{"id":166044,"date":"2023-01-19T09:48:00","date_gmt":"2023-01-19T08:48:00","guid":{"rendered":"https:\/\/liora.io\/en\/?p=166044"},"modified":"2026-02-06T09:07:40","modified_gmt":"2026-02-06T08:07:40","slug":"recurrent-neural-network-what-is-it","status":"publish","type":"post","link":"https:\/\/liora.io\/en\/recurrent-neural-network-what-is-it","title":{"rendered":"Recurrent Neural Networks: The complete Guide"},"content":{"rendered":"<strong>If you are a follower of our blog, you already know what a neural network is (if not, feel free to read this article first) but what does the adjective recurrent bring to this model? In this article, we will see how recurrent neural networks, called RNN, have become a classic model in <a href=\"https:\/\/liora.io\/en\/all-about-deep-learning\">deep learning<\/a>.<\/strong><h3>How to set up this neural network in situations ?<\/h3>\nBefore explaining a <b>RNN<\/b>, let&#8217;s focus on a bullet. Indeed, it is common in <strong><a href=\"https:\/\/liora.io\/en\/machine-learning-what-is-it-and-why-does-it-change-the-world\">Machine Learning<\/a><\/strong> to want to predict the trajectory of a movable object. As shown in figure 1, from the starting point, the ball can take <b>all sorts of directions<\/b><img decoding=\"async\" width=\"800\" height=\"656\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2021\/07\/Sans-titre-1-Recupere_Plan-de-travail-1-1024x840.png\" alt=\"\" loading=\"lazy\">\n\nHowever, if we take <b>figure 2<\/b>, it is obvious to say that the ball will continue to go <b>towards the right<\/b>, thanks to the past trajectories, which translate a movement towards the right.&nbsp;\n\nSo far, everything seems logical.\n\nContrary to figure 1, we have more training data, from which we are better able to decide on the movement of the ball.\n\nThus, we just need to give our <b>neural network<\/b> the old ball movements and our study is over. However, how do we choose the number of input neurons? A trajectory can be split in any way we want. Whether it is 10 or 100.\n\n<img decoding=\"async\" width=\"800\" height=\"243\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2021\/07\/Sans-titre-1-Recupere_Plan-de-travail-1-copie-1024x311.png\" alt=\"\" loading=\"lazy\">\n\nThe white ball represents the current position of the ball and the light blue balls represent the&nbsp;<b>old trajectories<\/b>&nbsp;of the white ball, so we guess that the ball is heading to the right.&nbsp;\n\nLet&#8217;s ignore this detail for now, and fix the size of our input samples. Let&#8217;s look at&nbsp;<b>Figure 3<\/b>, where will the ball go?\n\n<img decoding=\"async\" width=\"800\" height=\"484\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2021\/07\/Sans-titre-1-Recupere_Plan-de-travail-1-copie-2-1024x619.png\" alt=\"\" loading=\"lazy\">\n\nAdmittedly, the motion of the ball is <b>more complex <\/b>than in figure 1, but we can still tell that the next move of the ball will be <b>upwards<\/b>. Will the model make the same prediction? Unfortunately not, because the neural network does not think like we do! The model, unlike us, does not take into account the link between the inputs. The <b>inputs are not independent<\/b> of each other, so we must preserve this link between them when we train our neural network.&nbsp;\n\nWe need to overcome <b>2 problems<\/b>:&nbsp;\n<ul>\n \t<li><b>The size<\/b> of our input samples is not fixed.<\/li>\n \t<li>The input data are <b>not linked<\/b> together.&nbsp;<\/li>\n<\/ul>\nLet&#8217;s go back to our ball trajectory study. We will consider a plane, so the ball trajectory will have two coordinates, which we name x^1,x^2 and we want to predict the next location of the ball, the prediction of future coordinates will be \u0177^1,\u0177^2. Let us represent this with a traditional neural network.\n\n<img decoding=\"async\" width=\"800\" height=\"365\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2021\/07\/Sans-titre-1-Recupere-04.png\" alt=\"\" loading=\"lazy\">\n\nSynthetically, if we posit (x^1,x^2)=x_t and (\u0177^1,\u0177^2)=\u0177_t, we can make the diagram below:\n\n<img decoding=\"async\" width=\"800\" height=\"389\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2021\/07\/Sans-titre-1-Recupere-05-1.png\" alt=\"\" loading=\"lazy\">\n\nHere the index t indicates the coordinates of the ball at time t, representing the neural network by a function f, we have:\n\nf(x_t)=\u0177_t (resume the mathematical style of the RNN&nbsp; formula)\n\nAs mentioned earlier, we are studying the <b>motion of the ball locally<\/b>. We want to take into account several instants of the ball&#8217;s motion. If we can&#8217;t do this with a <b>traditional neural network<\/b>, with RNNs, everything changes, because the concept of recurrence is introduced. Indeed, let us observe the figure below. We add an input <b>h_t<\/b>, called hidden state. This hidden state embodies <b>\u0177_t <\/b>and is given as an argument to the next prediction in addition to the input <b>x_t<\/b>. We have indeed considered a set, and thus link outputs and inputs <b>without bound for our input samples<\/b>.\n\n<img decoding=\"async\" width=\"800\" height=\"620\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2021\/07\/Sans-titre-1-Recupere-06-1024x793.png\" alt=\"\" loading=\"lazy\">\n\nIf we try to symbolize this with a formula, we get the following equation:\n\n<img decoding=\"async\" width=\"800\" height=\"450\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2023\/01\/Ajouter-un-titre-1024x576.jpg\" alt=\"\" loading=\"lazy\">\n\nWe can see the concept of <b>recurrence<\/b>. To predict the next term, we need previous information from <i><b>h_{t-1}<\/b><\/i>.&nbsp;\n\nThe recurrence is even more obvious in this summary diagram.\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex is-content-justification-center\"><div class=\"wp-block-button \"><a class=\"wp-block-button__link wp-element-button \" href=\"\/en\/courses\/data-ai\/data-scientist\">Learn to use predictive models<\/a><\/div><\/div>\n\n<h3>What are the different applications of RNNs?<\/h3>\n<img decoding=\"async\" width=\"800\" height=\"335\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2021\/07\/Sans-titre-1-Recupere-07-1024x429.png\" alt=\"\" loading=\"lazy\">\n<ul>\n \t<li><b>One to many<\/b>, The RNN receives a single input and returns multiple outputs,&nbsp;the classic example of this process is the image legend<\/li>\n<\/ul>\n<ul>\n \t<li><i><b>Many to one<\/b><\/i> There are <b>several inputs<\/b> and there is a <b>single output<\/b>. An illustration of this mode is the sentiment analysis of texts. This makes it possible to identify a feeling from a group of words, determine the word that is missing to finish the sentence received as input.&nbsp;<\/li>\n \t<li><i><b>Many to many<\/b><\/i>, Finally, you can take <b>several inputs<\/b> and get <b>several outputs<\/b>. We don\u2019t necessarily have the same number of input and output neurons. We can cite, here, the translation of text, but we can be ambitious and plan to finish a musical work with its beginning.<\/li>\n<\/ul>\nPerfect, we have seen the <b>operation of an RNN <\/b>and its <b>multiple applications<\/b>, but is it perfect?&nbsp;\n\nUnfortunately, it has a major drawback, called <b>short-term memory<\/b>, of which we will see an example in <b>ANLP <\/b>(<i>Automatic Natural Language Processing<\/i>).\n<h3>Is the RNN a goldfish?<\/h3>\nTake the case of sentence completion.&nbsp;\n<p style=\"padding-left: 25px\"><em>&#8220;<strong>I like sushi, I\u2019m going to eat at the \u2026<\/strong>&#8220;<\/em><\/p>\nThe RNN can\u2019t remember the word sushi to predict <b>Japan<\/b>, because they can\u2019t remember the word <i><b>sushi<\/b><\/i>. In order to determine Japan, the RNN has to have a stronger memory. We can do that, by making neurons more complex.&nbsp; In particular, we will see the case of <b>LSTM<\/b>(<i>Long Short Term Memory<\/i>). In addition to the conventional hidden state <i>h_t<\/i>, we will add a second state called<i> c_t.<\/i> Here,<i> h_t<\/i> represents the short memory of the neuron and<i> c_t <\/i>the long memory.\n\nWe will not go into the technical considerations of this frightening scheme. The main point is that we have a much more complex cell, which allows us to solve the memory problem. With the LSTM, we run the h_t and c_t through doors, 4 of them.\n<ul>\n \t<li>The first door <b>eliminates<\/b> unnecessary information, it\u2019s the forget gate.<\/li>\n \t<li>The second door, <b>stores <\/b>the new information, the store gate.<\/li>\n \t<li>The third gate <b>updates<\/b> the information we will give to the RNN with the result of the forget gate and the store gate, it is the update gate.<\/li>\n \t<li>Finally, the last gate (output gate), <b>gives us <i>y_t<\/i> and<i> h_t<\/i><\/b>.<\/li>\n<\/ul>\nThis long process allows us to <b>control the information<\/b> we keep and transmit over time.\n\nThe RNN manages to know what to keep and what to forget, thanks to its learning. The <b>LSTM<\/b> (Long Short Term Memory) is not unique, we can also use <b>GRU<\/b> (Gated Recurrent Unit), just the architecture of the cell changes.\n\nLet us now summarize what we have seen. <b>NRNs are a particular type of neural network<\/b> that can process data that are not independent and have no fixed size. However, standard NRNs are quite limited with the s<b>hort memory problem<\/b>, which we can solve by using more complex cells like LSTM or GRU.\n\nIn addition, we can draw a parallel with another neural network system: <b>convolution neural networks<\/b> (CNN). Indeed, NNCs are known to share spatial information while NRNs are known to share temporal information. Finally, if you want to put NRNs into practice, don\u2019t hesitate to <b>join one of our Data Scientist training<\/b>.\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex is-content-justification-center\"><div class=\"wp-block-button \"><a class=\"wp-block-button__link wp-element-button \" href=\"\/en\/courses\/data-ai\/data-scientist\">Know more about our Data Scientist course<\/a><\/div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>If you are a follower of our blog, you already know what a neural network is (if not, feel free to read this article first) but what does the adjective recurrent bring to this model? In this article, we will see how recurrent neural networks, called RNN, have become a classic model in deep learning. [&hellip;]<\/p>\n","protected":false},"author":79,"featured_media":80135,"comment_status":"open","ping_status":"open","sticky":false,"template":"elementor_theme","format":"standard","meta":{"_acf_changed":false,"editor_notices":[],"footnotes":""},"categories":[2433],"class_list":["post-166044","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-ai"],"acf":[],"_links":{"self":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts\/166044","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/users\/79"}],"replies":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/comments?post=166044"}],"version-history":[{"count":1,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts\/166044\/revisions"}],"predecessor-version":[{"id":206461,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts\/166044\/revisions\/206461"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/media\/80135"}],"wp:attachment":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/media?parent=166044"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/categories?post=166044"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}