10 LSTM, Peeohole LSTM, GRU

9/25/22, 1:57 PM What is LSTM , peephole LSTM and GRU?
| by Jaimin Mungalpara | Nerd For Tech | Medium
Open in app Get started
Published in Nerd For Tech
Jaimin Mungalpara Follow
Feb 5, 2021 · 5 min read · Listen
Save
What is LSTM , peephole LSTM and GRU?

Long Short Term Memory (LSTM) was introduced by Hochreiter & Schmidhuber
(1997) and it was refined by many researchers. LSTM is special kind of RNN which
can remember long term dependencies. LSTM are specially designed to avoid the
problems which are faced in RNN. You can learn about RNN in my previous article
Understanding RNN. The architectural behavior made it strong t remember long
term dependencies. In simple RNN network, it has simple repeating neural network
such a simple tanh or relu network which is represented in below figure.
The repeating module in a standard RNN contains a single layer. Image taken from https://colah.github.io/
Well, LSTM is also having a same kind of chain structure but the repeating module
does not have simple neural network architecture. These repeating module contains
4 neural networks.
37
https://medium.com/nerd-for-tech/what-is-lstm-peephole-lstm-and-gru-77470d84954b 1/8
9/25/22, 1:57 PM What is LSTM , peephole LSTM and GRU? | by Jaimin Mungalpara | Nerd For Tech | Medium
LSTM Gates
The notations in above figure can be explained like this. In this notation yellow
boxes represents neural networks. Pink dots are point wise operation weather it
would be vector multiplication or addition. Merger of line is concatenation and
splitting of line means same content is going at different location.
Image taken from https://colah.github.io/
There are 3 Gates in LSTM architecture which are
1. Forget Gate
2. Input Gate
3. Output Gate
The Core Idea Behind LSTM

The top of the diagram contains cell state, it is a path in which information can pass
easily with some minor liner operations. Gates can add or remove information in
the cell state.
The gates are composed of sigmoid neural network and pointwise multiplication
operation. As sigmoid given an output 0 & 1 this means if it is 0 then nothing would
be passed threw and if 1 then everything would be passed.
Forget Gate
This is the first step in LSTM network which decides which information would be
passed threw the cell state and this decision is taken by forget gate.
It taken ht-1 and xt and an input and output is given as 0 and 1 which is then point
wise multiplied with Ct-1 and finally it will decide which information would be
passes threw. If we are working with some context based data if the context is
changed this forget cell will discard the information which is not relevant to
context.
Input Gate
In this step will decide which information is going to be stored in cell state. This
operation is one in two step. First, a sigmoid neural network decides which values
we will update and a tanh layer that creates a vector of new candidate values, Ct,
that could be added to the state.
Now we have to update cell state which we got at Ct-1 to Ct.
We multiply old state Ct-1 with ft and forget the things which are not required. Then
we add it*Ct to update the context which we need to remember. Here we add new
information in context to be remembered and passed to next stage.
Output Gate
At this stage, we have to decide about what we are going to send in output. This
would be based on our cell state, but we run a sigmoid neural network which
decides what parts of the cell state we’re going to output. Then, we put the cell state
through tanh (to push the values to be between −1 and 1) and multiply it by the
output of the sigmoid gate, so that we can decide about output data.
Here we can see whole idea in one gif about LSTM.
Peephole Architecture
Until now we have seen simple LSTM network but this architecture is modified
along with time in each and every research paper. One popular LSTM variant,
introduced by Gers & Schmidhuber (2000), is adding “peephole connections.” This
means that we let the gate layers look at the cell state.
In this peephole connection we can see that all the gates are having an input along
with the cell state.
GRU
Another variation on the LSTM is the Gated Recurrent Unit, or GRU, introduced by
Cho, et al. (2014). it combines the forget and input gate into update gate which is
newly added in this architecture. It also merges the cell state and hidden state. The
resulting model is simpler than traditional LSMT and it is growing more popularity.
Entire GRU operation can be seen as .
Here , we checked some LSTM variants but there are another variants as well. All
the drawbacks of RNN are already achieved in LSTM, still researchers are asking for
another step which is called attention.
References
Understanding LSTM Networks — colah’s blogThese loops make

recurrent neural networks seem kind of mysterious. However, if you think
a bit more, it turns out that…
colah.github.io
https://towardsdatascience.com/animated-rnn-lstm-and-gru-ef124d06cf45
Sign up for NFT Weekly Digest Get started

Open in app
By Nerd For Tech
Subscribe to our weekly News Letter to receive top stories from the Industry Professionals around the world Take a
look.
Your email
Get this newsletter
By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our
privacy practices.
About Help Terms Privacy
Get the Medium app

10 LSTM, Peeohole LSTM, GRU

Uploaded by

Copyright:

Available Formats

10 LSTM, Peeohole LSTM, GRU

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

10 LSTM, Peeohole LSTM, GRU

Uploaded by

Copyright:

Available Formats

9/25/22, 1:57 PM What is LSTM , peephole LSTM and GRU?

| by Jaimin Mungalpara | Nerd For Tech | Medium

Open in app Get started

Published in Nerd For Tech

Jaimin Mungalpara Follow

Feb 5, 2021 · 5 min read · Listen

What is LSTM , peephole LSTM and GRU?

Open in app Get started

Image taken from https://colah.github.io/

There are 3 Gates in LSTM architecture which are

The Core Idea Behind LSTM

Now we have to update cell state which we got at Ct-1 to Ct.

Open in app Get started

Here we can see whole idea in one gif about LSTM.

Open in app Get started

Entire GRU operation can be seen as .

Open in app Get started

Understanding LSTM Networks — colah’s blogThese loops make

Sign up for NFT Weekly Digest Get started

Get this newsletter

About Help Terms Privacy

Get the Medium app

You might also like