10 LSTM, Peeohole LSTM, GRU

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

9/25/22, 1:57 PM What is LSTM , peephole LSTM and GRU?

| by Jaimin Mungalpara | Nerd For Tech | Medium

Open in app Get started

Published in Nerd For Tech

Jaimin Mungalpara Follow

Feb 5, 2021 · 5 min read · Listen

Save

What is LSTM , peephole LSTM and GRU?


Long Short Term Memory (LSTM) was introduced by Hochreiter & Schmidhuber
(1997) and it was refined by many researchers. LSTM is special kind of RNN which
can remember long term dependencies. LSTM are specially designed to avoid the
problems which are faced in RNN. You can learn about RNN in my previous article
Understanding RNN. The architectural behavior made it strong t remember long
term dependencies. In simple RNN network, it has simple repeating neural network
such a simple tanh or relu network which is represented in below figure.

The repeating module in a standard RNN contains a single layer. Image taken from https://colah.github.io/

Well, LSTM is also having a same kind of chain structure but the repeating module
does not have simple neural network architecture. These repeating module contains
4 neural networks.

37

https://medium.com/nerd-for-tech/what-is-lstm-peephole-lstm-and-gru-77470d84954b 1/8
9/25/22, 1:57 PM What is LSTM , peephole LSTM and GRU? | by Jaimin Mungalpara | Nerd For Tech | Medium

Open in app Get started

LSTM Gates

The notations in above figure can be explained like this. In this notation yellow
boxes represents neural networks. Pink dots are point wise operation weather it
would be vector multiplication or addition. Merger of line is concatenation and
splitting of line means same content is going at different location.

Image taken from https://colah.github.io/

There are 3 Gates in LSTM architecture which are

1. Forget Gate

2. Input Gate

3. Output Gate

The Core Idea Behind LSTM


https://medium.com/nerd-for-tech/what-is-lstm-peephole-lstm-and-gru-77470d84954b 2/8
9/25/22, 1:57 PM What is LSTM , peephole LSTM and GRU? | by Jaimin Mungalpara | Nerd For Tech | Medium

The top of the diagram contains cell state, it is a path in which information can pass
Open in app Get started
easily with some minor liner operations. Gates can add or remove information in
the cell state.

The gates are composed of sigmoid neural network and pointwise multiplication
operation. As sigmoid given an output 0 & 1 this means if it is 0 then nothing would
be passed threw and if 1 then everything would be passed.

Forget Gate

This is the first step in LSTM network which decides which information would be
passed threw the cell state and this decision is taken by forget gate.

It taken ht-1 and xt and an input and output is given as 0 and 1 which is then point
wise multiplied with Ct-1 and finally it will decide which information would be
passes threw. If we are working with some context based data if the context is
changed this forget cell will discard the information which is not relevant to
context.

Input Gate

https://medium.com/nerd-for-tech/what-is-lstm-peephole-lstm-and-gru-77470d84954b 3/8
9/25/22, 1:57 PM What is LSTM , peephole LSTM and GRU? | by Jaimin Mungalpara | Nerd For Tech | Medium

In this step will decide which information is going to be stored in cell state. This
Open in app Get started
operation is one in two step. First, a sigmoid neural network decides which values
we will update and a tanh layer that creates a vector of new candidate values, Ct,
that could be added to the state.

Now we have to update cell state which we got at Ct-1 to Ct.

We multiply old state Ct-1 with ft and forget the things which are not required. Then
we add it*Ct to update the context which we need to remember. Here we add new
information in context to be remembered and passed to next stage.

Output Gate

At this stage, we have to decide about what we are going to send in output. This
would be based on our cell state, but we run a sigmoid neural network which
decides what parts of the cell state we’re going to output. Then, we put the cell state
through tanh (to push the values to be between −1 and 1) and multiply it by the
output of the sigmoid gate, so that we can decide about output data.

https://medium.com/nerd-for-tech/what-is-lstm-peephole-lstm-and-gru-77470d84954b 4/8
9/25/22, 1:57 PM What is LSTM , peephole LSTM and GRU? | by Jaimin Mungalpara | Nerd For Tech | Medium

Open in app Get started

Here we can see whole idea in one gif about LSTM.

Peephole Architecture

Until now we have seen simple LSTM network but this architecture is modified
along with time in each and every research paper. One popular LSTM variant,
introduced by Gers & Schmidhuber (2000), is adding “peephole connections.” This
means that we let the gate layers look at the cell state.

https://medium.com/nerd-for-tech/what-is-lstm-peephole-lstm-and-gru-77470d84954b 5/8
9/25/22, 1:57 PM What is LSTM , peephole LSTM and GRU? | by Jaimin Mungalpara | Nerd For Tech | Medium

Open in app Get started

In this peephole connection we can see that all the gates are having an input along
with the cell state.

GRU

Another variation on the LSTM is the Gated Recurrent Unit, or GRU, introduced by
Cho, et al. (2014). it combines the forget and input gate into update gate which is
newly added in this architecture. It also merges the cell state and hidden state. The
resulting model is simpler than traditional LSMT and it is growing more popularity.

Entire GRU operation can be seen as .

https://medium.com/nerd-for-tech/what-is-lstm-peephole-lstm-and-gru-77470d84954b 6/8
9/25/22, 1:57 PM What is LSTM , peephole LSTM and GRU? | by Jaimin Mungalpara | Nerd For Tech | Medium

Open in app Get started

Here , we checked some LSTM variants but there are another variants as well. All
the drawbacks of RNN are already achieved in LSTM, still researchers are asking for
another step which is called attention.

References

Understanding LSTM Networks — colah’s blogThese loops make


recurrent neural networks seem kind of mysterious. However, if you think
a bit more, it turns out that…

colah.github.io
https://towardsdatascience.com/animated-rnn-lstm-and-gru-ef124d06cf45

https://medium.com/nerd-for-tech/what-is-lstm-peephole-lstm-and-gru-77470d84954b 7/8
9/25/22, 1:57 PM What is LSTM , peephole LSTM and GRU? | by Jaimin Mungalpara | Nerd For Tech | Medium

Sign up for NFT Weekly Digest Get started


Open in app
By Nerd For Tech

Subscribe to our weekly News Letter to receive top stories from the Industry Professionals around the world Take a
look.

Your email

Get this newsletter

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our
privacy practices.

About Help Terms Privacy

Get the Medium app

https://medium.com/nerd-for-tech/what-is-lstm-peephole-lstm-and-gru-77470d84954b 8/8

You might also like