Recurrent Neural Networks: Index

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Recurrent neural networks

Each of the 3
types of neural
networks (artificial,
convolutional, and recurrent) are used to solve
supervised machine learning problems.
Index
• The types of problems solved by recurrent neural networks

• The relationships between the different parts of the brain and the
different neural networks

• The composition of a recurrent neural network and how each hidden


layer can be used to help train the hidden layer from the next
observation in the data set
Types of problems solved by recurrent neural networks •
RNNs are deep learning models that are typically used to
solve time series problems.

• They are used in real-world applications like self-driving


cars, high-frequency trading algorithms
Mapping Neural Networks to Parts of the Human Brain
Since, neural networks were designed to mimic the human brain,
in : 1) construction (both the brain and neural networks are composed of neurons), 2) their function (they are both used to make
decisions and predictions).
The most important characteristic of the brain that neural networks have mimicked is the ability to learn from other neurons. The
ability of a neural network to change its weights through each epoch of its training stage is similar to the long-term memory
that is seen in humans (and other animals).

The three main parts of the brain are:


• The cerebrum
• The brainstem
• The cerebellum

Arguably the most important part of the brain is the cerebrum. It contains four lobes:
• The frontal lobe
• The parietal lobe
• The temporal lobe
• The occipital lobe

The temporal lobe is the part of the brain that is associated with long-term memory. Since the artificial neural network has the
property of long-term memory. Many researchers have compared artificial neural networks with the temporal lobe of the
human brain.
Similarly, the occipital lobe is the component of the brain that powers our vision. Since convolutional neural networks are
typically used to solve computer vision problems, you could say that they are equivalent to the occipital lobe in the brain. The
recurrent neural networks are used to solve time series problems. They can learn from events that have happened in recent
previous iterations of their training stage. In this way, they are often compared to the frontal lobe of the brain – which powers
our short-term memory.

To summarize, researchers often pair each of the three neural nets with the following parts of the brain:
• Artificial neural networks -> the temporal lobe
• Convolutional neural networks -> the occipital lobe
• Recurrent neural networks -> the frontal lobe
The Composition of a Recurrent Neural Network
Let’s now discuss the composition of a recurrent neural network. First,
recall that the composition of a basic neural network has the following

appearance:
The first modification that needs to be made to this neural network is that
each layer of the network should be squashed together, like this:
Then, three more modifications need to be made:
• The neural network’s neuron synapses need to be simplified to a single line •
The entire neural network needs to be rotated 90 degrees
• A loop needs to be generated around the hidden layer of the neural net
The neural network will now have the following appearance:

That line that circles the hidden layer of the recurrent neural
network is called the temporal loop. It is used to indicate
that the hidden layer not only generates an output, but that
output is fed back as the input into the same layer.
A visualization is helpful in understanding this. As you can see in
the following image, the hidden layer used on a specific
observation of a data set is not only used to generate an
output for that observation, but it is also used to train the
hidden layer of the next observation.

This
property of one observation helping to train the next
observation is why recurrent neural networks are so useful in
solving time series analysis problems.
What is Recurrent Neural Network (RNN)?
Recurrent Neural Network(RNN) is a type of Neural Network where the output from the previous step is fed as
input to the current step. In traditional neural networks, all the inputs and outputs are independent of
each other. Still, in cases when it is required to predict the next word of a sentence, the previous words are
required and hence there is a need to remember the previous words. Thus RNN came into existence,
which solved this issue with the help of a Hidden Layer. The main and most important feature of RNN is its
Hidden state, which remembers some information about a sequence. The state is also referred to as
Memory State since it remembers the previous input to the network. It uses the same parameters for
each input as it performs the same task on all the inputs or hidden layers to produce the output. This
reduces the complexity of parameters, unlike other neural networks.

How RNN differs from Feedforward Neural Network?

Artificial neural networksthat do not have looping nodes are called feed forward neural networks. Because all
information is only passed forward, this kind of neural network is also referred to as a multi-layer neural
network.
• Information moves from the input layer to the output layer – if any hidden layers are present –
unidirectionally in a feedforward neural network. These networks are appropriate for image classification
tasks, for example, where input and output are independent. Nevertheless, their inability to retain
previous inputs automatically renders them less useful for sequential data analysis.
• Recurrent Vs Feedforward networks

Recurrent Neuron and RNN Unfolding

The fundamental processing unit in a Recurrent Neural Network (RNN) is a Recurrent Unit, which is not
explicitly called a “Recurrent Neuron.” This unit has the unique ability to maintain a hidden state, allowing
the network to capture sequential dependencies by remembering previous inputs while processing. Long
Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) versions improve the RNN’s ability to handle
long-term dependencies.
Types Of RNN
There are four types of RNNs based on the number of inputs and outputs in the
network. • One to One
This type of RNN behaves the same as any simple Neural network it is also known as Vanilla
Neural Network. In this Neural network, there is only one input and one output.

• One To Many
In this type of RNN, there is one input and many outputs associated with it. One of the most used
examples of this network is Image captioning where given an image we predict a sentence
having Multiple words.

• Many to One
In this type of network, Many inputs are fed to the network at several states of the network
generating only one output. This type of network is used in the problems like sentimental
analysis. Where we give multiple words as input and predict only the sentiment of the sentence
as output.

• Many to Many
In this type of neural network, there are multiple inputs and multiple outputs corresponding to a
problem. One Example of this Problem will be language translation. In language translation, we
provide multiple words from one language as input and predict multiple words from the
second language as output.
Recurrent Neural Network Architecture
RNNs have the same input and output architecture as any other deep neural architecture.
However, differences arise in the way information flows from input to output.
Unlike Deep neural networks where we have different weight matrices for each Dense network in RNN, the weight across the
network remains the same.
It calculates state hidden state Hifor every input Xi .
How does RNN work?
The Recurrent Neural Network consists of multiple fixed activation function units, one for each time step. Each unit has an
internal state which is called the hidden state of the unit. This hidden state signifies the past knowledge that the network
currently holds at a given time step. This hidden state is updated at every time step to signify the change in the knowledge of
the network about the past. The hidden state is updated using the following recurrence relation:-

The formula for calculating the current state:


ht =��(ht-1, xt)
where, ht-> current state, ht-1-> previous state, xt-> input state

Formula for applying Activation function(tanh)


ℎ��=������ℎ(whh +1+ wxh)
where, whh -> weight at recurrent neuron, wxh -> weight at input neuron

The formula for calculating output:


Yt = Why ht
Where Yt-> output
Why -> weight at output layer
These parameters are updated using backpropagation.
However, since RNN works on sequential data here we use an updated backpropagation which is known as Backpropagation
through time.

Issues of Standard RNNs


• Vanishing Gradient: Text generation, machine translation,
and stock market prediction are just a few examples of the
time-dependent and sequential data problems that can be
modelled with recurrent neural networks. You will discover,
though, that the gradient problem makes training RNN
difficult.
• Exploding Gradient: An Exploding Gradient occurs when a
neural network is being trained and the slope tends to grow
exponentially rather than decay. Large error gradients that
build up during training lead to very large updates to the
neural network model weights, which is the source of this
issue.
Training through RNN
• A single-time step of the input is provided to the network.
• Then calculate its current state using a set of current input and the previous state.
• The current ht becomes ht-1 for the next time step.
• One can go as many time steps according to the problem and join the information from all the previous states. •
Once all the time steps are completed the final current state is used to calculate the output.
• The output is then compared to the actual output i.e the target output and the error is generated. • The error is then
back-propagated to the network to update the weights and hence the network (RNN) is trained using Backpropagation
through time.

Advantages of Recurrent Neural Network


• An RNN remembers each and every piece of information through time. It is useful in time series prediction only because of the
feature to remember previous inputs as well. This is called Long Short Term Memory.
• Recurrent neural networks are even used with convolutional layers to extend the effective pixel neighborhood.
Disadvantages of Recurrent Neural Network
• Gradient vanishing and exploding problems.
• Training an RNN is a very difficult task.
• It cannot process very long sequences if using tanh or relu as an activation function.
Applications of Recurrent Neural Network
Language Modelling and Generating Text, Speech Recognition ,Machine Translation ,Image Recognition, Face detection, Time
series Forecasting

Variation Of Recurrent Neural Network (RNN)


• To overcome the problems like vanishing gradient and exploding gradient descent several new advanced versions of RNNs are
formed some of these are as; Bidirectional Neural Network (BiNN) , Long Short-Term Memory (LSTM) • Bidirectional Neural
Network (BiNN)
• A BiNN is a variation of a Recurrent Neural Network in which the input information flows in both direction and then the
output of both direction are combined to produce the input. BiNN is useful in situations when the context of the input is
more important such as Nlp tasks and Time-series analysis problems.
• Long Short-Term Memory (LSTM)
• Long Short-Term Memory works on the read-write-and-forget principle where given the input information network reads and
writes the most useful information from the data and it forgets about the information which is not important in predicting
the output. For doing this three new gates are introduced in the RNN. In this way, only the selected information is passed
through the network.
• Difference between RNN and Simple Neural Network
• RNN is considered to be the better version of deep neural when the data is sequential. There are significant differences
between the RNN and deep neural networks they are listed as:

Bidirectional Recurrent Neural Networks (BRNN)

You might also like