Artificial Neural Networks (ANN) : Dr.M.Sivagnanasundaram

Artificial
Neural
Networks
(ANN)
Dr.M.Sivagnanasundaram
Artificial Neural Networks (ANN)
With approximately 100 billion neurons, the human brain processes

data at speeds as fast as 268 mph! In essence, a neural network is a
collection of neurons connected by synapses.
This collection is organized into three main layers: the input layer, the
hidden layer, and the output layer. You can have many hidden layers,
which is where the term deep learning comes into play.
In an artificial neural network, there are several inputs, which are

called features, and produce a single output, which is called a label.
Artificial Neural Networks
(ANN)
• The circles represent neurons while the lines represent
synapses.
• The role of a synapse is to multiply the inputs and weights.
You can think of weights as the “strength” of the connection
between neurons. Weights primarily define the output of a
neural network. However, they are highly flexible. After, an
activation function is applied to return an output.
• At its core, neural networks are simple. They just perform a
dot product with the input and weights and apply an
activation function. When weights are adjusted via the
gradient of loss function, the network adapts to the changes
to produce more accurate outputs.
How ANN works?
Multiplies the input by a set

weights (performs a dot
Takes inputs as a matrix Applies an activation function Returns an output
product aka matrix
multiplication)
Error is calculated by taking the

difference from the desired To train, this process is repeated
output from the data and the The weights are then altered 1,000+ times. The more the
predicted output. This creates slightly according to the error. data is trained upon, the more
our gradient descent, which we accurate our outputs will be.
can use to alter the weights
Forward propagation
and back propagation
• For understanding Forward propagation
and back propagation, we’re going to use
a neural network with two inputs, two
hidden neurons, two output neurons.
Additionally, the hidden and output
neurons will include a bias.
• The goal of backpropagation is to
optimize the weights so that the neural
network can learn how to correctly map
arbitrary inputs to outputs.
• For the rest of this part we’re going to
work with a single training set: given
inputs 0.05 and 0.10, we want the neural
network to output 0.01 and 0.99.
Forward propagation
• To begin, let’s see what the neural network
currently predicts given the weights and biases
above and inputs of 0.05 and 0.10. To do this we’ll
feed those inputs forward though the network.
• We figure out the total net input to each hidden
layer neuron, squash the total net input using an
activation function (here we use the logistic
function), then repeat the process with the output
layer neurons.
Forward propagation..
Here’s how we calculate the total net
input for :
We then squash it using the logistic

function to get the output of :
Carrying out the same process for

we get:
Forward propagation..
We repeat this process for the output layer neurons,
using the output from the hidden layer neurons as
inputs.
Here’s the output for O1
And carrying out the same process for O2 we get
Calculating the Total Error
We can now calculate the error for each output For example, the target output for O1is 0.01 but
neuron using the squared error function and the neural network output 0.75136507,
sum them to get the total error: therefore its error is
Repeating this process for O2(remembering

that the target is 0.99) we get:
The 1/2 is included so that exponent is
cancelled when we differentiate later on. The
result is eventually multiplied by a learning rate The total error for the neural network is the sum of
anyway, so it doesn’t matter that we introduce a these errors:
constant here
Back Propagation
• Our goal with backpropagation is to update each of the
weights in the network so that they cause the actual
output to be closer the target output, thereby minimizing
the error for each output neuron and the network as a
whole.
• Consider w5. We want to know how much a change in w5
affects the total error, aka .
• is read as “the partial derivative of ETotal with
respect to w5“. You can also say “the gradient with
respect to w5”
• By applying the chain rule we know that:
• Chain Rule (using ’ ) f(g(x))= f’(g(x))g’(x)
• Chain Rule (using d/dx ) dy/dx = (dy/du) (du/dx)
Back Propagation..
• We need to figure out each piece in this equation.
Back Propagation..
• The partial derivative of the logistic function is the
output multiplied by 1 minus the output:
Back Propagation..
• Finally, how much does the total net input of o1
change with respect to w5?
• Putting it all together:
• To decrease the error, we then subtract this value

from the current weight (optionally multiplied by
some learning rate, eta, which we’ll set to 0.5):
Back Propagation..
• We can repeat this process to get the new weights w6 , w7 , w8
W6= 0.408666186
W7= 0.511301270
W8= 0.408666186
Back Propagation for Hidden Layer
Back Propagation for Hidden Layer…
Following the same process for
Therefore:
We need to calculate &
We calculate the partial derivative of the total net input to h1 with
respect to w1 the same as we did for the output neuron:
Putting it altogether
Now we can update w1
Repeating this for W2, w3,w4

Artificial Neural Networks (ANN) : Dr.M.Sivagnanasundaram

Uploaded by

Copyright:

Available Formats

Artificial Neural Networks (ANN) : Dr.M.Sivagnanasundaram

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Artificial Neural Networks (ANN) : Dr.M.Sivagnanasundaram

Uploaded by

Copyright:

Available Formats

Artificial

With approximately 100 billion neurons, the human brain processes

In an artificial neural network, there are several inputs, which are

Multiplies the input by a set

Error is calculated by taking the

We then squash it using the logistic

Carrying out the same process for

And carrying out the same process for O2 we get

Repeating this process for O2(remembering

• Chain Rule (using d/dx ) dy/dx = (dy/du) (du/dx)

• Putting it all together:

• To decrease the error, we then subtract this value

We need to calculate &

Repeating this for W2, w3,w4

You might also like