Unit 2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

3.

What are the Steps in Back propagation Algorithm


STEP ONE: initialize the weights and biases
 The weights in the network are initialized to random numbers from the interval [-1,1].
 Each unit has a BIAS associated with it
 The biases are similarly initialized to random numbers from the interval [-1,1].
STEP TWO: feed the training sample.
 Give the data for the system to training.
STEP THREE: Propagate the inputs forward
 we compute the net input and output of each unit in the hidden and output layers.
 Each unit in the hidden and output layers takes its net input and then applies an
activation function. The function symbolizes the activation of the neuron represented
by the unit. It is also called a logistic, sigmoid
STEP FOUR: back propagate the error.
 When reaching the Output layer, the error is computed and propagated backwards.
STEP FIVE: update weights and biases to reflect the propagated errors.
 Weights are updated by the following equations, where l is a constant between 0.0 and
1.0 reflecting the learning rate, this learning rate is fixed for implementation.

wij  (l ) ErrjOi

wij  wij  wij


STEP SIX: terminating conditions.
 Training stops, if there is no error,and it meets the threshold requirement
Multi layer nueral network is needed..?
Neural networks (kind of) need multiple layers in order to learn more detailed and more
abstractions relationships within the data and how the features interact with each other on a
non-linear level.
adding more layers (apart from increasing computational complexity to the training and testing
phases), allows for more easy representation of the interactions within the input data, as well
as allows for more abstract features to be learned and used as input into the next hidden layer.

2. WHAT IS MULTILAYER PERCEPTRON? HOW IS IT TRAINED USING BACK


PROPAGATION? WHAT IS LINEAR SEPARABILITY ISSUE AND WHATS THE
ROLE OF HIDDEN LAYER
MULTI LAYER PERCEPTRON::
A multilayer perceptron (MLP) is a class of feedforward artificial neural network. An MLP
consists of, at least, three layers of nodes: an input layer, a hidden layer and an output layer. Except
for the input nodes, each node is a neuron that uses a nonlinear activation function. MLP utilizes
a supervised learning technique called backpropagation for training.

A multilayer perceptron is a neural network connecting multiple layers in a directed graph, which
means that the signal path through the nodes only goes one way. Each node, apart from the
input nodes, has a nonlinear activation function.
BACKPROPAGATION IS USED TO TRAIN THE MLP:
An MLP uses backpropagation as a supervised learning technique. Since there are
multiple layers of neurons, MLP is a deep learning technique.
MLP is widely used for solving problems that require supervised learning as well as
research into computational neuroscience and parallel distributed processing.
Desired outputs are compared to achieved system outputs, and then the
systems are tuned by adjusting connection weights to narrow the difference
between the two as much as possible.

LINEAR SEPARABILITY:

Linear separability refers to the fact that classes of patterns with -dimensional
vector can be separated with a single decision surface. In the case
above, the line represents the decision surface.
Figure 2.9: Linearly Separable Pattern

LINEAR SEPARABILITY ISSUE when it works on XOR gate.

Role of hidden layer:


In a neural network, each hidden layer does something with its input and weights from
the previous layer, applies a non-linearity to it, and then it’s going to send the output to
the next layer. Which operations are applied depends on the type of neural. With a feed-
forward neural network you probably know that the input is multiplied by the weights,
summed, and passed through a non-linear equations of your choice (eg. sigmoid, ReLU,
tanh).

So each hidden layer applies a non-linearity to its input, and the more hidden layers you
stack together, the more complex functions you will be able to model. This is why neural
networks are said to be universal approximators, because with enough hidden layers,
the network is able to approximate any mapping from the network’s input to the
expected output.

1. Discuss the working behavior of SUPPORT VECTOR MACHINES: -


Support Vector Machine” (SVM) is a supervised machine learning algorithm which can
be used for both classification or regression challenges. However, it is mostly used in
classification problems. In this algorithm, we plot each data item as a point in n-
dimensional space (where n is number of features you have) with the value of each
feature being the value of a particular coordinate. Then, we perform classification by
finding the hyper-plane that differentiate the two classes very well.
Support Vectors are simply the co-ordinates of individual observation. Support Vector
Machine is a frontier which best segregates the two classes (hyper-plane/ line).
WORKING:
 Identify the right hyper-plane (Scenario-1):
Here, we have three hyper-planes (A, B and C). Now,
identify the right hyper-plane to classify star and circle.

You need to remember a thumb rule to identify the right hyper-plane: “Select
the hyper-plane which segregates the two classes better”. In this scenario, hyper-
plane “B” has excellently performed this job.

 Identify the right hyper-plane (Scenario-2): Here, we have three hyper-planes


(A, B and C) and all are segregating the classes well. Now, How can we identify
the right hyper-plane?
Here, maximizing the distances between nearest data point (either class) and
hyper-plane will help us to decide the right hyper-plane. This distance is called as
Margin.

Above, you can see that the margin for hyper-plane C is high as compared to
both A and B. Hence, we name the right hyper-plane as C. Another lightning reason
for selecting the hyper-plane with higher margin is robustness. If we select a hyper-
plane having low margin then there is high chance of miss-classification.
 Identify the right hyper-plane (Scenario-3):Hint: Use the rules as discussed in
previous section to identify the right hyper-plane.
Some of you may have selected the hyper- plane B as it has higher margin
compared to A. But, here is the catch, SVM selects the hyper-plane which
classifies the classes accurately prior to maximizing margin. Here, hyper-
plane B has a classification error and A has classified all correctly.
Therefore, the right hyper-plane is A.

 Can we classify two classes (Scenario-4)?: Below, I am unable to segregate


the two classes using a straight line, as one of star lies in the territory of
other(circle) class as an outlier.

As I have already mentioned, one star at other end is like an outlier for star
class. SVM has a feature to ignore outliers and find the hyper-plane that has
maximum margin. Hence, we can say, SVM is robust to outliers.
 Find the hyper-plane to segregate to classes (Scenario-5): In the
scenario below, we can’t have linear hyper-plane between the two
classes, so how does SVM classify these two classes? Till now, we have
only looked at the linear hyper-plane.

SVM can solve this problem. Easily! It solves this problem by


introducing additional feature. Here, we will add a new
feature z=x^2+y^2. Now, let’s plot the data points on axis x
and z:
In above plot, points to consider are:
o All values for z would be positive always because z is the squared
sum of both x and y
o In the original plot, red circles appear close to the origin
of x and y axes, leading to lower value of z and star
relatively away from the origin result to higher value of
z.

In SVM, it is easy to have a linear hyper-plane between these two classes.


But, another burning question which arises is, should we need to add this
feature manually to have a hyper-plane. No, SVM has a technique called the
kernel trick. These are functions which takes low dimensional input space
and transform it to a higher dimensional space i.e. it converts not separable
problem to separable problem, these functions are called kernels. It is
mostly useful in non-linear separation problem. Simply put, it does some
extremely complex data transformations, then find out the process to
separate the data based on the labels or outputs you’ve defined.

When we look at the hyper-plane in original input space it looks like a


circle:

You might also like