Unit 1 NNDL
Unit 1 NNDL
Unit 1 NNDL
In the context of a neural network, a neuron is the most fundamental unit of processing. It's also
called a perceptron. A neural network is based on the way a human brain works. So, we can say that
it simulates the way the biological neurons signal to one another. Neuron is a function that maps an n
input vector to a single value. It is a cascade of a linear and a non-linear operation as shown below .
N(w, x) = nl(x0*w0 + x1*w1+...+xn*wn + b)
where w = {w0, w1, ..., wn} is the weight parameter of size n,
x = {x0, x1, ..., xn} is the input vector of size n ,
b is a bias
and nl is a non-linear function,
The linear operation is a dot product of the input vector with a weight vector of the same size with a
scalar bias value added. The non-linear function is a function defined on a single variable. A typical
non-linear function used is relu which is defined as relu(x) = x for x > 0 else 0. In this article, we use an
input vector of size 1 and we choose relu as the non-linear function. Thus the equation for a neuron is
N(x) = relu(w*x+b) where w is the weight and b the bias.
If w > 0 then N(x) > 0 when x > -b/w. We will refer -b/w as the origin of the neuron.
Perceptron
Perceptron is one of the simplest Artificial neural network architectures. It was introduced by Frank
Rosenblatt in 1957s. It is the simplest type of feedforward neural network, consisting of a single layer
of input nodes that are fully connected to a layer of output nodes. It can learn the linearly separable
patterns. it uses slightly different types of artificial neurons known as threshold logic units (TLU). it was
first introduced by McCulloch and Walter Pitts in the 1940s.
Types of Perceptron
Single-Layer Perceptron: This type of perceptron is limited to learning linearly separable patterns.
effective for tasks where the data can be divided into distinct categories through a straight line.
A perceptron, the basic unit of a neural network, comprises essential components that collaborate
in information processing.
Input Features: The perceptron takes multiple input features, each input feature
represents a characteristic or attribute of the input data.
Weights: Each input feature is associated with a weight, determining the significance of
each input feature in influencing the perceptron’s output. During training, these weights
are adjusted to learn the optimal values.
Summation Function: The perceptron calculates the weighted sum of its inputs using
the summation function. The summation function combines the inputs with their
respective weights to produce a weighted sum.
Activation Function: The weighted sum is then passed through an activation function.
Perceptron uses Heaviside step function functions. which take the summed values as
input and compare with the threshold and provide the output as 0 or 1.
Output: The final output of the perceptron, is determined by the activation function’s
result. For example, in binary classification problems, the output might represent a
predicted class (0 or 1).
Bias: A bias term is often included in the perceptron model. The bias allows the model to
make adjustments that are independent of the input. It is an additional parameter that is
learned during training.
Learning Algorithm (Weight Update Rule): During training, the perceptron learns by
adjusting its weights and bias based on a learning algorithm. A common approach is the
perceptron learning algorithm, which updates weights based on the difference between
the predicted output and the true output.
Perceptron working
A weight is assigned to each input node of a perceptron, indicating the significance of that input to
the output. The perceptron’s output is a weighted sum of the inputs that have been run through an
activation function to decide whether or not the perceptron will fire. it computes the weighted sum of
its inputs as:
When all the neurons in a layer are connected to every neuron of the previous layer, it is known as
a fully connected layer or dense layer.
The output of the fully connected layer can be:
where X is the input W is the weight for each inputs neurons and b is the bias and h is the step
function.
During training, The perceptron’s weights are adjusted to minimize the difference between the
predicted output and the actual output. Usually, supervised learning algorithms like the delta rule or
the perceptron learning rule are used for this.
Here wij is the weight between the i’th input and j’th output neuron,xi is the input value yj and yj^ is the
j’th actal and predicted value is the learning rate.
Limitations of Perceptron
The perceptron was an important development in the history of neural networks, as it demonstrated
that simple neural networks could learn to classify patterns. However, its capabilities are limited:
The perceptron model has some limitations that can make it unsuitable for certain types of
problems:
Limited to linearly separable problems.
Convergence issues with non-separable data
Requires labeled data
Sensitivity to input scaling
Lack of hidden layers
Perceptron uses the step function that returns +1 if the weighted sum of its input 0 and -1.
The activation function is used to map the input between the required value like (0, 1) or (-1, 1).
o Input value or One input layer: The input layer of the perceptron is made of artificial input
neurons and takes the initial data into the system for further processing.
o Weights and Bias:
Weight: It represents the dimension or strength of the connection between units. If the weight
to node 1 to node 2 has a higher quantity, then neuron 1 has a more considerable influence on
the neuron.
Bias: It is the same as the intercept added in a linear equation. It is an additional parameter
which task is to modify the output along with the weighted sum of the input to the other neuron.
o Net sum: It calculates the total sum.
o Activation Function: A neuron can be activated or not, is determined by an activation
function. The activation function calculates a weighted sum and further adding bias with it to
give the result.
Multi-Layer Neural Network
To be accurate a fully connected Multi-Layered Neural Network is known as Multi-Layer Perceptron. A
Multi-Layered Neural Network consists of multiple layers of artificial neurons or nodes. Unlike Single-
Layer Neural networks, in recent times most networks have Multi-Layered Neural Network. The
following diagram is a visualization of a multi-layer neural network.
Multi-Layer perceptron defines the most complex architecture of artificial neural networks. It is
substantially formed from multiple layers of the perceptron. TensorFlow is a very popular deep
learning framework released by, and this notebook will guide to build a neural network with this
library. If we want to understand what is a Multi-layer perceptron, we have to develop a multi-layer
perceptron from scratch using Numpy.
MLP networks are used for supervised learning format. A typical learning algorithm for MLP networks
is also called back propagation's algorithm.
A multilayer perceptron (MLP) is a feed forward artificial neural network that generates a set of
outputs from a set of inputs. An MLP is characterized by several layers of input nodes connected as a
directed graph between the input nodes connected as a directed graph between the input and output
layers. MLP uses backpropagation for training the network. MLP is a deep learning method.
MLP (Multi-Layer Perceptron) is a type of neural network with an architecture consisting
of input, hidden, and output layers of interconnected neurons. It is capable of learning
complex patterns and performing tasks such as classification and regression by
adjusting its parameters through training. Let’s explore the architecture of an MLP in
detail:
Input Layer: The input layer is where the MLP and dataset first engage with
one another. A feature in the incoming data is matched to each neuron in this
layer. For instance, each neuron might represent the intensity value of a pixel in
picture categorization. These unprocessed input values are to be distributed to
the neurons in the next hidden layers by the input layer.
Hidden Layers: MLPs have a hidden layer or layers that are present between
the input and output layers. The main computations happen at these layers.
Every neuron in a hidden layer analyzes the data that comes from the neurons
in the layer above it. In the same buried layer, neurons do not interact directly
with one another but rather indirectly via weighted connections. The hidden
layer transformation allows the network to learn intricate links and
representations in the data. The intricacy of the task might affect the depth
(number of hidden layers) and width (number of neurons in each layer).
Output Layer: The MLP’s neurons in the output layer, the last layer, generate
the model’s predictions. The structure of this layer is determined by the
particular task at hand. The probability score for binary classification may be
generated by a single neuron with a sigmoid activation function. Multiple
neurons, often with softmax activation, can give probabilities to each class in a
multi-class classification system. When doing regression tasks, the output layer
frequently just has a single neuron that can forecast a continuous value.
Each neuron applies an activation function to the weighted total of its inputs, whether it is
in the input, hidden, or output layer. The sigmoid, hyperbolic tangent (tanh), and rectified
linear unit (ReLU) are often used activation functions. The MLP modifies connection
(synapse) weights during training using backpropagation and optimization methods like
gradient descent. In order to reduce the discrepancy between projected and actual
outputs, this method aids the network in learning and fine-tuning its parameters. MLPs
are appropriate for a variety of machine learning and deep learning problems, from
straightforward to extremely complicated, due to their flexibility in terms of the number of
hidden layers, neurons per layer, and choice of activation functions.
Feed Forward Neural Network
A feedforward neural network (FNN) is one of the two broad types of artificial neural network,
characterized by direction of the flow of information between its layers. Its flow is uni-directional,
meaning that the information in the model flows in only one direction—forward—from the input nodes,
through the hidden nodes (if any) and to the output nodes, without any cycles or loops, in contrast to
recurrent neural networks,which have a bi-directional flow. Modern feedforward networks are trained
using the backpropagation methodand are colloquially referred to as the "vanilla" neural networks.