Soft Computing Unit-2

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 61

School of Computing Science and Engineering

Course Code : BCSE9303 Course Name: Soft Computing

Program: B.Tech
Course Code: BCSE9303
Course Name: Soft Computing

Name of the Faculty: Geetika Sharma Program Name: B.Tech


UNIT-2

Neural Network

Program Name: B.Tech


Biological nervous system
• Biological nervous system is the most important part of many living
things, in particular, human beings.
• There is a part called brain at the centre of human nervous system.
• In fact, any biological nervous system consists of a large number of
interconnected processing units called neurons.
• Each neuron is approximately 10µm long and they can operate in
parallel.
• Typically, a human brain consists of approximately 1011 neurons
communicating with each other with the help of electrical impulses.

Program Name: B.Tech (Specialization)


Neuron: Basic unit of nervous system

Program Name: B.Tech


Neuron and its working
There are different parts in it : Dendrite, soma, axon and synapse.
• Dendrite : A bush of very thin fibre.
• Axon : A long cylindrical fibre.
• Soma : It is also called a cell body, and just like as a nucleus of cell.
• Synapse : It is a junction where axon makes contact with the
dendrites of neighboring dendrites.

Program Name: B.Tech (Specialization)


Neuron and its working
• There is a chemical in each neuron called neurotransmitter.
• A signal (also called sense) is transmitted across neurons by this chemical.
• That is, all inputs from other neuron arrive to a neurons through dendrites.
• These signals are accumulated at the synapse of the neuron and then
serve as the output to be transmitted through the neuron.
• An action may produce an electrical impulse, which usually lasts for about
a millisecond.
• Note that this pulse generated due to an incoming signal and all signal may
not produce pulses in axon unless it crosses a threshold value.
• Also, note that an action signal in axon of a neuron is commutative signals
arrive at dendrites which summed up at soma.

Program Name: B.Tech


Artificial neural network
• In fact, the human brain is a highly complex structure viewed as a
massive, highly interconnected network of simple processing
elements called neurons.
• Artificial neural networks (ANNs) or simply we refer it as neural
network (NNs), which are simplified models (i.e. imitations) of the
biological nervous system, and obviously, therefore, have been
motivated by the kind of computing performed by the human brain.
• The behavior of a biolgical neural network can be captured by a
simple model called artificial neural network.

Program Name: B.Tech


Analogy between BNN and ANN

Program Name: B.Tech


Artificial neural network
• Here, x1, x2, · · · , xn are the n inputs to the artificial neuron.
• w1, w2, · · · , wn are weights attached to the input links.
• Hence, the total input say I received by the soma of the artificial
neuron is
I = w1x1 + w2x2 + · · · + wnxn =
• To generate the final output y, the sum is passed to a filter φ called
transfer function, which releases the output.
• That is, y = φ(I)

Program Name: B.Tech


Artificial neural network
• A very commonly known transfer function is the thresholding
function.
• In this thresholding function, sum (i.e. I) is compared with a
threshold value θ.
• If the value of I is greater than θ, then the output is 1 else it is 0 (this
is just like a simple linear filter).
• In other words, y = φ( Pn i=1 wixi − θ)
• where φ(I) = 1 , if I > θ 0 , if I ≤ θ
• Such a Φ is called step function (also known as Heaviside function).

Program Name: B.Tech


• Note that, a biological neuron receives all inputs through the dendrites,
sums them and produces an output if the sum is greater than a
threshold value.
• The input signals are passed on to the cell body through the synapse,
which may accelerate or retard an arriving signal.
• It is this acceleration or retardation of the input signals that is modelled
by the weights.
• An effective synapse, which transmits a stronger signal will have a
correspondingly larger weights while a weak synapse will have smaller
weights.
• Thus, weights here are multiplicative factors of the inputs to account
for the strength of the synapse.
Program Name: B.Tech
Artificial neural network
• We may note that a neutron is a part of an interconnected network
of nervous system and serves the following.
• Compute input signals
• Transportation of signals (at a very high speed)
• Storage of information
• Perception, automatic training and learning
• We also can see the analogy between the biological neuron and
artificial neuron. Truly, every component of the model (i.e. artificial
neuron) bears a direct analogy to that of a biological neuron. It is
this model which forms the basis of neural network (i.e. artificial
neural network).

Program Name: B.Tech


Advantages of ANN
• ANNs exhibits mapping capabilities, that is, they can map input
patterns to their associated output pattern.
• The ANNs learn by examples. Thus, an ANN architecture can be
trained with known example of a problem before they are tested for
their inference capabilities on unknown instance of the problem. In
other words, they can identify new objects previous untrained.
• The ANNs posses the capability to generalize. This is the power to
apply in application where exact mathematical model to problem
are not possible.

Program Name: B.Tech


Advantages of ANN
• The ANNs are robust system and fault tolerant. They can therefore,
recall full patterns from incomplete, partial or noisy patterns.
• The ANNs can process information in parallel, at high speed and in a
distributed manner. Thus a massively parallel distributed processing
system made up of highly interconnected (artificial) neural
computing elements having ability to learn and acquire knowledge is
possible.

Program Name: B.Tech


Learning of neural networks:
• Concept of learning.
• Learning in:
• Single layer feed forward neural network
• multilayer feed forward neural network
• recurrent neural network
• Types of learning in neural networks.

Program Name: B.Tech


What are the Learning Rules in Neural Network?
• Learning rule or Learning process is a method or a mathematical
logic. It improves the Artificial Neural Network’s performance and
applies this rule over the network. Thus learning rules updates the
weights and bias levels of a network when a network simulates in a
specific data environment.
• Applying learning rule is an iterative process. It helps a neural
network to learn from the existing conditions and improve its
performance.

Program Name: B.Tech


The concept of learning
• The learning is an important feature of human computational ability.
• Learning may be viewed as the change in behavior acquired due to
practice or experience, and it lasts for relatively long time.
• As it occurs, the effective coupling between the neuron is modified.
• In case of artificial neural networks, it is a process of modifying
neural network by updating its weights, biases and other
parameters, if any.
• During the learning, the parameters of the networks are optimized
and as a result process of curve fitting.
• It is then said that the network has passed through a learning phase.

Program Name: B.Tech


• Classification Of Supervised Learning Algorithms
• #1) Gradient Descent Learning
• #2) Stochastic Learning
• Classification Of Unsupervised Learning Algorithms
• #1) Hebbian Learning
• #2) Competitive Learning
• Reinforced Learning

Program Name: B.Tech


Different learning rules in the Neural network:

• Hebbian learning rule – It identifies, how to modify the weights of


nodes of a network.
• Perceptron learning rule – Network starts its learning by assigning a
random value to each weight.
• Delta learning rule – Modification in sympatric weight of a node is
equal to the multiplication of error and the input.
• Correlation learning rule – The correlation rule is the supervised
learning.
• Outstar learning rule – We can use it when it assumes that nodes or
neurons in a network arranged in a layer.

Program Name: B.Tech


Supervised learning:
• In this learning, every input pattern that is used to train the network
is associated with an output pattern.
• This is called ”training set of data”. Thus, in this form of learning, the
input-output relationship of the training scenarios are available.
• Here, the output of a network is compared with the corresponding
target value and the error is determined.
• It is then feed back to the network for updating the same. This
results in an improvement.
• This type of training is called learning with the help of teacher.

Program Name: B.Tech


Unsupervised learning
• If the target output is not available, then the error in prediction can not
be determined and in such a situation, the system learns of its own by
discovering and adapting to structural features in the input patterns.
• This type of training is called learning without a teacher.
Reinforced learning:
• In this techniques, although a teacher is available, it does not tell the
expected answer, but only tells if the computed output is correct or
incorrect. A reward is given for a correct answer computed and a
penalty for a wrong answer. This information helps the network in its
learning process.

Program Name: B.Tech


Gradient Descent learning :
• This learning technique is based on the minimization of error E defined in
terms of weights and the activation function of the network.
• Also, it is required that the activation function employed by the network is
differentiable, as the weight update is dependent on the gradient of the
error E.
• Thus, if ∆Wij denoted the weight update of the link connecting the i-th and
j-th neuron of the two neighboring layers
then ∆Wij = η(∂E/∂Wij)
where η is the learning rate parameter and ∂E ∂Wij is the error gradient with
reference to the weight Wij
• The least mean square and back propagation are two variations of this
learning technique.
Program Name: B.Tech
Stochastic learning:
• In this method, weights are adjusted in a probabilistic fashion. Simulated
annealing is an example of such learning (proposed by Boltzmann and
Cauch)
Hebbian learning:
• This learning is based on correlative weight adjustment. This is, in fact,
the learning technique inspired by biology.
• Here, the input-output pattern pairs (xi , yi) are associated with the
weight matrix W. W is also known as the correlation matrix.
• This matrix is computed as follows.
W = Pn i=1 XiY T i
where Y T i is the transpose of the associated vector yi

Program Name: B.Tech


Competitive learning:
• In this learning method, those neurons which responds strongly to
input stimuli have their weights updated.
• When an input pattern is presented, all neurons in the layer
compete and the winning neuron undergoes weight adjustment.
• This is why it is called a Winner-takes-all strategy

Program Name: B.Tech


Elements of a Neural Network
• Input Layer :- This layer accepts input features. It provides
information from the outside world to the network, no computation
is performed at this layer, nodes here just pass on the
information(features) to the hidden layer. 
• Hidden Layer :- Nodes of this layer are not exposed to the outer
world, they are the part of the abstraction provided by any neural
network. Hidden layer performs all sort of computation on the
features entered through the input layer and transfer the result to
the output layer. 
• Output Layer :- This layer bring up the information learned by the
network to the outer world. 

Program Name: B.Tech


Activation function
• Activation function decides, whether a neuron should be activated or not by
calculating weighted sum and further adding bias with it. The purpose of the
activation function is to introduce non-linearity into the output of a neuron. 
• Explanation :- We know, neural network has neurons that work in
correspondence of weight, bias and their respective activation function. In a
neural network, we would update the weights and biases of the neurons on
the basis of the error at the output. This process is known as back-
propagation. Activation functions make the back-propagation possible since
the gradients are supplied along with the error to update the weights and
biases. 
• Why do we need Non-linear activation functions :- A neural network without
an activation function is essentially just a linear regression model. The
activation function does the non-linear transformation to the input making it
capable to learn and perform more complex tasks. 
Program Name: B.Tech
Program Name: B.Tech
Activation Functions
• It may be defined as the extra force or effort applied over the input to obtain an exact output. In ANN, we
can also apply activation functions over the input to get the exact output. Followings are some activation
functions of interest −
Linear Activation Function
• It is also called the identity function as it performs no input editing. It can be defined as:
F(x) = x
Sigmoid Activation Function
• It is of two type as follows −
• Binary sigmoidal function − This activation function performs input editing between 0 and 1. It is positive
in nature. It is always bounded, which means its output cannot be less than 0 and more than 1. It is also
strictly increasing in nature, which means more the input higher would be the output. It can be defined as
F(x)=sigm(x)=1/1+exp(−x)
• Bipolar sigmoidal function − This activation function performs input editing between -1 and 1. It can be
positive or negative in nature. It is always bounded, which means its output cannot be less than -1 and
more than 1. It is also strictly increasing in nature like sigmoid function. It can be defined as:
F(x)=sigm(x)=[2/1+exp(−x)]−1=1−exp(x)/1+exp(x)
Program Name: B.Tech
Types of Activation Functions
Several different types of activation functions are used in Deep Learning. Some of them are explained below:
• Step Function:
Step Function is one of the simplest kind of activation functions. In this, we consider a threshold value and
if the value of net input say y is greater than the threshold then the neuron is activated. Mathematically,
• f(x)=1, if x>=0
• f(x)=0, if x<0
• Given below is the graphical representation of step function.

Program Name: B.Tech


Sigmoid Function:
• Sigmoid function is a widely used activation function. It is defined as:
1/(1+e^-1)
• This is a smooth function and is continuously differentiable. The biggest
advantage that it has over step and linear function is that it is non-linear. This
essentially means that when I have multiple neurons having sigmoid function as
their activation function – the output is non linear as well. The function ranges
from 0-1 having an S shape.

Program Name: B.Tech


Single layer feed forward NN training
• We know that, several neurons are arranged in one layer with inputs and
weights connect to every neuron.
• Learning in such a network occurs by adjusting the weights associated with
the inputs so that the network can classify the input patterns.
• A single neuron in such a neural network is called perceptron.
• The algorithm to train a perceptron is stated below. Let there is a perceptron
with (n + 1) inputs x0, x1, x2, · · · , xn where x0 = 1 is the bias input.
• Let f denotes the transfer function of the neuron. Suppose, X¯ and Y¯ denotes
the input-output vectors as a training data set. W¯ denotes the weight matrix.
• With this input-output relationship pattern and configuration of a perceptron,
the algorithm Training Perceptron to train the perceptron is stated in the
following slide.

Program Name: B.Tech


Single layer feed forward NN training
1. Initialize W¯ = w0, w1, · · · , wn to some random weights.
2. For each input pattern x ∈ X¯ do Here, x = {x0, x1, ...xn}
• Compute I = Pn i=0 wixi
• Compute observed output y y = f(I) = 1 , if I > 0 0 , if I ≤ 0 Y¯0 = Y¯0 + y Add y to
Y¯0 , which is initially empty
3. If the desired output Y¯ matches the observed output Y¯0 then output
W¯ and exit.
4. Otherwise, update the weight matrix W¯ as follows : For each output y
∈ Y¯0 do If the observed out y is 1 instead of 0, then wi = wi − αxi , (i = 0,
1, 2, · · · n) Else, if the observed out y is 0 instead of 1, then wi = wi + αxi ,
(i = 0, 1, 2, · · · n)
5. Go to step 2.
Program Name: B.Tech
Single layer feed forward NN training
• In the above algorithm, α is the learning parameter and is a constant
decided by some empirical studies.
Note :
• The algorithm Training Perceptron is based on the supervised
learning technique.
• ADALINE : Adaptive Linear Network Element is also an alternative
term to perceptron.
• If there are 10 number of neutrons in the single layer feed forward
neural network to be trained, then we have to iterate the algorithm
for each perceptron in the network.

Program Name: B.Tech


Training multilayer feed forward neural network
• Like single layer feed forward neural network, supervisory training
methodology is followed to train a multilayer feed forward neural
network.
• Before going to understand the training of such a neural network,
we redefine some terms involved in it.
• A block diagram and its configuration for a three layer multilayer FF
NN of type l − m − n is shown in the next slide.

Program Name: B.Tech


multilayer feed forward neural network

Program Name: B.Tech


multilayer feed forward neural network
• For simplicity, we assume that all neurons in a particular layer follow
same transfer function and different layers follows their respective
transfer functions as shown in the configuration.
• Let us consider a specific neuron in each layer say i-th, j-th and k-th
neurons in the input, hidden and output layer, respectively.
• Also, let us denote the weight between i-th neuron (i = 1, 2, · · · , l) in
input layer to j-th neuron (j = 1, 2, · · · , m) in the hidden layer is
denoted by vij.
• Similarly, wjk represents the connecting weights between j − th
neuron(j = 1, 2, · · · , m) in the hidden layer and k-th neuron (k = 1, 2,
· · · n) in the output layer.

Program Name: B.Tech


Back Propagation Algorithm
• The above discussion comprises how to calculate values of different
parameters in l − m − n multiple layer feed forward neural network.
• Next, we will discuss how to train such a neural network.
• We consider the most popular algorithm called Back-Propagation
algorithm, which is a supervised learning.
• The principle of the Back-Propagation algorithm is based on the
error-correction with Steepest-descent method.
• We first discuss the method of steepest descent followed by its use
in the training algorithm.

Program Name: B.Tech


Back Propagation Algorithm
• Input values
• X1=0.05
X2=0.10
• Initial weight
• W1=0.15     w5=0.40
W2=0.20     w6=0.45
W3=0.25     w7=0.50
W4=0.30     w8=0.55
• Bias Values
• b1=0.35     b2=0.60
• Target Values
• T1=0.01
T2=0.99
• Now, we first calculate the values of H1 and H2 by a forward pass.
• Forward Pass::
• To find the value of H1 we first multiply the input value from the weights as
•                               H1=x1×w1+x2×w2+b1
                        H1=0.05×0.15+0.10×0.20+0.35
                                    H1=0.3775
• To calculate the final result of H1, we performed the sigmoid function as

Program Name: B.Tech


We will calculate the value of H2 in the same way as H1
•                               H2=x1×w3+x2×w4+b1
                        H2=0.05×0.25+0.10×0.30+0.35
                                    H2=0.3925

Program Name: B.Tech


• To calculate the final result of H1, we performed the sigmoid function as:

• Now, we calculate the values of y1 and y2 in the same way as we calculate


the H1 and H2.
• To find the value of y1, we first multiply the input value i.e., the outcome
of H1 and H2 from the weights as
•  y1=H1×w5+H2×w6+b2
                        y1=0.593269992×0.40+0.596884378×0.45+0.60
                                    y1=1.10590597

Program Name: B.Tech


• To calculate the final result of y1 we performed the sigmoid function
as:

• Our target values are 0.01 and 0.99. Our y1 and y2 value is not
matched with our target values T1 and T2.
• Now, we will find the total error, which is simply the difference
between the outputs from the target outputs. The total error is
calculated as
Program Name: B.Tech
• So, the total error is:

• Now, we will backpropagate this error to update the weights using a


backward pass.

Program Name: B.Tech


Backward pass at the output layer
• To update the weight, we calculate the error correspond to each weight
with the help of a total error. The error on weight w is calculated by
differentiating total error with respect to w.
• The back-propagation algorithm can be followed to train a neural network
to set its topology, connecting weights, bias values and many other
parameters. In this present discussion, we will only consider updating
weights. Thus, we can write the error E corresponding to a particular
training scenario T as a function of the variable V and W. That is E = f(V, W,
T) In BP algorithm, this error E is to be minimized using the gradient
descent method. We know that according to the gradient descent method,
the changes in weight value can be given as
• ∆V = −η(∂E/∂V)…..eq(1) and
• ∆W = −η ∂E ∂W…..eq(2)
Program Name: B.Tech
Backward pass at the output layer
• Note that −ve sign is used to signify the fact that if ∂E/∂V (or ∂/∂W)
> 0, then we have to decrease V and vice-versa.
• Let vij (and wjk ) denotes the weights connecting i-th neuron (at the
input layer) to j-th neuron(at the hidden layer) and connecting j-th
neuron (at the hidden layer) to k-th neuron (at the output layer).
• Also, let ek denotes the error at the k-th neuron with observed
output as OOo k and target output TOo k as per a sample intput I ∈
TI .

Program Name: B.Tech


Neural network architectures
• There are three fundamental classes of ANN architectures:
• Single layer feed forward architecture
• Multilayer feed forward architecture
• Recurrent networks architecture
• Before going to discuss all these architectures, we first discuss the
mathematical details of a neuron at a single level. To do this, let us
first consider the AND problem and its possible solution with neural
network.

Program Name: B.Tech


Single layer feed forward neural network
• The concept of the AND problem and its solution with a single
neuron can be extended to multiple neurons.

Program Name: B.Tech


Single layer feed forward neural network
• We see, a layer of n neurons constitutues a single layer feed forward
neural network.
• This is so called because, it contains a single layer of artificial neurons.
• Note that the input layer and output layer, which receive input signals
and transmit output signals are although called layers, they are
actually boundary of the architecture and hence truly not layers.
• The only layer in the architecture is the synaptic links carrying the
weights connect every input to the output neurons.
• In a single layer neural network, the inputs x1, x2, · · · , xm are
connected to the layers of neurons through the weight matrix W. The
weight matrix Wm×n.

Program Name: B.Tech


Multilayer feed forward neural networks
• A multilayer feedforward neural network is an interconnection of
perceptrons in which data and calculations flow in a single direction,
from the input data to the outputs. 
• The number of layers in a neural network is the number of layers of
perceptrons. 
• The simplest neural network is one with a single input layer and an
output layer of perceptrons. 
• Technically, this is referred to as a one-layer feedforward network
with two outputs because the output layer is the only layer with an
activation calculation.

Program Name: B.Tech


Multilayer feed forward neural networks
• Figure shows a schematic diagram of multilayer feed forward neural
network:

Program Name: B.Tech


Multilayer feed forward neural networks
• This network, as its name indicates is made up of multiple layers.
• Thus architectures of this class besides processing an input and an
output layer also have one or more intermediary layers called
hidden layers.
• The hidden layer(s) aid in performing useful intermediary
computation before directing the input to the output layer.
• A multilayer feed forward network with l input neurons (number of
neuron at the first layer), m1, m2, · · · , mp number of neurons at i-
th hidden layer (i = 1, 2, · · · , p) and n neurons at the last layer (it is
the output neurons) is written as l − m1 − m2 − · · · − mp − n
MLFFNN.

Program Name: B.Tech


Feedback Network
• A feedback network has feedback paths, which means the signal can
flow in both directions using loops. This makes it a non-linear
dynamic system, which changes continuously until it reaches a state
of equilibrium. It may be divided into the following types −
• Recurrent networks − They are feedback networks with closed
loops. Following are the two types of recurrent networks.
• Fully recurrent network − It is the simplest neural network
architecture because all nodes are connected to all other nodes and
each node works as both input and output.
• Jordan network − It is a closed loop network in which the output
will go to the input again as feedback.

Program Name: B.Tech


Feedback Network

• Fully recurrent network

• Jordan network

Program Name: B.Tech


Adaptive Resonance Theory (ART)
• Adaptive resonance theory is a type of neural network technique developed
by Stephen Grossberg and Gail Carpenter in 1987. The basic ART uses
unsupervised learning technique.
• The term “adaptive” and “resonance” used in this suggests that they are
open to new learning(i.e. adaptive) without discarding the previous or the
old information(i.e. resonance). The ART networks are known to solve the
stability-plasticity dilemma i.e., stability refers to their nature of memorizing
the learning and plasticity refers to the fact that they are flexible to gain new
information. Due to this the nature of ART they are always able to learn new
input patterns without forgetting the past. ART networks implement a
clustering algorithm. Input is presented to the network and the algorithm
checks whether it fits into one of the already stored clusters. If it fits then
the input is added to the cluster that matches the most else a new cluster is
formed.
Program Name: B.Tech
Adaptive Resonance Theory (ART)

• Types of Adaptive Resonance Theory(ART)


Carpenter and Grossberg developed different ART architectures as a result
of 20 years of research. The ARTs can be classified as follows:
• ART1 – It is the simplest and the basic ART architecture. It is capable of
clustering binary input values.
• ART2 – It is extension of ART1 that is capable of clustering continuous-
valued input data.
• Fuzzy ART – It is the augmentation of fuzzy logic and ART.
• ARTMAP – It is a supervised form of ART learning where one ART learns
based on the previous ART module. It is also known as predictive ART.
• FARTMAP – This is a supervised ART architecture with Fuzzy logic
included.

Program Name: B.Tech


Adaptive Resonance Theory (ART)
• Basic of Adaptive Resonance Theory (ART) Architecture
The adaptive resonant theory is a type of neural network that is self-
organizing and competitive. It can be of both types, the
unsupervised ones(ART1, ART2, ART3, etc) or the supervised
ones(ARTMAP). Generally, the supervised algorithms are named
with the suffix “MAP”.
But the basic ART model is unsupervised in nature and consists of :
• F1 layer or the comparison field(where the inputs are processed)
• F2 layer or the recognition field (which consists of the clustering
units)
• The Reset Module (that acts as a control mechanism)

Program Name: B.Tech


Advantage of Adaptive Resonance Theory (ART)

• It exhibits stability and is not disturbed by a wide variety of inputs


provided to its network.
• It can be integrated and used with various other techniques to give
more good results.
• It can be used for various fields such as mobile robot control, face
recognition, land cover classification, target recognition, medical
diagnosis, signature verification, clustering web users, etc.
• It has got advantages over competitive learning (like bpnn etc). The
competitive learning lacks the capability to add new clusters when
deemed necessary.
• It does not guarantee stability in forming clusters.

Program Name: B.Tech


Limitations of Adaptive Resonance Theory
• Some ART networks are inconsistent (like the Fuzzy ART and ART1)
as they depend upon the order in which training data, or upon the
learning rate.

Program Name: B.Tech


Self-organizing map
• Self Organizing Maps or Kohenin’s map is a type of artificial neural
networks introduced by Teuvo Kohonen in the 1980s.
• SOM is trained using unsupervised learning, it is a little bit different
from other artificial neural networks, SOM doesn’t learn by
backpropagation with SGD,it use competitive learning to adjust
weights in neurons. And we use this type of artificial neural
networks in dimension reduction to reduce our data by creating a
spatially organized representation, also it help us to discover the
correlation between data.

Program Name: B.Tech


SOM’s architecture
• Self organizing maps have two layers, the first one is the input layer and the
second one is the output layer or the feature map.
• Unlike other ANN types, SOM doesn’t have activation function in neurons, we
directly pass weights to output layer without doing anything.
• Each neuron in a SOM is assigned a weight vector with the same dimensionality
d as the input space.

Program Name: B.Tech


Self organizing maps training
• As we mention before, SOM doesn’t use backpropagation with SGD
to update weights, this type of unsupervised artificial neural
network uses competetive learning to update its weights.
• Competetive learning is based on three processes :
• Competetion
• Cooperation
• Adaptation

Program Name: B.Tech

You might also like