unit 4 5 NN
unit 4 5 NN
unit 4 5 NN
SOMs are used for clustering, visualization, and feature extraction in datasets where the structure may not be
immediately apparent.
1. Topological Preservation
o Similar data points in the input space are mapped to neighboring nodes in the output space.
2. Competitive Learning
o Neurons in the SOM compete for input data during training, and only one (or a few) neurons are
updated, leading to specialization.
3. Neighborhood Function
o Neurons surrounding the "winning neuron" (Best Matching Unit - BMU) are also updated,
maintaining the smoothness of the map.
4. Dimensionality Reduction
o SOMs reduce high-dimensional data into a comprehensible and visualizable format.
SOM Structure
1. Input Layer
o Accepts the high-dimensional input data.
2. Map Layer
o A grid of neurons (e.g., 2D lattice) where each neuron has a weight vector of the same dimension
as the input data.
SOM Algorithm
1. Initialization
2. Input Selection
Select an input vector x=[x1,x2,…,xd]\mathbf{x} = [x_1, x_2, \ldots, x_d]x=[x1,x2,…,xd] from the
dataset.
BMU=argimin∣∣x−wi∣∣
4. Neighborhood Function
Define a neighborhood around the BMU. Commonly used neighborhood functions include Gaussian or
bubble functions.
The neighborhood size shrinks over time: hi,BMU(t)=exp(−∣∣ri−rBMU∣∣22σ(t)2)h_{i,BMU}(t) = \exp\
left(-\frac{||r_i - r_{BMU}||^2}{2\sigma(t)^2}\right)hi,BMU(t)=exp(−2σ(t)2∣∣ri−rBMU∣∣2) where rir_iri
is the location of neuron iii, and σ(t)\sigma(t)σ(t) is the neighborhood radius at time ttt.
5. Weight Update
Update the weight vector of the BMU and its neighbors using: wi(t+1)=wi(t)+η(t)⋅hi,BMU(t)⋅(x−wi(t))\
mathbf{w_i}(t+1) = \mathbf{w_i}(t) + \eta(t) \cdot h_{i,BMU}(t) \cdot (\mathbf{x} - \mathbf{w_i}
(t))wi(t+1)=wi(t)+η(t)⋅hi,BMU(t)⋅(x−wi(t)) where:
o η(t)\eta(t)η(t): Learning rate (decreases over time).
o hi,BMU(t)h_{i,BMU}(t)hi,BMU(t): Neighborhood function.
6. Repeat
Repeat steps 2–5 for all input vectors and for multiple iterations, reducing σ(t)\sigma(t)σ(t) and η(t)\
eta(t)η(t) over time.
Applications of SOM
1. Clustering
o Organizing data into clusters without predefined labels.
2. Data Visualization
o Mapping high-dimensional data into 2D for better interpretation.
3. Pattern Recognition
o Identifying and classifying patterns in data.
4. Dimensionality Reduction
o Simplifying datasets while retaining structure.
5. Anomaly Detection
o Identifying data points that deviate significantly from the norm.
Advantages of SOM
Topology-preserving mapping.
Suitable for high-dimensional and unlabeled data.
Provides intuitive visualizations.
Limitations of SOM
SOMs are a powerful tool in neural networks for unsupervised learning tasks, especially when data exploration
and visualization are critical.
In neural networks, a feature map refers to the output of a convolutional layer or similar operation that
transforms the input data into a set of learned features. Here are some key properties of feature maps:
1. Spatial Dimensions: A feature map typically has width and height dimensions corresponding to the size
of the input data after convolution or pooling operations. It represents the spatial arrangement of features
detected in the input.
2. Depth/Channels: A feature map has depth, representing the number of filters applied in a convolution
layer. Each filter generates one channel, capturing a different aspect of the data (such as edges, textures,
or shapes).
3. Activation: The values in a feature map are often passed through an activation function (like ReLU) that
introduces non-linearity, allowing the model to learn complex patterns.
4. Learned Features: Each feature map captures certain features of the input data. For example, in image
recognition, different feature maps may highlight edges, corners, or textures.
5. Local Receptive Field: The size of the region from the input that contributes to each feature map
element is called the receptive field. It determines how much context each feature map can capture.
6. Pooling: Pooling layers (like max pooling or average pooling) applied to feature maps reduce their
spatial dimensions, retaining important features while making the model more computationally efficient.
7. Depth Control: The depth of the feature map can be adjusted by changing the number of filters used in
the convolution operation, which directly impacts the capacity of the model to learn diverse features.
Computer Simulations
Computer Simulations in Neural Networks involve creating virtual environments or systems to study, design,
and test neural network models. These simulations allow researchers and developers to explore the behavior and
performance of neural networks without the need for physical implementation. Here are the key aspects:
1. Purpose of Computer Simulations:
Model Testing: Test how neural networks perform on specific datasets or tasks.
Parameter Tuning: Experiment with hyperparameters like learning rate, number of layers, or activation
functions.
Behavior Analysis: Understand how a network learns and identifies patterns.
Scalability and Performance: Simulate large-scale networks to analyze computational requirements.
2. Advantages:
Cost-Effective: Simulations reduce the need for physical resources like specialized hardware or devices.
Time-Saving: Networks can be trained and tested in a controlled virtual setup, speeding up development.
Safety: Useful in scenarios where physical testing is risky, such as robotics or autonomous vehicles.
Insights: Provides detailed visualizations and logs of internal processes like weight updates and feature
extraction.
Frameworks: Tools like TensorFlow, PyTorch, Keras, and CNTK allow creating and simulating neural networks
easily.
Cloud Simulations: Services like Google Colab or AWS simulate complex neural networks on powerful cloud
GPUs.
Visualization Tools: Platforms like TensorBoard visualize training progress, loss curves, and feature maps.
4. Applications:
AI Development: Testing AI for natural language processing, image recognition, or game-playing strategies.
Research: Simulating neural networks for academic purposes, such as studying how learning algorithms evolve.
Optimization: Experimenting with various neural architectures to find the most efficient design.
5. Limitations:
Example:
In autonomous driving, a neural network can be trained in a simulation environment to recognize pedestrians,
vehicles, and traffic signs. This ensures the model is robust before being deployed in a real car
Learning Vector Quantization (LVQ) is a supervised machine learning algorithm that combines ideas from
neural networks and vector quantization. It is primarily used for classification tasks and works by finding
prototypes (representative vectors) for each class in the data. Here's an overview:
1. Prototype-Based Representation:
o Instead of storing all training data, LVQ represents each class with a few prototype vectors.
o These prototypes act as reference points for classifying new input data.
2. Supervised Learning:
o Unlike unsupervised methods like k-means, LVQ requires labeled data to guide the placement of
prototypes.
3. Distance-Based Classification:
o The classification of an input vector is based on the prototype it is closest to, typically using the
Euclidean distance.
1. Initialization:
o Choose the number of prototypes for each class.
o Initialize these prototypes randomly or by selecting representative samples from the training data.
2. Training:
o For each input sample:
1. Find the nearest prototype to the input.
2. Adjust the prototype's position:
If the prototype belongs to the correct class, move it closer to the input.
If the prototype belongs to the wrong class, move it farther from the input.
o This adjustment uses a learning rate and is repeated for several epochs.
3. Classification:
o After training, classify a new input by assigning it the class of the nearest prototype.
Advantages:
Applications:
Example:
In a handwriting recognition task, prototypes might represent typical shapes of each letter. During training,
these prototypes adjust their positions to better represent the letter shapes in the dataset, making it easier to
classify new handwritten inputs.
Adaptive Pattern Classification in neural networks refers to the ability of a neural network to learn, recognize,
and adapt to patterns in data by updating its parameters (weights and biases) during the training process. This
concept is a cornerstone of neural networks and allows them to generalize across different types of data for
tasks like classification, recognition, and prediction.
1. Adaptivity:
o Neural networks adapt their parameters through iterative learning algorithms like backpropagation.
o This adaptation enables the network to improve its pattern recognition performance over time.
2. Generalization:
o The network learns to classify unseen patterns (test data) by generalizing from the training data.
3. Nonlinear Mapping:
o Neural networks can model complex and nonlinear relationships between input patterns and their
corresponding classes.
4. Layered Processing:
o Multiple layers in the network (input, hidden, output) allow hierarchical feature extraction, enhancing
pattern classification capability.
5. Error-Driven Learning:
o The network minimizes a loss function (e.g., mean squared error, cross-entropy) by adjusting weights,
making its classification decisions more accurate.
Steps in Adaptive Pattern Classification:
1. Data Preprocessing:
o Normalize or scale the input data for better network performance.
o Split the data into training, validation, and test sets.
2. Feature Extraction:
o The network learns important features of input patterns through its hidden layers.
3. Model Training:
o Use algorithms like stochastic gradient descent (SGD) or Adam for weight updates.
o Apply supervised learning to adjust weights based on labeled data.
4. Pattern Classification:
o After training, the network assigns class labels to input patterns based on its learned decision
boundaries.
5. Performance Evaluation:
o Measure classification accuracy, precision, recall, and F1-score to evaluate the model.
Advantages:
Dynamic Learning: Neural networks can learn and adapt to new data.
Handles Complex Patterns: Suitable for tasks involving high-dimensional and nonlinear patterns.
Scalable: Can be extended to larger datasets and deeper architectures.
Disadvantages:
Training Complexity: Requires significant computational resources and time for large models.
Overfitting: Networks may overfit to the training data if not properly regularized.
Data Dependency: Performance relies heavily on the quality and quantity of labeled data.
Applications:
Image Recognition: Classifying images into categories (e.g., identifying cats vs. dogs).
Speech and Audio Processing: Recognizing spoken words or musical genres.
Medical Diagnosis: Classifying diseases based on patient data.
Financial Forecasting: Predicting market trends and classifying risks.
Example: In a digit recognition task (e.g., MNIST), an adaptive pattern classification neural network learns to
distinguish handwritten digits by adapting its parameters to represent features like edges, curves, and shapes.
After training, it can accurately classify new digit images.
Neuro Dynamics in Neural Networks
Neuro dynamics refers to the study of how neural networks evolve over time as dynamic systems. It involves
understanding how the states of neurons or nodes change based on interactions and inputs, leading to pattern
formation, stability, or complex behaviors in the network.
A dynamical system is a mathematical framework to describe how the state of a system evolves over time
based on certain rules or equations. In the context of neural networks:
Each neuron's state (e.g., activation level) evolves according to its inputs, weights, and biases.
The system's behavior can be described using differential equations (for continuous-time networks) or
difference equations (for discrete-time networks).
Mathematical Formulation:
Equilibrium states are states where the system does not change over time (i.e., the system reaches a steady
state).
Stable Equilibrium: If small perturbations to the state return to the equilibrium, it is considered stable.
o Example: A ball resting at the bottom of a bowl.
Unstable Equilibrium: If small perturbations move the state away from the equilibrium, it is considered
unstable.
o Example: A ball balanced at the peak of a hill.
A system is stable if the eigenvalues of the Jacobian matrix of f(x)f(x)f(x) have negative real parts (for continuous
systems) or magnitude less than 1 (for discrete systems).
3. Attractors
Attractors are states or sets of states toward which the system evolves over time.
Attractors help neural networks model natural phenomena such as memory (point attractors) or oscillations in
biological systems (limit cycles).
Neuro dynamical models use the principles of dynamical systems to describe and simulate the behavior of
neurons and networks.
1. Hopfield Networks:
o A recurrent neural network where neurons update asynchronously.
o Designed to minimize an energy function.
o States evolve toward stable equilibria, useful for associative memory and optimization tasks.
7. Winner-Take-All Networks:
o Networks where one neuron or group dominates the activity, inhibiting others.
o Models competitive dynamics and decision-making.
Memory Models: Hopfield networks use point attractors to store and retrieve patterns.
Oscillatory Behavior: Limit cycle attractors simulate brain rhythms.
Decision-Making: Winner-take-all networks model competitive decision processes.
Biological Processes: Spiking neural networks mimic real neuronal activity.
Visual Representation
Imagine a landscape where the state of the neural network is represented by a ball moving on the surface:
By understanding neuro dynamics, researchers can design neural networks that are stable, robust, and capable of
modeling complex patterns like those observed in real-world systems or biological brains.
In the context of recurrent neural networks (RNNs), the manipulation of attractors refers to altering or
designing the dynamics of a network so that its state-space behavior is guided toward specific desired patterns
or outputs. Attractors are stable states or regions in the state space of a dynamical system where the system
tends to settle, making them crucial in understanding and controlling RNN behavior.
This paradigm focuses on shaping the attractor landscape of the network for specific applications, such as
associative memory, pattern recognition, and solving optimization problems.
Key Concepts
1. Associative Memory
o Networks like Hopfield networks use point attractors to store and retrieve patterns. Partial or
noisy inputs evolve toward the nearest stored pattern.
2. Pattern Recognition
o Patterns or features are encoded as attractors. The system identifies patterns by evolving its state
toward the nearest attractor.
3. Optimization Problems
o Attractors represent optimal solutions. The network's dynamics guide the system toward these
solutions (e.g., traveling salesman problem).
4. Dynamic Systems Modeling
o Limit cycles and strange attractors are used to model periodic or chaotic phenomena, such as
weather or biological rhythms.
Advantages
Challenges
1. Complexity of Dynamics
o Manipulating attractors in high-dimensional systems can be computationally intensive.
2. Unintended Attractors
o Improper training may lead to spurious attractors, causing errors in retrieval or convergence.
3. Stability vs. Flexibility
o Designing a network with attractors that are both stable and flexible for varying inputs is non-
trivial.
Conclusion
The manipulation of attractors paradigm leverages the inherent dynamics of recurrent neural networks to
achieve tasks like memory recall, pattern recognition, and optimization. By carefully designing the network's
weight structure and energy landscape, it is possible to encode specific behaviors and ensure robust
performance in a variety of real-world applications.
Hopfield
The Hopfield Neural Networks, invented by Dr John J. Hopfield consists of one layer of ‘n’ fully connected
recurrent neurons. It is generally used in performing auto-association and optimization tasks. It is calculated
using a converging interactive process and it generates a different response than our normal neural nets.
Discrete Hopfield Network
It is a fully interconnected neural network where each unit is connected to every other unit. It behaves in a
discrete manner, i.e. it gives finite distinct output, generally of two types:
Binary (0/1)
Bipolar (-1/1)
The weights associated with this network are symmetric in nature and have the following properties.
1. wij=wji2. wii=0 1. wij=wji2. wii=0
Structure & Architecture of Hopfield Network
Each neuron has an inverting and a non-inverting output.
Being fully connected, the output of each neuron is an input to all other neurons but not the self.
The below figure shows a sample representation of a Discrete Hopfield Neural Network architecture having the
following elements.
The Hopfield model is a type of recurrent neural network designed for associative memory. It stores patterns
and retrieves them based on partial or noisy input, acting like a content-addressable memory system.
The Hopfield network is made of neurons, each connected to every other neuron.
Each neuron can take a value of +1 or -1 (binary states).
The connection between two neurons iii and jjj has a weight wijw_{ij}wij.
Start with all weights as 0: wij=0for all i,jw_{ij} = 0 \quad \text{for all } i, jwij=0for all i,j
where:
o PkiP_{ki}Pki and PkjP_{kj}Pkj: States of neurons iii and jjj in pattern PkP_kPk.
o NNN: Number of neurons.
Repeat this process for all patterns, summing the contributions.
Key Rules:
o Do not allow self-connections: wii=0w_{ii} = 0wii=0.
After processing all patterns, the final weights represent the memory of the network.
Example
If you want to store P1=[+1,−1]P_1 = [+1, -1]P1=[+1,−1] and P2=[−1,+1]P_2 = [-1, +1]P2=[−1,+1]:
1. For P1P_1P1:
w11=0,w12=(+1)(−1),w21=(−1)(+1),w22=0w_{11} = 0, \quad w_{12} = (+1)(-1), \quad w_{21} = (-1)
(+1), \quad w_{22} = 0w11=0,w12=(+1)(−1),w21=(−1)(+1),w22=0
2. For P2P_2P2:
Update the same weights with contributions from P2P_2P2.
Why is It Simple?
Real-Life Analogy
This simple design makes the Hopfield model intuitive and efficient for small datasets.