E-Note 15956 Content Document 20240219104507AM

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 18

Artificial Neural Network: Introduction

 An artificial neural network (ANN) may be defined as an information-processing


model that is inspired by the way biological nervous systems, such as the brain,
process information.
 This model tries to replicate only the most basic functions of the brain.
 An ANN is composed of a large number of highly interconnected processing units
(neurons) working in network (connected) to solve specific problems.
 Like human being, Artificial neural networks learn by example.
 An ANN is configured for a specific application, such as spam classification, Face
Recognition, pattern recognition through a learning process.
 Each neuron is connected with the other by a connection link.
 Each connection link is associated with weights which contain information about the
input signal.
 This information is used by the neuron network to solve a particular problem.
 ANNs' collective behavior is characterized by their ability to learn, recall and
generalize training patterns or data similar to that of a human brain.
 They have the capability to model networks of original neurons as found in the brain.
Thus, the ANN processing elements are called neurons or artificial neurons.
 Each neuron has an internal state of its own.
 This internal state is called the activation level of neuron The activation signal of a
neuron is transmitted to other neurons.
 A neuron can send only one signal at a time, which can be transmitted to several other
neurons.
Example:

In the above example it has two layers, out if which one is input layer and other one is output
layer and the minimum condition to construct any neural network is it should have minimum
of two layers.
For given example In input layers there are two neurons (X 1, X2) and one output neuron (Y)
thus making total of three neurons. x1 and x2 are inputs to X1 and X2 and y is out of Y neuron.
Each of neurons are connected by separate link and each links are associated with some
weights. Initially some random weight is assigned to links and after performing computation
and out put is obtained once the termination condition is satisfied (say error should be less
than 0.5) whatever weight considered that will be valid if termination condition is not
satisfied then weights need to be modified. To modify the weights there are different
algorithm like gradient, decent, back propagation etc which will be applied to update weights.
Usyally every neuron performs to types of computational operation

 Summation
 Apply activation function (Activation functions are not fixed one based on
requirements it can be changed like sigmoid and other functions)
i.e for the above simple neuron net architect, the net input has to be calculated in the
following way
Summation, yin = x1w1 + x2w2
Apply activation function, y = f(yin)
STATE THE MAJOR DIFFERENCES BETWEEN BIOLOGICAL AND
ARTIFICIANEURAL NETWORKS
1. Size: Our brain contains about 86 billion neurons and more than a 100 synapses
(connections). The number of “neurons” in artificial networks is much less than that.
2. Signal transport and processing: The human brain works asynchronously, ANNs work
synchronously.
3. Processing speed: Single biological neurons are slow, while standard neurons in ANNs are
fast.
4. Topology: Biological neural networks have complicated topologies, while ANNs are often
in a tree structure.
5. Speed: certain biological neurons can fire around 200 times a second on average. Signals
travel at different speeds depending on the type of the nerve impulse, ranging from 0.61 m/s
up to 119 m/s. Signal travel speeds also vary from person to person depending on their sex,
age, height, temperature, medical condition, lack of sleep etc. Information in artificial
neurons is carried over by the continuous, floating point number values of synaptic weights.
There are no refractory periods for artificial neural networks (periods while it is impossible to
send another action potential, due to the sodium channels being lock shut) and artificial
neurons do not experience “fatigue”: they are functions that can be calculated as many times
and as fast as the computer architecture would allow.
6. Fault-tolerance: biological neuron networks due to their topology are also fault-tolerant.
Artificial neural networks are not modeled for fault tolerance or self regeneration (similarly
to fatigue, these ideas are not applicable to matrix operations), though recovery is possible by
saving the current state (weight values) of the model and continuing the training from that
save state.
7. Power consumption: the brain consumes about 20% of all the human body’s energy —
despite it’s large cut, an adult brain operates on about 20 watts (barely enough to dimly light a
bulb) being extremely efficient. Taking into account how humans can still operate for a while,
when only given some c-vitamin rich lemon juice and beef tallow, this is quite remarkable.
For benchmark: a single Nvidia GeForce Titan X GPU runs on 250 watts alone, and requires
a power supply. Our machines are way less efficient than biological systems. Computers also
generate a lot of heat when used, with consumer GPUs operating safely between 50–
80°Celsius instead of 36.5–37.5 °C.
8. Learning: we still do not understand how brains learn, or how redundant connections store
and recall information. By learning, we are building on information that is already stored in
the brain. Our knowledge deepens by repetition and during sleep, and tasks that once required
a focus can be executed automatically once mastered. Artificial neural networks in the other
hand, have a predefined model, where no further neurons or connections can be added or
removed. Only the weights of the connections (and biases representing thresholds) can
change during training. The networks start with random weight values and will slowly try to
reach a point where further changes in the weights would no longer improve performance.
Biological networks usually don't stop / start learning. ANNs have different fitting (train) and
prediction (evaluate) phases.
9. Field of application: ANNs are specialized. They can perform one task. They might be
perfect at playing chess, but they fail at playing go (or vice versa). Biological neural networks
can learn completely new tasks. 10. Training algorithm: ANNs use Gradient Descent for
learning. Human brains use something different
APPLICATIONS OF ANN
1. Data Mining: Discovery of meaningful patterns (knowledge) from large volumes of data.
2. Expert Systems: A computer program for decision making that simulates thought process
of a human expert.
3. Fuzzy Logic: Theory of approximate reasoning.
4. Artificial Life: Evolutionary Computation, Swarm Intelligence.
5. Artificial Immune System: A computer program based on the biological immune system.
6. Medical: At the moment, the research is mostly on modelling parts of the human body and
recognizing diseases from various scans (e.g. cardiograms, CAT scans, ultrasonic scans,
etc.).Neural networks are ideal in recognizing diseases using scans since there is no need to
provide a specific algorithm on how to identify the disease. Neural networks learn by
example so the details of how to recognize the disease are not needed. What is needed is a set
of examples that are representative of all the variations of the disease. The quantity of
examples is not as important as the 'quantity'. The examples need to be selected very carefully
if the system is to perform reliably and efficiently.
7. Computer Science: Researchers in quest of artificial intelligence have created spin offs like
dynamic programming, object oriented programming, symbolic programming, intelligent
storage management systems and many more such tools. The primary goal of creating an
artificial intelligence still remains a distant dream but people are getting an idea of the
ultimate path, which could lead to it.
8. Aviation: Airlines use expert systems in planes to monitor atmospheric conditions and
system status. The plane can be put on autopilot once a course is set for the destination.
9. Weather Forecast: Neural networks are used for predicting weather conditions. Previous
data is fed to a neural network, which learns the pattern and uses that knowledge to predict
weather patterns.
10. Neural Networks in business: Business is a diverted field with several general areas of
specialization such as accounting or financial analysis. Almost any neural network
application would fit into one business area or financial analysis.
11. There is some potential for using neural networks for business purposes, including
resource allocation and scheduling.
12. There is also a strong potential for using neural networks for database mining, which is,
searching for patterns implicit within the explicitly stored information in databases. Most of
the funded work in this area is classified as proprietary. Thus, it is not possible to report on
the full extent of the work going on. Most work is applying neural networks, such as the
Hopfield-Tank network for optimization and scheduling.
13. Marketing: There is a marketing application which has been integrated with a neural
network system. The Airline Marketing Tactician (a trademark abbreviated as AMT) is a
computer system made of various intelligent technologies including expert systems. A feed
forward neural network is integrated with the AMT and was trained using back-propagation
to assist the marketing control of airline seat allocations. The adaptive neural approach was
amenable to rule expression. Additionally, the application's environment changed rapidly and
constantly, which required a continuously adaptive solution.
14. Credit Evaluation: The HNC company, founded by Robert Hecht-Nielsen, has developed
several neural network applications. One of them is the Credit Scoring system which
increases the profitability of the existing model up to 27%. The HNC neural systems were
also applied to mortgage screening. A neural network automated mortgage insurance under
writing system was developed by the Nestor Company. This system was trained with 5048
applications of which 2597 were certified. The data related to property and borrower
qualifications. In a conservative mode the system agreed on the under writers on 97% of the
cases. In the liberal model the system agreed 84% of the cases. This is system run on an
Apollo DN3000 and used 250K memory while processing a case file in approximately 1 sec.
ADVANTAGES OF ANN
1. Adaptive learning: An ability to learn how to do tasks based on the data given for training
or initial experience.
2. Self-Organisation: An ANN can create its own organisation or representation of the
information it receives during learning time.
3. Real Time Operation: ANN computations may be carried out in parallel, and special
hardware devices are being designed and manufactured which take advantage of this
capability.
4. Pattern recognition: is a powerful technique for harnessing the information in the data and
generalizing about it. Neural nets learn to recognize the patterns which exist in the data set.
5. The system is developed through learning rather than programming. Neural nets teach
themselves the patterns in the data freeing the analyst for more interesting work.
6. Neural networks are flexible in a changing environment. Although neural networks may
take some time to learn a sudden drastic change they are excellent at adapting to constantly
changing information.
7. Neural networks can build informative models whenever conventional approaches fail.
Because neural networks can handle very complex interactions they can easily model data
which is too difficult to model with traditional approaches such as inferential statistics or
programming logic.
8. Performance of neural networks is at least as good as classical statistical modelling, and
better on most problems. The neural networks build models that are more reflective of the
structure of the data in significantly less time.
LIMITATIONS OF ANN
In this technological era everything has Merits and some Demerits in others words there is a
Limitation with every system which makes this ANN technology weak in some points. The
various Limitations of ANN are:-
1) ANN is not a daily life general purpose problem solver.
2) There is no structured methodology available in ANN.
3) There is no single standardized paradigm for ANN development.
4) The Output Quality of an ANN may be unpredictable.
5) Many ANN Systems does not describe how they solve problems.
6) Black box Nature
7) Greater computational burden.
8) Proneness to over fitting.
9) Empirical nature of model development
ARTIFICIAL NEURAL NETWORK CONCEPTS/TERMINOLOGY
Here is a glossary of basic terms you should be familiar with before learning the details of
neural networks.
Inputs: Source data fed into the neural network, with the goal of making a decision or
prediction about the data. Inputs to a neural network are typically a set of real values; each
value is fed into one of the neurons in the input layer.
Training Set: A set of inputs for which the correct outputs are known, used to train the neural
network.
Outputs : Neural networks generate their predictions in the form of a set of real values or
boolean decisions. Each output value is generated by one of the neurons in the output layer.
Neuron/perceptron: The basic unit of the neural network. Accepts an input and generates a
prediction. Each neuron accepts part of the input and passes it through the activation function.
Common activation functions are sigmoid, TanH and ReLu. Activation functions help
generate output values within an acceptable range, and their non-linear form is crucial for
training the network.
Weight Space: Each neuron is given a numeric weight. The weights, together with the
activation function, define each neuron’s output. Neural networks are trained by fine-tuning
weights, to discover the optimal set of weights that generates the most accurate prediction.
Forward Pass: The forward pass takes the inputs, passes them through the network and
allows each neuron to react to a fraction of the input. Neurons generate their outputs and pass
them on to the next layer, until eventually the network generates an output.
Error Function: Defines how far the actual output of the current model is from the correct
output. When training the model, the objective is to minimize the error function and bring
output as close as possible to the correct value.
Backpropagation: In order to discover the optimal weights for the neurons, we perform a
backward pass, moving back from the network’s prediction to the neurons that generated that
prediction. This is called backpropagation. Backpropagation tracks the derivatives of the
activation functions in each successive neuron, to find weights that bring the loss function to
a minimum, which will generate the best prediction. This is a mathematical process called
gradient descent.
Bias and Variance: When training neural networks, like in other machine learning
techniques, we try to balance between bias and variance. Bias measures how well the model
fits the training set—able to correctly predict the known outputs of the training examples.
Variance measures how well the model works with unknown inputs that were not available
during training. Another meaning of bias is a “bias neuron” which is used in every layer of
the neural network. The bias neuron holds the number 1, and makes it possible to move the
activation function up, down, left and right on the number graph.
Hyperparameters: A hyper parameter is a setting that affects the structure or operation of
the neural network. In real deep learning projects, tuning hyper parameters is the primary way
to build a network that provides accurate predictions for a certain problem. Common hyper
parameters include the number of hidden layers, the activation function, and how many times
(epochs) training should be repeated.

Basic components of ANN


The models of ANN are specified by the three basic entities namely:
1. The model's synaptic interconnections;
2. The training or learning rules adopted for updating and adjusting the connection weights;
3. Their activation functions.
1. Connections
An ANN consists of a set of highly interconnected processing elements (neurons) such that
each processing element output is found to be connected through weights to the other
processing elements or to itself; delay lead and lag-free connections are allowed.
Hence, the arrangements of these processing elements and the geometry of their
interconnections are essential for an ANN.
There exist five basic types of neuron connection architectures. They are:
1. Single-layer feed-forward network
2. Multilayer feed-forward network
3. Single node with its own feedback
4. Single-layer recurrent network
5. Multilayer recurrent network.

Single-layer feed-forward network


In this network input layers can have one or more neurons and output layers can have one or
more neurons. In this type of network each of input neurons are directly connected to output
neurons has shown in the figure (X1 connected to Y1, Y2 ….Ym, X2 connected to Y1, Y2
….Ym) so on. This network is also called as fully connected Single-layer feed-forward
network.
Multilayer feed-forward network

In Multilayer feed-forward network along with input and output layers there will be hidden
layers. Each layers can have multiple neurons, input layers are conneted to hidden layers,
hidden layer connected to output layer.
Single node with its own feedback
In Single node with its own feedback there will be only one layer that is input layer and
output of input neuron will be the output and out put is validated and if the validation is
correct then model is accepted if validation is not correct feedback and modification is done
on computation part of neuron.
Single-layer recurrent network

Single-layer recurrent network has both input and output layers while modifying weights
instead of considering only input and previous weights output of previous iteration is
considered as well
Multilayer recurrent network.
Multilayer recurrent network is same as Single-layer recurrent network except that it has
hidden layer in addition to input and output layers.
2. Learning
The main property of an ANN is its capability to learn.
Learning or training is a process by means of which a neural network adapts itself to a
stimulus by making proper parameter adjustments, resulting in the production of desired
response.
Broadly, there are two kinds of learning in ANNs:
1. Parameter learning: It updates the connecting weights in a neural net.
2. Structure learning: It focuses on the change in network structure (which includes the
number of processing elements as well as their connection types).

WHAT IS A PERCEPTRON?
A perceptron is a binary classification algorithm modelled after the functioning of the human
brain it was intended to emulate the neuron. The perceptron, while it has a simple structure,
has the ability to learn input and output

What is Multilayer Perceptron?


A multilayer perceptron (MLP) is a group of perceptrons, organized in multiple layers, that
can accurately answer complex questions. Each perceptron in the first layer (on the left)
sends signals to all the perceptrons in the second layer, and so on. An MLP contains an input
layer, at least one hidden layer, and an output layer.

The perceptron learns as follows:


1. Takes the inputs which are fed into the perceptrons in the input layer, multiplies them by
their weights, and computes the sum.
2. Adds the number one, multiplied by a “bias weight”. This is a technical step that makes it
possible to move the output function of each perceptron (the activation function) up, down,
left and right on the number graph.
3. Feeds the sum through the activation function—in a simple perceptron system, the
activation function is a step function.
4. The result of the step function is the output.
A multilayer perceptron is quite similar to a modern neural network. By adding a few
ingredients, the perceptron architecture becomes a full-fledged deep learning system:
 Activation functions and other hyperparameters: a full neural network uses a variety of
activation functions which output real values, not boolean values like in the classic
perceptron. It is more flexible in terms of other details of the learning process, such as the
number of training iterations (iterations and epochs), weight initialization schemes,
regularization, and so on. All these can be tuned as hyperparameters.
 Backpropagation: a full neural network uses the backpropagation algorithm, to perform
iterative backward passes which try to find the optimal values of perceptron weights, to
generate the most accurate prediction.
 Advanced architectures: full neural networks can have a variety of architectures that can
help solve specific problems. A few examples are Recurrent Neural Networks (RNN),
Convolutional Neural Networks (CNN), and Generative Adversarial Networks (GAN).
WHAT IS BACKPROPAGATION AND WHY IS IT IMPORTANT?
After a neural network is defined with initial weights, and a forward pass is performed to
generate the initial prediction, there is an error function which defines how far away the
model is from the true prediction. There are many possible algorithms that can minimize the
error function—for example, one could do a brute force search to find the weights that
generate the smallest error. However, for large neural networks, a training algorithm is
needed that is very computationally efficient. Backpropagation is that algorithm—it can
discover the optimal weights relatively quickly, even for a network with millions of weights.
HOW BACKPROPAGATION WORKS?
1. Forward pass: weights are initialized and inputs from the training set are fed into the
network. The forward pass is carried out and the model generates its initial prediction.
2. Error function: the error function is computed by checking how far away the prediction is
from the known true value.
3. Backpropagation: with gradient descent—the backpropagation algorithm calculates how
much the output values are affected by each of the weights in the model. To do this, it
calculates partial derivatives, going back from the error function to a specific neuron and its
weight. This provides complete traceability from total errors, back to a specific weight which
contributed to that error. The result of backpropagation is a set of weights that minimize the
error function.
4. Weight update: weights can be updated after every sample in the training set, but this is
usually not practical. Typically, a batch of samples is run in one big forward pass, and then
backpropagation performed on the aggregate result. The batch size and number of batches
used in training, called iterations, are important hyperparameters that are tuned to get the best
results. Running the entire training set through the backpropagation process is called an
epoch.

Training algorithm of BPNN:


1. Inputs X, arrive through the pre connected path
2. Input is modeled using real weights W. The weights are usually randomly selected.
3. Calculate the output for every neuron from the input layer, to the hidden layers, to the
output layer.
4. Calculate the error in the outputs ErrorB= Actual Output – Desired Output
5. Travel back from the output layer to the hidden layer to adjust the weights such that the
error is decreased.
Keep repeating the process until the desired output is achieved.
Architecture of back propagation network:
As shown in the diagram, the architecture of BPN has three interconnected layers having
weights on them. The hidden layer as well as the output layer also has bias, whose weight is
always 1, on them. As is clear from the diagram, the working of BPN is in two phases. One
phase sends the signal from the input layer to the output layer, and the other phase back
propagates the error from the output layer to the input layer.
Explain various applications of Deep Learning.
There are various interesting applications for Deep Learning that made
impossible things before a decade into reality. Some of them are:
1. Color restoration, where a given image in greyscale is automatically turned into a colored
one.
2. Recognizing hand written message.
3. Adding sound to a silent video that matches with the scene taking place.
4. Self-driving cars
5. Computer Vision: for applications like vehicle number plate identification and facial
recognition.
6. Information Retrieval: for applications like search engines, both text search, and image
search.
7. Marketing: for applications like automated email marketing, target identification
8. Medical Diagnosis: for applications like cancer identification, anomaly detection
9. Natural Language Processing: for applications like sentiment analysis, photo tagging
10. Online Advertising, etc
Loss function in neural networks.
Neural Network uses optimising strategies to minimize the error in the algorithm. The way
we actually compute this error is by using a Loss Function. It is used to quantify how good or
bad the model is performing.
These are divided into two categories i.e. Regression loss and Classification Loss.
1. Regression Loss Function
Regression Loss is used when we are predicting continuous values like the price of a house or
sales of a company.
Eg. Mean Squared Error
Mean Squared Error is the mean of squared differences between the actual and predicted
value. If the difference is large the model will penalize it as we are computing the squared
difference.
2. Binary Classification Loss Function
Suppose we are dealing with a Yes/No situation like “a person has diabetes or not”, in this
kind of scenario Binary Classification Loss Function is used.
Eg. Binary Cross Entropy Loss
It gives the probability value between 0 and 1 for a classification task. Cross-Entropy
calculates the average difference between the predicted and actual probabilities.
3. Multi-Class Classification Loss Function
If we take a dataset like Iris where we need to predict the three-class labels: Setosa,
Versicolor and Virginia, in such cases where the target variable has more than two classes
Multi-Class Classification Loss function is used.
Eg. Categorical Cross Entropy Loss:
These are similar to binary classification cross-entropy, used for multi-classclassification
problems.
Differentiate between Machine Learning & Deep Learning.
1. Data dependencies for Performance:
When the data is small, deep learning algorithms don’t perform that well. This is because
deep learning algorithms need a large amount of data to understand it perfectly. On the other
hand, traditional machine learning algorithms with their handcrafted rules prevail in this
scenario.
2. Hardware dependencies
Deep learning algorithms heavily depend on high-end machines, contrary to traditional
machine learning algorithms, which can work on low-end machines. Deep learning
algorithms inherently do a large amount of matrix multiplication operations. These operations
can be efficiently optimized using a GPU.
3. Feature engineering:
Feature engineering is the process of transforming raw data into features that better represent
the underlying problem to the predictive models, resulting in improved model accuracy on
unseen data. Feature engineering turn your inputs into things the algorithm can understand.
In Machine learning, most of the applied features need to be identified by an expert and then
hand-coded as per the domain and data type. Features can be pixel values, shape, textures,
position and orientation. The performance of most of the Machine Learning algorithm
depends on how accurately the features are identified and extracted.
Deep learning algorithms try to learn high-level features from data. Deep learning reduces the
task of developing new feature extractor for every problem. Like, Convolutional NN will try
to learn low-level features such as edges and lines in early layers then parts of faces of people
and then high level representation of a face.

4. Problem Solving approach


When solving a problem using traditional machine learning algorithm, it is generally
recommended to break the problem down into different parts, solve them individually and
combine them to get the result. Deep learning in contrast advocates to solve the problem end-
to-end.
Eg. Suppose you have a task of multiple object detection. The task is to identify what is the
object and where is it present in the image. In a typical ML approach, you would divide the
problem into two steps, object detection and object recognition On the contrary, in deep
learning approach, you would do the process end-to-end.
5. Execution time
Usually, a deep learning algorithm takes a long time to train. This is because there are so
many parameters in a deep learning algorithm that training them takes longer than usual.
Whereas machine learning comparatively takes much less time to train, ranging from a few
seconds to a few hours.
This is turn is completely reversed on testing time. At test time, deep learning algorithm takes
much less time to run. Whereas, if you compare it with k-nearest neighbors (ML algorithm),
test time increases on increasing the size of data. Although this is not applicable on all
machine learning algorithms, as some of them have small testing times too.
6. Interpretability:
Suppose we use deep learning to give automated scoring to essays. The performance it gives
in scoring is quite excellent and is near human performance. But there‟s is an issue. It does
not reveal why it has given that score. Indeed mathematically you can find out which nodes
of a deep neural network were activated, but we don‟t know what there neurons were
supposed to model and what these layers of neurons were doing collectively. So we fail to
interpret the results.
On the other hand, machine learning algorithms like decision trees give us crisp rules as to
why it chose what it chose, so it is particularly easy to interpret the reasoning behind it.
Therefore, algorithms like decision trees and linear/logistic regression are primarily used in
industry for interpretability.

You might also like