Neural-Networks - First Lecture
Neural-Networks - First Lecture
Neural-Networks - First Lecture
Networks
Agenda
History of Artificial Neural Networks
What is an Artificial Neural Networks?
How it works?
Learning
Learning paradigms
Supervised learning
Unsupervised learning
Reinforcement learning
Applications areas
Advantages and Disadvantages
History of the Artificial Neural Networks
history of the ANNs stems from the 1940s, the decade of the first electronic
computer.
However, the first important step took place in 1957 when Rosenblatt
introduced the first concrete neural model, the perceptron. Rosenblatt also
took part in constructing the first successful neurocomputer, the Mark I
Perceptron. After this, the development of ANNs has proceeded as
described in Figure.
History of the Artificial Neural Networks
Rosenblatt's original perceptron model contained only one layer. From this,
a multi-layered model was derived in 1960. At first, the use of the multi-
layer perceptron (MLP) was complicated by the lack of a appropriate
learning algorithm.
In 1974, Werbos came to introduce a so-called backpropagation algorithm
for the three-layered perceptron network.
History of the Artificial Neural Networks
in 1986, The application area of the MLP networks remained rather limited
until the breakthrough when a general back propagation algorithm for a
multi-layered perceptron was introduced by Rummelhart and Mclelland.
in 1982, Hopfield brought out his idea of a neural network. Unlike the
neurons in MLP, the Hopfield network consists of only one layer whose
neurons are fully connected with each other.
History of the Artificial Neural Networks
Since then, new versions of the Hopfield network have been developed.
The Boltzmann machine has been influenced by both the Hopfield network
and the MLP.
History of the Artificial Neural Networks
in 1988, Radial Basis Function (RBF) networks were first introduced by
Broomhead & Lowe. Although the basic idea of RBF was developed 30
years ago under the name method of potential function, the work by
Broomhead & Lowe opened a new frontier in the neural network
community.
History of the Artificial Neural Networks
in 1982, A totally unique kind of network model is the Self-Organizing
Map (SOM) introduced by Kohonen. SOM is a certain kind of topological
map which organizes itself based on the input patterns that it is trained with.
The SOM originated from the LVQ (Learning Vector Quantization)
network the underlying idea of which was also Kohonen's in 1972.
History of Artificial Neural
Networks
Since then, research on artificial neural networks has
remained active, leading to many new network types, as
well as hybrid algorithms and hardware for neural
information processing.
Artificial Neural Network
An artificial neural network consists of a pool of simple
processing units which communicate by sending signals to
each other over a large number of weighted connections.
Artificial Neural Network
A set of major aspects of a parallel distributed model include:
a set of processing units (cells).
a state of activation for every unit, which equivalent to the output of the
unit.
connections between the units. Generally each connection is defined by a
weight.
a propagation rule, which determines the effective input of a unit from its
external inputs.
an activation function, which determines the new level of activation based
on the effective input and the current activation.
an external input for each unit.
a method for information gathering (the learning rule).
an environment within which the system must operate, providing input
signals and _ if necessary _ error signals.
Computers vs. Neural Networks
“Standard” Computers Neural Networks
Dendrites: Input
Cell body: Processor
Synaptic: Link
Axon: Output
How do our brains work?
A processing element
The axon endings almost touch the dendrites or cell body of the
next neuron.
How do our brains work?
A processing element
Processing ∑
∑= X1+X2 + ….+Xm =
Output y
How do ANNs work?
Not all inputs are equal
xm ......... x2 x1
...
Input
w ...
weights m
..
w2 w1
Processing ∑ ∑= X1w1+X2w2 + ….
+Xmwm =y
Output y
How do ANNs work?
The signal is not passed down to the
next neuron.
xm ......... x2 x1
...
Input
w
w w
...
weights m
..
2
1
Processing ∑
Transfer Function
f(vk)
(Activation Function)
Output y
The output is a function of the input, that is
affected by the weights, and the transfer
functions
Artificial Neural Networks
An ANN can:
1. compute any computable function, by the appropriate
selection of the network topology and weights values.
2. learn from experience!
Specifically, by trial‐and‐error
Learning by trial‐and‐error
Continuous process of:
Trial:
Processing an input to produce an output (In
terms of ANN: Compute the output function of a
given input)
Evaluate:
Evaluating this output by comparing the
actual output with the expected output.
Adjust:
Adjust the weights.
How it works?
Set initial values of the weights randomly.
Input: truth table of the XOR
Do
Read input (e.g. 0, and 0)
Compute an output (e.g. 0.60543)
Compare it to the expected output. (Diff= 0.60543)
Modify the weights accordingly.
Loop until a condition is met
Condition: certain number of iterations
Condition: error threshold
Design Issues
Initial weights (small random values ∈[‐1,1])
Transfer function (How the inputs and the weights
are combined to produce output?)
Error estimation
Weights adjusting
Number of neurons
Data representation
Size of training set
Transfer Functions
Linear: The output is proportional to the total
weighted input.
Threshold: The output is set at one of two values,
depending on whether the total weighted input is
greater than or less than some threshold value.
Non‐linear: The output varies continuously but not
linearly as the input changes.
Error Estimation
The root mean square error (RMSE) is a
frequently-used measure of the differences between
values predicted by a model or an estimator and the
values actually observed from the thing being
modeled or estimated
Weights Adjusting
After each iteration, weights should be adjusted to
minimize the error.
– All possible weights
– Back propagation
Back Propagation
Back-propagation is an example of
supervised learning is used at each layer
to minimize the error between the
layer’s response and the actual data
The error at each hidden layer is an
average of the evaluated error
Hidden layer networks are trained this
way
Back Propagation
N is a neuron.
Nw is one of N’s inputs weights
Nout is N’s output.
Nw = Nw +Δ Nw
Δ Nw = Nout * (1‐ Nout)* NErrorFactor
NErrorFactor = NExpectedOutput – NActualOutput
This works only for the last layer, as we can know
the actual output, and the expected output.
Number
of neurons
Many neurons:
Higher accuracy
Slower
Risk of over‐fitting
Memorizing, rather than understanding
The network will be useless with new problems.
Few neurons:
Lower accuracy
Inability to learn at all
Optimal number.
Data representation
Usually input/output data needs pre‐processing
Pictures
Pixel intensity
Text:
A pattern
Size of training set
No one‐fits‐all formula
Over fitting can occur if a “good” training set is not
chosen
What constitutes a “good” training set?
Samples must represent the general population.
Samples must contain members of each class.
Samples in each class must contain a wide range of
variations or noise effect.
The size of the training set is related to the number of
hidden neurons
Learning Paradigms
Supervised learning
Unsupervised learning
Reinforcement learning
Supervised learning
This is what we have seen so far!
A network is fed with a set of training
samples (inputs and corresponding
output), and it uses these samples to learn
the general relationship between the
inputs and the outputs.
This relationship is represented by the
values of the weights of the trained
network.
Unsupervised learning
No desired output is associated with the
training data!
Faster than supervised learning
Used to find out structures within data:
Clustering
Compression
Reinforcement learning
Like supervised learning, but:
Weights adjusting is not directly related to the error
value.
The error value is used to randomly, shuffle weights!
Relatively slow learning due to ‘randomness’.
Applications Areas
Function approximation
including time series prediction and modeling.
Classification
including patterns and sequences recognition, novelty
detection and sequential decision making.
(radar systems, face identification, handwritten text recognition)
Data processing
including filtering, clustering blinds source separation and
compression.
(data mining, e-mail Spam filtering)
Advantages / Disadvantages
Advantages
Adapt to unknown situations
Powerful, it can model complex functions.
Ease of use, learns by example, and very little user
domain‐specific expertise needed
Disadvantages
Forgets
Not exact
Large complexity of the network structure
Conclusion
Artificial Neural Networks are an imitation of the biological
neural networks, but much simpler ones.
The computing would have a lot to gain from neural networks.
Their ability to learn by example makes them very flexible and
powerful furthermore there is need to device an algorithm in
order to perform a specific task.
Conclusion
Neural networks also contributes to area of research such a
neurology and psychology. They are regularly used to model
parts of living organizations and to investigate the internal
mechanisms of the brain.
Many factors affect the performance of ANNs, such as the
transfer functions, size of training sample, network topology,
weights adjusting algorithm, …
References
Craig Heller, and David Sadava, Life: The Science of Biology, fifth edition,
Sinauer Associates, INC, USA, 1998.
Introduction to Artificial Neural Networks, Nicolas Galoppo von Borries
Tom M. Mitchell, Machine Learning, WCB McGraw-Hill, Boston, 1997.
Thank You
Q. How does each neuron work in ANNS?
What is back propagation?
A neuron: receives input from many other neurons;
changes its internal state (activation) based on the
current input;
sends one output signal to many other neurons, possibly
including its input neurons (ANN is recurrent network).