ANN Introduction

Introduction to Neural Networks
KP4842 Artificial Intelligence in Manufacturing

Mechanical Engineering Programme
UNIVERSITI KEBANGSAAN MALAYSIA
Prof Dr Dzuraidah Abd Wahab
Definition
Neural network which is viewed as an adaptive

machine is defined as:
A neural network is a massively parallel distributed
processor made up of simple processing units,
which has a natural propensity (inclination) for
storing experiential knowledge and making it
available for use.
 ANN is an information processing paradigm inspired by the way
biological nervous systems such as the brain that process
information.
 ANN are like human, they learn by examples. For example in

pattern recognition, data classification.
 ANN learning involves adjustments to the synaptic

connections that exist between the neurons.
It resembles the brain in two respects:
 Knowledge is acquired by the network from its

environment through a learning process
 Inter-neuron connection strengths, known as synaptic

weights, are used to store the acquired knowledge
• Biological neural activity
– Each neuron has a body, an axon, and many dendrites

• Can be in one of the two states: firing and rest.
• Neuron fires if the total incoming stimulus exceeds the threshold
– Synapse: thin gap between axon of one neuron and dendrite of another.
• Signal exchange
• Synaptic strength/efficiency
Artificial neural networks are:
 parallel computing devices consisting of many interconnected

simple processors
 share many characteristics of real biological neural networks such

as the human brain
 acquire knowledge from its environment through a learning

process, and this knowledge is stored in the connections strengths
(weights) between processing units (neurons).
 can be used to model complex patterns and prediction problems.

ANN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Bio NN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------
 Nodes • Cell body

◦ input – signal from other neurons
◦ output – firing frequency
◦ node function – firing mechanism
 Connections • Synapses
◦ connection strength – synaptic strength
• Highly parallel, simple local computation (at neuron level)

achieves global results as emerging property of the
interaction (at network level)
• Pattern directed (meaning of individual nodes only in the
context of a pattern)
• Learning/adaptation plays important role.
The next neuron can choose to either
accept it or reject it depending on the
strength of the signal.
Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
ANN is the human idealisation of the real networks of neurons
Advantages of ANN
 Able to derive meaning from complicated or imprecise
data e.g. to extract patterns and detect trends that are
too complex for human or computer techniques.
 Able to provide projections when given new situations

of interest and answer what if questions.
Other advantages:
 Adaptive learning –able to learn how to do tasks based on data
given during training or initial experience
 Self organisation - able to create its own organisation or
representation of the information it receives during the learning
time.
 Real time operation - perform computations in parallel
 Fault tolerance - via redundant information coding – partial
destruction of a network leads to the corresponding degradation
of performance, however some network capabilities may be
retained even with major network damage.
ANN Conventional computers
Uses a large number of highly Use algorithmic approach i.e. Follows

interconnected processing elements a set of instructions in order to solve
(neurons) working in parallel to solve a problem
a specific problem.
This restricts problem solving
capability to problems already
understood and we know how to
solve
It learns by example, therefore Use a cognitive approach to problem
examples must be selected carefully solving, the way the problem is to be
solved must be known.
Note: ANN should complement algorithmic computers. For maximum efficiency,

conventional computers can be used to supervise ANN
The mathematical model of the neuron must take into account 3 basic
components:
1. The synapses of the biological neuron which is modelled as
weights; represented as numbers, a positive value is designated as
excitatory connection while a negative value as inhibitory
connection. Synapse in biological neuron is the one that
interconnects the neural network and gives strength of the
connection.
2. All inputs are summed together and modified by the weights
known as linear combination.
3. An activation function that controls the amplitude of the
output. E.g. an acceptable range of output is between 0 and 1 or -1
and 1
ANN- distribute representation
ANN consist of a pool of simple processing units which communicate
by sending signals to each other over a large number of weighted
connections.
The aspects of parallel distributed model ( i.e. many processing unit
can carry out their computation at the same time) are:
1. A set of processing units i.e. neurons
2. A state of activation yk for every unit which is equivalent to the
output of he unit
3. Connections between the units: each connection is defined by a
weight wjk which determines the effect which the signal of unit j
has on unit k
4. A propagation rule which determines the effective input sk of a
unit from its external inputs.
5. An activation function Fk that determines the new level of
activation based on the effective input sk(t) and the current
activation yk(t), i.e. the update
6. An external input i.e. bias, offset (ok) for each unit
7. A method for information gathering (the learning rule)
8. An environment within which the system must operate,
providing input signals and if necessary error signals
Thus the artificial neuron is defined by the components:
1. A set of inputs, xi.
2. A set of weights, wij.
3. A bias, bi.
4. An activation function, f.
5. Neuron output, y
The subscript i indicates the i-th input or weight.
As the inputs and output are external, the parameters of this model
are therefore the weights, bias and activation function and thus
DEFINE the model
Processing Units in ANN
Each processing unit receive input from neighbours or external

sources and compute that into an output signal which is
propagated to other units
The units are also responsible for adjustments of the weights.
Types of units:
 Input units that receive data from outside of the neural
network
 Output units that send data out of the neural network
 Hidden units whose input and output signals remain within the
neural network.
A simple example of ANN application
A bank wants to assess whether to approve a loan application to a

customer, so, it wants to predict whether a customer is likely to default on
the loan. It has data like below. Column X has to be predicted.
A simple ANN architecture for the example
Network Architecture
 The ANN has an input layer, hidden layer and the output layer. It is
called Multi Layer Perception (MLP)
 The network architecture used in the example is called “feed-

forward network”, since the input signals are flowing in only one
direction (from inputs to outputs).
 The purpose of the hidden layer is to distill some of the important

patterns from the input before they are passed onto the next layer.
By omitting redundant information, the system will be faster and
more efficient
Activation function
The purpose of activation function:

 to capture non-linear relationship between the inputs
 to convert input into more useful outputs
The activation function used in the example is sigmoid:

 O1 = 1 / (1+exp(-F))
Where F = W1*X1 + W2*X2 + W3*X3
 Sigmoid activation function creates an output with values between

0 and 1.
Similarly, the hidden layer leads to the final prediction at
the output layer:
 O3 = 1 / (1+exp(-F 1))
Where F 1= W7*H1 + W8*H2
 The output value (O3) is between 0 and 1.
 A value closer to 1 (e.g. 0.75) indicates that there is a

higher indication of customer defaulting.
Weights
 The weights W are the importance associated with the

inputs. If W1 is 0.56 and W2 is 0.92, then there is
higher importance attached to X2: Debt Ratio than X1:
Age, in predicting H1.
 A good model with high accuracy gives predictions that are very close to
the actual values. Here, Column X values should be very close to Column
W values. The error in prediction is the difference between column W and
column X:
Optimisation
 The key to get a good model with accurate predictions is to
find “optimal values of W—weights” that minimizes the
prediction error. This is achieved by “Back propagation
algorithm” and this makes ANN a learning algorithm because
by learning from the errors, the model is improved.
 The most common method of optimization algorithm is

called “gradient descent”, where, iteratively different values of
W are used and prediction errors assessed. So, to get the
optimal W, the values of W are changed in small amounts and
the impact on prediction errors assessed
Pattern of connections between units
(topologies)
Feed-forward neural networks

 The data from input to ouput units is strictly feed-forward.
Data processing can extend over multiple units, but no
feedback connections are present
 Example of feed forward NN are Perceptron and Adaline
Recurrent neural networks

 Contain feedback connections.; contrary to feed-forward
networks, the dynamical properties of the network are
important
 Example of recurrent NN are by Anderson (1977), Kohonen
(1977) and Hopfield (1982)
Training of ANN
Categories of ANN learning:
 Supervised learning (Associative learning)

 Unsupervised learning (Self organisation)
 Reinforcement learning
Supervised learning (Associative learning)
 The network is trained by providing it with input and matching

output patterns.
 The input-ouput pairs can be provided by an external teacher or by

the system which contains the neural network (self supervised)
 The paradigms of supervised learning include error-correction

learning, reinforcement learning
 Least mean square convergence which is common in many learning

paradigms is used to minimise error between the desired and
computed unit values.
Unsupervised learning (Self organisation)
 Uses no external teacher and is based only on local information.
 Referred to as self organisation. It self organises data and detects

emergent collective properties i.e. an output unit is trained to
respond to clusters of pattern within the input.
 The system is expected to discover statistically salient features of

the population.
 Paradigms of unsupervised learning are Hebbian learning and

competitive learning.
Reinforcement Learning
 It is an intermediate form of the supervised and unsupervised

learning
 The learning machine does some action on the environment and
gets a feedback response from the environment
 The learning system grades its action good or bad based on the
environmental response and accordingly adjusts its parameters;
parameter adjustment is continued until an equilibrium state
occurs, following which there will be no more changes in its
parameters.
 The unsupervised learning (self organising) may be categorised
under this type of learning.
Learning in ANN results in adjustment of the weights of
the connections between units according to some
modification rule.
Common learning rules are variants of the Hebbian

learning rule and Widrow-Hoff rule or the delta rule.
Benefits of NN
NN is capable of solving complex (large-scale) problems
that are currently intractable through:
 Its computing power through massively parallel
distributed structure
 Ability to learn and therefore generalise. Generalisation
refers to the neural network producing reasonable
outputs for inputs not encountered during training (i.e.
the learning process)
Useful properties of NN
 Nonlinearity
 Input-Output Mapping
 Adaptivity
 Evidential Response
 Contextual Information
 Uniformity of Analysis and Design
 Neurobiological Analogy
 Non-linearity – an artificial neuron can be linear or non
linear. It is made up of an interconnection of nonlinear
neurons; it is distributed throughout the network.
 Input-output mapping – Learning in NN involves
modification of the synaptic weights of a NN by applying a set
of labeled training samples or task examples. Each example
consists of a unique input signal and a corresponding desired
response
 Adaptivity – NN has a built-in capability to adapt their
synaptic weights to changes in the surrounding environment.
NN trained to operate in a specific environment can be easily
retrained to deal with minor changes in the operating
environmental conditions

ANN Introduction

Uploaded by

Copyright:

Available Formats

ANN Introduction

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ANN Introduction

Uploaded by

Copyright:

Available Formats

Introduction to Neural Networks

KP4842 Artificial Intelligence in Manufacturing

Neural network which is viewed as an adaptive

 ANN are like human, they learn by examples. For example in

 ANN learning involves adjustments to the synaptic

 Knowledge is acquired by the network from its

 Inter-neuron connection strengths, known as synaptic

– Each neuron has a body, an axon, and many dendrites

 parallel computing devices consisting of many interconnected

 share many characteristics of real biological neural networks such

 acquire knowledge from its environment through a learning

 can be used to model complex patterns and prediction problems.

 Nodes • Cell body

• Highly parallel, simple local computation (at neuron level)

 Able to provide projections when given new situations

Uses a large number of highly Use algorithmic approach i.e. Follows

Note: ANN should complement algorithmic computers. For maximum efficiency,

Each processing unit receive input from neighbours or external

A bank wants to assess whether to approve a loan application to a

 The network architecture used in the example is called “feed-

 The purpose of the hidden layer is to distill some of the important

The purpose of activation function:

The activation function used in the example is sigmoid:

 Sigmoid activation function creates an output with values between

 The output value (O3) is between 0 and 1.

 A value closer to 1 (e.g. 0.75) indicates that there is a

 The weights W are the importance associated with the

 The most common method of optimization algorithm is

Feed-forward neural networks

Recurrent neural networks

Categories of ANN learning:

 Supervised learning (Associative learning)

 The network is trained by providing it with input and matching

 The input-ouput pairs can be provided by an external teacher or by

 The paradigms of supervised learning include error-correction

 Least mean square convergence which is common in many learning

 Uses no external teacher and is based only on local information.

 Referred to as self organisation. It self organises data and detects

 The system is expected to discover statistically salient features of

 Paradigms of unsupervised learning are Hebbian learning and

 It is an intermediate form of the supervised and unsupervised

Common learning rules are variants of the Hebbian

You might also like