ANN Introduction

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Introduction to Neural Networks

KP4842 Artificial Intelligence in Manufacturing


Mechanical Engineering Programme
UNIVERSITI KEBANGSAAN MALAYSIA
Prof Dr Dzuraidah Abd Wahab
Definition

Neural network which is viewed as an adaptive


machine is defined as:
A neural network is a massively parallel distributed
processor made up of simple processing units,
which has a natural propensity (inclination) for
storing experiential knowledge and making it
available for use.
 ANN is an information processing paradigm inspired by the way
biological nervous systems such as the brain that process
information.

 ANN are like human, they learn by examples. For example in


pattern recognition, data classification.

 ANN learning involves adjustments to the synaptic


connections that exist between the neurons.
It resembles the brain in two respects:

 Knowledge is acquired by the network from its


environment through a learning process

 Inter-neuron connection strengths, known as synaptic


weights, are used to store the acquired knowledge
• Biological neural activity

– Each neuron has a body, an axon, and many dendrites


• Can be in one of the two states: firing and rest.
• Neuron fires if the total incoming stimulus exceeds the threshold

– Synapse: thin gap between axon of one neuron and dendrite of another.
• Signal exchange
• Synaptic strength/efficiency
Artificial neural networks are:

 parallel computing devices consisting of many interconnected


simple processors

 share many characteristics of real biological neural networks such


as the human brain

 acquire knowledge from its environment through a learning


process, and this knowledge is stored in the connections strengths
(weights) between processing units (neurons).

 can be used to model complex patterns and prediction problems.


ANN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Bio NN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------

 Nodes • Cell body


◦ input – signal from other neurons
◦ output – firing frequency
◦ node function – firing mechanism
 Connections • Synapses
◦ connection strength – synaptic strength

• Highly parallel, simple local computation (at neuron level)


achieves global results as emerging property of the
interaction (at network level)
• Pattern directed (meaning of individual nodes only in the
context of a pattern)
• Learning/adaptation plays important role.
The next neuron can choose to either
accept it or reject it depending on the
strength of the signal.

Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
ANN is the human idealisation of the real networks of neurons
Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
Advantages of ANN
 Able to derive meaning from complicated or imprecise
data e.g. to extract patterns and detect trends that are
too complex for human or computer techniques.

 Able to provide projections when given new situations


of interest and answer what if questions.
Other advantages:
 Adaptive learning –able to learn how to do tasks based on data
given during training or initial experience
 Self organisation - able to create its own organisation or
representation of the information it receives during the learning
time.
 Real time operation - perform computations in parallel
 Fault tolerance - via redundant information coding – partial
destruction of a network leads to the corresponding degradation
of performance, however some network capabilities may be
retained even with major network damage.
ANN Conventional computers

Uses a large number of highly Use algorithmic approach i.e. Follows


interconnected processing elements a set of instructions in order to solve
(neurons) working in parallel to solve a problem
a specific problem.
This restricts problem solving
capability to problems already
understood and we know how to
solve
It learns by example, therefore Use a cognitive approach to problem
examples must be selected carefully solving, the way the problem is to be
solved must be known.

Note: ANN should complement algorithmic computers. For maximum efficiency,


conventional computers can be used to supervise ANN
The mathematical model of the neuron must take into account 3 basic
components:
1. The synapses of the biological neuron which is modelled as
weights; represented as numbers, a positive value is designated as
excitatory connection while a negative value as inhibitory
connection. Synapse in biological neuron is the one that
interconnects the neural network and gives strength of the
connection.
2. All inputs are summed together and modified by the weights
known as linear combination.
3. An activation function that controls the amplitude of the
output. E.g. an acceptable range of output is between 0 and 1 or -1
and 1
ANN- distribute representation
ANN consist of a pool of simple processing units which communicate
by sending signals to each other over a large number of weighted
connections.
The aspects of parallel distributed model ( i.e. many processing unit
can carry out their computation at the same time) are:
1. A set of processing units i.e. neurons
2. A state of activation yk for every unit which is equivalent to the
output of he unit
3. Connections between the units: each connection is defined by a
weight wjk which determines the effect which the signal of unit j
has on unit k
4. A propagation rule which determines the effective input sk of a
unit from its external inputs.
5. An activation function Fk that determines the new level of
activation based on the effective input sk(t) and the current
activation yk(t), i.e. the update
6. An external input i.e. bias, offset (ok) for each unit
7. A method for information gathering (the learning rule)
8. An environment within which the system must operate,
providing input signals and if necessary error signals
Thus the artificial neuron is defined by the components:
1. A set of inputs, xi.
2. A set of weights, wij.
3. A bias, bi.
4. An activation function, f.
5. Neuron output, y
The subscript i indicates the i-th input or weight.

As the inputs and output are external, the parameters of this model
are therefore the weights, bias and activation function and thus
DEFINE the model
Processing Units in ANN

Each processing unit receive input from neighbours or external


sources and compute that into an output signal which is
propagated to other units
The units are also responsible for adjustments of the weights.

Types of units:
 Input units that receive data from outside of the neural
network
 Output units that send data out of the neural network
 Hidden units whose input and output signals remain within the
neural network.
A simple example of ANN application

A bank wants to assess whether to approve a loan application to a


customer, so, it wants to predict whether a customer is likely to default on
the loan. It has data like below. Column X has to be predicted.

Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
A simple ANN architecture for the example

Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
Network Architecture

 The ANN has an input layer, hidden layer and the output layer. It is
called Multi Layer Perception (MLP)

 The network architecture used in the example is called “feed-


forward network”, since the input signals are flowing in only one
direction (from inputs to outputs).

 The purpose of the hidden layer is to distill some of the important


patterns from the input before they are passed onto the next layer.
By omitting redundant information, the system will be faster and
more efficient

Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
Activation function

The purpose of activation function:


 to capture non-linear relationship between the inputs
 to convert input into more useful outputs

The activation function used in the example is sigmoid:


 O1 = 1 / (1+exp(-F))
Where F = W1*X1 + W2*X2 + W3*X3

 Sigmoid activation function creates an output with values between


0 and 1.

Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
Similarly, the hidden layer leads to the final prediction at
the output layer:
 O3 = 1 / (1+exp(-F 1))
Where F 1= W7*H1 + W8*H2

 The output value (O3) is between 0 and 1.

 A value closer to 1 (e.g. 0.75) indicates that there is a


higher indication of customer defaulting.

Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
Weights

 The weights W are the importance associated with the


inputs. If W1 is 0.56 and W2 is 0.92, then there is
higher importance attached to X2: Debt Ratio than X1:
Age, in predicting H1.

Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
 A good model with high accuracy gives predictions that are very close to
the actual values. Here, Column X values should be very close to Column
W values. The error in prediction is the difference between column W and
column X:

Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
Optimisation
 The key to get a good model with accurate predictions is to
find “optimal values of W—weights” that minimizes the
prediction error. This is achieved by “Back propagation
algorithm” and this makes ANN a learning algorithm because
by learning from the errors, the model is improved.

 The most common method of optimization algorithm is


called “gradient descent”, where, iteratively different values of
W are used and prediction errors assessed. So, to get the
optimal W, the values of W are changed in small amounts and
the impact on prediction errors assessed
Source: https://towardsdatascience.com/introduction-to-neural-networks-
advantages-and-applications-96851bd1a207
Pattern of connections between units
(topologies)

Feed-forward neural networks


 The data from input to ouput units is strictly feed-forward.
Data processing can extend over multiple units, but no
feedback connections are present
 Example of feed forward NN are Perceptron and Adaline

Recurrent neural networks


 Contain feedback connections.; contrary to feed-forward
networks, the dynamical properties of the network are
important
 Example of recurrent NN are by Anderson (1977), Kohonen
(1977) and Hopfield (1982)
Training of ANN

Categories of ANN learning:

 Supervised learning (Associative learning)


 Unsupervised learning (Self organisation)
 Reinforcement learning
Supervised learning (Associative learning)

 The network is trained by providing it with input and matching


output patterns.

 The input-ouput pairs can be provided by an external teacher or by


the system which contains the neural network (self supervised)

 The paradigms of supervised learning include error-correction


learning, reinforcement learning

 Least mean square convergence which is common in many learning


paradigms is used to minimise error between the desired and
computed unit values.
Unsupervised learning (Self organisation)

 Uses no external teacher and is based only on local information.

 Referred to as self organisation. It self organises data and detects


emergent collective properties i.e. an output unit is trained to
respond to clusters of pattern within the input.

 The system is expected to discover statistically salient features of


the population.

 Paradigms of unsupervised learning are Hebbian learning and


competitive learning.
Reinforcement Learning

 It is an intermediate form of the supervised and unsupervised


learning
 The learning machine does some action on the environment and
gets a feedback response from the environment
 The learning system grades its action good or bad based on the
environmental response and accordingly adjusts its parameters;
parameter adjustment is continued until an equilibrium state
occurs, following which there will be no more changes in its
parameters.
 The unsupervised learning (self organising) may be categorised
under this type of learning.
Learning in ANN results in adjustment of the weights of
the connections between units according to some
modification rule.

Common learning rules are variants of the Hebbian


learning rule and Widrow-Hoff rule or the delta rule.
Benefits of NN
NN is capable of solving complex (large-scale) problems
that are currently intractable through:
 Its computing power through massively parallel
distributed structure
 Ability to learn and therefore generalise. Generalisation
refers to the neural network producing reasonable
outputs for inputs not encountered during training (i.e.
the learning process)
Useful properties of NN
 Nonlinearity
 Input-Output Mapping
 Adaptivity
 Evidential Response
 Contextual Information
 Uniformity of Analysis and Design
 Neurobiological Analogy
 Non-linearity – an artificial neuron can be linear or non
linear. It is made up of an interconnection of nonlinear
neurons; it is distributed throughout the network.
 Input-output mapping – Learning in NN involves
modification of the synaptic weights of a NN by applying a set
of labeled training samples or task examples. Each example
consists of a unique input signal and a corresponding desired
response
 Adaptivity – NN has a built-in capability to adapt their
synaptic weights to changes in the surrounding environment.
NN trained to operate in a specific environment can be easily
retrained to deal with minor changes in the operating
environmental conditions

You might also like