Fundamentals of Artificial Neural Network: Workshop On "Neural Network Approach For Image Processing", Feb 4 & 5, 2011

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

CHAPTER 1

FUNDAMENTALS OF ARTIFICIAL NEURAL NETWORK

1.1 Introduction

A majority of information processing today is carried out by digital computers. There are
many tasks that are ideally suited to solution by conventional computers: scientific and
mathematical problem solving, database creation, manipulation and maintenance,
electronic communication, word processing, graphics and desktop publication, even the
simple control functions that add intelligence to and simplify our household tools and
appliances are handled quite effectively by today’s computers. In contrast cognitive tasks
like Speech and Image processing are hard to solve by the conventional algorithmic
approach. Whereas, human beings are typically much better at perceiving and identifying
an object of interest in a natural scene, or interpreting natural language, and many other
natural cognitive tasks, than a digital computer. One reason why we are much better at
recognizing objects in a complex scene, for example, is due to the way that our brain is
organized. Our brain employs a computational architecture that is well suited to solve
complex problems that a digital computer would have a difficult time with. Since our
conventional computers are obviously not suited to this type of problem, we borrow the
features from physiology of the brain as the basis for our new computing models, and this
technology has come to be known as Artificial Neural Network (ANN). This chapter
gives an overview of the fundamental principles of artificial neural network.

1.2 Biological Neural Network

The brain is the central element of the nervous system. The information
processing cells of the brain are the neurons. Figure 1.1 shows the structure of a
biological neuron. It is composed of a cell body, or soma and two types of out-reaching
tree-like branches: the axon and the dendrites. The cell body has a nucleus that contains
information about hereditary traits and a plasma that holds the molecular equipment for
producing material needed by the neuron. A neuron receives signals from other neurons
through its dendrites. (receivers) and transmits signals generated by its cell body along
the axon (transmitter), which eventually branches into strands and substrands. At the
terminals of these strands are the synapses. A synapse is an elementary structure and a
fundamental unit between two neurons. The synapse’s effectiveness can be adjusted by
the signals passing through it so that the synapses can learn from the activities in which
they participate.

Figure 1.1 A biological Neuron

1 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

It is estimated that the brain consists of about 1011 neurons which are interconnected to
form a vast and complex neural network.

1.3 Artificial Neural Network?

An Artificial Neural Network is an information-processing paradigm that is inspired by


the way biological nervous systems, such as the brain, process the information. In
common with biological neural networks, ANNs can accommodate many inputs in
parallel and encode the information in a distributed fashion. Typically the information
that is stored in a artificial net is shared by many of its processing units that are analogous
to neurons in the brain. Table 1.1 shows how the biological neural net is associated with
the artificial neural net

Table 1.1 Associated Terminologies of Biological and Artificial Neural Net

Biological Neural Network Artificial Neural Network


Cell Body (Soma) Neurons
Synapse Weights
Dendrites Input
Axon Output

Unlike its programmed computing counterpart, neuro computing approach to information


processing first involves a learning process with an artificial neural network architecture
that adaptively responds to inputs according to a learning rule. After the neural network
has learned what it needs to know, the trained network can be used to perform certain
tasks depending on the particular application. The ability to learn by example and
generalize are the principal characteristics of artificial neural network. The learning
process is what sets a neural network apart from processing information on a digital
computer that must be programmed.

1.4 Computational Model of Artificial Neuron

In 1943, Mc Culloh and Pitts proposed a binary threshold unit as a computational model
for an artificial neuron. Figure 1.2 shows the fundamental representation of an artificial
neuron. Each node has a node function, associated with it which along with a set of local
parameters determines the output of the node, given an input. The input signals are
transmitted by means of connection links. The links possesses an associated weight,
which is multiplied along with the incoming signal (net input) for any typical neural net.
Positive weights correspond to excitatory synapses, while negative weights model
inhibitory ones. This mathematical neuron computes a weighted sum of its n input signals
xi, i = 1, 2……n given by
n
net = x 1 w 1 + x 2 w 2 + ... + x n w n = ∑
i =1
xnwn

If this sum exceeds the neuron’s threshold value then the neuron fires (the output is ON).
This is illustrated in Figure 1.3.

2 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

x1
w1
x2 w2
h
y

wn θ
xn
Figure 1.2 Basic Artificial Neuron

The output of the neuron can be expressed by:

⎛ n ⎞
y = f ⎜ ∑ (wi xi − θ )⎟
⎝ i =1 ⎠
Where f(.) is a step function defined by
f (net ) = 1, net > θ
= 0, net ≤ θ

The bias or offset θ could be introduced as one of the inputs with a permanent weight of
1, leading to a simpler expression:

⎛ n ⎞
y = f ⎜ ∑ (wi xi )⎟
⎝ i =0 ⎠
The lower limit in the above equation is from 0 rather than 1. The value of input x0 is
always set to 1.
Output
Output

net θ net

a) Thersholding at 0 b) Thersholding at θ

Figure 1.3 Thersholding Function

The other choices for activation function besides thersholding function are given in
Figure 1.4.

3 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

Figure 1.4 Common non-linear activation functions

Networks consisting of MP neurons with binary (on-off) output signals can be configured
to perform several logical functions. Figure 1.5 shows some examples of logic circuits
realized using the MP model.

Figure 1.5 Logic Networks using MP Neurons

4 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

1.5 Architectures of Neural Networks

The arrangement of neurons into layers and the pattern of connection within and in-
between layer are called “architecture of the network”. The neurons within a layer are
found to be either fully interconnected or not interconnected. The number of layers in the
network can be defined to be the number of layers of weighted interconnected links
between the particular slabs of neurons. Based on the connection pattern (architecture),
ANNs can be grouped into three categories:
(i) Feed forward networks
(ii) Feedback networks
(iii) Competitive networks

Different connection yields different network behaviour. Generally speaking, feed-


forward networks are static, that is, they produce only one set of output values rather than
a sequence of values from a given input. Feed- forward networks are memory less in the
sense that their response to an input is independent of the previous network state. In
single layer feed forward networks, an input layer of source nodes projects onto an output
layer of neurons but not vice versa. The multilayer feedforward neural networks
distinguishes itself by the presence of one or more hidden layers, whose computation
nodes are correspondingly called hidden neurons or hidden units. Recurrent, or feedback
networks on the other hand are dynamic systems. When a new input pattern is presented,
the neuron outputs are computed. Because of the feedback paths, the inputs to each
neuron are then modified which leads the network to enter a new state. Competitive
networks uses both feedforward and feedback connections.

2.6 Learning in Artificial Neural Network

Learning Process in the ANN context can be viewed as the problem of updating network
architecture and connection weights so that a network can efficiently perform a specific
task. The network usually must learn the specific weights from available training
patterns. Performance is improved over time by iteratively updating the weights in the
network. ANN’s ability to automatically learn from examples make them attractive and
exciting tool for many tasks.
There are three main learning paradigms for network training:
1. Supervised,
2. Unsupervised and
3. Reinforcement learning.

• Supervised learning or Associative learning in which the network is trained by


providing it with input and matching output patterns. These input-output pairs can
be provided by an external teacher, or by the system which contains the neural
network (self-supervised).

5 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

• Unsupervised learning or Self-organization in which an (output) unit is trained


to respond to clusters of pattern within the input. In this paradigm the system is
supposed to discover statistically salient features of the input population. Unlike
the supervised learning paradigm, there is no a priori set of categories into which
the patterns are to be classified; rather the system must develop its own
representation of the input stimuli.

• Reinforcement learning may be considered as an intermediate form of the above


two types of learning. Here the learning machine does some action on the
environment and gets a feedback response from the environment. The learning
system grades its action good (rewarding) or bad (punishable) based on the
environmental response and accordingly adjusts its parameters. Generally,
parameter adjustment is continued until an equilibrium state occurs, following
which there will be no more changes in its parameters. The self organizing neural
learning may be categorized under this type of learning.

6 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

CHAPTER 2

FEED FORWARD NEURAL NETWORK

2.1 Introduction

In chapter 1, the mathematical details of a neuron at the single cell level and as a network
were described. Although single neurons can perform certain simple pattern detection
functions, the power of neural computation comes from the neurons connected in a
network structure. Larger networks generally offer greater computational capabilities but
the converse is not true. Arranging neurons in layers or stages is supposed to mimic the
layered structure of certain portions of the brain.
In this chapter, we describe the most popular Artificial Neural Network (ANN)
architecture, namely Feedforward (FF) network. First we briefly review the perceptron
model.Next a multilayer feedforward network architecture is presented.This type of
network is sometimes called multilayer perceptron because of its similarity to perceptron
networks with more than one layer. We derive the generalized delta (backpropagation)
learning rule and see how it is implemented in practice. We will also examine variations
in the learning process to improve the efficiency, and ways to avoid some potential
problems that can arise during training. We will also discuss optimal parameters settings
and discuss various other training methods. We will also look at the capabilities and
limitations of the ANN.

2.2 Perceptron Model

A two layer feedforward neural network with hard limiting threshold units in the output
layer is called as single layer perceptron model. A perceptron model consisting of two
input units and three output units is shown in Figure 2.1. The number of units in the
input layer corresponds to the dimensionality of the input pattern vectors. The units in the
input layer are linear as the input layer merely contributes to fan out the input to each of
the output units. The number of output units depends on the number of distinct classes in
the pattern classification task.
The perceptron network is trained with the help of perceptron algorithm which is
a supervised learning paradigm. The network is given a desired output for each input
pattern. During the learning process, the actual output yi generated by the network may
not equal the desired output di. The perceptron learning rule is based on error-correction
principle. The basic principle of error correction learning rule is to use the error signal
(di-yi) to modify the connection weights to gradually reduce this error.

7 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

Y1
w11
X1
w12

Y2
w21
w22
X2
w13

w23
Y3

Input layer Output layer

Figure 2.1 Single layer Perceptron

The algorithm for weight adjustment using perceptron learning law is given below:

• Initialize weights and thresholds by assigning random values.


• Present a training pattern – i.e., an input and the desired output.
• Calculate the actual output:
⎛ n ⎞
y = f ⎜ ∑ wi (t ).xi (t )⎟
⎝ i =0 ⎠
• Adjust weights:
If correct, then wi (t + 1) ) = wi (t )
If output 0 should be 1 (class A), then wi (t + 1) = wi (t ) + η xi (t )
If output 1 should be 0 (class B), then wi (t + 1) = wi (t ) − η xi (t )
Where 0 ≤ η ≤ 1 is a constant that controls the adaptation rate.

Figure 2.2 Supervised Learning

8 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

2.3 Geometrical interpretation of perceptron network output

A pattern classification problem using perceptron model can be viewed as determining


the hyper surfaces separating the multidimensional patterns belonging to different
classes. With linear threshold units on the output, the hyper surfaces reduce to straight
lines. The network can produce N distinct lines depending on the number of output units
in the pattern space as shown in Figure 2.2. These lines can be used to separate different
classes, provided the regions formed by the pattern classification problem are linearly
separable.
x2

x1

Figure 2.2 Linearly separable classes

There are certain restrictions on the class of problems for which perceptron model can be
used. Perceptron network can be used only if the patterns are linearly separable. Because
many classification problems do not possess linearly separable property, this condition
places a severe restriction on the applicability of the perceptron network. A feed forward
network with hidden layer is an obvious choice in this case, the details of which are given
in the next section.

Problem 1:

Train the perceptron network using the training vector


⎡1 ⎤ ⎡0 ⎤ ⎡ − 1⎤
⎢− 2 ⎥ ⎢ 1.5 ⎥ ⎢1 ⎥
x1 = ⎢ ⎥ ; x2 = ⎢ ⎥ ; x3 = ⎢ ⎥
⎢0 ⎥ ⎢− 0.5⎥ ⎢0.5 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ − 1⎦ ⎣− 1 ⎦ ⎣ − 1⎦
The desired response are d1 = -1, d2 = -1 and d3 = 1 respectively.
Take η = 0.1
Take the initial weight vector as

⎡1 ⎤
⎢− 1 ⎥
w' = ⎢ ⎥
⎢0 ⎥
⎢ ⎥
⎣0.5⎦
Step 1:

9 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

net = 2.5; w2 = [.8 -0.6 0 0.7]


Step 2:
net = -1.6; No Correction

Step 3:

net = -2.1; w3 = [.6 -0.4 0.1 0.5]

2.4 Multilayer Feed Forward Neural Network

Figure 2.3 shows the structure of multilayer feed forward neural network. This type of
architecture is part of a large class of feed forward neural networks with the neurons
arranged in cascaded layers. The neural network architecture in this class share a
common feature that all neurons in a layer (or sometimes called a slab) are connected to
all neurons in adjacent layers through unidirectional branches. That is, the branches and
links can only broadcast information in one direction, that is, the “forward direction”. The
branches have associated transmittances, that is, synaptic weights that can be adjusted
according to a defined learning rule. Feed forward networks do not allow connections
between neurons within any layer of architecture. At every neuron, the output of the
linear combiner, that is, the neuron activity level is input to a non linear activation
function f (.), whose output is the response of the neuron.
The neurons in the network typically have activity levels in the range [-1, 1], and
in some applications the range [0, 1] is used. In Figure 2.3, actually there are three
layers; The first layer in the network does not perform any computations, but only serves
to feed the input signal to the neurons of the “second” layer (called the hidden layer),
whose outputs are then input to the “third” layer (or the output layer). The output of the
output layer is the network response factor. This network can perform the non linear
input/output mapping. In general there can be any number of hidden layers in the
architecture; however, from a practical perspective, only one or two hidden layers are
typically used. In fact, it can be shown that a Multi Layer Perceptron (MLP) that has only
one hidden layer, with a sufficient number of neurons, act as a universal approximator of
non-linear mapping.

10 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

h1
w11 w11

x1 w12 w12
w13 h2 w21 y1
w12
w22
x2 w22 w31
w23
w32
w13 h3 y2
w32 w41
x3 w33 w14
w42
w24
w34 h4

Input layer (i) Hidden layer (h) Output layer (j)


Figure 2.3 A typical multi layer feed forward network architecture

2.5 Back Propagation Learning Algorithm

Back Propagation learning is the commonly used algorithm for training the MLP. It is a
gradient descent method minimizing the mean square error between the actual and target
output of a multi layer perceptron.

Let (xp, dp) be the training data pair

In Iteration , k
n
net hk = ∑x w
i =0
i
k
hi ; h = 1, 2 ,...., H

1
z hk = f (net hk ) = ; h = 1,2,...., H
1 + e −(neth )
k

H
net kj = ∑z
h=0
k
h w kjh ; j = 1,2,..., m

1
y kj = f (net kj ) = (
− net kj ) ; j = 1,2,...., m
1+ e
Error Function

∑ (d j − y kj ) 2
m
Ek = 1
2
j =1
Adjust the weights such that the error is minimized

11 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

w kjh+1 = w kjh + ∆w kjh


∂E k
∆w kjh = −η
∂w kjh

∂E k
∂w k ( ) (
= − d kj − y kj . f ' net kj . z hk)
jh

= − δ jk z hk

Here δ kj = (d k
j ) ( )
− y kj . f ' net kj represents the error scaled by the slope

Input to hidden layer weight


( k +1)
whi = whik + ∆whik
∂E k
∆whik = −η
∂whik
∂E k ∂E k ∂z hk ∂ (net hk )
=
∂whik ∂z hk ∂net hk ∂whik

∂E k ⎛ m ⎞
= ⎜⎜ − ∑ δ jk w kjh ⎟⎟ f ' (net hk ) xi
∂z hk ⎝ j =1 ⎠
= δ h xi
k

⎛ m ⎞
Here δ hk = ⎜⎜ − ∑ δ jk w kjh ⎟⎟ . f ' (net hk )
⎝ j =1 ⎠

Weight update equations

• Hidden to output layer

w kjh+1 = w kjh + ∆w kjh


= w kjh + ηδ kj z hk
• Input to hidden layer

k +1
whi = whi
k
+ ∆whi
k

= whi
k
+ ηδ hk xi

12 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

The back propagation algorithm consists of the following steps:

Step 1. Initialize weights and offsets


Initialize all weights and node offsets to small random values.
Step 2. Present Input and Desired Output vector
Present continuous input vector x and specify the desired output d. Output vector
elements are set to zero values except for that corresponding to the class of the
current input.
Step 3. Calculate actual outputs
Calculate the actual output vector y using the sigmoidal nonlinearity
1
f (net i ) =
1 − e − net

Step 4. Adapt weights


Adjust weights by wij (t + 1) = wij (t ) + ηδ j xi'
where xi' is the output of the node i, η is the learning rate constant and δ j is the
sensitivity of the node j. If node j is an output node, then
δ j = f ' (net j )(d j − y j )
where dj is the desired output of the node j, yj is the actual output and f ' (net j ) is
the derivation of the activation function calculated at netj. If the node j is an
internal node, then the sensitivity is defined as
δ j = f ' (net j )∑ δ k w jk
k
where k sums over all nodes in the layer above the node j. Update equations are
derived using the chain derivation rule applied to the LMS training criterion
function.

Step 5. Repeat by going to step 2

Training the network with back propagation algorithm results in a non-linear


mapping between the input and output variables. Thus, given the input/output pairs, the
network can have its weights adjusted by the back propagation algorithm to capture the
non-linear relationship.
After training, the networks with fixed weights can provide the output for the
given input. Once trained, the network can be used as a classifier model for any
engineering applications.

2.6 Design Issues

This section discusses the various design issues that concern the inner workings of the
back propagation algorithm.

2.6.1 Pattern or Batch Mode Training

13 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

The Back Propagation algorithm operates by sequentially presenting the data drawn from
a training set to predefined network architecture. There are two choices in implementing
this algorithm. Once a data is presented to the network, its gradients are calculated and
proceed to change the network weights based on these instantaneous (local) gradient
values. This is called pattern mode training. Alternatively, one could collect the error
gradients over an entire epoch and then change the weights of the initial neural network
in one shot. This is called batch mode training.

2.6.2 Selection of Network Structure

Both the generalization and approximation ability of a feed forward neural network are
closely related to the architecture of the network and the size of the training set. Choosing
appropriate network architecture means the number of layers and the number of hidden
neurons per layer. Although the back propagation algorithm can be applied to any
number of hidden layers, a three-layered network can approximate any continuous
function. The problem of selecting the number of neurons in the hidden layers of
multilayer networks is an important issue. The number of nodes must be large enough to
form a decision region as complex as that required by the problem. And yet the number
must not be excessively large so that the weights cannot be reliably estimated by
available training data.

In general cross validation approach is used to select appropriate network


architecture. The operational procedure of cross validation approach is given below:

• Divide the data set into a training set Ttraining and a test set Ttest.
• Subdivide Ttraining into two subsets: one to train the network Tlearning, and
one to validate the network Tvalidation.
• Train different network architectures on Tlearning and evaluate their
performance on Tvalidation.
• Select the best network.
• Finally, retrain this network architecture on Ttraining.
• Test for generalization ability using Ttest.

2.6.3 Weight Initialization

It is important to correctly choose a set of initial weights for the network. Sometimes it
can decide whether or not the network is able to learn the training set function. It is
common practice to initialize weights to small random values within some
interval [−ε , ε ] . Initialization of weights of the entire network to the same value can lead
to network paralysis where the network learns nothing-weight changes are uniformly
zero. Further, very small ranges of weight randomization should be avoided in general
since this may lead to very slow learning in the initial stages. Alternatively, an incorrect
choice of weights might lead to network saturation where weight changes are almost
negligible over consecutive epochs.

14 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

To get the best result the initial weights (and biases) are set to random numbers
between -0.5 and 0.5 or between -1 and 1. In general, the initialization of weights (bias)
can be done randomly.

2.6.4 Termination criteria for Training

The motivation for applying back propagation net is to achieve a balance between
memorization and generalization; it is not necessarily advantages to continue training
until the error reaches a minimum value. The weight adjustments are based on the
training patterns. As along as the error for validation decreases training continues.
Whenever the error begins to increase, the net is starting to memorize the training
patterns. At this point training is terminated.

2.7 Applications of Feed Forward Neural Networks

Multi layered feed forward neural networks trained using back propagation algorithm
account for a majority of applications of real world problems. This is because back
propagation algorithm is easy to implement, fast and efficient to operate. Some of the
applications of ANN are mentioned below.

• Speech recognition
• Data Mining
• Robot arm control
• Bio-Informatics
• Power system security assessment
• Load forecasting
• Image processing
References

1. B.Yegnanarayana, Artificial Neural Networks, Prentice-hall of India Pvt Ltd,


1999.
2. Fredric M.Ham Ivica Kostanic, Principles of Neurocomputing for Science and
Engineering, McGraw-Hill Higher Education, 2001.
3. James A.Freeman / David M.Skapura, Neural Networks Algorithms, Applications
and Programming Techniques, Pearson Education, 1991.

15 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

CHAPTER 3

WORKING WITH MATLAB

3.1 Introduction to MATLAB

MATLAB stands for MATrix LABortary. It is a software package for high performance
numerical computation and visualization. It also provides an interactive environment with
hundreds of reliable and accurate built-in mathematical functions. MATLAB’s built-in
functions provide excellent tools for linear algebraic computations, data analysis, signal
processing, optimization, numerical solution of ordinary differential equation and many
other types of scientific computations. The basic building block of MATLAB is the
matrix. The fundamental data type is the array. Vectors, Scalars, real matrices and
complex matrices are all handled as special cases of the basic data type.
Dimensioning of a matrix is automatic in MATLAB. MATLAB is case-sensitive:
Most of the MATLAB commands and built-in functions are in lower case letters. The
output of every command is displayed on the screen unless MATLAB is directed
otherwise. A semi-colon at the end of a command suppresses the screen output, except
for graphics and on-line help commands.

3.2 MATLAB Windows

MATLAB works through the following three basic windows:


1. Command Window
2. Graphics Window
3. Edit Window

3.2.1 Command Window

This is the main window, which is characterized by the MATLAB command prompt,
‘>>’. All commands, including those for running user-written programs, are typed in this
window at the MATLAB command prompt. It consists of four other sub-windows.
• Launch Pad:
This sub-window lists all MATLAB related applications and toolboxes.
• Work Space:
This sub-window lists all variables that have been generated so far and shows
their type and size. Various operations can be performed on these variables such
as plotting.
• Command History:
All commands typed on the MATLAB command prompt get recorded, even
across multiple sessions in this window.
• Current Directory:
This is the sub-window, in which all files from the current directory are listed.

16 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

3.2.2 Graphics Window

The output of all graphics commands typed in the command window are flushed to the
graphics or figure window, a separate gray window with white background color. The
user can create as many figure windows as the system memory will allow.

3.2.3 Edit Window

In this sub-window, programs can be written, edited, created and saved in files called
“M-files”. MATLAB provides its own built-in editor.

3.3 Working with Script Files

A script file is a user-created file with a sequence of MATLAB commands in it. It may be
created by selecting a new M-file from the file menu in edit window. The file must be
saved with a ‘.m’ extension to its name, thereby, making it an M-file. This file is
executed by typing its name, without extension at the command prompt in command
window. If we use ‘%’ symbol before a line in the MATLAB program then it will be
treated as command line.
Eg. % MATLAB is user friendly program
The character limit for a line in MATLAB program is 4096.

3.4 Working with Directories

The following are some important commands for working with directories:
• pwd (print working directory)
This command opens a separate sub-window, to the left of the command window,
and also displays the current directory.
Eg: >>pwd
C:\matlabR12\work
Displays the present working directory.
• cd (change directory)
This command is used to change the current working directory to a new directory.
Eg: >>cd new directory
• dir (directory)
On execution of this command, the contents present in the current directory can
be viewed.
• addpath
This command is used to add the specified directories to the existing path.
Eg: >>addpath D:\matlabR12\work
>>addpath C:\mywork
(or)
>>addpath D:\matlabR12\work C:\mywork

17 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

3.5 Variables

Expressions typed without a variable name are evaluated by MATLAB, and the result is
stored and displayed by a variable, ans. The result can also be stored to a variable name.
The variable name should begin with an alphabet and it can have a maximum word
length of 31 characters. After the first letter any symbols, numbers (or) special characters
may be used. Variables used in script files are global. Declaration of a set of variables to
be globally accessible to all or some functions without passing the variables in the input
list.
Eg: >>global x y;
An M-file can prompt for input from the keyboard by using input command.
Eg: >>V=input(‘enter radius’)
displays the string - enter radius - and waits for a number to be entered. That number
will be assigned to the variable, V.

3.6 Matrix and vectors

A matrix is entered row-wise with consecutive elements of a row separated by a space or


by comma, and the rows are separated by semi-colons, The entire matrix must be
enclosed within square brackets. Elements of the matrix may be real numbers, complex
numbers or valid MATLAB expressions.
Eg: >>A=[1 2 3 ; 5 6 7; 10 11 12];
⎡1 2 5 ⎤
Displays, A = ⎢ ⎥
⎣3 9 0⎦
A Vector is a special case of a matrix, with just one row or one column. It is entered the
same way as a matrix.
Eg: >> V = [1 2 3 ] produces a row vector.
>> U = [1;2;3] produces a column vector.
To create a vector of numbers over a given range with a specified increment.
The general command to do this in MATLAB is,
1. V=Initial value : Increment : Final value
The three values in the above assignment can also be valid MATLAB
expressions. If no increment is specified, MATLAB uses the default
increment of 1.
Eg: >>a = 0:10:100 produces a = [0 10 20 …….100]
2. linspace(a,b,n)
This generates a linearly spaced vector of length n from a to b.
Eg: >>u = linspace(0,100,200) generates u = [0 20 40 ……100]
3. logspace(a,b,n)
This generates a logarithmically spaced vector of length n from 10a to 10b
Eg: >>v = logspace(0,3,4) produces v =[1 10 100 1000]
Special vectors, such as vectors of zeros or ones of a specific length, can be created with
the utility matrix functions zeros, ones, etc.
Eg: >>p = zeros(1,2) initializes a two element long zero row vector.
>>m = ones(10,1) creates a 10 element long column vector of 1’s.

18 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

The following are some of the commands used in matrix and vector manipulation:
• Square brackets with no elements between them creates a null matrix.
Eg: >>X = [ ] produces a null vector
• Any row(s) or column(s) of a matrix can be deleted by setting the row or
column to a null vector.
Eg: >>A(2,:) = [ ] deletes the second row of the matrix A.
• It provides a much higher level of index specifications. It allows a range of
rows and columns to be specified at the same time.
Eg: >>B=A(2:4,1:4) creates a matrix B consists of the elements in 2 to
4 rows and 1 to 4 columns from A matrix.
• When a row (or column) to be specified range over all rows (or columns) of
the matrix, a colon alone can be used.
Eg: >>B=A(2:3,:) creates a matrix B consists of all elements in 2rd and
3rd row of A matrix.
• No Dimension declaration required normally, but if the matrix is large then it
is advisable to initialize it
Eg: >>A=zeros(m,n); creates or initializes a matrix A of zeros as
elements with a dimension of m x n.
• The size of the matrix can be determined by the command size(matrix name)
in the command window.
Eg: >>a=size(A);
• If we want to know about the size of the row and column in a matrix then type
Eg: >>[m,n] = size(matrix name)
• Transpose of a matrix
Eg: >>B = A’
• Appending a row or a column in a matrix
>>A=[A ; u] Æ appends a column
>>A=[v ; A] Æ appends a row
where, u and v represents the column or row to be added.
• Deleting a row or a column in a matrix Æ any row or column in a matrix can
be
deleted by setting it to a null vector.
>>U (: ; a) = [ ] will delete the ath column from the U matrix.
>>V (b ; :) = [ ] will delete the bth row from the V matrix.
• Performing arithmetic operation in a matrix and in an array
Eg: >> A=[ 1 2: 3 4]
>>A=A*A;
• For Array Operation:
>>A=A.*A

19 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

3.7 Plotting a graph

The following are the commands used in plotting graphs or figure:


>>plot(x,y) /* For plotting a graph
>>xlabel(‘x’) – To label the axis
>>ylabel(‘y’) – To label the axis
>>Title(‘Title name’) – To set the title for the graph
>>axis(‘equal’) – To get a circle, we have to set the length scales of the axis to
be the same.

3.8 Loops, Branches and Control flow

3.8.1 For loop

It is used to repeat a statement or a group of statements for a fixed no of times


Format:
1. for i = m:n
………
……..
end
Where,
m – initial value
n – final value
2. for i = initial value: increment: final value
…….
end

3.8.2 While Loop

It is used to execute a statement or a group of statements for an indefinite no of times


until the condition specified by while is no longer satisfied
Format:
While n<condition
…….
…….
End

3.8.3 If – else statement:

This construction provides a logical branching for computations, also nested if statement
is possible as long as we have matching end statements

Eg: >>i=6;
>j=21;
>>if i>5
k=i;

20 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

elseif (i>1) & (j==20)


k=5*i+j;
else
k=1;
end

3.8.4 Switch case – otherwise

This construction provides a logical branching for computations. A flag is used as switch
and the values of the flag make up the different cases for execution.
Eg: color = input(‘color =’,’s’);
switch color
case ‘red’
c=[1 0 0]
case ‘blue’
c=[1 1 1]
otherwise
error(‘Invalid color’)
end

3.8.5 Error

The command error inside a function or a script aborts the execution, displays the error
message and returns the control to the keyboard.

3.8.6 Break

The command break insider the for loop or while loop terminates the execution of the
loop, even if the condition for execution of the loop is true
Eg: for i=1:10
if u(i)<0
break
end
a=a+v(i);
end

3.8.7 Return

The command returns the control to the invoking function

21 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

3.9 Exercises :

1. Perform the following arithmetic operations


4
(i) Y = 6 X 3 + , X = 2
X
X
(ii) Y = * 3, X = 8
4
(4 X ) 2
(iii) Y = , X = 10
25
SinX
(iv) Y = 2 * ,X = 2
5
(v) Y = 7 * ( X 1 / 3 ) + 4 X , X = 20

2. Write a MATLAB program to calculate the volume of a circular cylinder

3. Plot the polynomial


f ( X ) = 9 X 3 − 5X 2 + 3X + 7

⎡3 7 −4 12 ⎤
⎢− 5 9 10 2 ⎥⎥
4. Given A = ⎢
⎢6 13 8 11 ⎥
⎢ ⎥
⎣15 5 4 1⎦
a. Create a vector V consisting of the elements in the second column of A
b. Create a vector W consisting of the elements in the second row of A

⎡3 7 −4 12 ⎤
⎢− 5 9 10 2 ⎥⎥
5. Given A = ⎢
⎢6 13 8 11 ⎥
⎢ ⎥
⎣15 5 4 1⎦
(i) Create a 4 x 3 array B consisting of all elements in the second through
4th column of A
(ii) Create a 3 x 4 array C consisting of all elements in the second through
fourth rows of A
(iii) Create a 2 x 3 array D consisting of all elements in the first two rows
and the last three columns of A

22 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

CHAPTER 4

MATLAB NEURAL NETWORK TOOLBOX

4.1 Introduction

The neural network Toolbox is a collection of predefined functions built on the


MATLAB numerical computing environment. These predefined functions can be called
by the user to simulate various types of neural network models. Artificial neural network
technology is being used to solve a wide variety of complex scientific, engineering and
business problems. Neural networks are ideally suited for such problems because like
their biological counterparts, and artificial neural network can learn, and therefore can be
trained to find solutions, recognize patterns, classify data, and forecast future events.

23 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

Because neural networks require intensive matrix computations, MATLAB provides a


natural framework for rapidly implementing them and for their behavior and application.

4.2 Basic Commands in Neural Network Toolbox

HARDLIM - Hard limit transfer function.


A = hardlim(N)
info = hardlim(code)
HARDLIM is a transfer function. Transfer functions calculate a layer's output from its net
input.
HARDLIM(N) takes one input,N - SxQ matrix of net input (column) vectors and returns 1
where N is positive, 0 elsewhere.
HARDLIM(CODE) returns useful information for each CODE string:
'deriv' - Name of derivative function.
'name' - Full name.
'output' - Output range.
'active' - Active input range.

HARDLIMS - Symmetric hard limit transfer function.


A = hardlims(N)
info = hardlims(code)
HARDLIMS(N) takes one input,N - SxQ matrix of net input (column) vectors and returns
1 where N is positive, -1 elsewhere.

PURELIN - Linear transfer function.


A = purelin(N)
info = purelin(code)
PURELIN calculate a layer's output from its net input.
PURELIN(N) takes one input,N - SxQ matrix of net input (column) vectors and returns N.
PURELIN(CODE) returns useful information for each CODE string:

24 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

'deriv' - Returns name of derivative function.


'name' - Returns full name.
'output' - Returns output range.
'active' - Returns active input range.

LOGSIG - Logarithmic sigmoid transfer function.


A = logsig(N)
info = logsig(code)
LOGSIG calculate a layer's output from its net input.
LOGSIG(N) takes one input,N - SxQ matrix of net input (column) vectors and returns
each element of N squashed between 0 and 1.
LOGSIG(CODE) returns useful information for each CODE string:

'deriv' - Returns name of derivative function.


'name' - Returns full name.
'output' - Returns output range.
'active' - Returns active input range.

TANSIG Hyperbolic tangent sigmoid transfer function.


A = tansig(N)
info = tansig(code)
TANSIG calculate a layer's output from its net input.
TANSIG(N) takes one input,
N - SxQ matrix of net input (column) vectors and returns each element of N squashed
between -1 and 1
TANSIG(CODE) returns useful information for each CODE string:
'deriv' - Returns name of derivative function.
'name' - Returns full name.
'output' - Returns output range.
'active' - Returns active input range.

SATLIN Saturating linear transfer function.


A = satlin(N)
info = satlin(code)
SATLIN calculate a layer's output from its net input.
SATLIN(N) takes one input,
N - SxQ Matrix of net input (column) vectors and returns values of N truncated into the
interval [-1, 1].
SATLIN(CODE) return useful information for each CODE string:
'deriv' - Returns name of derivative function.
'name' - Returns full name.
'output' - Returns output range.
'active' - Returns active input range.
.
.

25 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

Design and Simulation Using MATLAB Neural Network Toolbox

ƒ Generate training and test data set

ƒ Load the data using the command “load” and store it in a variable
(eg.) >> load PR.dat
>> XY=PR

ƒ Separate the input and output data from the variable XY


where
X be the variable to store the input data & Y be the variable
to store the output data and store them separately

ƒ Separate the input and output data into training and test cases
X1,Y1 for training
X2,Y2 for testing

ƒ Normalize the input / output data if they are in different ranges using the formula

Normalized value of the ith pattern in jth variable is given by


X1n(i,j)=(X1(i,j)-minX(j))/(maxX(j)-minX(j));

Similarly normalized value of the ith pattern in the jth output variable
Y1n(i,j)=(Y1(i,j)-minY(j))/(maxY(j)-minY(j));

ƒ Define the input structure using the command


net=newff(PR,[nhid nout],{‘transig’ ‘purelin’}, ‘trainscg’)
Where,
newff creates a feed forward network to create the network object
tansig implies tan sigmoid transfer function
purelin implies linear transfer function
trainscg uses the conjugate gradient algorithm to train the
network

ƒ Initialize the weights and biases using the command


rand (‘seed’,0);
net.layers{1}.initFcn='initwb';
net.inputWeights{1,1}.initFcn='rands';
net .biases{1,1}.initFcn=‘rands’;
net .biases{2,1}.initFcn=‘rands’;
net=init(net);

• Specify the number of hidden layer and number of output layer using the
assignment
nhid =---
nout= ---

26 Dr. D. Devaraj, IEEE-KLU


Workshop on “Neural Network Approach for Image Processing”, Feb 4th & 5th, 2011

• Specify the number of epochs and the error goal using the assignment
net.trainParam.epochs=----;
net.trainParam.goal=-----;

ƒ Train the network using the command


[net TR]=train(net Xn’,Yn’)

ƒ Compare the network output and actual output

ƒ Re normalize the output values

27 Dr. D. Devaraj, IEEE-KLU

You might also like