International Journal of Innovative Technology and Exploring Engineering (IJITEE)

ISSN: 2278-3075, Volume-8, Issue-6S4, April 2019

Neural Network Programming in Python

Primož Podržaj

Abstract—In this paper a basic introduction to neural Neural networks are defined as networks consisting of
networks is made. An emphasis is given on a two layer perceptron many interconnected neurons. A neuron (or nerve cell) is a
used extensively for function approximation. The special biological cell that processes information. Neurons
backpropagation learning rule is than briefly introduced. A short
are connected at joints called synapses. A typical neuron
introduction into Python programming language is made and a
program for the perceptron design is written and discussed in together with a synaptic joint is shown in Fig. 1.
some detail. The “neurolab” library is used for this purpose.
cell body
Index Terms—Neural networks, Perceptron, Python. axon
synaptic terminal
synaptic vesicles
synaptic gap

Artificial intelligence is quickly becoming ubiquitous in

our day to day lives as AI systems are becoming more and dendrites
more capable. They are used in signal processing, control, synaptic terminals
pattern recognition, medicine, speech production and Fig. 1: The major structures of a typical neuron (left)
recognition, and business [1]. Recently, they are also being and a synaptic joint (right)
used extensively in image and video processing related tasks
[2], [3]. It is therefore important that a wide set of engineers Human brain, as the best known and the most capable of
gets at least a basic understanding in the field of artificial neural networks has some 1011 neurons. Each neuron has
intelligence, its advantages and drawbacks. The first and the about 10,000 synapses on average. Therefore the total
best known field within artificial intelligence are neural number of connections is around 1015.The cell body of a
networks. They are now known for quite some time and neuron sums the incoming signals from dendrites. A
various architectures were developed in order to solve particular neuron will send an impulse to its axon if
specific tasks. One of the first ones was the so called sufficient input signals arereceived to stimulate the neuron
perceptron, which can among other things be used for to its threshold level. However, if the inputs do not reach the
function approximation. A major breakthrough in the field required threshold, the input will quickly decay and will not
of neural networks was made when the backpropagation generate any action.
algorithm was introduced [4]. In this paper a Python based The biological neuron model is the foundation of an
realization of such a network is presented and discussed. artificial neuron which is shown schematically in Fig. 2.
Python is a high-level general-purpose programming
language created by Guido vanRossum in 1991. It has a p1 p
w1,1 W
design philosophy that puts emphasis on code readability. It p2 R1x

n a
 n a 1R

supports multiple programming paradigms including object- f + 11 x f 11


oriented, imperative, functional and procedural and has a pR w1,R b 1 b

large standard and comprehensive library. The first release R

was followed by Python 2.0 in 2000 and Python 3.0 in 2008. 1

Fig. 2: The artificial neuron model
At the time of writing this paper the latest version is Python
3.7. In comparison with other programming languages such
The total input to the neuron is defined by the following
as C/C++, Java, and Fortran, Python is a higher-level
language. The computation time is therefore typically a little
longer, but it is much easier to program in. Python is namely
𝑛 = 𝑤1,1 𝑝1 + 𝑤1,2 𝑝2 + ⋯ + 𝑤1,𝑅 𝑝𝑅 (1)
a programming language with the largest increase in ratings
[5]. It is especially popular in educational environments.
Individual inputs p1, p2, … , pR are each weighted by the
There is namely one aspect of Python that always has been,
corresponding elements w1,1, w1,2, … , w1,Rand then summed
and always will the most important in the entire language –
up with the bias b to form the total input n to the neuron. In
readability [6].
matrix notation,Eq. 1 can be written in the following form:
NETWORKS 𝑛 = 𝑾∙𝒑+𝑏 (2)

([email protected])

Neural Network Programming In Python

input first layer second layer third layer

The output of the neuron can then be determined by the
following equation p 1 2 3
1 a 2 a 3 a
W S1 1x
W 2x
S 1 W S3 1 x
1 2 3
S R n 1
n 2 S3 S2 n 3
𝑎 = 𝑓(𝑾 ∙ 𝒑 + 𝑏)

(3) +S1
f +S 1 2x
f + S3 1
x f
1 2 3
1 b 1 b 1 b
The function f can be linear or nonlinear function of n. It S1 1
S 1 S3 1

is usually called the transfer function. Complicated neural
networks are of course composed of many neurons. A basic Fig. 6: Three-layer feedforward neural network
unit of complicated neural networks is a layer, which is
made of one or more parallel neurons. Neurons within the The output of such a three-layer feedforward neural
same layer have usually the same transfer function. The network is determined by the following equation:
output(s) of a neural network can in such a case be
determined by the following equation 𝒂 = 𝒇3 (𝑾𝟑 ∙ 𝒇2 𝑾𝟐 ∙ 𝒇1 𝑾𝟏 ∙ 𝒑 + 𝒃𝟏 + 𝒃𝟐 + 𝒃𝟑 ) (5)

𝒂 = 𝒇(𝑾 ∙ 𝒑 + 𝒃) (4) It is a straightforward way to extend this equation to

neural networks with more than three layers.
The schematic representation of such a neuron setup (a
layer) is shown in Fig. 3.
p1  f Artificial neural networks can perform different tasks,
1 p depending on the types of transfer functions and the neuron
n2 a2 W
 interconnections. One of the most common ones (if not the

p2 f SR

f a
most commonly used one) is the two-layer perceptron
b2 S1
S1 x

1 1 b S shown in Fig. 7.

n1 a1 input logsig layer linear layer
pR w
S,R  f
b1 1
p 1 a 2 a
Fig. 3: Schematic representation of a single layer R1 W 1x W 2x
1 S 1 S 1
S R n 2x
+S1 1x +S 1 2x
Neural networks can in general be composed of more than
1 2
one layer. Based on the connections between layers neural 1 b 1 b
networks can be divided into intralayer, interlayer or S11x 2x
S 1
recurrent, as shown in Fig. 4.
Fig. 7: A two-layer perceptron used for function

In order to design a neural network we also need to

intralayer interlayer recurrent
determine the number of neurons in the first layer. Common
sense says that more neurons should be able to approximate
Fig. 4: Different types of networks based on connections
functions better. There is however always a certain upper
Based on the direction of the connections interlayer limit associated with this process. Too many neurons can
neural networks can further be divided into feedforward and namely also result in poor function approximation as
feedback ones, as shown in Fig. 5. demonstrated in Fig. 8.



feedforward feedback
Fig. 5: Feedforward and feedback neural networks Fig. 8: Good and poor function approximation

As an example, a schematic representation of a three layer

feedforward neural network is shown in Fig. 6.

International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-8, Issue-6S4, April 2019

Its main application is function approximation. A very The parameters mean that the network expects inputs in
important question however still has to be answered. So far, the [-10, 10] range. The hidden layer has 15 neurons and the
nothing was said about the process of determining the output layer 1 neuron. It should be emphasized that the
correct values of the parameters w and b in order that the weights are randomly initialized.
neural network functions properly. In order to do this a After that we need to create samples for neural network
training process must be conducted. Although there is a training. Let’s say that we want the neural network to
myriad of training approaches, most of them fall in the approximate function 0.75 sin(s) in the range [-10, 10]. So,
following two categories [7]: input vector x can be created by the following command:
• Supervised learning x = np.linspace(-10, 10, 100)
In this case a network is trained by a sequence of pairs of It consists of 100 equally spaced values between -10 and
vectors. The first one is the input vector and the second one 10. The corresponding output vector y can be easily
the target vector. The weights can be modified at each step obtained.
(after each pair) or a matrix of all the vectors can be formed
and then used for training. The training processes are y = 0.75*np.cos(x)
therefore called incremental or batch learning [8]. In the next step we must reshape both x and y into column
• Unsupervised learning form. The len() function returns the number of items in an
In this case there are no target vectors. The weights are
modified based only on the input vectors. size = len(x)
The algorithm used to train the two layer perceptron is the
Then we can modify x and y into new vectors inp (input)
so called backpropagation algorithm (also known as Delta
and trg (target) by the following commands:
rule) [9]. It is composed of both forward and backward
stage. It is described in some detail in [10]. The main goal of inp = x.reshape(size,1)
the algorithm is to make the difference between the actual trg = y.reshape(size,1)
and the target outputs as small as possible.
The reshape() function’s arguments are the numbers of
rows and columns in the new vector. Now, after the creation
of the input and target vectors, we can train the neural
In order to use Python, we must of course install it. On network using them. We will use the following command:
Windows platform it is very popular to install it together
with Anaconda and then make the program in Jupyter error = net.train(inp, trg, epochs=300, show=100,
Notebook [11]. Among many other possibilities a PyCharm goal=0.01)
software package canbe installed to use Python [12]. The train() function we have used is the main part of the
As already noted in the Introduction, Python is a high- presented program. In order to get its detailed description, it
level general-purpose programming language. It is is best to check the documentation [18]. It has several
especially popular among scientific community due to a parameters. The first two are the input and the target vector.
wide set of freely available libraries, in particular scientific Then the number of epochs needs to be stated (the default
ones (linear algebra, visualization tools, plotting, image value is 500). The show parameters determines the steps at
analysis, differential equations solving, symbolic which the error is printed out (the default value is 100). The
computations, statistics etc.). Probably the three most goal parameter determines at which value of the error the
important ones are Numpy, SciPy and Matplotlib [13]. training will stop (the default value is 0.01). It is also
NumPy is a library which adds a support for creation of important to at least know which training algorithm is being
large, multi-dimensional arrays and matrices, together with a used. The default is the “Gradient descent with momentum
large set of operations working on them [14].The SciPy backpropagation and adaptive learning rate” algorithm (in
software library implements a set of functions for processing Neurolab library it is known as neurolab.train.train_gdx().
scientific data, such as statistics, signal processing, image Actually, all the default values can be obtained by the
processing, and function optimization [15]. Matplotlib is, as net.trainf.defaults command. The typical output of the
the name of course suggests, a library used for plotting data program at this stage is given below:
[16]. Beside these common libraries, a Neurolab library,
created specifically for neural networks implementation Epoch: 100; Error: 0.02729823200135833;
[17], will be used. The code starts with the import of the Epoch: 200; Error: 0.018311521643474635;
needed libraries: Epoch: 300; Error: 0.01722912104269513;
The maximum number of train epochs is reached
importnumpy as np As we stated the goal to be 0.01, the learning process
importneurolab as nl stopped at 300 iterations without reaching the target error of
importpylab as pl 0.01. This is however by no means the only possibility. As
the weights of the neural network are randomly initialized
Then a two layer neural network is created by the
following command:

net = nl.net.newff([[-10, 10]], [15, 1])

Neural Network Programming In Python

during the network creation, we get a different result every

time when we rerun the program. We might for example get
the following output:
Epoch: 100; Error: 0.03487234318088822;
Epoch: 200; Error: 0.01774028250123779;
The goal of learning is reached
The error variable on the left side stores errors obtained
during each training iteration. In order to comparethe actual
output of the neural network with the target, we form the out
vector with the following command:
out = net.sim(inp)
So, in the out vector the actual values of the output of the
neural network after training are stored. Now we just need to
plot the results. We will lot the results in two subplots (one
above the other). The first one will plot the error vector. We
can get it with the following commands:
pl.xlabel('Epoch number')
pl.ylabel('error (default SSE)') Fig 10: The result when only 20 values are used for
The three digit parameter in the subplot() functions gives training
the number of rows, the number of columns and the index of
the specific subplot. In order to visually analyze the When only 15 values are used, the result is even worse
performance of the neural network, we will compare the (see Fig. 11).
actual and the target output. This will be done in the second
subplot using the following commands:
pl.plot(inp, trg, '-',inp , out, '.', inp, trg, 'p')
pl.legend(['train target', 'net output'])
The obtained output is shown in Fig. 9

Fig 11: The result when only 15 values are used for

Beside the number of samples used for training, the

number of neurons in the hidden layer can also have a big
influence on the performance of the neural network. In
accordance with intuition the performance of the neural
network will deteriorate with the decreasing number of
Fig. 9: The result of the program neurons. If for example only 3 neurons are used in the
hidden layer, we get the result shown in Fig. 12.
The result is by no means always that good. If for
example we only use 20 values for training, we get the result
shown in Fig. 10, despite the training goal being reached.

International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-8, Issue-6S4, April 2019

Neural networks are still a hot topic within artificial
intelligence field. In this paper a simple two layer
perceptron used for function approximation is made in
Python using Neurolab library. All the steps in the program
are thoroughly explained. The performance of the network is
also analyzed and the dependence of the network
performance on the number of training samples and the
number of neurons in the hidden layer are demonstrated.
The whole procedure is valuable for anyone starting the
study of neural networks and wanting them to be
implemented in Python. The further step into more complex
neural networks is facilitated in this way.

1. L. V. Fausett, “Fundamentals of neural networks:
architectures, algorithms, and applications,” Prentice-Hall,
2. T. Lindblad, J. M. Kinser, and J. G. Taylor, J. G. “Image
processing using pulse-coupled neural networks,” Springer,
Fig 12: The result with only 3 neurons in hidden layer 3. J. Howse, “OpenCV computer vision with Python,” Packt
Publishing, 2013.
4. J. L. McClelland, D. E. Rumelhart, and PDP Research Group,
Contrary to intuition, there is also a problem with neural “Parallel distributed processing,” Explorations in the
network having too many neurons. If we discard longer Microstructure of Cognition, 2, 216-271, 1986
training times needed when network has more neurons in 5. https://www.tiobe.com/tiobe-index/
hidden layer, a problem of overfitting might appear as well. 6. R. Van Hattem, “Mastering Python,” Packt Publishing, 2016.
If we use the network with 200 neurons in the hidden layer 7. A. Zilouchian , and M. Jamshidi, “Intelligent control systems
and then analyze with 500 values in the [-10,10] interval, we using soft computing methodologies,” CRC press, 2001.
8. M. T. Hagan, H. B. Demuth, M. H. Beale, and O. De Jesus,
get the result shown in Fig. 13. “Neural network design, 2nd Ed.,” Hagan and Demuth, 2013.
9. I. N. Da Silva, D. H. Spatti, D. H., R. A. Flauzino, R. A., L.
H. BartocciLiboni, and S. F. dos Reis Alves, “Artificial neural
networks: A practical course,” Springer, 2017.
10. A. F. Gad, “Practical Computer Vision Applications Using
Deep Learning with CNNs,” Apress, 2018.
11. J. P. Mueller, “Beginning programming with Python for
dummies,” John Wiley & Sons, 2018.
12. Q. N. Islam, “Mastering PyCharm,” Packt Publishing, 2015.
13. R. Johansson, “Numerical Python: Scientific Computing and
Data Science Applications with Numpy, SciPy and
Matplotlib, 2nd Ed.,” Apress, 2019.
14. I. Idris, “NumPy: Beginner's Guide, 3rd Ed.,” Packt
Publishing Ltd, 2015.
15. J. Nunez-Iglesias, S. van der Walt, and H. Dashnow, “Elegant
SciPy: The Art of Scientific Python," O'Reilly, 2017.
16. D. M. McGreggor, “Mastering matplotlib: A practical guide
that takes you beyond the basics of matplotlib and gives
solutions to plot complex data,” Packt Publishing, 2015
17. https://pythonhosted.org/neurolab/
18. https://pythonhosted.org/neurolab/lib.html#neurolab.train.trai

Fig. 13: The result with only 200 neurons in hidden layer

We can clearly see that many points are far from the
target values, despite the goal of the training being reached.

