Unit III Deep Learning Chapter Notes
Unit III Deep Learning Chapter Notes
Unit III Deep Learning Chapter Notes
The term "Artificial neural network" refers to a biologically inspired sub-field of artificial
intelligence modeled after the brain. An Artificial neural network is usually a
computational network based on biological neural networks that construct the structure
of the human brain. Similar to a human brain has neurons interconnected to each other,
artificial neural networks also have neurons that are linked to each other in various
layers of the networks. These neurons are known as nodes.
Artificial neural network tutorial covers all the aspects related to the artificial neural
network. In this tutorial, we will discuss ANNs, Adaptive resonance theory, Kohonen self-
organizing map, Building blocks, unsupervised learning, Genetic algorithm, etc.
The typical Artificial Neural Network looks something like the given figure.
Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks,
cell nucleus represents Nodes, synapse represents Weights, and Axon represents
Output.
Dendrites Inputs
Synapse Weights
Axon Output
There are around 1000 billion neurons in the human brain. Each neuron has an
association point somewhere in the range of 1,000 and 100,000. In the human brain,
data is stored in such a manner as to be distributed, and we can extract more than one
piece of this data when necessary from our memory parallelly. We can say that the
human brain is made up of incredibly amazing parallel processors.
We can understand the artificial neural network with an example, consider an example
of a digital logic gate that takes an input and gives an output. "OR" gate, which takes
two inputs. If one or both the inputs are "On," then we get "On" in output. If both the
inputs are "Off," then we get "Off" in output. Here the output depends upon input. Our
brain does not perform the same task. The outputs to inputs relationship keep changing
because of the neurons in our brain, which are "learning."
Input Layer:
As the name suggests, it accepts inputs in several different formats provided by the
programmer.
Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all the
calculations to find hidden features and patterns.
Output Layer:
The input goes through a series of transformations using the hidden layer, which finally
results in output that is conveyed using this layer.
The artificial neural network takes input and computes the weighted sum of the inputs
and includes a bias. This computation is represented in the form of a transfer function.
ADVERTISEMENT
Artificial neural networks have a numerical value that can perform more than one task
simultaneously.
Data that is used in traditional programming is stored on the whole network, not on a
database. The disappearance of a couple of pieces of data in one place doesn't prevent
the network from working.
Extortion of one or more cells of ANN does not prohibit it from generating output, and
this feature makes the network fault-tolerance.
It is the most significant issue of ANN. When ANN produces a testing solution, it does
not provide insight concerning why and how. It decreases trust in the network.
Hardware dependence:
Artificial neural networks need processors with parallel processing power, as per their
structure. Therefore, the realization of the equipment is dependent.
ANNs can work with numerical data. Problems must be converted into numerical values
before being introduced to ANN. The presentation mechanism to be resolved here will
directly impact the performance of the network. It relies on the user's abilities.
Afterward, each of the input is multiplied by its corresponding weights ( these weights
are the details utilized by the artificial neural networks to solve a specific problem ). In
general terms, these weights normally represent the strength of the interconnection
between neurons inside the artificial neural network. All the weighted inputs are
summarized inside the computing unit.
If the weighted sum is equal to zero, then bias is added to make the output non-zero or
something else to scale up to the system's response. Bias has the same input, and
weight equals to 1. Here the total of weighted inputs can be in the range of 0 to positive
infinity. Here, to keep the response in the limits of the desired value, a certain maximum
value is benchmarked, and the total of weighted inputs is passed through the activation
function.
The activation function refers to the set of transfer functions used to achieve the desired
output. There is a different kind of the activation function, but primarily either linear or
non-linear sets of functions. Some of the commonly used sets of activation functions are
the Binary, linear, and Tan hyperbolic sigmoidal activation functions. Let us take a look
at each of them in details:
Binary:
In binary activation function, the output is either a one or a 0. Here, to accomplish this,
there is a threshold value set up. If the net weighted input of neurons is more than 1,
then the final output of the activation function is returned as one or else the output is
returned as 0.
Sigmoidal Hyperbolic:
The Sigmoidal Hyperbola function is generally seen as an "S" shaped curve. Here the tan
hyperbolic function is used to approximate output from the actual net input. The
function is defined as:
Feedback ANN:
In this type of ANN, the output returns into the network to accomplish the best-evolved
results internally. As per the University of Massachusetts, Lowell Centre for
Atmospheric Research. The feedback networks feed information back into itself and are
well suited to solve optimization issues. The Internal system error corrections utilize
feedback ANNs.
Feed-Forward ANN:
A feed-forward network is a basic neural network comprising of an input layer, an output layer,
and at least one layer of a neuron. Through assessment of its output by reviewing its input, the
intensity of the network can be noticed based on group behavior of the associated neurons, and
the output is decided. The primary advantage of this network is that it figures out how to
evaluate and recognize input patterns.
Prerequisite
No specific expertise is needed as a prerequisite before starting this tutorial.
Audience
Our Artificial Neural Network Tutorial is developed for beginners as well as
professionals, to help them understand the basic concept of ANNs.
Problems
We assure you that you will not find any problem in this Artificial Neural Network
tutorial. But if there is any problem or mistake, please post the problem in the contact
form so that we can further improve it.
A perceptron is a neural network unit that does a precise computation to detect features
in the input data. Perceptron is mainly used to classify the data into two parts. Therefore,
it is also known as Linear Binary Classifier.
Perceptron uses the step function that returns +1 if the weighted sum of its input 0 and
-1.
The activation function is used to map the input between the required value like (0, 1) or
(-1, 1).
a. In the first step, all the inputs x are multiplied with their weights w.
b. In this step, add all the increased values and call them the Weighted sum.
c. In our last step, apply the weighted sum to a correct Activation Function.
For Example:
A Unit Step Activation Function
There are two types of architecture. These types focus on the functionality of artificial
neural networks as follows-
This is the first proposal when the neural model is built. The content of the neuron's
local memory contains a vector of weight.
The single vector perceptron is calculated by calculating the sum of the input vector
multiplied by the corresponding element of the vector, with each increasing the amount
of the corresponding component of the vector by weight. The value that is displayed in
the output is the input of an activation function.
ADVERTISEMENT
o The weights are initialized with the random values at the origination of each training.
o For each element of the training set, the error is calculated with the difference between
the desired output and the actual output. The calculated error is used to adjust the
weight.
o The process is repeated until the fault made on the entire training set is less than the
specified limit until the maximum number of iterations has been reached.
ADVERTISEMENT
MLP networks are used for supervised learning format. A typical learning algorithm for
MLP networks is also called back propagation's algorithm.
A multilayer perceptron (MLP) is a feed forward artificial neural network that generates a
set of outputs from a set of inputs. An MLP is characterized by several layers of input
nodes connected as a directed graph between the input nodes connected as a directed
graph between the input and output layers. MLP uses backpropagation for training the
network. MLP is a deep learning method.
Now, we are focusing on the implementation with MLP for an image classification
problem.
1. # Import MINST data
2. from tensorflow.examples.tutorials.mnist import input_data
3. mnist = input_data.read_data_sets("/tmp/data/", one_hot = True)
4.
5. import tensorflow as tf
6. import matplotlib.pyplot as plt
7.
8. # Parameters
9. learning_rate = 0.001
10. training_epochs = 20
11. batch_size = 100
12. display_step = 1
13.
14. # Network Parameters
15. n_hidden_1 = 256
16.
17. # 1st layer num features
18. n_hidden_2 = 256 # 2nd layer num features
19. n_input = 784 # MNIST data input (img shape: 28*28) n_classes = 10
20. # MNIST total classes (0-9 digits)
21.
22. # tf Graph input
23. x = tf.placeholder("float", [None, n_input])
24. y = tf.placeholder("float", [None, n_classes])
25.
26. # weights layer 1
27. h = tf.Variable(tf.random_normal([n_input, n_hidden_1])) # bias layer 1
28. bias_layer_1 = tf.Variable(tf.random_normal([n_hidden_1]))
29. # layer 1 layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, h), bias_layer_1))
30.
31. # weights layer 2
32. w = tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2]))
33.
34. # bias layer 2
35. bias_layer_2 = tf.Variable(tf.random_normal([n_hidden_2]))
36.
37. # layer 2
38. layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1, w), bias_layer_2))
39.
40. # weights output layer
41. output = tf.Variable(tf.random_normal([n_hidden_2, n_classes]))
42.
43. # biar output layer
44. bias_output = tf.Variable(tf.random_normal([n_classes])) # output layer
45. output_layer = tf.matmul(layer_2, output) + bias_output
46.
47. # cost function
48. cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(
49. logits = output_layer, labels = y))
50.
51. #cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(output_layer, y))
52. # optimizer
53. optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)
54.
55. # optimizer = tf.train.GradientDescentOptimizer(
56. learning_rate = learning_rate).minimize(cost)
57.
58. # Plot settings
59. avg_set = []
60. epoch_set = []
61.
62. # Initializing the variables
63. init = tf.global_variables_initializer()
64.
65. # Launch the graph
66. with tf.Session() as sess:
67. sess.run(init)
68.
69. # Training cycle
70. for epoch in range(training_epochs):
71. avg_cost = 0.
72. total_batch = int(mnist.train.num_examples / batch_size)
73.
74. # Loop over all batches
75. for i in range(total_batch):
76. batch_xs, batch_ys = mnist.train.next_batch(batch_size)
77. # Fit training using batch data sess.run(optimizer, feed_dict = {
78. x: batch_xs, y: batch_ys})
79. # Compute average loss
80. avg_cost += sess.run(cost, feed_dict = {x: batch_xs, y: batch_ys}) / total_batch
81. # Display logs per epoch step
82. if epoch % display_step == 0:
83. print
84. Epoch:", '%04d' % (epoch + 1), "cost=", "{:.9f}".format(avg_cost)
85. avg_set.append(avg_cost)
86. epoch_set.append(epoch + 1)
87. print
88. "Training phase finished"
89.
90. plt.plot(epoch_set, avg_set, 'o', label = 'MLP Training phase')
91. plt.ylabel('cost')
92. plt.xlabel('epoch')
93. plt.legend()
94. plt.show()
95.
96. # Test model
97. correct_prediction = tf.equal(tf.argmax(output_layer, 1), tf.argmax(y, 1))
98.
99. # Calculate accuracy
100. accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
101. print
102. "Model Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels})
o Build graphs and run sessions [Do all the set-up and then execute a session to
implement a session to evaluate tensors and run operations].
o Create our coding and run on the fly.
For this first part, we will use the interactive session that is more suitable for an
environment like Jupiter notebook.
1. sess = tf.InteractiveSession()
Creating placeholders
It's a best practice to create placeholder before variable assignments when using
TensorFlow. Here we'll create placeholders to inputs ("Xs") and outputs ("Ys").