Convolutional Neural Networks: Performance Comparision of Residual Deep Network For The Brain Tumor Detection

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

CHAPTER 3

CONVOLUTIONAL NEURAL NETWORKS

The deep learning architecture of convolutional neural networks is specialized. CNN


draws inspiration from the human visual cortex. Pattern recognition and image processing, it is
commonly employed. a feed-forward network. CNNs, a subset of deep neural networks (DNNs),
are frequently employed for problems involving computer vision. It is possible to train a variety
of layers. Its structure is one of weight distribution. The network model's structural complexity is
reduced in this way. It resembled biological brain networks more as a result of this[8].

Convolutional neural networks, one of the most popular deep learning techniques, have
gained popularity because of their exceptional performance in a wide range of computer vision
applications, including picture classification, object identification, face information extraction,
image retrieval, and many more. CNN is an end-to-end learning architecture that makes use of
gathered data.

3.1 Introduction

A picture is made up of pixels. Each pixel in the RGB model has three color components:
red, green, and blue. Each element's value can typically range from 0 (no color) to 255
(maximum saturation).

Traditional neural networks have difficulty with computer vision problems since even a
little image contains a huge amount of data. The number of pixels in a 512x512 monochrome
image is 262144. Each neuron needs 262 144 weights if each pixel intensity of this image is
inputted separately to a fully connected network. A full HD 1920x1080 image would need
2,073,600 weights. The number of weights is doubled by the number of color channels if the
photos are polychrome. As a result, it is clear that as image size increases, the total number of
free parameters in the network quickly grows to enormous proportions. Overfitting and sluggish
performance are caused by too large models [3].

Performance Comparision of Residual Deep Network for the Brain Tumor detection Page 30
Convolutional Neural Networks

Itt comes from ANN. Originally known as artificial neural networks, neural networks were
created to simulate the neural activity of the human brain. The threshold logic unit is a subject of
groundbreaking study. [4]. It is a particular kind of connection between synthetic and biological
neurons, or in terms of neuroscience. About 100 billion neurons are active simultaneously in the
human brain [5]. On more or less serial computers, artificial neurons are mathematical
operations. Developments in engineering and mathematics, rather than biological ones, serve as
the primary inspiration for research
resea into neural networks [2].

Figure 3.1 An Artificial neuron et al[3].

McCulloch Pitts model. M input


Figure 3.1 [3] depicts a synthetic neuron based on the McCulloch-Pitts
parameters j are provided to the neuron k. Additionally, the neuron has m weight parameters (w
). The bias term in the weight settings frequently contains a matched dummy
KJ). dum input with a
fixed value of 1. The weights and inputs are linearly added together and summed. The activation
together
sum..
function of the neuron that generates the output, y k, is then given the sum

Artificial neurons are put together to form neural networks. Most of the time, layers are formed
Most
by the neurons. Each output of a layer of neurons in a fully linked feed-forward
feed multi-layer
network is fed as input to each neuron in the subsequent layer, as shown in Figure 3.2. As a

Performance Comparision of Residual Deep Network for the Brain Tumor detection Page 31
Convolutional Neural Networks

result, while certain layers process data received from other neurons, others handle the original
input data.

Input layer Hidden layers Output layer

n1

N2

n3

nn

Figure 3.2 Is a fully connected multi-layer neural network [2].

The number of weights assigned to each neuron is the same as the number of neurons in the
layer before it. An input layer, hidden layers, and an output layer are the three types of layers that
make up a convolutional neural network [3]. The input layer typically only passes along data
without changing it. In the hidden levels, most of the calculation takes place. The output layer
transforms the activations from the hidden layer into an output, like a classification. A multilayer
feed-forward network with at least one hidden layer can be built to compute practically any
function and serve as a universal approximation. Convolutional and fully connected networks are
primarily covered. In contrast to fully-connected networks, convolutional networks use
parameter sharing and have fewer connections [2].

Performance Comparision of Residual Deep Network for the Brain Tumor detection Page 32
Convolutional Neural Networks

Figure 3.3
3 Design and trained convolutional neural network [3].

To train a neural network to approach target outputs from known inputs, the weights of
each neuron are chosen. Analytical resolution of the neuron weights in a multi-layer
multi network is
challenging. Iteratively solving the weights is made simple and efficient by the back
back-propagation
process. Gradient descent is used as an optimization technique in the traditional version.
Gradient descent can take a long time and is not always successful in locating the global
minimum of error, but given the right configuration (also referred ttoo as hyperparameters in
machine learning), it generally functions satisfactorily.

An input vector is transmitted forward via the neural network at the algorithm's initial stage.
The network neurons' weights have previously been initialized
initiali to some numbers, such as small
random values. Using a loss function, the network's received output is compared to the desired
output (which should be known from the training samples). The loss function's gradient is then
calculated. The error value is another name for this gradient. The output layer error value is
simply the difference between the present and desired output when meaning squared error is used
meaning
as the loss function.

The error values are then calculated for the hidden layer neurons by propagating tthem back
through the network. The chain rule of derivatives can be used to solve the gradients of the
hidden neuron loss function. By calculating the gradient of the weights and deducting a fraction
of the gradient from the weights, the neuron weights are finally updated. The learning rate [3] is
referred to as this ratio. Both fixed and dynamic learning rates are possible. The procedure
input until the weights converge after
proceeds by running the phases once more with different inputs
the weights have been adjusted.
usted.

Performance Comparision of Residual Deep Network for the Brain Tumor detection Page 33
Convolutional Neural Networks

I've discussed online learning that determines the weight updates for every new input in the full
description. Online learning can result in "zig-zagging" behavior, in which the gradient estimate
from a single data point changes course and does not directly approach the minimum. Full batch
learning is a different method of computing the updates, in which we compute the weight
updates for the entire dataset [2].

The final output of each neuron is now determined by another crucial component of the
activation function. To build a successful network, it is crucial to choose the function carefully.
Early researchers discovered that vision and other linear systems had serious limitations because
they couldn't resolve issues that weren't linearly separable. These kinds of issues can
occasionally be resolved by linear systems utilizing manually created feature detectors, however,
this is not the best application of machine learning. Simply adding layers is also ineffective
because a network made up of linear neurons will always remain linear, regardless of the number
of levels it has. Rectified linear units (ReLu) are a simple and efficient approach to building a
non-linear network [2].

3.2 Core Structure Of Cnn


A biological notion termed the receptive field served as the basis for CNN's fundamental
design. The visual cortex of animals has receptive fields. They perform the function of detectors
that are sensitive to specific forms of stimulus, such as edges. They overlap and are spread out
over the visual field. Computers can approximate this biological process

1 2 1
0 0 0
-1 -2 -1

Utilizing a convolutional filtering technique. Convolution can be used to filter images in image
processing to create a variety of visible effects. Figure 3.1 demonstrates how a hand-selected
convolutional filter, acting similarly to a receptive field, recognizes horizontal edges from an
image.

Performance Comparision of Residual Deep Network for the Brain Tumor detection Page 34
Convolutional Neural Networks

A convolutional layer of a neural network can be created by combining some convolutional


filters. Machine learning is used to train the filters' matrix values as neuron parameters. A
standard neural network layer's multiplication operation is replaced by the convolution operation.
The layer's output is typically described as a volume. The dimensions of the activation map
determine the volume's height and width. The quantity of filters affects the volume's depth.

In comparison to a fully-connected neural layer, there are far fewer free parameters because the
same filters are applied to every aspect of the image. The convolutional layer's neurons are only
connected to a small portion of the input and share the majority of the same characteristics.
Translation invariance is ensured through parameter sharing that results from convolution.
Consider the convolutional layer as a fully connected layer with weights that have an endlessly
strong prior as an alternate description [2]. The neurons are forced by this preceding to share
weights at various spatial places and to have 0 weight outside of the receptive field.

3.3 Basic Cnn Components


Convolution neural network layer types mainly include three components convolutional layer,
pooling layer, and fully-connected layer following:

3.3.1 Convolution Layer

The Convolutional layer, which has local connections and weights of shared properties, is
the central component of the Convolutional neural network. The convolutional layer's objective
is to teach input feature representations. The convolutional layer is made up of many feature
maps, as seen above. The local characteristics of different positions in the previous layer are
extracted using each neuron of the same feature map, but for a single neuron, the local
characteristics of the same positions in the previous different feature map are extracted. The
input feature maps are first convolved with a learned kernel to produce a new feature, and the
outcomes are then fed into a nonlinear activation function. Applying various kernels will result
in distinct feature maps[7].

Performance Comparision of Residual Deep Network for the Brain Tumor detection Page 35
Convolutional Neural Networks

3.3.2 Pooling Layer

The sampling procedure is an analog of the fuzzy filtering method. By reducing the size of the
feature maps and boosting the robustness of feature extraction, the pooling layer has an impact
on secondary feature extraction. The typical placement is in the middle of two convolutional
layers. The movement step of kernels determines the size of feature maps in the pooling layer.
Average pooling and maximum pooling are the two most used pooling operations. By stacking
many Convolutional layers and pooling layers, we may extract the high-level properties of
inputs. [8]

3.3.3 Fully-Connected Layer

A Convolutional neural network's classifier typically consists of one or more fully connected
layers. They connect every neuron in the current layer to every neuron in the preceding layer.
Fully connected layers do not retain any spatial information. An output layer follows the final
layer that is completely connected. Softmax regression is frequently used for classification
applications because it produces outputs with a good probability distribution. SVM is another
often-used technique that can be used in conjunction with CNN to address various classification
problems.

3.3.4 Pooling and Stride

To make the network easier to handle for classification, it is helpful to reduce the size of the
activation map in the deep end of the network. Pooling and stride are two more key ideas in
convolutional layers. In general, the deep layers of the network require more filter matrices to
distinguish many high-level patterns but less information about the precise spatial placements of
objects. We can increase the depth of the data volume and maintain a tolerable computation time
by decreasing the height and width of the data volume. [9]

The size of the data volume can be decreased in two ways. One method is to use activation
maps to efficiently down-sample images and add a pooling layer after a convolutional layer. By
making the detectors less accurate, pooling also has the additional benefit of increasing the
translation invariance of the final network. Pooling, though, has the potential to obliterate all
knowledge of spatial correlations between pattern components. Max-pooling is a common

Performance Comparision of Residual Deep Network for the Brain Tumor detection Page 36
Convolutional Neural Networks

pooling technique. Simply defined, max-pooling produces the highest value found in a
rectangular region of the activation map [2].

Changing the stride parameter of the convolution operation is another method for lowering the
data volume size. Depending on the value of the stride parameter, the convolution output may be
calculated for a neighborhood centered on each pixel of the input picture (stride 1) or for each
subsequent pixel (stride n). Using convolutional layers with a higher stride value allows pooling
layers to frequently be eliminated without sacrificing accuracy, according to research. The stride
operation is the same as pooling with a fixed grid [6].

3.3.5 RELU
A non-linear activation function, such as a rectified linear activation function, is frequently
present in the convolutional layer. Between the convolutional layer and the pooling layer,
activations are occasionally referred to as a distinct layer. Some systems, like [6], also have a
regularisation technique layer known as local response normalization. Local response
normalization imitates lateral inhibition, a biological process in which stimulated neurons
suppress the activity of nearby neurons. However, alternative regularisation methods are more
widely used right now, and these are covered in the next section.

3.3.6 Regularization
Regularization describes techniques that include adding more constraints or information to
the machine learning system to reduce overfitting. The addition of a penalty term to the
objective/loss function that penalizes particular sorts of weights is a traditional method of
utilizing regularisation in neural networks. Deep neural network-specific regularisation methods
come in a variety of forms. The common practice of "dropout" aims to lessen the co-adaptation
of neurons. This is accomplished by randomly removing neurons from the network as it is being
trained so that each training sample or mini-batch uses a slightly different neural network. As a
result, the system is less dependent on any one neuron or link, and regularisation is effectively
implemented in a computationally efficient manner [2].

Performance Comparision of Residual Deep Network for the Brain Tumor detection Page 37
Convolutional Neural Networks

3.3.7 Over Fitting


The amount of training data can also be increased to minimize overfitting. Data augmentation is
used to generate more samples when it is impossible to obtain more real samples based on the
available data. This can be done for classification using convolutional networks by performing
changes to the input images that do not change the observed item classes but add another level of
difficulty to the system. The photos can, for instance, be rotated, sub-sampled, or cropped and
scaled differently. The supplied images may also include noise [2].

3.3.8 Training, Test, and Validation Set

Typically, the data is divided into training, validation, and test data sets. The data used to
train the network is known as the training data set. To best suit this training data set, the network
will adjust its weights and biases. The network will periodically output the training accuracy
during training. If you apply the model to the training data set, you would get this classification
accuracy1. The network receives this training data set for a predetermined number of rounds.
The training accuracy will typically increase with each round until it plateaus at a certain point.
But our actual goal is for our network to have the best accuracy possible, not to have particularly
good accuracy on our training set best possible accuracy on the test set (test accuracy).

A network's performance is gauged by the test accuracy. Images that were not included in
either the network's training or validation make up the test set. Consequently, it shows how well
the network can extrapolate from known data. The end goal is to produce a network that has a
high capacity for generalization over data, however, the model with the best test accuracy is not
always the one with the best training accuracy. The network may at some point during training
overcompensate for the training data set. Overfitting is the term for this. A validation data set can
be given to the network during training to stop it from occurring.

The model selection process might take advantage of the validation data set. It makes it
possible to test the network's capacity to generalize fresh image data with each cycle. The
importance of high validation accuracy, therefore, outweighs that of high training accuracy
[Witten2000] [Buduma2017] [Goodfellow2016].

Performance Comparision of Residual Deep Network for the Brain Tumor detection Page 38
Convolutional Neural Networks

3.8.1 Execution Role Of Convolutional Neural Networks In Image Category Classification

Convolutional neural networks (CNNs)' execution of the Flow architecture along with the
backpropagation algorithm is the fundamental idea behind finding maximum accuracy
ac in
picture categorization.

Figure 3.4 Image Processing Steps Through Convolutional Neural Networks [3]

o train a neural network to approach target outputs from known inputs, the weights of each
To
neuron are chosen. Analytical resolution of the neuron weights in a multi-layer
multi network is
Performance Comparision of Residual Deep Network for the Brain Tumor detection Page 39
Convolutional Neural Networks

challenging. Iteratively solving the weights is made simple and efficient by the back-propagation
process. Gradient descent is used as an optimization technique in the traditional version.
Gradient descent can take a long time and is not always successful in locating the global
minimum of error, but given the right configuration (also referred to as hyperparameters in
machine learning), it generally functions satisfactorily. [3]

An input vector is transmitted forward via the neural network at the algorithm's initial stage.
The network neurons' weights have previously been initialized to some numbers, such as small
random values. Using a loss function, the network's received output is compared to the desired
output (which should be known from the training samples). The loss function's gradient is then
calculated. The error value is another name for this gradient. The output layer error value is
simply the difference between the present and desired output when meaning squared error is used
as the loss function.

The error values are then calculated for the hidden layer neurons by propagating them back
through the network. The chain rule of derivatives can be used to solve the gradients of the
hidden neuron loss function. By calculating the gradient of the weights and deducting a fraction
of the gradient from the weights, the neuron weights are finally updated. The learning rate [3] is
referred to as this ratio. Both fixed and dynamic learning rates are possible. The procedure
proceeds by running the phases once more with different inputs until the weights converge after
the weights have been adjusted.

Performance Comparision of Residual Deep Network for the Brain Tumor detection Page 40

You might also like