Unit3 2023 NNDL
Unit3 2023 NNDL
Unit3 2023 NNDL
CNN
• Introduction - Components of CNN Architecture - Rectified Linear Unit
(ReLU) Layer -
• Exponential Linear Unit (ELU, or SELU) - Unique Properties of CNN
-Architectures of CNN
• Applications of CNN.
• CNN
• CNNs are a class of Deep Neural Networks that can recognize and classify
particular features from images and are widely used for analyzing visual images.
Their applications range from image and video recognition, image classification,
medical image analysis, computer vision and natural language processing.
• CNN has high accuracy, and because of the same, it is useful in image recognition.
Image recognition has a wide range of uses in various industries such as medical
image analysis,security, recommendation systems, etc.
• The term ‘Convolution” in CNN denotes the mathematical function of convolution
which is a special kind of linear operation wherein two functions are multiplied to
produce a third function which expresses how the shape of one function is
modified by the other. In simple terms, two images which can be represented as
matrices are multiplied to give an output that is used to extract features from the
image.
CNN
• In the fast-paced world of computer vision and image processing, one problem
consistently stands out: the ability to effectively recognize and classify images.
• As we continue to digitize and automate our world, the demand for systems that
can understand and interpret visual data is growing at an unprecedented rate.
• The challenge is not just about recognizing images – it’s about doing so accurately
and efficiently. Traditional machine learning methods often fall short, struggling to
handle the complexity and high dimensionality of image data. This is
where Convolutional Neural Networks (CNNs) comes to rescue.
• The CNN architectures are the most popular deep learning framework. CNNs
shown remarkable success in tackling the problem of image recognition, bringing
a newfound level of precision and scalability.
image
How Does A Computer Read an
Image?
• Consider this image of the New York
skyline, upon first glance you will see a lot
of buildings and colors. So how does the
computer process this image?
• Fully connected layer: Fully-connected layers are one of the most basic
types of layers in a convolutional neural network (CNN).
• As the name suggests, each neuron in a fully-connected layer is Fully
connected- to every other neuron in the previous layer.
• Fully connected layers are typically used towards the end of a CNN- when
the goal is to take the features learned by the convolutional and pooling
layers and use them to make predictions such as classifying the input to a
label.
• For example, if we were using a CNN to classify images of animals, the final
Fully connected layer might take the features learned by the previous layers
and use them to classify an image as containing a dog, cat, bird, etc.
various kernel filters in CNN
• The purpose of using various kernel filters in Convolutional Neural
Networks (CNNs) is to capture different types of features and patterns
present in the input data, such as images. Each filter specializes in
detecting a specific feature, such as edges, corners, textures, or more
complex shapes. By applying multiple filters of different types, CNNs
can learn a diverse set of features and build a richer representation of
the input data.
kernel filters in CNN
• some common types of kernel filters and their purposes through
examples:
• Edge Detection: Edge detection filters are used to identify abrupt
intensity changes in an image, which correspond to edges. A common
edge detection filter is the Sobel filter. It has two variations, one for
detecting vertical edges and another for detecting horizontal edges.
These filters highlight regions of rapid intensity change in the input
image.
• Example of a Sobel filter for vertical edge detection:
Sobel Filter:
-1 0 1
-2 0 2
-1 0 1
kernel filters in CNN
• Blur (Gaussian) Filter: Gaussian filters are used for blurring or
smoothing an image. They average the pixel values in a neighborhood
to reduce noise and fine details. Gaussian filters are often used as
pre-processing steps.
• Example of a Gaussian filter:
• Gaussian Filter:
•1 2 1
•2 4 2
•1 2 1
kernel filters in CNN
• Sharpening Filter: Sharpening filters enhance edges and details in an
image. They work by subtracting a blurred version of the image from
the original image. This amplifies high-frequency components, making
edges more pronounced.
• Example of a sharpening filter:
• Sharpening Filter:
• 0 -1 0
• -1 5 -1
• 0 -1 0
kernel filters in CNN
• Embossing Filter: Embossing filters create a 3D effect by emphasizing
the differences between adjacent pixels. They simulate the effect of
light and shadow on a textured surface.
• Example of an embossing filter:Embossing Filter:
• -2 -1 0
• -1 1 1
•0 1 2
kernel filters in CNN
• Identity Filters:
• Identity filters do not alter the input and serve as baseline filters. They can
be used to visualize the regions where different patterns are detected by
other filters.
•
• Example: Identity Filter (No Change):
•
•0 0 0
• 0 1 0
• 0 0 0
• Custom Filters:
• Custom filters can be designed to detect specific patterns relevant to a particular
task, such as detecting diagonal lines, corners, or texture patterns.
• Example: Custom Filter for Diagonal Line Detection:
• 0 0 -1
• 0 1 0
• -1 0 0
• Using these various kernel filters, CNNs can learn a diverse set of features at
different scales and orientations. By stacking multiple convolutional layers with
different filters, CNNs become capable of recognizing complex visual patterns and
features, making them a powerful tool for a wide range of computer vision tasks.
•
WORKING OF CNN
• Example: Handwritten Digit Recognition
• Imagine you have a CNN trained to recognize handwritten digits (0-9).
The input to the CNN is a grayscale image of a digit.
• Input Image: input image of the digit "7":
• Convolutional Layer:
• The first layer in the CNN is a convolutional layer. It consists of
multiple filters that slide over the input image. Each filter detects
specific features like edges, corners, or textures.
• As the filters move across the input image, they perform convolutions
to create feature maps that highlight relevant patterns. Each filter
generates a separate feature map.
• Activation Function: After convolution, an activation function
(typically ReLU) is applied element-wise to each feature map. This
introduces non-linearity to the network and helps capture complex
relationships in the data.
•
WORKING OF CNN
• Pooling Layer:
• The next step is pooling (often max pooling). This layer reduces the spatial dimensions of the
feature maps, reducing computational complexity and making the network more robust to
variations in the input.
• Max pooling takes the maximum value from each pooling region and discards the rest.
• Flattening:
• The pooled feature maps are flattened into a 1D vector. This prepares the data for the fully
connected layers.
WORKING OF CNN
• Fully Connected Layers:
• The flattened vector is passed through fully connected layers. These layers learn higher-level
representations of the features, combining information from different parts of the image.
• Output Layer:
• The final fully connected layer leads to the output layer. For digit recognition, this layer typically has
10 neurons (one for each digit). The output values represent the model's confidence in each
possible class (digit).
• Softmax Activation:
• A softmax activation function is applied to the output layer. It converts the output values into a
probability distribution, indicating the likelihood of the input image belonging to each class.
• Prediction:
• The class with the highest probability becomes the predicted label for the input image. In our
example, the CNN predicts that the input image is most likely the digit "7."
WORKING OF CNN
• This entire process of convolution, activation, pooling, flattening, and
fully connected layers constitutes the core working of a Convolutional
Neural Network. The network learns to adjust the filter values during
training to recognize different features in the input data, allowing it to
make accurate predictions for various tasks such as image
recognition.
various stages of a CNN model with
illustrations-PYTHON
• REFER WORD FILE CNN
ReLU
• Vanishing –
• As the backpropagation algorithm advances downwards(or backward) from
the output layer towards the input layer, the gradients often get smaller
and smaller and approach zero which eventually leaves the weights of the
initial or lower layers nearly unchanged. As a result, the gradient descent
never converges to the optimum. This is known as the vanishing
gradients problem.
• Exploding –
• On the contrary, in some cases, the gradients keep on getting larger and
larger as the backpropagation algorithm progresses. This, in turn, causes
very large weight updates and causes the gradient descent to diverge. This
is known as the exploding gradients problem.
ReLu (Rectified Linear Unit) Activation
Function
• ReLu is the best and most advanced activation function right now
compared to the sigmoid and TanH because all the drawbacks like
Vanishing Gradient Problem is completely removed in this activation
function which makes this activation function more advanced
compare to other activation function.
Range: 0 to infinity
.
Advantage of ReLu:
•Here all the negative values are
converted into the 0 so there are no
negative values are available.
•Maximum Threshold values are Infinity,
so there is no issue of Vanishing
Gradient problem so the output
prediction accuracy and there
efficiency is maximum.
•Speed is fast compare to other
activation function
• Disadvantages of ReLU:
• Dying ReLU Problem: ReLU units can sometimes become inactive
during training, causing the gradient to be zero for all negative inputs.
This is known as the "dying ReLU" problem, which can slow down
learning.
• Unbounded Activation: ReLU doesn't have an upper bound on the
output, which can lead to issues like exploding gradients in deeper
networks.
• Output Instability: ReLU's output can be unstable if the input is not
properly normalized.
• Leaky ReLU Function
3. Low Computational Requirements: Due to its 3. Poor Performance on Complex Tasks: While
simplicity, LeNet requires fewer computational effective for digit recognition, LeNet may not perform
resources and can run on less powerful hardware well on more challenging computer vision tasks, such
compared to deep neural networks with many layers. as object detection or image segmentation.
.
Merits of LeNet Architecture Demerits of LeNet Architecture
4. Pioneering Architecture: LeNet laid the foundation 4. Lack of Depth: LeNet's limited depth may result in
for modern CNNs and inspired the development of difficulty capturing hierarchical features or learning
more advanced architectures, making it a significant complex data representations, which are important for
contribution to the field of deep learning. many modern tasks.