Student Notes: Convolutional Neural Networks (CNN) Introduction
Student Notes: Convolutional Neural Networks (CNN) Introduction
Student Notes: Convolutional Neural Networks (CNN) Introduction
Neural Networks
(CNN) Introduction
These notes are taken from the first two weeks of Convolutional Neural
Networks course (part of Deep Learning specialization) by Andrew Ng on
Coursera. The course is actually four weeks long, but I didn’t take the note for
the last two weeks which discuss about object localization/detection, face
recognition, and neural style transfer.
~
Why Convolutions
• Parameter sharing: a feature detector (such as a vertical edge
detector) that’s useful in one part of the image is probably useful in
another part of the image.
Convolution Operation
Step 1: overlay the filter to the input, perform element wise multiplication,
and add the result.
Stride
Stride governs how many cells the filter is moved in the input to calculate the
next cell in the result.
Padding
Notice the the dimension of the result has changed due to padding. See the
following section on how to calculate output dimension.
When the input has more than one channels (e.g. an RGB image), the filter
should have matching number of channels. To calculate one output cell,
perform convolution on each matching channel, then add the result together.
1 x 1 Convolution
Shorthand Representation
This simpler representation will be used from now on to represent one
convolutional layer:
~
Pooling Layer
Pooling layer is used to reduce the size of the representations and to speed up
calculations, as well as to make some of the features it detects a bit more
robust.
Sample types of pooling are max pooling and avg pooling, but these days
max pooling is more common.
Interesting properties of pooling layer:
• it has hyper-parameters:
o size (f)
o stride (s)