CS60010: Deep Learning CNN - Part 1: Sudeshna Sarkar
CS60010: Deep Learning CNN - Part 1: Sudeshna Sarkar
CS60010: Deep Learning CNN - Part 1: Sudeshna Sarkar
CNN – Part 1
Sudeshna Sarkar
Spring 2019
1 Feb 2019
LeNet-5 (LeCun, 1998)
4
Locally Connected Layer
STATIONARITY? Statistics is similar at
different locations
Feature Map
Grayscale Image
wT x
wT x
wT x
wT x
wT x
wT x
Ranzato
21
Output Size
Output size:
N −K
+1
S
output layer
input
layer hidden layer
now:
32 height
32 width
3 depth
Convolution Layer
32x32x3 image
5x5x3 filter
32
32
5x5x3 filter
32
32
32x32x3 image
5x5x3 filter
32
1 number:
the result of taking a dot product between the
filter and a small 5x5x3 chunk of the image
(i.e. 5*5*3 = 75-dimensional dot product + bias)
32
28
32 28
3 1
28
32 28
3 1
activation maps
32
28
Convolution Layer
32 28
3 6
32 28
CONV,
ReLU
e.g. 6
5x5x3
filters
32 28
3 6
32 28 24
….
CONV, CONV, CONV,
ReLU ReLU ReLU
e.g. 6 e.g. 10
5x5x3 5x5x6
filters filters
32 28 24
3 6 10
activation map
32x32x3 image
5x5x3 filter
32
28
32 28
3 1
7
7x7 input (spatially)
assume 3x3 filter
applied with stride 2
7
7x7 input (spatially)
assume 3x3 filter
applied with stride 2
7
7x7 input (spatially)
assume 3x3 filter
applied with stride 2
=> 3x3 output!
7
7x7 input (spatially)
assume 3x3 filter
applied with stride 3?
7
7x7 input (spatially)
assume 3x3 filter
applied with stride 3?
7 doesn’t fit!
cannot apply 3x3 filter on
7x7 input with stride 3.
Output size:
(N - F) / stride + 1
F
e.g. N = 7, F = 3:
stride 1 => (7 - 3)/1 + 1 = 5
N
F stride 2 => (7 - 3)/2 + 1 = 3
stride 3 => (7 - 3)/3 + 1 = 2.33 :\
(recall:)
(N - F) / stride + 1
32 28 24
….
CONV, CONV, CONV,
ReLU ReLU ReLU
e.g. 6 e.g. 10
5x5x3 5x5x6
filters filters
32 28 24
3 6 10
Figure: I. Kokkinos
W1 − K H1 − K
W2 = + 1 and H 2 = +1
S S
translated
image image