Concepts in Deep Learning

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

COMPUTER SCIENCE AND ENGINEERING

CST CONCEPTS IN Year of


Category L T P Credits
384 DEEP Introduction

LEARNING
VAC 3 1 0 4 2019

Preamble:
This course aims to introduce the learner to an overview of the concepts and algorithms involved in
deep learning. Deep learning is a subfield of machine learning, a subfield of artificial intelligence.
Basic concepts and application areas of machine learning, deep networks, convolutional neural
network and recurrent neural network are covered here. This is a foundational program that will
help students understand the capabilities, challenges, and consequences of deep learning and
prepare them to participate in the development of leading-edge AI technology. They will be able to
gain the knowledge needed to take a definitive step in the world of AI.

Prerequisite: Sound knowledge in Basics of linear algebra and probability theory.

CO1 Demonstrate basic concepts in machine learning.(Cognitive Knowledge Level:


Understand)

CO2 Illustrate the validation process of machine learning models using hyper-parameters
and validation sets. (Cognitive Knowledge Level: Understand)

CO3 Demonstrate the concept of the feed forward neural network and its training process.
(Cognitive Knowledge Level: Apply)

CO4 Build CNN and Recurrent Neural Network (RNN) models for different use cases.
(Cognitive Knowledge Level: Apply)

CO5 Use different neural network/deep learning models for practical applications.
(Cognitive Knowledge Level: Apply)

295
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

Mapping of course outcomes with program outcomes

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12

CO1

CO2

CO3

CO4

CO5

Abstract POs defined by National Board of Accreditation

PO# Broad PO PO# Broad PO


PO1 Engineering Knowledge PO7 Environment and Sustainability

PO2 Problem Analysis PO8 Ethics

PO3 Design/Development of solutions PO9 Individual and team work

PO4 Conduct investigations of complex PO10 Communication


problems

PO5 Modern tool usage PO11 Project Management and Finance

PO6 The Engineer and Society PO12 Life long learning

296
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

Assessment Pattern

Bloom’s Category Continuous Assessment Tests End Semester


Examination
Test1 (Percentage) Test2 (Percentage) Marks

Remember 30 30 30

Understand 40 40 40

Apply 30 30 30

Analyse

Evaluate

Create

Mark Distribution

Total Marks CIE Marks ESE Marks ESE Duration

150 50 100 3 hours

Continuous Internal Evaluation Pattern:

Attendance : 10 marks

Continuous Assessment Tests : 25 marks

Continuous Assessment Assignment : 15 marks

Internal Examination Pattern:

Each of the two internal examinations has to be conducted out of 50 marks. First Internal
Examination shall be preferably conducted after completing the first half of the syllabus and the
Second Internal Examination shall be preferably conducted after completing remaining part of the
syllabus.

297
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

There will be two parts: Part A and Part B. Part A contains 5 questions (preferably, 2 questions
each from the completed modules and 1 question from the partly covered module), having 3 marks
for each question adding up to 15 marks for part A. Students should answer all questions from Part
A. Part B contains 7 questions (preferably, 3 questions each from the completed modules and 1
question from the partly covered module), each with 7 marks. Out of the 7 questions in Part B, a
student should answer any 5.

End Semester Examination Pattern:

There will be two parts; Part A and Part B. Part A contains 10 questions with 2 questions from each
module, having 3 marks for each question. Students should answer all questions. Part B contains 2
questions from each module of which a student should answer any one. Each question can have
maximum 2 sub-divisions and carry 14 marks.

Syllabus

INTRODUCTION TO DEEP LEARNING

(General Instructions: Instructors are to introduce students to any one software platform and
demonstrate the working of the algorithms in the syllabus using suitable use cases and public
datasets to give a better understanding of the concepts discussed. Tutorial hour may be used for this
purpose)

Module-1 (Introduction)

Key components - Data, models, objective functions, optimization algorithms, Learning algorithm.
Supervised learning- regression, classification, tagging, web search, page ranking, recommender
systems, sequence learning, Unsupervised learning, Reinforcement learning, Historical Trends in
Deep Learning. Other Concepts - overfitting, underfitting, hyperparameters and validation sets,
estimators, bias and variance.

Module- 2 (Optimization and Neural Networks)

Neural Networks –Perceptron, Gradient Descent solution for Perceptron, Multilayer perceptron,
activation functions, architecture design, chain rule, back propagation, gradient based learning.
Introduction to optimization– Gradient based optimization, linear least squares. Stochastic gradient
descent, Building ML algorithms and challenges.

298
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

Module -3 (Convolutional Neural Network)

Convolutional Neural Networks – convolution operation, motivation, pooling, Structure of CNN,


Convolution and Pooling as an infinitely strong prior, variants of convolution functions, structured
outputs, data types, efficient convolution algorithms. Practical challenges of common deep learning
architectures- early stopping, parameter sharing, dropout. Case study: AlexNet, VGG, ResNet.

Module- 4 (Recurrent Neural Network)

Recurrent neural networks – Computational graphs, RNN design, encoder – decoder sequence to
sequence architectures, deep recurrent networks, recursive neural networks, modern RNNs LSTM
and GRU, Practical use cases for RNNs.

Module-5 (Application Areas)

Applications – computer vision, speech recognition, natural language processing, common word
embedding: continuous Bag-of-Words, Word2Vec, global vectors for word representation (GloVe).
Research Areas – autoencoders, representation learning, boltzmann machines, deep belief
networks.

Text Book
1. Ian Goodfellow, YoshuaBengio, Aaron Courville, Deep Learning, MIT Press 2015 ed.
2. Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola, Dive into Deep Learning,
August 2019.
3. Neural Networks and Deep Learning, Aggarwal, Charu C., c Springer International
Publishing AG, part of Springer Nature 2018

Reference Books
1. Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks by
Russell Reed, Robert J MarksII, A Bradford Book,2014
2. Practical Convolutional Neural Networks by MohitSewak, Md. Rezaul Karim,
PradeepPujari,Packt Publishing 2018
3. Hands-On Deep Learning Algorithms with Python by SudharsanRavichandran,Packt
Publishing 2019
4. Deep Learning with Python by Francois Chollet,Manning Publications Co.,2018

299
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

Sample Course Level Assessment Questions

Course Outcome 1 (CO1):


1. Compare regression and classification.
2. Define supervised learning? Distinguish between regression and classification.
3. Discuss the different learning approaches used in machine learning.
Course Outcome 2 (CO2):
1. What are hyperparameters? Why are they needed?
2. What issues are to be considered while selecting a model for applying machine
learning in a given problem?
Course Outcome 3 (CO3):
1. Update the parameters V11 in the given MLP using back propagation with learning rate as
0.5 and activation function as sigmoid. Initial weights are given as V11= 0.2, V12=0.1,
V21=0.1, V22=0.3, V11=0.2, W11=0.5, W21=0.2
V
0.6 W

0.9

0.8

2. Draw the architecture of a multi-layer perceptron.


3. Derive update rules for parameters in the multi-layer neural network through the gradient
descent.

Course Outcome 4 (CO4):


1. Give two benefits of using convolutional layers instead of fully connected ones for
visual tasks.
2. Suppose that a CNN was trained to classify images into different categories. It
performed well on a validation set that was taken from the same source as the
training set but not on a testing set. What could be the problem with the training of
such a CNN? How will you ascertain the problem? How can those problems be
solved?
3. Explain how the cell state is updated in the LSTM model from Ct-1 to Ct
4. Show the steps involved in an LSTM to predict stock prices.

300
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

Course Outcome 5 (CO5):


1. Explain how the cell state is updated in the LSTM model from Ct-1 to Ct
2. Show the steps involved in an LSTM to predict stock prices.
3. Illustrate the workings of the RNN with an example of a single sequence defined
on a vocabulary of four words.
Course Outcome 6 (CO6):
1. Development a deep learning solution for problems in the domain i) natural
language processing or ii Computer vision (Assignment
2. Illustrate the workings of the RNN with an example of a single sequence defined
on a vocabulary of four words.

Model Question Paper

QP CODE: PAGES:4

Reg No:_______________

Name:_________________

APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY


SIXTH SEMESTER B.TECH DEGREE EXAMINATION(MINOR), MONTH & YEAR

Course Code: CST 384


Course Name: CONCEPTS IN DEEP LEARNING
Max. Marks:100 Duration: 3 Hours

PART A
Answer all Questions. Each question carries 3 Marks
1. Distinguish between supervised learning and Reinforcement learning. Illustrate
with an example.

2. Differentiate classification and regression.

3. Compare overfitting and underfitting. How it can affect model generalization.

301
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

4. Why does a single perceptron cannot simulate simple XOR function? Explain
how this limitation is overcome?

5. Illustrate the strengths and weaknesses of convolutional neural networks.

6. Illustrate convolution and pooling operation with an example

7. How many parameters are there in AlexNet? Why the dataset size (1.2 million) is
important for the success of AlexNet?

8. Explain your understanding of unfolding a recursive or recurrent computation into


a computational graph.

9. Illustrate the use of deep learning concepts in Speech Recognition.

10. What is an autoencoder? Give one application of an autoencoder


(10x3=30
)

Part B
(Answer any one question from each module. Each question carries 14
Marks)

11. (a) “A computer program is said to learn from experience E with respect to some
class of
(10)
tasks T and performance measure P, if its performance at tasks in T, as
measured by P, improves with experience E.” What is your understanding of
the terms task, performance and experience. Explain with two example

(b) “How does bias and variance trade-off affect machine learning algorithms?
(4)

OR

12. (a) Illustrate the concepts of Web search, Page Ranking, Recommender systems
with suitable examples.
(10)

(b) List and discuss the different hyper parameters used in fine tuning the (4)

302
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

traditional machine learning models

13. (a) How multilayer neural networks learn and encode higher level features from
input features. (7)

(b) Explain gradient decent and delta rule? Why stochastic approximation to
gradient descent is needed? (7)

OR

14. (a) Find the new weights for the network using backpropogation algorithm, the (7)
network is given with a input pattern[-1,1] and target output as +1, Use
learning rate of alpha=0.3 and bipolar sigmoid function.

(b) Write an algorithm for backpropgation which uses stochastic gradient descent (7)
method. Comment on the effect of adding momentum to the network.

15. (a) Input to CNN architecture is a color image of size 112x112x3. The first (5)
convolution layer comprises of 64 kernels of size 5x5 applied with a stride
of 2 and padding 0. What will be the number of parameters?

(b) Let X=[-1, 0, 3, 5] W=[.3 ,.5 .2,.1] be the the input of ith layer of a neural (4)
network and to apply softmax function. What should be the output of it?

(c) Draw and explain the architecture of convolutional network (5)

OR

16. (a) Explain the concept behind i) Early stopping ii) dropout iii) weight decay (9)

303
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

(b) How backpropagation is used to learn higher-order features in a convolutional (5)


Network?

17. (a) Explain the working of RNN and discuss how backpropagation through time
is used in recurrent networks. (8)

(b) Describe the working of a long short term memory in RNNs. (6)

OR

18. (a) What is the vanishing gradient problem and exploding gradient problem? (8)

(b) Why do RNNs have a tendency to suffer from exploding/vanishing gradient? (6)
How to overcome this challenge?

19. (a) Explain any two word embedding techniques (8)

(b) Explain the merits and demerits of using Auto encoders in Computer Vision. (6)

OR

20. (a) Illustrate the use of representation learning in object classification. (7)

(b) Compare Boltzmann Machine with Deep Belief Network. (7 )

Teaching Plan

CONCEPTS IN DEEP LEARNING (45 Hours)

Module 1 : Introduction (9 hours)

1.1 Key components - Data, models, objective functions, optimization algorithms. 1 hour
(TB2: Section 1.1-1.2)

304
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

1.2 Learning algorithm (TB1: Section 5.1), Supervised learning- regression, 1 hour
classification (TB2: Section 1.3.1)

1.3 tagging, web search, page ranking (TB2: Section 1.3.1) 1 hour

1.4 Recommender systems, Sequence learning, Unsupervised learning, 1 hour


Reinforcement learning(TB2: Section 1.3.2-1.3.4)

1.5 Historical Trends in Deep Learning (TB1: Section 1.2). 1 hour

1.6 Concepts: over-fitting, under-fitting, hyperparameters and validation sets. 1 hour


(TB1: Section 5.2-5.3)

1.7 Concepts: Estimators, bias and variance. (TB1: Section 5.4) 1 hour

1.8 Demonstrate the concepts of supervised learning algorithms using a suitable 1 hour
platform.

1.9 Demonstrate the concepts of unsupervised using a suitable platform. 1 hour

Module 2 : Optimization and Neural Networks (9 hours)

2.1 Perceptron, Stochastic Gradient descent, Gradient descent solution for 1 hour
perceptron (TB3: Section 1.1 - 1.2.1)

2.2 Multilayer perceptron (TB3: Section 1.2.2), (TB1: Section 6.1,6.3) 1 hour

2.3 Activation functions- Sigmoid, tanh, Softmax, ReLU, leaky ReLU (TB3: 1 hour
Section 1.2.1.3 - 1.2.1.5)

2.4 Architecture design (TB1: Section 6.4, TB3: Section 1.6) 1 hour

2.5 Chain rule, back propagation (TB3: Section 1.3) 1 hour

305
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

2.6 Gradient based learning (TB1: Section 6.2) 1 hour

2.7 Gradient based optimization (TB1: Section 4.3) 1 hour

2.8 Linear least squares using a suitable platform. (TB1: Section 4.5) 1 hour

2.9 Building ML Algorithms and Challenges (TB3: 1.4, TB1: 5.10-5.11) 1 hour

Module 3 :Convolution Neural Network (10 hours)

3.1 Convolution operation, Motivation, pooling (TB1:Section 9.1-9.3) 1 hour

3.2 Structure of CNN (TB3: Section 8.2) 1 hour

3.3 Convolution and Pooling as an infinitely strong prior (TB1: Section 9.4) 1 hour

3.4 Variants of convolution functions – multilayer convolutional network, tensors, 1 hour


kernel flipping, downsampling, strides and zero padding. (TB1: Section 9.5)

3.5 Variants of convolution functions - unshared convolutions, tiled convolution, 1 hour


training different networks. (TB1: Section 9.5)

3.6 Structured outputs, data types (TB1: Section 9.6-9.7) 1 hour

3.7 Efficient convolution algorithms. (TB1: Section 9.8,9.10) 1 hour

3.8 Practical challenges of common deep learning architectures- early Stopping 1 hour
(TB3: 4.6)

3.9 Practical challenges of common deep learning architectures- parameter 1 hour


sharing, drop-out (TB3: Section 4.9, 4.5.4)

3.10 Case Study: AlexNet,VGG, ResNet. (TB3: Section 8.4.1-8.4.3,8.4.5) 1 hour

306
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

Module 4 :Recurrent Neural Network (8 hours)

4.1 Computational graphs (TB1: Section 10.1) 1 hour

4.2 RNN (TB1: Section 10.2-10.3) 1 hour

4.3 Encoder – decoder sequence to sequence architectures. (TB1: Section 10.4) 1 hour

4.4 Deep recurrent networks (TB1: Section 10.5) 1 hour

4.5 Recursive neural networks , Modern RNNs, (TB1: Section 10.6, 10.10) 1 hour

4.6 LSTM and GRU (TB1: Section 10.10, TB3: Section 7.5-7.6) 1 hour

4.7 Practical use cases for RNNs. (TB1: Section 11.1-11.4) 1 hour

4.8 Demonstrate the concepts of RNN using a suitable platform. 1 hour

Module 5 : Applications and Research (9 hours)

5.1 Computer vision. (TB1: Section 12.2) 1 hour

5.2 Speech recognition. (TB1: Section 12.3) 1 hour

5.3 Natural language processing. (TB1: Section 12.4) 1 hour

5.4 Common Word Embedding -: Continuous Bag-of-Words, Word2Vec (TB3: 1 hour


Section 2.6)

5.5 Common Word Embedding -: Global Vectors for Word 1 hour


Representation(GloVe) (TB3: Section 2.9.1- Pennigton 2014)

5.6 Brief introduction on current research areas- Autoencoders, Representation 1 hour


learning. (TB3: Section 4.10)

307
Downloaded from Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING

5.7 Brief introduction on current research areas- representation learning. (TB3: 1 hour
Section 9.3)

5.8 Brief introduction on current research areas- Boltzmann Machines, Deep belief 1 hour
networks. (TB1: Section 20.1, TB3 Section 6.3)

5.9 Brief introduction on current research areas- Deep belief networks. (TB1: 1 hour
Section 20.3)

308
Downloaded from Ktunotes.in

You might also like