Deep Learning - IIT Ropar - Unit 10 - Week 7
Deep Learning - IIT Ropar - Unit 10 - Week 7
Deep Learning - IIT Ropar - Unit 10 - Week 7
(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)
^ 2 2 4 5
f (x) = w0 + w1 x + w2 x + w4 x + w5 x
Week 1 () 2
Week 3 ()
^
f (x)
1
week 4 ()
^
f (x)
2
Week 5 () It is not possible to decide without knowing the true distribution of data points in the
dataset.
Week 6 () Yes, the answer is correct.
Score: 1
Week 7 () Accepted Answers:
^
f (x)
2
Bias and
Variance (unit? 2) We generate the data using the following model: 1 point
unit=92&lesso
n=93) 3
y = 5x + 2x + x + 3.
Train error vs
Test error We fit the two models f^1 (x) and f^2 (x) on this data and train them using a neural network.
(unit?
https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=92&assessment=295 1/4
10/27/24, 1:13 PM Deep Learning - IIT Ropar - - Unit 10 - Week 7
unit=92&lesso
n=94) ^
f (x) has a higher bias than f^2 (x).
1
Train error vs
^
Test error f (x)
1
has a higher variance than f^2 (x).
(Recap) (unit?
unit=92&lesso ^
f (x) has a higher bias than f^1 (x).
2
n=95)
^
True error and f (x)
2
has a higher variance than f^1 (x).
Model
Yes, the answer is correct.
complexity Score: 1
(unit? Accepted Answers:
unit=92&lesso ^
f (x) has a higher bias than f^2 (x).
n=96) 1
^
f (x)
2
has a higher variance than f^1 (x).
L2
regularization
Common Data Q3-Q6
(unit?
unit=92&lesso
Consider a function L(w, b) = 0.5w
2 2
+ 5b + 1 and its contour plot given below:
n=97)
Dataset
augmentation
(unit?
unit=92&lesso
n=98)
Parameter
sharing and
tying (unit?
unit=92&lesso
n=99)
Adding Noise
to the inputs
(unit?
unit=92&lesso
n=100)
Adding Noise
to the outputs
(unit?
unit=92&lesso
n=101)
Early stopping
(unit?
unit=92&lesso
3) What is the value of L(w∗ , b∗ ) where w∗ and b∗ are the values that minimize the function.
n=102)
Ensemble
1.2
Methods (unit?
No, the answer is incorrect.
unit=92&lesso Score: 0
n=103) Accepted Answers:
Dropout (unit?
(Type: Range) 0.9,1.1
unit=92&lesso
1 point
n=104)
4) What is the sum of the elements of ∇L(w∗ , b∗ ) ?
https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=92&assessment=295 2/4
10/27/24, 1:13 PM Deep Learning - IIT Ropar - - Unit 10 - Week 7
Lecture 0
Material for
Week 7 (unit?
Yes, the answer is correct.
Score: 1
unit=92&lesso
Accepted Answers:
n=105)
(Type: Numeric) 0
Quiz: Week 7 1 point
: Assignment
7 5) What is the determinant of H L (w∗ , b∗ ), where H is the Hessian of the function?
(assessment?
10
name=295)
Week 7
Yes, the answer is correct.
Score: 1
Feedback
Accepted Answers:
Form: Deep
(Type: Numeric) 10
Learning - IIT
Ropar (unit? 1 point
unit=92&lesso
n=236) 6) Compute the Eigenvalues and Eigenvectors of the Hessian. According to the eigen- 1 point
values of the Hessian, which parameter is the loss more sensitive to?
Week 8 ()
b
Week 9 ()
w
Week 12 ()
7) Suppose that a model produces zero training error. What happens if we use L2 1 point
regularization, in general?
Download
Videos () It might increase training error
It might decrease test error
Books ()
It might decrease training error
Text Reduce the complexity of the model by driving less important weights to close to zero
Transcripts Yes, the answer is correct.
() Score: 1
Accepted Answers:
Problem It might increase training error
Solving It might decrease test error
Session - Reduce the complexity of the model by driving less important weights to close to zero
July 2024 ()
8) Suppose that we apply Dropout regularization to a feed forward neural network. 1 point
Suppose further that mini-batch gradient descent algorithm is used for updating the parameters
of the network. Choose the correct statement(s) from the following statements.
The weights of the neurons which were dropped during the forward propagation at tth
https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=92&assessment=295 3/4
10/27/24, 1:13 PM Deep Learning - IIT Ropar - - Unit 10 - Week 7
9) We have trained four different models on the same dataset using various 1 point
hyperparameters. The training and validation errors for each model are provided below. Based
on this information, which model is likely to perform best on the test dataset?
Model 1
Model 2
Model 3
Model 4
10) Consider the problem of recognizing an alphabet (in upper case or lower case) of 1 point
English language in an image. There are 26 alphabets in the language. Therefore, a team
decided to use CNN network to solve this problem. Suppose that data augmentation technique is
being used for regularization. Then which of the following transformation(s) on all the training
images is (are) appropriate to the problem
https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=92&assessment=295 4/4