hw1 f21112 Problems11
hw1 f21112 Problems11
hw1 f21112 Problems11
Homework 1
Please upload your assignments on or before October 1, 2021.
• You are encouraged to discuss ideas with each other. But you must acknowledge your
collaborator, and you must compose your own writeup and code independently.
• We require answers to theory questions to be written in LaTeX. (Figures can be hand-drawn,
but any text or equations must be typeset.) Handwritten homework submissions will not be
graded.
• We require answers to coding questions in the form of a Jupyter notebook. It is important
to include brief, coherent explanations of both your code and your results to show us your
understanding. Use the text block feature of Jupyter notebooks to include explanations.
• Upload both your theory and coding answers in the form of a single PDF on Gradescope.
1
comment on potential practical issues involved in defining such networks. If not, explain
why not.
3. (3 points) Calculating gradients. Suppose that z is a vector with n elements. We would like
to compute the gradient of y = softmax(z). Show that the Jacobian of y with respect to z, J,
is given by the
∂yi
Jij = = yi (δij − yj )
∂zj
where δij is the Dirac delta, i.e., 1 if i = j and 0 else. Hint: Your algebra could be simplified if
you try computing the log derivative, ∂ log yi
∂zj .
4. (3 points) Improving the FashionMNIST classifier. Recall that in the first recitation, we trained
a simple logistic regression model to classify MNIST digits. Repeat the same experiment, but
now use a (dense) neural network with three (3) hidden layers with 256, 128, and 64 neurons
respectively, all with ReLU activations. Display train- and test- loss curves, and report test
accuracies of your final model. You may have to tweak the total number of training epochs to
get reasonable accuracy. Finally, draw any 3 image samples from the test dataset, visualize
the predicted class probabilities for each sample, and comment on what you can observe from
these plots.
5. (4 points) Implementing back-propagation in Python from scratch. Open the (incomplete)
Jupyter notebook provided as an attachment to this homework in Google Colab (or other Python
IDE of your choice) and complete the missing items.