Deep Learning Basics Lecture 9 Recurrent Neural Networks
Deep Learning Basics Lecture 9 Recurrent Neural Networks
Deep Learning Basics Lecture 9 Recurrent Neural Networks
• Example
• Sentiment analysis
• Machine translation
Example: machine translation
Figure from the paper “DenseCap: Fully Convolutional Localization Networks for Dense Captioning”,
by Justin Johnson, Andrej Karpathy, Li Fei-Fei
Computational graphs
A typical dynamic system
𝑠 (𝑡+1) = 𝑓(𝑠 𝑡
; 𝜃)
Figure from Deep Learning,
Goodfellow, Bengio and Courville
A system driven by external data
𝑠 (𝑡+1) = 𝑓(𝑠 𝑡
, 𝑥 (𝑡+1) ; 𝜃)
Figure from Deep Learning,
Goodfellow, Bengio and Courville
Compact view
𝑠 (𝑡+1) = 𝑓(𝑠 𝑡
, 𝑥 (𝑡+1) ; 𝜃)
Figure from Deep Learning,
Goodfellow, Bengio and Courville
square: one step time delay
Compact view
𝑠 (𝑡+1) = 𝑓(𝑠 𝑡
, 𝑥 (𝑡+1) ; 𝜃)
Key: the same 𝑓 and 𝜃 Figure from Deep Learning,
for all time steps Goodfellow, Bengio and Courville
Recurrent neural networks (RNN)
Recurrent neural networks
• Use the same computational function and parameters across different
time steps of the sequence
• Each time step: takes the input entry and the previous hidden state to
compute the output entry
• Loss: typically computed every time step
Label
Recurrent neural networks
Loss
Output
State
Input
• Many variants
• Information about the past can be in many other forms
• Only output at the end of the sequence
Example: use the output at the
previous step