Lab RNN Intro
Lab RNN Intro
Lab RNN Intro
2/18
RNN Tasks
Vanilla RNNs
3/18
RNN Tasks
4/18
RNN Tasks
5/18
RNN Tasks
e.g. Translation
Sequence of words → sequence of words
Source: CS231n Lecture 10
6/18
RNN Tasks
7/18
RNN Model
8/18
Vanilla RNN Model
(t) (t ) (t )
x h y
●
Current state depends on current inputs and previous state
●
RNNs can yield outputs at each time step
(t ) (t−1) (t)
h =f w (h
hh
, f w ( x ))
ih
(t ) (t )
y =f w (h ), ∀ t ∈{1... τ }
ho
9/18
Unfolding RNN in time
10/18
Unfolding RNN in time
11/18
Unfolding RNN in time
12/18
Forward through entire sequence to
compute loss, then backward through
Backpropagation through time entire sequence to compute gradient
Loss
●
Used in practice
●
Summary of the algorithm:
– Present a sequence of k1 timesteps of input and output pairs to
the network.
– Unroll the network then calculate and accumulate errors across
k2 timesteps.
– Roll-up the network and update weights.
– Repeat
13/18
Teacher Forcing and Warm-start
●
When training a RNN to generate a sequence, often, the
predictions (outputs y(t)) of a RNN cell are used as the input of
the cell at the next timestamp
●
Teacher Forcing: at training time, use the targets of the
sequence, instead of RNN predictions, as inputs to the next
step
●
Warm-start: when using an RNN to predict a next value
conditioned on previous predictions, it is sometimes
necessary to give the RNN some context (known ground truth
elements) before letting it predict on its own
14/18
LSTM
15/18
LSTM Cell
Img source:
https://medium.com/
@kangeugine/
●
Input Gate (i in (0, 1) – sigmoid) – scales input to cell (write)
●
Output Gate (o in (0, 1) – sigmoid) – scales output from cell
(read)
●
Forget Gate (f in (0, 1) – sigmoid) – scales old cell values
(reset mem)
16/18
LSTM Cell - Equations
(t ) (t−1)
it =σ ( θ xi x + θhi h +b i )
(t ) (t−1)
f t =σ ( θ xf x + θhf h +b f )
(t ) (t−1)
o t =σ ( θ xo x + θho h +b o )
(t) (t−1)
g t =tanh ( θ xg x + θhg h +b g )
c t =f t ⊙c(t−1)+it ⊙g t
h t =ot ⊙tanh(ct ) , where ⊙ is elementwise multiplication
17/18
LSTMs in practice
●
Sutskever et al, Sequence
to Sequence Learning with
Neural Networks, NIPS 2014
– Models are huge :-)
18/18