Neural Language Model, RNNS: Pawan Goyal

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Neural Language Model,ERNNs

Pawan Goyal

CSE, IIT Kharagpur

February 8th, 2022

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 1 / 38
Storage Problems with n-gram Language Model

=•
.

%0c.am# "n!?÷ay
six
nY parameters?
te
ist
v0 be
d
ait
canwesedra?
Large
-
n-
grams used

↳ smalennasiobeused
Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 2 / 38
Neural 1M ?

"
word Fried
radiate
E E É

in
÷
-
-
-

earn ←
din⇐
avg
-

\
.

d-
d- dim

the order
preserve
A fixed-window neural language model
-

t
-

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 3 / 38
A fixed-window neural language model
OLIVM
.it?usr#..!goftm
iÉioe
or "

.
.
m

Y WM
he
hidden
wader .

%÷d✗If
T.Fmepnoi.it#d---d '

-5
I
whether

student
.
o¥senÑ woods ?
Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 4 / 38
A fixed-window neural language model


*
4

✓ Ñ% Jane opened their

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 5 / 38
Recurrent Neural Networks
+

about
- -
-
stuff W
¥
.

v -
under
-
inform
n°9
-

¥ ¥
U v v
-
-

- -

saññ
Core Idea all
accumulating
Apply the same weights repeatedly! info

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 6 / 38
Recurrent Neural Networks

We can process a sequence of vectors x by applying a recurrence formula at


each step:

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 7 / 38
Recurrent Neural Networks

We can process a sequence of vectors x by applying a recurrence formula at


each step:

- ¥
th
-

he -

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 7 / 38
?⃝
Forward propagation for the RNN: first model

Activation function for the hidden units


Assume the hyperbolic tangent activation function
=

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 8 / 38
Forward propagation for the RNN: first model

Activation function for the hidden units


Assume the hyperbolic tangent activation function

Form of output and loss function


Assume output is discrete - predicting words
We can obtain a vector normalized probabilities over the output - ŷ.

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 8 / 38
Forward propagation for the RNN: first model

Activation function for the hidden units


Assume the hyperbolic tangent activation function

Form of output and loss function


Assume output is discrete - predicting words
We can obtain a vector normalized probabilities over the output - ŷ.

Update Equations
Initial state - h(0)

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 8 / 38
Forward propagation for the RNN: first model

Activation function for the hidden units


Assume the hyperbolic tangent activation function

Form of output and loss function


Assume output is discrete - predicting words
We can obtain a vector normalized probabilities over the output - ŷ.

Update Equations
Initial state - h(0)
From t = 1 to t = t, the following update equation is applied:

1)
a(t) = b + Wh(t + Ux(t)

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 8 / 38
)
( Vheeba

§Ñ
softmax

÷I÷y
'

"

, Fv
'
n

1- .
-

±÷•_¥¥÷
. -
.
-

hxchtd )
tana (w,ChMi+b)

É+ÉÉ:

hY÷
+
h✗÷ = tana ( win-win -
+ b)
Forward Propagation

1)
a(t) = b + Wh(t

+ Ux(t)
h(t) = tanh(a(t) )

¥
o(t) = c + Vh(t)
ŷ(t) = softmax(o(t) )
.

Cross
- entropy

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 9 / 38

You might also like