Welcome to Scribd!

0% found this document useful (0 votes)

33 views

Neural Language Model, RNNS: Pawan Goyal

Uploaded by

The document discusses neural language models and recurrent neural networks (RNNs). It describes some issues with n-gram language models like large parameter sizes. It then introduces fixed-window neural language models that use a fixed number of previous words. RNNs are presented as a way to preserve word order while having fewer parameters than n-grams. RNNs apply the same weights repeatedly to process sequential data. Forward propagation through an RNN unfolds it over time using recurrence formulas.

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Neural Language Model, RNNS: Pawan Goyal

Uploaded by

Bhushan Raju Golani

0% found this document useful (0 votes)

33 views15 pages

Original Title

rnn-nlp_part1

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

0% found this document useful (0 votes)

33 views15 pages

Neural Language Model, RNNS: Pawan Goyal

Uploaded by

Bhushan Raju Golani

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

Jump to Page

You are on page 1of 15

Search inside document

Neural Language Model,ERNNs

Pawan Goyal

CSE, IIT Kharagpur

February 8th, 2022

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 1 / 38
Storage Problems with n-gram Language Model

=•
.

%0c.am# "n!?÷ay
six
nY parameters?
te
ist
v0 be
d
ait
canwesedra?
Large
-
n-
grams used

↳ smalennasiobeused
Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 2 / 38
Neural 1M ?

"
word Fried
radiate
E E É
☒
in
÷
-
-
-

earn ←
din⇐
avg
-

\
.

d-
d- dim

the order
preserve
A fixed-window neural language model
-

t
-

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 3 / 38
A fixed-window neural language model
OLIVM
.it?usr#..!goftm
iÉioe
or "

.
.
m

Y WM
he
hidden
wader .
←

%÷d✗If
T.Fmepnoi.it#d---d '

-5
I
whether
•
student
.
o¥senÑ woods ?
Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 4 / 38
A fixed-window neural language model

✓
*
4

✓ Ñ% Jane opened their

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 5 / 38
Recurrent Neural Networks
+

about
- -
-
stuff W
¥
.

v -
under
-
inform
n°9
-

¥ ¥
U v v
-
-

- -

saññ
Core Idea all
accumulating
Apply the same weights repeatedly! info

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 6 / 38
Recurrent Neural Networks

We can process a sequence of vectors x by applying a recurrence formula at

each step:

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 7 / 38
Recurrent Neural Networks

We can process a sequence of vectors x by applying a recurrence formula at

each step:
←

- ¥
th
-

he -
•

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 7 / 38
?⃝
Forward propagation for the RNN: first model

Activation function for the hidden units

Assume the hyperbolic tangent activation function
=

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 8 / 38
Forward propagation for the RNN: first model

Activation function for the hidden units

Assume the hyperbolic tangent activation function

Form of output and loss function

Assume output is discrete - predicting words
We can obtain a vector normalized probabilities over the output - ŷ.

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 8 / 38
Forward propagation for the RNN: first model

Activation function for the hidden units

Assume the hyperbolic tangent activation function

Form of output and loss function

Assume output is discrete - predicting words
We can obtain a vector normalized probabilities over the output - ŷ.

Update Equations
Initial state - h(0)

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 8 / 38
Forward propagation for the RNN: first model

Activation function for the hidden units

Assume the hyperbolic tangent activation function

Form of output and loss function

Assume output is discrete - predicting words
We can obtain a vector normalized probabilities over the output - ŷ.

Update Equations
Initial state - h(0)
From t = 1 to t = t, the following update equation is applied:

1)
a(t) = b + Wh(t + Ux(t)

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 8 / 38
)
( Vheeba

§Ñ
softmax

÷I÷y
'

, Fv
'
n

1- .
-

±÷•_¥¥÷
. -
.
-

hxchtd )
tana (w,ChMi+b)

É+ÉÉ:
→

hY÷
+
h✗÷ = tana ( win-win -
+ b)
Forward Propagation

1)
a(t) = b + Wh(t
→
+ Ux(t)
h(t) = tanh(a(t) )

¥
o(t) = c + Vh(t)
ŷ(t) = softmax(o(t) )
.

Cross
- entropy

Pawan Goyal (IIT Kharagpur) Neural Language Model, RNNs February 8th, 2022 9 / 38

Gauss Forward and Backward Interpolation
Document12 pages
Gauss Forward and Backward Interpolation
S Adilakshmi
100% (2)
Statistics Formula Tables
Document8 pages
Statistics Formula Tables
Samantha Rodriguez
No ratings yet
Math Practice - Tests - Level - 4 PDF
Document129 pages
Math Practice - Tests - Level - 4 PDF
mozam haq
100% (2)
Module No. 1
Document7 pages
Module No. 1
Annalyn Arnaldo
No ratings yet
RoBERTa Audit Checklist and Harms 1673077058
Document43 pages
RoBERTa Audit Checklist and Harms 1673077058
Keketso Kgomosotho
No ratings yet
Final Summary A++
Document1 page
Final Summary A++
grace
No ratings yet
Problem - Solution - Fit Sample Template
Document2 pages
Problem - Solution - Fit Sample Template
SRIRAM S B
No ratings yet
Teacher S Guide (BTM) 13Lv. 1
Document21 pages
Teacher S Guide (BTM) 13Lv. 1
nguyentranthanhlongkh
No ratings yet
CanDo KeyUses GR 2 3
Document12 pages
CanDo KeyUses GR 2 3
annye
No ratings yet
Type 1 Structural Drawing
Document23 pages
Type 1 Structural Drawing
myoaung.click
No ratings yet
Can Do Descriptors: Key Uses Edition
Document12 pages
Can Do Descriptors: Key Uses Edition
api-509490521
No ratings yet
Oral Com
Document10 pages
Oral Com
loriely joy deuda
No ratings yet
Mahindra Mahindra Brochure1046
Document5 pages
Mahindra Mahindra Brochure1046
Ankur Mestry
No ratings yet
Data Base Management Systems
Document3 pages
Data Base Management Systems
tarunts567
No ratings yet
August & September Monthly Collection, Grade 1
From Everand
August & September Monthly Collection, Grade 1
Carson Dellosa Education
No ratings yet
Can Do Descriptors: Key Uses Edition
Document12 pages
Can Do Descriptors: Key Uses Edition
Lindsay Marie Stanford
No ratings yet
CanDo KeyUses GR 6 8
Document12 pages
CanDo KeyUses GR 6 8
annye
No ratings yet
Student Book
Document65 pages
Student Book
Vy Hà
No ratings yet
Ai Algorithms PDF
Document20 pages
Ai Algorithms PDF
Aakash Kulshreshtha
No ratings yet
Attention Is All You Need
Document7 pages
Attention Is All You Need
Ankit Kumar
No ratings yet
Finite Element Modelling and Characterization of 3D Cellular
Document12 pages
Finite Element Modelling and Characterization of 3D Cellular
ritikpatel.222mf008
No ratings yet
Morningstar Magazine 2024q2
Document76 pages
Morningstar Magazine 2024q2
rohansbarnaco
No ratings yet
Reading, Grade 1
From Everand
Reading, Grade 1
Carson Dellosa Education
Rating: 4 out of 5 stars
4/5 (2)
Java EE6 Vol1
Document304 pages
Java EE6 Vol1
Rodrigo Marins
100% (1)
Dip Das Assignment
Document3 pages
Dip Das Assignment
Dip Das
No ratings yet
CLM 21st - CLM BASIC
Document52 pages
CLM 21st - CLM BASIC
Rodel P. Pillo
No ratings yet
Wida Booklet 2012 Standards Web 2 1
Document138 pages
Wida Booklet 2012 Standards Web 2 1
api-301043459
No ratings yet
Servetg Cv.87c19dfa
Document2 pages
Servetg Cv.87c19dfa
Oskar Padić
No ratings yet
English6 1stquarter Tos - Only
Document2 pages
English6 1stquarter Tos - Only
RIZA BALONZO
No ratings yet
Site Expense - UPS Sampoerna
Document1 page
Site Expense - UPS Sampoerna
Ery Priyono
No ratings yet
C2 Listing
Document1 page
C2 Listing
PrashanthArathiShiney
No ratings yet
Statistical Methods: Multivariate Analysis
Document1 page
Statistical Methods: Multivariate Analysis
Jordan Chizick
No ratings yet
February Monthly Collection, Grade 2
From Everand
February Monthly Collection, Grade 2
Carson Dellosa Education
No ratings yet
Reading, Grade 4
From Everand
Reading, Grade 4
Carson Dellosa Education
No ratings yet
BDSP - S&D
Document1 page
BDSP - S&D
Alia Edington
No ratings yet
Data Science From Scratch: First Principles With Python
Document16 pages
Data Science From Scratch: First Principles With Python
shanmugapriyaa
100% (1)
Set The Roads On Fire
Document5 pages
Set The Roads On Fire
Praveen Sakrappanavar
No ratings yet
Dermatology
Document18 pages
Dermatology
Pepa Karadzhova
No ratings yet
Instructions Leaflet
Document1 page
Instructions Leaflet
English With Chami
No ratings yet
ETS ComMod TR DesigningEducationalProg
Document1 page
ETS ComMod TR DesigningEducationalProg
Alina Mustata
No ratings yet
D63510GC50 Les03
Document52 pages
D63510GC50 Les03
SAJJAD
No ratings yet
5 K-Nearest
Document10 pages
5 K-Nearest
patil_555
No ratings yet
Ece408 Lecture19 Sparse Matrix VK SP23
Document28 pages
Ece408 Lecture19 Sparse Matrix VK SP23
shihyunnam7
No ratings yet
EIP Flashcards PDF
Document6 pages
EIP Flashcards PDF
jimbeam2
No ratings yet
April Monthly Collection, Grade 1
From Everand
April Monthly Collection, Grade 1
Carson Dellosa Education
No ratings yet
IPS E-Max CAD For CEREC SpeedFire (Dentsply Sirona)
Document2 pages
IPS E-Max CAD For CEREC SpeedFire (Dentsply Sirona)
Yassin Salah
No ratings yet
RFID Solution Analysis
Document16 pages
RFID Solution Analysis
oosh01
No ratings yet
Large-Scale Multi-Label Text Classification - 1716327730214
Document13 pages
Large-Scale Multi-Label Text Classification - 1716327730214
mbohoumounpouyvanlandry
No ratings yet
GR 8-Extra Exercises - (Answers)
Document2 pages
GR 8-Extra Exercises - (Answers)
Mahmoud A.Raouf
No ratings yet
01 SAP Enable Now Whats New ZK
Document56 pages
01 SAP Enable Now Whats New ZK
Muhammed Moawad
No ratings yet
21CSC201J - Data Structures and Algorithms
Document4 pages
21CSC201J - Data Structures and Algorithms
Abhinav Vats
No ratings yet
2019 HORKNet Poster
Document1 page
2019 HORKNet Poster
l15801823611
No ratings yet
Anguagioimaaut-Hikah - TT: Language
Document10 pages
Anguagioimaaut-Hikah - TT: Language
Honey Louise Vazquez
No ratings yet
Resumegd Wesley Woodside
Document1 page
Resumegd Wesley Woodside
api-398102840
No ratings yet
Enterprise Case Study - Adopting An SD-WAN Enabled Network (2017)
Document9 pages
Enterprise Case Study - Adopting An SD-WAN Enabled Network (2017)
jeffe333
No ratings yet
S52095 - The Future of Generative AI For Content Creation - 1679442103451001Fz15
Document40 pages
S52095 - The Future of Generative AI For Content Creation - 1679442103451001Fz15
Red Dragon
No ratings yet
Australasian Science - July - August 2016 (Gnv64)
Document52 pages
Australasian Science - July - August 2016 (Gnv64)
Silvia Buss
No ratings yet
Lab Course File: Course Code:-Bcse2014 Room No.
Document35 pages
Lab Course File: Course Code:-Bcse2014 Room No.
Akshay Maan07
No ratings yet
VUCA Notes
Document15 pages
VUCA Notes
Filthy Memes
No ratings yet
Help, Lord, For Those Who Love Thee Fail: Thomas B. Southgate G G& Cma7 Am D7 D7sus4 D7 G C G
Document1 page
Help, Lord, For Those Who Love Thee Fail: Thomas B. Southgate G G& Cma7 Am D7 D7sus4 D7 G C G
Omar Cruz
No ratings yet
DR Gargi - 2 Page Notes - Reproduction Unit
Document14 pages
DR Gargi - 2 Page Notes - Reproduction Unit
sachinkumar1704203
No ratings yet
فاينل تحليلات هندسيه ثالث مدني كل PDF
Document16 pages
فاينل تحليلات هندسيه ثالث مدني كل PDF
Abdulaziz hazem
No ratings yet
Exercise 3
Document10 pages
Exercise 3
u2102965
No ratings yet
For Every Linear Programming Problem Whether Maximization or Minimization Has Associated With It Another Mirror Image Problem Based On Same Data
Document21 pages
For Every Linear Programming Problem Whether Maximization or Minimization Has Associated With It Another Mirror Image Problem Based On Same Data
Affu Shaik
No ratings yet
1 SM
Document15 pages
1 SM
muhammad iqrar
No ratings yet
ML Lab Session 06 - VGG16-CNN
Document15 pages
ML Lab Session 06 - VGG16-CNN
chatgptlogin2001
No ratings yet
Answer Sheet - Mathematics (Grade 10) : Polynomial Equation
Document1 page
Answer Sheet - Mathematics (Grade 10) : Polynomial Equation
Kimverly Ledda Ganaden
No ratings yet
Adding and Subtracting Polynomials - Grade 7
Document30 pages
Adding and Subtracting Polynomials - Grade 7
Ranigo Karl Vincent
No ratings yet
Rishibha Sharma - 116 Vidhi Agarwal - 119 Sushma Sharma - 212
Document11 pages
Rishibha Sharma - 116 Vidhi Agarwal - 119 Sushma Sharma - 212
sasa
No ratings yet
Newton Raphson Method
Document22 pages
Newton Raphson Method
Sachin
No ratings yet
Tutorial 4
Document13 pages
Tutorial 4
Faten Clubistia
No ratings yet
Exp 11 Forward Interpolation
Document3 pages
Exp 11 Forward Interpolation
Hasibur
100% (1)
Beam Buckling
Document45 pages
Beam Buckling
Saleha Quadsia
No ratings yet
Orca Manual 4 0 1
Document827 pages
Orca Manual 4 0 1
Fabio Nery
No ratings yet
Activity - Remainder Theorem
Document10 pages
Activity - Remainder Theorem
Reygie Fabriga
No ratings yet
22 Asignment Problem
Document15 pages
22 Asignment Problem
Aditya Reddy
No ratings yet
Bisection Method
Document20 pages
Bisection Method
Anonymous J8phVjlxPx
No ratings yet
Journal of Computational Physics: Armando Coco, Giovanni Russo
Document38 pages
Journal of Computational Physics: Armando Coco, Giovanni Russo
Anonymous tIwg2Ay
No ratings yet
Time Series Forecasting With Feed-Forward Neural Networks
Document40 pages
Time Series Forecasting With Feed-Forward Neural Networks
adnanmukred
100% (1)
Design and Analysis of Algorithms: Vignan University:: Vadlamudi - 522 213
Document2 pages
Design and Analysis of Algorithms: Vignan University:: Vadlamudi - 522 213
sashu c
No ratings yet
CL 701: Computational Methods For Chemical Engineering ODE-IVP and F (X) 0: Exercise Problems
Document6 pages
CL 701: Computational Methods For Chemical Engineering ODE-IVP and F (X) 0: Exercise Problems
chandrahas
No ratings yet
Deep Learning Tutorial Complete (v3)
Document109 pages
Deep Learning Tutorial Complete (v3)
Mario Cordina
No ratings yet
HPC MCQ
Document3 pages
HPC MCQ
Jayaprabha Kanse
No ratings yet
10 Math Polynomials
Document9 pages
10 Math Polynomials
Ajay Anand
No ratings yet
Polynomial Equations: Lesson 3
Document41 pages
Polynomial Equations: Lesson 3
Sheena Maurine Valenzuela
No ratings yet
tmpC634 TMP
Document10 pages
tmpC634 TMP
Frontiers
No ratings yet
Kelompok 2 - Linear Programming (English)
Document37 pages
Kelompok 2 - Linear Programming (English)
Sisri siagian
No ratings yet
Gaussian Quadrature
Document6 pages
Gaussian Quadrature
Karan K H M
No ratings yet
Improved Euler Method & Modified Euler Method (Second Order Runge Kutta Method) Requirements: Geometrical Interpretation of Derivative
Document12 pages
Improved Euler Method & Modified Euler Method (Second Order Runge Kutta Method) Requirements: Geometrical Interpretation of Derivative
Biki
No ratings yet
Module-2 System of Linear Equations and Eigen Value Problems Important Numerical Methods in Module-I
Document24 pages
Module-2 System of Linear Equations and Eigen Value Problems Important Numerical Methods in Module-I
Jawa freak
No ratings yet