Lecture1 PDF

Automatic
IntroductionSpeech Recognition
to Machine (CS753)
Learning (CS419M)
Lecture 1: What and why?
Jan 5, 2018
What is Machine Learning?
• Ability of computers to “learn” from “data” or “past experience”
• data: Comes from various sources such as sensors, domain

knowledge, experimental runs, etc.

• learn: Make intelligent predictions or decisions based on data

Pigeon Superstition
Video link: https://www.youtube.com/watch?v=TtfQlkGwE2U


1. Supervised learning
Supervised Learning
i=1  
Given a labeled set of input-output pairs, D = {(xi , yi )}N
•
 
objective is to learn a function mapping the inputs x to outputs y
• Inputs can be complex objects such as images, sentences, speech

signals, etc. Typically take the form of features.
• Outputs are either categorical (classification tasks) or real-valued

(regression tasks). More on these concepts in later classes.
Image recognition
Image from “ImageNet classification with deep CNNs”, Krizhevsky et al.


1. Supervised learning: decision trees, neural networks, etc.


2. Unsupervised learning
Unsupervised Learning
i i=1 , discover some  
N
• Given a set of inputs, D = {x }
 
patterns in the data
• Most common example: Clustering


2. Unsupervised learning: k-means clustering, etc.


2. Unsupervised learning: k-means clustering, etc.
3. Reinforcement learning
When do we need ML? (I)
• For tasks that are easily performed by humans but are complex for
computer systems to emulate
• Vision: Identify faces in a photograph, objects in a video or still

image, etc.
• Natural language: Translate a sentence from Hindi to English,
question answering, etc.
• Speech: Recognise spoken words, speaking sentences naturally
• Game playing: Play games like chess
• Robotics: Walking, jumping, displaying emotions, etc.
• Driving a car, flying a plane, navigating a maze, etc.
Relationship between AI, ML, DL
Image from: https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/

When do we need ML? (II)
• For tasks that are beyond human capabilities
• Analysis of large and complex datasets
• E.g. IBM Watson’s Jeopardy-playing machine
Image credit: https://i.ytimg.com/vi/P18EdAKuC1U/maxresdefault.jpg

When do we need ML? (II)
• For tasks that are beyond human capabilities
• Analysis of large and complex datasets
• E.g. Autopilot controls
Image credit: https://media.newyorker.com/photos/59095c8a019dfc3494e9f7b9/16:9/w_1200,h_630,c_limit/Hazards-of-Autopilot.jpg

ML and Statistics?
Glossary
Machine learning Statistics
network, graphs model
weights parameters
learning fitting
generalization test set performance
supervised learning regression/classification
unsupervised learning density estimation, clustering
large grant = $1,000,000 large grant= $50,000
nice place to have a meeting: nice place to have a meeting:

Snowbird, Utah, French Alps Las Vegas in August
Glossary from: http://statweb.stanford.edu/~tibs/stat315a/glossary.pdf

Course Specifics
Pre-requisites
Oﬀicially:
No prerequisites. But would help if you’ve taken “Data
Structures and Algorithms”, “Data Analysis and Interpretation”,
or any equivalent course taught by respective departments
When to take this course:

Should be comfortable with
• probability
• linear algebra
• multivariable calculus
• programming
Course webpage
https://www.cse.iitb.ac.in/~pjyothi/cs419/
Course logistics
Reading:
All mandatory reading will be freely available online and posted
on the course website.
Textbooks (available online):
1. Understanding Machine Learning. Shai Shalev-Shwartz and
Shai Ben-David. Cambridge University Press. 2017.
2. The Elements of Statistical Learning. Trevor Hastie, Robert
Tibshirani and Jerome Friedman. Second Edition. 2009.
Communication:  
We will use Moodle to communicate about the course.
Attendance:  
Strongly advised to attend all lectures. Lot of material covered
in class will not be on the slides. (Also, points for participation.)
Course TAs
1. Himanshu Agarwal, M.Tech. II
2. Aniket Kuiri, M.Tech. II
3. Pooja Palod, M.Tech. II
4. Sunandini Sanyal, M.Tech. I
5. Anupama Vijjapu, M.Tech. I
6. Saurabh Garg, B.Tech. IV
7. Tanmay Parekh, B.Tech. IV

Course Syllabus
Provide an overview of machine learning and well-known
techniques. We will briefly cover some ML applications as well.
Some Topics:
• Basic foundations of ML, classification/regression, Naive
Bayes’ classifier, linear and logistic regression
• Supervised learning: Decision trees, support vector machines,

neural networks, etc.
• Unsupervised learning: k-means clustering, etc.
• Brief introduction to ML applications in computer vision,

speech and natural language processing.
Evaluation (subject to change)
Assignments: 4-5 assignments contributing to 40% of your grade.
Late policy for assignments will vary and will be announced along
with the assignment.
Assignments will contain programming questions
Midsem and final exam: 15% + 25%
Participation: 5%. Four in-class quizzes will be randomly handed

out. Should have attempted at least two to receive full points for
participation.
Academic Integrity Policy
• Write what you know.
• Use your own words.
• If you refer to *any* external material, *always* cite your

sources. Follow proper citation guidelines.
• If you’re caught for plagiarism or copying, penalties are

much higher than simply omitting that question.
• In short: Just not worth it.
Image credit: https://www.flickr.com/photos/kurok/22196852451

Evaluation — Project
Grading: Constitutes 15% of the total grade. (Exceptional projects
could get extra credit. Details posted on website.)
Team: 2-3 members. Individual projects are highly discouraged.
Project details:
• Apply the techniques you studied in class to any interesting
problem of your choice
• Think of a problem early and work on it throughout the
course. Project milestones will be posted on Moodle.
• Examples of project ideas: auto-complete code, generate
song lyrics, help irctc predict ticket prices, etc.
• Feel free to be creative; consult with TAs/me if it’s feasible
Datasets abound…
Kaggle: https://www.kaggle.com/datasets
Datasets abound…
Kaggle: https://www.kaggle.com/datasets
Another good resource: http://deeplearning.net/datasets/
Popular resource for ML beginners:  

http://archive.ics.uci.edu/ml/index.php
Interesting datasets for computational journalists:  

http://cjlab.stanford.edu/2015/09/30/lab-launch-and-data-sets/
Speech and language resources:  

www.openslr.org/
… and so do ML libraries/toolkits
scikit-learn, openCV, Keras, Tensorflow, NLTK, etc.
Some basic concepts
Typical ML approach
• How do we approach an ML problem?
• Modeling: Use a model to represent the task

Modeling
Word
Trans
Posn
Phn
L-Lag T-Lag G-Lag
Prev-
Phn
L-Phn T-Phn G-Phn
Lip-op TT-op Glot
sur sur sur

Lip-op TT-op Glot
Typical ML approach
• Decoding/Inference: Given a model, answer questions with

respect to the model
Inference
Given an observed set of values, how accurately can we predict
the identity of a word?
Word list
Word weighted by
Frame # Articulatory Features
Posn
probabilities
1 G1 LP0 V0 T0 Phn
2 G1 LP1 V0 T0 L-Lag T-Lag G-Lag
3 G1 LP1 V0 T0
L-Phn T-Phn G-Phn
4 G1 LP2 V1 T3
Lip-Op TT-Op Glot
5 G1 LP2 V1 T3
sur! sur! sur!
Lip-Op TT-Op Glot
:
Articulatory Features
Typical ML approach
• Decoding/Inference: Given a model, answer questions with

respect to the model
• Learning: The model could be parameterized and the

parameters are learned using data
How do we know  
if our model’s any good?
• Generalization: Does the trained model produce good
predictions on examples beyond the training set?
• We should be careful not to overfit the training data
• Occam’s Razor: All other things being equal, pick the

simplest solution
• These concepts will be made more precise in later classes

No free lunch theorem
• There is no single best model that works optimally for all kinds
of problems (Wolpert 1997)
• No learning is possible without some prior assumptions about

the problem at hand
• Need many diﬀerent types of models to cover variety of

problems in the real world. Each model will have a range of
algorithms that can be used to train it.

Lecture1 PDF

Uploaded by

Copyright:

Available Formats

Lecture1 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture1 PDF

Uploaded by

Copyright:

Available Formats

Automatic

• data: Comes from various sources such as sensors, domain

• data: Comes from various sources such as sensors, domain

• learn: Make intelligent predictions or decisions based on data

Video link: https://www.youtube.com/watch?v=TtfQlkGwE2U

• data: Comes from various sources such as sensors, domain

• learn: Make intelligent predictions or decisions based on data

• Inputs can be complex objects such as images, sentences, speech

• Outputs are either categorical (classification tasks) or real-valued

Image from “ImageNet classification with deep CNNs”, Krizhevsky et al.

• data: Comes from various sources such as sensors, domain

• learn: Make intelligent predictions or decisions based on data

1. Supervised learning: decision trees, neural networks, etc.

• data: Comes from various sources such as sensors, domain

• learn: Make intelligent predictions or decisions based on data

1. Supervised learning: decision trees, neural networks, etc.

• Most common example: Clustering

• data: Comes from various sources such as sensors, domain

• learn: Make intelligent predictions or decisions based on data

1. Supervised learning: decision trees, neural networks, etc.

2. Unsupervised learning: k-means clustering, etc.

• data: Comes from various sources such as sensors, domain

• learn: Make intelligent predictions or decisions based on data

1. Supervised learning: decision trees, neural networks, etc.

2. Unsupervised learning: k-means clustering, etc.

• Vision: Identify faces in a photograph, objects in a video or still

Image from: https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/

Image credit: https://i.ytimg.com/vi/P18EdAKuC1U/maxresdefault.jpg

Image credit: https://media.newyorker.com/photos/59095c8a019dfc3494e9f7b9/16:9/w_1200,h_630,c_limit/Hazards-of-Autopilot.jpg

Machine learning Statistics

network, graphs model

generalization test set performance

supervised learning regression/classification

unsupervised learning density estimation, clustering

large grant = $1,000,000 large grant= $50,000

nice place to have a meeting: nice place to have a meeting:

Glossary from: http://statweb.stanford.edu/~tibs/stat315a/glossary.pdf

When to take this course:

2. Aniket Kuiri, M.Tech. II

3. Pooja Palod, M.Tech. II

4. Sunandini Sanyal, M.Tech. I

5. Anupama Vijjapu, M.Tech. I

6. Saurabh Garg, B.Tech. IV

7. Tanmay Parekh, B.Tech. IV

• Supervised learning: Decision trees, support vector machines,

• Unsupervised learning: k-means clustering, etc.

• Brief introduction to ML applications in computer vision,

Assignments will contain programming questions

Midsem and final exam: 15% + 25%

Participation: 5%. Four in-class quizzes will be randomly handed

• Write what you know.

• Use your own words.

• If you refer to *any* external material, *always* cite your

• If you’re caught for plagiarism or copying, penalties are

• In short: Just not worth it.

Image credit: https://www.flickr.com/photos/kurok/22196852451

Team: 2-3 members. Individual projects are highly discouraged.

Another good resource: http://deeplearning.net/datasets/

Popular resource for ML beginners:

• If you refer to any external material, always cite your

Popular resource for ML beginners:  

Interesting datasets for computational journalists:  

Speech and language resources: