Introduction of ML

Download as pdf or txt
Download as pdf or txt
You are on page 1of 53

CSPC204 MACHINE LEARNING

Dr K P Sharma
Assistant Professor
Department of Computer Science and Engineering
Email: [email protected]

Department of Computer Science and Engineering


Dr. B.R. Ambedkar National Institute of Technology
Jalandhar Punjab-144011
Syllabus
DEPARTMENT: COMPUTER SCIENCE AND ENGINEERING
COURSE CODE: CSPC-204
COURSE TITLE: MACHINE LEARNING
COURSE DESIGNATION: REQUIRED
PRE-REQUISITES: NONE
CONTACT HOURS/CREDIT SCHEME: (L-T-P-C: 3-0-0-3)
COURSE ASSESSMENT METHODS: One Mid Semester and one end-semester exam, along with
assignments, and Quizzes.

COURSE OUTCOMES: After the completion of the course, the students will be able to:
1. Demonstrate in-depth knowledge of methods and theories in the field of machine learning and provide an
introduction to the basic principles, techniques, and applications of machine learning, classification tasks, decision
tree learning.
2. Apply decision tree learning, bayesian learning and artificial neural network in real world problems.
3. Demonstrate the use of genetic algorithms and genetic programming.
4. Apply inductive and analytical learning with perfect domain theories.

1/16/2024 DR KP SHARMA, NIT JALANDHAR 2


Syllabus
Course Program outcomes
Outcomes
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CSPC-204
CO 1. M H H
CO 2. H M H
CO 3. M M
CO 4. M

TOPICS COVERED
Introduction: Well-Posed learning problems, Basic concepts, Designing a learning system, Issues in
machine learning. Types of machine learning: Learning associations, Supervised learning
(Classification and Regression Trees, Support vector machines), Unsupervised learning (Clustering),
Instance-based learning (K-nearest Neighbor, Locally weighted regression, Radial Basis Function),
Reinforcement learning (Learning Task, Q-learning, Value function approximation, Temporal
difference learning).

1/16/2024 DR KP SHARMA, NIT JALANDHAR 3


Syllabus
Decision Tree Learning: Decision tree representation, appropriate problems for decision tree
learning, Univariate Trees (Classification and Regression), Multivariate Trees, Basic Decision Tree
Learning algorithms, Hypothesis space search in decision tree learning, Inductive bias in decision tree
learning, Issues in decision tree learning.
Bayesian Learning: Bayes theorem and concept learning, Bayes optimal classifier, Gibbs algorithms,
Naive Bayes Classifier, Bayesian belief networks, The EM algorithm.
Artificial Neural Network: Neural network representation, Neural Networks as a paradigm for
parallel processing, Linear discrimination, Pairwise separation, Gradient Descent, Logistic
discrimination, Perceptron, Training a perceptron, Multilayer perceptron, Back propagation
Algorithm. Recurrent Networks, Dynamically modifying network structure.
Genetic Algorithms: Basic concepts, Hypothesis space search, Genetic programming, Models of
evolution and learning, Parallelizing Genetic Algorithms.
.

1/16/2024 DR KP SHARMA, NIT JALANDHAR 4


Syllabus
Inductive and Analytical Learning: Learning rule sets, Comparison between inductive and
analytical learning, Analytical learning with perfect domain theories: Prolog-EBG. Inductive
Analytical approaches to learning, Using prior knowledge to initialize hypothesis (KBANN
Algorithm), to alter search objective (Tangent Prop and EBNN Algorithm), to augment search
operators (FOCL Algorithm).

TEXT BOOKS, AND/OR REFERENCE MATERIAL


◦ Mitchell T.M., Machine Learning, McGraw Hill (1997) 2nd ed.
◦ Alpaydin E., Introduction to Machine Learning, MIT Press (2010) 2nd ed.
◦ Bishop C., Pattern Recognition and Machine Learning, Springer-Verlag (2006) 2nd ed.
◦ Michie D., Spiegelhalter D. J., Taylor C. C., Machine Learning, Neural and Statistical Classification.
Overseas Press (2009) 1st ed.

1/16/2024 DR KP SHARMA, NIT JALANDHAR 5


Machine Learning:
Introduction

6
Machine Learning
Human being always dreamt of creating machines with human like-
Some example- Robots in manufacturing, mining, agriculture, space, ocean
exploration and health science etc.
However these machines are enslaved by commands.
The goal of robotics and system science is to cast intelligence into machine.
- to create intelligent machine which can emulate human intelligence.

7
Human Intelligence
Posses complex sensory mechanism, control, effective (emotional process), and cognitive (thought
process) aspect of information processing and decision making.
The process is supported by Biological neurons, over one hundred billions in numbers, in our central
Nervous system (CNS).
CNS acquires information from external environment through various natural sensory mechanism.
Integrate the information and provide appropriate interpretation through cognitive computing.
The cognitive process then advances further towards some attribute such as learning, recollection
and reasoning which result in appropriate action through muscular control.
- Is this possible to emulate this behaviour ?
- How can the modelling be done?
- What is mathematics behind?

8
Human Intelligence and Computers
Traditionally, computers have been mainly used for storage and processing of numerical data.
◦ Can we process some cognitive function such as learning, remembering, reasoning, perceiving etc.

We must to develop new tools and hardware that must deal with cognitive learning .
The process of cognitive thinking not necessarily follow the mathematical laws.
◦ Is there any cognitive mathematics ?

If we re-examine some of the mathematical aspects of our thinking process and hardware
aspects of the neurons-we may succeed to some extent in emulation process.
The computing methods based on neural networks will be able to provide a thinking machine.

9
Machine Learning
Machine Learning (ML) is a subfield of artificial intelligence (AI) that focuses on the
development of algorithms and statistical models that enable computer systems to improve
their performance on a specific task over time, without being explicitly programmed. The core
idea behind machine learning is to allow computers to learn patterns and make decisions or
predictions based on data.

1/16/2024 DR KP SHARMA, NIT JALANDHAR 10


Examples of Successful Applications of
Machine Learning
Learning to recognize spoken words (Lee, 1989; Waibel, 1989).
Learning to drive an autonomous vehicle (Pomerleau, 1989).
Learning to classify new astronomical structures (Fayyad et al., 1995).
Learning to play world-class backgammon (Tesauro 1992, 1995).

11
Why is Machine Learning Important?
Some tasks cannot be defined well, except by examples (e.g., recognizing people).
Relationships and correlations can be hidden within large amounts of data. Machine
Learning/Data Mining may be able to find these relationships.
Human designers often produce machines that do not work as desired in the environments in
which they are used.
The amount of knowledge available about certain tasks might be too large for explicit
encoding by humans (e.g., medical diagnostic).
Environments change over time.
New knowledge about tasks is constantly being discovered by humans. It may be difficult to
continuously re-design systems “by hand”.

12
Forms of Learning
The field of machine learning usually distinguishes four forms of learning
–supervised learning, unsupervised learning, reinforcement learning and
learning based natural process.
1. Supervised/ Direct Learning: Classification and Regression
2. Unsupervised Learning: Cluster analysis and Association analysis
3. Reinforcement Learning

13
Comparison Table

Criteria Supervised ML Unsupervised ML Reinforcement ML


Trained using unlabelled
Learns by using labelled Works on interacting with
Definition data without any
data the environment
guidance.

Type of data Labelled data Unlabelled data No – predefined data

Regression and Association and Exploitation or


Type of problems
classification Clustering Exploration
Supervision Extra supervision No supervision No supervision

Linear Regression, Logistic K – Means, Q – Learning,


Algorithms
Regression, SVM, KNN etc. C – Means, Apriori SARSA

Discover underlying
Aim Calculate outcomes Learn a series of action
patterns
Recommendation
Risk Evaluation, Forecast Self Driving Cars,
Application System, Anomaly
Sales Gaming, Healthcare
Detection
14
Well-Posed Learning Problem
Definition: A computer program is said to learn from experience E with respect to some
class of tasks T and performance measure P, if its performance at tasks in T, as measured
by P, improves with experience E.

For example: A computer Program that learn to play checkers might improve its
performance as measured by its ability to win at the class of task involving playing
checkers game.

1/16/2024 DR KP SHARMA, NIT JALANDHAR 15


Some example learning problems
A checkers learning problem
Task T: playing checkers
Performance measure P: percentage of games won against opponents.
Training experience E: playing practice games against itself and opponents

16
Some example learning problems
A handwriting recognition learning problem
Task T: recognizing and classifying handwritten word within image
Performance measure P: percentage of words correctly classified.
Training experience E: A database of handwritten words with given classification.

17
Some example learning problems
A robot driving learning problem
Task T: Driving on public four lane highways using vision sensors.
Performance measure P: Average distance travelled before an error.
Training experience E: A sequence of images and steering commands recorded while observing
a human driver.

18
Designing a Learning System:
An Example
1. Problem Description
2. Choosing the Training Experience
3. Choosing the Target Function
4. Choosing a Representation for the Target Function
5. Choosing a Function Approximation Algorithm
6. Final Design

19
Designing a Learning System
Consider designing a program to learn to play
checkers, with the goal of entering it in the world
checkers tournament

20
Designing a Learning System
Consider designing a program to learn to play
checkers, with the goal of entering it in the world
checkers tournament
Performance measure: the percentage of games it
wins in this tournament.
Requires the following sets
◦ Choosing Training Experience
◦ Choosing the Target Function
◦ Choosing the Representation of the Target Function
◦ Choosing the Function Approximation Algorithm

21
Choosing the Training Experience
What training experience should the system have?
◦ A design choice with great impact on the outcome.
What amount of interaction should there be between the system
and the supervisor?
Which training examples?
Will the training experience provide direct or indirect feedback?

Direct feedback easier to learn from

22
Choosing the Training Experience
What amount of interaction should there be between the system and the
supervisor?

◦ Choice #1: No freedom.


◦ Choice #2: Semi-free
◦ Choice #3: Total-freedom.

23
Choosing the Training Experience
Which training examples?
◦ There is an huge huge number of possible games.
◦ No time to try all possible games.
◦ System should learn with examples that it will encounter in the future.
◦ For example, if the goal is to beat humans, it should be able to do well in situations that
humans encounter when they play (this is hard to achieve in practice).

24
Choosing the Training Experience
◦ If training the checkers program consists only of experiences
played against itself, it may never encounter crucial board
states that are likely to be played by the human checkers
champion
◦ Most theory of machine learning rests on the assumption
that the distribution of training examples is identical to the
distribution of test examples

25
Partial Design of Checkers Learning Program
A checkers learning problem:
◦ Task T: playing checkers
◦ Performance measure P: percent of games won in the world tournament
◦ Training experience E: games played against itself
Remaining choices
◦ The exact type of knowledge to be learned
◦ A representation for this target knowledge
◦ A learning mechanism

26
Choosing the Target Function
What should be learned exactly?

The computer program knows the legal moves.


Should learn how to choose the best move. Program needs to
learn the best move from among legal moves

The computer should learn a ‘hidden’ function.


◦ target function: ChooseMove : B → M
◦ Where B is the set of legal Board States and M is the set of some moves
from B

ChooseMove is difficult to learn given indirect training

27
Choosing the Target Function
So, our Alternative target function
◦ An evaluation function that assigns a numerical score to any given
board state
◦ V : B → ( where is the set of real numbers)
◦ V(b) for an arbitrary board state b in B
◦ if b is a final board state that is won, then V(b) = 100
◦ if b is a final board state that is lost, then V(b) = -100
◦ if b is a final board state that is drawn, then V(b) = 0
◦ if b is not a final state, then V(b) = V(b '), where b' is the best final board
state that can be achieved starting from b and playing optimally until the
end of the game

28
Choosing the Target Function
V(b) gives a recursive definition for board state b
◦ Not usable because not efficient to compute except is first
three trivial cases
◦ nonoperational definition
Goal of learning is to discover an operational
description of V
Learning the target function is often called function
approximation
◦ Referred to as

29
Choosing a Representation for the Target
Function
Choice of representations involve trade offs
◦ Pick a very expressive representation to allow close approximation to
the ideal target function V
◦ More expressive, more training data required to choose among
alternative hypotheses
Use linear combination of the following board features:
◦ x1: the number of black pieces on the board
◦ x2: the number of red pieces on the board
◦ x3: the number of black kings on the board
◦ x4: the number of red kings on the board
◦ x5: the number of black pieces threatened by red (i.e. which can be captured on
red's next turn)
◦ x6: the number of red pieces threatened by black

Where w0 through w6 are numerical coefficient, or weights to be chosen by the learning


algorithm. Learned values from w1 to w6 will determine the relative importance of the various
board features in determining the value of the board. Whereas, the weight w 0 will provide an
additive constant to the board value. 30
Partial Design of Checkers Learning Program
A checkers learning problem:
◦ Task T: playing checkers
◦ Performance measure P: percent of games won in the world
tournament
◦ Training experience E: games played against itself
◦ Target Function: V: B →
◦ Target Function representation:

• The first three items above correspond to the specification of the learning task, whereas the final two
items constitute design choices for the implementation of the learning program.
• Notice the net effect of this set of design choices is to reduce the problem of learning a checkers strategy
to the problem of learning values for the coefficients w o through w6 in the target function
representation.

31
Choosing a Function Approximation Algorithm
To learn we require a set of training examples describing the board b and
the training value Vtrain(b)
◦ Each training example is an Ordered pair

Example: the following training example describes a board state b in which black has won the game (note x 2
= 0 indicates that red has no remaining pieces) and for which the target function value V train(b) is therefore
+100.

x1: the number of black pieces on the board


x2: the number of red pieces on the board
x3: the number of black kings on the board
x4: the number of red kings on the board
x5: the number of black pieces threatened by red (i.e. which can be
captured on red's next turn)
x6: the number of red pieces threatened by black
32
Estimating Training
Need to assign specific Values
scores to intermediate board states
Approximate intermediate board state b using the learner's current
approximation of the next board state following b

◦ Simple and successful approach


◦ More accurate for states closer to end states

33
Adjusting the Weights
Choose the weights wi to best fit the set of training examples
Minimize the squared error E between the train values and the values predicted by the
hypothesis

Require an algorithm that


◦ will incrementally refine weights as new training examples become available
◦ will be robust to errors in these estimated training values
Least Mean Squares (LMS) is one such algorithm

34
LMS Weight Update Rule
For each train example
◦ Use the current weights to calculate
◦ For each weight wi, update it as

◦ where
◦ Ƞ is a small constant (e.g. 0.1)

35
36
Summary of Design Choices
Issues in Machine Learning (i.e.,
Generalization)
What algorithms are available for learning a concept? How well do
they perform?
How much training data is sufficient to learn a concept with high
confidence?
When is it useful to use prior knowledge?
Are some training examples more useful than others?
What are best tasks for a system to learn?
What is the best way for a system to represent its knowledge?

38
Experience: Training Data

Age

Income

39
Labelled Training Data: Classification

Age

Income

40
Possible Classifier

Age

Income

41
Possible Classifier

Age

Income

42
Possible Classifier

Age

Income

43
Possible Classifier

Age

Income

44
The Process

45
Training Process

46
Another Example: Prediction or
Regression

47
Prediction or Regression

48
Prediction or Regression

49
Prediction or Regression

50
Prediction or Regression

51
Linear Regression

52
Linear Regression

53

You might also like