CS8082U4L01 - K-Nearest Neighbour Learning

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 21

CS8082 – Machine Learning Techniques

Unit 4 – Instance-Based Learning

k-Nearest Neighbor Learning


Unit Objectives

• Understand the need for machine learning for various problem solving
• Study the various supervised, semi-supervised and unsupervised learning
algorithms in machine learning

2
Unit Outcomes

At the end of the course, the student should be able to:


• CO 1: Differentiate between supervised, unsupervised, semi-supervised machine learning
approaches
• CO 2: Apply specific supervised or unsupervised machine learning algorithm for a particular
problem
• CO 3: Analyze and suggest the appropriate machine learning approach for the various types of
problem
• CO 4: Design and make modifications to existing machine learning algorithms to suit an
individual application
• CO 5: Provide useful case studies on the advanced machine learning algorithms

3
Prerequisite

 Understanding on k-NN algorithm


 Familiar to distance based instance similarity identification
 General idea on gradient descent
What is instance-based learning?

 In contrast to learning methods that construct a general, explicit


description of the target function when training examples are
provided, instance-based learning methods simply store the
training examples.
 Instance-based learning includes nearest neighbor and locally
weighted regression methods that assume instances can be
represented as points in a Euclidean space.
 Instance-based methods are sometimes referred to as "lazy"
learning methods because they delay processing until a new
instance must be classified.
Instance-based learning

 Learning in these algorithms consists of simply storing


the presented training data.
 When a new query instance is encountered, a set of similar related
instances is retrieved from memory and used to classify the new query
instance.

 Many techniques present only local approximation and


not for the entire instance space.
1. k- Nearest Neighborhood Learning

 Assumes all instances correspond to points in the n-dimensional


space
 The nearest neighbors of an instance are defined in terms of the
standard Euclidean distance.
 Let an arbitrary instance x be described by the feature vector

 Then the distance between two instances xi and xj is defined to


be d(xi, xj), where
k- Nearest Neighborhood Learning

Target function : Discrete – valued or real-valued


 CASE 1: Discrete – valued
k- Nearest Neighborhood Learning

 The diagram on the right side of figure shows the shape of this decision
surface induced by 1-NN over the entire instance space.
 The decision surface is a combination of convex polyhedra surrounding each
of the training examples.
 For every training example, the polyhedron indicates the set of query points
whose classification will be completely determined by that training example.
 Query points outside the polyhedron are closer to some other
training example.
 This kind of diagram is often called the Voronoi diagram of the set of
training examples.
k- Nearest Neighborhood Learning

 CASE 2: Continuous – valued


 It calculates the mean value of the k nearest training examples
rather than calculate their most common value.
Distance-Weighted Nearest Neighbor Algorithm

Refinement of k-NN algorithm


 To weight the contribution of each of the k neighbors according to
their distance to the query point giving greater weight to closer
neighbors.
 Weight the vote of each neighbor according to the inverse square
of its distance from xq.
Distance-Weighted Nearest Neighbor Algorithm
Distance-weighted instances – Real-valued target function

 The only disadvantage of considering all examples is that our classifier will run
more slowly.
 If all training examples are considered when classifying a new query instance,
we call the algorithm a global method.
 If only the nearest training examples are considered, we call it a local method.
 When the above rule is applied as a global method, using all training examples,
it is known as Shepard's method.
2. Locally weighted regression

 Generalization of approximating as the target function f (x) at the single


query point x = xq
 Locally weighted regression uses nearby or distance-weighted training
examples to form this local approximation to f.
Locally weighted regression
 local because the function is approximated based only on data near the query
point,
 weighted because the contribution of each training example is weighted by its
distance from the query point, and
 Regression because this is the term used widely in the statistical learning
community for the problem of approximating real-valued functions.
Understanding Locally weighted regression
Locally weighted linear regression

Consider a linear function

 While concentrating on global approximation to target function we


derive methods to choose weights that minimize the squared error
summed over the set D of training examples
Locally weighted linear regression
 How shall we modify this procedure to derive a local approximation rather than a
global one?
 The simple way is to redefine the error criterion E to emphasize fitting the local
training examples.
 Three possible criteria are given below.
Locally weighted linear regression

 Rederiving the gradient descent rule:

The contribution of instance x to the weight update is now multiplied


by the distance penalty and that the error is summed over only the k
nearest training examples.
Summary
 k-NN an instance-based algorithm for approximating real-valued or discrete-
valued target functions, assuming instances correspond to points in an n-
dimensional Euclidean space.
 The target function value for a new query is estimated from the known values
of the k nearest training examples.
 Locally weighted regression methods are a generalization of k-NN which an
explicit local approximation to the target function is constructed for each query
instance.
 The local approximation to the target function may be based on a variety of
functional forms such as constant, linear, or quadratic functions or on spatially
localized kernel functions.
References

TEXT BOOKS:

1. Tom M. Mitchell, ―Machine Learning‖, McGraw-Hill Education (India) Private Limited, 2013.

REFERENCES:

2. Ethem Alpaydin, ―Introduction to Machine Learning (Adaptive Computation and Machine


Learning), The MIT Press 2004.
3. Stephen Marsland, ―Machine Learning: An Algorithmic Perspective ‖, CRC Press, 2009.

You might also like