Module 1
Module 1
Module 1
Learning
21BCA6D02
VI Sem BCA - DA
Course Objectives & Outcomes
COB1 Introduce the basic concepts of Machine Learning techniques in problem solving
COB2 Familiarize Machine Learning model building and evaluation
COB3 Introduce Neural Network and Deep Learning
3. Distribution of examples
• Reliable when training examples follow a distribution similar to future test examples
• If training experience has games only played against itself, then training experience might not be
fully representative of distribution of future test cases
• Mastery of one distribution of examples will not necessary lead to strong performance over other
distribution
Designing a Learning System
II. Choosing the Target Function
V to choose best successor board state and there by best legal move
• 1. if b is a final board state that is won, then V(b) = 100
• 2. if b is a final board state that is lost, then V(b) = -100
• 3. if b is a final board state that is drawn, then V(b) = 0
• 4. if b is not a final state in the game, then V(b) = V(b’), where b' is the best final board state that
can be achieved starting from b
• Operational description of V to evaluate the states and select moves
• Large table with distinct entry specifying value for each distinct board state
• Collection of rules that match against the features of the board state
• Quadratic polynomial function of predefined board features
• Artificial Neural Network
• Pick an expressive representation to represent close approximation to ideal target function
• But more training data is required
• Need a simple representation V will be calculated as a linear combination of board features
Estimating training values based on successor state prove to converge toward perfect estimates of V train
Designing a Learning System
V. The Final Design
Performance System
Critic
Generalizer
Experiment Generator
Perspectives and Issues in ML
Perspectives:
Iteratively tune the weights whenever the predicted value differs from training value
Finally consistent hypothesis with training data will correctly generalize to unseen examples
Perspectives and Issues in ML
Issues:
What algorithms exist for learning general target functions from specific training examples?
When and how can prior knowledge held by learner guide the generalizing from examples?
How can the learner automatically alter its representation to improve its ability to represent?
Statistical Learning
Statistical Learning is a set of tools for understanding data
2 Classes: Supervised and Unsupervised Learning
Supervised learning: predicting or estimating an o/p based on one or more i/p guided by supervised
o/p and i/p
Unsupervised learning: relationship or finds a pattern within the given data without a supervised o/p
Let, suppose that we observe a response Y and p different predictors X = (X ₁, X ₂,…., Xp).
Y =f(X) + ε
Here f is the function describing the relationship between X and Y, and ε is the random error term.
Why Estimate f?
Exactly f is not known, so use statistical methods to estimate f’
Y’ = f’(X)
Reducible Error
Error arising from mismatch of f’ and f
Irreducible Error
Arises from the fact that X doesn’t
completely determine Y.
There are variables outside of X that still have some small effect on Y
How Estimate f?
Using training data, apply statistical learning method estimate unknown function f
Parametric Methods
• 1. Make an assumption about functional form of f, such as “f is linear in X”
• 2. Perform procedure that uses training data to train the model
• In case of linear model, this procedure estimates parameters β0, β1, ..., βp
• Most common approach to fit linear model is (ordinary) least squares
Non-Parametric Methods
Do not make assumptions about the form of f.
Have the potential to fit a wider range of possible shapes for f.
The problem of estimating f is not reduced to a set number of parameters.
More observations are needed compared to a parametric approach to estimate f accurately
Trade-off between Prediction Accuracy and Model Interpretability
Prediction Accuracy
Accuracy
If inference is the goal, simple and inflexible methods are easier to interpret
Supervised learning methods are those in which a model that captures the relationship
between predictors and response measurements is fitted.
• The goal is to accurately predict the response variables for future observations
Unsupervised learning takes place when we have a set of observations and a vector of
measurements xi, but no response yi.
Clustering works best when the groups are significantly distinct from each other.
There are some scenarios where only a subset of the observations has response measurements.
Both qualitative and quantitative predictors can be used to predict both types of response variables
Choosing an appropriate statistical learning method is based on the type of the response variable
Regression vs Classification
Regression Classification
It attempt to find the best fit line, which predicts the Classification tries to find the decision boundary,
output more accurately. which divides the dataset into different classes.
We can further divide Regression algorithms into We can further divide Classification algorithms into
Linear and Non-linear Regression. Binary Classifiers and Multi-class Classifiers.
Assessing Model Accuracy
• Every data set is different and there is no one statistical learning method that works best for all data sets
The test MSE can be evaluated on these observations, and the learning method which produces the smallest TSE
will be chosen.
There is no guarantee that a model with the lowest training MSE also has the lowest test MSE.
Models often work in minimizing the training MSE, and can end up with large test MSE.
Assessing Model Accuracy
• A model with a small training MSE and large test MSE is overfitting the data, picking up patterns on the
training data that don’t exist in the test data.
Bias-Variance Trade-Off
• The expected test MSE can be broken down into the sum of three quantities:
• 1. the variance of (x0)
• 2. the squared bias of (x0)
• 3. the variance of the error terms ε
• To minimize expected test MSE, we need to choose a statistical learning method that achieves both low
variance and low bias.
• Variance refers to how much would change if repeatedly estimated with different training data sets.
• Generally, the more flexible a model it is, the higher the variance.
Assessing Model Accuracy
• Bias is the error introduced from approximating a complicated problem by a much simpler model.
• Fitting a linear regression to data that is not linear will always lead to high bias, no matter how many
observations are in the training set.
• More flexible methods lead to higher variance and lower bias