Machine Learning

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

What is

Machine
learning
And why is everyone talking
about it?
Machine learning

Machine learning is a category of algorithm that allows systems to


explore data in order to perform a specific task effectively without
being explicitly programmed. Instead, it relies on patterns and
relationships.
How is this possible? How does this mysterious process
happen? That’s what we’ll be seeing

in this speech but first let’s take a look


at machine learning applications
and its different types.
Types of ML
Supervised learning is where you:
-Have labeled data which means we already know the corresponding output for
each input.
-Use an algorithm to learn the hidden patterns between the inputs and the output.
-Tend to model these hidden patterns by a mapping function.
We’ll define it by saying that supervised learning is based on the concept of
teaching a machine learning algorithm how to predict using already divided
(labeled) data.
REGRESSION CLASSIFICATION
In Regression:
The output variable is a continuous value.
The input variable is a continuous value.
The mapping function is a mathematical function.
EXAMPLE
To predict the salary of an employee, let’s use a regression algorithm

Linear Regression???

Prediction Errors

error = y - y_predicted
Loss/Cost functions

Multi-Linear Regression

Multi-linear regression
Model

Two possible solutions :

Overfitting Underfitting

In regression problems, we can


use linear, multilinear and
polynomial regression.
Graphical plots help us choose the type of regression.
Regressions are sensitive to outliers.
In classification :
The output variable is a categorical variable.

The mapping function is based on probabilities or the


measure of similarities between a pair of objects.
Logistic Regression

EXAMPLE
We can use it To predict if a candidate is getting admitted or not (yes or no)

Sigmoid function???

Confusion matrix???

Logistic regression is used for binary classification problems with two possible outcomes.
Logistic regression uses a sigmoid function as a linear equation to predict a value.
To measure model performance, we use a confusion matrix where output can be two or more
classes.
KNN

~As known as the lazy learner~


KNN: Non-Parametric
KNN steps
For a given data point x of an unknown class:
Introduce the value of K.
Calculate the distance between x and all the data points in
the training data.
Model picks K entries in the database which are closest to
the new data point.
Then it takes the majority vote i.e the most common
class/label among those K entries and that will be the class of
the new data point.
KNN steps
KNN Pros KNN Cons
No assumptions about data. For
01 example: unlike the ones imposed by the 01 K-NN needs homogeneous feature as
features must have the same scale.
linear model.
K-NN might be very easy to implement but as the
02 Simple algorithm: easy to
understand and interpret. 02 dataset grows, the efficiency and/or the speed of
the algorithm decline very fast.
Outlier sensitivity: K-NN algorithm is very
03 No training needed.
03 sensitive to outliers as it simply chooses the
neighbors based on distance criteria.
Imbalanced data causes problems: For example, if
04 Variety of distance functions. 04 we have two classes and one class is dominant
more than the other, it will eventually represent
80% of the dataset.

KNN is a non-parametric machine learning algorithm.


The purpose of KNN is to predict the classification of a new
random point compared to other features(neighbors).
To find the best k value, we apply the elbow method.
Thank You

You might also like