Noida Institute of Engineering and Technology

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 24

NOIDA INSTITUTE OF ENGINEERING AND TECHNOLOGY

DEPARTMENT OF COMPUTER SCIENCE ENGINEERING


SEMINAR
ON
MACHINE LEARNING

NAME: LALIT MOHAN SRIVASTAV


SECTION: CSE-VIIIB
ROLL NO: 1613310107
CONTENTS
 Why Machine learning?
 Defining Machine Learning
 Traditional Programming vs Machine Learning
 Application of Machine Learning
 Process of Learning
 Machine Learning Algorithm
 Decision Learning
 Filter Spam example
 Measuring Algorithm performance
 Software
 Conclusion
 References
Why Machine Learning?

 Develop systems that can automatically adapt and customize themselves to


individual users.
 Discover new knowledge from large databases (data mining).
 Ability to mimic human and replace certain monotonous tasks.
 Develop systems that are too difficult/expensive to construct manually.
Defining Machine Learning.

 Machine learning is a method of data analysis that automates analytical model


building.
 Machine learning (ML) is the study of computer algorithms that improve
automatically through experience.
  Machine learning algorithms build mathematical model based on sample data,
known as "training data“.
 Machine learning is closely related to computational statistics.
 Python language suitable for a variety of tasks in machine learning.
Traditional Programming vs Machine
Learning
Traditional Programming

Data
Computer Output
Program

Machine Learning

Data
Computer
Program
Program
Application of Machine Learning
Process of Learning

 Gathering Data
 Data preparation
 Choosing a model
 Training
 Evaluation
 Parameter Tuning
 Prediction
Contd…
Contd…

 Learning = Improving with experience at some task


 Improve over task T,
 With respect to performance measure, P
 Based on experience, E
Machine Learning Algorithm
 Supervised Learning
 Regressions: learning numbers
 Classifications: learning classes
 Unsupervised Learning
 Clustering: finding groups
 Dimensionality Reductions: finding efficient representations
 Semi-supervised Learning
 Reinforcement Learning
Decision Learning

 Decision tree learning is one of the predictive modeling approaches used


in machine learning.
 A decision tree can be used to visually and explicitly represent decisions and
decision making. 
 It uses a decision tree to go from observations about an item to conclusions about
the item's target value.
 A decision tree is drawn upside down with its root at the top.
Contd…
In the image on the left, the bold
text in black represents a
condition/internal node, based
on which the tree splits into
branches/ edges. The end of
the branch that doesn’t
split anymore is the decision/leaf, in
this case, whether the passenger
died or survived, represented as
red and green text
respectively.
Filter Spam Example
Examples:- Spam Filtering
Spam:  refers to unsolicited commercial email or unsolicited bulk email 
Identify Spam Emails
T: % of spam emails that were filtered
P: % of non-spam emails that were incorrectly filtered-out
E: a database of emails that were labelled by users
The Learning Process for our example
Measuring Algorithm performance
There are various metrics for the measurement of algorithm performance
of Machine Algorithm.
1. Confusion Matrix:
1. A confusion matrix is nothing but a table with two dimensions viz. “Actual” and
“Predicted”.
2. Both the dimensions have “True Positives (TP)”, “True Negatives (TN)”, “False Positives
(FP)”, “False Negatives (FN)” as shown below −
Terms associated with confusion matrix
 True Positives (TP) − It is the case when both actual class & predicted class of data
point is 1.
 True Negatives (TN) − It is the case when both actual class & predicted class of data
point is 0.
 False Positives (FP) − It is the case when actual class of data point is 0 & predicted
class of data point is 1.
 False Negatives (FN) − It is the case when actual class of data point is 1 & predicted
class of data point is 0.
2. Classification Accuracy:
   1. It may be defined as the number of correct predictions made as a ratio of
all predictions made.
2. We can easily calculate it by confusion matrix with the help of
following formula −

3. Classification Report:
This report consists of the scores of Precisions, Recall, F1 and Support. They are
explained as follows −
1. Precision, used in document retrievals, may be defined as the
number of correct documents returned by our ML model.
We can easily calculate it by confusion matrix with the help of
following formula −
   2. Recall or sensitivity may be defined as the number of positives
returned by our ML model.
We can easily calculate it by confusion matrix with the help of
following formula.

3. Specificity, in contrast to recall, may be defined as the number of


negatives returned by our ML model.
We can easily calculate it by confusion matrix with the help of
following formula −

4. F1 Score will give us the harmonic mean of precision and recall.


Mathematically, F1 score is the weighted average of the precision and
recall.
F1 = 2∗(precision∗recall)/(precision+recall)
4. AUC (Area Under ROC Curve) is a performance metric, based on
varying threshold values, for classification problems.
ROC is a probability curve and AUC measure the separability. 
Mathematically, it can be created by plotting TPR (True Positive Rate) i.e.
Sensitivity or recall vs FPR (False Positive Rate).
5. LOGLOSS (Logarithmic Loss) is also called Logistic regression loss or
cross-entropy loss.
It basically defined on probability estimates and measures the
performance of a classification model where the input is a probability value between 0
and 1. 
It can be understood more clearly by differentiating it with accuracy.
Software Suites

 Python scikit learn


 MATLAB, mlpy
 Oracle data mining
 SAS enterprise miner
 STATISTICA data miner
 Orange
 Ayasdi
 IBM SPSS Modeler
 Apache Mahout
Conclusion

  Machine learning is quickly growing field in computer science.


 It has applications in nearly every other field of study.
 It is already being implemented commercially because machine learning can solve
problems too difficult or time consuming for humans to solve.
 To describe machine learning in general terms, a variety models are used to learn
patterns in data and make accurate predictions based on the patterns it observes.
References

 https://www.geeksforgeeks.org/machine-learning/
 https://en.wikipedia.org/wiki/Machine_learning
 https://
www.tutorialspoint.com/machine_learning_with_python/machine_learning_algori
thms_performance_metrics.htm
 https://
www.researchgate.net/publication/2940012_Machine_Learning_Techniques_in_S
pam_Filtering
 https://www.analyticsvidhya.com/blog/2017/09/machine-learning
THANK YOU

You might also like