365 ML Infographic

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Supervised Learning

Linear 
 Logistic
Ridge
Lasso
K-Nearest
Decision
Naïve
Support
Random 

Regression Regression Regression Regression Neighbors Trees Bayes Vector 
 Forests XGBoost
Machines

Increasing Complexity Increasing Interpretability

Supervised learning Classification algorithm Regression algorithm Regression algorithm An algorithm that Algorithm that creates a An algorithm that Supervised learning Algorithm made up of Tree boosting system
algorithm that fits a that predicts the that applies that applies classifies a sample tree-like structure with performs classification models that construct a many decision trees.
that is sparsity-aware
linear equation based on probability of an event regularization to deal regularization and based on the category questions regarding the according to Bayes’ maximal margin and performs weighted
the training data.
occurring using a logistic with overfitted data.
feature selection to deal of its K-nearest input posed as tree theorem.
hyperplane during Its result is determined approximate tree
Description function.
with overfitted data.
neighbors, where K is an nodes (e.g., is the input < training to find the best either by the average of learning.

The equation is used for The method uses L2 integer. 2.6?). It is primarily used The model assigns a solution for the data.
all outputs or the most
predictions on new Logistic regression can regularization. The method uses L1 for classification.
sample to the class with popular outcome. XGBoost is very scalable
data.
transform into its logit regularization. the largest conditional SVMs can employ Bootstrap aggregating and includes automatic
form, where the log of The structure is probability. different kernels to generates different feature selection.
the odds is equal to a hierarchical and can be separate the space and datasets from the
linear model.
easily visualized. ensure further flexibility. original and feeds them
to the trees. A subset of
Applied whenever we features is chosen at
want to predict random for each tree.

categorical outputs.
The goal is to reduce
overfitting and improve
accuracy.

Regression Classification Regression Regression Regression Regression Classification Regression Regression Regression
Classification Classification Classification Classification Classification
Used for

Numerical Numerical Numerical Numerical Numerical Numerical Numerical Numerical Numerical Numerical

Input Categorical Categorical Categorical Categorical Categorical Categorical Categorical Categorical Categorical

S mall dataset Small dataset Small dataset Small dataset Small dataset Small dataset S mall dataset Small dataset Small dataset S mall dataset
M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset
Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset
Hand es l :

S parse data Sparse data Sparse data Sparse data Sparse data Sparse data S parse data Sparse data Sparse data S parse data
H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions

Standardizing the Standardizing the Standardizing the Standardizing the Standardizing the No preprocessing Tokenizing – for Input needs to be No preprocessing Data needs to be in a
inputs inputs inputs inputs inputs required Multinomial and rescaled to [-1,1] required DMatrix
Complement Naïve
Preprocessing
Removing irrelevant Removing irrelevant Removing irrelevant Bayes
features features features
Encoding – for
Categorical Naïve
Bayes

Training: Training: Training: Training: Training: Training: Training: Training: Training: Training:
Fast Fast Fast Fast Fast Fast Slow Slow Fast Fast

Testing: Testing: Testing: Testing: Testing: Testing: Testing: Testing: Testing: Testing:
Algorithm
speed Fast Fast Fast Fast Slow Really fast Fast Fast Fast Fast

Regularization Regularization Ad ust the penalty


j Ad ust the penalty
j Increase the number P erform pruning Resistant to Resistant to The construction Resistant to
term term of neighbors (during or after overfitting overfitting prevents overfitting overfitting
training)
P rune the individual
Avoid 
 trees
overfitting ?

Intuitiv Intuitiv P revents overfitting P revents overfitting Intuitive Simple to understand Easy to implement Fast testing on new Requires little to no Easy to parallelize
and multicollinearity and multicollinearity and interpret data data preprocessing
Easy to interpret Easy to interpret issue issues Easy to implement Learns well from small Handles sparse data
In-built feature datasets Flexibility – handles P erforms well with
Pros Works very well with Easy to implemen Lowers variance Lowers variance Suitable for non-linear selection linear and non-linear large datasets Smart trees penalizer
linear data problems Makes predictions in problems, regression
Does not require a Performs feature Requires little data real time and classification Automatically handles
linear relationship selection Fast training preprocessing overfitting in most Easily scalable
between the Suitable for non-linear Handles sparse data cases
dependent and P erforms well with problems
independent variable large datasets Overcomes the curse Lots of
Irrelevant features do of dimensionality hyperparameters to
Shows which Fast testing on new not affect the control
predictors are data performance
important Gives better results
Compatible with than decision tree
limited-resource
platforms Suitable for non-linear
problems
Suitable for non-linear
problems

Assumes linearity P rone to overfitting Increases bias Increases bias Sensitive to irrelevant Sometimes unstable – Does not consider Slow training, Less interpretable Lack of
between dependent when dealing with features small variations in the feature especially for non- than decision trees – interpretability
and independent high-dimensional data Difficult to interpret Difficult to interpret input data can result dependencies linear kernels black box model
variables, otherwise it (the curse of the coefficients in the the coefficients in the Training can take up in big changes to the Difficult to optimize
Cons performs weakly dimensionality final model - they final model - they too much computer tree structure Not suitable for Low interpretability Does not solve the many different
shrink towards 0 shrink towards 0 memory regression tasks regression problems parameters
Sensitive to outliers Requires large sample Greedy learning Not suitable for large well
sizes Sensitive to irrelevant Testing can be algorithms not Bad estimator datasets
features computationally guaranteed to find Outperformed by
Limited to linearly intensive the global optimal gradient-boosted
separable problems solution trees
Suffers from the curse
of dimensionality – Prone to overfitting
the number of data
points is comparable Difficult to optimize as
to the number of there are not many
dimensions hyperparameters
Not suitable for Does not solve
extrapolation regression problems
problems well as it provides a
piecewise constant
Not suitable for approximation
categorical data

Forecasting Medical research Genetic studies Forecasting Text mining Succession Spam filtering Time series 
 roduct 

P Store sales
Applications planning prediction recommendation prediction

Medical research Gaming Water resource 
 Time series 
 Facial recognition Document 

studies prediction Expansion of 
 categorization Facial 
 Credit card 
 Ad click-through 

production recognition fraud detection rate prediction

Trends evaluation Text editing Outlier detection 



Clinical measures Applied stress
and fraud News articles 

testing prevention Credit card 
 categorization Data processing Medical diagnosis Motion detection
fraud detection for medical
diagnosis

Variable Advertising
dependencies 
 House price 

estimation predictions Loan default 
 Sentiment 
 roduct 

P

predictions Medical diagnosis analysis categorization

Developing
software 

packages
P redicting bankruptcy Recommendation 

systems

Credit card 

default predictions

-
Real world Toy datasets -
Real world Toy datasets Toy datasets Toy datasets Toy datasets Toy datasets Toy datasets Toy datasets

Starter
 datasets
Iris datasets
Diabetes Dataset Iris Iris Iris Iris Iris Iris
Datasets California Housing Digits
The Ames Housing Hitters Baseball Data Diabetes Digits Web purchases
rice Dataset
Breast cancer Breast cancer
P

Digits
Medical insurance Food for All
costs Wine
Breast cancer
-
Real world -
Real world -
Real world
datasets datasets datasets
-
Real world -
Real world

-
Real world
datasets Census income -
Real world
datasets Census income MNIST
datasets The Ames Housing datasets MNIST Student-dropouts
Dataset
Labeled Faces in the Olivetti faces Student-dropouts
Wild
-
Real world
datasets 20 newsgroups Mushrooms data
Olivetti faces 20 newsgroups
California housing
vectorized

Linear 
 Logistic
Ridge
Lasso
K-Nearest
Decision
Naïve
Support-
Random 

Regression Regression Regression Regression Neighbors Trees Bayes Vector 
 Forests XGBoost
Machines

www.365datascience.com

You might also like