365 ML Infographic

Supervised Learning
Linear   Logistic
Ridge
Lasso
K-Nearest
Decision
Naïve
Support
Random  
Regression Regression Regression Regression Neighbors Trees Bayes Vector   Forests XGBoost
Machines
Increasing Complexity Increasing Interpretability
Supervised learning Classification algorithm Regression algorithm Regression algorithm An algorithm that Algorithm that creates a An algorithm that Supervised learning Algorithm made up of Tree boosting system
algorithm that fits a that predicts the that applies that applies classifies a sample tree-like structure with performs classification models that construct a many decision trees.
that is sparsity-aware
linear equation based on probability of an event regularization to deal regularization and based on the category questions regarding the according to Bayes’ maximal margin and performs weighted
the training data.
occurring using a logistic with overfitted data.
feature selection to deal of its K-nearest input posed as tree theorem.
hyperplane during Its result is determined approximate tree
Description function.
with overfitted data.
neighbors, where K is an nodes (e.g., is the input < training to find the best either by the average of learning.
The equation is used for The method uses L2 integer. 2.6?). It is primarily used The model assigns a solution for the data.
all outputs or the most
predictions on new Logistic regression can regularization. The method uses L1 for classification.
sample to the class with popular outcome. XGBoost is very scalable
data.
transform into its logit regularization. the largest conditional SVMs can employ Bootstrap aggregating and includes automatic
form, where the log of The structure is probability. different kernels to generates different feature selection.
the odds is equal to a hierarchical and can be separate the space and datasets from the
linear model.
easily visualized. ensure further flexibility. original and feeds them
to the trees. A subset of
Applied whenever we features is chosen at
want to predict random for each tree.
categorical outputs.
The goal is to reduce
overfitting and improve
accuracy.
Regression Classification Regression Regression Regression Regression Classification Regression Regression Regression
Classification Classification Classification Classification Classification
Used for
Numerical Numerical Numerical Numerical Numerical Numerical Numerical Numerical Numerical Numerical
Input Categorical Categorical Categorical Categorical Categorical Categorical Categorical Categorical Categorical
S mall dataset Small dataset Small dataset Small dataset Small dataset Small dataset S mall dataset Small dataset Small dataset S mall dataset
M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset
Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset
Hand es l :
S parse data Sparse data Sparse data Sparse data Sparse data Sparse data S parse data Sparse data Sparse data S parse data
H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions
Standardizing the Standardizing the Standardizing the Standardizing the Standardizing the No preprocessing Tokenizing – for Input needs to be No preprocessing Data needs to be in a
inputs inputs inputs inputs inputs required Multinomial and rescaled to [-1,1] required DMatrix
Complement Naïve
Preprocessing
Removing irrelevant Removing irrelevant Removing irrelevant Bayes
features features features
Encoding – for
Categorical Naïve
Bayes
Training: Training: Training: Training: Training: Training: Training: Training: Training: Training:
Fast Fast Fast Fast Fast Fast Slow Slow Fast Fast
Testing: Testing: Testing: Testing: Testing: Testing: Testing: Testing: Testing: Testing:
Algorithm
speed Fast Fast Fast Fast Slow Really fast Fast Fast Fast Fast
Regularization Regularization Ad ust the penalty

j Ad ust the penalty
j Increase the number P erform pruning Resistant to Resistant to The construction Resistant to
term term of neighbors (during or after overfitting overfitting prevents overfitting overfitting
training)
P rune the individual
Avoid   trees
overfitting ?
Intuitiv Intuitiv P revents overfitting P revents overfitting Intuitive Simple to understand Easy to implement Fast testing on new Requires little to no Easy to parallelize
and multicollinearity and multicollinearity and interpret data data preprocessing
Easy to interpret Easy to interpret issue issues Easy to implement Learns well from small Handles sparse data
In-built feature datasets Flexibility – handles P erforms well with
Pros Works very well with Easy to implemen Lowers variance Lowers variance Suitable for non-linear selection linear and non-linear large datasets Smart trees penalizer
linear data problems Makes predictions in problems, regression
Does not require a Performs feature Requires little data real time and classification Automatically handles
linear relationship selection Fast training preprocessing overfitting in most Easily scalable
between the Suitable for non-linear Handles sparse data cases
dependent and P erforms well with problems
independent variable large datasets Overcomes the curse Lots of
Irrelevant features do of dimensionality hyperparameters to
Shows which Fast testing on new not affect the control
predictors are data performance
important Gives better results
Compatible with than decision tree
limited-resource
platforms Suitable for non-linear
problems
Suitable for non-linear
problems
Assumes linearity P rone to overfitting Increases bias Increases bias Sensitive to irrelevant Sometimes unstable – Does not consider Slow training, Less interpretable Lack of
between dependent when dealing with features small variations in the feature especially for non- than decision trees – interpretability
and independent high-dimensional data Difficult to interpret Difficult to interpret input data can result dependencies linear kernels black box model
variables, otherwise it (the curse of the coefficients in the the coefficients in the Training can take up in big changes to the Difficult to optimize
Cons performs weakly dimensionality final model - they final model - they too much computer tree structure Not suitable for Low interpretability Does not solve the many different
shrink towards 0 shrink towards 0 memory regression tasks regression problems parameters
Sensitive to outliers Requires large sample Greedy learning Not suitable for large well
sizes Sensitive to irrelevant Testing can be algorithms not Bad estimator datasets
features computationally guaranteed to find Outperformed by
Limited to linearly intensive the global optimal gradient-boosted
separable problems solution trees
Suffers from the curse
of dimensionality – Prone to overfitting
the number of data
points is comparable Difficult to optimize as
to the number of there are not many
dimensions hyperparameters
Not suitable for Does not solve
extrapolation regression problems
problems well as it provides a
piecewise constant
Not suitable for approximation
categorical data
Forecasting Medical research Genetic studies Forecasting Text mining Succession Spam filtering Time series   roduct  
P Store sales
Applications planning prediction recommendation prediction
Medical research Gaming Water resource   Time series   Facial recognition Document  
studies prediction Expansion of   categorization Facial   Credit card   Ad click-through  
production recognition fraud detection rate prediction
Trends evaluation Text editing Outlier detection  

Clinical measures Applied stress
and fraud News articles  
testing prevention Credit card   categorization Data processing Medical diagnosis Motion detection
fraud detection for medical
diagnosis
Variable Advertising
dependencies   House price  
estimation predictions Loan default   Sentiment   roduct  
P
predictions Medical diagnosis analysis categorization
Developing
software  
packages
P redicting bankruptcy Recommendation  
systems
Credit card  
default predictions
-
Real world Toy datasets -
Real world Toy datasets Toy datasets Toy datasets Toy datasets Toy datasets Toy datasets Toy datasets
Starter  datasets
Iris datasets
Diabetes Dataset Iris Iris Iris Iris Iris Iris
Datasets California Housing Digits
The Ames Housing Hitters Baseball Data Diabetes Digits Web purchases
rice Dataset
Breast cancer Breast cancer
P
Digits
Medical insurance Food for All
costs Wine
Breast cancer
-
Real world -
Real world -
Real world
datasets datasets datasets
-
Real world -
Real world
-
Real world
datasets Census income -
Real world
datasets Census income MNIST
datasets The Ames Housing datasets MNIST Student-dropouts
Dataset
Labeled Faces in the Olivetti faces Student-dropouts
Wild
-
Real world
datasets 20 newsgroups Mushrooms data
Olivetti faces 20 newsgroups
California housing
vectorized
Linear   Logistic
Ridge
Lasso
K-Nearest
Decision
Naïve
Support-
Random  
Regression Regression Regression Regression Neighbors Trees Bayes Vector   Forests XGBoost
Machines
www.365datascience.com

365 ML Infographic

Uploaded by

Copyright:

Available Formats

365 ML Infographic

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

365 ML Infographic

Uploaded by

Copyright:

Available Formats

Supervised Learning

Increasing Complexity Increasing Interpretability

Regularization Regularization Ad ust the penalty

Trends evaluation Text editing Outlier detection

predictions Medical diagnosis analysis categorization

You might also like