365 ML Infographic
365 ML Infographic
365 ML Infographic
Linear
Logistic
Ridge
Lasso
K-Nearest
Decision
Naïve
Support
Random
Regression Regression Regression Regression Neighbors Trees Bayes Vector
Forests XGBoost
Machines
Supervised learning Classification algorithm Regression algorithm Regression algorithm An algorithm that Algorithm that creates a An algorithm that Supervised learning Algorithm made up of Tree boosting system
algorithm that fits a that predicts the that applies that applies classifies a sample tree-like structure with performs classification models that construct a many decision trees.
that is sparsity-aware
linear equation based on probability of an event regularization to deal regularization and based on the category questions regarding the according to Bayes’ maximal margin and performs weighted
the training data.
occurring using a logistic with overfitted data.
feature selection to deal of its K-nearest input posed as tree theorem.
hyperplane during Its result is determined approximate tree
Description function.
with overfitted data.
neighbors, where K is an nodes (e.g., is the input < training to find the best either by the average of learning.
The equation is used for The method uses L2 integer. 2.6?). It is primarily used The model assigns a solution for the data.
all outputs or the most
predictions on new Logistic regression can regularization. The method uses L1 for classification.
sample to the class with popular outcome. XGBoost is very scalable
data.
transform into its logit regularization. the largest conditional SVMs can employ Bootstrap aggregating and includes automatic
form, where the log of The structure is probability. different kernels to generates different feature selection.
the odds is equal to a hierarchical and can be separate the space and datasets from the
linear model.
easily visualized. ensure further flexibility. original and feeds them
to the trees. A subset of
Applied whenever we features is chosen at
want to predict random for each tree.
categorical outputs.
The goal is to reduce
overfitting and improve
accuracy.
Regression Classification Regression Regression Regression Regression Classification Regression Regression Regression
Classification Classification Classification Classification Classification
Used for
Numerical Numerical Numerical Numerical Numerical Numerical Numerical Numerical Numerical Numerical
Input Categorical Categorical Categorical Categorical Categorical Categorical Categorical Categorical Categorical
S mall dataset Small dataset Small dataset Small dataset Small dataset Small dataset S mall dataset Small dataset Small dataset S mall dataset
M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset M edium dataset
Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset Large dataset
Hand es l :
S parse data Sparse data Sparse data Sparse data Sparse data Sparse data S parse data Sparse data Sparse data S parse data
H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions H igh dimensions
Standardizing the Standardizing the Standardizing the Standardizing the Standardizing the No preprocessing Tokenizing – for Input needs to be No preprocessing Data needs to be in a
inputs inputs inputs inputs inputs required Multinomial and rescaled to [-1,1] required DMatrix
Complement Naïve
Preprocessing
Removing irrelevant Removing irrelevant Removing irrelevant Bayes
features features features
Encoding – for
Categorical Naïve
Bayes
Training: Training: Training: Training: Training: Training: Training: Training: Training: Training:
Fast Fast Fast Fast Fast Fast Slow Slow Fast Fast
Testing: Testing: Testing: Testing: Testing: Testing: Testing: Testing: Testing: Testing:
Algorithm
speed Fast Fast Fast Fast Slow Really fast Fast Fast Fast Fast
Intuitiv Intuitiv P revents overfitting P revents overfitting Intuitive Simple to understand Easy to implement Fast testing on new Requires little to no Easy to parallelize
and multicollinearity and multicollinearity and interpret data data preprocessing
Easy to interpret Easy to interpret issue issues Easy to implement Learns well from small Handles sparse data
In-built feature datasets Flexibility – handles P erforms well with
Pros Works very well with Easy to implemen Lowers variance Lowers variance Suitable for non-linear selection linear and non-linear large datasets Smart trees penalizer
linear data problems Makes predictions in problems, regression
Does not require a Performs feature Requires little data real time and classification Automatically handles
linear relationship selection Fast training preprocessing overfitting in most Easily scalable
between the Suitable for non-linear Handles sparse data cases
dependent and P erforms well with problems
independent variable large datasets Overcomes the curse Lots of
Irrelevant features do of dimensionality hyperparameters to
Shows which Fast testing on new not affect the control
predictors are data performance
important Gives better results
Compatible with than decision tree
limited-resource
platforms Suitable for non-linear
problems
Suitable for non-linear
problems
Assumes linearity P rone to overfitting Increases bias Increases bias Sensitive to irrelevant Sometimes unstable – Does not consider Slow training, Less interpretable Lack of
between dependent when dealing with features small variations in the feature especially for non- than decision trees – interpretability
and independent high-dimensional data Difficult to interpret Difficult to interpret input data can result dependencies linear kernels black box model
variables, otherwise it (the curse of the coefficients in the the coefficients in the Training can take up in big changes to the Difficult to optimize
Cons performs weakly dimensionality final model - they final model - they too much computer tree structure Not suitable for Low interpretability Does not solve the many different
shrink towards 0 shrink towards 0 memory regression tasks regression problems parameters
Sensitive to outliers Requires large sample Greedy learning Not suitable for large well
sizes Sensitive to irrelevant Testing can be algorithms not Bad estimator datasets
features computationally guaranteed to find Outperformed by
Limited to linearly intensive the global optimal gradient-boosted
separable problems solution trees
Suffers from the curse
of dimensionality – Prone to overfitting
the number of data
points is comparable Difficult to optimize as
to the number of there are not many
dimensions hyperparameters
Not suitable for Does not solve
extrapolation regression problems
problems well as it provides a
piecewise constant
Not suitable for approximation
categorical data
Forecasting Medical research Genetic studies Forecasting Text mining Succession Spam filtering Time series
roduct
P Store sales
Applications planning prediction recommendation prediction
Medical research Gaming Water resource
Time series
Facial recognition Document
studies prediction Expansion of
categorization Facial
Credit card
Ad click-through
production recognition fraud detection rate prediction
Variable Advertising
dependencies
House price
estimation predictions Loan default
Sentiment
roduct
P
Developing
software
packages
P redicting bankruptcy Recommendation
systems
Credit card
default predictions
-
Real world Toy datasets -
Real world Toy datasets Toy datasets Toy datasets Toy datasets Toy datasets Toy datasets Toy datasets
Starter
datasets
Iris datasets
Diabetes Dataset Iris Iris Iris Iris Iris Iris
Datasets California Housing Digits
The Ames Housing Hitters Baseball Data Diabetes Digits Web purchases
rice Dataset
Breast cancer Breast cancer
P
Digits
Medical insurance Food for All
costs Wine
Breast cancer
-
Real world -
Real world -
Real world
datasets datasets datasets
-
Real world -
Real world
-
Real world
datasets Census income -
Real world
datasets Census income MNIST
datasets The Ames Housing datasets MNIST Student-dropouts
Dataset
Labeled Faces in the Olivetti faces Student-dropouts
Wild
-
Real world
datasets 20 newsgroups Mushrooms data
Olivetti faces 20 newsgroups
California housing
vectorized
Linear
Logistic
Ridge
Lasso
K-Nearest
Decision
Naïve
Support-
Random
Regression Regression Regression Regression Neighbors Trees Bayes Vector
Forests XGBoost
Machines
www.365datascience.com