dp100 Cheat Sheet Machine Learning

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Machine learning in a nutshell

Machine learning uses algorithms to identify patterns within data. Patterns that are used to create a data model that can make predictions.
The Azure for the Data Scientist course (DP-100) focuses on creating and using machine learning models with Azure Machine Learning.
To understand the purpose of the exercises, a simplified overview is provided here.

1. Define the problem


Decide on what the model should predict and when it is successful.

Classification Regression Time-series forecasting


Predict a categorical value. Predict a numerical value. Predict future numerical values
based on time-series data.

Computer Vision Natural Language


Classify images or Processing
detect objects in images. Extract insights from text.

2. Get the data


Find data sources and get access. Azure Machine Learning connects seamlessly with the three most used Azure storages for data science.

Azure Blob Storage Azure Data Lake Gen2 Azure SQL Database
Object cloud storage. Unlimited object cloud storage. Relational cloud database.
Uses flat namespace to store Uses hierarchical namespace for Used for tabular and
unstructured data. granular access control. transactional data.

3. Prepare the data


Explore the data. Clean and transform the data based on the model’s requirements.

Exploratory data analysis Feature engineering: Create validation set: Split


(EDA): Analyse your data, get Transform the data to create the data into training and
summary statistics, and features that will help the validation or test dataset to
understand possible model to predict the target evaluate the model.
correlations between variables. value.

4. Train the model


Choose algorithm and hyperparameters based on trial and error.

Data
Includes features (what influences the to be
predicted value) and the target value (if it exists). Model
Often stored as a binary file (e.g. pickle file).
Algorithm Use on new data with same features to predict
Based on the task (e.g. classification), the target value.
different algorithms and hyperparameters
can be tried.

5. Integrate the model


Use endpoint to generate predictions.

Real-time predictions Batch predictions


Create light-weight app to predict target Create pipeline to predict target
value in real-time per new data value on new set of data
measurement. measurements.

6. Monitor the model


Track the model’s performance.

Data drift Evaluation metrics


When new data differs significantly Keep track of the model’s performance.
from training dataset. When the model’s predictions are
increasingly incorrect.
Retrain the model

Learn more on:


© Copyright Microsoft Corporation. All rights reserved.
Create machine learning models Self-paced Microsoft Learn content on using Azure Machine Learning

Explore visual tools for machine learning on Microsoft Learn Azure Machine Learning documentation

You might also like