DTEXP5

University Institute of Engineering
Department of Computer Science & Engineering
Experiment: 2.1
Student Name: UID:

Branch: Computer Science & Engineering Section/Group:
Semester:1 Date of Performance:22.12.2022
Subject Name: Disruptive Technology
Subject Code: 22ECH -102
1. Aim of the practical:

To develop a prediction model based on linear/logistic regression.
2. Tool Used:
Require the PyCaret libraries and Google Colab
3. Basic Concept/ Command Description:

Given a given a labeled set of input-output pairs
𝐷 = {(𝑥𝑖𝑖, 𝑦𝑖𝑖)}𝑁
Where, D is called the training set, and N is the number of training examples.
In the simplest setting, each training input xi is a D-dimensional vector of numbers, representing, say, the
height and weight of a person. These are called features, attributes or covariates. In general, however, xi
could be a complex structured object, such as an image, a sentence, an email message, a time series, a
molecular shape, a graph, etc.
Similarly the form of the output or response variable can in principle be anything, but most methods
assume that yi is a categorical or nominal variable from some finite set, yi ∈ {1,...,C} (such as male or
female), or that yi is a real-valued scalar (such as income level). When yi is categorical, the problem is
known as classification or pattern recognition, and when yi is real- valued, the problem is known as
regression. Another variant, known as ordinal regression, occurs where label space Y has some natural
ordering, such as grades A–F.
The second main type of machine learning is the descriptive or unsupervised learning approach. Here we
are only given inputs, D = {xi}N i=1, and the goal is to find “interesting patterns” in the data. This is
sometimes called knowledge discovery. This is a much less well-defined problem, since we are not told
what kinds of patterns to look for, and there is no obvious error metric to use.
There is a third type of machine learning, known as reinforcement learning, which is somewhat less
commonly used. This is useful for learning how to act or behave when given occasional reward or
punishment signals.
4. Code:
!pip install pycaret &> /dev/null
print ("Pycaret installed sucessfully!!")
from pycaret.datasets import get_data
dataSets = get_data('index')
bostonDataSet = get_data("boston")
print(type(bostonDataSet))
s = setup(data = bostonDataSet, target='medv', silent=True)
cm = compare_models()
# Model Performance using data "Normalization"
s = setup(data = bostonDataSet, target = 'medv', normalize = True, normalize_method = 'zscore',

silent=True)
# Model Performance using "Feature Selection"
s = setup(data = bostonDataSet, target = 'medv', feature_selection = True, feature_selection_threshold =

0.9, silent=True)

# Model Performance using "Outlier Removal"
s = setup(data = bostonDataSet, target = 'medv', remove_outliers = True, outliers_threshold = 0.05,

silent=True)
# Model Performance using "PCA"
s = setup(data = bostonDataSet, target = 'medv', pca = True, pca_method = 'linear', silent=True)
cm = compare_models() # Create RF Model
rfModel = create_model('rf') # Save the trained model
sm = save_model(rfModel, 'rfModelFile') # Load the model
rfModel = load_model('rfModelFile') # Make prediction on the new dataset
# Select top 10 rows from boston dataset newDataSet = get_data("boston").iloc[:10] # Make prediction on
new dataset
newPredictions = predict_model(rfModel, data = newDataSet) newPredictions
# Scatter plot b/w actual and predicted
import matplotlib.pyplot as plt
predicted = newPredictions.iloc[:,-1] # Last column actual = newPredictions.iloc[:,-2] # 2nd last

column plt.scatter(actual, predicted)
plt.xlabel('Predicted') plt.ylabel('Actual') plt.title('Actul Vs Predicted')
plt.savefig("result-scatter-plot.jpg", dpi=300) plt.show()

5. Observations, Simulation Screen Shots and Discussions:

6. Result and Summary:

Machine learning is usually divided into two main types. In the predictive or supervised learning
approach, the goal is to learn a mapping from inputs x to outputs y.
7.Learning outcomes (What I have learnt):
1. Getting Data: How to import data from PyCaret repository
2. Setting up Environment: How to setup an experiment in PyCaret and get started with building
regression models
3. Create Model: How to create a model, perform cross validation and evaluate regression metrics
Evaluation Grid (To be filled by Faculty):

Sr. No. Parameters Marks Obtained Maximum Marks
1. Student Performance (task 12
implementation and result evaluation)
2. Viva-Voce 10
3. Worksheet Submission (Record) 8
Signature of Faculty (with Date): Total Marks Obtained: 30

DTEXP5

Uploaded by

Copyright:

Available Formats

DTEXP5

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DTEXP5

Uploaded by

Copyright:

Available Formats

University Institute of Engineering

Department of Computer Science & Engineering

Student Name: UID:

1. Aim of the practical:

3. Basic Concept/ Command Description:

from pycaret.datasets import get_data

s = setup(data = bostonDataSet, target='medv', silent=True)

# Model Performance using data "Normalization"

s = setup(data = bostonDataSet, target = 'medv', normalize = True, normalize_method = 'zscore',

# Model Performance using "Feature Selection"

s = setup(data = bostonDataSet, target = 'medv', feature_selection = True, feature_selection_threshold =

# Model Performance using "Outlier Removal"

s = setup(data = bostonDataSet, target = 'medv', remove_outliers = True, outliers_threshold = 0.05,

# Model Performance using "PCA"

s = setup(data = bostonDataSet, target = 'medv', pca = True, pca_method = 'linear', silent=True)

cm = compare_models() # Create RF Model

rfModel = create_model('rf') # Save the trained model

sm = save_model(rfModel, 'rfModelFile') # Load the model

rfModel = load_model('rfModelFile') # Make prediction on the new dataset

newPredictions = predict_model(rfModel, data = newDataSet) newPredictions

# Scatter plot b/w actual and predicted

import matplotlib.pyplot as plt

predicted = newPredictions.iloc[:,-1] # Last column actual = newPredictions.iloc[:,-2] # 2nd last

plt.xlabel('Predicted') plt.ylabel('Actual') plt.title('Actul Vs Predicted')

plt.savefig("result-scatter-plot.jpg", dpi=300) plt.show()

5. Observations, Simulation Screen Shots and Discussions:

6. Result and Summary:

7.Learning outcomes (What I have learnt):

1. Getting Data: How to import data from PyCaret repository

Evaluation Grid (To be filled by Faculty):

You might also like