DTEXP5

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

University Institute of Engineering

Department of Computer Science & Engineering

Experiment: 2.1

Student Name: UID:


Branch: Computer Science & Engineering Section/Group:
Semester:1 Date of Performance:22.12.2022
Subject Name: Disruptive Technology
Subject Code: 22ECH -102

1. Aim of the practical:


To develop a prediction model based on linear/logistic regression.

2. Tool Used:
Require the PyCaret libraries and Google Colab

3. Basic Concept/ Command Description:


Given a given a labeled set of input-output pairs

𝐷 = {(𝑥𝑖𝑖, 𝑦𝑖𝑖)}𝑁
Where, D is called the training set, and N is the number of training examples.

In the simplest setting, each training input xi is a D-dimensional vector of numbers, representing, say, the
height and weight of a person. These are called features, attributes or covariates. In general, however, xi
could be a complex structured object, such as an image, a sentence, an email message, a time series, a
molecular shape, a graph, etc.

Similarly the form of the output or response variable can in principle be anything, but most methods
assume that yi is a categorical or nominal variable from some finite set, yi ∈ {1,...,C} (such as male or
female), or that yi is a real-valued scalar (such as income level). When yi is categorical, the problem is
known as classification or pattern recognition, and when yi is real- valued, the problem is known as
regression. Another variant, known as ordinal regression, occurs where label space Y has some natural
ordering, such as grades A–F.
University Institute of Engineering
Department of Computer Science & Engineering

The second main type of machine learning is the descriptive or unsupervised learning approach. Here we
are only given inputs, D = {xi}N i=1, and the goal is to find “interesting patterns” in the data. This is
sometimes called knowledge discovery. This is a much less well-defined problem, since we are not told
what kinds of patterns to look for, and there is no obvious error metric to use.

There is a third type of machine learning, known as reinforcement learning, which is somewhat less
commonly used. This is useful for learning how to act or behave when given occasional reward or
punishment signals.

4. Code:
!pip install pycaret &> /dev/null
print ("Pycaret installed sucessfully!!")

from pycaret.datasets import get_data

dataSets = get_data('index')

bostonDataSet = get_data("boston")

print(type(bostonDataSet))

s = setup(data = bostonDataSet, target='medv', silent=True)

cm = compare_models()

# Model Performance using data "Normalization"

s = setup(data = bostonDataSet, target = 'medv', normalize = True, normalize_method = 'zscore',


silent=True)

cm = compare_models()

# Model Performance using "Feature Selection"

s = setup(data = bostonDataSet, target = 'medv', feature_selection = True, feature_selection_threshold =


0.9, silent=True)

cm = compare_models()

 
University Institute of Engineering
Department of Computer Science & Engineering

# Model Performance using "Outlier Removal"

s = setup(data = bostonDataSet, target = 'medv', remove_outliers = True, outliers_threshold = 0.05,


silent=True)

cm = compare_models()

# Model Performance using "PCA"

s = setup(data = bostonDataSet, target = 'medv', pca = True, pca_method = 'linear', silent=True)

cm = compare_models() # Create RF Model

rfModel = create_model('rf') # Save the trained model

sm = save_model(rfModel, 'rfModelFile') # Load the model

rfModel = load_model('rfModelFile') # Make prediction on the new dataset

# Select top 10 rows from boston dataset newDataSet = get_data("boston").iloc[:10] # Make prediction on
new dataset

newPredictions = predict_model(rfModel, data = newDataSet) newPredictions

# Scatter plot b/w actual and predicted

import matplotlib.pyplot as plt

predicted = newPredictions.iloc[:,-1]     # Last column actual = newPredictions.iloc[:,-2]                # 2nd last


column plt.scatter(actual, predicted)

plt.xlabel('Predicted') plt.ylabel('Actual') plt.title('Actul Vs Predicted')

plt.savefig("result-scatter-plot.jpg", dpi=300) plt.show()


University Institute of Engineering
Department of Computer Science & Engineering

5. Observations, Simulation Screen Shots and Discussions:


University Institute of Engineering
Department of Computer Science & Engineering
University Institute of Engineering
Department of Computer Science & Engineering
University Institute of Engineering
Department of Computer Science & Engineering

6. Result and Summary:


Machine learning is usually divided into two main types. In the predictive or supervised learning
approach, the goal is to learn a mapping from inputs x to outputs y. 
University Institute of Engineering
Department of Computer Science & Engineering

7.Learning outcomes (What I have learnt):

1. Getting Data: How to import data from PyCaret repository

2. Setting up Environment: How to setup an experiment in PyCaret and get started with building
regression models

3. Create Model: How to create a model, perform cross validation and evaluate regression metrics

Evaluation Grid (To be filled by Faculty):


Sr. No. Parameters Marks Obtained Maximum Marks
1. Student Performance (task 12
implementation and result evaluation)
2. Viva-Voce 10
3. Worksheet Submission (Record) 8
Signature of Faculty (with Date): Total Marks Obtained: 30

You might also like