3-Polynomial Regression Using Python

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 14

Machine Learning

19ME 3220
III/IV B.Tech
Odd Sem
for the Academic Year 2021-21

Sesion-23
Polynomial Regression
S. Ramesh Kumar
Course Co-ordinator
TYPES OF MACHINE LEARNING
Polynomial Regression
Polynomial Regression is a regression algorithm that models the relationship
between a dependent(y) and independent variable(x) as nth degree polynomial.

The Polynomial Regression equation is given below:

y= b0+b1x1+ b2x12+ b2x13+...... bnx1n


 It is also called the special case of Multiple Linear Regression in ML.
Because we add some polynomial terms to the Multiple Linear
regression equation to convert it into Polynomial Regression.
 It is a linear model with modification in order to increase the
accuracy.
 The dataset used in Polynomial regression for training is of non-linear
nature.
 It makes use of a linear regression model to fit the complicated and
non-linear functions and datasets.
 Hence, "In Polynomial regression, the original features are
converted into Polynomial features of required degree (2,3,..,n)
and then modelled using a linear model."
Polynomial Regression
The need of Polynomial Regression in ML can be understood in the below points:

If we apply a linear model on a linear dataset, then it provides us a good result as


we have seen in Simple Linear Regression, but if we apply the same model without
any modification on a non-linear dataset, then it will produce a drastic output. Due
to which loss function will increase, the error rate will be high, and accuracy will be
decreased.

So for such cases, where data points are arranged in a non-linear fashion, we


need the Polynomial Regression model. We can understand it in a better way
using the below comparison diagram of the linear dataset and non-linear dataset.
Polynomial Regression
In the above image, we have taken a dataset which is arranged non-linearly. So
if we try to cover it with a linear model, then we can clearly see that it hardly
covers any data point. On the other hand, a curve is suitable to cover most of
the data points, which is of the Polynomial model.

Hence, if the datasets are arranged in a non-linear fashion, then we should use
the Polynomial Regression model instead of Simple Linear Regression.

Equation of the Polynomial Regression Model:

Simple Linear Regression equation:   y = b0+b1x         .........(a)

Multiple Linear Regression equation: y= b0+b1x+ b2x2+ b3x3+....+ bnxn  .......(b)

Polynomial Regression equation:        y= b0+b1x + b2x2+ b3x3+....+ bnxn ......(c)

The Simple and Multiple Linear equations are also Polynomial equations
with a single degree, and the Polynomial regression equation is Linear
equation with the nth degree. So if we add a degree to our linear equations,
then it will be converted into Polynomial Linear equations.
Implementation of Polynomial Regression
Problem Description:  using Python
There is a Human Resource company, which is going to hire a new
candidate. The candidate has told his previous salary 160K per annum,
and the HR have to check whether he is telling the truth or bluff. So to
identify this, they only have a dataset of his previous company in which
the salaries of the top 10 positions are mentioned with their levels.
By checking the dataset available, we
have found that there is a non-linear
relationship between the Position
levels and the salaries.

Our goal is to build a Bluffing


detector regression model, so HR
can hire an honest candidate. Below
are the steps to build such a model.

Predict the output for level 6.5 because the candidate has 4+ years' experience
as a regional manager, so he must be somewhere between levels 6 and 7.
Polynomial Regression
Steps for Polynomial Regression:
1. Data Pre-processing
2. Build a Linear Regression model and fit it to the dataset
3. Build a Polynomial Regression model and fit it to the dataset
4. Visualize the result for Linear Regression and Polynomial Regression model.
5. Predicting the output.
Data Pre-processing Step:
The data pre-processing step will remain the same as in previous regression
models, we will not use feature scaling, and also we will not split our dataset
into training and test set. It has two reasons:

The dataset contains very less information which is not suitable to divide it
into a test and training set, else our model will not be able to find the
correlations between the salaries and levels.

In this model, we want very accurate predictions for salary, so the model
should have enough information.
Polynomial Regression
Code for Polynomial Regression
STEP-1

# importing libraries  
import pandas as pd 
import numpy as np  
import matplotlib.pyplot as plt
 
#importing datasets  
data_set= pd.read_csv('Position_Salaries.csv')  
  
#Extracting Independent and dependent Variable  
x= data_set.iloc[:, 1:2].values  
y= data_set.iloc[:, 2].values  
Polynomial Regression
Build and fit the Linear regression model to the dataset.
STEP-2
#Fitting the Linear Regression to the dataset  
from sklearn.linear_model import LinearRegression  
lin_regs= LinearRegression()  
lin_regs.fit(x,y)  

#Visulaizing the result 
for Linear Regression model  
plt.scatter(x,y,color="blue")  
plt.plot(x,lin_regs.predict(x), color="red")  
plt.title("Bluff detection model(Linear Regression)")  
plt.xlabel("Position Levels")  
plt.ylabel("Salary")  
plt.show()  
Polynomial Regression
STEP-3

#Fitting the Polynomial regression to the dataset 
from sklearn.preprocessing import PolynomialFeatures  
poly_regs= PolynomialFeatures(degree= 2)  
x_poly= poly_regs.fit_transform(x)  
lin_reg_2 =LinearRegression()  
lin_reg_2.fit(x_poly, y)  

In the above lines of code, we have used poly_regs.fit_transform(x),


because first we are converting our feature matrix into polynomial feature
matrix, and then fitting it to the Polynomial regression model. The parameter
value(degree= 2) depends on our choice. We can choose it according to our
Polynomial features.

After executing the code, we will get another matrix x_poly, which can be
seen under the variable explorer option:
Polynomial Regression
Visualizing the result for Polynomial Regression
STEP-4
#Visulaizing the result for Polynomial Regression  
plt.scatter(x,y,color="blue")  
plt.plot(x, lin_reg_2.predict(poly_regs.fit_transform(x)), color="red") 
 
plt.title("Bluff detection model(Polynomial Regression)")  
plt.xlabel("Position Levels")  
plt.ylabel("Salary")  
plt.show()  
Polynomial Regression
the predictions are close to the real values. The above plot will vary as we will
change the degree.
For degree= 3 For degree= 4

Chang in degree
If we change the degree=3, then we will give a more accurate plot.
If we change the degree=4, will get the most accurate plot.

Hence we can get more accurate results by increasing the degree of Polynomial.
Polynomial Regression
STEP - 5
Predicting the final result with the Linear Regression model:

Now, we will predict the final output using the Linear regression model to see
whether an employee is saying truth or bluff. So, for this, we will use
the predict() method and will pass the value 6.5.

lin_pred = lin_regs.predict([[6.5]])   OUTPUT: 330378.78787879


print(lin_pred)

Predicting the final result with the Polynomial Regression model:


Now, we will predict the final output using the Polynomial Regression model to
compare with Linear model. Below is the code for it:
poly_pred = lin_reg_2.predict(poly_regs.fit_transform([[6.5]]))  
print(poly_pred)  OUTPUT: 158862.45265153

The predicted output for the Polynomial Regression is [158862.45265153],


which is much closer to real value

You might also like