3-Polynomial Regression Using Python

Machine Learning
19ME 3220
III/IV B.Tech
Odd Sem
for the Academic Year 2021-21
Sesion-23
Polynomial Regression
S. Ramesh Kumar
Course Co-ordinator
TYPES OF MACHINE LEARNING
Polynomial Regression is a regression algorithm that models the relationship
between a dependent(y) and independent variable(x) as nth degree polynomial.
The Polynomial Regression equation is given below:
y= b0+b1x1+ b2x12+ b2x13+...... bnx1n

 It is also called the special case of Multiple Linear Regression in ML.
Because we add some polynomial terms to the Multiple Linear
regression equation to convert it into Polynomial Regression.
 It is a linear model with modification in order to increase the
accuracy.
 The dataset used in Polynomial regression for training is of non-linear
nature.
 It makes use of a linear regression model to fit the complicated and
non-linear functions and datasets.
 Hence, "In Polynomial regression, the original features are
converted into Polynomial features of required degree (2,3,..,n)
and then modelled using a linear model."
The need of Polynomial Regression in ML can be understood in the below points:
If we apply a linear model on a linear dataset, then it provides us a good result as

we have seen in Simple Linear Regression, but if we apply the same model without
any modification on a non-linear dataset, then it will produce a drastic output. Due
to which loss function will increase, the error rate will be high, and accuracy will be
decreased.
So for such cases, where data points are arranged in a non-linear fashion, we

need the Polynomial Regression model. We can understand it in a better way
using the below comparison diagram of the linear dataset and non-linear dataset.
In the above image, we have taken a dataset which is arranged non-linearly. So
if we try to cover it with a linear model, then we can clearly see that it hardly
covers any data point. On the other hand, a curve is suitable to cover most of
the data points, which is of the Polynomial model.
Hence, if the datasets are arranged in a non-linear fashion, then we should use
the Polynomial Regression model instead of Simple Linear Regression.
Equation of the Polynomial Regression Model:
Simple Linear Regression equation: y = b0+b1x .........(a)
Multiple Linear Regression equation: y= b0+b1x+ b2x2+ b3x3+....+ bnxn .......(b)
Polynomial Regression equation: y= b0+b1x + b2x2+ b3x3+....+ bnxn ......(c)
The Simple and Multiple Linear equations are also Polynomial equations
with a single degree, and the Polynomial regression equation is Linear
equation with the nth degree. So if we add a degree to our linear equations,
then it will be converted into Polynomial Linear equations.
Implementation of Polynomial Regression
Problem Description: using Python
There is a Human Resource company, which is going to hire a new
candidate. The candidate has told his previous salary 160K per annum,
and the HR have to check whether he is telling the truth or bluff. So to
identify this, they only have a dataset of his previous company in which
the salaries of the top 10 positions are mentioned with their levels.
By checking the dataset available, we
have found that there is a non-linear
relationship between the Position
levels and the salaries.
Our goal is to build a Bluffing

detector regression model, so HR
can hire an honest candidate. Below
are the steps to build such a model.
Predict the output for level 6.5 because the candidate has 4+ years' experience
as a regional manager, so he must be somewhere between levels 6 and 7.
Steps for Polynomial Regression:
1. Data Pre-processing
2. Build a Linear Regression model and fit it to the dataset
3. Build a Polynomial Regression model and fit it to the dataset
4. Visualize the result for Linear Regression and Polynomial Regression model.
5. Predicting the output.
Data Pre-processing Step:
The data pre-processing step will remain the same as in previous regression
models, we will not use feature scaling, and also we will not split our dataset
into training and test set. It has two reasons:
The dataset contains very less information which is not suitable to divide it
into a test and training set, else our model will not be able to find the
correlations between the salaries and levels.
In this model, we want very accurate predictions for salary, so the model
should have enough information.
Code for Polynomial Regression
STEP-1
# importing libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

#importing datasets
data_set= pd.read_csv('Position_Salaries.csv')

#Extracting Independent and dependent Variable
x= data_set.iloc[:, 1:2].values
y= data_set.iloc[:, 2].values
Build and fit the Linear regression model to the dataset.
STEP-2
#Fitting the Linear Regression to the dataset
from sklearn.linear_model import LinearRegression
lin_regs= LinearRegression()
lin_regs.fit(x,y)
#Visulaizing the result
for Linear Regression model
plt.scatter(x,y,color="blue")
plt.plot(x,lin_regs.predict(x), color="red")
plt.title("Bluff detection model(Linear Regression)")
plt.xlabel("Position Levels")
plt.ylabel("Salary")
plt.show()
STEP-3
#Fitting the Polynomial regression to the dataset
from sklearn.preprocessing import PolynomialFeatures
poly_regs= PolynomialFeatures(degree= 2)
x_poly= poly_regs.fit_transform(x)
lin_reg_2 =LinearRegression()
lin_reg_2.fit(x_poly, y)
In the above lines of code, we have used poly_regs.fit_transform(x),

because first we are converting our feature matrix into polynomial feature
matrix, and then fitting it to the Polynomial regression model. The parameter
value(degree= 2) depends on our choice. We can choose it according to our
Polynomial features.
After executing the code, we will get another matrix x_poly, which can be
seen under the variable explorer option:
Visualizing the result for Polynomial Regression
STEP-4
#Visulaizing the result for Polynomial Regression
plt.scatter(x,y,color="blue")
plt.plot(x, lin_reg_2.predict(poly_regs.fit_transform(x)), color="red")

plt.title("Bluff detection model(Polynomial Regression)")
plt.xlabel("Position Levels")
plt.ylabel("Salary")
plt.show()
the predictions are close to the real values. The above plot will vary as we will
change the degree.
For degree= 3 For degree= 4
Chang in degree
If we change the degree=3, then we will give a more accurate plot.
If we change the degree=4, will get the most accurate plot.
Hence we can get more accurate results by increasing the degree of Polynomial.
STEP - 5
Predicting the final result with the Linear Regression model:
Now, we will predict the final output using the Linear regression model to see
whether an employee is saying truth or bluff. So, for this, we will use
the predict() method and will pass the value 6.5.
lin_pred = lin_regs.predict([[6.5]]) OUTPUT: 330378.78787879

print(lin_pred)
Predicting the final result with the Polynomial Regression model:

Now, we will predict the final output using the Polynomial Regression model to
compare with Linear model. Below is the code for it:
poly_pred = lin_reg_2.predict(poly_regs.fit_transform([[6.5]]))
print(poly_pred) OUTPUT: 158862.45265153
The predicted output for the Polynomial Regression is [158862.45265153],

which is much closer to real value

3-Polynomial Regression Using Python

Uploaded by

Copyright:

Available Formats

3-Polynomial Regression Using Python

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3-Polynomial Regression Using Python

Uploaded by

Copyright:

Available Formats

Machine Learning

The Polynomial Regression equation is given below:

y= b0+b1x1+ b2x12+ b2x13+...... bnx1n

If we apply a linear model on a linear dataset, then it provides us a good result as

So for such cases, where data points are arranged in a non-linear fashion, we

Equation of the Polynomial Regression Model:

Simple Linear Regression equation: y = b0+b1x .........(a)

Multiple Linear Regression equation: y= b0+b1x+ b2x2+ b3x3+....+ bnxn .......(b)

Polynomial Regression equation: y= b0+b1x + b2x2+ b3x3+....+ bnxn ......(c)

Our goal is to build a Bluffing

In the above lines of code, we have used poly_regs.fit_transform(x),

lin_pred = lin_regs.predict([[6.5]]) OUTPUT: 330378.78787879

Predicting the final result with the Polynomial Regression model:

The predicted output for the Polynomial Regression is [158862.45265153],

You might also like