Dav Exp3 66

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

EXP 3: Implement Multiple Linear

EXPERIMENT 3 Regression in Python

AIM:-To implement Multiple Linear Regression on the given data set.

THEORY:-

Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that
uses several explanatory variables to predict the outcome of a response variable. The goal of multiple
linear regression is to model the linear relationship between the explanatory (independent) variables
and response (dependent) variables. In essence, multiple regression is the extension of ordinary
least-squares (OLS) regression because it involves more than one explanatory variable.

The multiple regression model is based on the following assumptions:

 There is a linear relationship between the dependent variables and the independent variables

 The independent variables are not too highly correlated with each other

 yi observations are selected independently and randomly from the population

 Residuals should be normally distributed with a mean of 0 and variance σ

Here's the formula for multiple linear regression, which produces a more specific calculation:

y = ß0 + ß1x1 + ß2x2 + ... + ßpxp

The variables in this equation are:

y is the predicted or expected value of the dependent variable.

x1, x2, and xp are three independent or predictor variables.

ß0 is the value of y when all the independent variables are equal to zero.

ß1, ß2, and ßp are the estimated regression coefficients. Each regression coefficient represents the
change in y relative to a one-unit change in the respective independent variable.

Because of the multiple variables, which can be linear or nonlinear, this regression analysis model allows

Manav Mangela T13 66 1


for more variance and precision when it comes to predicting outcomes and understanding the impact of
each explanatory variable on the model's total variance.

Assumptions for Multiple Linear Regression:

 A linear relationship should exist between the Target and predictor variables.

 The regression residuals must be normally distributed.

 MLR assumes little or no multicollinearity (correlation between the independent variable) in


data.

CODE:-

import pandas as pd

import seaborn as sb

import matplotlib.pyplot as plt

import numpy as np

house = pd.read_csv('https://github.com/YBIFoundation/Dataset/raw/main/Boston.csv')

house.describe()

Manav Mangela T13 66 2


house.columns

X = house.drop(['MEDV'],axis=1)

y = house['MEDV']

x=house['RM']

plt.scatter(x,y)

plt.xlabel('Number of rooms')

plt.ylabel('Median Value of homes')

x=np.array(x)

y=np.array(y)

b1,b0=np.polyfit(x,y,1)

plt.plot(x,b1*x+b0,color='red')

plt.show()

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,y, train_size=0.7, random_state=2529)

X_train.shape, X_test.shape, y_train.shape, y_test.shape

from sklearn.linear_model import LinearRegression

model = LinearRegression()

model.fit(X_train,y_train)

model.intercept_

y_pred = model.predict(X_test)

y_pred

model.coef_

from sklearn.metrics import r2_score

from sklearn.metrics import mean_squared_error

Manav Mangela T13 66 3


# predicting the accuracy score

score=r2_score(y_test,y_pred)

print('r2 score is ',score)

OUTPUT:-

CONCLUSION:-Thus we have successfully implamented and executed Multiple Linear Regression.

Manav Mangela T13 66 4

You might also like