Project Presentation On House Price Prediction System: Presented by Name: Simran B Solanki Roll No: 19020
Project Presentation On House Price Prediction System: Presented by Name: Simran B Solanki Roll No: 19020
Project Presentation On House Price Prediction System: Presented by Name: Simran B Solanki Roll No: 19020
On
House Price Prediction System
Presented By
Name: Simran B Solanki Roll No: 19020
House Price Prediction System
Programming Language
Python
Tools
Jupyter Spyder
Notebook
Libraries
Data Cleaning
House
Price
Regression Model
Prediction
System
Exploratory Data
Analysis
https://docs.google.com/forms/d/107XIIJ1n1kgKjKmjZnR-PYEW3hQ-
Google Form Link
jblNUfylul82qDQ/edit?ts=606d4684#responses
Data Cleaning
ProjectDataanalysis slide 3
Collected through survey Changing the Dataset Column
. Lorem ipsum dolor sit Lorem ipsum dolor sit Lorem ipsum dolor sit
amet, consectetur amet, consectetur amet, consectetur
adipiscing elit, sed do adipiscing elit, sed do adipiscing elit, sed do
eiusmod tempor eiusmod tempor eiusmod tempor
incididunt ut labore et incididunt ut labore et incididunt ut labore et
dolore magna aliqua. dolore magna aliqua. dolore magna aliqua.
Data Cleaning
Project analysis slide 5
Count of Null Value and Converting Categorical data into Integer
Removing the null value
Exploratory Data Analysis
Analysis Result
From graph we can observe that there are some outliers in carpet area with respect to rooms such as in 0 No. of bedrooms i.e 1
RK range of carpet area lies between 100 to 800 therefore house having carpet area 3500 is a outlier, in 1 BK range of carpet
area lies between 200 and 760 therefore we have 3 higher outliers with carpet area 1000,1010 and 2500, in 2 BK range of
carpet area lies between 480 and 1500 therefore we have 1 higher outliers with carpet area 1750, in 3 BK range of carpet area
lies between 900 and 2050 therefore we have 1 higher outliers with carpet area 2400, in 4 BK range of carpet area lies between
1990 and 2350 therefore we have 1 higher outliers with carpet area 3500 and one lower outlier 100.
Exploratory Data Analysis
5) Histogram to represent Variation in Carpet Area
From the graph we can observe that the dataset contains carpet area majority between 300 to 1000 sq.ft.
6) Count plot to represent Count of House Loan w.r.t Carpet Area
From the graph we can observe that majority of the people were willing to take the house loan.
Exploratory Data Analysis
7) Graphs to represent Regression Analysis
A) Carpet Area vs No of Bedroom
From the above graph we can conclude that there is a linear relation between the No. of Bedrooms and Carpet Area as No. of
Bedrooms increases Carpet Area also increases but the points do not fit on the regression line.
Exploratory Data Analysis
B) Carpet Area vs Price
From the above graph we can conclude that there is a linear relation between the House Price and Carpet Area as Carpet Area
increases House Prices also increases but the points do not fit on the regression line.
Exploratory Data Analysis
Analysis Result
Exploratory Data Analysis
Analysis Result
Trainingset is
Training set isthe
theone
oneon onwhich
whichwe train and
we train andfitfitour
our
model basically to fit thetoparameters LabelEncoder encode labels with a
model basically fit the parameters
value between 0 and n_classes-1
Test data is used only to assess performance of
where n is the number of distinct labels
model
Regression Models
By comparing the R2 of the regressions model we conclude that the Random Forest Regressor have
more accuracy in prediction when compared to the others regression model, it has the highest R2
Score i.e 0.755300315635314
Classification Models
From the above graph we conclude that since the From the graph we conclude that since the accuracy
error rate does not fluctuate after k=23, so we choose of K value at k=23 neither increases nor decreases
k value as K=23 the k value chosen is accurate
K-Nearest Neighbors Classifier
Classification Report Confusion Matrix Heat Map
From the above analysis result we conclude that since the overall accuracy score of the model is 0.71 which is
close to 1 so the model fits best.
Accuracy of the Classification Models
By comparing the accuracy of the classification model we conclude that the K- Nearest Neighbor
Classifier have more accuracy in prediction when compared to the others classification model, it has the
highest Accuracy Score i.e 0.7073170731707317
Hence, K-Nearest Neighbor Classifier is used for predicting whether the user will take a house loan or
not take a house loan to buy a house.
House Price Prediction System