House Prices Advanced Regression Techniques
House Prices Advanced Regression Techniques
House Prices Advanced Regression Techniques
https://doi.org/10.22214/ijraset.2023.49031
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue II Feb 2023- Available at www.ijraset.com
Abstract: The real estate industry is seeing an increase in the use of data mining. The capacity of data mining to extricate
helpful data from crude information makes it especially helpful for anticipating home estimations, essential housing
characteristics, and a great many different elements. Homeowners and the real estate industry frequently feel anxious about
price swings, according to research. The most useful models and important criteria for predicting home values are examined in a
literature review. The adoption of Random Forest and XGBoost as the most effective models in comparison to others was
confirmed by this study's findings. Additionally, our data suggest that locational and structural characteristics are significant
forecasting variables for housing values. In order to identify the most effective machine learning model for conducting a study in
this field and the most significant factors that influence home prices, this study will be very helpful, particularly to housing
developers and academics.
Keywords: Advanced regression, random forest, data mining, machine learning, and XG Boost
I. INTRODUCTION
Alongside food, water, and different necessities, having a house is one of the most crucial requirements of human life. The demand
for housing increased in tandem with people's living conditions. The majority of people worldwide purchase a home as a place to
call home or as a means of earning money, despite the fact that some people construct homes as an investment and property.
A nation's currency, which serves as a crucial economic scale, has a positive impact on housing markets. To meet housing demand,
homebuilders or contractors will purchase raw materials, while homeowners will purchase household goods like furniture and
appliances, indicating the impact of the new home supply on the economy. Beside that, clients have the cash to invest a lot, and the
nation's high housing supply shows that the development business is looking great.
The significance of the home has been emphasized by numerous human rights groups and international organizations. House is
deeply ingrained in the political, financial, and economic structures of every nation. Nevertheless, it was asserted that house owners,
buildings, and real estate have always been concerned about the volatility of home prices, and that significant price increases in the
housing market in numerous nations have rendered homes unaffordable. The national economy and the quality of life for residents
are both affected by the potential rise in property prices. To wrap things up, financial backers constructing a home as a venture will
be impacted by this issue. Interest for homes rises yearly, bringing about an expansion in house costs. The issue emerges when
various variables, for example, area and property interest, can influence the cost of a home; to help financial backers in deciding and
house manufacturers in setting the house price, most stakeholders, including buyers and developers, house builders, and the real
estate industry, might want to know the exact characteristics or elements impacting the cost of the house.
House costs can be anticipated utilizing an assortment of machine learning models, including support vector relapse and fake brain
organizations. House developers, property examiners, and home purchasers all advantage from the house-cost model in various
ways. This model will give home purchasers, financial backers, and manufacturers with an abundance of data and mastery, for
example, the valuation of the ongoing business sector cost of a home, which will assist them with deciding the cost of a home. In the
meantime, this model might help people who want to buy a house figure out what features are best for their budget. A machine
learning model was used independently to forecast home prices in previous studies, which examined the factors that influence them.
On the other hand, the qualities and anticipated prices of homes are combined in this article.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 371
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue II Feb 2023- Available at www.ijraset.com
As a result, the goal of this research is to both give guidance to enterprises in the field and to add to the literature. For complete
house deals in Turkey, a 124-month informational collection covering the years 2008 (1) to 2018 (4) was utilized in this
examination. The time series of deals were evaluated using LSTM (Long Short-Term Memory as a nonlinear model) and ARIMA
(Auto Regressive Integrated Moving Average as a linear model). A HYBRID(LSTM and ARIMA) model was developed and
utilized in the application to further develop gauge. The HYBRID model demonstrated the best presentation with the lowest error
rate when the MAPE (Mean Absolute Percentage Error) and MSE (Mean Squared Error) values obtained from each of these
strategies were analyzed. The fact that all of the application models have extremely close results demonstrates the progression of
consistency. This suggests that the writing will be given a significant amount of attention during our examination.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 372
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue II Feb 2023- Available at www.ijraset.com
D. Housing Price Prediction Using Machine Learning Algorithms: The Case of Melbourne City, Australia
A fundamental aspect of real estate is estimating the cost of a home. The writing attempts to collect significant data from verifiable
information about the property market. In Australia, machine learning techniques are utilized to look at past property exchanges to
foster accommodating models for home purchasers and merchants. The wide divergence in property costs between Melbourne's
most exorbitant and most economical regions has been uncovered. Besides, examinations show that joining Stepwise and Support
Vector Machine with mean squared error evaluation is a cutthroat technique.
B. Gradient Boost
Since its creation in 1999, gradient boosting has become a well-known machine learning (ML) strategy due to its efficiency,
consistency, and interpretability. Multistage grouping, click forecasting, and positioning are just a few examples of ML tasks where
gradient boosting excels. With the development of huge information as of late, slope helping has confronted new obstacles,
especially regarding adjusting precision and proficiency. Gradient boosting has a couple of boundaries. Yet again the accompanying
methodology might be followed to set boundaries to ensure a unique harmony among fit and consistency: (1) laying out
regularization boundaries (lambda, alpha), (2) diminishing learning rate, and deciding ideal boundaries.
C. XG Boost Algorithm
XGBoost, or Extreme Gradient Boosting, is the most sensible choice for a superfast ML calculation that deals with tree-based
models and endeavors to accomplish the top tier exactness while productively utilizing central processor assets. The XGBoost
calculation, created by Tianqi Chen, has recently acquired noticeable quality because of its far and wide use in hackathons and
Kaggle competitions. More or less, XGBoost is a decision tree-based troupe learning system that utilizes Gradient Descent as the
hidden goal capability and gives an elevated degree of adaptability while giving the expected outcomes by utilizing handling limit.
IV. DATASETS
1) Upload Dataset
2) Data Preprocessing
3) Feature Extraction
4) Model Generation
5) Random Forest Classifier
6) XG Boost Classifier
7) Accuracy Prediction
A. Data Collection
The first dataset proprietors finished this step. Furthermore, the dataset's cosmetics. Perceive the connection between a few
viewpoints. A portrayal of the essential qualities as well as the entire dataset. The dataset is additionally separated into 66% for
preparing and 33% for testing the calculations. Moreover, each class in the entire dataset should be addressed in generally the right
extent in both the preparation and testing datasets to make a delegate test. The various proportions of preparing and testing datasets
used in the article.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 373
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue II Feb 2023- Available at www.ijraset.com
B. Data Processing
The information got could have missing qualities, bringing about irregularities. To obtain improved results, information should be
pre-handled to help the calculation's exhibition. Anomalies should be erased, and variable change should be performed. We use the
guide capability to tackle these issues.
C. Model Generation
Machine learningis the most common way of expecting and recognizing designs to give proper results subsequent to appreciating
them. ML calculations search for and gain from designs in information. With each attempt, a ML model will learn and move along.
To assess the viability of a model, the information should initially be isolated into preparing and test sets. In this way, prior to
preparing our models, we isolated the information into two sets: the Preparation set, which included 70% of the complete dataset,
and the Test set, which contained the excess 30%. It was in this way important to apply a bunch of execution measures to our
model's expectations. In this situation, we endeavored to foresee whether an individual would bomb on an obligation. Model
precision may not be the main measurement used to evaluate how well our model functioned; the F1 score and disarray grid ought
to likewise be thought of. What is important is that the suitable exhibition measurements be picked for the fitting situations.
A. Disadvantages
We used to search for houses manually, which was a tedious methodology.
Different expectation models (Machine Learning Models, for example, Random forest and Xgboost might be utilized to estimate
house costs. The house-cost model offers a few benefits to home buyers, property examiners, and home manufacturers. This model
will give an abundance of data and skill to home buyers, property financial backers, and home developers, for example, the
valuation of current market house costs, which will help them in deciding house estimating. In the mean time, this model might help
planned buyers in deciding the highlights of a property that are fitting for their financial plan. Previous research focused on looking
at the elements that impact home costs and guaging house costs utilizing an ML model freely. This article, then again, consolidates
both anticipated home costs and characteristics.
B. Advantages
This model might help imminent purchasers decide the highlights of a property they need in view of their spending plan.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 374
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue II Feb 2023- Available at www.ijraset.com
VI. CONCLUSION
This report contemplated and surveyed flow research on the significant attributes of home cost, as well as information mining
approaches used to gauge house cost. In fact, properties in beneficial areas, for example, closeness to a retail outlet or different
conveniences, are more exorbitant than homes in rustic districts with less conveniences. Financial backers or home buyers would be
able to estimate the reasonable cost of a home using the precise expectation model, as would developers. The elements utilized by
before studies to expect a property cost utilizing different forecast models were tended to in this work. Taken together, the study
discoveries exhibit that Random Forest and XGBoost have the ability to expect property estimations. These models were made
utilizing different info boundaries and show a significant positive relationship with property cost. At long last, the objective of this
study was to help and help different scholastics in laying out a genuine model that can promptly and dependably expect property
estimations. More work on a true model is expected, with our outcomes used to affirm them.
REFERENCES
[1] S. Temür, M. Akgün, and G. Temür, “Predicting Housing Sales in Turkey Using Arima, Lstm and Hybrid Models,” J. Bus. Econ. Manag., vol. 20, no. 5, pp.
920–938, 2019, doi: 10.3846/jbem.2019.10190.
[2] A. Ebekozien, A. R. Abdul-Aziz, and M. Jaafar, “Housing finance inaccessibility for low-income earners in Malaysia: Factors and solutions,” Habitat Int., vol.
87, no. April, pp. 27–35, 2019, doi: 10.1016/j.habitatint.2019.03.009.
[3] A. Jafari and R. Akhavian, “Driving forces for the US residential housing price: a predictive analysis,” Built Environ. Proj. Asset Manag., vol. 9, no. 4, pp.
515–529, 2019, doi: 10.1108/BEPAM-07-2018-0100.
[4] Choong Wei Cheng, “Statistical Analysis of Housing Prices in Petaling,” Universiti Tunku Abdul Rahman, 2018.
[5] R. E. Febrita, A. N. Alfiyatin, H. Taufiq, and W. F. Mahmudy, “Data-driven fuzzy rule extraction for housing price prediction in Malang, East Java,” 2017 Int.
Conf. Adv. Comput. Sci. Inf. Syst. ICACSIS 2017, vol. 2018-Janua, pp. 351–358, 2018, doi: 10.1109/ICACSIS.2017.8355058.
[6] G. Gao et al., “Location-Centered House Price Prediction: A Multi-Task Learning Approach,” pp. 1–14, 2019, [Online]. Available:
http://arxiv.org/abs/1901.01774.
[7] T. D. Phan, “Housing price prediction using machine learning algorithms: The case of Melbourne city, Australia,” Proc. - Int. Conf. Mach. Learn. Data Eng.
iCMLDE 2018, pp. 8–13, 2019, doi: 10.1109/iCMLDE.2018.00017.
[8] Y. Y. S. Song, T. Zhou, H. Yachi, and S. Gao, “Forecasting house price index of China using dendritic neuron model,” PIC 2016 - Proc. 2016 IEEE Int. Conf.
Prog. Informatics Comput., pp. 37–41, 2017, doi: 10.1109/PIC.2016.7949463.
[9] R. Aswin Rahadi, S. K. Wiryono, D. P. Koesrindartoto, and I. B. Syamwil, “Factors Affecting Housing Products Price in Jakarta Metropolitan Region,” Int. J.
Prop. Sci., vol. 6, no. 1, pp. 1–21, 2016, doi: 10.22452/ijps.vol6no1.2.
[10] A. Nur, R. Ema, H. Taufiq, and W. Firdaus, “Modeling House Price Prediction using Regression Analysis and Particle Swarm Optimization Case Study :
Malang, East Java, Indonesia,” Int. J. Adv. Comput. Sci. Appl., vol. 8, no. 10, pp. 323–326, 2017, doi: 10.14569/ijacsa.2017.081042.
[11] A. Yusof and S. Ismail, “Multiple Regressions in Analysing House Price Variations,” Commun. IBIMA, vol. 2012, pp. 1–9, 2012, doi: 10.5171/2012.383101.
[12] A. Osmadi, E. M. Kamal, H. Hassan, and H. A. Fattah, “Exploring the elements of housing price in Malaysia,” Asian Soc. Sci., vol. 11, no. 24, pp. 26–38,
2015, doi: 10.5539/ass.v11n24p26.
[13] T. L. Chin and K. W. Chau, “A critical review of literature on the hedonic price model,” Int. J. Hous. Sci. Its Appl., vol. 27, no. 2, pp. 145–165, 2003.
[14] M. J. Ball, “Recent Empirical Work on the Determinants of Relative House Prices,” Urban Stud., vol. 10, no. 2, pp. 213–233, 1973, doi:
10.1080/00420987320080311.
[15] M. Rodriguez, “Managing Corporate Real Estate: Evidence from the Capital Markets.” Journal of Real Estate Literature, 1996.
[16] Hemin VasaniHarshil GandhiShrey PanchalShakti Mishra “House Price Prediction Using Advanced Regression Techniques” Dec 2022
[17] Jebashini ponnian Senthil PariUma Ramadass Chee Pun Ooi “A Unified Libraries for GDI Logic to Achieve Low-Power and High-Speed Circuit Design” Dec
2022
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 375