0% found this document useful (0 votes)

20 views7 pages

House Price Predicting Model Using

The document presents a house price prediction model utilizing machine learning techniques, focusing on algorithms such as Linear Regression, Gradient Boosting Regression, and Random Forest Regression. It outlines the methodology for data collection, cleaning, training, and testing the model, emphasizing the importance of various features like location and number of rooms. The model aims to assist buyers in making informed decisions based on their budget and preferences while improving the efficiency of the real estate market.

Uploaded by

nkaintura388

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views7 pages

House Price Predicting Model Using

Uploaded by

nkaintura388

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

House Price Predicting Model using

Machine Learning
Rahul Chauhan Prof. Kamal Ghanshala Utkarsh Gupta
Graphic Era Hill University Graphic Era University Graphic Era
Hill University Dehradun, India Dehradun, India
Dehradun, India [email protected] [email protected]
[email protected]

Abstract— Everyone wishes to buy and live in their dream house and which suits
their lifestyle and which provides facilities according to their needs. The main parameter
people look for will be the surrounding locality, area of the house in square feet, number of
rooms and bathrooms, Location etc, for prediction of house price. This model helps people
in selecting the house that is suitable for their living. As people are very concerned about
their budgets before buying any expensive things. This model also helps people in choosing
houses based on their budgets, which do not affect their financial state in the future. This
model helps us to predict the price of a house according to buyer requirements. This study
has attempted to implement various machine learning algorithms like Linear Regression
(LR), Gradient Boosting Regression (GBR) and Random Forest (RF) Regression
algorithms. Finally, the algorithm that generates high accuracy is considered for predicting
the house price.

Keywords: Machine Learning (ML), House Price Prediction, Regression

Techniques, ML Algorithm.
I. INTRODUCTION

Buying a new own house is everyone’s most important decision that a person makes in his life.
Everyone’s dream is to live in their dream house with a price range of their budget .The price of a house
may depend on a wide variety of factor range such as the location of the house, its features, as well as the
demand of the property. Therefore, predicting house values is not only beneficial for buyers, but also for
real estate agents and economic professionals. Without data we can’t train our model that why data is
called heart of Machine Learning model. So, we give certain information like location of house, number
of bedrooms, bathrooms and other amenities to our model to predict the house price accurately. Machine
Learning involves these model from previous data by using them to predict new data. Demand of housing
is increasing daily because our population is rising rapidly. People who don’t know the actual price of
particular house they may suffer loss of money. In dataset that we have used in our model in that there is
80% of data is used for training purpose and remaining 20% of data used for testing purpose. There are
many algorithm that can be used to predict house price, But I have used here Linear Regression algorithm
that perform sudden task to do prediction of house accurate.
1. Collecting data: Gather a large dataset containing information about houses such as location of house,
number of bedrooms and bathrooms, in which year the house is built, etc.

2. Data Cleaning: Clean the data by removing missing values, outliers and duplicates. Converting
categorical features into numerical features using techniques like one-hot encoding, label encoding, or
ordinal encoding.

3. Splitting of data: Split the data into two sets (i-) A Training set – The training set is used to train the
model, and (ii-) A Testing set- The testing set is used to evaluate the model’s performance.

4. Train the model: Training a regression model using the training set. By this model learn to predict
house prices based on the features of the house.

5. Evaluating the model: use the testing set to evaluate the model’s performance. Calculating metrics such
as mean squared error (MSE), mean absolute error (MAE) and R-squared value.

6. Deploy the model: once you are satisfied with the model’s performance, deploy it to make predictions
on new data.

7. Monitor the model: keep monitoring the model performance and update it periodically to ensure that it
stays accurate.

This model aims to explain how machine learning techniques can be applied to solve real-world problems
in real estate Company, and how accurate the prediction can be done using data-driven approaches. This
model is also very helpful for buyers and sellers make better decision and improve the overall efficiency
of the real estate market.

II. LITERATURE SURVEY

A lot of past works have been done for predicting house prices. Different levels of accuracies and results
have been achieved using different methodologies, techniques and datasets. A study of independent real
estate market forecasting on house price using data mining techniques was done by Bahia [11]. Here the
main idea was to construct the neural network model using two types of neural network. The first one is
Feed Forward Neural Network (FFNN) and the second one is Cascade Forward Neural Network (CFNN).
It was observed that CFNN gives a better result compared to FFNN using MSE performance metric.

A comparative study on the prediction of house prices using regression techniques like Elastic Net,
Multiple linear, Ridge, and LASSO algorithms has been presented by Madhuri. Here the common
parameters of the house have been used. That is, the price and square feet are the parameters used in this.
Tang et. al. have made a study on predicting house prices based on an ensemble learning algorithm.
Ensemble learning is considered the best tool for predicting algorithms. The random forest algorithm and
ensemble learning algorithm were used. The ensemble learning algorithm provides better performance
than the random forest algorithm. The ensemble learning provides the best accuracy compared to the
random forest algorithm.

Darshil Shah Volume 7 Issue 3 (2020) [2], shows the house prediction models are already there in the
market but have high risk and outdated dataset and to overcome that they have proposed a new and
automated system with better prediction and to do it they have used several techniques like XG Boost,
Light GBM, and Random Forest to train the model and predict the house prices and created an RPA that
generates more accurate and consisting results and also shows fewer errors which helps the customer to
make better decisions.

G. Naga Satish Volume 8 Issue 9 (2019)[4] in their paper they have used different algorithms like Linear
Regression, Lasso Regression, and Gradient Boosting and also took different predictable variables to get
the output plots in the form of bar graph and found that 3 bedroom houses are more and 7 bedroom
houses are least and considering all these they made a model that can predict the value of the house.

P. Durganjali proposed a house resale price prediction using classification algorithms. In this paper, the
resale price prediction of house is done using different classifications algorithms like Leaner regression,
Decision Tree, K-Means and Random Forest is used. There are so many factors are affected on house
price include physical attributes, location and also economic factor as well. Here we consider RMSE as
the performance matrix for different dataset and these algorithms are applied and find out most accuracy
model which predict better results.

Sifei Lu, proposed a hybrid regression technique for house price prediction. With limited dataset and data
features, creative feature engineering method is examined in this paper. The proposed approach has
recently has been deployed as the key kernel for Kaggle Challenge “House Price: Advance Regression
Techniques”. The goal of the paper is to predict reasonable price for customers with respect to their
budgets and priorities.

Patel and Upadhyay have discussed various pruning methods and their features and hence pruning
effectiveness is evaluated. They have also measured the accuracy for glass and diabetes dataset,
employing WEKA tool, considering various pruning factors. ID3 algorithm splits attribute based on their
entropy. TDIDT algorithm is one which constructs a set of classification rules through the intermediate
representation of a decision tree. Weka interface is used for testing of data sets by means of a variety of
open source machine learning algorithms.
III. Methodology

In this model, I focus on predicting house price using machine learning algorithms like Linear
Regression. I have proposed the system “House Price Prediction Using Machine Learning” I have predict
the house price using multiple features. In this proposed system, we are able to train model from various
features like Numbers of bedrooms, bathrooms, area of the house in square feet and location of house etc.
The previous data taken and out of this data 80% is used for training purpose and remaining 20% of data
used for testing purpose. Here, the raw data is stored in ‘.csv’ file. I have majorly used three machine
learning libraries to solve these problems. The first one was ‘pandas’, ‘numpy’ and another one is
‘sklearn’. The pandas used for to load ‘.csv’ file into jupyter notebook and also used to clean the data as
well as to manipulate the data. Numpy is used for train-test splitting purpose. Another one is sklearner,
which was used for real analysis and it has containing various inbuilt functions which help to solve the
problem.

1.Data Collection- The dataset is the collection of data, used for prediction purposes. Datasets can hold
any type of record that is stored in the system. For Machine Learning projects, a large amount of data is
required, because without data we cannot train our AI model, An ideal dataset has either well labeled
fields and members or a data dictionary that can be used to relabel the data. A good dataset has
completeness, they are reliable, and have great accuracy, Dataset can also be referred to as a container for
storing data. It has been attempted for various datasets on Kaggle, which would suite our project
objective. After looking at a lot of datasets, this dataset is found. It is a house pricing dataset in the city of
Bangalore.
2. Data cleaning- Data cleaning is the process of detecting and removing errors to increase the value of
data. Data cleaning is carried out with the help of data wrangling tools. It is the way toward identifying
and amending off base records from a record set, table or database. It finds the deficient information and
replaces the messy information. The information is changed to ensure that it is exact and right. It is
utilized to make a dataset predictable.

3. Pre-processing of the dataset- This process includes the pre-processing of the dataset and splitting the
dataset into a train(train.csv) and test(test.csv) dataset. In the dataset, there were non-numerical features
such as the location of the property, condition of property, ventilation, etc.

We have converted these non-numerical features into numerical features using One Hot Ender and label
Encoder function from the scikit learn library. There were empty cells in the dataset, we have replaced
these cells with the mean of the column using the Simple Imputer function from the scikit learn library.
Here, the target feature is the Sale price.

Split the dataset in training and test dataset in the ratio of 8:2 using train_test_split function.

4. Training the model- Here the data is broken into 2 parts. That is training and testing. 80 percent of
data is used for training the model and the rest 20 percent is used for testing purposes. Training the model
is mainly training the dataset with Machine Learning algorithms. It consists of sample output with
corresponding sets of input data for training the model as represented in figure.
5. Testing the model- Once the model is trained, they are tested with the dataset. The model provides the
prediction accuracy or the output for the processed data-set. It is a method to measure the results of the
model that gives the accurate score of the dataset. That is, validation/test is done for the model build. Test
data sets are used to evaluate machine learning programs that have been trained on an initial training data
set.

6. Simple Linear Regression -In this type of regression model a linear relationship is established among
the target variable which is the dependent variable (Y) and a single independent variable (X). Linear
Relationship between dependent and independent variable is established by fitting a regressor line
between them. The equation of the line is given by:

Y=a+bX (1)

where ‘a’ and ‘b’ are the model parameter called as regression coefficients. When we take the value of X
as 0, we get the value of ‘a’ which is the Y intercept of the line and ‘b’ is the slope that signifies the
change of Y with the change of X. If the value of ‘b’ is large then it means with a little change in X there
will be a huge change in Y and vice versa. To compute the values of ‘a’ and ‘b’ we use the Ordinary
Least Square Method. The values predicted by the model Linear Regression may not always be accurate.
There may be some difference hence we add an error term to the original equation (1), it helps for better
prediction of the model.

Y=a+bX+Ɛ (2)

There are some assumptions that are to be made in case of simple linear regression and those are as
follows:

1. The number of observations must be greater than the number of parameters present.

2. The validity of the regression data is over a restricted period.

3. The mean of the error term has expected value of 0, which means that the error term is normally
distributed.
7. Polynomial Regression- It is a special case of Simple Linear Regression. Unlike in linear regression
where the model tries to fit a straight regression line between the dependent and independent variable,
here a line cannot be fit as there doesn’t exists any linear relationship between the target variable and the
predictor variable. Here instead of straight line a curve is being fitted against the two variables. This is
accomplished by fitting a polynomial equation of degree n on the nonlinear data which forms a
curvilinear relationship between the dependent and independent variables. In polynomial regression the
independent variable may not be independent of each other unlike that in case of simple linear regression.
The equation of polynomial regression is as follows:

Y = a+b1X1 +b2X2 +b3X3 +........+bnXn (3)

The advantages of polynomial regression are as follows:

1. Polynomial Regression offers the best estimate of the relationship between the dependent and
independent variable.

2. The higher the degree of the polynomial the better it fits the dataset.

3. A wide range of curves can be fit into polynomial regression by varying the degree of the model.

The disadvantage of polynomial regression is as follows:

1. These are too sensitive towards the presence of outliers in the dataset, as the presence of outliers will
increase the variance of the model. And when the model encounters any unseen data point it under
performs.

8. Data Analysis- Before giving the data to any model we have to be sure that all the data is accurate and
ready to use to do this we have analyzed our dataset based on these features, characteristics, and the
relation among the features. From the analysis we found

SSRN Id3565512
No ratings yet
SSRN Id3565512
5 pages
Utkarsh Gupta G (73) (House Price Prediction)
No ratings yet
Utkarsh Gupta G (73) (House Price Prediction)
6 pages
Utkarsh Gupta - House Price Prediction
No ratings yet
Utkarsh Gupta - House Price Prediction
6 pages
Real Estate Price Prediction
No ratings yet
Real Estate Price Prediction
7 pages
Comparative Study of House Price Prediction Using Machine Learning Research Paper
No ratings yet
Comparative Study of House Price Prediction Using Machine Learning Research Paper
14 pages
IJIRCT2203007
No ratings yet
IJIRCT2203007
4 pages
House Price Prediction with ML
No ratings yet
House Price Prediction with ML
5 pages
Data Science Assignment Chapter 1
No ratings yet
Data Science Assignment Chapter 1
5 pages
Real Estate Cost Estimation Using Data Mining
No ratings yet
Real Estate Cost Estimation Using Data Mining
15 pages
Housepricepdf 2
No ratings yet
Housepricepdf 2
3 pages
Artificial Intelligence Approach For Modeling House Price Prediction
No ratings yet
Artificial Intelligence Approach For Modeling House Price Prediction
5 pages
Irjet V11i4226
No ratings yet
Irjet V11i4226
8 pages
SSRN Id4413863
No ratings yet
SSRN Id4413863
5 pages
Housing Price Prediction Model Using Machine Learning
No ratings yet
Housing Price Prediction Model Using Machine Learning
4 pages
House Price Forecasting Using Machine Learning Methods: Uter and Mathematics Education 11 (2021), 3624-3632
No ratings yet
House Price Forecasting Using Machine Learning Methods: Uter and Mathematics Education 11 (2021), 3624-3632
9 pages
ML Project CLG
No ratings yet
ML Project CLG
62 pages
Visvesvaraya Technological University Belagavi: House Price Prediction Using Machine Learning
No ratings yet
Visvesvaraya Technological University Belagavi: House Price Prediction Using Machine Learning
9 pages
House Price Prediction for Buyers
100% (1)
House Price Prediction for Buyers
10 pages
Price Prediction
No ratings yet
Price Prediction
16 pages
Rev Ajrcos 101262 Ina A
No ratings yet
Rev Ajrcos 101262 Ina A
11 pages
Real Estate Project PDF
No ratings yet
Real Estate Project PDF
8 pages
Main Content (1) - Merged
No ratings yet
Main Content (1) - Merged
50 pages
Real Estate Price Prediction Tool
No ratings yet
Real Estate Price Prediction Tool
20 pages
Machine Learning for Real Estate
No ratings yet
Machine Learning for Real Estate
9 pages
120 GSJ10713
No ratings yet
120 GSJ10713
8 pages
Python for Real Estate Analytics
No ratings yet
Python for Real Estate Analytics
4 pages
Main Content (1) - Merged
No ratings yet
Main Content (1) - Merged
50 pages
House Price Predictor Using ML Through A
No ratings yet
House Price Predictor Using ML Through A
4 pages
House Resale Price Prediction Using Classification Algorithms
No ratings yet
House Resale Price Prediction Using Classification Algorithms
4 pages
MSc Project: House Price Prediction
No ratings yet
MSc Project: House Price Prediction
14 pages
Mini Project Report Format
No ratings yet
Mini Project Report Format
22 pages
1 s2.0 S187705092403151X Main
No ratings yet
1 s2.0 S187705092403151X Main
7 pages
House Price Prediction With Analysis
No ratings yet
House Price Prediction With Analysis
9 pages
Review Paper of House Rate Prediction
No ratings yet
Review Paper of House Rate Prediction
7 pages
Paper 4404
No ratings yet
Paper 4404
10 pages
Ahtesham 2020
No ratings yet
Ahtesham 2020
5 pages
House Price Prediction Using Machine Learning Algorithm - The Case of Karachi City, Pakistan
No ratings yet
House Price Prediction Using Machine Learning Algorithm - The Case of Karachi City, Pakistan
5 pages
Akankshaa 2
No ratings yet
Akankshaa 2
1 page
House Price Prediction Using Machine Learning: Bachelor of Technology
No ratings yet
House Price Prediction Using Machine Learning: Bachelor of Technology
20 pages
Fyp Proposal
No ratings yet
Fyp Proposal
3 pages
Abstract Machine Learning Has Been Instrumental Across Diver
No ratings yet
Abstract Machine Learning Has Been Instrumental Across Diver
6 pages
Intership Report
No ratings yet
Intership Report
20 pages
Housing Price Prediction
No ratings yet
Housing Price Prediction
7 pages
Bangalore House Price Prediction
No ratings yet
Bangalore House Price Prediction
4 pages
Phase 5
No ratings yet
Phase 5
5 pages
Iamsp 2
No ratings yet
Iamsp 2
8 pages
Real-Estate Property
No ratings yet
Real-Estate Property
11 pages
Project1 Report1
No ratings yet
Project1 Report1
3 pages
Sample Synopsis
No ratings yet
Sample Synopsis
4 pages
ES205 Researchpaper
No ratings yet
ES205 Researchpaper
17 pages
Survey Paper Updated
No ratings yet
Survey Paper Updated
12 pages
1822 B.E Ece Batchno 120
No ratings yet
1822 B.E Ece Batchno 120
29 pages
Bda Report
No ratings yet
Bda Report
27 pages
A14 Abstract
No ratings yet
A14 Abstract
2 pages
HOUSE PRICE PREDICTION Shreya Majumder012345678910111213141516171819 - Sign
No ratings yet
HOUSE PRICE PREDICTION Shreya Majumder012345678910111213141516171819 - Sign
21 pages
A Comparative Study For Predicting House Price Based On Machine Learning
No ratings yet
A Comparative Study For Predicting House Price Based On Machine Learning
7 pages
Fin Irjmets1685380014
No ratings yet
Fin Irjmets1685380014
9 pages
Ijcse Icter P113
No ratings yet
Ijcse Icter P113
5 pages
New Research
No ratings yet
New Research
6 pages
A Star Algorithm
No ratings yet
A Star Algorithm
19 pages
CSE MINI PROJECT Report
No ratings yet
CSE MINI PROJECT Report
15 pages
Search Algorithms in Artificial Intelligence
No ratings yet
Search Algorithms in Artificial Intelligence
6 pages
Min - Max Algorithm
No ratings yet
Min - Max Algorithm
8 pages
Problem Solving in AI
No ratings yet
Problem Solving in AI
10 pages
Artificial Intelligence Introduction
No ratings yet
Artificial Intelligence Introduction
23 pages
Panel Vs Pooled Data
No ratings yet
Panel Vs Pooled Data
9 pages
Statistical Analysis for Coaches
No ratings yet
Statistical Analysis for Coaches
6 pages
Stock Watson 3U ExerciseSolutions Chapter04 Students PDF
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter04 Students PDF
8 pages
Tuning Parameters
No ratings yet
Tuning Parameters
15 pages
Geostatistics Assignment 5 1 A) Statistics and Variogram Modelling For Domain 1 North
No ratings yet
Geostatistics Assignment 5 1 A) Statistics and Variogram Modelling For Domain 1 North
12 pages
ML Module2
No ratings yet
ML Module2
124 pages
Analysing Panel Data Using STATA
No ratings yet
Analysing Panel Data Using STATA
13 pages
Case Analysis No. 5-Regression
No ratings yet
Case Analysis No. 5-Regression
5 pages
MachineLearning MidTerm UMT Spring 2021
100% (1)
MachineLearning MidTerm UMT Spring 2021
12 pages
Adopsi Budaya Kerja
No ratings yet
Adopsi Budaya Kerja
11 pages
(ENGDAT2) Exercise 3
No ratings yet
(ENGDAT2) Exercise 3
10 pages
Regression Analysis Course Guide
No ratings yet
Regression Analysis Course Guide
1 page
How To Interpret Multiple Regression Output in Spss
50% (2)
How To Interpret Multiple Regression Output in Spss
3 pages
Extra Activity 3
No ratings yet
Extra Activity 3
9 pages
Linear Regression: Rustom D. Sutaria - Avia Intelligence 2016, Dubai
No ratings yet
Linear Regression: Rustom D. Sutaria - Avia Intelligence 2016, Dubai
3 pages
SAS Regression Analysis Guide
No ratings yet
SAS Regression Analysis Guide
4 pages
Econometrics Guide for Stata Users
100% (5)
Econometrics Guide for Stata Users
222 pages
SPSS 19.0 Guide: Features & Data Processing
No ratings yet
SPSS 19.0 Guide: Features & Data Processing
119 pages
Multicollinearity & Heteroskedasticity Guide
No ratings yet
Multicollinearity & Heteroskedasticity Guide
39 pages
Multistage Random Sampling New
No ratings yet
Multistage Random Sampling New
21 pages
ADF Test
100% (1)
ADF Test
2 pages
SML Lab Manuel
No ratings yet
SML Lab Manuel
24 pages
Analysis of Variance (Anova) Aliasing Confounding Alpha Risk
No ratings yet
Analysis of Variance (Anova) Aliasing Confounding Alpha Risk
8 pages
Classification Basics
No ratings yet
Classification Basics
14 pages
Viva Questions and Possible Answers - Ver 1.0
No ratings yet
Viva Questions and Possible Answers - Ver 1.0
3 pages
ECON 322 ECONOMETRICS 11 - Kabarak University
No ratings yet
ECON 322 ECONOMETRICS 11 - Kabarak University
6 pages
Topic 8 Tutorial Am025
No ratings yet
Topic 8 Tutorial Am025
4 pages
Elementary Statistics A Step by Step Approach 9th Edition Bluman Solutions Manual PDF Download
100% (3)
Elementary Statistics A Step by Step Approach 9th Edition Bluman Solutions Manual PDF Download
55 pages
ANOVA Lecture Notes: Key Concepts
No ratings yet
ANOVA Lecture Notes: Key Concepts
8 pages
Data Science Basics for Beginners
No ratings yet
Data Science Basics for Beginners
16 pages

House Price Predicting Model Using

Uploaded by

House Price Predicting Model Using

Uploaded by

House Price Predicting Model using

Keywords: Machine Learning (ML), House Price Prediction, Regression

II. LITERATURE SURVEY

2. The validity of the regression data is over a restricted period.

Y = a+b1X1 +b2X2 +b3X3 +........+bnXn (3)

The advantages of polynomial regression are as follows:

The disadvantage of polynomial regression is as follows:

You might also like