PROJECT REPORT
On
House price prediction using ml
Submitted for the partial fulfillment of the requirement for the degree of
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE ENGINRRRING - AIML
Submitted By
Name - Samyam Prakash
Roll - 2204037
Reg no. – 2201287660
Guided By
Prof. Sidhartha Samanta
GITA AUTONOMOUS COLLEGE
BHUBANESWAR
April 2025
1
COMPUTER SCIENCE ENGINRRRING - ARTIFICIAL INTELLIGENCE
GITA AUTONOMOUS COLLEGE, BHUBANESWAR
Ref No.: - ……………… Date: - ………………...
Certificate
This is to certify that the project report entitled “Title of the project” submitted by
Mr./Ms.------------, Reg No. ------------------ is an authentic work carried out by
him/her at GITA Autonomous College, Bhubaneswar under my guidance. This
project is for the partial fulfillment for the degree of Bachelor of Technology.
Prof Sidharth Samanta Prof. (Dr.) Prasanta Kumar Bal
(Guide, Dept. of CSE-AI) (H.O.D., Dept. of CSEAI/AIML)
Examined By
(External)
2
COMPUTER SCIENCE ENGINRRRING - ARTIFICIAL INTELLIGENCE
GITA AUTONOMOUS COLLEGE, BHUBANESWAR
ACKNOWLEDGEMENT
I express and gratitude to Prof Sidharth Samanta project
supervisor for his/her guidance and constant support.
I also take this opportunity to thank Prof. (Dr.) Prasanta Kumar
Bal, head of Department, Computer Science Engineering –
Artificial Intelligence , AIML for his constant support and timely
advice.
Lastly, words run to express my gratitude to all the faculties of
the CSE-AI Department and friends for their support and co-
operation, constructive criticism and valuable suggestion during
preparation of this project report.
Thanking All
(Signature of the student)
3
Reg. No: __________
ABSTRACT
The prediction of housing prices has become an increasingly important application of data science
and machine learning, driven by the rapid growth of real estate markets and the demand for accurate,
data-driven decision-making. This mini project, titled "House Price Prediction Using Machine
Learning and Flask Deployment," aims to develop an efficient and user-friendly web application
capable of predicting house prices based on multiple attributes such as area, number of bedrooms
and bathrooms, presence of amenities like guest rooms, basement, air conditioning, and the
furnishing status of the house.
The project utilizes a supervised machine learning model trained on a real-world dataset named
Housing.csv. The model was developed using Python libraries such as Pandas, Scikit-Learn, and
Pickle for serialization. The dataset included both numerical and categorical features, which were
carefully preprocessed to enhance the model’s accuracy. The trained model was integrated into a
web-based interface using the Flask framework. A simple, clean web form collects user input, which
is then fed into the model to generate a price prediction.
To ensure better user interaction, functionalities such as automatic form clearing, refresh options,
and proper field validations have been incorporated. The system was deployed locally using the
Werkzeug server integrated within Flask. It demonstrates how machine learning models can be
easily transitioned from notebooks to interactive web applications.
This project highlights the importance of combining software development skills with data science
expertise to create real-world solutions. It also discusses challenges like feature mismatch and model
overfitting, and it outlines future scope such as deploying on cloud platforms, adding more features
like location-based analysis, and integrating dynamic datasets for real-time predictions.
The final system is reliable, scalable for further upgrades, and serves as a mini prototype of what a
full-fledged property price prediction system would look like.
Keywords: Machine Learning, Housing Price Prediction, Flask, Web Deployment, Supervised
Learning.
4
TABLE OF CONTENTS
Chapter No. Topics Page No.
Chapter 1 : INTRODUCTION 1-19
Objective of the project
Need of the system
Advantages of the system
Related works and how they have used
Chapter 2 : DEVELOPMENT OF THE SYSTEM 20-30
Hardware, Software requirements
System requirements and System specifications
About the dataset
Vision of project
Chapter 3 : IMPLEMENTATION & CODING
System architecture
Code flow and explanation
Chapter 4 : RESULTS
Output screenshots
Interpretation of Results
Chapter 5 : FUTURE SCOPE, ADV AND DIS ADV
Chapter 6 : Conclusion And Discussion
Chapter 7 : References
5
Abstract
The prediction of housing prices has become an increasingly important
application of data science and machine learning, driven by the rapid
growth of real estate markets and the demand for accurate, data-driven
decision-making. This mini project, titled "House Price Prediction Using
Machine Learning and Flask Deployment," aims to develop an efficient and
user-friendly web application capable of predicting house prices based on
multiple attributes such as area, number of bedrooms and bathrooms,
presence of amenities like guest rooms, basement, air conditioning, and the
furnishing status of the house.
The project utilizes a supervised machine learning model trained on a real-
world dataset named Housing.csv. The model was developed using Python
libraries such as Pandas, Scikit-Learn, and Pickle for serialization. The
dataset included both numerical and categorical features, which were
carefully preprocessed to enhance the model’s accuracy. The trained model
was integrated into a web-based interface using the Flask framework. A
simple, clean web form collects user input, which is then fed into the model
to generate a price prediction.
To ensure better user interaction, functionalities such as automatic form
clearing, refresh options, and proper field validations have been
incorporated. The system was deployed locally using the Werkzeug server
integrated within Flask. It demonstrates how machine learning models can
be easily transitioned from notebooks to interactive web applications.
This project highlights the importance of combining software development
skills with data science expertise to create real-world solutions. It also
discusses challenges like feature mismatch and model overfitting, and it
outlines future scope such as deploying on cloud platforms, adding more
features like location-based analysis, and integrating dynamic datasets for
real-time predictions.
The final system is reliable, scalable for further upgrades, and serves as a
mini-prototype of what a full-fledged property price prediction system
would look like.
6
CHAPTER 1: INTRODUCTION
1.1 Objective of the Project
n today’s rapidly evolving real estate market, determining the
accurate price of a residential property is a critical factor for both
buyers and sellers. Traditional methods of price estimation often
rely on manual comparisons, real estate agents, or rough
approximations, which may be subjective, time-consuming, and
inaccurate.
With the availability of vast real estate data and the advancement
in machine learning, there is a growing need for an automated,
data-driven solution that can offer precise property price
predictions. This can aid not only homeowners and buyers but also
real estate professionals, bankers, and analysts in making informed
decisions.
The system fulfills several key needs:
• Elimination of Manual Effort: Reduces dependence on
traditional, error-prone methods of estimation.
• Data-Driven Insights: Provides accurate price predictions
based on real historical data and measurable house features.
• Increased Transparency: Enables users to understand how
various features impact the price, improving trust and
awareness.
• Time Efficiency: Instantly predicts the price based on entered
attributes, reducing turnaround time.
•Scalability and Accessibility: With a web-based interface, it
can be accessed easily by anyone, anywhere.
This need forms the basis for building a predictive system that
blends machine learning capabilities with an intuitive user interface
for broader utility.
7
1.2 Need for the System
The primary objective of this mini project is to design and
implement a house price prediction system using a machine
learning model, and integrate it into a web-based application using
the Flask framework.
The specific objectives include:
1. To collect and preprocess real estate data for training a
regression-based machine learning model.
2. To develop a predictive model using supervised learning
techniques capable of analyzing multiple features and
estimating house prices.
3. To build a user-friendly web application that accepts user
input and displays predictions dynamically.
4. To ensure proper feature encoding, scaling, and
serialization so that the model performs consistently and
accurately.
5. To implement additional functionalities like form reset,
validations, and refresh for a better user experience.
6. To explore deployment strategies such as using local Flask
hosting with future plans for cloud deployment.
Ultimately, the project aims to demonstrate the real-world
application of machine learning in solving practical problems and
serve as a foundational prototype that can be extended or
commercialized.
1.3 Advantages of the System
• The developed House Price Prediction System offers multiple
advantages that address both technical and practical
requirements of real-world users. These advantages make the
system efficient, scalable, and user-friendly.
8
1. Accurate Predictions
• The system uses a machine learning model trained on real-
world housing data. By analyzing various features like area,
number of bedrooms and bathrooms, location-based
amenities, furnishing status, and parking, the system delivers
highly accurate and data-driven price predictions.
2. User-Friendly Interface
• The web interface is built using the Flask framework and
designed to be clean, minimal, and intuitive. It allows users
with minimal technical expertise to easily input details and
receive predictions without any confusion.
3. Real-Time Output
• The system provides immediate feedback. As soon as the user
inputs the required data, the model processes it in real-time
and displays the predicted price, improving the speed of
decision-making.
4. Portable and Lightweight
• The entire project, including the trained model, web
application, and required dependencies, is lightweight and can
run on standard computing environments, making it suitable
for local and small-scale deployment.
5. Expandable for Future Enhancements
The project is designed in a modular way, allowing future
integration of advanced features like:
Location-based analysis using map APIs
Real-time data from online listings
Dynamic model updates with new training data
6. Cost-Effective Solution
The system is a free and open-source prototype that can serve
as an alternative to paid property valuation services or real
estate consultants, making it highly cost-effective for users.
7. Educational Value
The project is a great learning resource for students and
9
developers as it covers core concepts in machine learning,
data preprocessing, Flask development, and model
deployment.
datasets.
1.4 Related Works
Several works in the past have utilized regression models, neural
networks, and hybrid approaches to predict house prices. This
project extends those ideas by integrating a Flask-based web
interface with a machine learning model for ease of use.
CHAPTER 2: DEVELOPMENT OF THE
SYSTEM
2.1 Hardware Requirements
• Processor: Intel i5 or above
• RAM: 8 GB minimum
• Storage: 10 GB free space
• Network: Broadband internet connection
2.2 Software Requirements
• Python 3.10+
• Flask Framework
• Pandas, NumPy, Scikit-learn, Pickle
• Jupyter Notebook/VS Code
• HTML/CSS for front-end
10
2.3 System Requirements
The system must support Python, Flask, and a web browser. A
local server or online hosting environment is necessary for
deployment.
2.4 About the Dataset
The dataset includes housing features such as:
• Area
• Number of bedrooms and bathrooms
• Presence of amenities like guestroom, basement, air
conditioning
• Furnishing status
2.5 Vision of the Project
The vision of this project is to create a smart, reliable, and scalable
house price prediction system using the power of machine learning
and web development technologies. With the increasing demand
for data-driven decision-making in the real estate sector, this
system aims to provide a seamless and user-friendly interface that
enables users—especially homeowners, buyers, and real estate
professionals—to estimate property prices based on essential
features of the house.
The core vision elements include:
1. Accuracy and Efficiency:
Deliver highly accurate price predictions using a trained
machine learning model based on real-world housing data.
The system is designed to efficiently process both numerical
and categorical data attributes for robust prediction.
2. Accessibility:
Make housing price prediction accessible to all users by
deploying the model on a web platform. This allows anyone
with internet access to use the system without needing any
11
technical knowledge of machine learning or coding.
3. User Experience:
Design a simple yet effective interface that allows users to
input features like area, bedrooms, bathrooms, amenities (like
air conditioning, guestroom), and instantly get predictions.
Form validation and auto-clearing fields enhance usability.
4. Extendibility:
Build a modular and scalable solution that can easily be
expanded in the future. For instance, additional features like
location-based pricing, real-time market trends, or cloud
deployment can be incorporated later.
5. Educational Value:
Serve as a mini-prototype for educational institutions and
beginner-level data science students to understand the
practical implementation of machine learning models in web
environments using frameworks like Flask.
By fulfilling this vision, the project not only demonstrates the
practical application of AI in real-world domains but also bridges
the gap between machine learning model development and end-
user accessibility.
CHAPTER 3: IMPLEMENTATION & CODING
3.1 System Architecture
The system follows a client-server architecture where the frontend
(user interface) communicates with the backend (Flask server) to
provide real-time house price predictions. The Flask backend loads
a trained machine learning model, processes the user input,
performs prediction, and sends the result back to the frontend.
3.2 Workflow of the System
12
1. User Input: The user enters housing attributes through a
form.
2. Data Formatting: The input is processed and converted into
a DataFrame structure.
3. Model Loading: A pre-trained model (e.g., Linear Regression
or Random Forest) is loaded using joblib.
4. Prediction: The input data is passed to the model, and the
output (predicted price) is returned.
5. Output Display: The predicted price is shown on the
webpage in real-time.
3.3 Tools and Technologies Used
• Python: Backend programming and ML model training.
• Flask: Lightweight web framework used to build the server.
• HTML/CSS/JavaScript: To create the web interface.
• Pandas & NumPy: Data handling and numerical operations.
• Scikit-learn: Machine learning library for model training and
prediction.
• Jupyter Notebook: For model development and testing.
• Joblib: To save and load the trained model efficiently.
3.4 Important Code Snippets
3.4.1 Model Training (Jupyter Notebook)
python
CopyEdit
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
import joblib
# Load dataset
data = pd.read_csv("Housing.csv")
13
# Preprocessing
data = pd.get_dummies(data, drop_first=True)
# Split data
X = data.drop("price", axis=1)
y = data["price"]
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2)
# Train model
model = LinearRegression()
model.fit(X_train, y_train)
# Save model
joblib.dump(model, "house_price_model.pkl")
3.4.2 Flask App Backend (app.py)
python
CopyEdit
from flask import Flask, render_template, request
import pandas as pd
import joblib
app = Flask(__name__)
model = joblib.load("house_price_model.pkl")
@app.route("/", methods=["GET", "POST"])
def home():
if request.method == "POST":
# Collect input values from form
area = int(request.form["area"])
bedrooms = int(request.form["bedrooms"])
14
bathrooms = int(request.form["bathrooms"])
stories = int(request.form["stories"])
mainroad = 1 if request.form["mainroad"] == "yes" else 0
guestroom = 1 if request.form["guestroom"] == "yes" else 0
basement = 1 if request.form["basement"] == "yes" else 0
hotwaterheating = 1 if request.form["hotwaterheating"] ==
"yes" else 0
airconditioning = 1 if request.form["airconditioning"] ==
"yes" else 0
parking = int(request.form["parking"])
prefarea = 1 if request.form["prefarea"] == "yes" else 0
furnishing = request.form["furnishingstatus"]
# Encode furnishingstatus
furnishing_encoded = [0, 0]
if furnishing == "semi-furnished":
furnishing_encoded = [1, 0]
elif furnishing == "furnished":
furnishing_encoded = [0, 1]
# Create input dataframe
input_df = pd.DataFrame([[area, bedrooms, bathrooms,
stories, mainroad, guestroom,
basement, hotwaterheating, airconditioning,
parking,
prefarea] + furnishing_encoded],
columns=model.feature_names_in_)
# Predict price
predicted_price = model.predict(input_df)[0]
return render_template("index.html",
prediction=round(predicted_price, 2))
return render_template("index.html", prediction=None)
15
if __name__ == "__main__":
app.run(debug=True)
3.4.3 HTML Form (templates/index.html)
html
CopyEdit
<!DOCTYPE html>
<html lang="en">
<head>
<title>House Price Predictor</title>
</head>
<body>
<h2>Enter House Details</h2>
<form method="POST">
Area: <input type="number" name="area" required><br>
Bedrooms: <input type="number" name="bedrooms"
required><br>
Bathrooms: <input type="number" name="bathrooms"
required><br>
Stories: <input type="number" name="stories" required><br>
Main Road Access:
<select name="mainroad">
<option value="yes">Yes</option>
<option value="no">No</option>
</select><br>
<!-- Add similar dropdowns for other binary fields -->
Furnishing Status:
<select name="furnishingstatus">
<option value="unfurnished">Unfurnished</option>
<option value="semi-furnished">Semi-Furnished</option>
<option value="furnished">Furnished</option>
</select><br><br>
16
<input type="submit" value="Predict Price">
</form>
{% if prediction %}
<h3>Predicted Price: ₹{{ prediction }}</h3>
{% endif %}
</body>
</html>
Chapter 4: Results
4.1 Output Screenshots
17
4.2 Interpretation of result
The primary aim of the system is to predict the price of a house
based on user-input attributes. After successful model training and
deployment using Flask, the system takes input data from the user
through a web interface and returns the predicted price. Here's how
to interpret the results obtained from the system:
1. Input Attributes Used
The model considers multiple parameters like:
• Area (in sq. ft)
• Number of bedrooms and bathrooms
• Presence of amenities (air conditioning, basement, guestroom,
hot water heating)
• Furnishing status (furnished, semi-furnished, unfurnished)
• Parking spaces
• Proximity to the main road
• Preferred location indicator
Each of these features contributes to the final predicted price based
on the patterns learned during training.
2. Sample Prediction Output
For example, if a user enters the following:
• Area: 2500 sq. ft
• Bedrooms: 3
• Bathrooms: 2
• Stories: 2
• Main Road: Yes
• Guestroom: No
• Basement: Yes
• Hot Water Heating: No
• Air Conditioning: Yes
• Parking: 2 cars
• Preferred Area: Yes
• Furnishing Status: Semi-Furnished
18
The system may output:
Predicted Price: ₹76,00,000
This means that based on similar historical data and the trained
model, a house with these features is expected to cost
approximately ₹76 lakhs.
3. Accuracy and Model Confidence
The Linear Regression model used provides predictions based on
the best-fit line over the training data. While the exact percentage
accuracy may vary based on the dataset used, a well-preprocessed
dataset with cleaned and encoded categorical features typically
results in:
• R² Score (Goodness of Fit): Around 0.75–0.85
• Mean Absolute Error (MAE): Acceptable within 5–10%
deviation
This implies the model performs well on known patterns but may
slightly deviate in case of unseen combinations or outliers.
4. Usability Insights
• For Buyers: Users can evaluate whether the listed price of a
house is fair or overpriced.
• For Sellers: Helps in pricing the property appropriately based
on features.
• For Agents: Can be integrated into their platforms to give
dynamic price suggestions to clients.
5. Limitations in Interpretation
• The prediction does not account for current market
fluctuations or seasonal price changes.
• Geographic diversity is limited to the data scope, i.e., the
model may not generalize well to cities or areas not
represented in the dataset.
19
Chapter 5: Future Scope, Advantages &
Disadvantages
5.1 Future Scope
The house price prediction system developed in this project holds
immense potential for future enhancements. Several additions and
improvements can make it even more effective, robust, and
commercially viable:
1. Integration with Real-time Market Data
o Connect the system to real estate APIs to fetch real-time
location-based market trends.
o Enhance predictions by incorporating real-time demand,
property age, and neighborhood details.
2. Geolocation and Map Integration
o Include GPS-based location input for more location-
aware predictions.
o Use maps to allow users to pick a location directly on the
interface.
3. Advanced ML Models and Deep Learning
o Use models like Random Forest, XGBoost, or Deep
Neural Networks to increase accuracy.
o Implement ensemble learning for improved performance.
4. User Authentication and Data Storage
o Add login/signup functionality and enable users to save
their previous predictions.
o Maintain a database of user queries and prediction
history.
5. Mobile Application Deployment
o Convert the current web app into a cross-platform mobile
application using frameworks like React Native or
Flutter.
6. Multilingual Support
20
o Add support for multiple Indian languages to ensure
better reach across diverse user bases.
7. Visual Analytics
o Provide graphical representation of pricing trends based
on area, city, or features.
o Help users compare houses based on attributes.
5.2 Advantages of the System
The developed house price prediction system offers several
benefits:
1. Instant and Accurate Predictions
o Provides price estimates within seconds based on inputs,
helping in decision-making.
2. User-Friendly Interface
o Simple and clean design makes it easy for even non-
technical users to use the system.
3. Efficient Resource Usage
o Requires minimal hardware and software resources to
run effectively.
4. Customizable for Local Markets
o Can be adapted to different regional housing markets by
retraining with new datasets.
5. Open for Integration
o Can be integrated with real-estate listing websites and
apps for commercial use.
5.3 Disadvantages of the System
Despite its advantages, there are some limitations and
disadvantages:
1. Limited Dataset Scope
o Predictions are limited to the dataset used. It may not
cover every type of housing locality or pricing trend.
2. No Consideration of External Factors
21
o It does not take into account economic trends, inflation,
or property age, which influence real-world prices.
3. Basic ML Model Used
o Linear Regression, while effective, may not capture
complex non-linear patterns in the data.
4. Lack of Legal or Regulatory Insights
o The system does not incorporate legal aspects such as
property taxes, registration charges, or loan options.
5. Security and Privacy
o As it stands, there's no user authentication or data
protection measures implemented.
CHAPTER 6: CONCLUSION AND DISCUSSION
6.1 Conclusion
This mini project titled "House Price Prediction using Machine
Learning" aimed to develop a reliable system that estimates house
prices based on various input features such as area, number of
rooms, furnishing status, and presence of specific amenities.
Using a Linear Regression algorithm, the system was trained on a
real-world housing dataset. It successfully captured the underlying
patterns and relationships between the input features and the house
price. A simple and interactive Flask-based web application was
developed for deployment, enabling end-users to input housing
details and receive a predicted price in real-time.
The project provides a cost-effective and practical solution for real
estate professionals, individual buyers, and sellers who want to
understand the property valuation in a data-driven manner.
22
6.2 Discussion
Key Highlights
• The project makes use of a classical machine learning model
that balances accuracy and interpretability.
• The user interface is simple, allowing users with no technical
background to use the system easily.
• The Flask backend allows for seamless integration with the
trained model and dynamic response handling.
Challenges Faced
• Handling categorical data like furnishing status and binary
options required one-hot encoding, which added
dimensionality to the model.
• Ensuring the feature alignment during model training and
prediction was critical. A mismatch caused errors which were
handled through feature engineering and consistent
preprocessing.
• Creating a responsive and user-friendly UI while ensuring
data validation was a key design decision.
Model Performance
• The trained model shows good performance for the provided
dataset with a decent R² score and acceptable mean error
margins.
• For better accuracy in future versions, more complex models
like Random Forest or XGBoost can be considered.
Scalability
The project architecture is scalable:
• It can be easily deployed on cloud platforms like Render,
Heroku, or AWS.
23
REFERENCES
1. GeeksforGeeks – Linear Regression in Machine Learning
https://www.geeksforgeeks.org/linear-regression-python-
implementation/
2. Scikit-learn Documentation
https://scikit-learn.org/stable/documentation.html
3. Pandas Documentation
https://pandas.pydata.org/docs/
4. Flask Documentation
https://flask.palletsprojects.com/en/2.3.x/
5. Kaggle Datasets – Housing Price Prediction
https://www.kaggle.com/
6. JavaTpoint – Machine Learning Linear Regression
https://www.javatpoint.com/machine-learning-linear-
regression
7. W3Schools – HTML, CSS, JS Basics
https://www.w3schools.com/
8. Medium Blogs – Machine Learning Projects in Python
https://medium.com/
9. Real Python – Building Flask Web Applications
https://realpython.com/flask-by-example-part-1-project-setup/
10. Python Official Documentation
https://docs.python.org/3/
----X----
24
25
26
o One department copy
27