lOMoARcPSD|38824818
python project major project
Programming With Python -I (University of Mumbai)
Scan to open on Studocu
Studocu is not sponsored or endorsed by any college or university
Downloaded by xboy xman (
[email protected])
lOMoARcPSD|38824818
A PROJECT SYNOPSIS
on
House Price Prediction Using Machine Learning
Submitted By
1. Shreyas saundade(64)
2. Shivani singh (67)
3. Aditya wadnerkar (71)
Under the Guidance of
Prof. Vaibhav dhage
Department of Computer Science and Engineering
(Data Science)
Saraswati Education Society’s
SARASWATI COLLEGE OF ENGINEERING
Kharghar, Navi Mumbai
(Affiliated to University of Mumbai)
Academic Year :-2021-22
lOMoARcPSD|38824818
Saraswati College of Engineering, Kharghar
Vision
“To develop a core of eminency in Engineering Education and Research “
Mission
“To educate Students to become quality techno-crafts for taking up challenges in all
facets of life “
Department of Computer Science and Engineering
(Data science)
Vision
“To be among renowned institution in Computer Science Engineering (CSE)
education and research by developing globally competent graduates.”
Mission
1. To produce quality Engineering graduates by imparting quality training,
hands on experience and value education.
2. To pursue research and new technologies in Computer Science Engineering
and across interdisciplinary areas that extends the scope of Computer
Engineering and benefit humanity.
lOMoARcPSD|38824818
3. To provide stimulating learning ambience to enhance innovative ideas,
problem solving ability, leadership qualities, team-spirit and ethical
responsibilities.
(Approved by AICTE, reg. By MaharashtraGovt. DTE,Affiliated to Mumbai University )
PLOT NO. 46/46A, SECTOR NO 5, BEHIND MSEB SUBSTATION, KHARGHAR,NAVI MUMBAI-410210
Tel. : 022-27743706 to 11 * Fax : 022-27743712 * Website: www.sce.edu.in
CERTIFICATE
This is to certify that the requirements for the synopsis entitled,”House Price Prediction Using Machine
Learning” Have been successfully completed by the following students:
Roll numbers Name
64 Shreyas saundade
67 Shivani singh
71 Aditya wadnerkar
In partial fulfillment of Sem –IIIBachelorof Engineering of Mumbai University, AIML/Data Science
of Saraswati college of Engineering, Kharghar during the academic year 2021-22.
Internal Guide External Examiner
Prof. vaibhav daghe
lOMoARcPSD|38824818
Project coordinator Head of Department
Prof. Vijay R. Kapure Prof. Shraddha Subhedar
lOMoARcPSD|38824818
House price predication system
ABSTRACT
The relationship between house prices and the economy is an important motivating factor for
predicting house prices. A property’s value is important in real estate transactions. Housing price
trends are not only the concern of buyers and sellers, but it also indicates the current economic
situation. Therefore, it is important to predict housing prices without bias to help both the buyers
and sellers make their decisions. In this project, we are going to create a website where user have
to add some property details for predicting the house price, enter date for forecasting the price till
that date and budget range for recommending best location. This project uses two datasets, one
includes some features and large entries of housing sales in Mumbai and another contains house
price index of Mumbai. We are using different feature selection methods and feature extraction
method with Multiple Linear Regression to predict the current house price and using ARIMA
model for forecasting the price after few years in Mumbai and also uses content based
recommendation system to recommend best location according to their budget in nearby area of
interest.
Keywords- House price prediction using machine learning algorithm, Recommendation of house
according to user choice
lOMoARcPSD|38824818
INDEX
List of Figures ………………………………………….................................................................……………………………... 1
List of Figures ………………………………………….................................................................……………………………... 1
1. Introduction …………………………………………………................................................................…………………….. 2
1.1 General …………………………………………………………………................................................................…..…. 2
1.2 Objective ……………………………………………………………………….
1.3 Problem statement …………………………….................................................
2. Literature review ………………………………………………………………….. 4
3. Methodology ………………………………………………………………............. 5
3.1 Project details.…………………………………………………………............ 5
3.2 System analysis and design ………………………………………................... 10
3.2.1 System analysis ………………………………………………………….. 10
3.3 Project timeline ……………………………………………………………….. 12
4. Implementation and Results ……………………………………………………. 13
4.1. Implementation ....…………………………………………………................. 13
4.2. Results ...………………………………………………………………............ 19
5. Conclusion ……….………………………………………………………………… 21
6. Future scope ……………………………………………………………………….. 22
7. Reference ……………………………………………………………………...........
8. Certifications and Publications…………………………………………………….
23
lOMoARcPSD|38824818
24
Program Educational Objectives (PEO)
1. To apply statistical data analysis and other data science techniques to effectively solve real-world problems.
2. To motivate & prepare students for lifelong learning and research to manifest global competitiveness.
3. To equip students with communication, team work and leadership skills to accept challenges in all facets of life
ethically.
Program Outcomes (PO)
At the end of the program, a student will be able to:
1. Apply the knowledge of Mathematics, Science and Engineering Fundamentals to solve complex Data Science
Problems.
2. Identify, formulate and analyze Data analysis Problems and derive conclusion using First Principle of
Mathematics, Engineering Science and Computer Science.
3. Investigate Complex Data Science problems to find appropriate solution leading to valid conclusion.
4. Design a data science model, process to meet specified needs with appropriate attention to health and Safety
Standards, Environmental and Societal Considerations.
5. Create, select and apply appropriate techniques, resources and advance Engineering software to analyze tools
and design for Data Science Problems.
6. Understand the Impact of Data Science solution on society and environment for Sustainable development.
7. Understand Societal, health, Safety, cultural, Legal issues and Responsibilities relevant to Engineering
Profession.
8. Apply Professional ethics, accountability and equity in Engineering Profession.
9. Work Effectively as a member and leader in multidisciplinary team for a common goal.
10. Communicate Effectively within a Profession and Society at large.
11. Appropriately incorporate principles of Management and Finance in one’s own Work.
12. Identify educational needs and engage in lifelong learning in a Changing World of Technology.
lOMoARcPSD|38824818
lOMoARcPSD|38824818
Program Specific Objectives (PSO)
1. Identify, understand, formulate and analyse complex engineering problems in the field of Data Analysis, Big
Data, Database Management, Predictive Analysis, Trends Identification and Identifying Business Insights.
2. Acquire, Store, Retrieve, Process and finally convert data into knowledge in the field of artificial intelligence,
data mining, network management and security, and Internet of Things applications
through use of secure, reliable and cost effective state of art Analysis tools efficiently
Lab Objectives:
Students will try to:
1. To acquaint with the process of identifying the needs and converting it into the problem.
2. To familiarize the process of solving the problem in a group.
3. To acquaint with the process of applying basic engineering fundamentals to attempt solutions to the problems.
4. To inculcate the process of self-learning and research.
Lab Outcomes:
Student will be able to:
1. Identify problems based on societal /research needs.
2. Apply Knowledge and skill to solve societal problems in a group.
3. Develop interpersonal skills to work as member of a group or leader.
4. Draw the proper inferences from available results through theoretical/ experimental/simulations.
5. Analyze the impact of solutions in societal and environmental context for sustainable development.
6. Use standard norms of engineering practices
7. Excel in written and oral communication.
8. Demonstrate capabilities of self-learning in a group, which leads to life long learning.
9. Demonstrate project management principles during project work.
lOMoARcPSD|38824818
INTRODUCTION
Investment is a business activity on which most people are interested in this globalization era. There are
several objects that are often used for investment, for example, gold, stocks and property. In particular,
property investment has increased significantly. Housing price trends are not only the concern of buyers
and sellers, but it also indicates the current economic situation. There are many factors which has impact
on house prices, such as location, BHK, floor etc. Also, a location with a great accessibility to highways,
expressways, schools, shopping malls and local employment opportunities contributes to the rise in house
price. Manual house prediction becomes difficult, hence there are many systems developed for house
price prediction. The aim of this system is to create a website through which the user can give his house
requirements as input which is then passed on to the linear regression model for predicting the house
price. The website also allows user to forecast the predicted house price to a particular date which is also
specified by the user. This is done by using another model known as the ARIMA(Auto Regressive
Integrated Moving Average Model).
During the last few decades, with the rise of Youtube, Amazon, Netflix and many other such web
services, recommender systems have taken more and more place in our lives. From e-commerce (suggest
to buyers articles that could interest them) to online advertisement (suggest to users the right contents,
matching their preferences), recommender systems are today unavoidable in our daily online journeys.
In a very general way, recommender systems are algorithms aimed at suggesting relevant items to users
(items being movies to watch, text to read, products to buy or anything else depending on industries).
This website also provides an option for recommendations. The type of recommendation system is
content based recommendation. In this project, we are using two datasets which are extracted from
Makaan.com by using the concept of web scraping. One dataset consists of some features such as
location, BHK, floor, furnished etc. with different cities in Mumbai. This dataset is used for prediction.
The other dataset consists of the House Price index of Mumbai for the last 10 years. This dataset is used
for forecasting.
Machine learning plays a major role from past years in image detection, spam reorganization, normal
speech command, product recommendation and medical diagnosis. Present machine learning algorithm
helps us in enhancing security alerts, ensuring public safety and improve medical enhancements. Machine
learning system also provides better customer service and safer automobile systems. In the present
lOMoARcPSD|38824818
PROPOSED SYSTEM
1. Data Gathering
2. Analysis Dataset
3. Training the regression model
4. Testing
Dataset Preparation
The dataset was imported from scikit-learn library in python. The dataset includes 4 Pre-labelled variable
in total, 3-Independent variable (i.e area, roomcount, building) and 1-dependent varible (i.e price) as
shown in figure 1.0 below.
Fig1.0: DATASET
lOMoARcPSD|38824818
Linear Regression
In this Project, we have used Linear Regression Algorithm for predicting the current house price.
• The Linear Regression Algorithm accepts two variables Independent variable (X) and Dependent
variable (Y).
• We have used sklearn Library for importing Linear Regression model.
• The dataset containing different cities with their features and prices is used for training Linear
Regression Model.
• The dataset entities will be divided into two parts 80% for training and 20% for testing. • Linear
Regression model will be trained using X_train Independent variable entries and Y_train Dependent
variable entries.
• The trained model will be tested upon the 20% test dataset entities. After training and testing the model
will be use for prediction purpose.
• The accuracy for trained linear regression model is 86.67%.
• Formula: Yi=β0+β1Xi1+β2Xi2+...+βpxi+ ϵ
Yi=dependent variable
Xi=independent variables
β0= y-intercept (constant term)
ϵ=the model’s error
INDEPENDENT VARIABLES DEPENDENT VARIABLE
AREA(int) PRICE(rupees)
ROOM COUNTt(int)
BUILDING AGE(int)
Table:1.0
lOMoARcPSD|38824818
SYSTEM SPECIFICATION
Following table shows the list of System Specification used in project
Type of Equipment Specification
Processor Core i3 , Gen 3
Laptop Dell , Res:1360x768
Anaconda Latest version
jupyter Latest version
Microsoft Excel Latest version
Chrome Browsers
Table:2.0
Algorithm
STEP 1: Start
STEP 2: Importing Libraries
STEP 3: Loading our dataset
STEP 4: Defining a linear regression model
STEP 5: Train and test our lr model
STEP 6: Implemented the coefficients of our multilinear regression formula
STEP 7: Predicting coefficient of mlr
STEP 8: Stop
lOMoARcPSD|38824818
Result
fig0.2: output1
fig0.3: output2
Fig0.4: output3
lOMoARcPSD|38824818
CONCLUSION
House prices prediction are expected to help people who plan to buy a house so they can know the price
range in the future, then they can plan their finance well. In addition, house price predictions are also
beneficial for property investors to know the trend of housing prices in a certain location.
Machine learning which is broadly defined as the capability of a machine to imitate intelligent human
behavior.
So we use Machine learning model predictions as they allow businesses to make highly accurate guesses
as to outcomes of a question based on historical data, which can be about all kinds of things.
. The system makes optimal use of the Data mining Algorithm i.e Linear Regression
The Linear Regression algorithm is used to predict the house price according to the property requirement
given by the customer with accuracy of 86.7%
. The main objective of using this prediction system is to reduce the human physical calculation, time and
carry out the whole process at ease.
References
[1] Real Estate Price Prediction with Regression and Classification, CS 229 Autumn 2016
[2] Gongzhu Hu, Jinping Wang, and Wenying Feng Multivariate Regression Modelling for Home Value Estimates
with Evaluation using Maximum Information Coefficient
[3] Byeonghwa Park, Jae Kwon Bae (2015). Using machine learning algorithms for housing price prediction,
Volume 42, Pages 2928-2934
[4] https://www.coursera.org/specializations/recommen der-systems
[5] https://www.analyticsvidhya.com/blog/2015/08/begi nners-guide-learn-content-based-recommendersystems/
[6] https://towardsdatascience.com/how-to-buildfrom-scratch-a-content-based-movie-recommenderwith-
natural-language-processing-25ad400eb243
[7] https://www.makaan.com/
[8] https://tradingeconomics.com/