Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
46 views10 pages

Used Car Price Prediction Model

Tdg6 vf6vpv7fn7bk6fc6 ft gr8.

Uploaded by

engtawkibhasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views10 pages

Used Car Price Prediction Model

Tdg6 vf6vpv7fn7bk6fc6 ft gr8.

Uploaded by

engtawkibhasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Table of Content

1. Abstract………………………………………….. page (2).


2. Introduction……………………………………... page (2).
3. Problem Statement……………………………… page (3).
4. Literature Review……………………………….. page (4).
5. Methodology……………………………………... page (4-8).
6. Result……………………………………………... page (9).
7. Conclusion……………………………………...… page (9-10).
8. Reference…………………………………………. Page (10).

1
1. Abstract
With the rapid development of the mobile Internet and the increasing reliance on digital
platforms, traditional offline used car trading has struggled to meet the evolving needs of
consumers. This has led to the emergence of online platforms for second-hand car trading,
where accurate price predictions play a critical role in ensuring fair and transparent
transactions. This study explores the application of machine learning techniques for predicting
used car prices based on features such as mileage, engine size, fuel type, registration year, and
other vehicle attributes. Using a dataset of used car transactions, we investigated the linear and
nonlinear relationships between vehicle parameters and prices. To improve prediction accuracy,
a feature selection process was conducted to identify the most relevant factors affecting used
car prices. A comparative analysis of multiple regression models, including Support Vector
Regression (SVM), Random Forest (RF), and Gradient Boosting (GB), was performed.
Additionally, the traditional Back Propagation Neural Network (BPNN) was optimized using
Particle Swarm Optimization (PSO) and combined with Grey Relational Analysis (GRA) for
feature selection. The proposed PSO-GRA-BPNN model demonstrated superior performance,
achieving a Mean Absolute Percentage Error (MAPE) of less than 4% and an R² value of 0.98,
outperforming other models in terms of accuracy and reliability. This work establishes a robust
framework for predicting used car prices and highlights the potential for advanced machine
learning algorithms in standardizing second-hand car pricing.

2. Introduction
The used car market in Bangladesh is experiencing rapid growth, with buyers and sellers
increasingly seeking accurate price assessments to facilitate fair transactions. While traditional
offline methods of car trading remain prevalent, the advent of online platforms has
revolutionized the process, offering greater accessibility and transparency. These online
platforms connect buyers and sellers efficiently and help establish fair market values for
vehicles.
Determining the price of a used car is a complex process influenced by various factors,
including physical attributes such as mileage, engine size, registration year, and fuel type.
Additionally, market conditions, the popularity of the car brand, and consumer demand play
significant roles in price variation. Accurate price prediction is essential for both buyers and
sellers, ensuring fair deals and contributing to the stability of the used car market.
The primary objective of this research is to leverage advanced machine learning techniques to
predict used car prices accurately. Traditional methods like Back Propagation Neural Networks
(BPNN), Random Forest (RF), and Support Vector Regression (SVR) have been widely used
but exhibit certain limitations. To address these, this study proposes an enhanced model that
combines Grey Relational Analysis (GRA) for feature selection with Particle Swarm
Optimization (PSO) to improve the performance of the BPNN model.
This research examines the relationships between vehicle attributes and prices, proposing a
robust approach to enhance prediction accuracy. By achieving reliable price predictions, this
study aims to promote fairness and transparency in used car transactions, contributing to a more
standardized and efficient market.

2
3. Problem Statement
Accurately determining the price of a used car is a challenging yet crucial task in the rapidly
growing used car market. Traditional methods of offline car trading often rely on subjective
assessments, leading to inconsistencies and inaccuracies in price evaluation. This creates a lack
of transparency and fairness in transactions, which can discourage both buyers and sellers.
The emergence of online platforms has provided a more structured framework for used car
trading, but the effectiveness of these platforms heavily depends on their ability to provide
reliable price predictions. The complexity arises due to the diverse factors influencing car
prices, such as mileage, engine specifications, fuel type, registration year, and external market
conditions. Moreover, not all features significantly impact the price, and identifying the most
relevant ones is essential to avoid noise in the prediction process.
Existing predictive models, such as traditional regression methods, random forests, and support
vector machines, have shown limited accuracy due to their inability to fully capture the non-
linear relationships between features and price. Additionally, the lack of feature selection
techniques in many approaches results in overfitting or reduced model efficiency.
Thus, there is a pressing need for a robust, data-driven approach that can accurately predict
used car prices by addressing the following challenges:
1. Identifying the most significant features affecting car prices while ignoring irrelevant
or redundant variables.
2. Developing a prediction model that captures complex, non-linear relationships in the
data.
3. Improving the accuracy and reliability of predictions compared to existing methods.
4. Ensuring scalability and applicability of the model for diverse datasets and market
conditions.
This research aims to address these challenges by proposing a machine learning framework
that combines Grey Relational Analysis (GRA) for feature selection, Particle Swarm
Optimization (PSO) for parameter tuning, and Back Propagation Neural Networks (BPNN) for
prediction. The goal is to create a system that ensures fairness, transparency, and efficiency in
the used car pricing process.

4. Literature Review
Predicting used car prices has been a significant area of research, with various machine-
learning techniques explored for accurate price estimation. Below is a brief review of the key
methodologies used.
1. Traditional Models:
o Multiple Linear Regression (MLR): MLR is simple but struggles with non-
linear relationships and is sensitive to outliers. It has been used as a baseline
for price prediction but is less effective in complex scenarios.

3
2. Machine Learning Approaches:
o Support Vector Regression (SVM): SVM models non-linear relationships but
requires careful tuning and becomes computationally expensive with larger
datasets.
o Random Forest (RF): RF is robust to overfitting and handles large datasets
effectively. However, it lacks interpretability.
3. Neural Networks:
o Backpropagation Neural Networks (BPNN): BPNN can model complex non-
linear patterns but requires significant data and computational resources for
training. Overfitting is a common challenge without regularization.
o Optimization Techniques: Particle Swarm Optimization (PSO) and Genetic
Algorithms (GA) are used to fine-tune BPNN, improving accuracy and
reducing overfitting.
4. Feature Selection:
o Grey Relational Analysis (GRA): GRA helps identify significant features
affecting price prediction, improving model performance by focusing on the
most relevant attributes.
o Principal Component Analysis (PCA): PCA reduces dimensionality but can
sacrifice interpretability.

5. Methodology
The methodology for predicting car prices using machine learning involves several key steps,
each designed to ensure data integrity, model accuracy, and effective evaluation. Below is an
overview of the steps

Used Algorithm

• Support Vector Regression (SVR)


• Linear regression
• Random Forest
• Decision Tree

4
Code:

Data Collection:

Drop car Id

5
Visualize different car names

Fuel type Ratio

6
Price distribution of cars

7
Decision Tree

Random Forest

Linear regression

6. Results:
Accuracy or R2 score represents how well a model predicts the target variable compared to
actual values. It is expressed as a percentage

Accuracy (%) = R2×100

8
Calculation:

• Linear Regression:
R2= 0.7375 ⟹Accuracy=73.75%
• Decision Tree Regressor:
R2 =0.8839⟹Accuracy=88.39%
• Random Forest Regressor:
R2 =0.7375⟹Accuracy= 90.65%

The Random Forest Regressor demonstrated the highest accuracy in predicting car prices,
achieving a score of 90.65%. This suggests that it is the most reliable model for your dataset.
Its ensemble approach, which combines multiple decision trees, helps capture complex patterns
in the data effectively.

The Decision Tree Regressor also performed well with an accuracy of 88.39%, but it slightly
underperformed compared to Random Forest. This difference arises because Decision Trees
are prone to overfitting and may not generalize as well as Random forests, which average
multiple trees to reduce variance.

In contrast, the Linear Regression model achieved an accuracy of only 73.75%, indicating
that the relationship between the features and car prices is not purely linear. Linear Regression
struggles with datasets where the target variable depends on complex, nonlinear relationships
among the features.

7. Conclusion:

The analysis of the dataset for car price prediction highlights the effectiveness of different
machine learning models. Among the tested algorithms, the Random Forest Regressor emerged
as the most accurate, achieving an R² score of 90.65%. This model proved to be the most
suitable for handling the dataset due to its ability to capture complex, nonlinear relationships
between features and car prices.

The Decision Tree Regressor also performed well, achieving an R² score of 88.39%, but its
accuracy was slightly lower compared to Random Forest due to its tendency to overfit and the
lack of ensemble methods.

9
In contrast, Linear Regression struggled to provide high accuracy, with an R² score of 73.75%.
This indicates that the relationship between the dataset’s features and car prices is not purely
linear, emphasizing the need for more sophisticated models to achieve better predictions.

This study underscores the importance of selecting the right algorithm for predictive tasks,
particularly when dealing with datasets that have nonlinear patterns. Future work can explore
fine-tuning the Random Forest model, adding additional relevant features, or employing
advanced algorithms like Gradient Boosting Machines or Neural Networks for further
improvement.

8. Reference:
1. ChatGPT

https://chatgpt.com/

2. Kaggle

https://www.kaggle.com/

3. Google Scholar

https://www.ceeol.com/search/article-detail?id=746689

10

You might also like