0% found this document useful (0 votes)

17 views29 pages

(Aide300) Group 4 - Final Report

This final report presents a study on predicting stock prices in Vietnam's real estate market using four Machine Learning models: Linear Regression, Decision Tree, Random Forest, and Support Vector Regression. The research aims to provide investors with data-driven tools for better decision-making amidst market instability, focusing on five major real estate companies. The findings suggest that Linear Regression consistently outperforms other models in terms of accuracy and stability across the evaluated stocks.

Uploaded by

40Thiên TrangB2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views29 pages

(Aide300) Group 4 - Final Report

Uploaded by

40Thiên TrangB2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

FOREIGN TRADE UNIVERSITY

HO CHI MINH CAMPUS

FINAL REPORT

Course name: Artificial Intelligence in Era of Digital Transformation

Date: 14/07/2025 – Class code: 24 – Course code: AIDE300

No. Full Name ID Peer Evaluation

(0%-100%)

1 Nguyễn Mai Anh 2312255005 100%

2 Nguyễn Thiên Trang 2312255072 100%

3 Võ Lâm Thanh Trúc 2312255074 100%

Grade (in number) Grade (in words)

Examiner 1’s signature Examiner 2’s signature

TABLES OF CONTENT
CHAPTER I. INTRODUCTION AND OVERVIEW.................................................... 1
1. Executive Summary................................................................................................... 1
2. Introduction................................................................................................................ 2
2.1. Identify problems.................................................................................................... 2
2.2. Report Objective......................................................................................................3
3. Disclaimer.................................................................................................................. 3
3.1. AI Prompt................................................................................................................ 3
3.2. Dataset..................................................................................................................... 4
CHAPTER II. PREDICTIVE MODEL BUILDING..................................................... 5
1. Data Cleaning Process................................................................................................5
2. Data Description.........................................................................................................5
3. Model Development................................................................................................... 9
3.1. Data Preparation...................................................................................................... 9
3.2. Model Selection Rationale.................................................................................... 10
3.3. Feature Engineering...............................................................................................11
4. Model Training......................................................................................................... 12
4.1. DXG.VN................................................................................................................13
4.2. PDR.VN................................................................................................................ 13
4.3. VHM.VN...............................................................................................................14
4.4. NVL.VN................................................................................................................ 15
4.5. KDH.VN................................................................................................................16
5. Evaluation & selection............................................................................................. 18
5.1. MSE (Mean Squared Error).................................................................................. 18
5.2. R² (R-squared)....................................................................................................... 19
6. 5-Day Forecasting (03/07/2025 – 08/07/2025)........................................................ 20
6.1. DXG.VN................................................................................................................21
6.2. PDR.VN................................................................................................................ 22
6.3. VHM.VN...............................................................................................................22
6.4. NVL.VN................................................................................................................ 23
6.5. KDH.VN................................................................................................................23
CHAPTER III. CONCLUSION..................................................................................... 24
1. Findings.................................................................................................................... 24
2. Recommendations and Future Work........................................................................ 24

1
CHAPTER I. INTRODUCTION AND OVERVIEW

1. Executive Summary

In the framework of the very unstable Vietnamese real estate market, affected by
the COVID-19 and stricter credit policies, it is more important to predict the stock prices
in this industry. Thus, the absence of effective forecasting tools and data-driven strategies
may result in emotionally-based decisions, higher risk, and lost opportunities.

This report addresses that gap by applying and comparing four Machine Learning
models (Linear Regression, Decision Tree, Random Forest, and Support Vector
Regression) to predict the stock prices of five real estate companies in Vietnam: DXG,
PDR, VHM, NVL, and KDH. These were chosen for their diversity in scale,
capitalization, and business models, representative of Vietnam’s real estate sector.
Expanding beyond one single stock enhances model robustness and generalizability, as
supported by Domingos (2012) and Goodfellow et al. (2016).

Preprocessing of historical price and technical indicators was done and models
were trained. The evaluation of performance was done based on MAE, MSE, and R².
This research aims to provide investors a baseline for decision-making and serve as a step
toward more advanced Deep Learning applications in financial prediction.

2. Introduction
2.1. Identify problems

The real estate sector in Vietnam has been experiencing a long period of instability
post COVID recovery period because of credit tightening, legal barriers, fluctuations in
interest rates, … Official data show that in Q1 2020, only 1,300 real estate units were
sold successfully, which is 14.3 percent of the total supply, the lowest absorption rate in
four years (ThS. Nguyen Thi Hoa, 2020). Meanwhile, the growth of credit to the real
estate sector declined significantly, by 26 percent in 2018 to 12 percent in 2020,
indicating tighter lending policies and limited access to funds (Phan Nam, 2022). These

2
changes have increased market uncertainty and put institutional and retail investors to test
in their decision-making.

In this regard, there is an increasing need for data-driven tools that can facilitate
more precise and timely investment decisions. Machine Learning (ML), which can
capture complex patterns and nonlinear relationships, has become an interesting method
of stock price prediction. Nevertheless, its performance in a turbulent and noisy market
such as real estate is not well-researched in the Vietnamese setting.

2.2. Report Objective

This study aims to apply Machine Learning models to predict real estate stock
prices in the next 5 days. Specifically, the team will:

- Compare the effectiveness of 4 machine learning models: Linear Regression,

Decision Tree, Random Forest and Support Vector Regression, through popular
evaluation metrics: MAE, MSE and R².
- Determine the most optimal prediction model for each stock code to help
individual investors have more basis for decision making.
- Lay the foundation for the development of a Deep Learning model with long-term
prediction capabilities and better adaptability to unstructured factors such as
market events. This model will be tested on highly volatile stocks such as NVIDIA
in the next phase of the course.
3. Disclaimer
3.1. AI Prompt

The code and models in this report were built on Google Colab using historical
data of five major Vietnamese real estate stocks (DXG - Dat Xanh Group Joint Stock
Company, PDR - Phat Dat Real Estate Development Joint Stock Company, VHM -
Vinhomes Joint Stock Company, NVL - No Va Land Investment Group Corporation -
Novaland, KDH - Khang Dien House Trading and Investment Joint Stock Company )
from 01/07/2015 to 01/07/2025, retrieved through the yfinance API. Core Python

3
libraries including pandas, numpy, scikit-learn, matplotlib, and seaborn were used for
data processing, model training, evaluation, and visualization. The code structure
includes:

- Calculation of technical indicators (e.g., EMA_10, EMA_20, MACD, Signal Line,

ATR, Upper Band);
- Removal of multicollinearity via correlation analysis;
- Model training and testing using Linear Regression, Decision Tree, Random
Forest, and SVR;
- Model comparison using MAE, MSE, and R²;
- Forecasting and visualizing the next 5 trading days’ prices for each stock

The code was partially generated with the help of AI tools (ChatGPT-4o, Gemini 2.0,
DeepSeek V3, Grok 3.0) and later refined by the team to fit the project’s needs. It is
intended purely for academic use, not as financial advice.

3.2. Dataset

The dataset spans varying historical depths (2015–2023), reflecting real-world

forecasting where data availability differs across assets. This variation allows for
evaluating model flexibility and generalization across data richness levels, aligning with
recommendations to test model robustness under diverse conditions (Géron, 2019).

4
CHAPTER II. PREDICTIVE MODEL BUILDING

1. Data Cleaning Process

Figure 1. Data cleaning Code

Forward fill coupled with backward fill was used in dealing with missing values
such that continuity is maintained without data loss. One duplicate row was recognized
and eliminated to give integrity to the data. The significant columns such as open, high,
low, close, price and volume were casted to float so that they can work with numerals
calculations.

2. Data Description

5
Figure 2. Historical Closing Prices of Selected Stocks (2015–2025)

The chart shows the differences in price trends and volatility of the chosen
stocks throughout time. Such differences underline the need to train an individual
model per stock and select features that reflect each stock’s unique nature to
enhance prediction accuracy.

Figure 3. Distribution of Closing Prices (2015–2025)

Most closing prices fall between 10,000 and 30,000 VND, with DXG and KDH
appearing more frequently in this range. In contrast, VHM shows a broader and higher
price spread, reflecting stronger historical valuation. These variations may affect how
each model learns and predicts stock behavior.

6
Figure 4. Cumulative Returns Over Time (2015–2025)

Stocks like KDH and DXG demonstrate strong and consistent long-term growth,
while others such as PDR, VHM, and NVL display shorter time horizons and greater
volatility. Consistent trends support stable predictions, whereas noisy or limited data
challenge the model’s ability to generalize.

Figure 5. 20-Day Exponential Moving Averages (EMA) of Stock Prices

7

The EMA plot shows strong upward trends in VHM, DXG, and KDH, while NVL
and PDR display lower and more volatile patterns. Clearer trends may aid model
learning, whereas volatile series require more robust algorithms to capture patterns.

Figure 6. Rolling Standard Deviation of Daily Returns for Selected Stocks

The rolling standard deviation reveals volatility differences across stocks. DXG
and KDH show broader historical coverage and cyclical patterns, while PDR, VHM, and
NVL have shorter, more volatile windows from 2023. These distinctions highlight
varying risk profiles and their impact on model generalization.

3. Model Development

3.1. Data Preparation

We used a supervised learning setup, where the target variable was the next-day
closing price. A Target column was created by shifting the Close column by one time
step. The last row was dropped due to a missing target value from the shift. Each ticker
was printed to verify dataset integrity after cleansing.

8
Each dataset was split into 80% training and 20% testing without shuffling to
preserve time-series order. For example, DXG.VN had 1,982 training rows and 496
testing rows, ensuring both convergence and temporal validity.

The same pattern was applied to all five real estate stocks. Processed datasets
(X_train, X_test, y_train, y_test) were stored in a structured dictionary for easy access
during model training and evaluation.

Figure 7. Data splitting

3.2. Model Selection Rationale

To evaluate model performance in stock price prediction, four supervised learning

algorithms were trained and compared: Linear Regression (LR), Decision Tree (DT),
Random Forest (RF), and Support Vector Regressor (SVR) with a linear kernel for speed
and interpretability.

For each stock, models were trained on (X_train, y_train) and evaluated on
(X_test, y_test). Performance was assessed using MAE, MSE, and R² — capturing
average error, penalized error magnitude, and explained variance, respectively. Each
model was re-fitted per stock to account for differences in market behavior.

9
This evaluation framework provided insight into how each algorithm performed
across different data volumes and volatility patterns, helping identify the most suitable
model for each stock.

3.3. Feature Engineering

A set of eight technical indicators commonly used in finance and research was
selected to train the forecasting model. These include EMA_10 and EMA_20 for short-
and medium-term trends, MACD and its Signal Line for momentum, Bollinger Upper
Band for price range, ATR for market volatility, High price for daily fluctuation
amplitude, and Closing price.

Feature selection focused on avoiding mathematical and conceptual overlap

among technical indicators. For example, EMA and SMA both use moving averages,
while MACD and RSI are momentum-based. Including highly correlated features can
increase multicollinearity, skew learning, and reduce interpretability. Thus, only
indicators with statistical and conceptual independence were selected.

Figure 8. Feature Selections

4. Model Training

10
The four models were applied to each stock, using historical features to predict
next-day closing prices. Predicted and actual values during the test period were visualized
on line graphs, helping to assess how well each model captured trends, fluctuations, and
patterns over time.

4.1. DXG.VN

Figure 9. Model Training for DXG.VN

Linear Regression and SVR forecasts in vicinity of the real prices, steady, smooth.
In the meantime, Decision Tree and Random Forest are characterized by overfitting in
areas of high fluctuation. Hence, the best model in DXG Linear Regression in terms of its
generalization, and predictive power in the long-term.

11
4.2. PDR.VN

Figure 10. Model Training for PDR.VN

The forecasts of the Linear Regression and SVR are smooth and close to the price
movements. Decision Tree and Random Forest are jagged, which is the indicator of
possible overfitting. Accordingly, Linear Regression is the most appropriate choice for
PDR, balancing accuracy and stability.

4.3. VHM.VN

12
Figure 11. Model Training for VHM.VN

The Linear Regression is apt, and it tracks the real price. SVR is also very fine
though occasionally the prediction is a little bit erroneous. Decision Tree and Random
Forest are insensitive, and short term variations are ignored. Visually, Linear Regression
is the most effective model with VHM, because of its ability to capture both long-term
and short-term trends well.

4.4. NVL.VN

13
Figure 12. Model Training for NVL.VN

Both Linear Regression and SVR have natural and smooth graphs in predicting
prices. Decision Tree and Random Forest have the tendency to flatten fluctuations
leading to the loss of detailed signals. According to this graph, Linear Regression is the
best choice for NVL, thanks to its high accuracy and smooth graphs.

4.5. KDH.VN

14
Figure 13. Model Training for KDH.VN

All models closely followed the actual price, but Linear Regression and SVR were
smoother and more stable. Decision Tree was slightly noisy, while Random Forest tended
to overfit. Thus, Linear Regression was the most optimal model for KDH due to its stable
performance and practical simplicity.

Graphically, Linear Regression and Support Vector Regressor showed the best
results, with estimated lines mostly conforming to real prices. But a complete decision
can not be made based solely on visual results. To choose the most appropriate model, the
provided metrics should also be taken into account.

5. Evaluation & selection

15
According to the No Free Lunch theorem (Wolpert & Macready, 1997), no single
model works best for all cases, so four different models were tested on the same data. To
ensure objective selection, three common regression evaluation metrics were used:

5.1. MAE (Mean Absolute Error)

Figure 14. Mean Absolute Error (MAE) - Training Model

Based on MAE, Linear Regression was the most accurate and stable, especially for
DXG, NVL, and KDH. Decision Tree and Random Forest performed poorly on VHM
and PDR, likely due to overfitting. SVR was steady but less effective than Linear
Regression.

5.2. MSE (Mean Squared Error)

16
Figure 15. Mean Squared Error (MSE) - Training Model

Linear Regression recorded the lowest and least variable MSE on stocks such as
DXG, NVL and KDH, and this further affirms its effectiveness. On the contrary, Decision
Tree and Random Forest were ineffective on VHM and PDR probably caused by
overfitting. SVR was just as good as Linear Regression, but no better.

5.3. R² (R-squared)

Figure 15. R² (R-squared) - Training Model

17
Linear Regression achieved strong R² scores across most stocks, including DXG,
NVL, VHM, and KDH, showing strong explanatory power. SVR performed slightly
lower, while Decision Tree and Random Forest struggled with PDR and VHM due to
poor generalization on volatile data. Overall, Linear Regression was the most reliable in
terms of R².

6. 5-Day Forecasting (03/07/2025 – 08/07/2025)

Figure 16. 5-Day Forecasting (03/07/2025 – 08/07/2025)

The code snippet shown below was used in the Linear Regression, which
underwent the best performance to predict the stock prices within a 5-trading-day interval
(03/07/2025 08/07/2025). The result contains the predicted closing price of the five real
estate stocks.

18
19
Figure 17. 5-Day Forecasting Graph

6.1. DXG.VN

Table 1. 5-Day Forecasting Metrics - DXG.VN

In the initial days the model predicted rather accurately but strongly
under-predicted on 8 th of July when real price moved up starkly. Linear regression
lacked responsiveness to the sudden changes and it could not respond and follow the
rapid upwards movement.

6.2. PDR.VN

Table 2. 5-Day Forecasting Metrics - PDR.VN

The model results were more observant and the error was minimal during the
entire period. On July 7, it was a little bit over-optimistic, and yet, the model tracked the
actual trend relatively closely, indicating moderate fluctuations.

20
6.3. VHM.VN

Table 3. 5-Day Forecasting Metrics - VHM.VN

The model experienced frequent underestimation of the stock price most

especially on the 8 th of July when the price drastically rose. This indicates that the
model had failed to match with the fast pace of growth and was conservative in making
prediction on high volatility stocks.

6.4. NVL.VN

Table 4. 5-Day Forecasting Metrics - NVL.VN

The prediction matched the actual price within close limits during the initial few
days but it failed to capture the falling trend on July 8 and this resulted to an
over-derivation in the prediction. The model supposed that there was an ongoing upward
trend but in the market, it actually turned down.

21
6.5. KDH.VN

Table 5. 5-Day Forecasting Metrics - KDH.VN

The model, similar to the VHM, always underestimated the actual price
particularly in those cases where the price grew significantly towards the end of the
period. The model will be appropriate only when there is a stable market, and it is not
sensitive enough to pick high rebounds.

7. Discussion of Potential Causes for the Identified Trends

The patterns noted in the forecast accuracy of the five stocks of the real estate may
be explained by a few generic factors.

The volatility rate and price range of every stock was quite influential. The model
would generalise well on stocks with steady and moderate variance but would yield a
higher error when it had to make a prediction on the stock with sudden spikes or
exhibited volatility with respect to its price.

The linearity of the selected model, which was Linear Regression, restricted how
the nonlinearity of dynamics or sudden shifts in the market could be detected. The model
was able to follow general trends but had problems responding to abrupt directional
changes, particularly towards the latter portion of an outlook period.

22
The model is only based on technical signals based on past prices without
incorporating external qualitative data like financial news or policy revision. This left it
less sensitive to near-term market drivers that can contribute to real estate stocks.

Lastly, the length and the quality of the data affected the functionality of the
model. Stocks which have longer and consistent historical records could be trained more
accurately and better generalized. The less reliable forecasts, on the other hand, were
obtained for those with limited or volatile datasets.

A combination of these reasons pins the point on more adaptive modeling

strategies and features established in the demand of input in order to improve predictive
performance in a more complex and volatile market dynamic such as in Vietnam where
the real estate industry is prevalent.

23
CHAPTER III. CONCLUSION

1. Findings

While no single model proved to be universally optimal, Linear Regression and

Random Forest consistently performed well across most tickers. Among these, Linear
Regression demonstrated the lowest MAE and MSE, along with relatively high
R-squared values, indicating strong predictive ability in stable market conditions.

In contrast, Decision Tree and Random Forest models occasionally suffered from
overfitting, particularly when market patterns were noisy or inconsistent. In some cases,
these models even produced negative R-squared scores, suggesting poor generalization
on the test set. The performance gaps among models highlighted the inherent limitations
of traditional machine learning models, particularly their reduced effectiveness in
short-term forecasts under conditions of high volatility or sudden, news-driven market
shifts.

Overall, the modeling pipeline, which included feature engineering, technical

indicator extraction, and sequential train-test splitting, was successful in generating
reliable baseline forecasts for relatively stable periods. However, the ability to capture
sharp price movements remains limited under the current framework.

2. Recommendations and Future Work

For future development, it is recommended to explore deep learning architectures

especially LSTM (Long Short-Term Memory) models—which are well-suited for
capturing temporal dependencies and could offer improved accuracy for volatile stocks
such as NVIDIA.

Consider integrating quantitative data (historical stock prices or technical

indicators) with qualitative inputs such as news headlines, market sentiment, and social
media signals. This multimodal approach could enhance model responsiveness to
external shocks and market narratives.

24
Development of hybrid models (e.g., Bi-directional LSTM or BERT combined
with numerical features) is suggested to better capture both sequential dependencies and
semantic patterns from text-based data sources;

Finally, implementing an early warning system that reacts to real-time information

such as surprise product releases (e.g., the launch of DeepSeek) would increase model
adaptability and forecasting robustness in high-impact, short-horizon scenarios.

25
REFERENCES

1. Phan Nam. (2022, May 10). Tín dụng bất động sản: “Nắn” chứ không nên “siết.”

Nhịp Sống Kinh Tế Việt Nam & Thế Giới.

https://vneconomy.vn/techconnect//tin-dung-bat-dong-san-nan-chu-khong-nen-siet

.htm?utm_source=chatgpt.com

2. ThS. Nguyễn Thị Hoa. (2020). Thị trường bất động sản Việt Nam trong cơn bão

Covid-19: Đón chờ lực bật mới. Consosukien.vn.

https://consosukien.vn/thi-truong-bat-dong-san-viet-nam-trong-con-bao-covid-19-

don-cho-luc-bat-moi.htm

3. Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for

optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82.

https://doi.org/10.1109/4235.585893

4. Domingos, P. (2012). A few useful things to know about machine learning.

Communications of the ACM, 55(10), 78–87.

https://doi.org/10.1145/2347736.2347755

5. Murphy, J. J. (1999). Technical analysis of the financial markets: A comprehensive

guide to trading methods and applications. New York Institute of Finance.

6. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

https://www.deeplearningbook.org/

7. Kaufman, P. J. (2013). Trading systems and methods (5th ed.). Wiley.

8. Zhang, X., Qu, Y., & Li, R. (2017). Stock price prediction via discovering

multi-frequency trading patterns. Proceedings of the 26th International

26
Conference on World Wide Web, 1231–1240.

https://doi.org/10.1145/3097983.3098131

9. Géron, A. (2019). Hands-on machine learning with Scikit-Learn, Keras, and

TensorFlow (2nd ed.). O'Reilly Media.

10. [AIDE300] GROUP 4 - MIDTERM COLAB NOTEBOOK

27
GOOGLE COLAB

https://colab.research.google.com/drive/1sBobAz4spemHli97SZjQGWN0h7wILyCF?usp=sharin
g&fbclid=IwY2xjawLiAGFleHRuA2FlbQIxMQABHpU_d-DujWbtIJ8_VH5YgYAOOlQKGY8v60sM
qXqxDzpQNu9ahaSl1sdw6RLb_aem_sD0Com7R9MOQh7IgUeiv2g

Stock Market Prediction Using Machine Learning: December 2018
No ratings yet
Stock Market Prediction Using Machine Learning: December 2018
4 pages
IJISAE 50 Rahul+Marui+Dhokane 3 1867
No ratings yet
IJISAE 50 Rahul+Marui+Dhokane 3 1867
8 pages
Project Title: Stock Market Prediction
No ratings yet
Project Title: Stock Market Prediction
8 pages
Hybrid Stock Prediction Model
No ratings yet
Hybrid Stock Prediction Model
7 pages
StockMarketPredictionTermPaper Final1
No ratings yet
StockMarketPredictionTermPaper Final1
4 pages
Stock Price Prediction Abstract2
No ratings yet
Stock Price Prediction Abstract2
2 pages
Stock Market Prediction Using Machine Learning: December 2018
No ratings yet
Stock Market Prediction Using Machine Learning: December 2018
4 pages
Stock Market Prediction Using Machine Learning: December 2018
No ratings yet
Stock Market Prediction Using Machine Learning: December 2018
4 pages
Stock Researchppr - Final
No ratings yet
Stock Researchppr - Final
11 pages
Report Minor
No ratings yet
Report Minor
15 pages
Appliedmath 05 00076
No ratings yet
Appliedmath 05 00076
36 pages
Real
No ratings yet
Real
5 pages
StockMarketPredictionTermPaper Final1
No ratings yet
StockMarketPredictionTermPaper Final1
4 pages
Thesis Khetan Harsha
No ratings yet
Thesis Khetan Harsha
82 pages
AI Project 2 Report
No ratings yet
AI Project 2 Report
6 pages
Priyanshu Lakhotiya Aiml
No ratings yet
Priyanshu Lakhotiya Aiml
49 pages
Paper 06
No ratings yet
Paper 06
27 pages
Master Thesis Marcus Haevaker Vfinal
No ratings yet
Master Thesis Marcus Haevaker Vfinal
57 pages
SVM Code Stock Prediction
No ratings yet
SVM Code Stock Prediction
5 pages
Patel Prince Vipulbhai Thesis 2021
No ratings yet
Patel Prince Vipulbhai Thesis 2021
41 pages
(Aide300) (Group 5) Final Report
No ratings yet
(Aide300) (Group 5) Final Report
36 pages
Forecasting Index Prices Across The Three Stock Exchanges
No ratings yet
Forecasting Index Prices Across The Three Stock Exchanges
10 pages
Literature Review (Stock Market Prediction)
No ratings yet
Literature Review (Stock Market Prediction)
14 pages
Stock Recommendations Leveraging AI/ML For Informed Long-Term Investment Decisions
No ratings yet
Stock Recommendations Leveraging AI/ML For Informed Long-Term Investment Decisions
10 pages
Predicting Stock Prices Using Artificial Intelligence: A Comparative Study of Machine Learning Algorithms
No ratings yet
Predicting Stock Prices Using Artificial Intelligence: A Comparative Study of Machine Learning Algorithms
13 pages
Evaluating The Effectiveness of Modern Forecasting Models in Predicting Commodity Futures Prices in Volatile Economic
No ratings yet
Evaluating The Effectiveness of Modern Forecasting Models in Predicting Commodity Futures Prices in Volatile Economic
16 pages
CCP Final Stock Market
No ratings yet
CCP Final Stock Market
41 pages
Project Report SP
No ratings yet
Project Report SP
9 pages
Progressive Seminar 1
No ratings yet
Progressive Seminar 1
16 pages
Chen C 66900 PHD Thesis
No ratings yet
Chen C 66900 PHD Thesis
148 pages
A Machine Learning-Based Analysis of Stock Market Forecasting A Review
No ratings yet
A Machine Learning-Based Analysis of Stock Market Forecasting A Review
5 pages
Ojapps20241411 72312565
No ratings yet
Ojapps20241411 72312565
8 pages
B112 - B114 - B121 - Stock Prediction Using RNN & LSTM
No ratings yet
B112 - B114 - B121 - Stock Prediction Using RNN & LSTM
18 pages
Stock Prediction Analysis Using LSTM
No ratings yet
Stock Prediction Analysis Using LSTM
18 pages
Marwala's MSC Dissertation
No ratings yet
Marwala's MSC Dissertation
166 pages
Stock Prediction Using Machine Learning Google Scholar
No ratings yet
Stock Prediction Using Machine Learning Google Scholar
8 pages
22 Bds 041
No ratings yet
22 Bds 041
43 pages
Advanced Stock Market Prediction Report
No ratings yet
Advanced Stock Market Prediction Report
21 pages
Topic Submission Document (1) .Edited
No ratings yet
Topic Submission Document (1) .Edited
23 pages
Sample 2 Research Paper
No ratings yet
Sample 2 Research Paper
19 pages
Stock Price Prediction Using Machine Learning
No ratings yet
Stock Price Prediction Using Machine Learning
24 pages
Stock Prediction with ML Models
No ratings yet
Stock Prediction with ML Models
3 pages
ML Stock Prediction Insights
No ratings yet
ML Stock Prediction Insights
5 pages
A Review On Stock Market Prediction Using Machine Learning Algorithms
No ratings yet
A Review On Stock Market Prediction Using Machine Learning Algorithms
25 pages
DWM Project
No ratings yet
DWM Project
7 pages
SML Project (E23cseu1717)
No ratings yet
SML Project (E23cseu1717)
6 pages
Fintech Research Project
No ratings yet
Fintech Research Project
55 pages
Report Phase2
No ratings yet
Report Phase2
19 pages
O Level Project - Pratigya Gangwar
No ratings yet
O Level Project - Pratigya Gangwar
62 pages
Stock Market Analysis PDF
No ratings yet
Stock Market Analysis PDF
5 pages
On Stock Price Prediction - A Deep Learning Approach Using Bidirectional Long-Short Term Memory (Bilstm) - 20230227 - 202813
No ratings yet
On Stock Price Prediction - A Deep Learning Approach Using Bidirectional Long-Short Term Memory (Bilstm) - 20230227 - 202813
59 pages
Published Research Paper
No ratings yet
Published Research Paper
8 pages
Deepika
No ratings yet
Deepika
15 pages
Stock Forecasting with LSTM Model
No ratings yet
Stock Forecasting with LSTM Model
33 pages
Stock Prediction
No ratings yet
Stock Prediction
9 pages
Stock Price Analysis and Prediction Using Machine Learning 2
No ratings yet
Stock Price Analysis and Prediction Using Machine Learning 2
6 pages
Project Report - DA
No ratings yet
Project Report - DA
20 pages
Stock Report
No ratings yet
Stock Report
51 pages
A Novel Technique For Selecting Financial Parameters and Technical Indicators To Predict Stock Prices
No ratings yet
A Novel Technique For Selecting Financial Parameters and Technical Indicators To Predict Stock Prices
10 pages
Google's Employee Motivation Strategy
No ratings yet
Google's Employee Motivation Strategy
6 pages
Task 1 - Unit 14
No ratings yet
Task 1 - Unit 14
2 pages
Task 1 - Unit 15
No ratings yet
Task 1 - Unit 15
2 pages
Thay DONG8 Muideptrai
No ratings yet
Thay DONG8 Muideptrai
2 pages
Machine Learning Exercises in Python, Part 1: Curious Insight
No ratings yet
Machine Learning Exercises in Python, Part 1: Curious Insight
14 pages
Time Value of Money Solutions
No ratings yet
Time Value of Money Solutions
4 pages
Syllabus of BSCS Programme
No ratings yet
Syllabus of BSCS Programme
4 pages
EEG-Based Emotion Recognition
No ratings yet
EEG-Based Emotion Recognition
12 pages
Linearisation Techniques Explained
No ratings yet
Linearisation Techniques Explained
3 pages
An Efficient Algorithm For Solving Nonograms: Hui-Lung Lee Ling-Hwei Chen
No ratings yet
An Efficient Algorithm For Solving Nonograms: Hui-Lung Lee Ling-Hwei Chen
14 pages
GIS & Map Projections in Civil Engineering
No ratings yet
GIS & Map Projections in Civil Engineering
88 pages
OCR Free Table of Contents Detection in Urdu Books
No ratings yet
OCR Free Table of Contents Detection in Urdu Books
5 pages
Integration by Substitution Guide
No ratings yet
Integration by Substitution Guide
35 pages
Chapter 5 - Numerical Methods in Heat Conduction
No ratings yet
Chapter 5 - Numerical Methods in Heat Conduction
1 page
Econometrics for Non-Econ Majors
No ratings yet
Econometrics for Non-Econ Majors
2 pages
Karachi LTE1800 Model Tuning - Cluster Comparison
No ratings yet
Karachi LTE1800 Model Tuning - Cluster Comparison
18 pages
Deep Learning Unit I II MCQ
No ratings yet
Deep Learning Unit I II MCQ
2 pages
N1
No ratings yet
N1
2 pages
Partial Differential Equations Guide
No ratings yet
Partial Differential Equations Guide
2 pages
Analytical Models in Parallel Computing
No ratings yet
Analytical Models in Parallel Computing
82 pages
14the Normal Distribution - Worksheet
No ratings yet
14the Normal Distribution - Worksheet
10 pages
DM Recurrence Relation
No ratings yet
DM Recurrence Relation
32 pages
Post-Quantum Lattice-Based Secure Reconciliation Enabled Key Agreement Protocol For IoT
No ratings yet
Post-Quantum Lattice-Based Secure Reconciliation Enabled Key Agreement Protocol For IoT
13 pages
(J22) - A Limited-Preview Filtered B-Spline Approach To Tracking Control - With Application To Vibration-Induced Error Compensation of A 3D Printer
No ratings yet
(J22) - A Limited-Preview Filtered B-Spline Approach To Tracking Control - With Application To Vibration-Induced Error Compensation of A 3D Printer
10 pages
City House Price Prediction: Mini Project Report
No ratings yet
City House Price Prediction: Mini Project Report
3 pages
TRANSIENT STABILITY ANALYSIS in ETAP
No ratings yet
TRANSIENT STABILITY ANALYSIS in ETAP
10 pages
Cse3004 Design-Analysis-Of-Algorithm LT 1.0 1 Cse3004
No ratings yet
Cse3004 Design-Analysis-Of-Algorithm LT 1.0 1 Cse3004
2 pages
Thermodynamics for Students
No ratings yet
Thermodynamics for Students
19 pages
Grade 10 Probability Worksheet
No ratings yet
Grade 10 Probability Worksheet
4 pages
Program Evaluation and Review Technique
No ratings yet
Program Evaluation and Review Technique
7 pages
Decision Trees for Data Scientists
No ratings yet
Decision Trees for Data Scientists
13 pages
PhD Computer Science Syllabus
No ratings yet
PhD Computer Science Syllabus
5 pages
01-Introduction To Soft Computing PDF
100% (2)
01-Introduction To Soft Computing PDF
61 pages
Dmouj
No ratings yet
Dmouj
40 pages

(Aide300) Group 4 - Final Report

Uploaded by

(Aide300) Group 4 - Final Report

Uploaded by

FOREIGN TRADE UNIVERSITY​

HO CHI MINH CAMPUS

Course name: Artificial Intelligence in Era of Digital Transformation

Date: 14/07/2025 – Class code: 24 – Course code: AIDE300

No. Full Name ID Peer Evaluation

1 Nguyễn Mai Anh 2312255005 100%

2 Nguyễn Thiên Trang 2312255072 100%

3 Võ Lâm Thanh Trúc 2312255074 100%

Grade (in number) Grade (in words)

Examiner 1’s signature Examiner 2’s signature

1.​ Executive Summary

2.2.​ Report Objective

-​ Compare the effectiveness of 4 machine learning models: Linear Regression,

-​ Calculation of technical indicators (e.g., EMA_10, EMA_20, MACD, Signal Line,

The dataset spans varying historical depths (2015–2023), reflecting real-world

1.​ Data Cleaning Process

Figure 1. Data cleaning Code

2.​ Data Description

Figure 3. Distribution of Closing Prices (2015–2025)

Figure 5. 20-Day Exponential Moving Averages (EMA) of Stock Prices

Figure 6. Rolling Standard Deviation of Daily Returns for Selected Stocks

3.​ Model Development

Figure 7. Data splitting

3.2.​ Model Selection Rationale

To evaluate model performance in stock price prediction, four supervised learning

3.3.​ Feature Engineering

Feature selection focused on avoiding mathematical and conceptual overlap

Figure 8. Feature Selections

4.​ Model Training

Figure 9. Model Training for DXG.VN

Figure 10. Model Training for PDR.VN

5.​ Evaluation & selection

5.1.​ MAE (Mean Absolute Error)

Figure 14. Mean Absolute Error (MAE) - Training Model

5.2.​ MSE (Mean Squared Error)

Figure 15. R² (R-squared) - Training Model

6.​ 5-Day Forecasting (03/07/2025 – 08/07/2025)

Figure 16. 5-Day Forecasting (03/07/2025 – 08/07/2025)

Table 1. 5-Day Forecasting Metrics - DXG.VN

Table 2. 5-Day Forecasting Metrics - PDR.VN

Table 3. 5-Day Forecasting Metrics - VHM.VN

The model experienced frequent underestimation of the stock price most

Table 4. 5-Day Forecasting Metrics - NVL.VN

Table 5. 5-Day Forecasting Metrics - KDH.VN

7.​ Discussion of Potential Causes for the Identified Trends

A combination of these reasons pins the point on more adaptive modeling

While no single model proved to be universally optimal, Linear Regression and

Overall, the modeling pipeline, which included feature engineering, technical

2. Recommendations and Future Work

For future development, it is recommended to explore deep learning architectures

Consider integrating quantitative data (historical stock prices or technical

Finally, implementing an early warning system that reacts to real-time information

Nhịp Sống Kinh Tế Việt Nam & Thế Giới.

Covid-19: Đón chờ lực bật mới. Consosukien.vn.

optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82.

Communications of the ACM, 55(10), 78–87.

5.​ Murphy, J. J. (1999). Technical analysis of the financial markets: A comprehensive

guide to trading methods and applications. New York Institute of Finance.

multi-frequency trading patterns. Proceedings of the 26th International

TensorFlow (2nd ed.). O'Reilly Media.

10.​ [AIDE300] GROUP 4 - MIDTERM COLAB NOTEBOOK

You might also like

FOREIGN TRADE UNIVERSITY

1. Executive Summary

2.2. Report Objective

- Compare the effectiveness of 4 machine learning models: Linear Regression,

- Calculation of technical indicators (e.g., EMA_10, EMA_20, MACD, Signal Line,

1. Data Cleaning Process

2. Data Description

3. Model Development

3.2. Model Selection Rationale

3.3. Feature Engineering

4. Model Training

5. Evaluation & selection

5.1. MAE (Mean Absolute Error)

5.2. MSE (Mean Squared Error)

6. 5-Day Forecasting (03/07/2025 – 08/07/2025)

7. Discussion of Potential Causes for the Identified Trends

5. Murphy, J. J. (1999). Technical analysis of the financial markets: A comprehensive

10. [AIDE300] GROUP 4 - MIDTERM COLAB NOTEBOOK