Stock Market Time Series Analysis
Stock Market Time Series Analysis
Scientific Programming
Volume 2022, Article ID 4758698, 12 pages
https://doi.org/10.1155/2022/4758698
Research Article
Research on Stock Price Time Series Prediction Based on Deep
Learning and Autoregressive Integrated Moving Average
1
Daiyou Xiao and Jinxia Su2
1
School of Finance, Central University of Finance and Economics, Beijing, China
2
School of Business, Central University of Finance and Economics, Beijing, China
Received 7 December 2021; Revised 24 January 2022; Accepted 21 February 2022; Published 31 March 2022
Copyright © 2022 Daiyou Xiao and Jinxia Su. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
Different from traditional algorithms and model, machine learning is a systematic and comprehensive application of computer
algorithms and statistical models, and it has been widely used in many fields. In the field of finance, machine learning is mainly
used to study the future trend of capital market price. In this paper, to predict the time-series data of stock, we applied the
traditional models and machine learning models for forecasting the linear and non-linear problem, respectively. First, stock
samples that occurred from year 2010 to 2019 at the New York Stock Exchange are collected. Next, the ARIMA (autoregressive
integrated moving average model) model and LSTM (long short-term memory) neural network model are applied to train and
predict stock price and stock price subcorrelation. Finally, we evaluate the proposed model by several indicators, and the ex-
periment results show that: (1) Stock price and stock price correlation are accurately predicted by the ARIMA model and LSTM
model; (2) compared with ARIMA, the LSTM model performance better in prediction; and (3) the ensemble model of ARIMA-
LSTM significantly outperforms other benchmark methods. Therefore, our proposed method provides theoretical support and
method reference for investors about stock trading in China stock market.
can be expected to gain by choosing different β coefficient of performs well in predicting economic and financial time
stocks. Moreover, the stock index and its constituent stock series. Other researchers put forward a stock price prediction
prices often keep trend in sync in the global stock market. method using deep learning models [10]: 14 different DL
Therefore, except for predicting stock index and single stock methods similar to LSTM are comprehensively adopted in
prices, better portfolio strategies can be worked out by S&P stocks; BSE-BANKEX stock index will be capable of
forecasting the correlation coefficients of the expected forecasting one or even four steps ahead. It is found that the
constituent stock of the stock index for higher returns on DL methods proposed in their research can obtain a good
investment. prediction results for stock price. Joo II and Seung-ho
Based on all of this, this paper takes the strong-enough proposed a stock price forecast model of a two-way LSTM
representative S&P 500 stock index and its constituent recurrent neural network, which adds a hidden layer in the
stocks as the research object to forecast the future trend of opposite direction of the data flow to deal with the limited
the S&P 500 stock index through forecast models and then network through the previous model based on the RNN [11].
predict the correlation coefficient between its constituent It was found that, compared with the nonbidirectional
stocks and the stock index, so as to formulate the optimal LSTM recurrent neural network, the stock price prediction
investment strategy for investors to refer to at a certain model using the bidirectional LSTM recurrent neural net-
extent. work has higher accuracy. To get rid of high noise in stock
Over the past few decades, many social science researches data, researchers applied the wavelet threshold denoising
have focused on predict social and economic development method to preprocess the initial data sets [12]. In their study,
trends with quantitative methods. Many feasible methods in the soft/hard threshold method used for data preprocessing
time-series analysis, both with advantages and disadvantages, has a significant effect on noise suppression. Based on this
can be interpreted as techniques for using past data to build research, a new multioptimal combination wavelet trans-
forecasts and strategies on future value. form (MOCWT) was proposed, and the research finally
First, research about linear model: As early as the 1990s, showed that MOCWT is more accurate in forecasting than
the ARIMA (autoregressive integrated moving average) traditional methods. Researchers also proposed the LSTM
method has already been used by scholars to forecast in the model and employed it to intraday stock forecasts [13]. Chen
capital market. Some researchers used the ARIMA and and Ge made an exploration on the forecasting mechanism
coefficients to predict stock market data [3], and in their of stock price movement based on LSTM and found that it
experiments, researchers found that the experiment result significantly improved the forecasting performance [14].
was better than the prediction of the zero hypothesis of Third, the research on the hybrid model is as follows:
random fluctuations in the base value. The ARIMA model Peter and Zhang used ARIMA and ANN hybrid method to
has been used in many fields including temperature pre- study time series estimation [15]. Narendra Babu and Eswara
diction, prices prediction for electricity, and wind speed. Reddy proposed a linear hybrid model that can simulta-
Some studies adopted the process of ARIMA time in their neously maintain the prediction accuracy and the trend of
research [4]. Yang et al. selected the Shanghai Composite the data [16]. Baek and Kim proposed a novel data en-
Index to structure ARIMA model [5]. Kim and Sayama hancement method for stock market index prediction based
developed a new method aiming to forecasting the future on the ModAugNet framework [17]. The method includes
trend of the S&P 500 index by establishing a complex the over-fitting prevention LSTM module and the predictive
network of time series of the index-foundation S&P 500 and LSTM module and it is found from analysis that the test
then linking the network to the interconnected weights [6]. performance depends entirely on the latter. An ensemble
The study showed that adding network measurement results method LSTM with GARCH is proposed [18]; it has high
to the ARIMA can improve the prediction accuracy. Khashei predictive ability and good applicability. Chen et al. pro-
and Hajirahimi believe that the time series in the hybrid posed a new ensemble model to problems on portfolio
model is divided into t linear and nonlinear two parts [7]. selecting with skewness and kurtosis [19].
Therefore, ARIMA and MLP (multiparametric linear pro- Through the analysis of recent literature, it can be found
gramming) are chosen to build hybrid models. They also that domestic and foreign forecasting models can be roughly
found that on the whole, the ANN-ARIMA hybrid model divided into linear, nonlinear, and hybrid models. In gen-
can be adopted to achieve more accurate results. Unggara eral, the current research status at home and abroad can be
et al. used the Firefly algorithm to optimize the ARIMA (p, d, summarized as follows: The research on linear models
q) model and determined the best ARIMA model by looking mainly focuses on the ARIMA model. For recent researches,
for the smallest AIC (Akaike information criterion) value many researchers keep more belief in predictive perfor-
[8]. As a result, the ARIMA model optimized by the Firefly mance of non-linear models than that of linear models. The
algorithm has a better forecasting performance. hybrid model is the best predictive model in all. It can not
Second, research about neural network model: The only process the linear part of time series data, but also has
LSTM (long short-time memory) network, which has better processing capabilities for its nonlinear part. There-
achieved further success in processing large data sets, is fore, in our study, a single method is first used to predict the
mainly used for deeper learning. Although LSTM model is trend of stock indexes, and then a hybrid one is adopted to
limited in the number of inputs, Siami-Namini and Namin predict the correlation coefficients of stock indexes and their
attempted to use the LSTM in financial data sets [9]. Ex- constituent stocks, so that provide investors with guidance
periment results indicate that the proposed method to profit to a certain extent.
Scientific Programming 3
where yt and εt are the actual value and random error of the
2.2. Long Short-Term Memory Model. Many researchers
time period t, respectively; Φi (i � 1, 2,. . ., p) and θj (j � 1,
found that different models are good at dealing with dif-
2,. . ., q) are the model parameters; p and q, the order of the
ferent types of prediction problems. This provides a basis for
model (p and q are integers), are also the model parameter
using the ARIMA-LSTM hybrid model, which contains both
mentioned earlier; the random error εt , whose mean value is
linear and nonlinear parts, to produce better results than a
0, is assumed to be independent and obey the same dis-
single method. Figure 1 shows the LSTM neural grid stores
tribution in the model. The variance of constant term is
the internal structure of cells.
denoted as σ 2. Equation (1) involves several important
Our study used the standard LSTM including the four
special cases of ARIMA series models. If q � 0, then equation
interactive neural networks (forgetting gates, input gates,
(1) can be simplified to an AR model of order p. When p � 0,
input candidate gates, and output gates).
the model can be simplified to a q-order MA model. Among
them, the model order (p, q) is the key link in ARIMA model ft � σ Wf × ht− 1 , xt + bf , (2)
construction, which determines the accuracy of model
prediction. The parameters of the AR and MA operations are where σ represents the sigmoid activation function.
defined as (p) and (q), respectively. These two parameters i
need to be determined by the auto-correlation graph (ACF). σ(X) � . (3)
1 + e− x
ARIMA includes the following steps:
And then, a new unit state Ct is obtained from the input
Step 1: Data diagnosis and check: In the first step, it is
gate, this state will be as an update unit state in the next time
necessary to check the stationarity of the given time
step. The input gate employed the σ as the activation
series data, which is essential to improve the accuracy of t as outputs. it is employed to de-
function and it and C
forecasting. A stationary time series is a time series t.
termined the feature in Ct to reflect C
whose statistical properties such as mean, variance, and
covariance are related to time. it � σ Wi × ht− 1, xt + bi ,
(4)
Step 2: Model parameter estimation: In order to sta- t � tanh Wc × ht− 1 , xt + bc ,
C
bilize the nonstationary time series, a proper degree of
difference (d) is performed on it, and the stability test is σ function outputs a value in the range 0 to 1 and the tanh
performed again and this process is continued until a outputs a value in the range − 1 to 1.
stable series is obtained. (d) is a positive integer that Next, the value selected by the ht activates Ot and Ct ,
shows the degree of difference. If the difference op- which are decided by the output gate.
eration is performed (d) times, the integration pa-
rameter of the ARIMA model is set to (d), and then the σ t � σ W0 × ht− 1, xt + b0 , (5)
obtained stationary data are identified. In this process,
the model (ACF graph) and partial auto-correlation t,
Ct � ft × Ct− 1 + it × C (6)
graph (PACF graph) are determined.
Step 3: Model identification and selection: After en- ht � ot × tanh Ct . (7)
suring that the input variable is a stationary series, the
parameter d has been determined. Next, calculation Equations (6) and (7) produce the Ct and ht , and they
algorithms are used to estimate the parameters to find will be passed to the next time step. The experiment in this
the coefficients most suitable for the selected ARIMA article is a regression problem, and the range of output value
model. And then the AIC standard or BIC standard is of the proposed model is − 1 to 1; therefore, the last element
used to test the model and select the minimum is activated by the tanh function.
4 Scientific Programming
ht–1 ht
Ct–1 Ct
σ ft
~
+
tanh Ct
h σ it tanh ht
σ
ot
xt xt+1
LSTMerror
ARIMAweight
ARIMA Model
ARIMAerror
2011/1/4
2012/1/4
2013/1/4
2014/1/4
2015/1/4
2016/1/4
2017/1/4
2018/1/4
2019/1/4
stock index forecasting models, ARIMA and LSTM, are
constructed at first. The S&P 500 stock index is selected in
the empirical data selects, and the daily trading data is Figure 3: S&P Stock index closing sequence.
selected in the data sample interval selects from January 1,
2010, to December 31, 2019, which are 2519 sets of data in
Table 1: The experiment results about ARIMA and LSTM model
total. Among them, the first 90% is used for model
forecasting.
training, and the 10% is used for model prediction. The
S&P 500 stock index sequence is shown in Figure 3. It can MSE MAE RMSE
be found from the figure that within the selected time ARIMA 0.000101 0.007333 0.043788
range, the S&P 500 Index generally shows a steady in- LSTM 0.000096 0.007184 0.028828
creasing trend.
3.3. The Design of Stock Price Correlation Coefficient
Prediction Ensemble Model
3.2. Comparative Analysis of Stock Index Forecast Model
Results. After obtaining the prediction data set, the 3.3.1. The Design of ARIMA for Stock Price Correlation
aforementioned four test methods are used in this study to Coefficient Prediction
test the data of each forecasting method. The following
table shows the different loss value obtained on the basis of (1) In the experiment of correlation coefficient pre-
the prediction of the four foreign exchange median prices diction, the adjusted closing price of the constituent
and the ARIMA model and the RNN neural network stocks of the S&P 500 index is selected, and the
model. sample interval is still set from January 1, 2010, to
Table 1 shows the fitting results based on the loss values December 31, 2019, on the New York Stock Ex-
of the prediction results of each model under different loss change daily transaction receipts. Data are mainly
functions. It can be seen from the table that the loss acquired in the use of Python language’s Beautiful
functions of the LSTM model are all smaller than the Soup function library through crawler technology.
ARIMA model, which is because the LSTM model can not The trading data of the constituent stocks originates
only describe the nonlinear relationship of time series data from the Quandl database, and the industry in-
but also has certain processing capabilities for its linear part formation of the constituent stocks is from
despite of its instability in comparison with the ARIMA Wikipedia.
model. However, generally speaking, both models have After preprocessing the data, the program randomly
gained very low loss values, indicating that the two models generates 150 stocks from the remaining 446 assets,
are both relatively perform well in predicting accuracy. and calculates the correlation coefficient of each pair
Figure 4 shows the predicted results using LSTM and of assets in a 100-day time window. In order to
ARIMA, respectively. diversify the data, 5 sets of data are set up in this
6 Scientific Programming
Actual V.S. Predicted using LSTM Actual V.S. Predicted using ARIMA
1.00 1.00
0.95 0.95
0.90 0.90
Normalised Value
Normalised Value
0.85 0.85
0.80 0.80
0.75 0.75
0.70 0.70
0.65 0.65
article with a starting value every 20 days: day 1; day AIC � − 2In(L) + 2N, (14)
21; day 41; day 61; and day 81. Each value corre-
sponds to a rolling 100-day window, advancing in where L represents the maximum likelihood function and N
100-day time-steps until the end of the data set represents the number of parameters.
training. In this process, a total of 55,875 sets of time The AIC standard was proposed by the Japanese stat-
series data were trained, and each set has 24 time- istician Akaike, so it is named directly after initials of his
steps. Development, test1, and test2 are produced name. To evaluate the performance of the ARIMA model
using these 55,875 × 24 data sets. In the model with the application of AIC standard, the maximum like-
evaluation stage, this paper divides the data as fol- lihood function and the model parameters are used to judge
lows to achieve forward optimization. its prediction effect. Specifically, the larger the maximum
likelihood function value, the higher the prediction effect;
(2) The parameters of the model should be determined
theoretically speaking, the more the number of model pa-
before fitting the ARIMA model. ARIMA (p, d, q),
rameters is set, the lower the difficulty of fitting the data
where d is easiest to be determined. Data difference
relationship or the better the fit will be. However, too many
aims to making the last data used is a time series that
parameters will also complicate the model structure, which
tends to be stable, which can improve forecasting
may lead to more difficulties in parameter estimation,
accuracy. As mentioned in the previous section, the
thereby reducing the model prediction accuracy. Therefore,
S&P 500 Index and its constituent stocks generally
the ideal ARIMA model should be the optimal combination
show a steady increasing trend. The data will tend to
of maximum likelihood function and parameters. The AIC
be stable after a difference, so the parameter d here
standard comprehensively considers the above two indica-
can be determined as the value 1. The determination
tors and can perform comprehensively on evaluation of the
of the parameters p and q needs to adopt the ACF
ARIMA model. Therefore, when optimizing the ARIMA
and PACF of the data.
model, the parameter with the smallest AIC value will be
The ACF and PACF are set into zero after a certain order selected.
is called truncation. The running results show that most data If the ARIMA model is used to predict future data, the
sets show an oscillation trend, as shown in Table 2. There are generated data are in the ARIMA model. In other words, the
also notable trends covering rising/falling trends, large drops underlying process of generating the time series only has a
occasionally when the correlation coefficient is stabilized, linear correlation structure, but the nonlinear relationship in
and stable periods with mixed oscillations. Although the the experiment data cannot be described. The ARIMA
ACF and PACF images show that most of the data sets are method still has certain limitations in predicting complex
close to white noise, the images show that five groups of real-world problems. In this regard, the NN model can be
parameters can be effectively used in the prediction of the employed to analyze the nonlinear parts that the ARIMA
ARIMA model. These five sequences are used in this article model cannot deal with.
to test the ARIMA model, and a total of 55,875 data sets are After fitting the ARIMA model to the linear part of the
trained. What is more, for each data set, we will select the data, this article generates a new data set to calculate the
smallest AIC-value-based model after training. residual value of the remaining non-linear part at every 21-
AIC (Akaike information criteria) is a commonly used time steps, as shown in Figure 5. Since the input is the
test standard for the prediction performance of ARIMA nonlinear partial residuals processed by the ARIMA model,
models. The expression of AIC calculation is as follows: the residual distributions of the X and Y data sets all fall
Scientific Programming 7
between 0 and 1. The newly generated X and Y segmentation on RNN, which contains 25 units. The final output of the
data set will be used as the input value of the next nonlinear cells is combined into a value with a full-connection layer.
LSTM model for training. This value is then output as a final predicted value through
a tanh activation function of a two-layer network. The tanh
activation function of the two-layer network can be
3.3.2. Forecast Design Based on LSTM Stock Price Correlation simply understood as the tanh function magnified by two
Coefficient. (1) Data Selection and Acquisition: After the times. Figure 7 shows the simplified architecture of the
ARIMA model processes the linear part of 150 pairs of method.
combined assets generated at any time, the remaining
nonlinear part is calculated as the residual value and used as
the input of the LSTM model, as shown in Figure 5. 3.4. Prediction Results Analysis
The input data set of the LSTM model is also divided into
X and Y trains, X and Y developments and two sets of X and 3.4.1. Forecasting Performance Evaluation. This paper aims
Y test set 1 and test set 2. The input data are stored in the X to fit the parameters of the model so that the optimal pa-
and Y data sets as shown in Figure 6. Each x data set size is a rameters can be used to apply and predict various assets in
55,874 × 20 matrix, and each X time series corresponds to a Y different time periods. Therefore, only the first window is
data set. trained, and the trained model can be applied to the data
(2) Training for LSTM Model: The model structure training of the three time intervals of the validation set and
constructed in this paper is an improved LSTM model based the two test sets. In addition, when the prediction results of
8 Scientific Programming
0.6
0.5
0.4
0.3
0.2
0.1
0.0
–4 –2 0 2 4
residual
Figure 5: Residual data distribution of training set.
RAW DATA d1 d21 d41 d61 d81 d2400 d2420 d2440 d2460 d2480
1 2 3 4 5
5 sections
×
STOCK A T1 T2 T3 T100 T101 T102 T103 T200 T201 T202 T203 T300 T2301 T2302 T2303 T2400
STOCK B T1 T2 T3 T100 T101 T102 T103 T200 T201 T202 T203 T300 T2301 T2302 T2303 T2400
11175
pairs
×2 ŷ
DATASET
the correlation coefficient of the model in the two time model to test the model in this article. The MSE and
periods are relatively ideal, some classic financial prediction MAE values of four financial models are calculated in this
models are selected to analyze the prediction effects of each article.
Scientific Programming 9
correlation coefficient
the S&P 500 index component stocks in the aspect of linear 0.6
as the first step, and then the nonlinear part of the data
0.4
residual value processed at the first step is used as the input
data of the LSTM model. Finally, model establishment, data 0.2
training and testing is developed. The final prediction results
of the correlation coefficient between the 150 randomly 0.0
generated asset portfolios and the S&P 500 index in the next
–0.2
20 time steps are shown in Figure 8.
0 5 10 15 20
3.5. Control Group Forecasting Model. Predicting the results time step
by the hybrid model alone is not enough to show that the
Figure 8: Prediction results of correlation coefficient.
certain advantages of the model in the forecasting perfor-
mance of research objects such as correlation coefficients. In
order to make comparison between the proposed hybrid direction. In order to quantify the volatility of assets and
model proposed and other models for the accuracy of fi- market returns, it is necessary to specify the market returns
nancial sequence forecasting, other commonly used fore- themselves. This specification is called the “market model.”
casting models are introduced as the reference group. Many
studies have shown that the full-sequence model is poor in Ri,t � αi + βi Rm,t + εi,t , (17)
prediction performance during the period of predicting fi- where Ri,t represents the return of asset i at time t; in the
nancial sequences, so three other commonly used prediction same way, Rm,t represents the return of asset m at time t; αi
models are also discussed, which are compared with the represents the excess return of asset i after risk adjustment; βi
prediction results of hybrid models. represents the impact of asset i on the market sensitivity; εi,t
represents the residual income of asset i at time t, also called
3.5.1. Full-Sequence Model (FS). Adopting the full-sequence the error term. So there is
algorithm is the easiest way to estimate the portfolio cor-
E εi � 0,
relation. All the past correlation values are used in the model
to predict the future correlation coefficient. Var(εi ) Var εi � σ 2εi , (18)
2
βi βj σ m (t) 1) CovRi Rj � ρij σ i σ j � βi βj σ 2m ,
ρ(t)
ij � ρ � ρ(t− . (15)
σ i σ j ij ij
0.30
0.40
0.25
Mean Absolute Error
0.30
0.15
0.25
0.10
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
epochs epochs
TRAIN_MAE TRAIN_MSE
DEV_MAE DEV_MSE
0.55 0.45
0.50 0.40
Mean Absolute Error
0.45 0.35
0.30
0.40
0.25
0.35
0.20
0.30
0.15
0.25 0.10
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
epochs epochs
TEST1_MAE TEST1_MSE
TEST2_MAE TEST2_MSE
accuracy of the ensemble method has been improved, and the [4] T. Zheng, J. Farrish, and M. Kitterlin, “Performance trends of
model can be extensively used to other applications of stock hotels and casino hotels through the recession: an ARIMA
market prediction. with intervention analysis of stock indices,” Journal of Hos-
pitality Marketing & Management, vol. 25, no. 1, pp. 49–68,
2016.
4. Conclusion [5] B. Yang, C. Li, D. Wang, and X. He, “Research on the Risk of
Shanghai Composite Index Based on VaR and GARCH
First, the two single models have good applicability to the Model,” in Proceedings of the 2017 3rd International Con-
data with single dimension. The loss function is used to ference on Economics, Social Science, Arts, Education and
calculate the prediction results of the proposed model, and Management Engineering (ESSAEME 2017), Huhhot, China,
we found that both ARIMA and LSTM model have lower January, 2017.
loss function values in stock index prediction. By comparing [6] M. Kim and H. Sayama, “Predicting stock market movements
the loss function values of all methods, it can indicate that using network science: an information theoretic approach,”
the three loss function indexes of LSTM model are superior Applied Network Science, vol. 2, no. 1, p. 35, 2017.
to ARIMA model. Moreover, the prediction accuracy of [7] M. Khashei and Z. Hajirahimi, “A comparative study of series
ARIMA-LSTM hybrid model is better than other financial arima/mlp hybrid models for stock price forecasting,”
models. In this paper, we proposed a hybrid model ARIMA- Communications in Statistics-Simulation and Computation,
LSTM, linearity is filtered out in ARIMA modeling, and vol. 48, no. 9, pp. 2625–2640, 2019.
[8] I. Unggara, A. Musdholifah, and K. S. Anny, “Optimization of
nonlinear trends are predicted in LSTM recursive neural
ARIMA forecasting model using firefly algorithm,” IJCCS
networks. The loss function test results show that the MSE, (Indonesian Journal of Computing and Cybernetics Systems),
MAE, and RMSE of ARIMA-LSTM hybrid model are vol. 13, no. 2, 2019.
smaller than those of other control models. Therefore, [9] S. Siami-Namini and A. S. Namin, “Forecasting Economics
ARIMA-LSTM model is feasible to predict the correlation and Financial Time Series: ARIMA vs. LSTM,” Papers, 2018,
coefficient of portfolio optimization. Although the predic- https://arxiv.org/abs/1803.06386.
tion results in this paper are basically consistent with the [10] A. Jayanth Balaji, D. S. Harish Ram, and B. B. Nair, “Ap-
expected results before the experiment, the time series before plicability of deep learning models for stock price forecasting
2010 is not considered for only the data after 2010 are se- an empirical study on BANKEX data,” Procedia Computer
lected. Therefore, the model’s ability to predict the special Science, vol. 143, pp. 947–953, 2018.
financial situation before 2010 need to be further tested. [11] T. Joo II and C. Seung-Ho, “Stock prediction model based on
What is more, as financial anomalies and noise are common, bidirectional LSTM recurrent neural network,” Journal of
all special trends cannot be covered by the model. Therefore, Korea Instiute of Information, Electronics, and Communica-
tion Technology, vol. 11, no. 2, pp. 204–208, 2018.
in the next step, it is necessary for researchers to further
[12] X. Liang, Z. Ge, L. Sun, M. He, and H. Chen, “LSTM with
study how to deal with Black Swan Theory in the financial wavelet Transform based data preprocessing for stock price
world. prediction,” Mathematical Problems in Engineering, vol. 2019,
Article ID 1340174, 8 pages, 2019.
Data Availability [13] S. Borovkova and I. Tsiamas, “An ensemble of LSTM neural
networks for high-frequency stock market classification,”
The experimental data of this research are available from the SSRN Electronic Journal, vol. 01, 2018.
corresponding author upon request. [14] S. Chen and L. Ge, “Exploring the attention mechanism in,”
LSTM-based Hong Kong Stock price Movement Prediction,
Taylor & Francis Journals, Milton Park, UK, 2019.
Conflicts of Interest [15] G. Peter and Zhang, “Time series forecasting using a hybrid
ARIMA and neural network model,” Neurocomputing, vol. 50,
All the authors declared that they have no conflicts of in- 2003.
terest regarding this study. [16] C. Narendra Babu and B. Eswara Reddy, “Prediction of se-
lected Indian stock using a partitioning-interpolation based
References ARIMA-GARCH model,” Applied Computing and Infor-
matics, vol. 11, no. 2, pp. 130–143, 2015.
[1] Z. Bao and C. Wang, “A multi-agent knowledge integration [17] Y. Baek and H. Y. Kim, “ModAugNet: a new forecasting
process for enterprise management innovation from the framework for stock market index value with an overfitting
perspective of neural network,” Information Processing & prevention LSTM module and a prediction LSTM module,”
Management, vol. 59, no. 2, Article ID 102873, 2022. Expert Systems with Applications, vol. 113, no. DEC,
[2] S. Deng, X. Huang, J. Shen, H. Yu, and C. Wang, “Prediction pp. 457–480, 2018.
and trading in crude oil markets using multi-class classifi- [18] H. Y. Kim and C. H. Won, “Forecasting the volatility of stock
cation and multi-objective optimization,” IEEE Access, vol. 7, price index: a hybrid model integrating LSTM with multiple
no. 99, p. 1, 2019. GARCH-type models,” Expert Systems with Applications,
[3] G. Caginalp and G. Constantine, “Statistical inference and vol. 103, pp. 25–37, 2018.
modelling of momentum in stock prices,” Applied Mathe- [19] B. Chen, J. Zhong, and Y. Chen, “A hybrid approach for
matical Finance, vol. 2, no. 4, 1995. portfolio selection with higher-order moments: empirical
12 Scientific Programming