Paper 3
Paper 3
Energy Reports
journal homepage: www.elsevier.com/locate/egyr
Research paper
article info a b s t r a c t
Article history: Peak load forecasting plays an integral part in the planning and operating of energy plants for the
Received 10 May 2022 utility companies and policymakers to devise reliable and stable power infrastructure. However,
Received in revised form 22 August 2022 the electricity load profile is considered a complex signal due to the non-linear and stochastic
Accepted 28 September 2022
behavior of the consumer. Therefore, a rigid forecasting model with assertive stochastic and non-
Available online 18 October 2022
linear behavior capturing abilities is required to estimate the demand capacity accurately. To handle
Keywords: these uncertainties, this paper proposed a hybrid model that integrates the multivariate empirical
Peak load forecasting modal decomposition (MEMD) and adaptive differential evolution (ADE) algorithm with a support
Support vector machine vector machine (SVM). MEMD allows the decomposition of multivariate data to deteriorate over
Adaptive differential evolution time to effectively extract the unique information from exogenous variables over different time
Multivariate empirical mode
frequencies to ensure high computational efficiency. The ADE algorithm obtains and tunes the SVM
decomposition
model’s appropriate parameters to effectively avoid trapping into local optimum and returns accurate
Convergence rate
forecasting results. Consequently, the proposed MEMD-ADE-SVM forecasting model simultaneously
achieves good accuracy (93.145%), stability, and convergence rate, respectively. A historical load dataset
from the independent system operator (ISO) New England (ISO-NE) energy sector is analyzed to verify
the MEMD-ADE-SVM hybrid model. The results show that the developed MEMD-ADE-SVM model
outperforms the benchmark frameworks such as; SVR-based model by hybridizing variational mode
decomposition, the chaotic mapping mechanism, and the grey wolf optimizer (VMD-SVR-CGWO), SVM
based on data preprocessing and whale optimization algorithm (DCP-SVM-WO), intelligent optimized
SVR model based on variational mode decomposition and Fast Fourier transform (VMD-FFT-IOSVR),
SVR model based on multivariate empirical mode decomposition and particle swarm optimization
(EMD-SVR-PSO), and MEMD-ADE-LSTM for day-ahead and week-ahead electricity peak load forecasting
in terms of accuracy, stability, and convergence rate.
© 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).
https://doi.org/10.1016/j.egyr.2022.09.188
2352-4847/© 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
algorithms. The DE or modified DE or enhanced DE has been often flatness. These variables positively affect the stable per-
used in the economic-environment dispatch problems of energy formance of SVMs in PLF. Selecting and modifying these
systems (Santos Coelho and Mariani, 2006; Yuan et al., 2008, variables for accurate and consistent execution is challeng-
2009). It has yielded a more satisfactory solution than other evo- ing. The presented ADE algorithm is merged with the SVM
lutionary algorithms like the GA or PSO. Due to SVM’s extensive model to address the problem of hyperparameters that are
theoretical foundations and inference capabilities (Li et al., 2020; difficult to tune in the SVM model. By combining the ADE
Zhang et al., 2021; Al-Musaylh et al., 2018b; Hafeez et al., 2019), algorithm with the SVM model, optimal hyperparameter
this study uses SVM as a modeler. Furthermore, we employed selection and adjustment are achieved.
adaptive differential evolution (ADE) to optimize SVM‘s hyper- 3. Innovative performance evaluation criteria: Four typical
parameters to improve forecast accuracy. However, it is worth statistical metrics (mean absolute percentage error (MAPE),
stating that the priority of this study is to tackle the uncertainties directional accuracy (DA), root mean square error (RMSE),
in the load data and the effectiveness of the designed hybrid and R-squared (R2 ). The four-evaluation criteria can be
framework, not an approximation of the variant implementation used as a baseline for energy system decision-making. The
using various optimization methods such as a genetic algorithm need for the evaluation criterion is to test the framework’s
(GA) (Moazzami et al., 2013), fruit fly optimization (FFO) (Zhang effectiveness and confirm its applicability. Furthermore,
et al., 2018), comprehensive learning particle swarm optimiza- two statistical test approaches the Analysis of Variance
tion (CLPSO) (Hu et al., 2014), and modified artificial bee colony (ANOVA) (Xiong et al., 2014) and the Diebold–Mariano
(MABC) (Li et al., 2015) for fluctuations in energy demand. (DM) (Li et al., 2020) validate the MEMD-ADE-SVM hybrid
This study defines the devised novel MEMD-based hybrid model to other competing models in forecasting reliability
model as MEMD-ADE-SVM. In a few words, the multivariate and accuracy while identifying a significant difference in
channels are first inputted into the devised MEMD-ADE-SVM the testing datasets.
framework to be deteriorated simultaneously by employing the
MEMD technique. After that, three hyper-parameters of SVM are 1.2. Design goals
optimized by ADE. Finally, ADE-based SVM is utilized to launch a
model and anticipate each component extracted using the MEMD The designed MEMD-ADE-SVM framework aims to signify day
approach. The actual load dataset from the ISO New England (ISO- ahead and week ahead PLF efficiently and accurately. We must
NE) energy sector is considered to justify the performance of the process the raw data, identify the appropriate features, and care-
presented framework and other comparative frameworks. The fully tune the classifier to perform this. As a result, the metrics
prediction of each element is incorporated to acquire the final listed below are indispensable for the processing performance of
forecast. This analysis shows that the developed hybrid model our presented system.
accurately estimates the day ahead and week ahead PLF.
• Accuracy and convergence rate: These are the core goals of
1.1. Real contributions our devised framework.
• Dimensional reduction rate: In this devised framework, the
A robust hybrid day-ahead modeling framework, MEMD-ADE- performance of MEMD influences the accuracy of classifica-
SVM for PLF, has been devised to effectively assemble stochastic tion directly.
scheduling with multi-dimensional input variables to minimize • Time-efficiency: Applied in PLF, the framework should run
the loss of underestimated or overestimated energy systems. Al- fast.
though several types of research have been performed to analyze
the forecasting reliability and significance, no studies, evidence 1.3. Paper organization
from Table 1, communed to PLF using MEMD for peak load data
decomposition and SVM have been documented in the literature. The rest of the paper is organized as follows: The litera-
The real contributions of this study are outlined as follows. ture survey is discussed in Section 2. Section 3 explains the
devised MEMD-ADE-SVM model and methods used in it. Sec-
1. A transition towards hybridization: A new revitalizing tion 4 explains the research formulation by describing the dataset
MEMD-ADE-SVM framework is being developed that in- description, performance metrics, and experimental implemen-
tegrates MEMD and ADE algorithms with SVMs. MEMD tation, while Sections 5 and 6 show the simulation results and
allows multivariate data decomposition to efficiently cap- discussion, respectively. Finally, Section 7 concludes this work by
ture unique information between related variables of dif- outlining limitations and potential future directions.
ferent time frequencies during multivariate deterioration
over time. The ADE system proactively selects and ad- 2. Literature survey
justs SVM hyperparameters to enhance prediction accuracy
and stability while boosting the rate of convergence. The STLF usually covers the hourly forecasting period and is es-
hybridization of the MEMD and ADE algorithms makes sential for grid decision-making. Statistical and ML models are
it possible to effectively implement SVM technology. The generally used in the STLF literature. These models are split into
proposed MEMD system decomposes historical loads & two sub-streams to better understand the current STLF models:
meteorological variables simultaneously, unlike other de- models with univariate time series load data and a model with
composition methods. This dynamically manages the non- multivariate time series load data.
linearity and non-stationary peak loads, efficiently retriev- Researchers note that hybrid models that use time–frequency
ing various features from different levels of time frequen- analysis are considered promising because of the advantages of
cies related to accurate forecasting of the day ahead and a data pre-filtering approach in extracting the unique character-
week ahead peak load. istics of time-series data. Time–frequency studies determine the
2. Appropriate hyperparameters modification and adap- affinity between the most effective physical quantities (time and
tion through ADE technique: The SVM framework has frequency domain). Hybrid systems extend the current paradigm
three hyperparameters: the intense loss function (ε ), the of modeling decomposed ensembles by relying on the actual
kernel function parameters, and the parameters that rep- properties of time series data. This is a step beyond the above
resent the trade-off between training errors and function systems. Moazzami, Khodabakhshian (Moazzami et al., 2013) use
13335
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
Table 1
Recent and relevant literature survey’s brief summary considering Frameworks, Objectives, Limitations, Advantages, Performance metrics (Accuracy, Convergence rate,
and Computational complexity).
STLF frameworks Objectives Time scale Limitations Advantages Performance metrics
Accuracy Convergence Computational
rate complexity
SVM-ANN with Peak load forecasting Short-term Improved accuracy The framework has High Low High
K-Medoids clustering (30 mints) with complex large complexity
(Haq et al., 2020) framework
Weather information Accuracy Daily This framework is Performance High Low High
based ELF of a bulk improvement for suitable and quite increased by
power system effective performance effective only for bulk integrating exogenous
(Kazemzadeh et al., of bulk power system power system parameters
2020)
ANN-based Accuracy improved Daily Accuracy achieved at Convergence is High Low High
forecasting by reducing RMSE the expense of decreased due to
framework (Heydari convergence sigmoid function and
et al., 2020) model complexity
A big data approach Forecast accuracy Daily The framework has Forecast accuracy is High Low High
for ELF (Hu et al., improvement for complex structure improved at the
2014) scalable models and slow convergence expense of high
rate complexity.
Intelligent model for Distribution energy Daily The framework is High accuracy and High Low High
ELF based on SVM generation forecasting designed for short better generalization
and FFI algorithm and structure analysis horizon of prediction is achieved at the
Khalid and Javaid cost of framework
(2020) complexity
LSTM-RNN based LF Accuracy Hourly Proposed framework Accuracy is improved High Low High
(Amjady et al., 2010) improvement to improved forecasting while convergence is
facilitate the accuracy compromised
residential consumers
ANN based DE-PSO Day ahead forecasting Daily Accuracy increased Improved accuracy in High Low High
(Sakurai et al., 2019) considering outliers. with compromising comparison
convergence rate traditional PSO
ANN’s wavelet decomposition (WLD) and GA optimization using research (Hu et al., 2015a; Sobhani et al., 2020; Selakov et al.,
PLF univariate time series meteorological data to decompose the 2014; Jang et al., 2020). In this study, historical data consisting of
load time series, taking into account the low and high-frequency load and temperature are considered as input variables to reduce
dimensions of ANN. By determining the detection target, the the computational cost of the hybrid models. The MEMD algo-
prediction accuracy has been improved for grasping complex rithm decomposed multidimensional input variables to extract
features at different frequencies. However, pre-selecting basis IMFs with similar frequencies and multidimensional residuals.
function wavelets is also very complicated. In recent years, em- PLF is the output of the predictive target. Researchers consider
pirical modal decomposition (EMD) has gained increasing at- ML hybrid approaches are potent techniques that deal with non-
tention to overcome wavelet defects (Huang et al., 1998). For stationary, non-linear, and transient characteristics of peak loads
example, AlMusaylh, and Deo (Al-Musaylh et al., 2018b) used due to the inadequate predictive control of univariate time series
an improved empirical adaptive noise mode decomposition com- and the inherent limitations of individual models. Table 1 shows
bined with SVR optimized by a two-phased PSO approach for various short-term PLF models.
PLF using univariate time series load data. Recently, EMD stud- The emerging hybrid and integrated predictive models are
ies have developed theoretical treatments for bivariate (Rilling intelligent solutions that completely utilize the preferred features
et al., 2007), trivariate (ur Rehman and Mandic, 2009), and mul- of single models to provide excellent efficiency (Raza and Khos-
tivariate (Rehman and Mandic, 2010) patterns, respectively. Mul- ravi, 2015; Yu et al., 2019). For example, a hybrid framework is
tivariate EMD (MEMD) is a multivariate, ingenious, adaptable, devised based on a wavelet neural network (WNN) and improved
multiscale decomposition approach stemming from EMD (Huang differential evolution (IDE) for PLF (Liao, 2014). The applicability
et al., 1998; Rehman and Mandic, 2010). EMD and MEMD rein- of the presented framework is confirmed by a practical com-
force non-linear, inconsistent, uncertain, unstable, unsteady net- paring it with other frameworks like ANN-GA, ANN evolutionary
works that can decompose the original data into intrinsic mode programming (ANN-EP), and ANN-PSO. The authors developed a
functions (IMFs). However, as mentioned above, various aspects converged model of repulsion PSO (RPSO) and SVM algorithm for
such as environment variables, days of the week, holidays, and PLF (Dai and Zhao, 2020). The presented composite framework
consumer social interactions affect the energy load (Hu et al., is validated using Singapore’s historical data compared to tra-
2015b,a). The complexity of the energy system suggests that it ditional training frameworks in assessing accuracy. A nonlinear
is not enough to consider the past load of the univariate PLF. This AR, GA, and an extraneous NN hybrid framework are devised
study deals with nonlinear and transient load sequences and their for STLF (Jawad et al., 2018). The proposed framework is be-
influencing factors, considering the usefulness of MEMD. How- ing optimized by using statistics and pattern recognition-based
ever, MEMD was proposed by Rehman and Mandic to decompose schemes. GA is used for the weight and bias of the NN train-
multivariate load series simultaneously for more accurate predic- ing selection. The framework is validated by comparing existing
tions (Rehman and Mandic, 2010). The results show that MEMD mean and regression tree models with external inputs. The
has improved performance compared to EMD. Nevertheless, it is author presented a resilient STLF model with a computerized
difficult to estimate all the aspects that influence load. Tempera- data preprocessing and prefiltering strategy for ELF of distri-
ture is a meteorological factor, based on comprehensive literature bution feeders (Huyghues-Beaufond et al., 2020). The previous
13336
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
day’s building level LF model was proposed based on DL (Cai the literature that vast improvement has been made in ELF for
et al., 2019). The devised DL model is validated by hovering energy management. However, the existent techniques are coun-
the accuracy of definitive models. The navel hybrid model VMD- terproductive in dealing with big data. It is formidable to tune
LSTM-BO has been developed (He et al., 2019). This model is control parameters, resulting in high computational complexity
considered superior to current models in stability and accuracy. and the ineptitude to converge fast because redundancy, irrele-
In Wu et al. (2019), authors proposed GNRR and the multi- vancy, and dimensional reduction are not intercepted. Moreover,
purpose cuckoo search algorithm (CSA) based improved hybrid the literature mentioned above does not simultaneously cater
framework. The developed model used real-time load data from to forecast accuracy and convergence rate. A fast and accurate
an Australian energy market operator (AEMO) to evaluate fore- framework is the need of the day to unravel such concerns.
cast accuracy against the benchmark frameworks. The author Hence, in Shiri et al. (2015), the integration of gradient descent
developed a Neural Elman (NE) network-based forecast engine (GD) algorithm with an SVM-based framework is devised. This
to signify the future load of SG. The proposed intelligent opti- framework has much computational complexity and is untrained
mization algorithm optimally acclimates the biases and weights to converge. Some authors focus on feature selection algorithms
of this network to obtain precise forecastings (Vrablecová et al., such as traditional classifier decision trees (DTs) and artificial
2018). The authors provided an STLF model based on SSVR (Li neural networks (ANN) (Jiang et al., 2016). However, DTs meet the
et al., 2018). The foremost objective is to enhance the efficacy issue of overfitting, which means that a DT portrays excellently
and accuracy of comparative forecasts. The output of the fore- in training but incorrectly in prediction. The ANN has restricted
casting module is passed through the optimization engine, and generalization capacity and has a dilemma of regulating its con-
the accuracy and efficiency of relative predictions are enhanced vergence. In Wang et al. (2017), the authors framed a hybrid
by fine-tuning the parameters. However, it improves forecast feature selection, extraction, and classification-based model for
accuracy at the outlay of computational complexity. The de- ELF. However, this method has high system complexity and is
veloped framework attains more increased accuracy than other incompetent to converge. Therefore, a new hybrid forecasting
frameworks. Besides, an integrated SVR algorithm and chaotic model has been devised. The developed framework strives to
krill herd (CKH) framework is proposed for load forecasting time establish high-quality daily to weekly load forecasts over a period
series (Zhang et al., 2020). However, the results acquired are not with a relatively high convergence rate for SG decision-making.
stable and have no accuracy. The hybrid framework is designed
based on SVR and DE to improve prediction accuracy by modify- 3. Proposed MEMD-ADE-SVM framework
ing the SVR’s hyper-parameters (Zhang et al., 2016). The proposed
model performs better than typical regression models, SVR, ANN, This work proposes a novel hybrid framework based on ME-
and back-propagation (BP)- ANN models. A model combining fruit MD, SVM, and ADE algorithms for day ahead and week ahead
fly (Ff) and SVR was developed in Cao and Wu (2016) to solve the PLF. The devised model has three main parts: (i) MEMD based
problem of parameter selection and improve the accuracy of PLF. data decomposition module, (ii) forecaster based on SVM, and
Also, a new method is developed, hybridizing SVR with the firefly (iii) optimizer based on ADE algorithm.
optimization (FFO) in Kavousi-Fard et al. (2014), Xiao et al. (2016) The whole workflow of the devised MEMD-ADE-SVM frame-
to ensure accurate PLF by optimal tuning hyper-parameters. work is presented in Fig. 1. In this study, there are three key goals
The aforementioned hybrid frameworks can be deemed pro- for PLF:
mising, optimistic, and practical in enhancing forecast accuracy • The first is simultaneous preprocessing of multidimensional
by suitably modifying super-parameters. However, the authors time series using a time frequency process. This can signif-
of these articles concentrate on optimizing bias initialization icantly improve future peak loads by adopting a univariate
and random weight or appropriately altering and picking hyper- approach.
parameters. Also, none of these models considered accuracy, • Second, it avoids overfitting and training the developed
rate of convergence, and stability simultaneously. From plenty of model with excellent generalization capabilities and dimin-
analysis and investigation, we inferred that only one factor (bias ishes modeling and predictive computational complexity.
initialization and random weight optimization or appropriate hy- • Finally, we need to consider the optimal modeling parame-
perparameters setting and selection) and only one criterion (con- ters to ensure the optimal solution for our PLF.
vergence rate or accuracy or stability) are insufficient. Therefore,
a robust hybrid model is needed to overwhelm the problems of The following are the relevant analyses of the proposed ADE-SVM
current models while improving predictive accuracy and stability based framework, as illustrated in Fig. 1:
with fast convergence rates. Step 1: Using the MEMD, the multivariate channel Y = { tem-
From the recent rational work above, we can draw three perature, valley, mean, peak } is initially split into m multivariate
conclusions: (i) There is no perfect versatile predictive model in rf and Imf extracted. For accuracy excellency, time domain stud-
all respects, but some frameworks are appropriate for some ob- ies can take into account the possible characteristics of factors
jectives and conditions. (ii) There are problems with overfitting. that utilize energy load patterns. In contrast, frequency-domain
Overfitting means that the model is above average in training analysis allows us to search more precisely for features specific
and worst in prediction. (iii) There is a trade-off between the to our dataset. The Fig. 3 shows a time-domain study applied
prediction accuracy and the convergence rate, and increasing to a real-time dataset to capture trends in power load. When
prediction accuracy affects the convergence rate. The reverse is implementing MEMD and the time–frequency method to deterio-
rate time-domain signals to finite waves of different frequencies,
also true.
these spontaneous signals become stable and predictable signals
Several studies have compared forecasting models and identi-
of different frequencies instead of related variables. Effectively
fied the top-performing models for electricity PLF. However, the
extract information and make predictions.
prevalence of these has targeted distinctive areas. No analysis
involving definitive or recently developed hybrid models has Step 2: ADE is used to optimize the hyper-parameters of SVM.
exhaustively concentrated on the PLF. Hence, this research can MAPE carefully tunes three SVM parameters, notably (C , ϵ, γ ),
present complete and long-term trends in procedure and policy and, in training sets to ensure SVM predicting accuracy. Before
than previous research via the relative analysis of diverse high- analyzing and predicting each constituent with SVM, the char-
level forecasting frameworks. It can be confidently inferred from acteristics of Imf and rf derived from the original signal Y are
13337
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
investigated to find the best SVM settings. The developed ADE 2010). While the conventional EMD approach can reconstruct
is a random search heuristic method based on group differences. a convoluted univariate load to a finite set of Imf and a rf .
It is an innovative and efficient technique. GA was used to de- The Imf components extracted by EMD from various. Different
velop the ADE. The function of ADE algorithm is to optimize load TS may extract a different number of Imf elements, and TS
the SVM parameters, that is, to identify the optimal parameter may not always correlate to a certain frequency. In computing
combination (C , ϵ, γ ), so that the SVM model performs the best in cost, matching the Imf differences obtained from different TS
classification. The SVM prediction model uses the SVR parameters is tough. To overwhelm the inherent weaknesses of EMD, the
that correspond to the best global solution. MEMD method is applied in this study to improve forecasting
Step 3: SVM is used in this study to build the system and accuracy while significantly lowering the computational cost.
forecast each element retrieved using the MEMD approach. ADE MEMD makes a significant contribution by estimating the local
explored the optimum SVM hyper-parameters. However, the ap- mean of n-dimensional signals.
propriate enhanced prediction of each removed element in step 1
In this work, the four input variates for day ahead and week
may be developed, which must then be revamped to obtain the
ahead PLF are considered as input variables: the mean peak load,
final energy PL estimate.
valley load, temperature, and peak load. The proposed MEMD
In a nutshell, multivariate channels are input to the initially
can simultaneously decomposed the p-variate inputs Y into k-
proposed MEMD-ADE-SVM model and simultaneously deterio-
multivariate IMF (Imf (q)) and a multivariate residue (rk (q)). The
rate using MEMD technology. Next, ADE optimizes the three
Y is presented in Eq. (1):
hyperparameters of the SVM. Finally, use ADE-based SVM to build
the framework and anticipate each feature. Each feature was Y = Y1 (q), Y2 (q), . . . , Yp (q)
{ }
(1)
extracted using the MEMD method. Combine the predictions from {
each feature to get the final prediction. where each component Imfj (q) presents Imf1 (q), Imf2 (q), . . . ,
} j j
3.1. MEMD based decomposition module Imfm (q) (j = 1, . . . , k), while the residue rk (q) represents
j
Fig. 2. MEMD algorithm flow chart. Loop-1 is used to test through shifting criterion while loop-2 is used to determine through stopping criterion.
where v denotes the number of extrema of the projected Y (q) = Y pl (q), Y V (q), Y ml (q), Y τ (q) , (q = 1, . . . , L)
{ }
(6)
signal less than 3. The Eq. (4) shows that a mode would {
be excluded when the projected signal has inadequate where each component Imfj (q) of the L represents ImfPl (q), Imfml
j j
extrema. }
5. The components Imf (q) are extracted in sequence from (q), Imfml (q), Imfτj (q) (j = 1, . . . , m) while the residue rk of L is
j
represents rk pl (q), rk V (q), rk ml (q), rk τ (q) . The result of decompo-
{ }
high to low frequency. As shown in Fig. 2. Loop 1 is used
to test whether transfer function a(q) is obtained through sition is shown in Fig. 3. Fig. 3 shows the decomposition of ISO-NE
the sifting criterion or again calculation of projection of load data using the MEMD algorithm resulting in ten multivariate
input signals along dth directions, and Loop 2 is used to Imf elements with higher to the lowest frequency and one rf
determine whether r(q) is the residue or is used to obtain element.
next multivariate Imf 1(q) through the stopping criterion.
3.2. SVM based forecaster
Through the above procedures of decomposition, the multi-
directional signal can be expressed as. The multidimensional sig-
SVMs are generally used to demonstrate the problems of non-
nal {Y (q)}Lq=1 can be expressed in Eq. (5) using the decomposition
linear and unsteady prediction. Due to its robust fictitious frame-
mechanisms described above:
work and potential for effective generalization, SVM is a modeler
m
∑ that confirms the efficacy of the MEMD-based time–frequency
Y (q) = Imfj (q) + rk (q) (5)
strategy presented to the day ahead and the week ahead PLF. In
j=1
this paper, We model the classification problem mathematically
where m is the total number of Imf obtained j = 1, . . ., m and the in Eq. (7):
residue is rk . D
∑
The day ahead multivariate historical data comprising elec- f (x, y) = yk Zk (x) + γ (7)
tricity load and temperature are taken from ISO-NE for PLF work k=1
13339
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
Fig. 3. Decomposition of ISO-NE load data using the MEMD algorithm resulting in ten multivariate Imf elements with higher to the lowest frequency and one rf
element.
The objective of SVM is to define a hyperplane in D-dimensional The ELF pattern is fed into the optimizer unit, which improves
feature space that differentiates the data points. In this study, the accuracy by dropping other errors.
hyperplane is defined by Eq. (7). The regularized risk function RF
is then defined in Eq. (8):
∑D ⏐⏐ a 3.3. ADE
k=1 Lk − f (x, y) ε + σ y
⏐ 2
⏐
RF (y) = , (8)
D This subsystem intends to improve predictive performance
where σ is the feature selection regulating threshold, ε is the even further by dropping the RF . Since the SVM-based fore-
insensitive loss function parameter, and Lak is the targeted load caster’s returned value of the RF is the smallest within its limits,
consumption pattern. The parameter y must be obtained through The optimizer unit is combined with the forecasting module
minimization of this RF . The robust error function x is calculated based on SVM to minimize further RF . However, as an objective
in Eq. (9): function, the optimization module applies RF minimization. Yet,
{ this feature is guided to hyperparameters including the insensi-
if ⏐Laj − f (x, y)⏐ < ε
⏐ ⏐
0 tive loss function Lf , cost penalty CP , and kernel K. Optimizing
x= ⏐ a (9)
L − f (x, y)⏐ other w ise.
⏐ ⏐
these hyperparameters is robust for efficient, accurate, and effec-
j
tive LF. Scholars have used a variety of methods to improve hy-
Eq. (9) employs a function to minimize Eq. (8) and can be modeled perparameters, including cross-validation, back-propagation (BP),
in Eq. (10): and gradient descent (GD) (Kumar et al., 2016). On the other
N hand, these strategies have high dimensionality and are untrained
∑
f x, π, π ∗ = π − π K∗ (x, xk ) + γ , to converge. However, DE is favored over-optimization method
( ) ( ∗ )
(10)
k=1
for two rationales: (i) premature convergence avoidance, and (ii)
it provides superior quest ability. The authors used an efficient
where π ∗ ≥ 0 for all values of k. K∗ (x, w) is the SVM kernel
adaptation of DE (EDE) that was presented in Storn and Price
function that shows the multiplication of radial basis KPCA in the
(1997). In terms of trial vector generation reliability and con-
feature space f∗ as in Eq. (11):
vergence rate, the study in Amjady et al. (2010) is enhanced. As
D
∑ a result, proposed adaptive DE is used with the SVM model to
K∗ (x, w) = Zk (x)Zk (w) (11) optimize control parameters. The following is a brief discussion:
k=1 The trial vector (B) for the jth individual in the t iteration is
In an infinite feature space, the K∗ eliminates the requirement for described in Amjady et al. (2010) and presented in Eq. (13):
Zk feature will be calculated. By maximizing the quadratic form, {
mt (j, k) if rm (j) ≤ ff (mt (j))
the π and π ∗ can be obtained in Eq. (12): Bt (j, k) = (13)
pt (j, k) if rm (k) > ff (mt (j))
N N
where mt (j, k) represents the mutant vector and pt (j, k) rep-
∑( ) ∑ a( ∗
R π ∗ , π = −ε πk∗ + πk + L k πk − πj
( ) )
k=1 k=1
resents the parent vector. In Eq. (13), ff () represents the fitness
13340
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
function with a range of 0 to 1, and rm represents a random num- iteration, EDE calculates 1 ff in 500 time units and 2 in 1000 time
ber with a range of 0 to 1. The next generation Bt +1 (j) offspring units. As a consequence, the ffs in Eq. (16) and (17) are modified
is generated based on pt (j) and mt (j) presented in Eq. (14): to minimize processing time and boost generalization ability as
follows:
Bt (j, k) if Rf (Bt (j)) ≤ ff (pt (j))
{
Bt +1 (j, k) = (14) Rf (Pt (j))
pt (j, k) other w ise. ff (Mt (j)) = (18)
Rf (Mt (j)) + Rf (Pt (j))
It is clear from Eqs. (13) and (14) that the former B influences Rf (Mt (j))
the selection of the following reproduction t + 1 offspring, which ff (Pt (j)) = (19)
Rf (Pt (j)) + Rf (Mt (j))
is based on the rm and ff () functions. The EDE method in Am-
jady et al. (2010) compares rm with ff () to update load values. Using Eqs. (14) and (15), the approach calculates 2 ffs in 100 iter-
This erratic upgrading of the load is a significant dilemma. As ations in 400 units. As a response, the convergence performance
a result, This issue is addressed by eliminating the offspring se- of the EDE method used in Amjady et al. (2010) is improved.
lection’s reliance on the genuinely random quantity. The process
for changing load values is established by comparing the future 3.4. Proposed benchmark model MEMD-LSTM-ADE for comparison
load value of ff to the previous load value. As a result, the new
load values will become suitable, increasing forecast accuracy. The comprehensive flow of the devised benchmark MEMD-
The designed adjustments in Eq. (13) are as follows: ADE-LSTM framework compared to the proposed model is elab-
⎧ orated in detail in Fig. 4. The explicit studies of the suggested
Pt (j)
⎨mt (j, k) if ≤ ff (Mt (j)) LSTM-based benchmark model are as follows: LSTM is used to
⎪
⎪
Pt (jmax ) verify the framework and forecast each element extracted by em-
Bt (j, k) = (15)
Pt (j) ploying the MEMD approach. ADE analyzed the optimal control
⎩pt (j, k) if > ff (Mt (j))
⎪
⎪
Pt (jmax ) parameters of LSTM. Hence, the affiliated improved forecast of
each part removed from Step 1 can be acquired, which requires
The ff of parent and mutant vectors, according to this perspective,
reconstruction to fetch the final electricity PLF. In a few words,
is defined in Amjady et al. (2010) and presented in Eq. (16) and the multivariate channels are first inputted into the proposed
(17): MEMD-ADE-LSTM model to be decomposed simultaneously using
1 the MEMD technique. Then hyper-parameters of LSTM are opti-
Rf (Mt (j))
ff (Mt (j)) = 1 1
(16) mized by ADE. Finally, ADE-based LSTM establishes a model and
Rf (Mt (j))
+ Rf (Pt (j)) forecasts each element extracted using the MEMD technique. The
1 estimates of each component are incorporated to fetch the final
Rf (Pt (j)) prediction.
ff (Pt (j)) = 1 1
(17)
Rf (Pt (j))
+ Rf (Mt (j))
4. Research formation
It is assumed in the ff of Eq. (16) and (17) that every arithmetic
function, such as divide and adding, requires 1 unit of time to 4.1. Datasets description
complete. Because Eq. (16) and (17) will require 5 units of time
to execute each iteration, the total number of iterations for the A real-world load dataset from ISO-NE (ISO, 2022) is used to
EDE method is 100, according to Amjady et al. (2010). At each validate the excellency of the devised MEMD-ADE-SVM hybrid
13341
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
framework. The main reason for using the samples is that ISO- Table 3
NE is an independent, not-for-profit corporation responsible for The duration of time and the volume of the experimental data sets.
keeping electricity flowing across the six New England states. The Data origin Time span Size of Training Testing
samples dataset dataset
first 2by3 of daily historical data is used as training samples. At
the same time, the rest is used to validate and test a model to ISO-NE January 1, 2017 to 2308 1615 692
December 31, 2020
estimate the PL. Table 3 shows the experimental distribution of
the data sets. For the multi-dimensional actual series, consider
valley load, peak load, mean load and temperature represented
as yVl , yPl , yMl , and yT respectively. Therefore, a multi-dimensional where M denotes the size of the testing dataset, Xq and X̂q are
vector (yq )is represented as in Eq. (20): actual and forecast PLs at the qth point in time, respectively. uq
signifies the forecast direction and the observation, and X̄ is the
q , yq , yq , yq , (q = 1, . . . , τ )
yq = ypl Vl Ml T
{ }
(20)
mean of the actual PL. In addition, a quantitative testing strategy
It is composed of affecting parameters represent the input on the is used to evaluate the expected validity of the developed MEMD-
pl
qth day and yq+1 that reflects the result on the next day. Dataset ADE-SVM hybrid framework against other competing models. A
of the proposed model is represented in Eq. (21): powerful preliminary test called ANOVA tests the null hypothesis
)}τ of equivalence to determine if there is a significant performance
Yj , Zj ∈ R4u × R
{( ) (
D= j=u
(21) gap between all the models compared (Xiong et al., 2014). Sec-
where ondly, Diebold–Mariano (DM) test shows a significant difference
in predictive performance between the developed hybrid frame-
j , yq , yq , yq , . . . , yq−u+1 , yq−u+1 , yq−u+1 , yq−u+1
Yj = yPl Vl Ml T Pl Vl Ml T
[ ]
(22) work and other equivalent frameworks at certain probability
The input to specify the delay value of the historical data is values as demonstrated in Eq. (28) (Li et al., 2020). However, DM
depicted in Eq. (22), and Zj = yPl tests are used to successfully remove the constraints of stochastic
q+1 is set as the output of Yj .
Through Yj and Zj , I/O pair τ − u + 1 have been developed for variance of instances and to determine and minimize frame-
modeling and prediction. Where u the embedded dimension is work prediction errors compared to other frameworks, which can
picked by trial and error, Section 4.3 contains more information provide stability throughout the analysis.
about it. lmean
DM =
l
std
4.2. Performance metrics
c , c , . . . , cτ ]
⎧ [ 1 2 ]
⎪ F c =
Fd = d1 , d2 , . . . , dτ
⎪
⎪ [
Four statistical errors are chosen to calculate the prediction ⎪
⎪
i j
dj
⎪
accuracy of the developed MEMD-ADE-SVM hybrid framework: ⎨ l = c −∑ (28)
⎪
⎪
τ 2
mean absolute percentage error (MAPE), root mean square error s.t lmean = j−1 l
τ
(RMSE), R-squared (R2 ), and directional accuracy (DA). The four ⎪ √
∑τ
j=1 (l −lmean )
⎪ 2
2
⎪
evaluation criteria can be used as a baseline for energy system
⎪
l =
⎪
std τ −1
⎪
decision-making. R2 is characterized by Eq. (23), which is widely
⎪
⎪
j = 1, . . . , τ
⎩
used to assess the appropriate levels of various benchmarks on
the same test dataset. The greater the value of R2 , the superior
the predicting performance for the framework is Zhang et al.
4.3. Experimental implementation
(2021), He et al. (2019). Eq. (24) defines RMSE, that is a term
commonly used to quantify the relative squared error between
the actual and forecasted loads. Closer the forecasted value to the The devised MEMD-ADE-SVM hybrid framework is run in the
true value, the RMSE value would be lessened (He et al., 2020; MATLAB R2020b environment, where MEMD plays a vital role in
Dewangan et al., 2020; Kumar et al., 2016). MAPE is defined as enhancing forecasting accuracy. The ADE is used to find the best
Eq. (25), and It is frequently used to compute the average abso- SVM parameters to enhance prediction performance. ADE settings
lute inaccuracy between actual and predicted loads. The smaller are chosen by trial and error. LIBSVM (Version 3.24), an SVM
the MAPE value, the superior the prediction performance of the library offered by Chang and Lin (2011), is used to implement
model (Memarzadeh and Keynia, 2021; He et al., 2020; Dewangan SVM. To enhance forecasting accuracy, three SVM hyperparame-
et al., 2020). DA is expressed as Eq. (26), a method for measuring ters, C , ϵ, γ , are fine-tuned in the training phase using ADE-based
the accuracy of forecasting direction and giving investors with the hyperparameter optimization. The search space of parameters is
current trend (Wang et al., 2019). defined: C ∈ [0.1, 1000] γ ∈ [0.001, 1000], and ε ∈ [0.001, 0.1],
)2 respectively. The fitness function of ADE is used as the average
∑M (
Xq − X̂q of MAPE to produce and assess the optimal parameters in SVM.
q−1
R2 = 1 − ∑M (23) The lower the MAPE value, the better the particle’s modeling
(Xt − X )2
q=1 and prediction. The possible size of the embedded dimension is
specified from 1 to 16 throughout the training process. To find the
M X − X̂ 2
∑ ( )
√ q−1 q q appropriate embedded dimension, we must trade-off prediction
RMSE = (24) accuracy and computing time in a real-world sample. As a result,
M the best one u = 6 is chosen, as shown in Table 4 and Fig. 5.
M ⏐
⏐ ⏐
1 ∑ ⏐ Xq − X̂q ⏐
⏐
MAPE = ⏐ ⏐ × 100 (25) 5. Experimental results
M ⏐ Xq ⏐
q=1
∑M
uq 5.1. Simulation for data analysis
q−2
DA = × 100, q = 2, . . . , N (26)
M −1 Hourly based historical load data gathered from the ISO New
1, if xq − xq−1 x̂q − x̂q−1 ≥ 0
{ ( )( )
England Control Area from 2017 to 2020, having more than 5000
s.t . uq = (27)
0, otherwise records that is freely available to the public (ISO, 2022). The
13342
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
Fig. 5. Achieved trade off between forecasting accuracy and computational cost for optimal u in real world samples. The best one u = 6.
Fig. 6. Significant variation of historical load waveforms for peak loads, valley loads and average loads, and the correlation between daily mean historical loads and
mean daily temperatures.
Fig. 7. Regeneration of Imf & rf of each channel to investigate the patterns of fluctuations in energy demand inherent in evolution.
Table 6
Evaluation of the devised and the benchmark frameworks for the 15 April 2020 With hour resolution in terms of MAPE (%).
Proposed and benchmark forecasting models
Hours VMD-FFT-IOSVR DCP-SVM-WO EMD-SVR-PSO VMD-SVR-CGWO MEMD-ADE-LSTM Proposed
MAPE (%) MAPE (%) MAPE (%) MAPE (%) MAPE (%) MAPE (%)
0 2.4 2.1 1.9 1.4 1.1 0.9
1 2.3 1.9 1.8 1.5 1.2 0.7
2 2.2 2 1.7 1.6 1.3 0.8
3 2.1 1.9 1.9 1.3 1.1 0.5
4 2.1 1.8 1.6 1.4 1.4 0.9
5 2. 1.8 1.8 1.3 1.3 0.7
6 1.9 1.75 1.7 1.4 1.4 0.7
7 2.7 1.7 1.5 1.3 1.5 0.6
8 3.1 1.65 1.6 1.2 1.3 0.6
9 2.5 1.6 1.5 1.3 1.3 0.8
10 2.4 1.55 1.4 1.3 1.3 0.6
11 2.6 1.5 1.9 1.2 1.2 0.5
12 2.6 2.3 1.9 1.2 1.2 0.4
13 2.7 2.4 2 1.3 1.3 0.5
14 2.7 2.5 2.1 1.7 1.1 0.6
15 2.8 2.1 2.2 1.1 1.1 0.7
16 2.8 2 1.8 1.2 1.2 0.4
17 2.9 1.8 1.7 1.7 1.2 0.3
18 2.9 1.9 1.6 1.3 1.3 0.6
19 3 1.8 1.5 1.4 1.4 0.5
20 3.1 1.8 1.8 1.5 1.1 0.4
21 3.2 1.7 1.7 1.7 1.1 0.5
22 2.7 1.6 1.6 1.6 1 0.3
23 2.8 2.2 2.2 1.3 0.9 0.2
Average 2.63 1.92 1.77 1.39 1.22 0.57
Table 7
Evaluation of the devised and the benchmark frameworks for the week time horizon of 04/16/2020 to 04/22/2020 in terms of MAPE
(%).
Proposed and benchmark forecasting models
Days VMD-FFT-IOSVR DCP-SVM-WO EMD-SVR-PSO VMD-SVR-CGWO MEMD-ADE-LSTM Proposed
MAPE (%) MAPE (%) MAPE (%) MAPE (%) MAPE (%) MAPE (%)
Monday 2.6 2 1.7 1.3 1.3 0.5
Tuesday 2.5 1.9 1.9 1.6 1.4 0.8
Wednesday 2.4 1.8 1.8 1.7 1.2 0.7
Thursday 2.7 1.5 2.1 1.7 1.3 0.5
Friday 2.8 2.2 1.5 1.2 1.4 0.4
Saturday 2.7 2 1.7 1.1 1.2 0.6
Sunday 2.7 2.1 1.8 1.1 1.1 0.5
Average 2.63 1.93 1.78 1.38 1.27 0.57
Fig. 8. Evaluation of devised and benchmark frameworks of ISO-NE energy sector hourly load dataset. (a) Day ahead forecasting; (b) Week ahead forecasting
(16-04-2020 to 22-04-2020.).
13345
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
Table 8
The predictive accuracy of proposed and other frameworks in the real-world testing sets considering
historical load along-with temperature.
Models Input variables DA (%) MAPE (%) RMSE R2
Load Temperature
SVM ✓ ✓ 70.3165 4.231 443.132 0.743
LSTM ✓ ✓ 75.12 3.124 402.1 0.798
VMD-FFT-IOSVR ✓ ✓ 81.431 2.471 345.871 0.867
DCP-SVM-WO ✓ ✓ 84.231 2.141 184.241 0.8771
EMD-SVR-PSO ✓ ✓ 90.231 0.881 210.918 0.8923
VMD-SVR-CGWO ✓ ✓ 91.227 0.823 180.918 0.9013
MEMD-ADE-LSTM ✓ ✓ 92.227 0.819 178.218 0.9231
Proposed ✓ ✓ 93.145 0.786 112.147 0.9612
Fig. 9. Forecast accuracy of the devised and other benchmark frameworks in real world testing data.
compared to the MEMD-ADE-SVM model. However, these hybrid efficacy of multivariate decomposition predictions. Table 8 and
frameworks perform better than a single model due to the time– Fig. 9 show the highest performance error protection.
frequency resolution algorithm. In addition, MEMD-ADE-LSTM
offers excellent performance in DA and R2 as compared to the 5.5. Evaluation of convergence rate
EMD-SVR-PSO. This is due to the ADE optimization technique.
The devised MEMD-ADE-SVM framework is more accurate by Comparative analysis of the devised SVM based model and
other benchmark models like DCP-SVM-WO, VMD-FFT-IOSVR,
optimizing ADE compared to the relevant benchmark models.
EMD-PSO-SVR, VMD-SVR-CGWO, and MEMD-ADE-LSTM consider
The hybrid model presented better performance than the an-
the system’s convergence rate shown in Fig. 10(a). There is a
alyzed error metric analysis models. Similarly, the VMD-based trade-off between rate of convergence and prediction accuracy.
framework VMD-SVR-CGWO proposed in Zhang and Hong (2021). The accuracy of the VMD-FFT-IOSVR strategy has been enhanced
We approximated the procedures of univariate STLF-based SVR compared to other models, including the proposed model. Since
optimized by using PSO presented a model in this paper. This the optimization engine is integrated into the VMD-FFT-IOSVR
report accentuates the significance of irrelevant variables and the strategy, this improved accuracy is achieved at the expense of
13346
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
longer execution times. Fig. 10(a) show that the execution time Table 9
has increased from 15 s to 109 s as the optimization module is Stochastic noise coefficients before and after MEMD.
integrated into the forecasting module. The developed model has Noise coefficient Before MEMD After MEMD
reduced execution time for the following reasons: (i) Provides Q 1.49e−3 1.81e−4
mostly abstract attributes as input to the training and prediction L 1.02e−5 1.32e−6
B 4.21e−4 4.87e−5
engine, reducing network training time. (ii) Use kernel functions. K 6.35e−3 9.12e−4
(iii) Use ADE instead of the PSO, CGWO, and FFT algorithms R 5.21e−2 4.20e−3
because the convergence speed is relatively fast. The proposed
STLF model reduced the running time from 109 s to 46 s due
Table 10
to adaptations in current models. In contrast, the SVM model
The DM test values between the devised and other seven
does not have a built-in optimizer, so SVM outperforms the other models in the ISO-NE real-world testing set.
models in evaluating the rate of convergence. This behavior is Frameworks (Individual ISO-NE
clearly illustrated in Fig. 10(a). and Hybrids) DM P-value
Table 11
Evaluation of actual and forecasted PL in terms of MAPE and computational time of Individual frameworks (LSTM and SVM) and hybrid frameworks (DCP-SVM-WO,
VMD-FFT-IOSVR, EMD-PSO-SVR, MEMD-ADE-LSTM, VMD-SVR-CGWO) for day-ahead and week-ahead time horizon.
Frameworks without and with MEMD/ EMD/ VMD and optimization modules
LSTM SVM EMD-PSO-SVR VMD-FFT-IOSVR MEMD-ADE-LSTM VMD-SVR-CGWO MEMD-ADE-SVM
τ (s) MAPE (%) τ (s) MAPE (%) τ (s) MAPE (%) τ (s) MAPE (%) τ (s) MAPE (%) τ (s) MAPE (%) τ (s) MAPE (%)
Day ahead 170 2.25 185 2.2 242 2.98 228 1.65 224 1.25 221 0.89 218 0.786
Week ahead 185 2.25 198 2.2 310 2.98 289 1.65 276 1.25 267 0.89 256 0.786
• The selection of DL architectures: Till now, different DL optimal planning and decision making, making the energy grid
architectures have been used and applied in the literature cost-effective and environment friendly. Moreover, it provides a
to solve complex problems. However, there is no motivation strategic understanding of the current energy situation to eval-
or documentation on why these architectures have been uate the energy imported or exported. On this note, most of
established (Bera et al., 2014) the existing literature focuses on accuracy improvement. How-
ever, considering only the accuracy index is insufficient, conver-
• Lack of benchmarking results: There are few benchmark gence rate and stability indices are indispensable. These indices
results in the literature, such as studies in Reddy et al. are equally crucial in forecasting. Thus, a novel hybrid frame-
(2016), Badem et al. (2017), Chen et al. (2018). In these work MEMD-ADE-SVM is devised in this research. The proposed
studies, authors have involved different deep architectures hybrid framework has a novel ADE algorithm for appropriate
and compared the results with DTs and BPs to produce the parameter selection and tuning of SVM. Meanwhile, the MEMD
best training results. In addition, the loss of information approach simultaneously decomposes historical loads and mete-
in any system under analysis can influence the stability orological variables adaptively to handle the non-linearity and
of the whole system. This issue must be benchmarked to non-stationarity of the day and week ahead PL. It effectively
determine what the best performance is being fetched. extracts various features at different levels of time frequencies as-
sociated with predicting the next day’s peak load more accurately.
• The cost of implementing the architecture: Features ex- The purpose is to simultaneously acquire high accuracy, excellent
traction can be done beforehand, and then the suitable stability, and fast convergence. The proposed framework eval-
algorithm can be enforced as in Gadekallu et al. (2020), uated in terms of accuracy, stability, and convergence rate by
Junbo et al. (2015), Liu et al. (2016). This procedure strives comparing it with benchmark frameworks such as DCP-SVM-WO,
to lessen the mandated training time and computational VMD-FFT-IOSVR, EMD-PSO-SVR, MEMD-ADE-LSTM, and VMD-
power. SVR-CGWO. From the experimental results, the achieved DAs
of the benchmark frameworks and the proposed framework are
• Reasonable run-time: The high dimensionality of some dat- 81.43%, 84.23%, 90.231%, 91.227%, 92.22% and 93.145%, respec-
asets with many parameters in some DL architectures, such tively. Therefore, the proposed framework would be the most
as the DNN model, represents a challenge for DL to acquire appropriate option for policymakers and decision-makers to use
accurate DNN in a reasonable run time. for load forecasting to ensure power systems’ reliable and safe
operation.
• Overfitting in DNNs: In complex applications, many param-
eters are related to the unseen dataset. This can cause a
7.1. Limitations and future work
difference in the training dataset’s error and the error faced
in the new unseen dataset. However, the efficiency of the DL
Energy grids are expected to be more convoluted with the
model-based ANN can be evaluated by the ability to perform
evolution of materializing renewable energy (RE) technologies.
unseen datasets. The uncertainties of SG systems are boosting as many aspects
• The optimization of hyper-parameters: The hyper-parame- may affect electricity demand. This paper concentrates not on the
future load demand from a long-term perspective but the short-
ters are those whose value is defined before the learning
term load fluctuation. The forecasting framework devised in this
process. Any modification in these parameters simulates the
research does not evaluate other related factors but is only based
performance of the DL model.
on the detailed historical short-term load. Many key impacts may
• High hardware performance is required: High processing be missing, and there are also substantial research gaps there.
power is needed to vend with a real-world application us- From the standpoint of life cycle assessment (LCA), research on
ing DL solutions. Therefore, engineers and specialists are the whole system from ‘‘cradle to grave’’ is introduced, which can
trying to develop multi-core, high-performing GPUs and be used in forecasting models. Moreover, scientific scenarios can
similar processing units like the recently upcoming Tensor be inducted to combine long-term and short-term forecasting,
Processing Units (TPUs). and more work must be done in related fields. Follow-up studies
could be performed in future work, including but not confined to:
• Lack of flexibility: DL models can yield accurate and efficient
solutions to a disseminated problem. On the other hand, • The forecasting model can evaluate additional characteris-
the shallow network architectures are highly specialized to tics or parameters to enhance the PLF’s efficacy.
specific application domains. • Research on energy systems, notably the use of RE, must be
studied so that the distribution and structures of future RE
7. Conclusions can be well known, which is a crucial aspect for PLF.
PLF plays a vital role in balancing power distribution, eco- • Paying close concentration to develop data cleaning tech-
nomics, and safe and reliable energy system operations. Ac- nologies to deal with irregular and unstable short-term load
curate PLF reduces energy grid failure, ameliorates costs and data so that the adverse impacts of noise can be effectively
risks, improves energy grid security, and helps policymakers in handled.
13349
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
• A dynamic model selection strategy could be evaluated Badem, H., Basturk, A., Caliskan, A., Yuksel, M.E., 2017. A new efficient training
when selecting the weights of hybrid or combined models. strategy for deep neural networks by hybridization of artificial bee colony
and limited–memory BFGS optimization algorithms. Neurocomputing 266,
• LCA-based modeling and design analysis can be oriented 506–526.
into forecasting models. Bae, C., Kang, K., Liu, G., Chung, Y.Y., 2016. A novel real time video tracking
framework using adaptive discrete swarm optimization. Expert Syst. Appl.
64, 385–399.
• More case studies in different SG systems could be done
Bashir, T., Haoyong, C., Tahir, M.F., Liqiang, Z., 2022. Short term electricity load
to demonstrate the scalability of the proposed forecasting
forecasting using hybrid prophet-LSTM model optimized by BPNN. Energy
model. Rep. 8, 1678–1686.
Bera, S., Misra, S., Rodrigues, J.J., 2014. Cloud computing applications for smart
CRediT authorship contribution statement grid: A survey. IEEE Trans. Parallel Distrib. Syst. 26 (5), 1477–1494.
Cai, M., Pipattanasomporn, M., Rahman, S., 2019. Day-ahead building-level load
M. Zulfiqar: Conceptualization, Methodology/Study design, forecasts using deep learning vs. traditional time-series techniques. Appl.
Energy 236, 1078–1088.
Software, Validation, Formal analysis, Investigation, Resources,
Cao, G., Wu, L., 2016. Support vector regression with fruit fly optimization
Data curation, Writing – original draft, Writing – review algorithm for seasonal electricity consumption forecasting. Energy 115,
and editing, Visualization. M. Kamran: Conceptualization, 734–745.
Validation, Formal analysis, Investigation, Resources, Writing Chang, C.-C., Lin, C.-J., 2011. LIBSVM: A library for support vector machines. ACM
– original draft, Writing – review and editing, Visualization, Trans. Intell. Syst. Technol. 2 (3), 1–27.
Supervision, Project administration. M.B. Rasheed: Conceptual- Chen, J., Zeng, G.-Q., Zhou, W., Du, W., Lu, K.-D., 2018. Wind speed forecasting
ization, Methodology/Study design, Software, Validation, Formal using nonlinear-learning ensemble of deep learning time series prediction
and extremal optimization. Energy Convers. Manage. 165, 681–695.
analysis, Investigation, Resources, Data curation, Writing –
Chen, Y., Zhang, F., Berardi, U., 2020. Day-ahead prediction of hourly suben-
original draft, Writing – review and editing, Visualization,
try energy consumption in the building sector using pattern recognition
Supervision, Project administration, Funding acquisition. T. algorithms. Energy 211, 118530.
Alquthami: Conceptualization, Validation, Formal analysis, Coelho, V.N., Coelho, I.M., Coelho, B.N., Reis, A.J., Enayatifar, R., Souza, M.J.,
Investigation, Resources, Writing – original draft, Writing – Guimarães, F.G., 2016. A self-adaptive evolutionary fuzzy model for load
review and editing, Visualization, Supervision, Project adminis- forecasting problems on smart grid environment. Appl. Energy 169, 567–584.
tration, Funding acquisition. A.H. Milyani: Conceptualization, Dai, Y., Zhao, P., 2020. A hybrid load forecasting model based on support
vector machine with intelligent methods for feature selection and parameter
Validation, Investigation, Resources, Writing – review and
optimization. Appl. Energy 279, 115332.
editing, Project administration, Funding acquisition.
Darwish, A., Hassanien, A.E., Das, S., 2020. A survey of swarm and evolutionary
computing approaches for deep learning. Artif. Intell. Rev. 53 (3), 1767–1812.
Declaration of competing interest Deng, C., Zhang, X., Huang, Y., Bao, Y., 2021. Equipping seasonal exponential
smoothing models with particle swarm optimization algorithm for electricity
The authors declare that they have no known competing consumption forecasting. Energies 14 (13), 4036.
financial interests or personal relationships that could have Dewangan, C.L., Singh, S., Chakrabarti, S., 2020. Combining forecasts of day-ahead
solar power. Energy 202, 117743.
appeared to influence the work reported in this paper.
Fan, G.-F., Peng, L.-L., Zhao, X., Hong, W.-C., 2017. Applications of hybrid EMD
with PSO and GA for an SVR-based load forecasting model. Energies 10 (11),
Data availability 1713.
Gadekallu, T.R., Khare, N., Bhattacharya, S., Singh, S., Maddikunta, P.K.R., Sri-
No data was used for the research described in the article. vastava, G., 2020. Deep neural networks to predict diabetic retinopathy. J.
Ambient Intell. Humaniz. Comput. 1–14.
Acknowledgment Gong, M., Liu, J., Li, H., Cai, Q., Su, L., 2015. A multiobjective sparse feature
learning model for deep neural networks. IEEE Trans. Neural Netw. Learn.
Syst. 26 (12), 3263–3277.
This project has received funding from the European Union Guan, Y., Li, D., Xue, S., Xi, Y., 2021. Feature-fusion-kernel-based Gaussian process
Horizon 2020 research and innovation program under the Marie model for probabilistic long-term load forecasting. Neurocomputing 426,
Sklodowska-Curie grant agreement No 754382, GOT ENERGY 174–184.
TALENT. Furthermore, The Deanship of Scientific Research (DSR) Hafeez, G., Javaid, N., Riaz, M., Ali, A., Umar, K., Iqbal, Z., 2019. Day ahead electric
at King Abdulaziz University, Jeddah, Saudi Arabia has also funded load forecasting by an intelligent hybrid model based on deep learning for
this project, under grant no. (RG-34-135-42). Therefore, authors smart grid. In: Conference on Complex, Intelligent, and Software Intensive
Systems. Springer, pp. 36–49.
greatly acknowledge technical and financial support from the
Haq, E.U., Lyu, X., Jia, Y., Hua, M., Ahmad, F., 2020. Forecasting household electric
Ministry of Education and King Abdulaziz University, DSR, Jeddah, appliances consumption and peak demand based on hybrid machine learning
Saudi Arabia. approach. Energy Rep. 6, 1099–1105.
Haq, M.R., Ni, Z., 2019. A new hybrid model for short-term electricity load
References forecasting. IEEE Access 7, 125413–125423.
He, F., Zhou, J., Feng, Z.-k., Liu, G., Yang, Y., 2019. A hybrid short-term load fore-
Ahmad, T., Chen, H., 2019. Nonlinear autoregressive and random forest ap- casting model based on variational mode decomposition and long short-term
proaches to forecasting electricity load for utility energy management memory networks considering relevant factors with Bayesian optimization
systems. Sustainable Cities Soc. 45, 460–473. algorithm. Appl. Energy 237, 103–116.
Al-Musaylh, M.S., Deo, R.C., Adamowski, J.F., Li, Y., 2018a. Short-term electricity He, F., Zhou, J., Mo, L., Feng, K., Liu, G., He, Z., 2020. Day-ahead short-term load
demand forecasting with MARS, SVR and ARIMA models using aggregated probability density forecasting method with a decomposition-based quantile
demand data in Queensland, Australia. Adv. Eng. Inform. 35, 1–16. regression forest. Appl. Energy 262, 114396.
Al-Musaylh, M.S., Deo, R.C., Li, Y., Adamowski, J.F., 2018b. Two-phase parti- Hernandez, L., Baladron, C., Aguiar, J.M., Carro, B., Sanchez-Esguevillas, A.J.,
cle swarm optimized-support vector regression hybrid model integrated Lloret, J., Massana, J., 2014. A survey on electric power demand forecasting:
with improved empirical mode decomposition with adaptive noise for future trends in smart grids, microgrids and smart buildings. IEEE Commun.
multiple-horizon electricity demand forecasting. Appl. Energy 217, 422–439. Surv. Tutor. 16 (3), 1460–1495.
Amjady, N., Keynia, F., Zareipour, H., 2010. Short-term load forecast of microgrids Heydari, A., Nezhad, M.M., Pirshayan, E., Garcia, D.A., Keynia, F., De San-
by a new bilevel prediction strategy. IEEE Trans. Smart Grid 1 (3), 286–294. toli, L., 2020. Short-term electricity price and load forecasting in isolated
Bäck, T., Foussette, C., Krause, P., 2013. Contemporary Evolution Strategies, Vol. power grids based on composite neural network and gravitational search
86. Springer. optimization algorithm. Appl. Energy 277, 115503.
13350
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
Hu, Z., Bao, Y., Chiong, R., Xiong, T., 2015a. Mid-term interval load forecasting Niu, D., Ji, Z., Li, W., Xu, X., Liu, D., 2021. Research and application of a
using multi-output support vector regression with a memetic algorithm for hybrid model for mid-term power demand forecasting based on secondary
feature selection. Energy 84, 419–431. decomposition and interval optimization. Energy 234, 121145.
Hu, Z., Bao, Y., Xiong, T., 2014. Comprehensive learning particle swarm opti- Raza, M.Q., Khosravi, A., 2015. A review on artificial intelligence based load
mization based memetic algorithm for model selection in short-term load demand forecasting techniques for smart grid and buildings. Renew. Sustain.
forecasting using support vector regression. Appl. Soft Comput. 25, 15–25. Energy Rev. 50, 1352–1372.
Hu, Z., Bao, Y., Xiong, T., Chiong, R., 2015b. Hybrid filter–wrapper feature Reddy, K.K., Sarkar, S., Venugopalan, V., Giering, M., 2016. Anomaly detection and
selection for short-term load forecasting. Eng. Appl. Artif. Intell. 40, 17–27. fault disambiguation in large flight data: A multi-modal deep auto-encoder
approach. In: Annual Conference of the PHM Society, Vol. 8, no. 1.
Huang, X., Hong, S.H., Li, Y., 2017. Hour-ahead price based energy management
Rehman, N., Mandic, D.P., 2010. Multivariate empirical mode decomposition.
scheme for industrial facilities. IEEE Trans. Ind. Inform. 13 (6), 2886–2898.
Proc. Royal Soc. A 466 (2117), 1291–1302.
Huang, N.E., Shen, Z., Long, S.R., Wu, M.C., Shih, H.H., Zheng, Q., Yen, N.-C.,
Rilling, G., Flandrin, P., Gonçalves, P., Lilly, J.M., 2007. Bivariate empirical mode
Tung, C.C., Liu, H.H., 1998. The empirical mode decomposition and the Hilbert decomposition. IEEE Signal Process. Lett. 14 (12), 936–939.
spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Sakurai, D., Fukuyama, Y., Iizaka, T., Matsui, T., 2019. Daily peak load forecasting
Lond. Ser. A Math. Phys. Eng. Sci. 454 (1971), 903–995. by artificial neural network using differential evolutionary particle swarm
Huyghues-Beaufond, N., Tindemans, S., Falugi, P., Sun, M., Strbac, G., 2020. Robust optimization considering outliers. IFAC-PapersOnLine 52 (4), 389–394.
and automatic data cleansing method for short-term load forecasting of Santos Coelho, L.d., Mariani, V., 2006. ERRATUM-correction to" combining of
distribution feeders. Appl. Energy 261, 114405. chaotic differential evolution and quadratic programming for economic
ISO, 2022. ISO: ISO New England Data, URL http://www.iso-ne.com/. dispatch optimization with valve-point effect". IEEE Trans. Power Syst. 21
Jacob, M., Neves, C., Vukadinović Greetham, D., 2020. Forecasting and Assessing (3), 1465.
Risk of Individual Electricity Peaks. Springer Nature. Selakov, A., Cvijetinović, D., Milović, L., Mellon, S., Bekut, D., 2014. Hybrid PSO–
Jang, Y., Byon, E., Jahani, E., Cetin, K., 2020. On the long-term density prediction SVM method for short-term load forecasting during periods with significant
of peak electricity load with demand side management in buildings. Energy temperature variations in city of burbank. Appl. Soft Comput. 16, 80–88.
Build. 228, 110450. Shiri, A., Afshar, M., Rahimi-Kian, A., Maham, B., 2015. Electricity price fore-
Jawad, M., Ali, S.M., Khan, B., Mehmood, C.A., Farid, U., Ullah, Z., Usman, S., casting using support vector machines by considering oil and natural gas
Fayyaz, A., Jadoon, J., Tareen, N., et al., 2018. Genetic algorithm-based non- price impacts. In: 2015 IEEE International Conference on Smart Energy Grid
Engineering. SEGE, IEEE, pp. 1–5.
linear auto-regressive with exogenous inputs neural network short-term and
Sideratos, G., Ikonomopoulos, A., Hatziargyriou, N.D., 2020. A novel fuzzy-based
medium-term uncertainty modelling and prediction for electrical load and
ensemble model for load forecasting using hybrid deep neural networks.
wind speed. J. Eng. 2018 (8), 721–729.
Electr. Power Syst. Res. 178, 106025.
Jiang, H., Wang, K., Wang, Y., Gao, M., Zhang, Y., 2016. Energy big data: A survey.
Sobhani, M., Hong, T., Martin, C., 2020. Temperature anomaly detection for
IEEE Access 4, 3844–3861. electric load forecasting. Int. J. Forecast. 36 (2), 324–333.
Junbo, T., Weining, L., Juneng, A., Xueqian, W., 2015. Fault diagnosis method Storn, R., Price, K., 1997. Differential evolution–A simple and efficient heuristic
study in roller bearing based on wavelet transform and stacked auto- for global optimization over continuous spaces. J. Global Optim. 11 (4),
encoder. In: The 27th Chinese Control and Decision Conference. 2015 CCDC, 341–359.
IEEE, pp. 4608–4613. Talaat, M., Farahat, M., Mansour, N., Hatata, A., 2020. Load forecasting based
Kavousi-Fard, A., Samet, H., Marzbani, F., 2014. A new hybrid modified firefly on grasshopper optimization and a multilayer feed-forward neural network
algorithm and support vector regression model for accurate short term load using regressive approach. Energy 196, 117087.
forecasting. Expert Syst. Appl. 41 (13), 6047–6056. ur Rehman, N., Mandic, D.P., 2009. Empirical mode decomposition for trivariate
Kazemzadeh, M.-R., Amjadian, A., Amraee, T., 2020. A hybrid data mining driven signals. IEEE Trans. Signal Process. 58 (3), 1059–1068.
algorithm for long term electric peak load and energy demand forecasting. Vesterstrom, J., Thomsen, R., 2004. A comparative study of differential evolution,
Energy 204, 117948. particle swarm optimization, and evolutionary algorithms on numerical
Khalid, R., Javaid, N., 2020. A survey on hyperparameters optimization algorithms benchmark problems. In: Proceedings of the 2004 Congress on Evolutionary
of forecasting models in smart grid. Sustainable Cities Soc. 61, 102275. Computation (IEEE Cat. No. 04TH8753), Vol. 2. IEEE, pp. 1980–1987.
Kumar, N., et al., 2016. Market clearing price prediction using ANN in indian Vrablecová, P., Ezzeddine, A.B., Rozinajová, V., Šárik, S., Sangaiah, A.K., 2018.
Smart grid load forecasting using online support vector regression. Comput.
electricity markets. In: 2016 International Conference on Energy Efficient
Electr. Eng. 65, 102–117.
Technologies for Sustainability. ICEETS, IEEE, pp. 454–458.
Wang, R., Wang, J., Xu, Y., 2019. A novel combined model based on hybrid
Li, Y., Che, J., Yang, Y., 2018. Subsampled support vector regression ensemble for
optimization algorithm for electrical load forecasting. Appl. Soft Comput. 82,
short term electric load forecasting. Energy 164, 160–170.
105548.
Li, K., Ma, Z., Robinson, D., Lin, W., Li, Z., 2020. A data-driven strategy to Wang, K., Xu, C., Zhang, Y., Guo, S., Zomaya, A.Y., 2017. Robust big data analytics
forecast next-day electricity usage and peak electricity demand of a building for electricity price forecasting in the smart grid. IEEE Trans. Big Data 5 (1),
portfolio using cluster analysis, Cubist regression models and particle swarm 34–45.
optimization. J. Clean. Prod. 273, 123115. Wood, D.A., 2022. Trend decomposition aids short-term countrywide wind
Li, S., Wang, P., Goel, L., 2015. Short-term load forecasting by wavelet transform capacity factor forecasting with machine and deep learning methods. Energy
and evolutionary extreme learning machine. Electr. Power Syst. Res. 122, Convers. Manage. 253, 115189.
96–103. Wu, Z., Zhao, X., Ma, Y., Zhao, X., 2019. A hybrid model based on modified multi-
Liao, G.-C., 2014. Hybrid improved differential evolution and wavelet neural objective Cuckoo search algorithm for short-term load forecasting. Appl.
network with load forecasting problem of air conditioning. Int. J. Electr. Energy 237, 896–909.
Power Energy Syst. 61, 673–682. Xiao, L., Shao, W., Wang, C., Zhang, K., Lu, H., 2016. Research and application
Liu, Y., Gao, F., 2020. Ultra-short-term forecast of power load based on load of a hybrid model based on multi-objective optimization for electrical load
characteristics and embedded system. Microprocess. Microsyst. 103460. forecasting. Appl. Energy 180, 213–233.
Liu, S., Tian, L.-X., 2013. The study of long-term electricity load forecasting based Xiao, F., Wang, S., Fan, C., 2017. Mining big building operational data for building
cooling load prediction and energy efficiency improvement. In: 2017 IEEE
on improved grey prediction model. In: 2013 International Conference on
International Conference on Smart Computing. SMARTCOMP, IEEE, pp. 1–3.
Machine Learning and Cybernetics, Vol. 2. IEEE, pp. 653–656.
Xiong, T., Bao, Y., Hu, Z., 2014. Interval forecasting of electricity demand: A novel
Liu, D., Zeng, L., Li, C., Ma, K., Chen, Y., Cao, Y., 2016. A distributed short-term
bivariate EMD-based support vector regression modeling framework. Int. J.
load forecasting method based on local weather information. IEEE Syst. J. 12
Electr. Power Energy Syst. 63, 353–362.
(1), 208–215. Yu, Z., Niu, Z., Tang, W., Wu, Q., 2019. Deep learning for daily peak load
Massaoudi, M., Refaat, S.S., Chihi, I., Trabelsi, M., Oueslati, F.S., Abu-Rub, H., forecasting–A novel gated recurrent neural network combining dynamic time
2021. A novel stacked generalization ensemble-based hybrid LGBM-XGB-MLP warping. Ieee Access 7, 17184–17194.
model for short-term load forecasting. Energy 214, 118874. Yuan, X., Wang, L., Yuan, Y., Zhang, Y., Cao, B., Yang, B., 2008. A modified differ-
Memarzadeh, G., Keynia, F., 2021. Short-term electricity load and price forecast- ential evolution approach for dynamic economic dispatch with valve-point
ing by a new optimal LSTM-NN based prediction algorithm. Electr. Power effects. Energy Convers. Manage. 49 (12), 3447–3453.
Syst. Res. 192, 106995. Yuan, X., Wang, L., Zhang, Y., Yuan, Y., 2009. A hybrid differential evolution
Moazzami, M., Khodabakhshian, A., Hooshmand, R., 2013. A new hybrid day- method for dynamic economic dispatch with valve-point effects. Expert Syst.
ahead peak load forecasting method for Iran’s National Grid. Appl. Energy Appl. 36 (2), 4042–4048.
101, 489–501. Zhang, W., Chen, Q., Yan, J., Zhang, S., Xu, J., 2021. A novel asynchronous deep
Nalcaci, G., Özmen, A., Weber, G.W., 2019. Long-term load forecasting: Models reinforcement learning model with adaptive early forecasting method and
based on MARS, ANN and LR methods. CEJOR Cent. Eur. J. Oper. Res. 27 (4), reward incentive mechanism for short-term load forecasting. Energy 236,
1033–1049. 121492.
13351
M. Zulfiqar, M. Kamran, M.B. Rasheed et al. Energy Reports 8 (2022) 13333–13352
Zhang, F., Deb, C., Lee, S.E., Yang, J., Shah, K.W., 2016. Time series forecasting Zhang, Z., Hong, W.-C., 2021. Application of variational mode decomposition and
for building energy consumption using weighted support vector regres- chaotic grey wolf optimizer with support vector regression for forecasting
sion with differential evolution optimization technique. Energy Build. 126, electric loads. Knowl.-Based Syst. 228, 107297.
94–103. Zhang, X., Wang, J., Zhang, K., 2017. Short-term electric load forecasting based on
Zhang, Z., Ding, S., Sun, Y., 2020. A support vector regression model hybridized singular spectrum analysis and support vector machine optimized by Cuckoo
with chaotic krill herd algorithm and empirical mode decomposition for search algorithm. Electr. Power Syst. Res. 146, 270–285.
Zhang, J., Wei, Y.-M., Li, D., Tan, Z., Zhou, J., 2018. Short term electricity load
regression task. Neurocomputing 410, 185–201.
forecasting using a hybrid model. Energy 158, 774–781.
13352