Wavelet Based Forecasting
Wavelet Based Forecasting
discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/228638221
CITATIONS READS
31 97
3 authors, including:
Krishna B P. C. Nayak
National Institute of Hydrology National Institute of Hydrology
5 PUBLICATIONS 219 CITATIONS 33 PUBLICATIONS 1,396 CITATIONS
Some of the authors of this publication are also working on these related projects:
Statistical Downscaling and climate change impact assessment for Mahanadi basin View project
All content following this page was uploaded by P. C. Nayak on 23 May 2014.
Abstract
A new hybrid model which combines wavelets and Artificial Neural Network (ANN) called wavelet neural
network (WNN) model was proposed in the current study and applied for time series modeling of river flow.
The time series of daily river flow of the Malaprabha River basin (Karnataka state, India) were analyzed by
the WNN model. The observed time series are decomposed into sub-series using discrete wavelet transform
and then appropriate sub-series is used as inputs to the neural network for forecasting hydrological variables.
The hybrid model (WNN) was compared with the standard ANN and AR models. The WNN model was able
to provide a good fit with the observed data, especially the peak values during the testing period. The
benchmark results from WNN model applications showed that the hybrid model produced better results in
estimating the hydrograph properties than the latter models (ANN and AR).
time series property [17]. In order to raise the forecasted formance of WNN model was compared with the ANN
precision and lengthen the forecasted time, an alternative and AR models.
model should be envisaged. In this paper, a new hybrid
model called wavelet neural network model (WNN), 2. Methods of Analysis
which is the combination of wavelet analysis and ANN,
has been proposed. The advantage of the wavelet tech- 2.1. Wavelet Analysis
nique is that it provides a mathematical process for de-
composing a signal into multi-levels of details and anal- The wavelet analysis is an advance tool in signal proc-
ysis of these details can be done. Wavelet analysis can essing that has attracted much attention since its theo-
effectively diagnose signals of main frequency compo- retical development [33]. Its use has increased rapidly in
nent and abstract local information of the time series. In communications, image processing and optical engi-
the past decade, wavelet theory has been introduced to neering applications as an alternative to the Fourier
signal processing analysis. In recent years, the wavelet transform in preserving local, non-periodic and multis-
transforms has been successfully applied to wave data caled phenomena. The difference between wavelets and
analysis and other ocean engineering applications Fourier transforms is that wavelets can provide the exact
[18,19]. locality of any changes in the dynamical patterns of the
In recent years, wavelet theory has been introduced in sequence, whereas the Fourier transforms concentrate
the field of hydrology [17,20-22]. Wavelet analysis has mainly on their frequency. Moreover, Fourier transform
recently been identified as a useful tool for describing assume infinite-length signals, whereas wavelet trans-
both rainfall and runoff time-series [21,23-26]. By cou- forms can be applied to any kind and size of time series,
pling the wavelet method with the traditional AR model, even when these sequences are not homogeneously sam-
the Wavelet-Autoregressive model (WARM) is devel- pled in time [34]. In general, wavelet transforms can be
oped for annual rainfall prediction [27]. Coulibaly [28] used to explore, denoise and smoothen time series, aid in
used the wavelet analysis to identify and describe vari- forecasting and other empirical analysis.
ability in annual Canadian stream flows and to gain in- Wavelet analysis is the breaking up of a signal into
sights into the dynamical link between the stream flows shifted and scaled versions of the original (or mother)
and the dominant modes of climate variability in the wavelet. In wavelet analysis, the use of a fully scalable
Northern Hemisphere. Due to the similarity between modulated window solves the signal-cutting problem.
wavelet decomposition and one hidden layer neural net- The window is shifted along the signal and for every
work, the idea of combining both wavelet and neural net- position the spectrum is calculated. Then this process is
work has resulted recently in formulation of wavelet neu- repeated many times with a slightly shorter (or longer)
ral network, which has been used in various fields [29]. window for every new cycle. In the end, the result will
Results show that, the training and adaptation efficiency be a collection of time-frequency representations of the
of the wavelet neural network is better than other net- signal, all with different resolutions. Because of this col-
works. Dongjie [30] used a combination of neural net- lection of representations we can speak of a multiresolu-
works and wavelet methods to predict ground water levels. tion analysis. By decomposing a time series into time-
Aussem [31] used a Dynamical Recurrent Neural Network frequency space, one is able to determine both the do-
(DRNN) on each resolution scale of the sunspot time se- minant modes of variability and how those modes vary
ries resulting from the wavelet decomposed series with the in time. Wavelets have proven to be a powerful tool for
Temporal Recurrent Back propagation (TRBP) algorithm. the analysis and synthesis of data from long memory
Partal [32] used a conjunction model (wavelet-neuro- processes. Wavelets are strongly connected to such
fuzzy) to forecast the Turkey daily precipitation. The ob- processes in that the same shapes repeat at different or-
served daily precipitations are decomposed to some sub ders of magnitude. The ability of the wavelets to simul-
series by using Discrete Wavelet Transform (DWT) and taneously localize a process in time and scale domain
then appropriate sub series are used as inputs to neuro- results in representing many dense matrices in a sparse
fuzzy models for forecasting of daily precipitations. form.
In this paper, an attempt has been made to forecast the
time series of Daily River flow by developing wavelet 2.2. Discrete Wavelet Transform (DWT)
neural network (WNN) models. Time series of river flow
was decomposed into wavelet sub-series by discrete The basic aim of wavelet analysis is both to determine the
wavelet transform. Then, neural network model is con- frequency (or scale) content of a signal and to assess and
structed with wavelet sub-series as input, and the origi- determine the temporal variation of this frequency content.
nal time series as output. Finally, the forecasting per- This property is in complete contrast to the Fourier analy-
sis, which allows for the determination of the frequency tion components (Figure 1).
content of a signal but fails to determine frequency time-
dependence. Therefore, the wavelet transform is the tool 2.3. Mother Wavelet
of choice when signals are characterized by localized high
frequency events or when signals are characterized by a The choice of the mother wavelet depends on the data to
large numbers of scale-variable processes. Because of its be analyzed. The Daubechies and Morlet wavelet trans-
localization properties in both time and scale, the wavelet forms are the commonly used “Mother” wavelets. Dau-
transform allows for tracking the time evolution of proc- bechies wavelets exhibit good trade-off between parsi-
esses at different scales in the signal. mony and information richness, it produces the identical
The wavelet transform of a time series f(t) is defined as events across the observed time series and appears in so
many different fashions that most prediction models are
1 t b
f a, b f t dt (1) unable to recognize them well [35]. Morlet wavelets, on
a a the other hand, have a more consistent response to simi-
where (t ) is the basic wavelet with effective length (t) lar events but have the weakness of generating many
that is usually much shorter than the target time series more inputs than the Daubechies wavelets for the predic-
f(t). The variables are a and b, where a is the scale or tion models.
dilation factor that determines the characteristic fre- An ANN, can be defined as a system or mathematical
quency so that its variation gives rise to a ‘spectrum’; model consisting of many nonlinear artificial neurons
and b is the translation in time so that its variation repre- running in parallel, which can be generated, as one or
sents the ‘sliding’ of the wavelet over f(t). The wavelet multiple layered. Although the concept of artificial neu-
spectrum is thus customarily displayed in time-frequency rons was first introduced by McCulloch and Pitts [36],
domain. For low scales i.e. when a 1 , the wavelet the major applications of ANN’s have arisen only since
function is highly concentrated (shrunken compressed) the development of the back-propagation method of
with frequency contents mostly in the higher frequency training by Rumelhart [37]. Following this development,
bands. Inversely, when a 1 , the wavelet is stretched ANN research has resulted in the successful solution of
and contains mostly low frequencies. For small scales, some complicated problems not easily solved by tradi-
we obtain thus a more detailed view of the signal (also tional modeling methods when the quality/quantity of
known as a “higher resolution”) whereas for larger scales data is very limited. ANN models are ‘black box’ models
we obtain a more general view of the signal structure. with particular properties, which are greatly suited to
The original signal X(n) passes through two comple- dynamic nonlinear system modeling. The main advan-
mentary filters (low pass and high pass filters) and tage of this approach over traditional methods is that it
emerges as two signals as Approximations (A) and De- does not require the complex nature of the underlying
tails (D). The approximations are the high-scale, low process under consideration to be explicitly described in
frequency components of the signal. The details are the mathematical form. ANN applications in hydrology vary,
low-scale, high frequency components. Normally, the from real-time to event based modeling.
low frequency content of the signal (approximation, A) The most popular ANN architecture in hydrologic
is the most important part. It demonstrates the signal modeling is the multilayer perceptron (MLP) trained
identity. The high-frequency component (detail, D) is with BP algorithm [8,9]. A multilayer perceptron net-
nuance. The decomposition process can be iterated, with work consists of an input layer, one or more hidden lay-
successive approximations being decomposed in turn, so ers of computation nodes, and an output layer. The
that one signal is broken down into many lower resolu- number of input and output nodes is determined by the
nature of the actual input and output variables. The num- 3.1. Selection of Network Architecture
ber of hidden nodes, however, depends on the complex-
ity of the mathematical nature of the problem, and is de- Increasing the number of training patterns provide more
termined by the modeler, often by trial and error. The information about the shape of the solution surface, and
input signal propagates through the network in a forward thus increases the potential level of accuracy that can be
direction, layer by layer. Each hidden and output node achieved by the network. A large training pattern set,
processes its input by multiplying each of its input values however can sometimes overwhelm certain training al-
by a weight, summing the product and then passing the gorithms, thereby increasing the likelihood of an algo-
sum through a nonlinear transfer function to produce a rithm becoming stuck in a local error minimum. Conse-
result. For the training process, where weights are se- quently, there is no guarantee that adding more training
lected, the neural network uses the gradient descent me- patterns leads to improve solution. Moreover, there is a
thod to modify the randomly selected weights of the limit to the amount of information that can be modeled
nodes in response to the errors between the actual output by a network that comprises a fixed number of hidden
values and the target values. This process is referred to as neurons. The time required to train a network increases
training or learning. It stops when the errors are mini- with the number of patterns in the training set. The criti-
mized or another stopping criterion is met. The BPNN cal aspect is the choice of the number of nodes in the
can be expressed as hidden layer and hence the number of connection
weights.
Y f WX (2) Based on the physical knowledge of the problem and
where X = input or hidden node value; Y = output value statistical analysis, different combinations of antecedent
of the hidden or output node; f (.) = transfer function; W values of the time series were considered as input nodes.
= weights connecting the input to hidden, or hidden to The output node is the time series data to be predicted in
output nodes; and θ = bias (or threshold) for each node. one step ahead. Time series data was standardized for
zero mean and unit variation, and then normalized into 0
3. Method of Network Training to 1. The activation function used for the hidden and
output layer was logarithmic sigmoidal and pure linear
Levenberg-Marquardt method (LM) was used for the function respectively. For deciding the optimal hidden
training of the given network. Levenberg-Marquardt neurons, a trial and error procedure started with two hid-
method (LM) is a modification of the classic Newton den neurons initially, and the number of hidden neurons
algorithm for finding an optimum solution to a minimi- was increased up to 10 with a step size of 1 in each trial.
For each set of hidden neurons, the network was trained
zation problem. In practice, LM is faster and finds better
in batch mode to minimize the mean square error at the
optima for a variety of problems than most other meth-
output layer. In order to check any over-fitting during
ods [38]. The method also takes advantage of the internal
training, a cross validation was performed by keeping
recurrence to dynamically incorporate past experience in
track of the efficiency of the fitted model. The training
the training process [39].
was stopped when there was no significant improvement
The Levenberg-Marquardt algorithm is given by
in the efficiency, and the model was then tested for its
1
X k 1 X k J T J I JTe (3) generalization properties. Figure 2 shows the multilayer
perceptron (MLP) neural network architecture when the
where, X is the weights of neural network, J is the Jaco- original signal taken as input of the neural network ar-
bian matrix of the performance criteria to be minimized, chitecture.
is a learning rate that controls the learning process and
e is residual error vector. 3.2. Method of Combining Wavelet Analysis
If scalar is very large, the above expression ap- with ANN
proximates gradient descent with a small step size; while
if it is very small; the above expression becomes Gauss- The decomposed details (D) and approximation (A) were
Newton method using the approximate Hessian matrix. taken as inputs to neural network structure as shown in
The Gauss-Newton method is faster and more accurate Figure 3. In Figure 3, i is the level of decomposition
near an error minimum. Hence we decrease after each varying from 1 to I and j is the number of antecedent
successful step and increase only when a step increases values varying from 0 to J and N is the length of the time
the error. Levenberg-Marquardt has great computational series. To obtain the optimal weights (parameters) of the
and memory requirements, and thus it can only be used neural network structure, LM back-propagation algorithm
in small networks. It is faster and less easily trapped in has been used to train the network. A standard MLP with
local minima than other optimization algorithms. a logarithmic sigmoidal transfer function for the hidden
Raingauge Khanapur
OB well
GD station
0 10000
other. The decomposition process can be iterated, with data were used for calibration of the model, and the re-
successive approximations being decomposed in turn, so maining four years (1993-96) data were used for valida-
that one signal is broken down into many lower resolu- tion. The model inputs (Table 1) were decomposed by
tion components, tested using different scales from 1 to wavelets and decomposed sub-series were taken as input
10 with different sliding window amplitudes. In this to ANN and the original river flow value in one day
context, dealing with a very irregular signal shape, an ahead as output. ANN was trained using backpropaga-
irregular wavelet, the Daubechies wavelet of order 5 tion with LM algorithm. The optimal number of hidden
(DB5), has been used at level 3. Consequently, D1, D2, neurons was determined as three by trial and error pro-
D3 were detail time series, and A3 was the approximation cedure.
time series. The performance of various models estimated to fore-
An ANN was constructed in which the sub-series {D1, cast the river flow was presented in Table 2. From Table
D2, D3, A3} at time t are input of ANN and the original 2, it was found that low RMSE values (8.24 to 18.60
time series at t + T time are output of ANN, where T is m3/s) for WNN models when compared to ANN and AR
the length of time to forecast. The input nodes are the (1) models. It was observed that WNN models estimated
antecedent values of the time series and were presented the peak values of river flow to a reasonable accuracy
in Table 1. The Wavelet Neural Network model (WNN) (peak flow during the study was 774 m3/s). From Table 2,
was formed in which the weights are learned with Feed it was observed that the WNN model having four ante-
forward neural network with Back Propagation algorithm. cedent values of the time series, estimated minimum
The number of hidden neurons for BPNN was deter-
mined by trial and error procedure. Table 1. Model inputs.
Model I X (t) = f (x [t-1])
5. Results and Discussion
Model II X (t) = f (x [t-1], x [t-2])
Model III X (t) = f (x [t-1], x [t-2], x [t-3])
To forecast the river flow at Khanapur gauging station of
Malaprabha River (Figure 4), the daily stream flow data Model IV X (t) = f (x [t-1], x [t-2], x [t-3], x [t-4])
of 11 years was used. The first seven years (1986-92) Model V X (t) = f (x [t-1], x [t-2], x [t-3], x [t-4], x [t-5])
Table 2. Goodness of fit statistics of the forecasted river flow for the calibration and validation period.
Calibration Validation
Model RMSE R COE (%) PI RMSE R COE (%) PI
(cumecs) (cumecs)
WNN
Model I 17.76 0.952 90.76 0.627 18.60 0.931 86.69 0.395
Model II 8.66 0.989 97.80 0.911 10.46 0.979 95.78 0.809
Model III 9.44 0.986 97.38 0.894 10.33 0.979 95.89 0.814
Model IV 6.07 0.995 98.91 0.956 8.18 0.987 97.38 0.881
Model V 5.90 0.994 98.97 0.959 10.01 0.980 96.14 0.825
ANN
Model I 26.09 0.894 80.07 0.195 23.45 0.889 78.86 0.039
Model II 25.39 0.900 81.13 0.238 23.15 0.893 79.40 0.064
Model III 26.07 0.895 80.10 0.196 23.17 0.893 79.36 0.062
Model IV 25.96 0.896 80.27 0.203 23.16 0.898 79.37 0.062
Model V 25.32 0.901 81.23 0.242 23.73 0.887 78.35 0.061
RMSE (8.24 m3/s), high correlation coefficient (0.9870), that the flow forecasted by WNN models were very
highest efficiency (> 97%) and a high PI value of 0.881 much close to the 45 degrees line. From this analysis, it
during the validation period. The model IV of WNN was was worth to mention that the performance of WNN was
selected as the best-fit model to forecast the river flow in much better than ANN and AR models in forecasting the
one-day advance. river flow in one-day advance.
An analysis to assess the potential of each of the mod-
el to preserve the statistical properties of the observed 6. Conclusions
flow series was carried out and reveals that the flow se-
ries computed by the WNN model reproduces the first This paper reports a hybrid model called wavelet based
three statistical moments (i.e. mean, standard deviation neural network model for time series modeling of river
and skewness) better than that computed by the ANN flow. The proposed model is a combination of wavelet
model. The values of the first three moments for the ob- analysis and artificial neural network (WNN). Wavelet
served and modeled flow series for the validation period decomposes the time series into multi-levels of details
were presented in Table 3 for comparison. In Table 3, and it can adopt multi-resolution analysis and effectively
AR model performance was not presented because of its
low efficiency compared to other models. Table 3 de- Table 3. Statistical moments of the observed and modeled
picts the percentage error in annual peak flow estimates river flow series during validation period.
for the validation period for both models. From Table 3
Statistical moments Year Observed WNN ANN
it is found that the WNN model improves the annual (cumecs)
peak flow estimates and the error was limited to 18%. Mean 1993 45.84 46.00 46.06
However, ANN models tend to underestimate the peak 1994 64.23 63.78 63.12
flow up to 55% error in peak estimation. 1995 30.92 31.98 31.25
1996 31.08 31.21 33.02
Figure 5 shows the observed and modeled hydro-
graphs for WNN and ANN models. It was found that Standard deviation 1993 65.67 63.83 61.01
1994 85.32 82.14 77.36
values modeled from WNN model correctly matched 1995 54.89 56.39 53.76
with the observed values, whereas, ANN model underes- 1996 40.94 40.38 44.69
timated the observed values. The distribution of error Skew ness 1993 2.85 2.79 2.09
along the magnitude of river flow computed by WNN 1994 1.98 1.81 1.45
1995 3.20 3.29 2.84
and ANN models during the validation period has been 1996 2.21 2.27 2.26
presented in Figure 6. From Figure 6, it was observed
%Error in peak estimation
that the estimation of peak flow was very good as the
Peak value 1993 432.00 −0.49 −28.53
error is minimum when compared with ANN model. 1994 430.61 −9.23 −35.62
Figure 7 shows the scatter plot between the observed 1995 360.65 −17.36 −30.88
and modeled flows by WNN and ANN. It was observed 1996 224.65 2.50 −54.80
(a) (b)
Figure 5. Plot of observed and modeled hydrographs for (a) WNN and (b) ANN model for the validation period.
(a)
(b)
(c)
Figure 6. Distribution of error plots along the magnitude of river flow for (a) WNN model and (b) ANN model during valida-
tion period.
53-71.
[4] B. Fernandez and J. D. Salas, “Periodic Gamma Autore-
gressive Processes for Operational Hydrology,” Water
Resources Research, Vol. 22, No. 10, 1986, pp. 1385-
1396. doi:10.1029/WR022i010p01385
[5] S. L. S. Jacoby, “A Mathematical Model for Non-Linear
Hydrologic Systems,” Journal of Geophysics Research,
Vol. 71, No. 20, 1966, pp. 4811-4824.
[6] J. Amorocho and A. Brandstetter, “A Critique of Current
Methods of Hydrologic Systems Investigations,” Eos
Transactions of AGU, Vol. 45, 1971, pp. 307-321.
[7] S. Ikeda, M. Ochiai and Y. Sawaragi, “Sequential GMDH
Algorithm and Its Applications to River Flow Predic-
tion,” IEEE Transactions of System Management and Cy-
bernetics, Vol. 6, No. 7, 1976, pp. 473-479.
doi:10.1109/TSMC.1976.4309532
[8] ASCE Task Committee, “Artificial Neural Networks in
hydrology-I: Preliminary Concepts,” Journal of Hydrolo-
gic Engineering, Vol. 5, No. 2, 2000(a), pp. 115-123.
[9] ASCE Task Committee, “Artificial Neural Networks in
Hydrology-II: Hydrologic Applications,” Journal of Hy-
Figure 7. Scatter plot between observed and modeled river
drologic Engineering, Vol. 5, No. 2, 2000(b), pp. 124-
flow during validation period. 137.
[10] D. W. Dawson and R. Wilby, “Hydrological Modeling
diagnose the main frequency component of the signal Using Artificial Neural Networks,” Progress in Physical
and abstract local information of the time series. The Geograpgy, Vol. 25, No. 1, 2001, pp. 80-108.
proposed WNN model was applied to daily river flow of [11] S. Birikundavy, R. Labib, H. T. Trung and J. Rousselle,
Malaprabha river basin in Belgaum district of Karnataka “Performance of Neural Networks in Daily Stream Flow
State, India. The time series data of river flow was de- Forecasting,” Journal of Hydrologic Engineering, Vol. 7,
composed into sub series by DWT. Each of sub-series No. 5, 2002, pp. 392-398.
plays distinct role in original time series. Appropriate doi:10.1061/(ASCE)1084-0699(2002)7:5(392)
sub-series of the variable used as inputs to the ANN [12] P. Hettiarachchi, M. J. Hall and A. W. Minns, “The Ex-
model and original time series of the variable as output. trapolation of Artificial Neural Networks for the Model-
From the current study it is found that the proposed ing of Rainfall-Runoff Relationships,” Journal of Hydro-
informatics, Vol. 7, No. 4, 2005, pp. 291-296.
wavelet neural network model is better in forecasting
river flow in Malaprabha basin. In the analysis, original [13] E. J. Coppola, M. Poulton, E. Charles, J. Dustman and F.
Szidarovszky, “Application of Artificial Neural Networks
signals are represented in different resolution by discrete
to Complex Groundwater Problems,” Journal of Natural
wavelet transformation, therefore, the WNN forecasts are Resources Research, Vol. 12, No. 4, 2003(a), pp. 303-
more accurate than that obtained directly by original 320.
signals. [14] E. J. Coppola, F. Szidarovszky, M. Poulton and E.
Charles, “Artificial Neural Network Approach for Pre-
7. References dicting Transient Water Levels in a Multilayered
Groundwater System under Variable State, Pumping and
[1] H. Raman and N. Sunil Kumar, “Multivariate Modeling Climate Conditions,” Journal of Hydrologic Engineering,
of Water Resources Time Series Using Artificial Neural Vol. 8, No. 6, 2003(b), pp. 348-359.
Networks,” Journal of Hydrological Sciences, Vol. 40, [15] P. C. Nayak, Y. R. Satyaji Rao and K. P. Sudheer, “Gro-
No. 4, 1995, pp. 145-163. undwater Level Forecasting in a Shallow Aquifer Using
doi:10.1080/02626669509491401 Artificial Neural Network Approach,” Water Resources
[2] H. R. Maier and G. C. Dandy, “Determining Inputs for Management, Vol. 20, No. 1, 2006, pp. 77-90.
Neural Network Models of Multivariate Time Series,” doi:10.1007/s11269-006-4007-z
Microcomputers in Civil Engineering, Vol. 12, 1997, pp. [16] B. Krishna, Y. R. Satyaji Rao and T. Vijaya, “Modeling
353-368. Groundwater Levels in an Urban Coastal Aquifer Using
[3] M. C. Deo and K. Thirumalaiah, “Real Time Forecasting Artificial Neural Networks,” Hydrological Processes,
Using Neural Networks: Artificial Neural Networks in Vol. 22, No. 12, 2008, pp. 1180-1188.
Hydrology,” In: R. S. Govindaraju and A. Ramachandra doi:10.1002/hyp.6686
Rao, Kluwer Academic Publishers, Dordrecht, 2000, pp. [17] D. Wang and J. Ding, “Wavelet Network Model and Its