Visual Analysis of STime Data Predictions With DL Models
Visual Analysis of STime Data Predictions With DL Models
sciences
Article
Visual Analysis of Spatiotemporal Data Predictions with Deep
Learning Models
Hyesook Son 1 , Seokyeon Kim 1 , Hanbyul Yeon 1 , Yejin Kim 1 , Yun Jang 1, * and Seung-Eock Kim 2
1 Computer Engineering and Convergence Engineering for Intelligent Drone, Sejong University,
Seoul 05006, Korea; [email protected] (H.S.); [email protected] (S.K.); [email protected] (H.Y.);
[email protected] (Y.K.)
2 Civil and Environmental Engineering, Sejong University, Seoul 05006, Korea; [email protected]
* Correspondence: [email protected]
Abstract: The output of a deep-learning model delivers different predictions depending on the input
of the deep learning model. In particular, the input characteristics might affect the output of a deep
learning model. When predicting data that are measured with sensors in multiple locations, it is
necessary to train a deep learning model with spatiotemporal characteristics of the data. Additionally,
since not all of the data measured together result in increasing the accuracy of the deep learning
model, we need to utilize the correlation characteristics between the data features. However, it is
difficult to interpret the deep learning output, depending on the input characteristics. Therefore,
it is necessary to analyze how the input characteristics affect prediction results to interpret deep
learning models. In this paper, we propose a visualization system to analyze deep learning models
with air pollution data. The proposed system visualizes the predictions according to the input
characteristics. The input characteristics include space-time and data features, and we apply temporal
prediction networks, including gated recurrent units (GRU), long short term memory (LSTM), and
spatiotemporal prediction networks (convolutional LSTM) as deep learning models. We interpret the
Citation: Son, H.; Kim, S.; Yeon, H.; output according to the characteristics of input to show the effectiveness of the system.
Kim, Y.; Jang, Y.; Kim, S.-E. Visual
Analysis of Spatiotemporal Data Keywords: spatiotemporal; air quality; deep learning
Predictions with Deep Learning
Models. Appl. Sci. 2021, 11, 5853.
https://doi.org/10.3390/app11135853
1. Introduction
Academic Editor: Kwan-Hee Yoo
Spatiotemporal data contain feature information, such as temporal and spatial in-
formation, at the same time [1]. Therefore, spatiotemporal correlation patterns are often
Received: 16 May 2021
Accepted: 21 June 2021
utilized together in prediction models. Spatiotemporal prediction models are applied in
Published: 24 June 2021
various fields, such as traffic, weather, social media, flights, and human migration. How-
ever, creating a prediction model is challenging because each field has a different degree
Publisher’s Note: MDPI stays neutral
and type of spatiotemporal correlation and complexity [2]. Different means of recording
with regard to jurisdictional claims in
spatiotemporal data and different data formats make predictions more complicated. Radar
published maps and institutional affil- echo data and air pollutant data have different recording schemes and data formats. Radar
iations. echo data are signals reflected from objects, such as raindrops. Radar echo data sets can
be collected in the form of a two-dimensional image sequence in a regular grid. On the
other hand, air pollutant data are recorded with air-condition information from sensors.
Most air pollutant data are continuously recorded in time but have uneven spatial infor-
Copyright: © 2021 by the authors.
mation, due to irregular sensor locations, which is more complicated for spatiotemporal
Licensee MDPI, Basel, Switzerland.
pattern extraction.
This article is an open access article
In machine learning [3], the machine is trained using data and algorithms to learn
distributed under the terms and how to perform a task. Deep learning [4] is considered an evolution of machine learning,
conditions of the Creative Commons which uses a programmable neural network that empowers the machine to make decisions
Attribution (CC BY) license (https:// without guidance from humans. There are two methods in machine learning, including
creativecommons.org/licenses/by/ supervised learning and unsupervised learning. The main difference between these two
4.0/). is the use of labeled data sets. Supervised learning utilizes labeled input and output
data, while unsupervised learning does not. Deep learning models can be applied for
temporal pattern prediction. Typically, recurrent neural networks (RNNs) use recurrent
computations to train temporal patterns from historical sequence information and produce
predictions. Many studies were conducted to predict spatiotemporal data with gated
recurrent unit (GRU) networks and long short term memory (LSTM) networks with RNN
structures [5–7], which have a looping constraint on the hidden layer of the artificial neural
network (ANN). Preprocessing is expected to handle spatiotemporal data as input to
the RNN architectures. Since the RNNs do not consider the spatial structure, the spatial
information within the data may be dropped during the preprocessing.
A spatiotemporal predictive deep-learning model was proposed to resolve the problem
in RNN that does not consider the spatial structure. The convolutional LSTM network [8]
recognizes the spatiotemporal correlation by combining the LSTM layer and the convolu-
tional layer. Although this deep learning model predicts spatiotemporal data adequately,
it is puzzling to understand how the incorporation of spatial information in the input data
can improve the predictive performance of the deep learning model, just by reviewing the
accuracy. Since the spatial information contained in each feature of the data is different,
the prediction performance also varies, according to the feature selection. In addition to
the feature selection, the incorporation of spatial information, such as grid structure, also
affects the deep learning performance. Therefore, it is challenging to interpret deep learning
results that depend on input characteristics such as feature selection, temporal correlation,
and spatial correlation. The more difficult the deep learning model is to interpret, the more
time-consuming the modeling process is. Hence, it is necessary to develop a system that
allows us to quickly interpret the output of the deep learning model, according to the input
characteristics. The contributions of our work are as follows.
• We develop a visualization system to support the interpretation of outputs from deep
learning models.
• We propose multiple feature selection functionalities with temporal and spatial infor-
mation.
• Our system enables us to perform prediction modelings by visualizing information,
such as correlations between variables, temporal autocorrelation, and spatial autocor-
relation.
• We evaluate our system through prediction modeling for a spatiotemporal air pollu-
tant data set.
We expect that our system supports us in understanding deep learning modeling and
exploring the results with data and parameters interactively for prediction improvements.
2. Related Work
Many researchers desire to understand how deep learning models are trained, how
model representations are interpreted, and how deep learning supports decision making [9].
The idea of model understanding in machine learning is divided into interpretability and ex-
plainability [10]. The interpretation is to understand the status transitions that occur while
changing input or algorithm parameters in machine learning models. Explainability is the
interpretation of the internal mechanisms of machine learning models in understandable
human terms.
In visualization and visual analytics (VA) areas, some studies have been proposed to
support the design and debugging of models by applying VA to an interactive machine
learning workflow [9]. In the area of model interpretation, visual analytics has focused
on understanding the structure of models [11], analyzing the performance of predictive
models [12], identifying misclassified instances [13–15], and comparing the performance
of multiple predictive models [16]. To explain the structure of the model, node-link di-
agrams [17], drawing directed graphs [11], and directed acyclic graphs [18] are applied.
Wongsuphasawat et al. [11] presented a TensorFlow graph visualizer to assist in under-
standing machine learning architectures. Liu et al. [18] proposed a visual analytics system
to understand and diagnose a convolutional neural network, using a directed acyclic
Appl. Sci. 2021, 11, 5853 3 of 15
graph. Although many visual analysis systems support machine learning modeling, most
are limited in classification models. Therefore, we believe that our system assists us in
understanding deep learning modeling while improving spatiotemporal predictions.
The performance analysis of the predictive model includes studies to explore the
combination of input features [19] and to improve the quality of the labeled data [13,20].
Xiang et al. [13] introduce a system for correcting false labels in training data, using hi-
erarchical visualization with incremental t-distributed stochastic neighbor embedding
(t-SNE). If we can observe the cause and consequence of the predictive model in interactive
machine learning, the explainable AI (XAI) must be able to analyze why the model makes
such a decision [21]. To understand the internal mechanism, researchers detect errors or
weight changes observed in specific output changes during the learning process based
on the performance metrics [22]. Comprehensive theoretical studies of the role of visual
analytics in deep learning have been conducted, and it is possible to interpret various deep
learning models, such as CNN [23], DNN [24], RNN [25,26], LSTM [27,28], and DQN [29].
Spinner et al. [22] also presented an interactive and explainable visual analytics framework
for understanding machine learning models. They can diagnose and improve the limita-
tions of the designed model through quality monitoring, provenance tracking, and model
comparison in the TensorBoard environment.
In the field of statistics, time-series data predictions are mainly performed with the au-
toregressive model, moving average model, and autoregressive moving average (ARIMA)
model. In machine learning studies, the RNN and LSTM are known to be suitable for time
series prediction. LSTM models can be constructed according to the layer layout, structure,
connectivity, and combination with other neural networks. Typical LSTM models are
Vanilla LSTM [30], Stacked LSTM [31], Bidirectional LSTM [32], etc. Although the LSTM
model generally outperforms the ARIMA model in time series prediction [33], the ARIMA
model outperforms the LSTM in time series data with strong seasonal factors [34]. Studies
for the interpretation of LSTMs and RNNs were published in the visual analytics com-
munity. Tang et al. [35] visualized the behavior of LSTM and GRU in speech recognition
and presented that LSTM has long-term memory but is more sensitive to noise than RNN.
Strobelt et al. [36] provided a visual tool to improve the performance of LSTM models
with the exploration and summarization of long-term dependencies in time series and
sequence data. Since our data have temporal features, we employ LSTM and GRU for deep
learning modeling.
Spatial interpolation estimates the unobserved data inside the sampled area with the
observed data [37]. Spatial interpolation is generally applied for visualization, mainly by
computing the pixel values from pixel-based data [38]. Many algorithms were developed
for interpolation, including nearest-neighbor interpolation, bilinear interpolation, and bicu-
bic interpolation [39]. Inverse distance weighted interpolation (IDW) is assumed to have
similar values as the data become closer to each other [40]. IDW interpolation estimates the
value of an unknown point by weighting it inversely with distance [41]. IDW interpolation
assigns consecutive weights, while nearest-neighbor interpolation weights only 1 to the
nearest data. Linear interpolation is a simple interpolation that estimates data linearly. We
can use cubic interpolation to reduce the discontinuities caused by linear interpolation. Cu-
bic interpolation produces more smooth data than linear interpolation or nearest-neighbor
interpolation. As a high-order interpolation, radial basis function (RBF) is employed for
more accurate interpolation of unstructured data. The RBF interpolation can be constructed
in an artificial neural network by using RBFs as activation functions [42]. In this work, we
apply cubic, linear RBF, and nearest-neighbor techniques for spatial interpolation.
Prediction of spatiotemporal data is generally performed considering both the tempo-
ral and spatial feature points. Deep learning algorithms that are mainly used for space-time
data prediction include LCRN [43] and convolutional LSTM (ConvLSTM) [8]. LCRN has
a structure in which CNN and LSTM are sequentially connected. In the LCRN structure,
the spatiotemporal data inputs are trained for the spatial feature points with the CNN
and the temporal feature points with the LSTM. Johan et al. [44] presented PVNet, using
Appl. Sci. 2021, 11, 5853 4 of 15
the LCRN structure. PVNet predicts photovoltaic power by training numerical weather
information, including irradiance, cloud, temperature, the clear sky model and a power
model, calculated with the persistence model. LCRN contains a sequential connection
structure between CNN and LSTM, while ConvLSTM includes convolution operations
within the cells of LSTM. ConvLSTM trains spatiotemporal data by performing convolution
operations as soon as input data are inserted into LSTM cells. ConvLSTM has faster com-
putational speed and has higher performance than LCRN in many studies. Yuan et al. [45]
conducted a study on the traffic accident prediction problem, using the ConvLSTM model.
They predicted data by applying a spatial ensemble to the results predicted by ConvLSTM.
The proposed model shows a much higher prediction accuracy than the conventional
method. He et al. [46] proposed STCNN using ConvLSTM for long-term traffic predictions.
The proposed model combines the weekly ConvLSTM prediction result and the daily
Skip-ConvLSTM prediction result for CNN training to identify the periodic pattern of
traffic. Lin et al. [47] proposed a ConvLSTM-based spatiotemporal temperature deviation
prediction model (PredTemp). They compared the predictions with ConvLSTM, using tem-
perature deviation data, and with ConvLSTM, using both precipitation and temperature
deviation data. To utilize spatiotemporal features, we also include ConvLSTM for deep
learning modeling.
3. Data Description
Particulate matter (PM) is a particle that is generated naturally or artificially and is
contained in the air as an aerosol. The most commonly used PM parameters include PM10 ,
whose diameter is 10 micrometers or less, and PM2.5 , whose diameter is 2.5 micrometers
or less. PM is a fine particle that floats in the air and is a respirable substance that has a
significant impact on health. Many countries around the world treat PM as an environmen-
tal issue. In October 2013, the World Health Organization (WHO) and the International
Agency for Research on Cancer (IARC) classified PM as a Class 1 carcinogen, due to the
high toxicity. According to the State of Global Air [48] released in 2018, 33.7% of the world
was exposed to household air pollution in 2016, and the death toll associated with PM2.5
reached 4.1 million by 2016.
PM tends to float in the air and propagate with the flow of the atmosphere. The smaller
the PMs, the longer they stay in the air. The diffusion rate varies depending on the particle
compositions. The PM forecast is a challenge for climate forecasts, as they show different
patterns depending on the climate impact of each country. PM data are the density of the
particulate matter, such as PM2.5 and PM10 collected from ground stations. In general, it is
desirable for the stations to be evenly distributed throughout the country but they usually
tend to be concentrated in major cities and towns. The distribution is not even uniform,
which makes it challenging to predict such spatiotemporal data.
In this paper, we compare the performances of deep learning models to predict
air pollutant data as spatiotemporal data. We utilize air pollutant data provided by
kweather [49]. Data were collected from 413 discrete stations in Seoul, South Korea.
The collected data include PM2.5 , PM10 , noise, temperature, and humidity, and we utilize
data that were measured every hour for 75 days from 5 September 2019, to 18 November
2019. We examined the missing data as preprocessing and removed 16 days of data. We
also scaled all the data, using min–max scaling. To properly apply deep learning models,
the models are trained with the training data, and the model parameters are tuned with the
validation data. Then, the model performance is evaluated with the test data, which are
unbiased. We randomly separated the data sets into 991, 212, and 213 h for the training data
set, validate data set and test data set, respectively, at the ratio of 7:1.5:1.5. In this paper,
we design PM prediction models using these data sets and compare the PM prediction
performance depending on the data feature selection and temporal and spatial correlations
with deep learning models.
Appl. Sci. 2021, 11, 5853 5 of 15
N
state ht−1 . Additionally, b f , bi , bo , and bg are biases for four layers. The is an element-
wise matrix multiplication. The current short-term state ht is affected by the long-term
state ct−1 and the current long-term state ct is calculated based on the long-term state
ct−1 at the previous time and the input gate it at the present time. LSTM resolves the
long-term dependence problem in RNN by transmitting the long-term state and prevents
the vanishing of the gradient, using tanh as a cell activation function.
The GRU algorithm utilizes only one state vector ht and controls both the forget gate
and input gate with one gate controller, zt . The GRU is presented as follows [51].
technique, and these techniques usually produce excellent approximations for regularly
distributed stations.
(b)
(c)
(g) (h)
(f)
(e)
(d)
(i) (j)
(k)
Models accuracy (d) LISA (b), (j) Prediction (f), Ground truth (g) Residual (h)
61% ≤ Accuracy < 80% High - Low Min Max Min Max
Figure 1. Our visualization system for analyzing deep learning models. (a) is the scatterplot of
the correlation and probability distribution between input variables. (b) shows spatial autocorrela-
tion (Moran’s I) of the selected variable. (c) presents line density map with temporal autocorrelation.
The Sankey diagram supports the modeling of the spatiotemporal prediction by combining features,
deep learning models, and interpolation models in (d). (e) presents our prediction modeling param-
eter settings. (f) presents interpolated predictions with the nearest neighbor algorithm. (g) shows
the observed data. (h) presents the errors between the observed data and predictions. (i) shows the
standard deviation of prediction over time. (j) presents the LISA visualization. (k) The box plots
show temporal predictions with the actual observed values.
5.1. Analysis Based on Correlation and Time Lag Settings at Initial State
First of all, the correlations between variables can be identified in the scatter plot
matrix in (a). The scatter plot shows the features that correlate strongly with the PM2.5
that we attempt to predict. The Pearson correlation coefficient between PM2.5 and PM10
is close to 1, and the scatter plot shows a strong linear correlation, which confirms that
PM10 has the highest correlation with PM2.5 . Therefore, we can attempt to predict PM2.5
by inserting PM2.5 and PM10 features together in the GRU network and the LSTM network.
Our system supports three time lags as an input time range, including 6, 24, and 72 h. The
results are summarized in Table 1. Overall, it is difficult to tell that all six network models
have good predictive performance. Note that we observe the high correlation between
PM2.5 and PM10 within our data, and this is also reported in the study by Zhou et al. [54].
Now, we compare the model performance with different time lags. In both GRU and
LSTM networks, when only the parameters of PM2.5 and PM10 are selected, setting the
time lag to 6 h produces lower MAPE than 24 or 72 h. Since the visualization shown in
Figure 2 is proposed to set an appropriate time lag, we check that the autocorrelation of
each variable changes according to the time lag. We observe the temporal autocorrelation
graphs of PM2.5 and PM10 in Figure 2 to infer the cause for these results. Since the temporal
autocorrelation of PM2.5 and PM10 has a major decreasing trend, we can interpret it as the
accuracy for a long time lag tends to decrease. In other words, when only two variables are
used, including much data from a past time, it may degrade the prediction performance.
We can try two approaches to improve the performance of the GRU and LSTM. First,
Appl. Sci. 2021, 11, 5853 9 of 15
the models are fixed with GRU and LSTM and features are reselected for the training.
Second, we fix the selected features and apply another model, such as the ConvLSTM.
Table 1. Prediction accuracy of different time lags and models with PM2.5 and PM10 for gated recur-
rent units (GRU) and long short term memory (LSTM) with mean absolute percentage error (MAPE).
Figure 2. The temporal autocorrelations of all variables are visualized over time lags. Humidity,
noise, and temperature tend to have high autocorrelations every 24 h. However, PM2.5 and PM10 do
not have repeated temporal autocorrelations.
the prediction. Therefore, we train PM2.5 again with temperature and humidity features,
which have high linear coefficients next to PM10 . The results are summarized in Table 2 and
visualized in Figure 3. We observe that the model with PM2.5 , humidity, and temperature
produces more accurate prediction than one with only PM2.5 and PM10 as presented in
Figure 3a,b. The fixed model with the same features predict PM2.5 differently according
to the time lags, as shown in Figure 3c–e. Although the average MAPE with the time lag
of 6 h is lower than one with the time lag of 24 h, we observe that the time lag of 24 h
produces lower errors overall in the map visualizations.
Table 2. Prediction accuracy of different time lags and models with PM2.5 , temperature, humidity for
gated recurrent units (GRU), long short term memory (LSTM) and mean absolute percentage error
(MAPE).
(a) (b)
(c) (d)
Figure 3. The visualizations of PM2.5 predictions. (a,b) Visualizations of prediction results with
different features. (c–e) Results with different time lags.
In the results after selecting the new feature set, we observe that the MAPE becomes
smaller, compared to the previous feature selection. One reason for this is that duplicated
Appl. Sci. 2021, 11, 5853 11 of 15
6. Discussions
In this paper, we propose an approach to select the appropriate features and deep
learning model by analyzing correlations, spatial correlations, and temporal correlations for
spatiotemporal data prediction. We evaluate our system with spatiotemporal air pollution
data to generate the prediction model. We take the past data (t1 , ..., tn−1 ) as input and
predict the current data at tn as an output. The prediction results are compared in the
map visualizations. The evaluation in Section 5 is intended to perform the deep learning
modeling procedure to improve the prediction results through the system. Note that we
show the modeling procedure rather than the best results in this paper. The limitations of
our approach are in the following.
For feature selection, our system provides the Pearson correlations between variables,
temporal autocorrelation with the time lag, and spatial autocorrelation with LISA visual-
ization. However, the extension to spatial filtering and feature extraction during the data
analysis can enhance the quality of feature selection. Although our approach can be useful
for identifying and predicting global trends in the overall data, our system tends to neglect
the local characteristics. For example, we can filter the areas by considering geographic
characteristics and environmental conditions. In the case of PM2.5 , the frequency of occur-
rence may vary according to the density of factories in neighboring areas, and the diffusion
Appl. Sci. 2021, 11, 5853 12 of 15
7. Conclusions
In this paper, we proposed a visualization system that can analyze deep learning
models. We proposed an approach to select the appropriate features and deep learning
model by analyzing correlations, spatial correlations, and temporal correlations for spa-
tiotemporal data prediction. We analyzed deep learning based prediction model with an
air pollutant data set, which represents an irregularly distributed spatiotemporal data set.
Our system allows us to explain the reason for the low performance of a deep learning
model in the aspect of spatial and temporal correlations. We believe that our approach
supports us in understanding the parameter settings and improving deep learning models
for spatiotemporal data. It is possible to extend our system to include more deep learn-
ing models and explain the predicted results, which is crucial in deep learning research.
However, our model has some limitations, including the lack of feature extraction and the
hyper parameter setting of deep learning networks. To overcome this problem, we plan to
add spatial filtering, apply feature extraction techniques, including PCA, LDA, and t-SNE,
and apply a DCRNN architecture by transforming the extracted feature into a directional
graph form. We also plan to apply the RBF network to the ConvLSTM neural network in
the future.
Appl. Sci. 2021, 11, 5853 13 of 15
Author Contributions: All authors contributed to this study. H.S., S.K., H.Y., and Y.K. developed
the system and wrote the article. S.-E.K. and Y.J. supervised the project and wrote the article. All
authors have read and agreed to the published version of the manuscript.
Funding: This work was supported in part by the Basic Research Program through the National
Research Foundation of Korea (NRF) funded by the MSIT (2019R1A4A1021702) and in part by
Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded
by the Korea government (MSIT) (No. 2019-0-00374, Development of Big data and AI based Energy
New Industry type Distributed resource Brokerage System).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Cressie, N.; Shi, T.; Kang, E.L. Fixed rank filtering for spatio-temporal data. J. Comput. Graph. Stat. 2010, 19, 724–745. [CrossRef]
2. Cheng, X.; Zhang, R.; Zhou, J.; Xu, W. Deeptransport: Learning spatial-temporal dependency for traffic condition forecasting. In
Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; IEEE:
Piscataway, NJ, USA, 2018; pp. 1–8.
3. Bishop, C.M. Pattern Recognition and Machine Learning (Information Science and Statistics); Springer: Berlin/Heidelberg, Ger-
many, 2006.
4. Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [CrossRef] [PubMed]
5. Li, X.; Peng, L.; Yao, X.; Cui, S.; Hu, Y.; You, C.; Chi, T. Long short-term memory neural network for air pollutant concentration
predictions: Method development and evaluation. Environ. Pollut. 2017, 231, 997–1004. [CrossRef] [PubMed]
6. Tao, Q.; Liu, F.; Li, Y.; Sidorov, D. Air Pollution Forecasting Using a Deep Learning Model Based on 1D Convnets and Bidirectional
GRU. IEEE Access 2019, 7, 76690–76698. [CrossRef]
7. Huang, C.J.; Kuo, P.H. A deep cnn-lstm model for particulate matter (PM2. 5) forecasting in smart cities. Sensors 2018, 18, 2220.
[CrossRef] [PubMed]
8. Xingjian, S.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM network: A machine learning
approach for precipitation nowcasting. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC,
Canada, 7–12 December 2015; pp. 802–810.
9. Hohman, F.M.; Kahng, M.; Pienta, R.; Chau, D.H. Visual analytics in deep learning: An interrogative survey for the next frontiers.
IEEE Trans. Vis. Comput. Graph. 2018, 25, 2674–2693. [CrossRef] [PubMed]
10. Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artif. Intell. 2019, 267, 1–38. [CrossRef]
11. Wongsuphasawat, K.; Smilkov, D.; Wexler, J.; Wilson, J.; Mane, D.; Fritz, D.; Krishnan, D.; Viégas, F.B.; Wattenberg, M. Visualizing
dataflow graphs of deep learning models in tensorflow. IEEE Trans. Vis. Comput. Graph. 2017, 24, 1–12. [CrossRef] [PubMed]
12. Dingen, D.; van’t Veer, M.; Houthuizen, P.; Mestrom, E.H.; Korsten, E.H.; Bouwman, A.R.; Van Wijk, J. RegressionExplorer:
Interactive exploration of logistic regression models with subgroup analysis. IEEE Trans. Vis. Comput. Graph. 2018, 25, 246–255.
[CrossRef]
13. Xiang, S.; Ye, X.; Xia, J.; Wu, J.; Chen, Y.; Liu, S. Interactive Correction of Mislabeled Training Data. In Proceedings of the 2019
IEEE Conference on Visual Analytics Science and Technology (VAST), Vancouver, BC, Canada, 20–25 October 2019; pp. 57–68.
14. Migut, M.; Worring, M. Visual exploration of classification models for risk assessment. In Proceedings of the 2010 IEEE
Symposium on Visual Analytics Science and Technology, Salt Lake City, UT, USA, 25–26 October 2010; IEEE: Piscataway, NJ,
USA, 2010; pp. 11–18.
15. Ming, Y.; Qu, H.; Bertini, E. RuleMatrix: Visualizing and understanding classifiers with rules. IEEE Trans. Vis. Comput. Graph.
2018, 25, 342–352. [CrossRef]
16. Yu, W.; Yang, K.; Bai, Y.; Yao, H.; Rui, Y. Visualizing and comparing convolutional neural networks. arXiv 2014, arXiv:1412.6631.
17. Harley, A.W. An interactive node-link visualization of convolutional neural networks. In International Symposium on Visual
Computing; Springer: Cham, Switzerland, 2015; pp. 867–877.
18. Liu, M.; Shi, J.; Li, Z.; Li, C.; Zhu, J.; Liu, S. Towards better analysis of deep convolutional neural networks. IEEE Trans. Vis.
Comput. Graph. 2016, 23, 91–100. [CrossRef] [PubMed]
19. Mühlbacher, T.; Piringer, H. A partition-based framework for building and validating regression models. IEEE Trans. Vis. Comput.
Graph. 2013, 19, 1962–1971. [CrossRef] [PubMed]
20. Bernard, J.; Zeppelzauer, M.; Sedlmair, M.; Aigner, W. VIAL: A unified process for visual interactive labeling. Vis. Comput. 2018,
34, 1189–1207. [CrossRef]
Appl. Sci. 2021, 11, 5853 14 of 15
21. Ribeiro, M.T.; Singh, S.; Guestrin, C. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the
22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August
2016; ACM: New York, NY, USA, 2016; pp. 1135–1144.
22. Spinner, T.; Schlegel, U.; Schäfer, H.; El-Assady, M. explAIner: A visual analytics framework for interactive and explainable
machine learning. IEEE Trans. Vis. Comput. Graph. 2019, 26, 1064–1074. [CrossRef] [PubMed]
23. Liu, D.; Cui, W.; Jin, K.; Guo, Y.; Qu, H. Deeptracker: Visualizing the training process of convolutional neural networks. ACM
Trans. Intell. Syst. Technol. (TIST) 2019, 10, 6. [CrossRef]
24. Wang, Q.; Yuan, J.; Chen, S.; Su, H.; Qu, H.; Liu, S. Visual Genealogy of Deep Neural Networks. IEEE Trans. Vis. Comput. Graph.
2019, 26, 3340–3352. [CrossRef] [PubMed]
25. Kwon, B.C.; Choi, M.J.; Kim, J.T.; Choi, E.; Kim, Y.B.; Kwon, S.; Sun, J.; Choo, J. Retainvis: Visual analytics with interpretable and
interactive recurrent neural networks on electronic medical records. IEEE Trans. Vis. Comput. Graph. 2018, 25, 299–309. [CrossRef]
[PubMed]
26. Ming, Y.; Cao, S.; Zhang, R.; Li, Z.; Chen, Y.; Song, Y.; Qu, H. Understanding hidden memories of recurrent neural networks. In
Proceedings of the 2017 IEEE Conference on Visual Analytics Science and Technology (VAST), Phoenix, AZ, USA, 3–6 October
2017; IEEE: Piscataway, NJ, USA, 2017; pp. 13–24.
27. Ming, Y.; Xu, P.; Cheng, F.; Qu, H.; Ren, L. ProtoSteer: Steering Deep Sequence Model with Prototypes. IEEE Trans. Vis. Comput.
Graph. 2019, 26, 238–248. [CrossRef] [PubMed]
28. Liu, M.; Liu, S.; Su, H.; Cao, K.; Zhu, J. Analyzing the noise robustness of deep neural networks. arXiv 2018, arXiv:1810.03913.
29. Wang, J.; Gou, L.; Shen, H.W.; Yang, H. Dqnviz: A visual analytics approach to understand deep q-networks. IEEE Trans. Vis.
Comput. Graph. 2018, 25, 288–298. [CrossRef] [PubMed]
30. Graves, A.; Schmidhuber, J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures.
Neural Netw. 2005, 18, 602–610. [CrossRef] [PubMed]
31. Cui, Z.; Ke, R.; Pu, Z.; Wang, Y. Deep stacked bidirectional and unidirectional LSTM recurrent neural network for network-wide
traffic speed prediction. In Proceedings of the 6th International Workshop on Urban Computing (UrbComp 2017), Halifax, NS,
Canada, 14 August 2017.
32. Huang, Z.; Xu, W.; Yu, K. Bidirectional LSTM-CRF models for sequence tagging. arXiv 2015, arXiv:1508.01991.
33. Siami-Namini, S.; Namin, A.S. Forecasting economics and financial time series: Arima vs. lstm. arXiv 2018, arXiv:1803.06386.
34. Han, J.H. Comparing Models for Time Series Analysis. Bachelor’s Thesis, University of Pennsylvania, Philadelphia, PA,
USA, 2018.
35. Tang, Z.; Shi, Y.; Wang, D.; Feng, Y.; Zhang, S. Memory visualization for gated recurrent neural networks in speech recognition.
In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans,
LA, USA, 5–9 March 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 2736–2740.
36. Strobelt, H.; Gehrmann, S.; Pfister, H.; Rush, A.M. Lstmvis: A tool for visual analysis of hidden state dynamics in recurrent neural
networks. IEEE Trans. Vis. Comput. Graph. 2017, 24, 667–676. [CrossRef] [PubMed]
37. Li, J.; Heap, A.D. Spatial interpolation methods applied in the environmental sciences: A review. Environ. Model. Softw. 2014,
53, 173–189. [CrossRef]
38. Revesz, P.; Li, L. Constraint-based visualization of spatial interpolation data. In Proceedings of the Sixth International Conference
on Information Visualisation, London, UK, 10–12 July 2002; IEEE: Piscataway, NJ, USA, 2002; pp. 563–569.
39. Wolberg, G. Digital Image Warping, 1st ed.; IEEE Computer Society Press: Washington, DC, USA, 1994.
40. Li, L.; Revesz, P. Interpolation methods for spatio-temporal geographic data. Comput. Environ. Urban Syst. 2004, 28, 201–227.
[CrossRef]
41. Mitas, L.; Mitasova, H. Spatial interpolation. In Geographic Information Systems: Principles, Techniques, Management and Applications;
Longley, P., Goodchild, M.F., Maguire, D.J., Rhind, D.W., Eds.; Wiley: New York, NY, USA, 1999; pp. 481–492.
42. Park, J.; Sandberg, I.W. Universal approximation using radial-basis-function networks. Neural Comput. 1991, 3, 246–257.
[CrossRef] [PubMed]
43. Donahue, J.; Anne Hendricks, L.; Guadarrama, S.; Rohrbach, M.; Venugopalan, S.; Saenko, K.; Darrell, T. Long-term recurrent
convolutional networks for visual recognition and description. In Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 2625–2634.
44. Mathe, J.; Miolane, N.; Sebastien, N.; Lequeux, J. PVNet: A LRCN Architecture for Spatio-Temporal Photovoltaic PowerForecast-
ing from Numerical Weather Prediction. arXiv 2019, arXiv:1902.01453.
45. Yuan, Z.; Zhou, X.; Yang, T. Hetero-convlstm: A deep learning approach to traffic accident prediction on heterogeneous spatio-
temporal data. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining,
London, UK, 19–23 August 2018; ACM: New York, NY, USA, 2018; pp. 984–992.
46. He, Z.; Chow, C.Y.; Zhang, J.D. STCNN: A Spatio-Temporal Convolutional Neural Network for Long-Term Traffic Prediction. In
Proceedings of the 2019 20th IEEE International Conference on Mobile Data Management (MDM), Hong Kong, China, 10–13 June
2019; IEEE: Piscataway, NJ, USA, 2019; pp. 226–233.
47. Lin, H.; Hua, Y.; Ma, L.; Chen, L. Application of ConvLSTM Network in Numerical Temperature Prediction Interpretation. In
Proceedings of the 2019 11th International Conference on Machine Learning and Computing, Zhuhai, China, 22–24 February
2019; ACM: New York, NY, USA, 2019; pp. 109–113.
Appl. Sci. 2021, 11, 5853 15 of 15