0% found this document useful (0 votes)

90 views11 pages

Al Hawawreh2018 PDF

Uploaded by

Pedrito Orange

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

90 views11 pages

Al Hawawreh2018 PDF

Uploaded by

Pedrito Orange

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Journal of Information Security and Applications 41 (2018) 1–11

Contents lists available at ScienceDirect

Journal of Information Security and Applications

journal homepage: www.elsevier.com/locate/jisa

Identiﬁcation of malicious activities in industrial internet of things

based on deep learning models
Muna AL-Hawawreh, Nour Moustafa∗, Elena Sitnikova
School of Engineering and Information Technology, University of New South Wales, Australian Defence Force Academy (ADFA), Canberra, Australia

a r t i c l e i n f o a b s t r a c t

Article history: Internet Industrial Control Systems (IICSs) that connect technological appliances and services with phys-
ical systems have become a new direction of research as they face different types of cyber-attacks that
Keywords: threaten their success in providing continuous services to organizations. Such threats cause firms to suffer
Industrial internet of things (IIoT) financial and reputational losses and the stealing of important information. Although Network Intrusion
Internet industrial control systems (IICSs) Detection Systems (NIDSs) have been proposed to protect against them, they have the difficult task of col-
Deep learning lecting information for use in developing an intelligent NIDS which can proficiently detect existing and
Auto-encoder new attacks. In order to address this challenge, this paper proposes an anomaly detection technique for
IICSs based on deep learning models that can learn and validate using information collected from TCP/IP
packets. It includes a consecutive training process executed using a deep auto-encoder and deep feedfor-
ward neural network architecture which is evaluated using two well-known network datasets, namely,
the NSL-KDD and UNSW-NB15. As the experimental results demonstrate that this technique can achieve
a higher detection rate and lower false positive rate than eight recently developed techniques, it could be
implemented in real IICS environments.
© 2018 Elsevier Ltd. All rights reserved.

1. Introduction security domain cannot find promising control solutions for pre-
venting them [6].
Cyberspace plays a key role in contemporary societies and With the number of IIoT devices and applications rapidly in-
economies as the internet has changed the ways in which peo- creasing, protecting critical infrastructures (i.e., IICSs) is becoming
ple and organizations communicate and conduct business electron- a more critical issue for business [7]. In IIoT environments, mal-
ically [1]. Therefore, different devices, applications, and services, ware, which leverages zero-day vulnerabilities, is one of the most
which link the virtual and physical worlds, are included in the new common threats, whereby attackers infect critical devices in order
term the ‘Industrial Internet of Things’ (IIoT) [2]. The interoperabil- to control and modify their operations using different methodolo-
ity of Information Technology (IT) and Operational Technology (OT) gies, such as Advanced Persistent Threat (APT), Denial of Service
exposes industrial environments that depend on closed and propri- (DoS) and Distributed DoS (DDoS). For example, the Stuxnet worm
etary communication protocols to diverse types of anomalous ac- attacked the Iranian nuclear program in 2010, Iranian hackers pen-
tivities [3]. IIoTs are connected to the internet via the TCP/IP pro- etrated the ICS of New York’s dam in 2013, black-energy malware
tocol in the forms of Machine-to-machine (M2M) and Machine-to- was directly responsible for power outages for at least 80.0 0 0 cus-
people (M2P) using specific IIoT protocols, for example, Message tomers in Ukraine in 2015 and, more recently, a SFG malware at-
Queue Telemetry Transport (MQTT) and Advanced Message Queu- tack targeted European energy companies [10,11]. These malicious
ing Transport (AMQT) [4]. There have been substantial increases activities have proven that the ‘security by obscurity’ or traditional
in the numbers of loopholes and vulnerabilities in IICSs that can cyber-security mechanisms, including security policies, authenti-
be breached using several sophisticated attack techniques, whereby cation, firewalls and signature-based Intrusion Detection Systems
attackers attempt to exploit these systems in order to achieve their (IDSs), are no longer appropriate schemes for achieving efficient
aims of stealing valuable information and/or financial funds, and/or protection for critical infrastructure.
corrupting device resources [5]. It is expected that cyber threats To detect IIoT attacks, a Network IDS (NIDS), which is the
to the IIoT/IICSs will cost up to $90 trillion by 2030 if the cyber- second line of defense after firewall, antivirus and access con-
trol systems, has to be deployed [8]. It is defined as a software
and/or hardware mechanism used to monitor and detect suspi-
∗
Corresponding author. cious events throughout networked systems [9], with its method-
E-mail address: [email protected] (N. Moustafa).

https://doi.org/10.1016/j.jisa.2018.05.002
2214-2126/© 2018 Elsevier Ltd. All rights reserved.
2 M. AL-Hawawreh et al. / Journal of Information Security and Applications 41 (2018) 1–11

ology categorized as either signature- or anomaly-based detection. an IDS based on the context of the Modbus/TCP protocol. Although
The former identifies existing intrusions by comparing upcoming the above mechanisms achieved reasonable performances to some
rules/signatures against a blacklist of their known rules but can- extent, they were dedicated to specific protocols with high FPRs.
not detect new attacks while the latter can detect known and new Stewart et al. [23] proposed an adaptive IDS for fitting the dy-
attacks but also creates a huge number of errors [8]. An anomaly- namic architectures of SCADA systems using different OCSVM mod-
based IDS could be a powerful technique if its methodology could els to choose the most appropriate one for effectively detecting
successfully detect known and unknown attacks that attempt to different attacks. However, this system consumed a high amount
breach IIoTs [8,9]. According to the literature, IDSs have been built of computational resources while executing and produced a high
based on classical machine-learning and data-mining techniques false alarm rate for detection. Shang et al. [24] proposed an ADS
[12,13], rules-based models [14], artificial intelligence approaches for discovering attacks that penetrated the Modbus/TCP protocol
[15] and statistical models [40]. However, these methods often pro- by extracting different features of communication activities from
duce high False Positive Rates (FPRs) due to overlapping between SCADA systems which were used by a SVM algorithm to classify
legitimate and anomalous observations. attacks. However, the detection process was not sufficiently effec-
In this study, we propose an effective Anomaly Detection Sys- tive for detecting abnormal behaviors.
tem (ADS) for IICSs using deep-learning models to address the Meglaras and Jiang [25] combined the OCSVM model and re-
drawback of FPRs as much as possible because these models can cursive K-means clustering algorithm to avoid the influence of ker-
automatically analyze raw network data to discover abnormal pat- nel parameters on the OCSVM for effectively detecting network
terns efficiently. A deep-learning technique is very effective, as it attacks. An IDS for a critical infrastructure based on an Artifi-
can deal with high dimensionality and determine the latent struc- cial Neural Network (ANN) mechanism, which used error back-
ture from unlabeled data [34]. More importantly, in the training propagation and Levenberg-Marquard functions to train a multi-
phase, it conducts a consecutive training process using the unsu- perceptron ANN to detect abnormal network traffic, was presented
pervised Deep Auto-Encoder (DAE) algorithm to learn normal net- by Linda et al. [26]. Hodo et al. [27] adopted an ANN to detect
work behaviors and produce the optimal parameters (i.e., weights DoS/DDoS attacks in IoTs using a simulated network, and Chen
and biases). Then, a standard supervised deep neural network et al. [28] proposed an artificial immune-based distributed IDS
model uses the estimated parameters of the ADE models for effec- for IoT systems. More recently, Marsden et al. [56] proposed a
tively tuning its parameters and classifying network observations. Probability Risk Identification based Intrusion Detection System
The proposed technique is evaluated on two benchmark datasets, (PRI-IDS) mechanism by inspecting network traffic of the Mod-
the NSL-KDD [38] and UNSW-NB15 [39,44,45], due to their widely bus TCP/IP protocol for detecting replay attacks. However, these
and recently use for assessing ADSs. The experimental results re- schemes produced high false alarm rates and had a difficulty of
veal the superiority of the proposed technique compared with dif- recognizing some new attacks.
ferent network intrusion detection mechanisms which clarify its
effectiveness for deployment in real-world IICS environments. 2.2. Deep networks for IDS
The remainder of this paper is organized as follows.
Section 2 provides an overview of the most relevant litera- IDSs have been studied using shallow and deep networks for
ture concerning ADSs in industrial control systems and the IoT. detecting abnormal observations from the host- and network-
Section 3 discusses the use of deep learning as an ADS. In based systems [47–50,54]. A shallow network is an ANN that con-
Section 4, details of the design of the proposed model and its sists of often one/two hidden layer(s), whilst a deep network com-
deployment in an IIoT environment are presented. Descriptions prises many hidden layers with different architectures. Deep learn-
of the datasets and evaluation metrics used are presented in ing is one of the most popular machine-learning techniques that
Section 5, and the experimental results discussed in Section 6. academic and industrial researchers use due to its capability of
Finally, the conclusion is presented in Section 7. learning a computational process in depth that mimic the natural
behaviors of a human’s brain [29].
2. Background and related work Deep learning can be categorized into different types depend-
ing on its architectural design which consists of hierarchical layers
This section explains the ADS technology and its approaches in of non-linear processing levels [17]. According to Hodo et al [54],
IoT and industrial environments. Moreover, we focus on the ap- Deep networks are classified based on its architecture into genera-
proaches of shallow and deep networks, which are related to our tive and discriminative; the generative architecture models a joint
proposed technique, demonstrating their capability of identifying probability distribution for observed data with their classes. There
suspicious activity. are four types of generative models, which are Recurrent Neural
Networks (RNN), Dee Belief Network (DBN), Auto-Encoder (AE),
2.1. ADS technology and Deep Boltzmann Machine (DBM). The discriminative architec-
ture models the posterior distributions of classes conditioned on
An ADS is a fundamental security control mechanism which the observed data comprises RNN and Convolutional Neural Net-
acts as a sniffer and decision engine for monitoring network traf- work (CNN) [54]. These models are described as follows.
fic and identifying abnormal activities [18]. We focus on one that
establishes a profile from normal data and considers any variation • Generative deep architectures
from it an attack because it can detect both known and unknown RNN is considered as a supervised or unsupervised learn-
(zero-day) attacks [8,19]. Some ADSs have been introduced in IIoTs; ing model. The core theory behind it is that information
for example, Shang et al. [20] proposed a Particle Swarm Optimiza- is linked in long sequences via a layer-by-layer connection
tion (PSO) technique-based ADS for improving the efficiency of the with a feedback loop. There is a directed cycle between its
One Class Support Vector Machine (OCSVM) model by extracting layers that increase its reliability, with the capability of cre-
packets of the Modbus/TCP communication protocol for training ating an internal memory for storing data of the previous
and validating the model. Similarly, Maglaras and Jiang [21] de- input. RNN has two types: Elman and Jordan, based on the
veloped an IDS/ADS based on this model which was trained on way of layer connections. Elman consists of three layers (i.e.,
offline data using the network traces collected from a SCADA en- input, hidden, and output) in addition to the context layer.
vironment. Silva and Schukat [22] used a K-NN classifier to build The hidden layer is connected to the context layer after each
M. AL-Hawawreh et al. / Journal of Information Security and Applications 41 (2018) 1–11 3

feed-forward and learning rules are applied, a copy of the six features selected from the NSL-KDD dataset. Seok et al. [52] uti-
previously hidden units is saved at the context units. Jordan lized the CNN technique-based IDS for recognizing malware. In
is like Elman networks but the context units are fed from [53], the author proposed an ensemble method for IDS using dif-
the output units. ferent DFN architectures that contain shallow auto-encoder net-
DAE is used for learning efficient coding in an unsupervised works, DBN, DNN, and an extreme learning machine. The method
manner. The simplest architecture of DAE involves an input was evaluated on the NSL-KDD dataset, and the experiment results
layer, more than one hidden layer and an output layer that showed a good performance of detecting abnormal observations
has the same number of neurons in the input layer for re- from network data.
construction. From the discussion above, it is observed that deep learning
DBM is an undirected probabilistic generative model. It con- techniques could considerably improve the performance of design-
sists of energy and stochastic units for the overall network ing a reliable IDS for IICSs with higher detection accuracy and low
for producing binary results. A Restricted Boltzmann Ma- false alarm rates. This is the motivation of utilizing deep learn-
chine (RBM) is applied to reduce hidden layers, which does ing models in this study, due to their ability of the automatic
not allow intra-layer connections between hidden units. feature extracting with a depth analysis to network data and de-
Training a stack of DBM on unlabeled data as the input of tecting outlier patterns from data as suspicious vectors. Our pro-
the next layer and inserting a layer for discrimination can posed DAE-DFFNN-based ADS technique contains a DAE algorithm
lead to constructing an architecture of DBN. to pre-train the DFFNN model that classifies network observations
DBN consists of multiple hidden layers, where a connection by ranking the parameter values of the ADE. It has the capability of
is between layers not between units within each layer. It is discovering a good representation for network data and converting
a composition of unsupervised and supervised learning net- the high dimensional data to low dimensional using the decreased
works. The unsupervised model is learned by a greedy layer- layer in the DAE-DFFNN model, as detailed in the following section.
by-layer connection at a time, whereas the supervised net-
work is one or more layers linked for classifying tasks.
3. Proposed ADS-based deep learning for IICSs
• Discriminative deep architectures
RNN utilizes discriminative power for a classification task,
This study applies different architectures of deep-learning mod-
and this occurs when the output of the model is labeled
els to develop an efficient ADS for IIoT environments. In the train-
data in a sequence with the input.
ing phase, a DAE algorithm learns using normal network observa-
CNN is a space invariant multi-perceptron ANN, which is bi-
tions to create the initialization parameters (i.e., weights and bi-
ologically inspired by the organization of the animal visual
ases) and learn a deep representation of normal behaviors. These
cortex. It has many hidden layers, which typically consists
parameters are used as an initialization stage for training a stan-
of convolutional layers, pooling layers, fully connected lay-
dard Deep Feed Forward Neural Network (DFFNN) to discover ex-
ers and normalization layer. The convolutional layers share
isting and new attack instances. In the testing phase, the DFFNN is
many weights that have a few parameters and this makes
used to recognize malicious vectors. Different hidden nodes in the
the CNN is easier in the training process compared with
technique can professionally learn a deep feature representation
other models with the same number of hidden units.
and capture the most important features by converting the high
dimensions of data to low dimensions based on the decreased hid-
Many recent research studies [46–53] have applied deep learn-
den layer. The details of the proposed ADS technique are explained
ing techniques for IDSs. A study by Alom et al. [46] used DBN
in the following three subsections.
which adopted the greedy layer-by-layer learning algorithm to
learn each stack of RBM at a time to identify intrusion activities.
Similarly, Gao et al [47] suggested the use of the DBN technique to 3.1. Deep feed-forward neural network (DFFNN)
build an IDS. In the training phase, the greedy layer-by-layer algo-
rithm was used for pre-training and fine-tuning the model. In [48], Typically, a DFFNN is defined as an ANN technique that has
a deep auto-encoder algorithm was utilized to reduce the data di- an input layer, more than one hidden layers and an output layer
mensions and a pre-stage of classifying network data. The ANN with direct connections without a cycle between them [30]. Each
mechanism was adopted as a classifier to evaluate the efficiency of hidden layer of the nodes represents abstracted features based on
the auto-encoder compared with the Principle Component Analy- the previous level’s output which are automatically determined
sis (PCA), kernel- PCA and factor analysis algorithms. The results and collected in several layers to generate the outputs. To train
revealed the technique’s efficiency for detecting network attacks. this technique, a stochastic gradient descent back-propagation al-
Li et al. [49] presented a hybrid malicious code detector based gorithm [42] is used.
on deep learning. In the first step, the auto-encoder was used In this deep-learning algorithm, the input data feeds into an in-
for decreasing the data dimensions, and the unsupervised DBN put layer and is then propagated to the hidden layer, the output
model was applied to discover network attacks. Chuan-Long et al. from which is a non-linear transformation of the data that passes
[50] proposed an IDS based on RNN to classify the collected data. to the output layer. A loss function or back-propagation error [31],
The experiments were conducted on a different number of hidden which is the difference between the predicted and actual output, is
nodes and learning rate values. The output of the techniques ac- calculated to evaluate the model’s performance and its value prop-
complished a reasonable performance with the parameters’ setting agated backward through the hidden layers to update the weights.
of 80 hidden nodes and learning rate 0.1, but its computational Calculations of the loss function are conducted based on single or
processing was high. A flexible NIDA using a Self-Taught Learn- mini-batch samples of, rather than all, the training data, with the
ing (STL) algorithm was presented by Niyaz et al. [43]. The sparse weights updated after each sample is processed in order to prop-
auto-encoder was applied to represent a good feature representa- erly fit the model.
tion while the soft-max regression technique was utilized for clas- The supervised training process in this algorithm depends on
sifying the network data. The proposed model performed well per- the randomness of the initialization of the neural network’s pa-
formance in the evaluation process. rameters which tends to place the model at local minima solutions
Tang et al. [51] built an IDS using a simple DFN which consists with poor regularization [33]. To have better convergence proper-
of three hidden layers, the model was trained and tested the best ties and improve the results of supervised learning, pre-training
4 M. AL-Hawawreh et al. / Journal of Information Security and Applications 41 (2018) 1–11

Fig. 1. Architecture of DAE network.

unsupervised techniques, in particular, an AE, can be used to create cess computed by the deterministic mapping (gθ ) as
the initialization parameters.
gθ x(i ) = T W z(i) + b , (3)
3.2. Deep auto-encoder (DAE) where W is a dh × d0 weight matrix, b a bias vector and θ the
mapping parameters [W , b ].
A DAE is a feed-forward neural network algorithm for learning The input is formed in a compressed representation to ﬁt the
eﬃcient coding using an unsupervised technique [34]. It creates a hidden layer, the data in which is then used as input to recon-
representation of a set of data (x) by learning the approximation struct the original data. The training process minimizes the recon-
of an identity function, where the output (xˆ) is similar to the in- struction error (i.e., the difference between the original data and
put (x), that is, x → xˆ. Its schematic structure consists of vectors its low-dimensional reconstruction) and is calculated for a single
(x(i) ) in the input layer and more than one hidden layer with a or mini-batch training sample (s) by
non-linear activation function. The hidden layers are used to learn
1 (i )
s
a compressed representation of the input data via fewer neurons E x, xˆ = ||x − xˆ(i) || 2 , (4)
than the input layer. As a result, it learns the most important fea- 2
i
tures, reduces the dimensionality and represents an abstraction of
the input data. Ultimately, the output layer (xˆ(i ) ) is displayed as an θ = {W, b} = argminθ E x, xˆ , (5)
approximate representation of the input layer. The DAE architecture depicted in Fig. 1 that contains three hid-
The simplest architecture of an AE consists of an input layer, den layers, an encoder, bottleneck (which consists of fewer nodes
hidden layer, and output layer. Assuming that the training data than the previous layers and is used to represent the input data
(x(i) ) has n samples, where each x(i) (i ∈ (1, …., n) has many di- with a non-linear dimensionality reduction, where the number of
mensions, and there is a dimensional feature vector (d0 ), the Tanh nodes represents the number of dimensions) and decoder. This
activation function is used [34] and computed by model potentially operates using a non-linear Principal Component
1 − e−2t Analysis (PCA) technique for reducing dimensionality [36]. In order
T (t ) = , (1) to identify legitimate and suspicious activities in IICSs, we propose
1 + e−2t
an ADS based on deep learning that includes training and testing
The AE algorithm has two main parts, an encoder and decoder phases, as discussed below.
[16,35]. To map the input vector (x(i) ) into a hidden layer represen-
tation (z(i) ), a deterministic mapping called an encoder process (fθ ) 3.3. Training and testing phases of ADS-based deep learning
is used [16] and the dimensionality of x(i) reduced to provide the
correct number of codes as The DFFNN and DAE approaches discussed above are the fun-

fθ x (i ) = T (Wx(i) + b) (2) damental mechanisms used to build the proposed ADS-based
deep-learning technique; the structures of networks in DAE-DFFFN
where W is a weight matrix of size d0 × dh , dh
a number of neurons model are depicted in Fig. 2. In the training phase, given an un-
in a hidden layer (dh < d0 ), b the bias vector, T a Tanh activation labeled normal training dataset A, and labeled training dataset B,
function and θ the mapping parameters [W, b]. where AB, a DAE with a bottleneck layer is trained based on only
To reconstruct the input as an approximation (xˆ(i ) ), the result of normal records (A) without any anomalous vectors to learn and
the hidden layer’s representation is mapped and the decoder pro- discover the most important feature representations for normal
M. AL-Hawawreh et al. / Journal of Information Security and Applications 41 (2018) 1–11 5

Fig. 2. Proposed architecture of DAE-DFFNN model based ADS for IICs.

Algorithm 1: vectors that test the accuracy of the method. In more detail, the
Unsupervised training phase for proposed ADS-based deep learning.
network model is trained based on the stochastic gradient descent
Input: training dataset (A) with n umber of samples (n) of back-propagation mechanism to minimize the loss function, with
(x(i) ), where i ∈ (1, …., n). the mean square error calculated from the difference between the
Output: parameters θ = {W, b}
values of the target output (y(i) ) and predicted output ( gθ (x(i) )),
Begin
Initialize {W, b}; where gθ is the hypothesis function that yields an estimated out-
repeat put.
For each record (x(i) ), do In the testing phase, after the parameters are automatically
compute the activation ( z(i) ) in/at hidden layer and learned in the training phase, the new dataset sample (C {A, B})
give output (xˆ(i ) ) to outhe tput layer.
is tested based on the ﬁnal constructed network model. Each in-
compute the training error (E (x(i ) , xˆ(i ) ).
Back-propagate E and update parameters put record (xˆ(i ) ) is passed to the input layer with the initialization
θ = {W, b}; weights and bias (θ f = {Wf ,bf )) adopted and then the input data is
End processed through the hidden layers. Finally, the output layer pre-
until converged
dicts the class of the input data as either normal or attack based
end
on the estimated value of the loss function of each class.

patterns. It is trained using all the data, where the input to the 4. Suggested framework for applying proposed ADS in IICSs
network (x(i) ) is passed through three hidden layers, including the
bottleneck one, to reconstruct it (xˆ(i ) ), where We , Wn and Wf are This study proposes an efficient anomaly detection model for
the weights of the DAE, DFFNN and final prediction model, as de- protecting IICS environments against malicious activities. As illus-
picted in Fig. 2. trated in Fig. 3, its architecture consists of training and testing
In the encoding step, the input layer in the first hidden layer steps.
is processed using Eqs. (1) and (2). In the bottleneck step, a low- Data pre-processing, this includes feature transformation and
dimensional non-linear transformation of the input features is ex- normalization, and is the first stage in the proposed ADS-based
ecuted to extract important representative features from the net- deep-learning mechanism, inspects and selects important informa-
work data. Then, in the decoding step, the last hidden layer in tion from the large-scale data in an IIoT environment.
the bottleneck feature is used to approximately replicate the in-
put using Eqs. (1) and (3), and stochastic gradient descent back- • Feature transformation—as the proposed model accepts only
propagation to reduce the loss function, that is, the mean square numerical features, each symbolic feature value is converted
error between (x(i) ) and (xˆ(i ) ) using Eqs. (4) and (5). The key pro- into a numerical one; for example, the NSL-KDD dataset has
cedures for unsupervised training of the proposed ADS-based deep many symbolic attributes such as protocol types with nominal
learning are presented in Algorithm 1. values like ICMP, TCP, and UDP which are mapped into 1, 2 and
Then, the trained model is used as the starting point and initial 3, respectively.
parameters of weights and biases for training the supervised deep • Feature normalization—since deep learning depends on
network and the process for tuning the network model conducted weights, the different feature scales can bias data into partic-
using the labeled training dataset (B(x(i) ,y(i) )). The same steps as ular layers which may cause certain weights to update faster
previously followed are applied to learn and validate the proposed than others [32]. Consequently, it is necessary to handle this is-
ADS technique on dataset B which includes normal and anomalous sue using a statistical normalization whereby the Z-score func-
6 M. AL-Hawawreh et al. / Journal of Information Security and Applications 41 (2018) 1–11

Training Data Basic PreProcessing

Labeled Data
Features Features
Transformation Normalization

Unlabeled Normal
Data
Unsupervised-Deep Learning Supervised Deep Learning

Learning Algorithms

Testing Data Basic PreProcessing Trained Model

Predicted Data
Features Features
Classifier Label
Conversion Normalization
(Norma || Attack)

Fig. 3. Architecture of proposed ADS-based Deep Learning.

tion for each feature value (v(i) ) is performed using based on the deep-learning algorithm in an IIoT environment. Like
other IDSs, the proposed system uses a set of cooperative and ma-
v −μ
(i )
Z (i ) = (6) jor components: a sniffing and monitoring unit; databases; and an-
σ alyzer and response units. A general view of the structure of this
where μ is the mean of the n values for a given feature (v(i) (i ∈ 1, IDS is presented in Fig. 4.
2, 3, …n)) and σ the standard deviation. The main components of the proposed ADS are described be-
As network data contains a high-dimensional space, it is essen- low.
tial reducing its dimensionality for improving the computational
resources to design a lightweight and scalable ADS technique [36]. • Sniffing and monitoring unit—a sniffer, which is implemented
Consequently, the proposed DAE-DFFNN model is utilized to re- in the gateway to monitor and collect the traffic exchanged
duce the high dimensions into low ones using a central decreased between it and the external network via the internet can be
layer. In more detail, there is a non-linear function in the Model embedded in either the software or hardware to obtain the
that encodes a large number of features into the lower feature set sent and received packets which it stores in files that are then
in the decreased hidden layer, so feature reduction is applied with- passed to the raw traffic database. Three databases, raw traf-
out the need for human knowledge. The target of the DAE-DFFNN fic, behavior, and log, implemented in our IDS are installed and
feature reduction is to eliminate from the ambiguous structure in retained in the cloud storage at the network’s edge (i.e., fog
the input distribution and find out well-designed representations computing storage). The first stores the raw network traffic col-
in terms of higher-level learned, as well as importantly filtered and lected by the sniffing unit, the second contains a list of previous
reduced features. datasets which is considered a profile history of the network
Since a suspicious activity is determined by any change in the while the third stores the new signatures of detected attacks
normal state of the network, analyzing normal behaviors are so and is used to continuously feed the behavior database.
significant for facilitating the detection process. Therefore, an un- • Analyzer unit—this is an important component in an IDS which
supervised learning process is applied to the normal data observa- consists of data processing and detection models.
tions to estimate the initialization parameters of the weights and ➢ Data processing model—since an enormous amount of data
biases as a given input of the standard DFFNN for decreasing the can be extracted from packets and processing it is extremely
processing time of building this model. The parameters are also re- challenging in terms of the processing power, resources and
tuned in the supervised deep-learning method using the labeled time required, the raw data should be passed to this model
data (i.e., normal and malicious), with the final training model to convert it to useful information. In it, the collected net-
evaluated based on new data samples obtained during the testing work traffic is analyzed and gathered according to the size
phase, as explained in the aforementioned sections. of the time window or flow, such as the source and desti-
The placement of an IDS in the IIoT environment is critical for nation IPs, source and destination ports, and protocol type.
ensuring that the environment is secured against any malicious ac- Moreover, the basic features are extracted for the data flow
tivities. As industrial systems transmit data for end-users and/or and the data converted to a uniform format, a process han-
cloud storage through the IoT gateway which includes an internet– dled directly by the proposed ADS based on the deep learn-
protocol connectivity (e.g., IP, TCP, UDP or HTTP), this gateway is ing algorithm in order to reduce the data’s dimensional-
a crucial location in which to deploy the proposed anomaly-IDS ity. Therefore, this processing model assists in the decision-
M. AL-Hawawreh et al. / Journal of Information Security and Applications 41 (2018) 1–11 7

IIoT Gateway Internet

Cloud Storage

Raw Traffic Behavior Log

Database Database Database

Industrial Environment Data Processing

Sniffing & Detection Model
Monitoring Unit Model
Actuator Sensor
Analyzer Unit

Alert System

Response Unit
Intrusion Detection System

Fig. 4. Structure of deployment proposed ADS.

making process and prevents any bias and confusion for the Root (U2R), Remote to Local (R2L) and Normal [37,38]. However,
anomaly detector. although having been used widely in IDSs, it is outdated [40].
➢ Detection Model—the result obtained from the processing Therefore, to effectively evaluate our proposed work, a new
model is sent to the detection model for a decision to be dataset called UNSW-NB15 is used. It reflects real modern normal
made for each data group, that is, the flow or window size. behaviors and contains contemporary synthesized attack activities
This model is used to detect known and unknown attacks [39,44,45]. It has 257,673 records (i.e., 93,0 0 0 normal and 164,673
by learning from the behavior database whereby, if any of attacks), each with 41 features and a class label. There are ten dif-
the input data does not match normal network behavior, it ferent class labels, one normal and nine attacks, namely, Fuzzers,
is classified as an attack. Details of the detection process are Analysis, Backdoors, DoS, Exploits, Generic, Reconnaissance, Shell-
discussed in Section 3. code, and Worms.
• Response unit—this contains the alert system model and log
database. The ADS alerts the system administrator to take the 5.2. Evaluation metrics
appropriate action when any abnormal activity is detected in
the network, with the new signature for a specific attack type The proposed ADS is evaluated on the two datasets in terms
stored in the log database which, in turn, is used to feed the of the accuracy, detection rates, and FPRs extracted from the con-
behavior database. fusion matrix terms; True Positive (TP) and False Negative (FN),
which indicate the numbers of attack observations correctly iden-
tified as anomalous and incorrectly identified as normal, respec-
5. Descriptions of datasets and evaluation metrics tively, and True Negative (TN) and False Positive (FP) which show
the numbers of normal observations correctly identified as normal
5.1. Datasets and incorrectly identified as attack, respectively. The main evalua-
tion metrics are calculated as follows.
Since a dataset plays a vital role in testing, analyzing and eval-
uating the behavior of a detection system, a good-quality one not • Accuracy identifies the total number of observations correctly
only produces efficient results for an offline system but is also po- identified with respect to the total number of observations and
tentially effective when deployed in a real environment. Most re- is calculated by
searchers have used the popular NSL-KDD dataset, an amended TP + TN
version of the KDD CUP 99 one which solved the main problems Accuarcy = , (7)
TP + TN + FP + FN
of KDD CUP 99 by eliminating its redundant records and select-
ing numbers of records from it in proportion to their percentages. • The Detection Rate (DR) is the ratio of attack to normal obser-
After pre-processing, it consists of 148,517 records (i.e., 77,054 nor- vations correctly classified and defined as
mal and 71,460 attacks), each of which contains 41 features and TP
a class label. There are five classes, namely, Probing, DoS, User to Det ection Rat e = , (8)
TP + FN
8 M. AL-Hawawreh et al. / Journal of Information Security and Applications 41 (2018) 1–11

Fig. 5. (a) NSL-KDD dataset ROC curve; (b) UNSW-NB15 dataset ROC curve.

Table 1 NSL-KDD and UNSW-NB 15 datasets are 98.4% and approximately

Evaluation of performances for two datasets.
92.5%, respectively.
Dataset Accuracy Detection rate FPR The detection rates for the types of records in the NSL-KDD and
NSL-KDD 98.6% 99% 1.8% UNSW-NB15 datasets using the proposed ADS based on the deep-
UNSW-NB15 92.4% 93% 8.2% learning technique are shown in Fig. 6. In the left-hand chart, the
results for the NSL-KDD dataset demonstrate that the proposed
model can determine the record types DoS, Normal, Probe, R2L
and U2R with detection rates of 99.8%, 99.5%, 98.7%, 93.6% and
• The False Positive Rate (FPR) is the ratio of attack to normal
71.4%, respectively. The results for the UNSW-NB15 dataset in the
observations incorrectly classiﬁed and calculated by
right-hand chart show that the detection rates for the record types
FP Analysis, Backdoor, DoS, Exploits, Fuzzer, Generic, Normal, Recon-
F PR = . (9) naissance, Shellcode, and Worms are 83.3%, 91.8%, 95.1%, 96%, 60%,
FP + TN
99.5%, 98.9%, 96.8%, 81.1% and 76%, respectively. Although some
results, such as for U2R in the NSL-KDD dataset and Fuzzer and
6. Experimental results and discussion Worms in the UNSW-NB15 one, are not high, the proposed model
demonstrates overall good performances for detecting record types
6.1. Performance assessment and comparison in both datasets.

The proposed ADS technique is implemented and evaluated us-

ing the R programming language. The experiment is conducted on 6.2. Time cost for proposed ADS model
both datasets with all features and the important features are au-
tomatically adopted using the decreased layer of the DAE-DFFNN We analyze the proposed ADS model based on deep learning in
model. We use the full NSL-KDD dataset (i.e., 77,054 normal and terms of the total elapsed and CPU times it requires. As explained
71,460 attack records) and different samples from the UNSW-NB15 in Section 3, it consists of two main phases, training and test-
one (i.e., 93,0 0 0 normal and 92,0 0 0 attack records), with 20% ing. In the former, it is passed through two levels: ﬁrstly, unsuper-
(which represents 40% of the normal records), 60% and 20% of the vised learning is used to obtain the initialization parameters (i.e.,
samples used for unsupervised learning, supervised learning and weights and biases) which, for the NSL-KDD dataset, takes approx-
testing, respectively. imately 194.37 seconds to process 30,783 records over 100 epochs,
Based on the experiments, we adopted the networks structures with the CPU requiring approximately 0.31 s and, for the UNSW-
and parameters that produce the highest DR and lowest FPR. The NB15 dataset, 227.30 seconds to process 37,280 records over 100
best network structures of the proposed model were used for both epochs, with the total CPU time 0.25 s; and secondly, the super-
datasets as follows; one input layer (41 nodes), three hidden lay- vised learning takes approximately 194.67 seconds to pass 88,119
ers (10, 3, 10 nodes) and output layer (41 nodes) for the DAE records 100 times, with the CPU requiring approximately 0.13 s for
model while the output layer with 2 nodes for the DFFNN model. the NSL-KDD dataset and, for the UNSW-NB15 one, 119.03 seconds
The Tanh activation function was used, 100 epochs, a 0.002 learn- to process 110,776 records over 100 epochs and 0.14 s of CPU time.
ing rate, 2e-6 annealing rate, 0.2 momentum start, 0.4 momen- In the testing phase, for the NSL-KDD dataset, the proposed
tum stable, momentum 1e7 ramp, L1 and L2 regularizations of model takes 2.25 seconds to predict 29,615 records, that is, approx-
L1 = L2 = 1e-6 for the UNSW-NB15 dataset, and a learning rate of imately 13,162 records in one second and 0.06 s of CPU time and,
0.0015 and momentum start of 0.2 for the NSL-KDD dataset. for the UNSW-NB15 dataset, 55 seconds to predict 36,944 records,
The results obtained from the proposed ADS model based on that is, approximately 6657 records in one second and 0.01 s of
deep learning presented in Table 1 show that it performs better CPU time. It is worth mentioning that this time includes the task
for the NSL-KDD than the UNSW-NB15 dataset. of reducing the data dimensions, learning and extracting the most
In Fig. 5, the ROC curves indicate the performances of the pro- crucial features which make our proposed model a practical so-
posed ADS for the two datasets in terms of the TPR (i.e., detection lution for detecting intrusive activities in a real IIoT environment.
rate or sensitivity) and FPR (i.e., fall-out), with the areas under the The total elapsed and CPU times for the NSL-KDD and UNSW-NB15
red and green curves showing that its levels of accuracy for the datasets are presented in Table 2.
M. AL-Hawawreh et al. / Journal of Information Security and Applications 41 (2018) 1–11 9

Fig. 6. (a) Detection rates for NSL-KDD dataset classes; (b) Detection rates for UNSW-NB15 dataset classes.

Table 2
Time cost (Elapsed and CPU time in seconds) for both datasets.

Dataset Unsupervised training phase Supervised training phase Testing phase

(100 epochs) (100 epochs)
Elapsed time CPU time Elapsed time CPU time Elapsed time CPU time

NSL-KDD 194.37 0.31 194.67 0.13 2.25 0.06

UNSW-NB 15 227.30 0.25 119.03 0.14 5.55 0.01

Table 3 with normal patterns. DMM utilized some statistical properties to

Comparison of performances of five techniques for
specify the boundaries between normal and abnormal behavior. In
NSL-KDD dataset.
this model, the PCA technique selected the best features, and the
Technique Detection rate FPR model achieved a high detection rate. However, this method needs
F-SVM [13] 92.2% 8.7% to further statistical analysis for ensuring fitting all possible normal
CVT [41] 95.3% 5.6% events that could happen in network production systems.
DMM [40] 97.2% 2.4% The last four models adopted different deep learning architec-
TANN [12] 91.1% 9.4%
tures in designing IDSs. The DBN and Ensemble-DNN techniques
DBN [46] 95.1% 4.5%
RNN [50] 73% 3.6% utilized the unsupervised pre-training model based on a greedy
DNN [51] 76% 15% layer-by-layer learning algorithm. In DDBN model, to learn the
Ensemble-DNN [53] 98% 14.7% weights between layers, each stack of RBM was trained at a time.
Proposed Model 99% 1.8%
After pre-training the two RBM, the parameters were tuned using
the discriminative layer, which is trained on the labeled data for
classification.
The ensemble–DNN model used the auto-encoder and extreme
6.3. Comparative study
learning machine RBM to the DBN for learning data representa-
tions accurately. These models learned the hierarchical represen-
To illustrate the effectiveness of our proposed ADS model, we
tation of the network data with high performance in terms of DR
compare its performance with those of eight recently developed
and FPR. However, they have the drawbacks of the computational
anomaly detection techniques, namely, the Filter-based Support
costs related to training DBN and the optimization way of network
Vector Machine (F-SVM) [13], Computer Vision Technique (CVT)
structure based on the maximum-likelihood training approxima-
[41], Dirichlet Mixture Model (DMM) [40], Triangle Area Near-
tion that is ambiguous. The RNN model which used a loop in its
est Neighbors (TANN) [12], DBN [46], RNN [50], DNN [51], and
architecture performed less because of gradient and exploding gra-
Ensemble-DNN [53]. Table 3 demonstrates the result achieved by
dient problems, but it works better in other domains, like noise
our proposed model compared with other models tested on NSL-
cancellation and character recognition [55]. The final model is IDS-
KDD dataset in term of detection rate and false positive rate. It
DNN which used only the basic concept of deep learning to de-
is very clear that our proposed model gets the best results with
tect intrusive activities after selecting the best six features in NSL-
99% DR and 1.8% FPR. The first four models achieved a rational
KDD dataset, here building the model based on the initial random
performance in detecting the intrusive activities after a feature
weights and bias achieved fewer convergence properties and per-
selection process. F-SVM adopted mutual information to address
formance.
the linear and non-linear data features, and then they were used
For discussing why our model performs better the above tech-
with the SVM for recognizing attacks. Nevertheless, this model still
niques, it automatically provides dimensionality reduction and fea-
needs to optimize the search strategy to improve the efficiency of
ture extraction without human intervention, which takes a great
IDS. CVT and TANN used the PCA technique to reduce the data
deal of time and effort using most classical machine-learning al-
dimensions.
gorithms, and has the capability to infer the unknown structure
TANN used the K-means cluster based on only attack data to
of normal network behavior through unsupervised and supervised
obtain good feature representations of attack behaviors. Both mod-
training phases, thereby learning and obtaining good representa-
els rely on estimating the correlations between normal and abnor-
tions. Also, as previously discussed, it takes a suitable amount of
mal instances, and this will face a problem in dealing with sophis-
processing time to train and predict data records. Furthermore, its
ticated attacks, which mimic the normal behavior, as they overlap
10 M. AL-Hawawreh et al. / Journal of Information Security and Applications 41 (2018) 1–11

performance continually improves when the structure of the deep Supplementary materials
learning model is trained with more data.
The proposed model is different from the previous IDSs based Supplementary material associated with this article can be
on deep learning that it used the simple mathematical algorithm found, in the online version, at doi:10.1016/j.jisa.2018.05.002.
(DAE) for unsupervised learning which estimates the parameters
in a suitable range that is the input of the supervised-DFFNN
References
for building it effectively and efficiently. In addition, the model,
through the decreased hidden layer, learns and explores the high- [1] Sherasiya T, Upadhyay H, Patel H. A survey: intrusion detection system for in-
level features, reduce the dimensionality of data automatically, and ternet of things. J. Comput Sci Eng 2016;5:91–8.
[2] Drath R, Horch A. Industry 4.0: hit or hype? [Industry forum]. IEEE Ind Electron
represent the crucial features well. Therefore, these traits ensure
Mag 2014:56–8.
that our proposed model is appropriate for deployment in a real [3] Shahzad A, Kim G, Elgamoudi A. Secure IoT platform for industrial control sys-
industrial environment containing a massive amount of unlabeled tems. In: Platform technology and service (PlatCon), international conference
and unstructured data. on. IEEE; 2017. p. 1–6.
[4] Katsikeas S, Fysarakis K, Miaoudakis A, Van Bemten A, Askoxylakis I, Papaef-
stathio I, Plemenos A. Lightweight and secure industrial IoT communications
via the MQ telemetry transport protocol. Symposium on computers and com-
munications conference. IEEE; 2017.
6.4. Pros and cons of proposed anomaly-IDS based on deep learning [5] Stouffer K, Falco J, Scarfone K. Guide to industrial control systems (ICS) secu-
rity. NIST special publication; 2011. p. 6–16.
[6] Atlantic Council. http://publications.atlanticcouncil.org/cyberrisks//.
The proposed ADS has several advantages. Firstly, it can eas- [7] Sitnikova E, Foo E, Vaughn B. The power of hands-on exercises in SCADA cyber
ily detect normal and attack behaviors in an IIoT environment as, security education. In: IFIP world conference on information security educa-
because it is designed to identify normal behavior by training the tion. Springer; 2009. p. 83–94.
[8] Abraham A, Grosan C, Martin-vide C. Evolutionary design of intrusion detec-
model using normal behavior in the first training phase, based on tion. Int J Netw Secur 2007:328–39.
the unsupervised deep learning algorithm, it can obtain good rep- [9] Modi C, Patel D, Borisaniya B, Patel H, Patel A, Rajarajan M. A survey of intru-
resentations of normal traffic. Secondly, it conducts an additional sion detection techniques in cloud. J Netw Comput Appl 2013:42–57.
[10] Tzokatziou G, Maglaras A, Janicke H, He Y. Exploiting SCADA vulnerabilities
training process based on normal and attack behaviors to power
using a human interface device. Int J Adv Comput Sci Appl 2015:234–41.
the system and ensure that it is capable of detecting sophisticated [11] Kushner D. The real story of stuxnet. In: IEEE spectrum 50; 2013. p. 48–53.
attacks. Thirdly, it provides an automated process for feature engi- [12] Tsai F, Lin Y. A triangle area based nearest neighbors approach to intrusion
detection. Pattern Recognit 2010;43:222–9.
neering which minimizes the time and effort required, and makes
[13] Ambusaidi A, He X, Nanda P, Tan Z. Building an intrusion detection sys-
it more effective when deployed in a real environment. Finally, it tem using a filter-based feature selection algorithm. IEEE Trans Comput
depends on the parameters which are tuned in only the training 2016;65:2986–98.
phase and has a tolerance for noisy and outlier data. [14] N. Moustafa, and J. Slay. "A hybrid feature selection for network intrusion de-
tection systems: central points." arXiv preprint arXiv:1707.05505 (2017).
Although a disadvantage of this model is that choosing its pa- [15] Kim J, Bentley P, Aickelin U, Greensmith J, Tedesco G, Twycross J. Immune sys-
rameters for the training phase is not a trivial process, this is tem approaches to intrusion detection – a review. Nat Comput 2007:413–66.
not considered a major problem given the current availability of [16] Hardy W, Chen L, Hou S, Ye Y, Li X. DL4MD: a deep learning framework for
intelligent malware detection. In: Proceedings of the international conference
complex and fast hardware. Also, while it cannot deal with non- on data mining (DMIN). The steering committee of the world congress in com-
numeric and the original range of features values, we overcome puter science, computer engineering and applied computing (WorldComp);
this issue by using feature transformation and normalization steps 2016. p. 61–7.
[17] Huang W, Song G, Hong H, Xie K. Deep architecture for traffic flow prediction:
in the pre-processing stage. deep belief networks with multitask learning. In: IEEE transactions on intelli-
gent transportation systems; 2014. p. 2191–201.
[18] Nadiammai G, Hemalatha M. Effective approach toward intrusion detection
system using data mining techniques. Egypt Inf J 2013:37–50.
7. Conclusion [19] Mustafa N, Slay J. The evaluation of network anomaly detection systems: sta-
tistical analysis of the UNSW-NB15 data set and the comparison with the
KDD99 dataset. Inf Secur J 2016:18–31.
In this paper, an ADS model for detecting intrusive activities in [20] Shang W, Zeng P, Wan M, Li L, An P. Intrusion detection algorithm based on
IIoT environments using the data collected from TCP/IP traffic is OCSVM in industrial control system. Secur Commun Netw 2016:1040–9.
[21] Maglaras A, Jiang J. Intrusion detection in SCADA systems using machine
proposed. It uses deep-learning methods for unsupervised learning
learning techniques. In: Science and information conference (SAI). IEEE; 2014.
with automatic dimensionality reductions and a good representa- p. 626–31.
tion of normal network patterns. It obtains powerful rather than [22] Silva P, Schukat M. On the use of K-NN in intrusion detection for industrial
random parameters for a supervised training for DFFNN. Then, to control systems. In: 13th international conference on information technology
and telecommunication; 2014. p. 103–6.
better tune these parameters, the supervised DFFNN is used. The [23] Stewart B, Rosa L, Maglaras A, Cruz T, Ferrag M, Simoes P, Janicke H. A novel
proposed DAE-DFFNN model can successfully build and extract im- intrusion detection mechanism for SCADA systems that automatically adapts
portant features which enhance its performance overall. The final to changes in network topology. Ind Netw Intell Syst 2017:1–12.
[24] Shang W, Cui J, Wan M, An P, Zeng P. Modbus communication behavior mod-
constructed model is tested on different data samples from the eling and SVM intrusion detection method. In: Proceedings of the 6th in-
NSL-KDD and NSW-NB15 datasets, with the results revealing that ternational conference on communication and network security. ACM; 2016.
it achieves the highest detection rate and fewest false alarms com- p. 80–5.
[25] Maglaras A, Jiang J. Ocsvm model combined with k-means recursive cluster-
pared with some techniques developed in recent studies. In future, ing for intrusion detection in scada systems. In: Heterogeneous networking for
we will extend this work to train this algorithm on real data col- quality, reliability, security and robustness (QShine), 10th international confer-
lected from IIoT systems to demonstrate the efficiency of its imple- ence on. IEEE; 2014. p. 133–4.
[26] Linda O, Vollmer T, Manic M. Neural network based intrusion detection system
mentation.
for critical infrastructures. In: Neural networks, international joint conference
Also, we present a perception for deploying this proposed ADS on. IEEE; 2009. p. 1827–34.
model based on the deep-learning algorithm in the real-world IIoT [27] Hodo E, Bellekens X, Hamilton A, Dubouilh L, Iorkyase E, Tachtatzis C, Atkin-
son R. Threat analysis of iot networks using artificial neural network intrusion
environment by: firstly, using it to collect the TCP/IP traffic us-
detection system. In: Networks, computers and communications (ISNCC), in-
ing the sniffer implemented in the gateway; and, secondly, pre- ternational symposium on. IEEE; 2016. p. 1–6.
processing and analyzing the collected data in order to reveal any [28] Chen R, Liu M, Chen C. An artificial immune-based distributed intrusion detec-
intrusive activity. Further modifications and ideas for deploying tion model for the Internet of Things. In: Advanced materials research; 2012.
p. 165–8.
and validating this model in a real environment, and extending this [29] Van Dijk C, Williams P. The history of artificial intelligence. Expert systems in
work to handle different protocols will be considered in future. auditing Palgrave Macmillan UK; 1990. 21-16.
M. AL-Hawawreh et al. / Journal of Information Security and Applications 41 (2018) 1–11 11

[30] Tang A, Mhamdi L, McLernon D, Zaidi R, Ghogho M. Deep learning approach [44] Moustafa N, Slay J. The evaluation of network anomaly detection systems:
for network intrusion detection in software defined networking. In: Wireless statistical analysis of the UNSW-NB15 data set and the comparison with the
networks and mobile communications (WINCOM), international conference on. KDD99 data set. Inf Secur J 2016;25:18–31.
IEEE; 2016. p. 258–63. [45] Moustafa N, Slay J. The significant features of the UNSW-NB15 and the KDD99
[31] Svozil D, KvasniEka V, Pospichal J. Introduction to multi-layer feed-forward data sets for network intrusion detection systems. Building analysis datasets
neural networks. Elsevier; 1997. p. 43–62. and gathering experience returns for security (BADGERS), 2015 4th interna-
[32] Recht B, Re C, Wright S, Niu F. Hogwild: A lock-free approach to paralleliz- tional workshop on. IEEE; 2015.
ing stochastic gradient descent. In: Advances in neural information processing [46] Alom MdZ, Bontupalli V, Taha. Intrusion detection using deep belief networks.
systems; 2011. p. 693–701. In: Aerospace and electronics conference (NAECON), 2015 national. IEEE; 2015.
[33] Erhan D, Bengio Y, Courville A, Manzagol A, Vincent P, Bengio S. Why does p. 339–44.
unsupervised pre-training help deep learning? J Mach Learn Res 2010:625–60. [47] Gao N, Gao L, Gao Q, Wang H. An intrusion detection model based on deep
[34] Tao X, Kong D, Wei Y, Wang Y. A big network traffic data fusion approach belief networks. In: Advanced cloud and big data (CBD), 2014 second interna-
based on fisher and deep auto-encoder. Information 2016:20–30. tional conference on. IEEE; 2014. p. 247–52.
[35] Lv Y, Duan Y, Kang W, Li Z, Wang Y. Traffic flow prediction with big data: a [48] Abolhasanzadeh B. Nonlinear dimensionality reduction for intrusion detection
deep learning approach. IEEE Trans Intell Transp Syst 2015:865–73. using auto-encoder bottleneck features. In: Information and knowledge tech-
[36] Yousefi-Azar M, Varadharajan V, Hamey L, Tupakula U. Autoencoder-based fea- nology (IKT), 2015 7th conference on. IEEE; 2015. p. 1–5.
ture learning for cyber security applications. In: Neural networks (IJCNN), 2017 [49] Li Y, Rong M, Runhai J. A hybrid malicious code detection method based on
international joint conference on. IEEE; 2017. p. 3854–61. deep learning. Int J Secur Appl Methods 2015;9(5).
[37] Rathore S, Saxena A, Manoria M. Intrusion detection system on KDDCup99 [50] Chuan-long Y, Yue-fei Z, Jin-long F, Xin-zheng H. A deep learning approach for
dataset: a survey. Int J Comput Sci Inf Tech 2015. intrusion detection using recurrent neural networks. IEEE Access 2017:1–7.
[38] Tavallaee M, Bagheri E, Lu W, Ghorbani A. A detailed analysis of the KDD CUP [51] Tang T, Mhamdi L, McLernon D, Zaidi S, Ghogho M. Deep learning approach for
99 data set. In: Computational intelligence for security and defense applica- network intrusion detection in software defined networking. In: Wireless net-
tions. CISDA 2009. IEEE symposium on. IEEE; 2009. p. 1–6. works and mobile communications (WINCOM), 2016 international conference
[39] Moustafa N, Slay J. UNSW-NB15: a comprehensive data set for network intru- on. IEEE; 2016. p. 258–63.
sion detection systems (UNSW-NB15 network data set). In: Military communi- [52] Seok S, Howon K. Visualized malware classification based-on convolutional
cations and information systems conference (MilCIS). IEEE; 2015. p. 1–6. neural network. J Korea Inst Inf Secur Cryptol 2016;26(1):197–208.
[40] Moustafa N, Creech G, Slay J. Big data analytics for intrusion detection sys- [53] Ludwig S. Intrusion detection of multiple attack classes using a deep neu-
tem: statistical decision-making using finite Dirichlet mixture models. In: Data ral net ensemble. 2017 IEEE symposium series on computational intelligence;
Analytics and Decision Support for Cybersecurity. Springer; 2017. p. 127–56. 2017.
[41] Tan Z, Jamdagni A, He X, Nanda P, Liu P, Hu J. Detection of denial-of-service at- [54] E. Hodo, B. Xavier, H. Andrew, T. Christos, and A. Robert "Shallow and deep
tacks based on computer vision techniques. IEEE Trans Comput 2015:2519–33. networks intrusion detection system: a taxonomy and survey." arXiv preprint
[42] Li M, Zhang T, Chen Y, Smola J. Efficient mini-batch training for stochastic op- arXiv:1701.02145 (2017).
timization. In: Proceedings of the 20th ACM SIGKDD international conference [55] Z, Lipton, J. Berkowitz, and C. Elkan. "A critical review of recurrent neural net-
on Knowledge discovery and data mining. ACM; 2014. p. 661–70. works for sequence learning." arXiv preprint arXiv:1506.0 0 019 (2015).
[43] Niyaz Q, Sun W, Javaid A, Alam M. A deep learning approach for network [56] T. Marsden, N. Moustafa, E. Sitnikova, and G. Creech, G. (2017). Probability
intrusion detection system. In: Proceedings of the 9th EAI international con- risk identification based intrusion detection system for SCADA systems. arXiv
ference on bio-inspired information and communications technologies (for- preprint arXiv:1711.02826.
merly BIONETICS). ICST (Institute for Computer Sciences, Social-Informatics
and Telecommunications Engineering); 2016. p. 21–6.

Toward Detection and Attribution of Cyber-Attacks in Iot-Enabled Cyber-Physical Systems
No ratings yet
Toward Detection and Attribution of Cyber-Attacks in Iot-Enabled Cyber-Physical Systems
11 pages
ML-Based IDS for IoT Networks
No ratings yet
ML-Based IDS for IoT Networks
19 pages
1 s2.0 S0167404823002250 Main
No ratings yet
1 s2.0 S0167404823002250 Main
14 pages
Intrusion Detection Using Deep Neural Network Algorithm On The Internet of Things
No ratings yet
Intrusion Detection Using Deep Neural Network Algorithm On The Internet of Things
4 pages
Evaluation of Machine Learning Algorithms Used On Attacks Detection in Industrial Control Systems
No ratings yet
Evaluation of Machine Learning Algorithms Used On Attacks Detection in Industrial Control Systems
12 pages
A Survey On Intrusion Detection System in IoT Networks
No ratings yet
A Survey On Intrusion Detection System in IoT Networks
19 pages
New Project Page 3
No ratings yet
New Project Page 3
53 pages
False Data Injection Attack Detection For Industrial Control Systems Based On Both Time - and Frequency-Domain Analysis of Sensor Data
No ratings yet
False Data Injection Attack Detection For Industrial Control Systems Based On Both Time - and Frequency-Domain Analysis of Sensor Data
11 pages
Industrial Control System-Anomaly Detection Dataset ICS-ADD For Cyber-Physical Security Monitoring in Smart Industry Environments
No ratings yet
Industrial Control System-Anomaly Detection Dataset ICS-ADD For Cyber-Physical Security Monitoring in Smart Industry Environments
10 pages
Electronics 13 01031 v2
No ratings yet
Electronics 13 01031 v2
25 pages
Electronics 11 00494 v2
No ratings yet
Electronics 11 00494 v2
18 pages
IoT Intrusion Detection with DNN
No ratings yet
IoT Intrusion Detection with DNN
24 pages
Enhancing IoT Security With CNN and LSTM-Based Int
No ratings yet
Enhancing IoT Security With CNN and LSTM-Based Int
7 pages
Intrusion Detection Systems For Wireless Sensor Networks Using Computational Intelligence Techniques
No ratings yet
Intrusion Detection Systems For Wireless Sensor Networks Using Computational Intelligence Techniques
15 pages
BiLSTM-CNN Hybrid Intrusion Detection System For I
No ratings yet
BiLSTM-CNN Hybrid Intrusion Detection System For I
27 pages
ICS Anomaly Detection with ML
No ratings yet
ICS Anomaly Detection with ML
14 pages
1 s2.0 S2667305323000145 Main
No ratings yet
1 s2.0 S2667305323000145 Main
13 pages
1 s2.0 S016740482300007X Main
No ratings yet
1 s2.0 S016740482300007X Main
18 pages
AiIC2022 Paper 35
No ratings yet
AiIC2022 Paper 35
6 pages
19148-Article Text-78917-2-10-20240405
No ratings yet
19148-Article Text-78917-2-10-20240405
24 pages
Anomaly Events Classification and Detection System in Critical Industrial Internet of Things Infrastructure Using Machine Learning Algorithms
No ratings yet
Anomaly Events Classification and Detection System in Critical Industrial Internet of Things Infrastructure Using Machine Learning Algorithms
22 pages
1 s2.0 S2352864823000640 Main
No ratings yet
1 s2.0 S2352864823000640 Main
15 pages
Securing Iot Networks in Cloud Computing Environments: A Real Time Ids
No ratings yet
Securing Iot Networks in Cloud Computing Environments: A Real Time Ids
31 pages
AI-enabled Intrusion Detection Systems in IoT Networks Advancing Defense
No ratings yet
AI-enabled Intrusion Detection Systems in IoT Networks Advancing Defense
18 pages
EX-DFL An Explainable Deep Federated-Based Intrusion Detection System For Industrial IoT
No ratings yet
EX-DFL An Explainable Deep Federated-Based Intrusion Detection System For Industrial IoT
7 pages
Deep Learning Algorithms For Intrusion D
No ratings yet
Deep Learning Algorithms For Intrusion D
8 pages
MJEE - Volume 17 - Issue 2 - Page 69-77
No ratings yet
MJEE - Volume 17 - Issue 2 - Page 69-77
9 pages
Learning Approaches For Security and Privacy in Internet of Things
No ratings yet
Learning Approaches For Security and Privacy in Internet of Things
12 pages
IoT Network Attack Detection Using Supervised Machine Learning
No ratings yet
IoT Network Attack Detection Using Supervised Machine Learning
15 pages
Presentation 1
No ratings yet
Presentation 1
15 pages
1 s2.0 S0950705123007165 Main
No ratings yet
1 s2.0 S0950705123007165 Main
14 pages
1 s2.0 S1874548217300884 Main
No ratings yet
1 s2.0 S1874548217300884 Main
10 pages
Deep Packet Inspection For Intelligent Intrusion Detection in Software-Defined Industrial Networks: A Proof of Concept
No ratings yet
Deep Packet Inspection For Intelligent Intrusion Detection in Software-Defined Industrial Networks: A Proof of Concept
12 pages
NE GConv A Lightweight Node Edge Graph Convolutional Net - 2023 - Computers - S
No ratings yet
NE GConv A Lightweight Node Edge Graph Convolutional Net - 2023 - Computers - S
10 pages
Intrusion Detection 2 31 Jan 2024
No ratings yet
Intrusion Detection 2 31 Jan 2024
16 pages
Báo cáo tiến độ W8 - 25Oct
No ratings yet
Báo cáo tiến độ W8 - 25Oct
24 pages
Intrusion Detection System For IoT Environments Using Machine Learning Techniques
No ratings yet
Intrusion Detection System For IoT Environments Using Machine Learning Techniques
7 pages
RP 3
No ratings yet
RP 3
9 pages
AS IDS: Anomaly and Signature Based IDS For The Internet of Things
No ratings yet
AS IDS: Anomaly and Signature Based IDS For The Internet of Things
26 pages
ICCAD25 Paper 7737
No ratings yet
ICCAD25 Paper 7737
5 pages
A Hybrid Approach For Efficient Feature Selection
No ratings yet
A Hybrid Approach For Efficient Feature Selection
44 pages
AI-Powered IDS for IoT Security
No ratings yet
AI-Powered IDS for IoT Security
6 pages
Deep Learning Based Detection For Cyber Attacks in Iot Networks: A Distributed Attack Detection Framework
No ratings yet
Deep Learning Based Detection For Cyber Attacks in Iot Networks: A Distributed Attack Detection Framework
24 pages
NIDS-CNNLSTM Network Intrusion Detection Classification Model Based On Deep Learning
No ratings yet
NIDS-CNNLSTM Network Intrusion Detection Classification Model Based On Deep Learning
14 pages
Intrusion Detection and Mitigation Framework For SDN Controlled IoTs Network
No ratings yet
Intrusion Detection and Mitigation Framework For SDN Controlled IoTs Network
5 pages
Arpit Jain Seminar Report
No ratings yet
Arpit Jain Seminar Report
31 pages
Intrusion Detection in Iot Networks Using Ai Based Approach
No ratings yet
Intrusion Detection in Iot Networks Using Ai Based Approach
15 pages
1 s2.0 S1877050920317804 Main
No ratings yet
1 s2.0 S1877050920317804 Main
6 pages
Blockchain and Deep Learning-Based IDS For Securing SDN-Enabled Industrial IoT Environments
No ratings yet
Blockchain and Deep Learning-Based IDS For Securing SDN-Enabled Industrial IoT Environments
6 pages
Research 2
No ratings yet
Research 2
12 pages
IoT Anomaly Detection via Logistic Regression
No ratings yet
IoT Anomaly Detection via Logistic Regression
5 pages
A Sample Article Using IEEEtran Cls For IEEE Journals and Transactions
No ratings yet
A Sample Article Using IEEEtran Cls For IEEE Journals and Transactions
7 pages
IoT/IIoT Attack Detection Method
No ratings yet
IoT/IIoT Attack Detection Method
8 pages
Intrusion Detection System For Internet of Things Based On A Machine Learning Approach
No ratings yet
Intrusion Detection System For Internet of Things Based On A Machine Learning Approach
6 pages
Electronics: Federated Machine Learning To Enable Intrusion Detection Systems in Iot Networks
No ratings yet
Electronics: Federated Machine Learning To Enable Intrusion Detection Systems in Iot Networks
19 pages
A Hybrid Approach For Efficient Feature Selection in Anomaly Intrusion Detection For Iot Networks
No ratings yet
A Hybrid Approach For Efficient Feature Selection in Anomaly Intrusion Detection For Iot Networks
43 pages
Machine Learning Based Intrusion Detection Systems For IoT Applications
No ratings yet
Machine Learning Based Intrusion Detection Systems For IoT Applications
24 pages
Anomaly Detection On Iot Network Using Deep Learning
No ratings yet
Anomaly Detection On Iot Network Using Deep Learning
14 pages
Review Paper Published
No ratings yet
Review Paper Published
5 pages
OrionVX-Datasheet 20250722
No ratings yet
OrionVX-Datasheet 20250722
5 pages
WP Ot Ciso Solution Selection
No ratings yet
WP Ot Ciso Solution Selection
6 pages
Calero Fabian
No ratings yet
Calero Fabian
159 pages
Orch UserGuide R952
No ratings yet
Orch UserGuide R952
846 pages
Security Guide 9.5 Release Latest
No ratings yet
Security Guide 9.5 Release Latest
114 pages
Aruba Data Sheet Edgeconnect Solution 121620
No ratings yet
Aruba Data Sheet Edgeconnect Solution 121620
8 pages
SB Ot Cybersecurity Assurance With Fortideceptor
No ratings yet
SB Ot Cybersecurity Assurance With Fortideceptor
5 pages
Top 12 Cybersecurity Careers 2023
No ratings yet
Top 12 Cybersecurity Careers 2023
14 pages
Integrated Network and Security Operation Center: A Systematic Analysis
100% (1)
Integrated Network and Security Operation Center: A Systematic Analysis
18 pages
Tuning Logstash Garbage Collection For High Throughput in A Monitoring Platform
No ratings yet
Tuning Logstash Garbage Collection For High Throughput in A Monitoring Platform
7 pages
Advanced Distribution Management System (ADMS) Evaluations With Private LTE Communication Networks
No ratings yet
Advanced Distribution Management System (ADMS) Evaluations With Private LTE Communication Networks
12 pages
OSINT Prompt Ideas for Analysts
No ratings yet
OSINT Prompt Ideas for Analysts
124 pages
99
No ratings yet
99
1 page
Smart Grid Communication Survey
No ratings yet
Smart Grid Communication Survey
26 pages
The Using of Analytical Platform For Telecommunication Network Events Forecasting
No ratings yet
The Using of Analytical Platform For Telecommunication Network Events Forecasting
2 pages
Industry 4.0: CIOT Collaboration Insights
No ratings yet
Industry 4.0: CIOT Collaboration Insights
7 pages
Peti Plantilla
No ratings yet
Peti Plantilla
3 pages
Electric Vehicles Standards Charging Inf
No ratings yet
Electric Vehicles Standards Charging Inf
27 pages
DC Microgrids Part II A Review of Power
No ratings yet
DC Microgrids Part II A Review of Power
24 pages
ICT Module 1 CSS NC-II
No ratings yet
ICT Module 1 CSS NC-II
27 pages
MC 16 Short - Rev. 2016-07
No ratings yet
MC 16 Short - Rev. 2016-07
34 pages
Vijaymentor - Vijaymentor - Whoami
No ratings yet
Vijaymentor - Vijaymentor - Whoami
11 pages
N (0:1:40) A 1.2 F 0.1 X A Cos (2 Pi F N) Stem (N, X,'r','filled') Xlabel ('TIME') Ylabel ('AMPLITUDE')
No ratings yet
N (0:1:40) A 1.2 F 0.1 X A Cos (2 Pi F N) Stem (N, X,'r','filled') Xlabel ('TIME') Ylabel ('AMPLITUDE')
7 pages
Intro to Quality Assurance
No ratings yet
Intro to Quality Assurance
12 pages
3C03 Quality Check 201707008
No ratings yet
3C03 Quality Check 201707008
4 pages
Python Developer Job at Azurity
No ratings yet
Python Developer Job at Azurity
2 pages
Matsit Cleanup Presentation
No ratings yet
Matsit Cleanup Presentation
10 pages
Block Chain - Document (1.0)
No ratings yet
Block Chain - Document (1.0)
27 pages
MikroTik VPN - Back To Home - RouterOS - MikroTik Documentation
No ratings yet
MikroTik VPN - Back To Home - RouterOS - MikroTik Documentation
4 pages
Ram Concept
100% (1)
Ram Concept
558 pages
UH300 Series HMI Specs & Features
No ratings yet
UH300 Series HMI Specs & Features
2 pages
Sapan AGRAWAL-1
No ratings yet
Sapan AGRAWAL-1
2 pages
CSL QB
No ratings yet
CSL QB
8 pages
Pure Install Guide
100% (1)
Pure Install Guide
14 pages
PAFM250W Rev4.2
No ratings yet
PAFM250W Rev4.2
5 pages
Milling
No ratings yet
Milling
34 pages
DAP-3662 A1 Manual v1.10 (WW)
No ratings yet
DAP-3662 A1 Manual v1.10 (WW)
95 pages
Sony NW-A1000 Service Manual v1.0 2005
No ratings yet
Sony NW-A1000 Service Manual v1.0 2005
58 pages
Mercury Marine SC5000 Manual
100% (1)
Mercury Marine SC5000 Manual
66 pages
Saul Griffith
No ratings yet
Saul Griffith
3 pages
Ata Chapters 100 Memorize
No ratings yet
Ata Chapters 100 Memorize
3 pages
BPI-Company Profile
No ratings yet
BPI-Company Profile
19 pages
Design Thinking Tools Guide
No ratings yet
Design Thinking Tools Guide
25 pages
Cs Core - Call Flows
100% (2)
Cs Core - Call Flows
16 pages
Free Grants Database for Auckland Non-Profits
No ratings yet
Free Grants Database for Auckland Non-Profits
4 pages
AI, Machine Learning and Deep Learning A Security Perspective (Fei Hu) (Z-Library)
No ratings yet
AI, Machine Learning and Deep Learning A Security Perspective (Fei Hu) (Z-Library)
347 pages
Gulf Super Duty VLE 15W40
No ratings yet
Gulf Super Duty VLE 15W40
1 page
Role of Information Systems in Indian Railways
No ratings yet
Role of Information Systems in Indian Railways
21 pages
DAP HMI User Manual Ver2.01
No ratings yet
DAP HMI User Manual Ver2.01
158 pages

Al Hawawreh2018 PDF

Uploaded by

Al Hawawreh2018 PDF

Uploaded by

Journal of Information Security and Applications 41 (2018) 1–11

Contents lists available at ScienceDirect

Journal of Information Security and Applications

Identiﬁcation of malicious activities in industrial internet of things

Fig. 1. Architecture of DAE network.

Fig. 2. Proposed architecture of DAE-DFFNN model based ADS for IICs.

Training Data Basic PreProcessing

Testing Data Basic PreProcessing Trained Model

Fig. 3. Architecture of proposed ADS-based Deep Learning.

IIoT Gateway Internet

Raw Traffic Behavior Log

Industrial Environment Data Processing

Fig. 4. Structure of deployment proposed ADS.

Table 1 NSL-KDD and UNSW-NB 15 datasets are 98.4% and approximately

The proposed ADS technique is implemented and evaluated us-

Dataset Unsupervised training phase Supervised training phase Testing phase

NSL-KDD 194.37 0.31 194.67 0.13 2.25 0.06

Table 3 with normal patterns. DMM utilized some statistical properties to

You might also like