Security in Computing and Communications
Security in Computing and Communications
Security in Computing
and Communications
5th International Symposium, SSCC 2017
Manipal, India, September 13–16, 2017
Proceedings
123
Communications
in Computer and Information Science 746
Commenced Publication in 2007
Founding and Former Series Editors:
Alfredo Cuzzocrea, Xiaoyong Du, Orhun Kara, Ting Liu, Dominik Ślęzak,
and Xiaokang Yang
Editorial Board
Simone Diniz Junqueira Barbosa
Pontifical Catholic University of Rio de Janeiro (PUC-Rio),
Rio de Janeiro, Brazil
Phoebe Chen
La Trobe University, Melbourne, Australia
Joaquim Filipe
Polytechnic Institute of Setúbal, Setúbal, Portugal
Igor Kotenko
St. Petersburg Institute for Informatics and Automation of the Russian
Academy of Sciences, St. Petersburg, Russia
Krishna M. Sivalingam
Indian Institute of Technology Madras, Chennai, India
Takashi Washio
Osaka University, Osaka, Japan
Junsong Yuan
Nanyang Technological University, Singapore, Singapore
Lizhu Zhou
Tsinghua University, Beijing, China
More information about this series at http://www.springer.com/series/7899
Sabu M. Thampi Gregorio Martínez Pérez
•
Security in Computing
and Communications
5th International Symposium, SSCC 2017
Manipal, India, September 13–16, 2017
Proceedings
123
Editors
Sabu M. Thampi Jiankun Hu
Indian Institute of Information Technology RMIT University
and Management Kerala (IIITMK) Melbourne, VIC
Trivandrum, Kerala Australia
India
Chun I. Fan
Gregorio Martínez Pérez National Sun Yat-sen University
University of Murcia Kaohsiung
Murcia Taiwan
Spain
Félix Gómez Mármol
Carlos Becker Westphall University of Murcia
Federal University of Santa Catarina Murcia
Florianópolis, Santa Catarina Spain
Brazil
These proceedings contain papers selected for presentation at the 5th International
Symposium on Security in Computing and Communications (SSCC 2017). SSCC aims
to provide an opportunity to bring together researchers and practitioners from both
academia and industry to exchange knowledge and discuss research findings. The
symposium was held in Manipal Institute of Technology, Manipal University,
Karnataka, India, during September 13–16, 2017. SSCC 2017 was co-located with the
International Conference on Applied Soft Computing and Communication Networks
(ACN 2017).
In response to the call for papers 84 papers were submitted to the symposium. These
papers were evaluated on the basis of their significance, novelty, and technical quality.
A double-blind review process was conducted to ensure that the author names and
affiliations were unknown to the Technical Program Committee (TPC). Each paper was
reviewed by the members of the TPC and finally, 21 regular papers and 13 short papers
were selected for presentation at the symposium (acceptance ratio: *40%).
The organization of the symposium benefited from the efforts of many individuals.
We would like to thank the TPC members and external reviewers for their timely
expertise in carefully reviewing the submissions. We would like to thank the general
chair and members of the Advisory Committee for their support. We express our most
sincere thanks to all keynote speakers who shared with us their expertise and
knowledge.
Special thanks to members of the Organizing Committee for their time and effort in
organizing the symposium. We wish to thank all the authors who submitted papers and
all participants and contributors to fruitful discussions. Finally, we would like to
acknowledge Springer for the active cooperation and timely production of the
proceedings.
Chief Patron
Ramdas M. Pai Manipal University, India
Patrons
H.S. Ballal Manipal University, India
B.H.V. Pai MIT, Manipal University, India
G.K. Prabhu MIT, Manipal University, India
Narayan Sabhahit Manipal University, India
V. Surendra Shetty Manipal University, India
H. Vinod Bhat Manipal University, India
Advisory Committee
John F. Buford Avaya Labs Research, USA
Mauro Conti SPRITZ Security and Privacy Research Group,
University of Padua, Italy
Xavier Fernando Ryerson University, Canada
David Naccache ENS Paris, France
Prasad Naldurg IBM Research India, Bangalore
Anand R. Prasad NEC, Japan
Bimal Kumar Roy R.C. Bose Centre for Cryptology and Security,
Indian Statistical Institute, India
Somitra Kr. Sanadhya IIIT Delhi, India
Zhili Sun Institute for Communication Systems (ICS),
University of Surrey, UK
Shambhu J. Upadhyaya State University of New York at Buffalo, USA
V.N. Venkatakrishnan University of Illinois at Chicago, USA
Guojun Wang Central South University, China
General Chair
Sudip Misra Indian Institute of Technology, Kharagpur, India
Program Chair
Gregorio Martínez Pérez University of Murcia, Spain
VIII Organization
Program Co-chairs
Chun-I Fan National Sun Yat-sen University, Taiwan
Félix Gómez Mármol University of Murcia, Spain
Jiankun Hu RMIT University, Australia
Ryan Ko University of Waikato, New Zealand
Publicity Chair
Carlos Becker Westphall Federal University of Santa Catarina, Brazil
Organizing Chair
Hareesha K.S. Manipal Institute of Technology (MIT) - Manipal
University, India
Organizing Co-chairs
Balachandra Manipal Institute of Technology, Manipal University,
India
Ashalatha Nayak Manipal Institute of Technology, Manipal University,
India
Organizing Secretaries
Renuka A. Manipal Institute of Technology, Manipal University,
India
Preetham Kumar Manipal Institute of Technology, Manipal University,
India
Poornima P.K. Manipal Institute of Technology, Manipal University,
India
Organized by
Contents
Deep Learning for Network Flow Analysis and Malware Classification . . . . . 226
R.K. Rahul, T. Anjali, Vijay Krishna Menon, and K.P. Soman
ASLR and ROP Attack Mitigations for ARM-Based Android Devices . . . . . . 350
Vivek Parikh and Prabhaker Mateti
An Android Application for Secret Image Sharing with Cloud Storage . . . . . 399
K. Praveen, G. Indu, R. Santhya, and M. Sethumadhavan
1 Introduction
With the advent of the Internet technology, today’s computer networks have
grown rapidly both in terms of size and complexity. Moreover, Cyber attacks are
also on the rise, prompting the need for cyber defense analysis. Even though crit-
ical industry resources are assumed to be well-secured within a well-administered
network, a single vulnerability in the Internet facing server(s) or client-side appli-
cation(s) can be used as a pivot point (launching pad) to compromise network
resources incrementally. Essentially, potential adversaries can combine multiple
network vulnerabilities to progressively compromise critical network resources
results in a multistage, multi-host attacks. Therefore, security analyst must
consider the cause-consequence relationship between the existing vulnerabili-
ties to secure the network. To determine the relation and interaction among the
exploitable network vulnerabilities, attack graphs [1–5] have been proposed in
literature.
Various attack path length-based metrics [6–9] have been proposed in the lit-
erature to assess the security posture of a computer network. The problem with
path length-based metrics is that they do not consider the cause-consequence
relationship between vulnerabilities and treat each of the exploitable vulnera-
bility equally. However, each kind of vulnerability in a network poses different
amount of resistance to the attacker during their exploitation. Cumulative prob-
ability [10] and cumulative attack resistance metric [11] assess the attacker’s
likelihood of successful vulnerability exploitation, and attackers efforts in terms
of the resistance posed by the vulnerabilities, respectively. However, both the
metrics do not consider the exploit diversity along the attack path(s). Chen
et al. [12] considered diversity among the network vulnerabilities as one of the
factors while calculating the network risk. However, as the attacker can take only
one of the attack path at a time, exploit diversity among the network vulnera-
bilities is not good criteria to consider. Instead, author’s should have considered
vulnerability diversity along the attack path.
Yigit et al. [13] proposed a metric to assess the security risk of a given net-
work. Authors summed up the path probabilities of all the goal-oriented attack
paths to measure the risk of a given network. However, they do not consider the
exploit diversity along the attack path. Suh-Lee and Jo [14] used the proximity
of the un-trusted network and the potential security risk(s) of the neighboring
hosts as important risk conditions to assess the security risk of each vulnera-
bility in a given system. However, they do not consider critical network risk
conditions such as the cause-consequence relationship between the exploitable
vulnerabilities and the exploit diversity along the attack paths.
In this paper, we introduce a new metric for assessing the security risk of a
given network. For doing this, we have considered the resistance and the success
probabilities of each of the goal-oriented attack path (in an attack graph) reach-
able to the predetermined target (i.e. critical resource). First, the risk of each
of the goal-oriented attack path is computed and then summed up to measure
the potential risk of an entire network. Secondly, the contribution of each initial
condition and exploit in an attack graph that contributes to the goal-oriented
Diversity-aware, Cost-effective Network Security Hardening 3
attack paths is calculated. Then, the effective cost of removing each of them
is estimated. Finally, the candidate exploits/initial condition are identified for
removal as a network hardening strategy. The entire process is repeated until
all the attack paths are removed, or security budget gets depleted. Based on
the metric recommendations we find a network hardening solution that brings
maximum security to the network with a minimum cost.
The organization of the paper is as follows. Section 2 discusses the existing
work on metrics available in the attack graph literature. Section 3 reviews the
popular attack graph model and provides a running example. In Sect. 4 we pro-
pose a new network risk assessment metric and also discuss how it will be useful
in identifying the network hardening solution that brings maximum security to
the network with minimum cost. Section 5 presents the results for the running
example. Finally, Sect. 6 closes with conclusions and directions for future work.
2 Related Work
Earlier efforts on security metrics, for example, CVSS [15,16], and CWSS [17] are
focused on assigning a numeric score to the individual reported vulnerabilities or
software weaknesses based on the known facts about them. Vulnerabilities with
higher severity score are given top priority during the process of network hard-
ening. However, an attacker may combine (correlate) less severe vulnerabilities
(based on their cause-consequence relationship) to penetrate the network and
compromise critical resources incrementally. Such causal relationship between
system vulnerabilities is at the heart of ever-increasing multistage, multi-host
attacks [18,19].
Cumulative probability-based attack graph metric [10] considers the causal
relationship between the network vulnerabilities for measuring the overall prob-
ability of an attacker successfully exploiting a vulnerability from her initial posi-
tion. The likelihood of occurrence of each attack path is used to evaluate the
network security. Similar to [10], cumulative attack resistance [11] for each attack
goal (here, critical resource) provides a quantitative measure of how likely the
attack goal can be achieved. The complexity of exploiting each attack path is
used to assess the security posture of a target network. The downside of the
proposed metric ([11]) is that the authors evaluated their work by assigning
the hypothetical resistance values to the individual vulnerabilities, which is not
acceptable in realistic networks and hence limits its usage. Later on, Ghosh
and Ghosh [20] resolved the problem of computing the individual resistance
value of each vulnerability in the system. Although the work in [10,11] consider
the causal relationship between vulnerabilities, they do not consider the exploit
diversity along the attack path(s). As a matter of fact, the multiple occurrences
of the already exploited vulnerability along the attack path(s) ease attacker’s
job. While launching the same type of attack for the second time, the adversary
will get benefited from her experiences and tools that have been accumulated
during the launch of attack for the first time [11]. In particular, an adversary
does not have to engineer a new exploit for taking advantage of the repeated
4 M.A. Jabbar et al.
vulnerabilities and hence she can save her effort, and time. In other words, she
can use previously engineered exploits with little or no modification.
Chen et al. [12] used diversity among the network vulnerabilities and attack
path length as an important risk condition to assess the security risk of a net-
work. Here, the length of the attack path(s) signifies the attacker’s effort and
exploit diversity indicates her knowledge about the different exploitation tech-
nologies. However, as the attacker can follow only one of the attack path, the
diversity among the network vulnerabilities is not a good factor to consider.
Instead, authors should have considered vulnerability diversity along the attack
path. The second factor they took into account is the attack path length in terms
of the number of vulnerabilities attacker has to exploit to reach and compromise
the critical resource. The number of vulnerabilities along the attack path is not
good criteria to consider as it does not capture attackers effort. The fundamental
problem with [12] is that the authors do not consider the exploit diversity along
the attack paths. Wang et al. [21,22] used diversity among the network services
along the attack paths (in a resource graph generated for a given network) to
measure the robustness of a network against the zero-day attacks. Smaller the
count, less robust the network is to the potential zero-day attacks and vice versa.
The above idea of service diversity along the attack path(s) motivated us to con-
sider the exploit diversity along the attack paths in an attack graph generated
for the well-known vulnerabilities. Suh-Lee and Jo [14] used the proximity of un-
trusted network and the risk posed by the neighboring hosts as important risk
condition to assess the security risk of each vulnerability in a system. However,
the approach followed in [14] does not consider the cause-consequence relation-
ship between the exploitable vulnerabilities. Work of Chen et al. [12], Yigit et al.
[13], Suh-Lee and Jo [14], Wang et al. [21,22], and Albanese et al. [23] motivated
us to consider various parameters such as vulnerability resistance, exploit prob-
ability, and exploit diversity along the goal-oriented attack paths for network
risk scoring.
Fig. 1. Attack Graph G for the Test Network (adapted from [12]). Each exploit is
shown by a box, initial condition, and post-condition by a simple plain-text, attackers
initial position by a circle, and her final target by a double circle.
4 Proposed Solution
By traversing the attack graph G (shown in Fig. 1), we have extracted all the
valid, goal-oriented attack paths that end in a given critical resource. If each of
these multistage attack scenarios is eliminated, then the critical resource (here,
Host2 ) in an enterprise network become secure. Usually, there is a vast solution
space available for removing all the multistep attack scenarios since different
initial conditions, and exploits can be chosen for removal. Security analyst needs
to take into account the cost involved in disabling initial conditions or patching
vulnerabilities to harden the network with minimum cost. However, sometimes it
is not at all possible to disable or patch few initial conditions or exploits because
of the incurred side-effects as discussed in Sect. 3. On the contrary, considering
the cost involved in disabling initial conditions or patching vulnerabilities influ-
ence security budget constraint since the cost involved in completely securing
the network can be unacceptable. However, the likelihood of the potential multi-
stage attacks can be decreased to a great extent even if it cannot be entirely wipe
out. It is because of the organization’s security budget constraints. Therefore,
the security risk of a given network needs to be measured to assess their security
strength, and then the administrator will decide on how much protection she
needs to provide.
Diversity-aware, Cost-effective Network Security Hardening 7
To identify all the goal-oriented attack scenarios which end at the prede-
termined critical resource (here, Host2 ), we have used a backward algorithm
in Phase I of our proposal. Hence, the exploits which cannot help adversary
in reaching the target (i.e. critical resource) are never explored. Moreover, we
also got benefited from the logic used by the forward algorithm and discarded
the attack scenarios which do not start from the attacker initial position (i.e.
user(0)). Consequently, the complexity of Phase II is decreased further, as only
essential, goal-oriented attack paths are considered for removal. Mostly, attack
graph for a given network may contain cyclic paths. However, the attacker does
not usually opt for such cyclic paths [12] during network compromise, as she
never relinquishes her privileges on the already compromised host(s). During
the extraction of an attack path(s) (using the backward algorithm), if an exploit
gets encountered which was previously covered in the attack path, then the
extraction of that attack path is canceled to avoid the cyclic attack path(s).
In particular, the security risk of a network relies on several factors. First, the
number of goal-oriented attack paths to the target resource denoted as m. The
availability of more number of goal-oriented attack paths signifies that there is
more opportunity for an adversary to compromise critical resources. The second
one is the length of the attack path in terms of the number of vulnerabili-
ties that needs to be exploited successfully to reach the target. The longer the
attack path(s), an adversary should have the greater endurance to reach the
target resource. However, each type of vulnerability along the attack path poses
a different level of difficulty (resistance) to an attacker during their exploitation.
Hence, each type of vulnerability has different success probability which in turn
could be approximated by the average time (mean-time-to-compromise) or com-
putational complexity required to successfully compromise the critical resource in
a network [24]. We have used individual success probabilities of exploit Eprob (Ei )
[20] which is obtained from the CVSS Temporal Score [15,16] and given as input
to calculate the overall success probability of attack paths. The success prob-
ability (or the resistance) of an attack path execution is a more appropriate
criterion than the attack path length since attackers usually circumvent longer
paths. Since the attack path length increases, the overall probability of success-
ful execution of that path typically decreases. However, there could be longer
attack paths in a network which constitute easy to exploit vulnerabilities and
have higher success probability than the shorter attack paths which include diffi-
cult to exploit vulnerabilities. In this regard, the success probability of an attack
path Aj (i.e. Aprob (Aj )), can be obtained by multiplying the success probabil-
ities of each exploit in that path. The third is the number of different kinds of
vulnerabilities along the attack path. Usually, the more types of vulnerabilities
along the attack path indicates that an adversary needs to have more knowledge
about the different exploitation technologies. Hence, we define the security risk
posed by each of the goal-oriented attack path Aj in a given network N as:
1
Risk(Aj ) = w + (1 − w)Aprob (Aj ), (1)
Arest (Aj )
Diversity-aware, Cost-effective Network Security Hardening 9
where w and (1 − w) signifies the weights given to the factors such as attacker’s
endurance and knowledge, respectively. The likelihood of a goal-oriented attack
path Aj , Aprob (Aj ), can be obtained by multiplying the success probabilities of
each exploit encountered along that path:
Aprob (Aj ) = Eprob (Ei ) (2)
whereas, the overall resistance posed by an attack path Aj (i.e. Arest (Aj )) to an
adversary can be found by summing up the individual resistance value of each
of the vulnerability present in that path as:
Arest (Aj ) = Erest (Ei ), (3)
Like [13], our network hardening solution selects an exploit Ei or initial con-
dition Ik with minimum effective cost β for removal provided their cost α should
not exceed the remaining security budget γ. If an exploit Ei is chosen for patch-
ing, attack paths which encompass Ei are removed, and hence these paths will
be no longer available to an adversary. However, if an initial condition Ik is
Diversity-aware, Cost-effective Network Security Hardening 11
disabled, then the attack paths which include exploit(s) which were enabled by
Ik are removed. Hence, the selected exploit Ei or initial condition Ik eliminates
its contribution to attack paths and reduce security risk M. In this way, by
considering the effective cost β for the removal of an exploit or an initial condi-
tion, critical resources are secured with minimal security budget. The selection
of an initial condition or an exploit for elimination in each iteration based on the
minimum effective cost β guarantees that the highest security risk reduction is
achieved per unit cost. Then, the network security risk M is re-computed, and
Phase II go on until risk M is zero or security budget γ is depleted.
5 A Case Study
To illustrate the proposed cost-effective network hardening strategy, we have
used a well-known network example [12,18,19] from the attack graph literature.
The corresponding attack graph G generated for the adapted network is shown
in Fig. 1. As shown in the attack graph G, there are 11 exploits and 7 initial
conditions. The success probability (likelihood of vulnerability exploitation) and
the cost of removal of each vulnerability is given in Table 1. We assumed the cost
of disabling each initial conditions is 10 units and user(0) is the only initial con-
dition that cannot be disabled since it signifies attackers initial location/position
and also her privileges on the attacking machine. The total security budget set
aside for the enterprise network security is 25 units. In practice, the success
probability values for each well-known vulnerability can be calculated from the
CVSS Temporal Score [15,20]. Whereas, the security experts provide the cost
of removal of vulnerability or initial condition. However, in this study, we have
assigned cost values to each of the exploitable vulnerability and initial conditions
to illustrate the operation of proposed network hardening method more clearly.
Table 2. Attack Paths in a Attack Graph G, and their respective values for path length,
number of distinct vulnerabilities (along the path), Success Probability (Aprob (Aj )),
Resistance (Arest (Aj )), and Risk (Risk(Aj )). Here no. of steps represents the total
number of vulnerabilities which needs to be exploited by an adversary along the attack
path.
root(2). The number of distinct exploits in each attack path is shown in column
3. Whereas, the column 4, 5, and 6 shows the cumulative probability, cumulative
resistance and the risk of each of the goal-oriented attack path, respectively.
The cumulative success probability of a goal-oriented attack path is obtained
by multiplying the success probabilities of all the exploits which belong to the
attack path provided exploit diversity taken into account. For instance, the
probability of occurrence of first attack path A1 is Aprob (A1 ) = 0.8 × 0.8 ×
0.8888 × 0.5 = 0.2844. Whereas, the resistance posed by the attack path A1 is
Arest (A1 ) = 0.25 + 0.25 + 0.125 + 1 = 1.625. Supposing attacker’s endurance
w = 0.5 and attacker’s effort reduction factor a = 0.5 (i.e. reduction in attackers
effort due to the repetition of already exploited vulnerabilities along the attack
path), we use Eq. 1 to compute the security risk of each of the goal-oriented
attack path in G. The risk of a whole network (i.e. M) is determined using the
Eq. 4. Table 2 shows the valid, goal-oriented attack paths and their respective
values for success probability, resistance, and risk.
For the original network setting N , security risk M equals to 2.2847 and is
measured by summing up the risks posed by all five goal-oriented attack paths.
As the network security risk M and the organizational security budget γ are not
zero, we can execute Phase II of our proposed system hardening algorithm.
The effective cost β for each initial condition and exploit that contributes
to the goal-oriented attack paths is calculated in the very first iteration. Next,
an initial condition or an exploit which has a minimum effective cost (β) and
removal cost lower than the original security budget γ is chosen for elimination.
To compute the effective cost β of an initial condition or exploit, the corre-
sponding removal cost α is divided by their contribution (Eq. 7). Table 3 shows
the contribution (con(x)) and effective cost (β) values of each exploits and ini-
tial conditions for each iteration. For the exploit(s) or initial condition(s) that
does not contribute to M, we did not compute their effective cost. As evident
from the Table 3, in the first iteration, an exploit f tp rhosts(0, 1) is selected for
Diversity-aware, Cost-effective Network Security Hardening 13
Table 3. Contribution and Effective cost (β) of each exploit (Ei ) and initial condi-
tion (Ik ).
removal because its removal cost is not larger than the security budget γ and
it has the minimum effective cost β among all exploits and initial conditions.
As neither the residual risk M nor the remaining security budget γ is zero post
f tp rhosts(0, 1) removal, we go for the second iteration.
In the second iteration, an initial condition sshd(0, 1) is chosen for the
removal as it has minimum effective cost and disabling cost is smaller than
the remaining security budget. Upon completion of the second iteration, neither
M nor γ is zero; therefore, we go on with the Third iteration. In this iteration,
the exploit f tp rhosts(0, 2) is chosen for removal.
Therefore, with the total cost of 24 units, we can harden the network such
that there is no single path available to an adversary to compromise Host2 .
However, if only initial conditions are considered while hardening the network
as in [18,25,26], the overall system hardening cost would be 30 units instead of
24. Disabling all the initial conditions provide an additional gain of completely
securing the network. In contrast, these approaches ([18,25,26]) of network secu-
rity hardening are not adaptive and do not let the security administrator control
the overall cost of network hardening in a flexible manner. Therefore, considering
both exploits and initial conditions for removal in our technique helps admin-
istrator to converge to the minimum cost requirement of an organization in a
budget-aware manner.
To conclude, similar to [12,13], our proposed network hardening solution
allows the balance between network security posture improvement and the result-
ing incurred cost to be adjusted by the security analyst in a cost and context-
aware manner. Therefore, our method of network hardening is complementary
to the existing attack graph-based network hardening solutions.
14 M.A. Jabbar et al.
6 Conclusion
In this paper, we have proposed a diversity-aware metric (M) to assess the
security risk of a given network and presented a cost-effective network hardening
solution. The proposed metric M determines the security posture of a given net-
work. The proposed network hardening solution facilitates cost-controlled net-
work immunization by taking into account both initial conditions and exploits for
removal. As opposed to existing solutions ([12,13]), we consider the attacker’s
effort reduction factor (due to the repetition of same vulnerability along the
attack path(s)) while protecting the critical resources. Further, like [13], our
network hardening solution considers the organization’s security budget con-
straints while securing the critical network resources. Such viable hardening
solution obtained under the given security budget constraint improves the secu-
rity posture of a network. As a part of future work, the complexity analysis of
the proposed algorithm needs to be investigated rigorously. Moreover, we pro-
pose to study the reduction in attacker’s work factor (vulnerability resistance)
because of the repetition of vulnerabilities along the attack paths. Such reduc-
tion in work factor (i.e. attackers effort) will be different for the different types
of vulnerabilities.
References
1. Jha, S., Sheyner, O., Wing, J.: Two formal analysis of attack graphs. In: Proceed-
ings of the 15th IEEE Workshop on Computer Security Foundations, CSFW 2002,
pp. 49–63. IEEE Computer Society, Washington (2002)
2. Sheyner, O., Haines, J., Jha, S., Lippmann, R., Wing, J.: Automated generation
and analysis of attack graphs. In: Proceedings of the IEEE Symposium on Security
and Privacy, pp. 273–284 (2002)
3. Ou, X., Boyer, W.F.: A scalable approach to attack graph generation. In: Pro-
ceedings of the 13th ACM Conference on Computer and Communications Security
(CCS), pp. 336–345. ACM Press (2006)
4. Jajodia, S., Noel, S.: Topological vulnerability analysis: a powerful new approach
for network attack prevention, detection, and response. In: Proceedings of Algo-
rithms, Architectures, and Information System Security, pp. 285–305. Indian Sta-
tistical Institute Platinum Jubilee Series (2009)
5. Ghosh, N., Ghosh, S.: A planner-based approach to generate and analyze minimal
attack graph. Appl. Intell. 36, 369–390 (2012)
6. Phillips, C., Swiler, L.P.: A graph-based system for network-vulnerability analysis.
In: Proceedings of the 1998 Workshop on New Security Paradigms, NSPW 1998,
pp. 71–79. ACM, New York (1998)
7. Ortalo, R., Deswarte, Y., Kaaniche, M.: Experimenting with quantitative evalua-
tion tools for monitoring operational security. IEEE Trans. Softw. Eng. 25, 633–650
(1999)
8. Li, W., Vaughn, R.: Cluster security research involving the modeling of network
exploitations using exploitation graphs. In: Proceedings of the 6th IEEE Inter-
national Symposium on Cluster Computing and the Grid, CCGRID 2006, vol. 2,
p. 26 (2006)
Diversity-aware, Cost-effective Network Security Hardening 15
9. Idika, N., Bhargava, B.: Extending attack graph-based security metrics and aggre-
gating their application. IEEE Trans. Dep. Secur. Comp. 9, 75–85 (2012)
10. Wang, L., Islam, T., Long, T., Singhal, A., Jajodia, S.: An attack graph-based
probabilistic security metric. In: Atluri, V. (ed.) DBSec 2008. LNCS, vol. 5094, pp.
283–296. Springer, Heidelberg (2008). doi:10.1007/978-3-540-70567-3 22
11. Wang, L., Singhal, A., Jajodia, S.: Measuring the overall security of network
configurations using attack graphs. In: Barker, S., Ahn, G.-J. (eds.) DBSec
2007. LNCS, vol. 4602, pp. 98–112. Springer, Heidelberg (2007). doi:10.1007/
978-3-540-73538-0 9
12. Chen, F., Liu, D., Zhang, Y., Su, J.: A scalable approach to analyzing network
security using compact attack graphs. J. Netw. 5 (2010)
13. Yigit, B., Gür, G., Alagüz, F.: Cost-aware network hardening with limited budget
using compact attack graphs. In: Proceedings of the IEEE Military Communica-
tions Conference, pp. 152–157 (2014)
14. Suh-Lee, C., Jo, J.: Quantifying security risk by measuring network risk condi-
tions. In: 2015 Proceedings of the 14th International Conference on Computer and
Information Science (ICIS), pp. 9–14. IEEE/ACIS (2015)
15. Mell, P., Scarfone, K., Romanosky, S.: Common vulnerability scoring system. IEEE
Secur. Priv. 4, 85–89 (2006)
16. FIRST: Common vulnerability scoring system v3.0: Spec. Doc., June 2015
17. MITRE: Common weakness scoring system (2016). https://cwe.mitre.org/cwss/
18. Wang, L., Noel, S., Jajodia, S.: Minimum-cost network hardening using attack
graphs. Comput. Commun. 29, 3812–3824 (2006)
19. Keramati, M., Asgharian, H., Akbari, A.: Cost-aware network immunization frame-
work for intrusion prevention. In: Proceedings of the IEEE International Confer-
ence on Computer Applications and Industrial Electronics (ICCAIE), pp. 639–644
(2011)
20. Ghosh, N., Ghosh, S.: An approach for security assessment of network configu-
rations using attack graph. In: Proceedings of the International Conference on
Networks & amp; Communications, pp. 283–288 (2009)
21. Wang, L., Jajodia, S., Singhal, A., Noel, S.: k -zero day safety: measuring the
security risk of networks against unknown attacks. In: Gritzalis, D., Preneel, B.,
Theoharidou, M. (eds.) ESORICS 2010. LNCS, vol. 6345, pp. 573–587. Springer,
Heidelberg (2010). doi:10.1007/978-3-642-15497-3 35
22. Wang, L., Jajodia, S., Singhal, A., Cheng, P., Noel, S.: k-zero day safety: a network
security metric for measuring the risk of unknown vulnerabilities. IEEE Trans.
Dependable Secure Comput. 11, 30–44 (2014)
23. Albanese, M., Jajodia, S., Noel, S.: Time-efficient and cost-effective network hard-
ening using attack graphs. In: IEEE/IFIP International Conference on Dependable
Systems and Networks (DSN 2012), pp. 1–12 (2012)
24. Wang, L., Singhal, A., Jajodia, S.: Toward measuring network security using attack
graphs. In: Proceedings of the 2007 ACM Workshop on Quality of Protection. QoP
2007, pp. 49–54. ACM, New York (2007)
25. Man, D., Wu, Y., Yang, Y.: A method based on global attack graph for network
hardening. In: Proceedings of the 4th International Conference on Wireless Com-
munications, Networking and Mobile Computing, pp. 1–4 (2008)
26. Islam, T., Wang, L.: A heuristic approach to minimum-cost network hardening
using attack graph. In: Proceedings of the New Technologies, Mobility and Security,
pp. 1–5 (2008)
Fast Verification of Digital Signatures in IoT
1 Introduction
Internet of Things (IoT) was coined in 1999 by Kevin Ashton. ‘Internet’ refers
to the interconnectivity of devices to create a network, and ‘Things’ refers to
the objects or devices that have the capability to connect to the Internet. The
Internet of Things (IoT) can be defined in many ways [2,10,15,31]. One way of
defining can be, ‘it is a network of sensors and smart devices which sense the data
which is further processed and analysed in a ubiquitous network.’ IoT has seen
rapid development in recent years because of its ‘smartness’. The various appli-
cations of IoT include Smart City [5,17], Smart Home [6,16], and Smart Health
[1] etc. These applications have millions of devices generating large volumes of
data.
As we know the sensors are used for monitoring various physical conditions
like temperature, sound, pressure etc. The network of these several distributed
sensing objects are collectively referred as Wireless Sensor Network (WSN).
These WSN nodes are deployed largely in various applications because of their
low cost and low power consumption. WSN edge nodes act as gateways or bridge
between sensors and internet protocol as depicted in Fig. 1. These gateway nodes
collect data from the sensor nodes, and normalize the information received for
further processing and storage and they are also responsible for providing secu-
rity. These nodes initially authenticate the sensor node before the exchange of
data. These set of edge nodes together have more energy and computation power
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 16–27, 2017.
https://doi.org/10.1007/978-981-10-6898-0_2
Fast Verification of Digital Signatures in IoT 17
for processing than individual sensor nodes. Hence they play the role of firewall
by providing the security to sensor nodes as well as to the internet protocol.
Gateway nodes
Sensor nodes
Internet/Cloud
Security is the major concern in IoT since millions of devices sense and com-
municate large volumes of private and sensitive data. There are a number of
fundamental security capabilities that a IoT system should posses, since the
sensor nodes are more vulnerable to threats. Therefore IoT security standard
must address the challenges of scalability, privacy and authentication etc. IoT is
a combination of various networks, where various sensor nodes generate hetero-
geneous sets of data. Therefore building a standard secure and reliable system
for IoT is still a challenge.
Most of the threats are categorised into three major categories:
There are various ways to overcome these threats by implementing security pro-
tocols such as TLS, SSL, and by providing digital certificate standard and Cer-
tificate Authorities (CA), which are based on Public Key Infrastructure (PKI).
Before processing any data, the authenticity of the sender has to be verified by
18 A.S. Kittur et al.
verifying the Digital Signature of the sender. There are many standard Digi-
tal signature algorithms introduced such as RSA Digital Signature, DSA, and
ECDSA etc. which satisfy the CIA (Confidentiality, Integrity, Authentication)
triad properties.
2 Related Work
There has been lot of research on the security of IoT in recent times [25,28–30].
Many researchers have been in to standardizing the security protocols for IoT,
but due to its diversity in varied applications, it is difficult to standardize the
security architecture. Various lightweight authentication schemes are provided to
reduce computation load and computation time [13,14,20,21] on the IoT devices.
There are many Digital Signatures schemes [7,19,22,26] proposed for check-
ing the Authenticity, Integrity and Non-repudiation properties. There has been
research on improving the signature verification time through Batch verification
[8]. And many Batch verification techniques for RSA Digital signatures [3,12],
DSA signatures [11,24], ECDSA signatures [27] etc. are proposed. As per our
knowledge there is no standard, efficient batch signature verification technique
introduced for IoT as of now.
3 Definitions
In this section we provide formal definitions of various notions.
It is required that except with negligible probability over (pk, sk) output by
Gen(1n ), it holds that V rf ypk (m, Signsk (m)) = 1 for every (legal) message
m. Signature s is considered valid if V rf ypk (m, s) = 1
4 Proposed Method
As IoT devices have huge information exchange, providing end-to-end authenti-
cation between the sensor nodes is very critical. In our work, we have reduced
the verification time required for authentication of these millions of nodes in
IoT. As we know, batch verification of signatures reduces the total verification
time, but in order to further reduce the verification time, we have applied paral-
lelism along with batch verification. As explained earlier, the edge nodes in IoT
can distribute the verification and processing load among themselves as in the
cluster considered for our study.
Parallel processing has the advantage of reduced computation time and cost.
Therefore in our study, we are implementing parallel processing for three batch
Verification Algorithms, A1 [12], A2 [23] and A3 [3] signed with RSA digital
signature scheme. We use MPI (Message Passing Interface) [9] in order to dis-
tribute the load among the different processors in the workstation cluster. MPI
provides the specifications for the library for efficient message passing in par-
allel. MPI specifications provide advantages such as portability, efficiency and
flexibility across various platforms.
20 A.S. Kittur et al.
From the above equation it is clear that, after the receiving the signatures si ,
at the LHS side, all the si values are multiplied, and are exponented with the
public key e. Then on the RHS side, hash values h (mi ) for each message are
generated and re multiplied if both the values of LHS and RHS match, all the
signatures are valid or else there are one/more invalid signatures existing in the
given batch.
Algorithm A2: This algorithm proposed by Hwang et al. is the modification
to Algorithm A1, and improves the security over algorithm A1. The proposed
equation to batch verify the signatures is,
t
t
( svi i )e = h(mi )vi mod n (2)
i=1 i=1
4.3 Workflow
Our aim of the work is to reduce the computation load on single node during
signature verification, since IoT sensor nodes have limited capacity. The verifi-
cation load is distributed among the available nodes through parallel processing
which reduces the computation time and load.
In the proposed system for batch verification, server node will perform the
task of scheduling the batch verification jobs amongst the available gateway
nodes and will generate the final results. To emulate this scenario, we have
designed and implemented a 7 node cluster system for the batch verification of
digital signatures. It may be noted that each cluster node has large capacity
and computation power in comparison to a gateway node. Gateway nodes have
either dual or quad core 500 MHz–1 GHz processors. Therefore each processor of
our cluster system is equivalent to two Gateway nodes.
In Fig. 2, we can observe that the Master distributes load to other worksta-
tions, and the communication happens through MPI. Each workstation gets a
set of signatures which have to be verified through batch verification. The public
22 A.S. Kittur et al.
node
1
node
3
node
n
slave
master 2
slave
n
Gateway nodes have more computing power then the sensors or IoT devices,
every Gateway node can almost process data from around 2000 sensors. There-
fore to handle more load i.e., to process more data from sensors, higher processing
power is needed.
5 Results
The verification time obtained for Algorithm A1 are shown in Table 1. The
Table clearly indicates, as the number of workstations increases, the verification
time required for the batch of signatures subsequently reduces. It can also be
seen that as the batch size of signatures increases, the verification time also
increases accordingly. We can also observe the perform gain. The verification
time for batch size 24 remains almost same for all seven machines is because the
amount of time needed for verification of such small batch size very less.
Case 2: For Algorithm A2, the details of time required are given in the Table 2.
Table 2 shows the results obtained for Algorithm A2, for the same input
given. For 7 machines, the performance gained is almost 6 - 6.5 times. There is
very little difference in the increased time for verification for this algorithm since
the number of modular exponentiations increases, but the difference is negligible
when compared to the security provided.
Case 3: For algorithm A3, the details of time required are given in the Table 3.
Table 3 for Algorithm A3 has similar results to show. There is no much differ-
ence in the number of exponentiation operations when compared to Algorithm
A2, but Algorithm A3 is more secure.
6 Security Analysis
Since our study focuses on three Batch verification techniques for RSA digital
signatures, in this section we analyse the security aspects of the three techniques
and compare them. The algorithm A1 by L. Harn is prone to adaptive chosen
message attack. This can be explained as follows, If an attacker wants to send a
set of messages m1 , m2 , . . . , mt , he first generates fake signatures for the t messages
s1 , s2 , . . . st such that si ’= si ∗ ai mod q where i = 1, 2, . . . , t and i=1 ai = 1
and sends across. Therefore at the verification, these set of signatures get verified
successfully and the verifier fails to detect the fake signatures.
In other attack, the sender generates signatures s1 = h(m3 )d , s2 =
h(m1 )d , s3 = h(m2 )d etc., which when verified in batch gets successfully ver-
ified. But in case of both the attacks, the invalid signatures are identified if
verified individually.
To improve the security of algorithm A1, algorithm A2 was introduced. This
technique was introduced to overcome the security flaws from the previous tech-
nique. But this technique too is vulnerable to attacks. The chances of verifying
an invalid signature as valid is 50%. A dishonest signer chooses a w such that
w2 = 1 mod n and generates the invalid signatures si ’= si ∗ w mod n. The prob-
ability of choosing an even random number is 50%. Therefore the probability of
accepting an invalid signature as valid is 50%.
This technique increases the number of modular exponentiation operations
for batch verification at the verifier. Therefore the extra security comes at a
Fast Verification of Digital Signatures in IoT 25
As we know that IoT has millions of sensor devices sending information across
the network, there is a need to provide security and authentication to prevent
the integrity and the privacy of information. Therefore our idea of accelerating
the Batch verification techniques, significantly reduces the time needed to verify
millions of signatures, which is a significant advantage to the Digital world. This
aids for ‘smart’ projects such for smart city, smart healthcare etc.
For our experimental results, we have considered the batch verification tech-
niques introduced for RSA Digital Signature Scheme since it is the first scheme
introduced for batch verification strategy and easy to interpret. We extend our
experimental results for various batch verification techniques introduced for DSA
and ECDSA. And we are looking forward to implement and study batch verifi-
cation strategy for Type 2 signatures for verification.
References
1. Amendola, S., Lodato, R., Manzari, S., Occhiuzzi, C., Marrocco, G.: RFID tech-
nology for IoT-based personal healthcare in smart spaces. IEEE Internet Things
J. 1(2), 144–152 (2014)
2. Atzori, L., Iera, A., Morabito, G.: The internet of things: a survey. Comput. Netw.
54(15), 2787–2805 (2010)
3. Bao, F., Lee, C.-C., Hwang, M.-S.: Cryptanalysis and improvement on batch ver-
ifying multiple rsa digital signatures. Appl. Math. Comput. 172(2), 1195–1200
(2006)
4. Bellare, M., Garay, J.A., Rabin, T.: Fast batch verification for modular exponen-
tiation and digital signatures. In: Nyberg, K. (ed.) EUROCRYPT 1998. LNCS,
vol. 1403, pp. 236–250. Springer, Heidelberg (1998). https://doi.org/10.1007/
BFb0054130
5. Cocchia, A.: Smart and digital city: a systematic literature review. In: Dameri,
R.P., Rosenthal-Sabroux, C. (eds.) Smart City. PI, pp. 13–43. Springer, Cham
(2014). https://doi.org/10.1007/978-3-319-06160-3 2
6. Du, K.-K., Wang, Z.-L., Hong, M.: Human machine interactive system on smart
home of IoT. J. China Univ. Posts Telecommun. 20, 96–99 (2013)
7. Even, S., Goldreich, O., Micali, S.: On-line/off-line digital signatures. J. Cryptol.
9(1), 35–67 (1996)
8. Fiat, A.: Batch RSA. In: Brassard, G. (ed.) CRYPTO 1989. LNCS, vol. 435, pp.
175–185. Springer, New York (1990). https://doi.org/10.1007/0-387-34805-0 17
26 A.S. Kittur et al.
9. Gropp, W., Lusk, E., Doss, N., Skjellum, A.: A high-performance, portable imple-
mentation of the MPI message passing interface standard. Parallel Comput. 22(6),
789–828 (1996)
10. Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of things (IoT): a
vision, architectural elements, and future directions. Future Gener. Comput. Syst.
29(7), 1645–1660 (2013)
11. Harn, L.: Batch verifying multiple DSA-type digital signatures. Electron. Lett.
34(9), 870–871 (1998)
12. Harn, L.: Batch verifying multiple RSA digital signatures. Electron. Lett. 34(12),
1219–1220 (1998)
13. Hernandez-Ramos, J.L., Pawlowski, M.P., Jara, A.J., Skarmeta, A.F., Ladid,
L.: Toward a lightweight authentication and authorization framework for smart
objects. IEEE J. Sel. Areas Commun. 33(4), 690–702 (2015)
14. Jan, M.A., Nanda, P., He, X., Tan, Z., Liu, R.P.: A robust authentication scheme
for observing resources in the internet of things environment. In: 2014 IEEE 13th
International Conference on Trust, Security and Privacy in Computing and Com-
munications (TrustCom), pp. 205–211. IEEE (2014)
15. Jia, X., Feng, Q., Fan, T., Lei, Q.: Rfid technology and its applications in internet
of things (IoT). In: 2012 2nd International Conference on Consumer Electronics,
Communications and Networks (CECNet), pp. 1282–1285. IEEE (2012)
16. Jie, Y., Pei, J.Y., Jun, L., Yun, G., Wei, X.: Smart home system based on IoT
technologies. In: 2013 Fifth International Conference on Computational and Infor-
mation Sciences (ICCIS), pp. 1789–1791. IEEE (2013)
17. Jin, J., Gubbi, J., Marusic, S., Palaniswami, M.: An information framework for
creating a smart city through internet of things. IEEE Internet Things J. 1(2),
112–121 (2014)
18. Katz, J., Lindell, Y.: Introduction to Modern Cryptography. CRC Press, Boca
Raton (2014)
19. Lamport, L.: Constructing digital signatures from a one-way function. Technical
report CSL-98, SRI International Palo Alto (1979)
20. Lee, J.-Y., Lin, W.-C., Huang, Y.-H.: A lightweight authentication protocol for
internet of things. In: 2014 International Symposium on Next-Generation Elec-
tronics (ISNE), pp. 1–2. IEEE (2014)
21. Liu, J., Xiao, Y., Chen, C.P.: Authentication and access control in the internet of
things. In: 2012 32nd International Conference on Distributed Computing Systems
Workshops (ICDCSW), pp. 588–592. IEEE (2012)
22. Merkle, R.C.: Method of providing digital signatures, US Patent 4,309,569, 5
January 1982
23. Min-Shiang, H., Cheng-Chi, L., Yuan-Liang, T.: Two simple batch verifying mul-
tiple digital signatures. In: Qing, S., Okamoto, T., Zhou, J. (eds.) ICICS 2001.
LNCS, vol. 2229, pp. 233–237. Springer, Heidelberg (2001). https://doi.org/10.
1007/3-540-45600-7 26
24. Naccache, D., M’Raı̈hi, D., Vaudenay, S., Raphaeli, D.: Can D.S.A. be improved?
— Complexity trade-offs with the digital signature standard —. In: De Santis, A.
(ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 77–85. Springer, Heidelberg (1995).
https://doi.org/10.1007/BFb0053426
25. Riahi, A., Challal, Y., Natalizio, E., Chtourou, Z., Bouabdallah, A.: A systemic
approach for IoT security. In: 2013 IEEE International Conference on Distributed
Computing in Sensor Systems (DCOSS), pp. 351–355. IEEE (2013)
26. Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures
and public-key cryptosystems. Commun. ACM 21(2), 120–126 (1978)
Fast Verification of Digital Signatures in IoT 27
27. Shao, Z.: Batch verifying multiple DSA-type digital signatures. Comput. Netw.
37(3), 383–389 (2001)
28. Xu, T., Wendt, J.B., Potkonjak, M.: Security of IoT systems: design challenges and
opportunities. In: Proceedings of the 2014 IEEE/ACM International Conference
on Computer-Aided Design, pp. 417–423. IEEE Press (2014)
29. Zhang, Z.K., Cho, M.C.Y., Wang, C.W., Hsu, C.W., Chen, C.K., Shieh, S.: Iot
security: ongoing challenges and research opportunities. In: 2014 IEEE 7th Inter-
national Conference on Service-Oriented Computing and Applications (SOCA),
pp. 230–234. IEEE (2014)
30. Zhao, K., Ge, L.: A survey on the internet of things security. In: 2013 9th Interna-
tional Conference on Computational Intelligence and Security (CIS), pp. 663–667.
IEEE (2013)
31. Zhu, Q., Wang, R., Chen, Q., Liu, Y., Qin, W.: IoT gateway: Bridgingwireless sen-
sor networks into internet of things. In: 2010 IEEE/IFIP 8th International Confer-
ence on Embedded and Ubiquitous Computing (EUC), pp. 347–352. IEEE (2010)
Efficient and Provably Secure Pairing Free
ID-Based Directed Signature Scheme
1 Introduction
complexity and eliminates the need of digital certificate by creating public key from its
public identity. A reliable third party called Key Generation Centre (KGC) generates
the private key using entity’s public key.
In an ordinary signature scheme, the validity of a signature on a message can be
verified by any one. However, this public verifiability of a signature is not desirable in
some applications where the signed message is sensitive to the signature receiver. To
deal with specific application scenarios such as signatures on medical records, tax
information and in personal/business transactions, one may go for directed signatures.
In a directed signature scheme, the validity of a signature can be verified only by the
designated verifier (receiver). If difference of opinion occurs between the signer and
designated verifier then they both can prove the correctness of a signature to a third
party.
1.2 Motivation
The above mentioned ID-based directed signature schemes are designed using bilinear
pairings over elliptic curves and this pairing operation is 20 times more than that of the
scalar multiplication over elliptic curve group. So most of the schemes are less efficient
and are not applicable efficiently in practice. Also, Elliptic Curve Cryptography
(ECC) provides high security with smaller key sizes. Hence, time management, storage
space and consumption of bandwidth become very less with these small keys.
According to National Institute of Standards and Technology (NIST), to achieve high
security level, such as 256 bit AES (symmetric algorithm), RSA needs 15360 bit key
size where as ECC needs only 521 bit. Similarly for 80 bit AES (symmetric algorithm),
30 N.B. Gayathri et al.
RSA needs 1024 bit key size where as ECC needs only 160 bit for applications. Hence,
schemes with general hash function under Elliptic Curve Cryptography (ECC) in
pairing free environment would be more desirable to achieve high efficiency with the
same security. This motivated us to design a pairing free directed signature scheme in
identity based frame work.
2 Preliminaries
In this section we briefly describe the fundamental concepts that are required in the
proposed scheme.
In this section we propose our efficient Identity based directed signature (IDBDS)
scheme and we prove its security.
Proof: Let n be an ECDL challenger and is given a random instance ðQ ¼ sPÞ of the
ECDL problem in G for a randomly chosen s 2 Zq : Its goal is to compute s: Let ADV
is an adversary who interacts with n by performing oracle queries as modelled in [12].
Now we prove that n can solve the ECDLP using ADV. During the simulation process
n needs to guess the target identity of ADV. Without loss of generality, n takes ID as
target identity of ADV on a message m :
– Initialization Phase: Algorithm n sets PPub ¼ Q ¼ sP and runs Setup to generate
s: n then gives s and PPub to ADV.
– Query Phase: In this phase, ADV performs the oracle simulation and n responds to
these oracles as follows.
Queries on oracle H1 ðH1 ðIDi ; Ri ; PPub ÞÞ: A list L1 ; with records of the form
ðIDi ; Ri ; PPub ; l1i Þ; is maintained by n: After receiving a query on
H1 ðIDi ; Ri ; PPub Þ; if there is a record ðIDi ; Ri ; PPub ; l1i Þ in L1 , n returns l1i :
Otherwise, n picks a random l1i and adds to L1 : Finally, n returns l1i :
Some time ADV can query for the public key component corresponding to
identity IDi as ADV wants to know the actual Ri corresponding to IDi : n does the
following.
(i) If IDi ¼ ID , n sets Ri ¼ sP ¼ PPub where s is unknown to n and PPub is the
ECDL problem that n wants to solve. n stores the record ðIDi ; Ri ; ?; PPub ; l1i Þ
to L1 , and returns Ri to ADV.
(ii) If IDi 6¼ ID ; choose ri 2 Zq and set Ri ¼ ri P l1i PPub and stores the record
ðIDi ; Ri ; ri ; PPub ; l1i Þ to L1 , and returns Ri to ADV.
Queries on oracle H2 ðH2 ðm; IDs ; IDv ; Us ; Rs ÞÞ: A list L2 ; with records of the
form ðm; IDs ; IDv ; Us ; Rs ; l2i Þ; is maintained by n: After receiving H2 query on
ðm; IDs ; IDv ; Us ; Rs Þ if a record ðm; IDs ; IDv ; Us ; Rs ; l2i Þ exists on L2 , n returns
l2i : otherwise, n picks a random l2i 2 Zq and returns l2i : n adds
ðm; IDs ; IDv ; Us ; Rs ; l2i Þ to L2 :
Queries on oracle H3 ðH3 ðm; IDs ; IDv ; Us ; Rs ; l2i ÞÞ: A list L3 ; with records of the
form ðm; IDs ; IDv ; Us ; Rs ; l2i ; l3i Þ; is maintained by n: After receiving a query on
H3 ðm; IDs ; IDv ; Us ; Rs ; l2i Þ; n gives l3i if the record exists on L3 : Otherwise, n
picks a random l3i 2 Zq ; and returns l3i and n adds ðm; IDs ; IDv ; Us ; Rs ; l2i ; l3i Þ to
L3 :
Key Extraction Oracle ððKExtIDi ÞÞ: When ADV makes this query on identity
IDi ; n does the following.
If IDi ¼ ID , n aborts. Otherwise ðif IDi 6¼ ID Þ, n sets di ¼ ri and returns di to
ADV.
Signing Oracle: When n receives a query on ðIDs ; mÞ; with a verifier IDv ; n first
makes queries on H1 ; H 2 ; H3 oracles and recovers the records ðIDi ; Ri ; PPub ; l1i Þ;
ðm; IDs ; IDv ; Us ; Rs ; l2i Þ; ðm; IDs ; IDv ; Us ; Rs ; l2i ; l3i Þ from L1 ; L2 ; L3 respectively.
n generates two random numbers r1i ; r2i 2 Zq and sets ki ¼ r1i ; Vi ¼
ðr1i P ðRi þ l1i PPub Þl2i Þl1 3i ; Ui ¼ r2i Rv and Wi ¼ r2i P:
34 N.B. Gayathri et al.
DVerify Oracle ðDVðIDi ÞÞ: ADV submits ðIDs ; IDv ; mÞ and ri ¼ ðRi ; Wi ; Vi ; ki Þ
to n: It first recovers ðIDv ; Rv ; PPub ; l1i Þ from L1 list and continues as follow.
(i) If IDv 6¼ ID ; it computes Ui ¼ rv Wi and then recovers the entries l2i ¼
H2 ðm; IDs ; IDv ; Ui ; Ri Þ and l3i ¼ H3 ðm; IDs ; IDv ; Ui ; Ri ; l2i Þ from L2 & L3
lists.
If these entries does not exists, n selects l2i ; l3i 2 Zq and defines
H2 ðm; IDs ; IDv ; Ui ; Ri Þ ¼ l2i and H3 ðm; IDs ; IDv ; Ui ; Ri ; l2i Þ ¼ l3i : n then ver-
ifies the Eq. (1) to check the validity of ri ¼ ðRi ; Wi ; Vi ; ki Þ and returns either
1(valid) or 0(invalid) to ADV.
(ii) If IDv ¼ ID , n works on all possible entries H2 ðm; IDs ; IDv ; Ui ; Ri Þ and
H3 ðm; IDs ; IDv ; Ui ; Ri ; l2i Þ for some Ui :
• For each possible entry H2 ðm; IDs ; IDv ; Ui ; Ri Þ ¼ l2i and
H3 ðm; IDs ; IDv ; Ui ; Ri ; l2i Þ ¼ l3i for some Ui ; n evaluate the Eq. (1). If the
verification results 1 (valid) then n returns 1(valid) to ADV.
• If the above procedure does not lead n to return an answer for ADV; n
then returns 0 (invalid) to ADV.
PVerify Oracle ðPVðIDi ÞÞ: ADV. submits ðIDs ; IDv ; mÞ and ri ¼ ðRi ; Wi ; Vi ; ki Þ
to n: It follows the same procedure as in the simulation of DVerify Oracle. The
only difference is; when n judges ri ¼ ðRi ; Wi ; Vi ; ki Þ is valid (i.e., returns 1 in
the DVerify Oracle); it returns Aid ¼ Ui ¼ r2i Rv ¼ rv Wi ¼ Yi ðsay) to ADV.
When n judges ri ¼ ðRi ; Wi ; Vi ; ki Þ is invalid (i.e., returns 0 in the DVerify
Oracle); it returns ? to ADV.
– Forgery: Finally ADV. out puts IDs ; IDv ; m ; ri as its forgery where
ri ¼ ðRi ; Wi ; Vi ; ki Þ:
ð1Þ ð1Þ ð1Þ
If IDi 6¼ IDs ; n stops simulation. Otherwise, let ri ¼ ðRi ; Wi ; Vi ; ki Þ denote
ri ¼ ðRi ; Wi ; Vi ; ki Þ: By Forking Lemma [9], n repeats simulation with same random
ðjÞ
tape but different choice of H 2 ; H3 ; ADV will out put another two ri ¼
ðjÞ ðjÞ
ðRi ; Wi ; Vi ; ki Þ for j ¼ 2; 3; and Eq. (1) holds. Hence
ðjÞ ðjÞ ðjÞ lðjÞ
ki P ðRi þ l1i PPubi Þl2i l3i ¼ Vi for j ¼ 1; 2; 3:
4 Efficiency Analysis
In this section we compare our scheme with the relevant schemes [12, 13, 16] in terms
of computation and communication cost. Various cryptographic operations and their
conversions are presented in Table 1 [2]. The detailed comparison of our IDBDS
scheme with other Directed signature schemes is presented in Table 2. From Table 2, it
is clear that all the existing directed signature schemes are using bilinear pairings where
as our IDBDS scheme does not use bilinear parings. The security of our scheme is
proven under the hardness of ECDL problem. Hence, our proposed scheme is com-
putationally more efficient than all other schemes.
5 Conclusions
In this paper, we have presented a novel and efficient IDBDS scheme over elliptic
curves in pairing free environment. This is the first ID-based directed signature scheme
in pairing free setup. All the existing directed signature schemes in ID-based setting
uses bilinear pairings and the computation of bilinear pairing is most expensive
operation. The proposed scheme does not uses pairings and hence our scheme is
computationally more efficient than the well-known existing directed signature
schemes. The proposed scheme is secure under the assumption that ECDLP is hard.
Hence, the proposed scheme can be applied in many applications such as signatures on
medical records, tax information where message is sensitive to the signature receiver.
Table 2. Comparison of the proposed IDBDS scheme with the related schemes
Scheme Signing cost D verify cost P verify cost Total cost Signature With out Hard problem
length pairing
Sun et al. 3TSM þ 1TBP þ 2TMTPH þ 1TPA 4TBP þ 1TMTPH 3TBP þ 1TMTPH 899:12TMM 3jG1 j No CDH & DBDH
(2008)
B.U.P. et al. 3TSM þ 2TBP þ 2TMTPH þ 1TH þ 1TPEX 3TBP þ 1TMTPH þ 1TH þ 1TPEX 2TBP þ 1TMTPH þ 1TH þ 1TPEX 942:5TMM 2jG1 j þ Zq No CDH & DBDH
(2009)
Zhang et al. 6TSM þ 1TBP þ 2TH þ 1TXOR þ 6TPEX 2TSM þ 6TBP þ 2TH þ 1TXOR þ 2TPEX 2TSM þ 4TBP þ 1TH þ 1TPEX 1595TMM 4jG1 j No CDH & DBDH
(2009)
Our Scheme 3TSM þ 2TH þ 2TPA 5TSM þ 2TH þ 3TPA þ 1TINV 4TSM þ 2TH þ 2TPA þ 1TINV 372:04TMM 3jG j þ Z Yes ECDL
1 q
Efficient and Provably Secure Pairing
37
38 N.B. Gayathri et al.
Acknowledgements. The authors are grateful and sincerely thank the reviewers for their
valuable suggestions. This work is supported by WOS-A, DST, Govt. of India under the grant
No. SR/WOS-A/PM-1033/2014 (G), WOS-A, DST.
References
1. Diffe, W., Hellman, M.E.: New directions in cryptography. IEEE Trans. Inf. Theor. 22, 644–
654 (1976)
2. Islam, S.K., Biswas, G.P.: A Pairing free identity-based authenticated group key agreement
protocol for imbalanced mobile networks. Ann. Telecommun. 67, 547–558 (2012). Springer
3. Ismail, E.S., Abu-Hassan, Y.: A directed signature scheme based on discrete logarithm
problems. Jurnal Teknologi 47(C), 37–44 (2007)
4. Ku, J., Yun, D., Zheng, B., Wei, S.: An efficient ID-based directed signature scheme from
optimal eta pairing. In: Li, Z., Li, X., Liu, Y., Cai, Z. (eds.) ISICA 2012. CCIS, pp. 440–448.
Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34289-9_49
5. Laguillaumie, F., Paillier, P., Vergnaud, D.: Universally convertible directed signatures. In:
Roy, B. (ed.) ASIACRYPT 2005. LNCS, vol. 3788, pp. 682–701. Springer, Heidelberg
(2005). https://doi.org/10.1007/11593447_37
6. Lal, S., Kumar, M.: A directed signature scheme and its applications (2004). http://arxiv.org/
abs/cs/0409035
7. Lim, C.H., Lee, P.J.: Modified Maurer-Yacobi’s scheme and its applications. In: Seberry, J.,
Zheng, Y. (eds.) AUSCRYPT 1992. LNCS, vol. 718, pp. 308–323. Springer, Heidelberg
(1993). https://doi.org/10.1007/3-540-57220-1_71
8. Lu, R., Cao, Z.: A directed signature scheme based on RSA assumption. Int. J. Netw. Secur.
2(3), 182–421 (2006)
9. Pointcheval, D., Stern, J.: Security arguments for digital signatures and blind signatures.
J. Crypt. 13(3), 361–369 (2000)
10. Ramlee, N.N., Ismail, E.S.: A new directed signature scheme with hybrid problems. Appl.
Math. Sci. 7(125), 6217–6225 (2013)
11. Shamir, A.: Identity-based cryptosystems and signature schemes. In: Blakley, G.R., Chaum,
D. (eds.) CRYPTO 1984. LNCS, vol. 196, pp. 47–53. Springer, Heidelberg (1985). https://
doi.org/10.1007/3-540-39568-7_5
12. Sun, X., Li, J., Chen, G., Yung, S: Identity-based directed signature scheme from bilinear
pairings. https://eprint.iacr.org/2008/305.pdf
13. Uma Prasada Rao, B., Vasudeva Reddy, P., Gowri, T.: An efficient ID-based directed
signature scheme from bilinear pairings. https://eprint.iacr.org/2009/617.pdf
14. Wang, Y.: Directed signature based on identity. J. Yulin Coll. 15(5), 1–3 (2005)
15. Wei, Q., He, J., Shao, H.: Directed signature scheme and its application to group key initial
distribution. In: ICIS-2009, Seoul, Korea, pp. 24–26. ACM (2009)
16. Zhang, J., Yang, Y., Niu, X.: Efficient provable secure ID-based directed signature scheme
without random oracle. In: Yu, W., He, H., Zhang, N. (eds.) ISNN 2009. LNCS, vol. 5553,
pp. 318–327. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01513-7_34
User Authentication Scheme for Wireless Sensor
Networks and Internet of Things Using LU
Decomposition
1 Introduction
The sensor nodes of WSNs or Internet of Things (IoT) which measure different
parameters (temperature, pressure, humidity, light, etc.) of the environment and
mutually transmit the processed data to the users or gateway, are confined to
tiny computational capacity, small-scale memory, moderate transmission range
and short-lived battery power. It is the essential and challenging task of WSNs
to accomplish better security using light- weight cryptography on this resource
constrained sensor devices. User authentication is one of the significant need
for WSN’s emerging technologies (remotely monitoring patient’s body situation,
electronic devices of industry and smart home, the possibility of attacks in a
battleground, natural calamity, forest fire, etc.). Authenticating users who con-
nect to the WSNs is a process of validating his/her identity (based on one or
more factors such as user’s inherence, possession, knowledge) using sensor device.
A secure user validation scheme of WSNs offers various known security features
such as efficient user’s password update mechanism, secure session key estab-
lishment, confidentiality, integrity, availability, non-repudiation, freshness and
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 39–53, 2017.
https://doi.org/10.1007/978-981-10-6898-0_4
40 A.K. Maurya and V.N. Sastry
mutual authentication of the user, sensor, gateway. A secure WSN resists vari-
ous well-known security attacks such as sensor node and user’s identity imper-
sonation attack, replay attack, denial of service and man-in-the-middle attack,
stolen smart card attack.
2 Related Work
Akyildiz et al. [1] analyzed many aspects of WSNs and discussed many open
research issues of WSNs. In 2006, Watro et al. [2] proposed public-key based
scheme TinyPK for securing WSNs which provides mutual authentication and
withstand sensor impersonation attack. In 2006, Wong et al. [3] suggested a
secure hash function based authentication scheme but it does not support mutual
authenticity and session key establishment between user and sensor. In 2007,
Tseng et al. [4] specified that Watro et al.’s [2] and Wong et al.’s [3] schemes
exhibit replay and forgery attack. Tseng et al. improved Wong et al.’s scheme and
recommended password update mechanism. In 2008, Lee [5] revealed that Wong
et al. scheme exhibits more computational overhead on sensor node compared to
gateway node and improved Wong et al. scheme with less computation overhead
of sensor node. In 2008, Ko [6] indicated that Tseng et al.’s scheme does not con-
tribute mutual authentication and proposed mutual authenticity and timestamp
based scheme. In 2009, Vaidya et al. [7] proposed mutual authentication scheme
with formal verification. In 2009, Das [8] developed a secure mechanism to pro-
vide authenticity using smart card and user’s password (two factor) but it does
not offer session key between user and sensor node. In 2010, Khan- Alghathbar
[9] identified the gateway node bypass attack, insider attack and lack of password
update mechanism in Das’s [8] scheme and improved Das’s scheme by including
password update and mutual authentication technique. In 2010, Yuan et al. [10]
provided a bio metric based scheme but it is unprotected from node capture and
denial of service attack. In 2012, Yoo et al. [11] designed a scheme that provides
secure session key and mutual authentication. In 2013, Xue et al. [12] designed a
mutual authentication scheme based on temporal information. However, in 2014,
Jiang et al. [13] revealed that Xue et al.’s scheme is susceptible to stolen smart
card and privilege insider attack. In 2015, Das [14] suggested fuzzy extractor
based authentication scheme which resist well known security attacks of WSNs
and has more security features compare to Althobaiti et al. (2013) [15] scheme.
The outline of this paper is as follow: In Sect. 1, we introduce the basic
characteristics, applications and important security features of WSNs. Section 2,
consists of literature survey. In Sect. 3, we have explained the notation and math-
ematical expressions which we use for designing the protocol. Section 4, is about
the techniques we use for the proposal of user authentication and session key
establishment mechanism. In Sect. 5, we perform the security analysis. Section 6,
presents the comparison of computational overhead considering other existing
protocol. Eventually, in Sect. 7, we presents conclusions of our paper.
User Authentication Scheme for WSNs and Internet of Things 41
Some basic notations which we use for designing our protocol are listed in fol-
lowing Table 1.
Notations Description
Ui ith User
IDUi Identity of Ui
P WUi Password of Ui
Bi Bio-metric information of Ui
SNj j th Sensor Node
SCUi Smart card of Ui
GW N The gateway node
h(.) A collision resistant one - way hash function
n Maximum numbers of Users and Sensor Nodes in WSNs
LO n × n Lower triangular matrix
UP n × n Upper triangular matrix
LOij Element of LO matrix at ith row and j th column
M at n × n Symmetric matrix such that M at = LO × U P
M atij Element of M at at row i and column j
LOr (Ui ) Row matrix securely assign to Ui
LOr (SNj ) Row matrix securely assign to SNj
U Pc (Ui ) Column matrix assign to Ui
U Pc (SNj ) Column matrix assign to SNj
Gen(.) Generator procedure of Fuzzy Extractor
Rep(.) Reproduction procedure of Fuzzy Extractor
T Error tolerance limit of Fuzzy Extractor
TUi , TGW N , TSNj Current timestamps of Ui , GW N, SNj respectively
T , T , T Current time at GW N, SNj , Ui respectively
+
Z Set of positive integers
|| A string concatenation operator
⊕ A bitwise XOR operator
ΔT Maximum transmission delay
× Matrix multiplication Operator
A Adversary
42 A.K. Maurya and V.N. Sastry
Where (s, s ) ←R indicates that the pair (s, s ) is randomly chosen by A and P r
represents the probability of the event (s, s ) ←R A with execution time t.
To design a secure and efficient user validation protocol of WSNs, we use the
concept of fuzzy extractor [17] for authenticating the user and LU decomposition
for establishing the session key between user and sensor node.
In this section, we first describe the concept of fuzzy extractor and efficient
way of using LU decomposition for establishing the session key. Afterwards, we
propose the pre-deployment scheme for user, sensor, gateway and the procedure
of registering the user Ui and the mechanism of login, authentication and session
key establishment between Ui and SNj . Finally, we describe the user’s credential
update mechanism.
Here, for LOr (Ui ) and U Pc (SNj ) the value of i and j represents the ith
user and j th sensor node respectively, i and j are also equal to the number
of elements of row matrix LOr (Ui ) and column matrix U Pc (SNj ) respectively.
LOr (Ui )k represents the k th element of LOr (Ui ).
⎧ j
⎪
⎨ k=0 LOr (SNi )k × U Pc (Uj )k , if j ≥ i
LOr (SNj ) × U Pc (Uj ) =
⎪
⎩ i
k=0 LOr (SNi )k × U Pc (Uj )k , otherwise
44 A.K. Maurya and V.N. Sastry
Storage Analysis. If len be the number of bits or length of each keying elements
of LO or U P , z be the number of bits to represent n − 1 zero elements. Then, the
total memory required to store keys as per Choi et al.’s scheme [18] is,
Γ[18] = 2 × n2 × len
Total memory required to store keys as per Pathan et al.’s [19] is,
n n × (n + 1)
Γ[19] = len × i + n × (2 × z) = len × + n × (2 × z)
i=1 2
Total memory required to store keys in our scheme is,
n n × (n + 1)
Γour = len × i = len ×
i=1 2
Therefore, we can say that Γour < Γ[19] < Γ[18] .
In order to retrieve data from SNj , Ui gets authenticated using SCi , IDUi ,
P WUi , noisy bio-metric information Bi and fuzzy extractor function Rep(.).
Afterwards, the GW N verifies the credentials of Ui and sends a secure message
to SNj for establishing a secure session with Ui . SNj verifies the message and
establishes the key with Ui . Table 3 describes the authentication and key sharing
mechanism in detail.
else
Reject Ui
Step 3: for SNj Step 4: for Ui
else else
Reject Ui Reject Ui
else else
Reject Ui Reject Ui
46 A.K. Maurya and V.N. Sastry
We provide a mechanism for the user Ui to change his/her password and bio-
metric information before an adversary (who can steal user’s credential without
his/her knowledge) get an opportunity to use it. The procedure for updating the
credential is shown in Table 4.
5 Security Analysis
To validate the security feature of our protocol, we first perform the informal
analysis considering major and minor attacks in WSNs. Afterwards, we imple-
ment our scheme using Security Protocol Description Language and evaluate our
security claims using Sycther tool [22]. For automated validation of the protocol
using AVISPA tool [21], we use High- Level Protocols Specification Language
Finally, we do the logical verification of the protocol using BAN logic [23].
Attack Based on Stolen Smart Card: Our scheme is safe from stolen card of
legitimate user Ui because an adversary A can not extract the secret credential
P WUi , σi , LOr (Ui ) etc. without having the authentic bio-metric credential Bi
of Ui .
User Authentication Scheme for WSNs and Internet of Things 47
Replay Attack: The timestamp TUi , TGW N , TSNj are stored in variable
m1 , m3 , m4 after secure hashing, therefore an adversary A can not perform the
replay attack using message M1 , M2 , M3 .
Notations Description
Pr , Qr Principals like Ui , GW N, and SNj
St Statements like TUi , TGW N , α, β etc.
K Secret key or data like KGSNj , XUi etc.
Pr | ≡ St Pr believes st, or Pr is permitted to believe st
Pr St P r has received a data containing St and it can read or repeat St
Pr | ∼ St Pr once said St. Pr sent a data containing St and it could be a fresh or
old data
(St) The St is fresh and it has not been sent before
XU
i
Pr Qr St is a secret data and it is only known to Pr or Qr and perhaps to the
trusted principals
<St>St1 St1 is a secret and its presence gives the identity of whoever generates
<St>St1
Security Yoo et al. Sun et al. Xue et al. Jiang et al. Althobaiti Ours
feature [11] [16] [12] [13] et al. [15]
SF1 No Yes No Yes Yes Yes
SF2 Yes No No No No Yes
SF3 No No No No No Yes
SF4 No No No No Yes Yes
SF5 No No No No No Yes
SF6 Yes Yes No No Yes Yes
Note: SF1 , SF2 , SF3 , SF4 , SF5 are the security features. F1 resist the attack based
on stolen smart card, SF2 indicates the secure password updating, SF3 represents
secure bio-metric information updating, SF4 indicates non-repudiation, SF5 offers
formal security analysis, SF6 represents no privileged-insider attack
6 Performance Comparison
Table 7 shows the comparison of our proposed protocol based on security fea-
tures, and it indicates that our protocol is relatively more secure compared to
the existing protocols. Table 8 represents the computational cost comparison,
it shows that our scheme provides better computational cost on all the three
entities i.e., Ui , GW N and SNj .
52 A.K. Maurya and V.N. Sastry
7 Conclusion
In this paper, we first discussed the security issues involve in sensor nodes of
WSNs and proposed a user validation, session key sharing scheme using smart
card, fuzzy extractor, matrix decomposition operation. Afterward, we performed
the security analysis and verification using a widely accepted and robust tool
such as AVISPA and Scyther. To ensure the correctness of the security features
involves in the protocol, we performed the logical verification using BAN logic.
Finally, we did the comparative analysis of our protocol with other existing
protocol based on security features and computational overhead which indicates
that our protocol is secure and efficient.
References
1. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor net-
works: a survey. Comput. Netw. 38(4), 393–422 (2002)
2. Watro, R., Kong, D., Cuti, S.F., Gardiner, C., Lynn, C., Kruus, P.: TinyPK: secur-
ing sensor networks with public key technology. In: ACM Workshop on Security
of Ad Hoc and Sensor Networks, Washington DC, USA, pp. 59–64. ACM Press
(2004)
3. Wong, K.H., Zheng, Y., Cao, J., Wang, S.: A dynamic user authentication scheme
for wireless sensor networks. In: Proceedings of 2006 IEEE International Conference
on Sensor Networks, Ubiquitous, and Trustworthy Computing, Taichung, Taiwan,
pp. 1–9 (2006)
4. Tseng, H.R., Jan, R.H., Yang, W.: An improved dynamic user authentication
scheme for wireless sensor networks. In: Proceedings of IEEE Global Telecommu-
nications Conference (GLOBECOM 2007), Washington, DC, USA, pp. 986–990
(2007)
5. Lee, T.H.: Simple Dynamic user authentication protocols for wireless sensor net-
works. In: The Second International Conference on Sensor Technologies and Appli-
cations, pp. 657–660 (2008)
User Authentication Scheme for WSNs and Internet of Things 53
6. Ko, L.C.: A novel dynamic user authentication scheme for wireless sensor networks.
In: IEEE International Symposium on Wireless Communication Systems (ISWCS
2008), pp. 608–612 (2008)
7. Vaidya, B., Silva, J.S., Rodrigues, J.J.: Robust dynamic user authentication scheme
for wireless sensor networks. In: Proceedings of the 5th ACM Symposium on QoS
and Security for Wireless and Mobile Networks (Q2SWinet 2009), Tenerife, Spain,
pp. 88–91 (2009)
8. Das, M.L.: Two-factor user authentication in wireless sensor networks. IEEE Trans.
Wireless. Comm. 8, 1086–1090 (2009)
9. Khan, M.K., Alghathbar, K.: Cryptanalysis and security improvements of “two-
factor user authentication in wireless sensor networks”. Sensors 10(3), 2450–2459
(2010)
10. Yuan, J., Jiang, C., Jiang, Z.: A biometric-based user authentication for wireless
sensor networks. Wuhan Univ. J. Nat. Sci. 15(3), 272–276 (2010)
11. Yoo, S.G., Park, K.Y., Kim, J.: A Security-performance-balanced user authentica-
tion scheme for wireless sensor networks. Int. J. Distrib. Sens. Netw. 2012, 1–11
(2012)
12. Xue, K., Ma, C., Hong, P., Ding, R.: A temporal-credential-based mutual authen-
tication and key agreement scheme for wireless sensor networks. J. Netw. Comput.
Appl. 36(1), 316–323 (2013)
13. Jiang, Q., Ma, J., Lu, X., Tian, Y.: An effcient two-factor user authentication
scheme with unlinkability for wireless sensor networks. Peer-to-Peer Netw. Appl.
8, 1070–1081 (2014). doi:10.1007/s12083-014-0285-z
14. Das, A.K.: A secure and effective biometric-based user authentication scheme for
wireless sensor networks using smart card and fuzzy extractor. Int. J. Commun.
Syst. (2015). doi:10.1002/dac.2933
15. Althobaiti, O., Al-Rodhaan, M., Al-Dhelaan, A.: An efficient biometric authen-
tication protocol for wireless sensor networks. Int. J. Distrib. Sens. Netw. 2013,
1–13 (2013). Article ID 407971
16. Sun, D.Z., Li, J.X., Feng, Z.Y., Cao, Z.F., Xu, G.Q.: On the security and improve-
ment of a two-factor user authentication scheme in wireless sensor networks. Pers.
Ubiquit. Comput. 17(5), 895–905 (2013)
17. Dodis, Y., Reyzin, L., Smith, A.: Fuzzy extractors: how to generate strong keys
from biometrics and other noisy data. In: Cachin, C., Camenisch, J.L. (eds.)
EUROCRYPT 2004. LNCS, vol. 3027, pp. 523–540. Springer, Heidelberg (2004).
doi:10.1007/978-3-540-24676-3 31
18. Choi, S.J., Youn, H.Y.: An efficient key pre-distribution scheme for secure dis-
tributed sensor networks. In: Enokido, T., Yan, L., Xiao, B., Kim, D., Dai,
Y., Yang, L.T. (eds.) EUC 2005. LNCS, vol. 3823, pp. 1088–1097. Springer,
Heidelberg (2005). doi:10.1007/11596042 111
19. Pathan, A.K., Dai, T.T., Hong, C.S.: An efficient LU decomposition-based key pre-
distribution scheme for ensuring security in wireless sensor networks. In: Proceed-
ings of The Sixth IEEE International Conference on Computer and Information
Technology, CIT 2006, p. 227 (2006)
20. Dolev, D., Yao, A.: On the security of public key protocols. IEEE Trans. Inf. Theor.
29(2), 198–208 (1983)
21. AVISPA. http://www.avispa-project.org/
22. Cremers, C.: Scyther - semantics and verification of security protocols. Ph.D. dis-
sertation, Eindhoven University of Technology, Netherlands (2006)
23. Burrows, M., Abadi, M., Needham, R.M.: A logic of authentication. Proc. Roy.
Soc. Lond. 426, 233–271 (1989)
Detection of Zeus Bot Based on Host
and Network Activities
1 Introduction
Bots are one of the crucial hazards in the field of internet. The main feature
that differentiates the bot and other types of malware is the command and
control channel through which the Zeus bot is controlled by its bot master. The
commands vary depending on the motivation of the botnet. The motive of Zeus
bot is to steal user credentials and send it to its bot master. Zeus botnets use
keystroke logging and form grabbing attacks that target bank data, account
logins, and private user data. The information gathered by Zeus botnet are used
for online identity theft, credit card theft, and more. The functionalities of the
Zeus bot are as follows:
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 54–64, 2017.
https://doi.org/10.1007/978-981-10-6898-0_5
Detection of Zeus Bot Based on Host and Network Activities 55
• Copies the original executable into another location, executes the copied file
and deletes the original.
• The config.bin file is downloaded from the bot master and the Zeus bot exe-
cutes it.
• Steal the user credential from the infected system.
• Send the stolen data back to the bot master.
2 Related Works
BotSwat [1] is used to differentiate bot programs and benign programs in a host
system. It is done by monitoring the commands executed in the host machine.
The final judgment is done by checking whether the input data of the executed
command in the host is received from the network or not. Generally C&C channel
is used by the botmaster to send commands to the bot infected host. Thus
monitoring the network of the host system can detect the bot at the host level.
BotTracer [2] is used to detect bots using three stages. A bot has three fea-
tures on its onset: Automatic startup of bot without any user action, establish-
ment of command and control channel with its bot master, and local or remote
attacks by the bot. Bot-Tracer is used to detect the using these three phases.
Capturing the three features during the bot execution, the bot is detected.
BotTee [3] the method captures and analyses runtime call behaviour of the
bot during its execution. It recognizes the behavior triggered by every command,
irrespective of the syntax of different bot protocols. The working is based on the
interception Windows API system calls for a list of calls that are popular. When
a bot starts executing, the API calls are compared to a set of call patterns.
Detection of botnets using combined host and network-level information is
explained in [4]. The host level information consists of registry, file system, and
network stack. The network level information consists of network behavior fac-
tors by analyzing the net flow of the system. The host and network information
are used to determine the infected host through clustering and correlation.
EFFORT [5] proposes an approach to correlate information from different
host and network-level and design an architecture to coordinate the modules
for monitoring. Implementation and evaluation is done on real-world benign
and malicious programs. The five modules: HumanProcessNetwork correlation
analysis module, Process reputation analysis module, System resource exposure
analysis module, Network information trading analysis module, and Correlation
engine were designed for detection of bot.
56 R. Kalpika and A.R. Vasudevan
4 Proposed System
Based on the analysis performed on the functionalities of the Zeus bot, the
following modules have been identified for detection of Zeus bot:
Detection of Zeus Bot Based on Host and Network Activities 57
Figure 3 provides a diagrammatic representation for the work flow of the pro-
posed system. The modules are designed based on the functionalities of the Zeus
bot. Once the system starts booting the Folder monitoring module is started.
When the executable file is found, the Host Network monitoring module and
58 R. Kalpika and A.R. Vasudevan
API hook monitoring module are triggered, and also value true is passed into
Integrated Decision module. The captured network traffic is compared with pre-
defined patterns. If the network traffic matches the Configuration file download
pattern and the HTTP POST message pattern, the value true is passed on to the
Integrated decision module. The API hook monitor runs parallel to Host Net-
work monitoring module. If the monitored API hooks match with the API hooks
used by Zeus bot, (for stealing credentials) value true is passed into Integrated
decision module. When all the three modules conditions are true in Integrated
decision module, then the host system is classified as infected by Zeus bot.
In this module the folder in which the bot executable stores itself is monitored.
After analyzing the Zeus bot it was found that the bot executable creates a
folder with a random name inside Roaming folder. Inside this folder the copy of
the original executable is stored with a random name and other folders which are
benign did not have any .exe extension files. Thus by monitoring the Roaming
folder, the executable file can be traced.
Detection of Zeus Bot Based on Host and Network Activities 59
and user32.dll. By monitoring the API hooks used by the currently running
processes and comparing them with the list of API hooks used by Zeus bot we
can decide if the currently running process is the Zeus bot process.
4.5 Implementation
Figure 6 is the experimental setup constructed in lab environment. There are
two LAN connections. LAN 1 consists of the bot masters server. LAN 2 has
three host machines in which one is infected by the Zeus bot. The other two
are benign hosts. The host systems are windows systems. Host 1 is infected
by Zeus bot through drive by download attack. After the system is infected the
first communication takes place with the bot master requesting for configuration
file. After obtaining the configuration file the master can send commands to the
infected system. The commands are actions performed for malicious purpose.
All these actions take place in the background and the user is not aware about
the system being compromised.
To monitor the folder in which the bot executable is used a code is written.
The code continuously monitors the Roaming folder. The folder is monitored for
the presence of an executable file.
Next the host network is monitored with help of wireshark. Logs are extracted
once in 5 min which are saved as pcap files. Then these files are converted into
Detection of Zeus Bot Based on Host and Network Activities 61
.csv files. With the help of coding we compare the obtained .csv file with another
.csv file having the predefined communication pattern for Zeus configuration file
download or HTTP POST message.
Java code is written to monitor the API hooks of the currently running
processes. A comparison is done with the list that contains the API hooks that
are used by Zeus bot to steal credentials.
files matched, the corresponding destination address was noted and API hook
monitoring was triggered.
The Host Network monitoring module successfully captured the network traf-
fic automatically, compared with the predefined patterns, and passed value true
to the Integrated decision module and extracted the destination IP address which
was the IP address of the bot master.
API Hook Monitoring:
Table 1 gives the contents of the various dll processes and Application Program
Interfaces used for stealing the credentials from the Zeus bot infected system.
After monitoring all the three modules individually the Integrated module
checked for three conditions: presence of executable file in Roaming folder, pat-
tern matching for network traffic with predefined communication pattern, and
API hooks of running processes matching the predefined list of API hooks used
by Zeus bot. When all three conditions were true an alert message was triggered
with the IP address of the bot master. Immediately after this the bot executable
was deleted from the Roaming folder.
6 Conclusion
Zeus bot can be detected using a combination of three prolonged approach
by monitoring the host and network activities. The main contribution of the
research work includes specific Folder monitoring, Network traffic monitoring,
API Hook monitoring, and an Integrated Decision making module in order to
identify the executable as Zeus bot or not. It was seen that presence of .exe file
in Roaming folder, keeping track of HTTP GET and HTTP POST messages,
and the monitoring of nineteen API hooks were the necessary conditions in order
to identify and confirm the presence of Zeus bot activity in a host system. The
detection was performed by monitoring the three modules in real time.
References
1. Stinson, E., Mitchell, J.C.: Characterizing bots’ remote control behavior. In:
Hämmerli, B.M., Sommer, R. (eds.) DIMVA 2007. LNCS, vol. 4579, pp. 89–108.
Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73614-1 6
2. Liu, L., Chen, S., Yan, G., Zhang, Z.: BotTracer: execution-based bot-like mal-
ware detection. In: Wu, T.-C., Lei, C.-L., Rijmen, V., Lee, D.-T. (eds.) ISC 2008.
LNCS, vol. 5222, pp. 97–113. Springer, Heidelberg (2008). https://doi.org/10.1007/
978-3-540-85886-7 7
3. Park, Y., Reeves, D.S.: Identification of bot commands by runtime execution mon-
itoring. In: Annual Computer Security Applications Conference, ACSAC 2009, pp.
321–330. IEEE, December 2009
4. Zeng, Y., Hu, X., Shin, K.G.: Detection of botnets using combined host and network-
level information. In: 2010 IEEE/IFIP International Conference on Dependable Sys-
tems and Networks (DSN), pp. 291–300. IEEE, June 2010
5. Shin, S., Xu, Z., Gu, G.: EFFORT: efficient and effective bot malware detection. In:
2012 Proceedings of INFOCOM, pp. 2846–2850. IEEE, March 2012
64 R. Kalpika and A.R. Vasudevan
6. Ji, Y., Li, Q., He, Y., Guo, D.: Overhead analysis and evaluation of approaches to
host-based bot detection. Int. J. Distrib. Sensor Netw. (2015)
7. Thejiya, V., Radhika, N., Thanudhas, B.: J-Botnet detector: a java based tool for
HTTP botnet detection. Int. J. Sci. Res. (IJSR) 5(7), 282–290 (2016)
8. Bharathula, P., Mridula Menon, N.: Equitable machine learning algorithms to probe
over P2P botnets. In: Das, S., Pal, T., Kar, S., Satapathy, S.C., Mandal, J.K. (eds.)
Proceedings of the 4th International Conference on Frontiers in Intelligent Comput-
ing: Theory and Applications (FICTA) 2015. AISC, vol. 404, pp. 13–21. Springer,
New Delhi (2016). https://doi.org/10.1007/978-81-322-2695-6 2
An Asymmetric Key Based Efficient
Authentication Mechanism for Proxy Mobile
IPv6 Networks
1 Introduction
(i) PMIPv6 does not need any modification in the protocol stack of IPv6
devices.
(ii) PMIPv6 eliminates tunneling overhead over the wireless link.
(iii) PMIPv6 also reduces signaling overhead as the mobile node (MN) does not
need to participate in mobility-related signaling.
cache entry for the corresponding MN. When the MN attaches to a new MAG,
the AAA server executes the authentication procedure and uses MNs identity
(i.e., MN-ID) to authenticate the MN using security protocols deployed in the
access network. The new MAG (MAG2) sends PBU registration message to the
LMA for updating MNs current location within the network. Upon receiving
such message, LMA sends back proxy binding acknowledgement (PBA) message
that include MNs home network prefix (HNP) to the new MAG. The LMA then
creates a binding cache entry for the MN and sets up a bi-directional tunnel
to the new MAG (MAG2). The new MAG now sends the router advertisement
(RA) message to the MN and then acts as the serving MAG for the MN. Data
packets destined for the MN comes from the correspondent node (CN) to the
MN via the bi-directional tunnel between MAG2 and the LMA.
The hand off delay for PMIPv6 is substantially reduced compared to MIPv6.
However, PMIPv6 handover procedure incorporates some inefficient authentica-
tion mechanism [5,11]. On the other hand, most of the research work on PMIPv6
[4–7] attempt to improve the PMIPv6 handover procedure by decreasing the
hand-off delay. Very few works consider security threats to PMIPv6 network. The
researchers in [8] propose a symmetric key based secure fast handover scheme
called SF-PMIPv6 that reduces handover delay compared to the existing authen-
tication schemes [5,11] for PMIPv6 networks and resolves the packet loss problem.
However, compromising the key by a single MAG in SF-PMIPv6 may be a critical
security threat to the PMIPv6 network as authentication among various network
entities in SF-PMIPv6 is based on a single pre-shared symmetric key between all
the MAGs and AAA server. Thus, in this paper we propose an asymmetric key
based efficient authentication scheme for PMIPv6 handover procedure to reduce
handover latency compared to other existing authentication based PMIPv6 han-
dover procedures. This paper has been organised as follows. Literature review is
done in Sect. 2. Section 3 presents our proposed authentication scheme AKEAuth
and the integrated AKEAuth-PMIPv6 handoff technique. Section 4 provides secu-
rity analysis of our proposed scheme and Sect. 5 presents the numerical analysis.
Finally we conclude in Sect. 6 and presents our future goal.
2 Related Work
buffer mechanism to resolve the packet loss problem. However, the authentica-
tion scheme in SF-PMIPv6 is less secure as authentication among the AAA
server and all the MAGs is based on a single pre-shared symmetric key. The
secure password authentication mechanism (SPAM) proposed for PMIPv6 han-
dover procedure proposed in [12] involves a complicated authentication proce-
dure that executes two separate mutual authentications. One is between the MN
and the MAG and the other is between the MAG and the LMA. Although the
integration of SPAM along with the bicasting scheme into PMIPv6 handover
procedure [12] can resolve the packet loss as well as out-of-sequence problems
but it increases over-all handover latency compared to other existing techniques.
In addition, SPAM stores the authentication related parameters of a user into
a smart card which is highly susceptible to attack by adversary when the user
inserts the smart card along with id and password into card reader in order
to access mobility related services. Another secret key based mutual authen-
tication mechanism which uses separate secret key for authentication between
each different pair of network entities, is proposed in [14]. However, maintain-
ing a separate key for each MAG by the LMA would obviously create a huge
burden on the LMA. On the other hand, researchers in [15] have pro-posed a
public key based authentication mechanism called PKAuth for PMIPv6 net-
works that comprise multiple domains considering both inter-PMIPv6-domain
handover as well as intra-PMIPv6 handover. However, all the PMIPv6 network
entities and the MNs in PKAuth use certificates to distribute their public keys
among themselves rather than relying on the AAA server. In this paper we pro-
pose an efficient authentication scheme which is to be integrated with PMIPv6
handover procedure to reduce overall handover latency.
This section at first describes our proposed authentication scheme and then
presents its seamless integration with PMIPv6 handoff technique to prevent
various attacks in PMIPv6 networks. Our proposed authentication mechanism
is named as asymmetric key based efficient authentication (AKEAuth)
scheme for PMIPv6 handover procedure.
For security parameter k, the AAA server and MAG generate the system para-
meters as given below.
4. Set a random number sZq ∗ as the master key and set Ppub = s.P as the
system public key.
∗
5. Set four cryptographic hash functions H1 : {0, 1} XG → Zq∗ ,
∗
H2 : {0, 1} XG2 → {0, 1}k
∗
H3 : {0, 1} XG3 → {0, 1}k
∗
And H4 : {0, 1} XG4 → {0, 1}k
6. Represent the system parameters params= (Fq , E, G, P, Ppub , H1 , H2 , H3 , H4 )
while keeping s secret.
The main feature of the proposed AKEAuth scheme is the use of asymmetric
key rather than symmetric key in authentication process to provide high security
in PMIPv6 networks. The proposed authentication procedure consists of two
parts. The first part is initial authentication procedure between the AAA server
and the MN. The second part is authentication procedure performed locally
between the MN and the MAG. Table 1 lists the notations that are used in the
proposed authentication scheme.
Table 1. Notations
Identification Description
q A large prime number
G A cyclic additive group of order q
P The generator of G
Zq∗ (1, 2,....., q − 1)
SKM N A secret number chosen by MN
IDM N The identity of the MN
rAAA A random number chosen by AAA server
CIDM N The dynamic identity of the MN
(SM N , SKM N ) The private key of the MN
(P KM N , CM N ) The public key of the MN
s The private key of the MAG
Ppub The system public key
r1M N , r2M N Random numbers chosen by MN
The initial authentication procedure between the MN and the AAA server is
described below.
(i) The mobile node, at first, chooses a secret number denoted as SKM N , where
SKM N ∈ Zq ∗ and computes CM N = SKM N P Afterward, the MN sends
its Id (IdM N ) and newly computed value CM N , i.e., (IdM N , CM N ) to the
AAA server.
(ii) Upon receiving (IdM N , CM N ) from the MN, AAA server selects a random
number rAAA where rAAA ∈ Zq∗ and then computes the following values:
RAAA = rAAA + CM N P And dAAA = (H1 (IDM N , RAAA )s − rAAA ) mod
q. It also stores the ID of MN i.e. IdM N for future authentication required
by MAG of the same LMD. Now, the AAA server sends those two newly
computed values (RAAA , dAAA ) to the MN .
(iii) After receiving (RAAA , dAAA ) from the AAA server, MN computes SM N =
(dAAA − SKM N )mod q. And P KM N = sM N .P Afterwards MN uses the set
(SM N , SKM N ) as private key and (P KM N , CM N ) as public key for future
authentication process.
(i) MAG1 sends a proxy handover initial (Proxy HI) message to the target
MAG (i.e., MAG2). This Proxy HI message includes the MNs profile (i.e.,
IDM N ) and the target MAGs address.
(ii) MAG2 responds by sending a proxy handover acknowledgement (Proxy
HACK) message to MAG1.
(iii) After getting Proxy HI, MAG1 begins to store data in its buffer until it
receives the DeReg PBU message from the LMA via MAG2.
(iv) When the MN moves outside the transmission range of MAG1 and comes
with-in the communication range of target MAG, i.e., MAG2, it sends RS
and authentication information to the MAG2.
(v) Upon receiving the RS message from MN, MAG sends DeRegPBU MAG1
and PBU message for itself to the LMA.
(vi) After receiving the PBU message, the LMA sends the PBA message con-
taining the HNP of the MN as well as DeRegPBA for MAG1 to MAG2.
(vii) MAG2 forwards DeReg PBA to MAG1.
(viii) Upon receiving the DeRegPBA message, MAG1 forwards the buffered
packet to MAG2 and MAG2 stores it in its own buffer.
(ix) After sending the PBA message to the MAG2, LMA forwards MAG2
all data packets destined for MN, buffered by MAG2. MAG2 checks the
sequence number of the first packet it receives from LMA and stores all
packets in proper order.
(x) After successful completion of the proposed authentication mechanism by
MAG2, MAG2 sends back RA with authentication information to MN.
(xi) LMA updates the binding cache entry with the MNs current location, and
sets up a bi-directional tunnel to the new MAG (i.e., MAG2). By this bidi-
rectional tunnel between LMA and MAG2 and associated routing states in
72 S. Biswas et al.
both LMA and MAG2, MN data plane is managed. Downlink packets sent
to the Mobile Node from outside of the LMD arrive at LMA, which for-
wards them by the tunnel to MAG2. After decapsulation, MAG2 sends the
packets to the MN directly through the access link. Uplink packets which
are originated in the MN are sent to the LMA from the MAG2 through
the tunnel, and are then forwarded to the destination by the LMA.
(MAG1) and new MAG (MAG2), because when MN transmits RS with the
authentication information to MAG2, MAG1 utilizes its buffer to store packets
coming from LMA. So time taken for authentication in MAG2 and latency due
to PBU and PBA is reduced as packet transmission is not stopped. The whole
process occurs in handover phase.
4 Security Analysis
In this section, we shortly describe the security analysis of the proposed scheme.
We show that the proposed authentication scheme can provide various security
features like insider attack prevention, mutual authentication, confidentiality as
well as it can prevent replay attack and domino effect as explained below.
(i) Insider attack resistance: Our scheme can resist insider attack and pro-
vide user anonymity. The insider attack can affect all computer security
elements and range from stealing sensitive and valuable data to injecting
Trojan viruses in a system or network. A mobile node (MN) which resides
in a LMD may be malicious but AAA and MAG both check the MN by
getting the value IDMN and CMN the MN which resides in the LMD is
genuine/correct or malicious.
(ii) Mutual Authentication: The proposed authentication procedure
described in Subsect. 3.2 shows that both the MN and the MAG authenti-
cates each other before MN is provided mobility related services by MAG.
Thus, mutual authentication is ensured by the proposed scheme.
(iii) Replay Attack Prevention: In our proposed authentication scheme,
whenever the MN joins LMD or moves from one MAGs network to another
MAGs network within the same LMD, MN and new MAG authenticates
each other by checking some newly computed key values and then estab-
lish some new session key. Thus, our proposed authentication scheme can
ensure prevention of replay attack even if some messages are replayed as the
session key included in those messages would not remain valid afterwards.
(iv) Confidentiality: Confidentiality is guaranteed by our proposed scheme
AKEAuth by using the secret session key established between the MN and
the MAG for encrypting some important messages before their exchange
between the MN and the MAG. Exchange of encrypted messages between
the MN and the MAG can easily prevent attack from eavesdropping.
(v) Domino Effect Prevention: Although our proposed scheme relies on
AAA server-based key management, it can prevent domino effect, which
means the compromise of the secret session key by one MAG is always local-
ized and never affects the other parts of the network. Unlike SF-PMIPv6
in which some single secret key is pre-shared between all MAGs and AAA
server and PKAuth in which some secret key is shared by several MAGs
that are mentioned by the MN, our proposed scheme AKEAuth can com-
pletely prevent the domino effect as new secret session key is established
between the MN and each new MAG.
74 S. Biswas et al.
5 Numerical Analysis
This section analyzes the performances of our proposed handoff technique
AKEAuth-PMIPv6 with that of other existing PMIPv6 handoff techniques
for PMIPv6 network such as SF-PMIPv6 and PKAuth-PMIPv6 in terms of
computational cost, handover latency and signaling cost. The integrated SPAM-
PMIPv6 handover procedure proposed in [12] has not been considered for
performance comparisons as it adopts a complicated authentication procedure
consisting of two separate mutual authentications.
MN MAG AAA
Initial registration procedure Cran + 3.Ck NA Ch + Cran
Authentication procedure 3.Ch + 2.Ck + Cran + CXOR 4.Ch + Ck + Cran + CXOR NA
Fig. 3. Variation in average handover latency with respect to latency between MAG
and LMA.
Fig. 4. Variation in average handover latency with respect to latency between AAA
server and MAG.
6 Conclusion
This paper proposes an asymmetric key based simplified authentication scheme
which is named as asymmetric key based efficient authentication mechanism
for Proxy Mobile IPv6 Networks to provide high security in PMIPv6 networks.
Numerical analysis shows that our proposed AKEAuth-PMIPv6 handoff tech-
nique reduces the handover latency compared to the other existing handoff
An Asymmetric Key Based Efficient Authentication Mechanism for PMIPv6 77
References
1. Johnson, D., Perkins, C., Arkko, J.: Mobility Support in IPv6. IETF RFC 3775,
June 2004
2. Gundavelli, S., Leung, K., Devarapalli, V., Chowdhury, K., Patil, B.: Proxy Mobile
IPv6. RFC 5213, August 2008
3. Lei, J., Fu, X.: Evaluating the benefits of introducing PMIPv6 for localized mobility
management. In: Proceedings of the IEEE International Wireless Communications
Mobile Computing Conference, August 2008
4. Xia, F., Sarikaya, B.: Mobile node agnostic fast handovers for Proxy Mobile IPv6.
Draft-xia-netlmm-fmip-mnagno-02, IETF draft, November 2007
5. Ryu, S., Kim, M., Mun, Y.: Enhanced fast handovers for Proxy Mobile IPv6. In:
Proceedings of IEEE International Conference on Computational Science and Its
Applications (ICCSA), pp. 39–43, July 2009
6. Lee, J.H., Kim, Y.D., Lee, D.: Enhanced handover process for Proxy Mobile IPv6.
In: Proceedings of IEEE International Conference on Multimedia and Ubiquitous
Engineering (MUE), p. 15, August 2010
7. Park, J.W., Kim, J.I., Koh, S.J.: Q-PMIP: query-based proxy mobile IPv6. In:
Proceedings of IEEE International Conference on Advanced Communication Tech-
nology (ICACT), pp. 742–745, February 2011
8. Chuang, M.C., Lee, J.F.: SF-PMIPv6: a secure fast handover mechanism for Proxy
Mobile IPv6 Networks. J. Syst. Softw. 86, 437–448 (2013)
9. Kong, K.S., Lee, W., Han, Y.H., Shin, M.K., You, H.R.: Mobility management for
All-IP mobile networks: mobile IPv6 vs. Proxy Mobile IPv6. IEEE Wirel. Commun.
2, 36–45, April 2008
10. Kong, K.S., Lee, W., Han, Y.H., Shin, M.K.: Handover latency analysis of a
network-based localized mobility management protocol. In: Proceedings of IEEE
International Conference on Communications (ICC), pp. 5838–5843, May 2008
11. Tie, L., He, D.: A certificated-based binding update mechanism for Proxy Mobile
IPv6 protocol. In: Proceedings of IEEE Asia Pacific Conference on Postgraduate
Research in Microelectronics and Electronics, pp. 333–336, January 2009
12. Chuang, M.C., Lee, J.F., Chen, M.C.: SPAM: a secure password authentication
mechanism for seamless handover in Proxy Mobile IPv6 Networks. IEEE Syst. J.
7(1), 102–113 (2013)
13. Sun, H., Wen, Q., Zhang, H., Jin, Z.: A novel remote user authentication and key
agreement scheme for mobile client-server environment. Appl. Math. Inf. Sci. 7(4),
1365–1374 (2013)
78 S. Biswas et al.
14. Ben Ameur, S., Zarai, F., Smaoui, S., Obaidat, M.S., Hsiao, K.F.: A lightweight
mutual authentication mechanism for improving fast PMIPV6-based network
mobility scheme. In: 4th IEEE International Conference on Network Infrastruc-
ture and Digital Content, Beijing, China, pp. 61–68 (2014)
15. Kim, J., Song, J.: A public key based PMIPv6 authentication scheme. In: 2014
IEEE/ACIS 13th International Conference on Computer and Information Science
(ICIS), Taiyuan, pp. 5–10 (2014)
User Authentication Scheme for Wireless Sensor
Networks and Internet of Things Using Chinese
Remainder Theorem
1 Introduction
The sensor nodes of WSNs [1] or IoT measures different parameters (temper-
ature, pressure, humidity, light, etc.) of the environment and mutually trans-
mit the processed data using the wireless medium to the users or gateway, are
confined to tiny computational capacity, small-scale memory, moderate trans-
mission range and short-lived battery power (e.g. 7.7 MHz 8-bit ATmega128
processor, 4 K byte RAM, 128 K byte ROM, 512 K byte EEPROM, 250 k baud
data rate, 2 AA battery). Therefore, it is not feasible to implement the tradi-
tional cryptography algorithm on the resource constrained sensor devices. But
user authentication is one of the significant need for WSN’s emerging technolo-
gies (remotely monitoring patient’s body situation, electronic devices of industry
and smart home, the possibility of attacks in a battleground, natural calamity,
forest fire, etc.). Authenticating users who connect to the WSNs is a process
of validating identity (based on one or more factors such as user’s inherence,
possession, knowledge) using sensor devices. A secure user validation scheme of
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 79–94, 2017.
https://doi.org/10.1007/978-981-10-6898-0_7
80 A.K. Maurya and V.N. Sastry
WSNs offers various known security features such as efficient user’s password
update mechanism, secure session key establishment, confidentiality, integrity,
availability, non-repudiation, freshness and mutual authentication of the user,
sensor, gateway. A secure WSNs resist various well-known security attacks such
as sensor node and user’s identity impersonation attack, replay attack, denial of
service and man-in-the-middle attack, stolen smart card attack.
2 Related Work
In 2002, Akyildiz et al. [1] surveyed many aspects of WSNs and discussed many
open research issues of WSNs. In 2004, Benenson et al. [26] presented a user
authentication and access control mechanism for WSNs. In 2006, Watro et al.
[2] offered public-key based scheme TinyPK for securing WSNs which provides
mutual authentication and withstand sensor impersonation attack. In 2006,
Wong et al. [3] suggested a secure hash function based authentication scheme but
it does not support mutual authenticity and session key establishment between
user and sensor. In 2007, Tseng et al. [4] specified that Watro et al.’s and Wong
et al.’s schemes exhibit replay and forgery attack. Tseng et al. improved Wong
et al’s scheme and recommended password update mechanism. In 2008, Lee [5]
revealed that Wong et al. scheme exhibit more computational overhead on sensor
node in compare to gateway node and improved Wong et al. scheme with less
computation overhead of sensor node. In 2008, Ko [6] indicated that Tseng et al’s
scheme does not contribute mutual authentication and proposed mutual authen-
ticity and time-stamp based scheme. In 2009, Vaidya et al. [7] presented mutual
authentication scheme with formal verification. In 2009, Das [8] developed a
secure mechanism to provide authenticity using smart card and user’s password
(two factor) but it does not offer session key between user and sensor node. In
2010, Khan-Alghathbar (2010) [9] identified the gateway node bypass attack,
insider attack and lack of password update mechanism in Das’s [8] scheme and
improved Das’s scheme by including password update and mutual authentica-
tion technique. In 2010, Yuan et al. [10] provided a bio metric based scheme but
it is unprotected from node capture and denial of service attack. In 2012, Yoo et
al. [11] designed a scheme that provides secure session key and mutual authenti-
cation. In 2013, Xue et al. [12] designed a mutual authentication scheme based
on temporal information. However, in 2014, Jiang et al. [13] revealed that Xue
et al.’s scheme is susceptible to stolen smart card and privileged insider attack.
In 2015, Das [14] suggested fuzzy extractor based authentication scheme which
resists well-known security attacks of WSNs and has more security features in
compare to Althobaiti et al. (2013) [15] scheme.
The outline of this paper is as follow: Sect. 1, introduces the basic character-
istics, applications and important security features of WSNs. Section 2, consist of
literature survey. Section 3, explains the notation and mathematical expressions
which we use for designing the protocol. Section 4, reviews Das’s user authentica-
tion scheme. In Sect. 5, is about cryptanalysis of Das’s Scheme. Section 6, describes
our proposed scheme. Section 7, performs the security analysis. Section 8, indi-
cates the performance comparison. Eventually, Sect. 9, concludes our paper.
User Authentication Scheme for WSNs and IoT 81
Some basic notations which we use for designing our protocol are listed in fol-
lowing Table 1.
Notations Description
Ui ith User
IDUi Identity of Ui
P WUi Password of Ui
Bi Bio-metric information of Ui
SNj j th Sensor Node
SCUi Smart card of Ui
GW N The gateway node
h(.) A collision resistant one - way hash function
Gen(.) Generator procedure of Fuzzy Extractor
Rep(.) Reproduction procedure of Fuzzy Extractor
T Error tolerance limit of Fuzzy Extractor
TUi , TGW N , TSNj Current timestamps of Ui , GW N, SNj respectively
T , T ”, T ” Current time at GW N, SNj , Ui respectively
|| A string concatenation operator
⊕ A bitwise XOR operator
ΔT Maximum transmission delay
A Adversary
5.1 Presumption
• Sensor node may not fix up with temper - resistant hardware and if a node
is captured by an adversary, all the prominent and confidential information
stored in its memory can be accessed by the adversary. If the sensor nodes
are tamper - resistant the adversary can know the information stored in the
memory by measuring the power consumption of the captured sensor nodes.
• Base station or gateway can not be compromised, by the adversary.
• Adversary can intercept the public communication channel, inject packets
and reply the previously transmitted packets.
User Authentication Scheme for WSNs and IoT 83
Store SKij
• Adversary can capture the smart card of user and it can extract the sensitive
information stored in card through a power analysis attack.
• We consider that the WSNs consist of few users (with smart card which can
be captured or stolen by the adversary A), hundreds of sensor nodes (it can
be captured by A) and gateway (it is trusted and it can not be compromise
by A).
Stolen Smart Card Attacks. The adversary A ascertains the value of {τi , ei ,
ri , BEi , f ∗ , h(.), Gen(.), Rep(.), T } from stolen SCi by measuring the power con-
sumption of smart card [25]. A computes: BEi ⊕ri = [h(IDi σi )⊕K]⊕[h(IDi
σi ) ⊕ eki ] = K ⊕ eki .
84 A.K. Maurya and V.N. Sastry
The adversary A finds out the value of K and eki by implementing one of
the following three mechanism:
1. Derives the value of K and eki using the frequency analysis of stream cipher
BEi , ri and BEi ⊕ ri .
2. Eavesdrops R and Eeki (R, T, IDSNj ) and implements the known plain text
attack to find out the value of eki . Thereafter, A finds out the value of K =
eki ⊕ (K ⊕ eki ).
3. Steals the bio-metric information Bi of Ui (where d(Bi , Bi ) ≤ T ) and find
out the value of σi = Rep(Bi , τi ). Eavesdrops the value of IDi from public
communication channel and then evaluates the value of eki = BEi ⊕ h(IDi
σi ), K = ri ⊕ h(IDi σi ). It is possible, because eki is not password P Wi
protected.
6 Proposed Scheme
GW N generates a key rSNj for each sensor node SNj and a key rUi for each user
Ui , where rSNj and rUi are relatively prime integers. GW N generates a system
of simultaneous congruence (considering Chinese Remainder Theorem) such as:
Xold ≡ xold
i mod rUi , Xold ≡ xold
i mod rSNj ,
Xnew ≡ xnew
i mod rUi , X new ≡ xnew
i mod rSNj
For authenticated key establishment, Ui provides IDUi , P WUi and the noisy
biometric information Bi as a input to the Rep() function of the fuzzy extractor.
Then, Ui , GW N and SNj follows the steps 4, 5, 6, 7, 8, 9 consecutively as
proposed in Table 5.
86 A.K. Maurya and V.N. Sastry
User (Ui )
Ui puts SCUi into card reader and provides IDUi , P WUi , Bi ,
It produces σi = Rep(Bi , τi ),
IP Bi = h(IDUi ||P WUi ||h(σi )), rUi = α ⊕ IP Bi , β = h(IP Bi ||rU i
), Verify β = β ,
Computes Xold = γ ⊕ h(IDUi ||IP Bi ),
Ui provides new password P WUnew i
and bio-metric information Binew .
new new new
(σi , τi ) = Gen(Bi ), IP Bi = h(IDUi ||P WUnew
i
||h(σinew ))
new new new new new
α = IP Bi ⊕ rUi β = h(IP Bi ||rUi ) γ = h(IDUi ||IP Binew ) ⊕ Xold
7 Security Analysis
To verify the security features present in our protocol, we first perform the infor-
mal analysis considering major and minor attacks in WSNs. Afterward, we imple-
ment our scheme using Security Protocol Description Language and evaluate our
security claims using Sycther tool [22]. For automated validation of the proto-
col using AVISPA tool [21], we use High-Level Protocols Specification Language
Finally, we do the logical verification of the protocol using BAN logic [23].
E = N S × (59.2 + 28.6n) µJoule.
88 A.K. Maurya and V.N. Sastry
But in our scheme we eliminate A at the initial level which saves the total
energy of (E − Eh )µ Joule. Where Eh is the energy required in computing
and verifying the hash value m1 = h(xoldi ||TUi ). The energy required by SHA-
1 hash function is 5.9 µ Joule/byte [24]. Hence, our scheme with stand the
energy exhausting attacks.
• Stolen Smart Card Attack. To defend the attacks based on stolen SCi , we
keep the secret credentials of Ui in SCi protected with fuzzy extractor mech-
anism. An adversary A can extract the value of α, β, γ from stolen SCi using
power analysis attacks. But it is hard find out the value of secret credentials
such as: rUi , σi , P WUi for an adversary A without knowing the bio-metric
information and password of the user Ui . Therefore, our scheme resist the
stolen SCi attacks.
• Man-in-the-middle attack. To avoid the Man-in-the-middle attack, we
ensure mutual authentication among the Ui , SNj , GW N by verifying the
secret parameters such as m1 , m2 , m3 , m4 . The parameters m1 , m2 , m3 , m4
also ensures the message integrity.
• Replay Attack. Verification of timestamps TUi , TSNj , TGW N along with
their hashed values protects the replay attacks.
• Impersonation Attack. The verification of legitimate bio-metric informa-
tion Bi (using fuzzy extractor) and password P WUi at the time of user authen-
tication ensures that an adversary A can not impersonate the user Ui .
GW N | ≡ (TUi ), GW N | ≡ Ui | ∼ TUi
•
GW N | ≡ Ui | ≡ TUi
That is, if GW N believes TUi is fresh and GW N believes Ui once said
TUi , then GW N believe Ui believes on TUi
rSNj
GW N | ≡ SNj GW N, GW N <TSNj >rSNj
•
GW N | ≡ SNj | ∼ TSNj
That is, if GW N believes the secret rSNj is shared with SNj and sees
<TSNj >rSNj , then GW N believe SNj once said TUi
Notations Description
Pr , Qr Principals like Ui , GW N, and SNj
St Statements like TUi , TGW N , α, β etc.
K Secret key or data like KGSNj , XUi etc.
Pr | ≡ St Pr believes st, or Pr is permitted to believe st
Pr St P r has received a data containing St and it can read or repeat St
Pr | ∼ St Pr once said St. Pr sent a data containing St and it could be a fresh or
old data.
(St) The St is fresh and it has not been sent before.
rUi
Pr Qr St is a secret data and it is only known to Pr or Qr and perhaps to the
trusted principals
<St>St1 St1 is a secret and its presence gives the identity of whoever generates
<St>St1
8 Performance Comparison
Table 9 shows the comparison based on security features, and it indicates that our
protocol is relatively secure compared to the existing protocol. Table 10 represent
the computational cost comparison, it shows that our scheme is suitable for
secure WSNs and IoT.
92 A.K. Maurya and V.N. Sastry
Security Sun et al. Xue et al. Jiang et al. Althobaiti Our scheme
Feature [16] [12] [13] et al. [15]
SF1 Yes No Yes Yes Yes
SF2 No No No No Yes
SF3 No No No No Yes
SF4 No No No Yes Yes
SF5 No No No No Yes
SF6 Yes No No Yes Yes
Note: SF1 , SF2 , SF3 , SF4 , SF5 are the security features. F1 resist the
attack based on stolen smart card, SF2 indicates the secure password
updating, SF3 represents secure bio-metric information updating, SF4
indicates non-repudiation, SF5 offers formal security analysis, SF6 rep-
resents no privileged-insider attack
9 Conclusion
In this paper, we first discussed the security issues involve in sensor nodes of
WSNs and identified vulnerabilities involve in Das’s user authentication scheme.
Based on the security requirement of WSNs, we proposed an efficient authenti-
cated key exchange mechanism using the concepts of the fuzzy extractor and Chi-
nese Remainder Theorem. After that, we performed the security analysis of our
scheme using widely accepted automated verification tools such as AVISPA and
Scyther. Then, we performed logical verification using BAN Logic. Finally, we
did the computational analysis, and we demonstrated the comparative analysis
in respect of computational overhead and security features which indicate that
our scheme is secure and effective. In future, we aim to propose hyperelliptic
curve based authenticated key exchange scheme for WSNs and IoT.
User Authentication Scheme for WSNs and IoT 93
References
1. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor net-
works: a survey. Comput. Netw. 38(4), 393–422 (2002)
2. Watro, R., Kong, D., Cuti, S.F., Gardiner, C., Lynn, C., Kruus, P.: TinyPK: secur-
ing sensor networks with public key technology. In: ACM Workshop on Security of
Ad Hoc and Sensor Networks, pp. 59–64. ACM Press, Washington, DC (2004)
3. Wong, K.H., Zheng, Y., Cao, J., Wang, S.: A dynamic user authentication scheme
for wireless sensor networks. In: Proceedings of 2006 IEEE International Conference
on Sensor Networks, Ubiquitous, and Trustworthy Computing, Taichung, Taiwan,
pp. 1–9 (2006)
4. Tseng, H.R., Jan, R.H., Yang, W.: An improved dynamic user authentication
scheme for wireless sensor networks. In: Proceedings of IEEE Global Telecommu-
nications Conference (GLOBECOM 2007), Washington, DC, USA, pp. 986–990
(2007)
5. Lee, T.H.: Simple dynamic user authentication protocols for wireless sensor net-
works. In: The Second International Conference on Sensor Technologies and Appli-
cations, pp. 657–660 (2008)
6. Ko, L.C.: A novel dynamic user authentication scheme for wireless sensor networks.
In: IEEE International Symposium on Wireless Communication Systems (ISWCS
2008), pp. 608–612 (2008)
7. Vaidya, B., Silva, J.S., Rodrigues, J.J.: Robust dynamic user authentication scheme
for wireless sensor networks. In: Proceedings of the 5th ACM Symposium on QoS
and Security for wireless and mobile networks (Q2SWinet 2009), Tenerife, Spain,
pp. 88–91 (2009)
8. Das, M.L.: Two-factor user authentication in wireless sensor networks. IEEE Trans.
Wireless. Comm. 8, 1086–1090 (2009)
9. Khan, M.K., Alghathbar, K.: Cryptanalysis and security improvements of two-
factor user authentication in wireless sensor networks. Sensors 10(3), 2450–2459
(2010)
10. Yuan, J., Jiang, C., Jiang, Z.: A biometric-based user authentication for wireless
sensor networks. Wuhan Univ. J. Nat. Sci. 15(3), 272–276 (2010)
11. Yoo, S.G., Park, K.Y., Kim, J.: A security-performance-balanced user authentica-
tion scheme for wireless sensor networks. Int. J. Distrib. Sens. Netw. 2012, 1–11
(2012)
12. Xue, K., Ma, C., Hong, P., Ding, R.: A temporal-credential-based mutual authen-
tication and key agreement scheme for wireless sensor networks. J. Netw. Comput.
Appl. 36(1), 316–323 (2013)
13. Jiang, Q., Ma, J., Lu, X., Tian, Y.: An efficient two-factor user authentication
scheme with unlinkability for wireless sensor networks. Peer-to-Peer Network. Appl.
8(6), 1070–1081 (2014). doi:10.1007/s12083-014-0285-z
14. Das, A.K.: A secure and effective biometric-based user authentication scheme for
wireless sensor networks using smart card and fuzzy extractor. Int. J. Commun.
Syst. (2015). doi:10.1002/dac.2933
15. Althobaiti, O., Al-Rodhaan, M., Al-Dhelaan, A.: An efficient biometric authen-
tication protocol for wireless sensor networks. Int. J. Distrib. Sens. Netw. 1–13,
Article ID 407971 (2013)
16. Sun, D.Z., Li, J.X., Feng, Z.Y., Cao, Z.F., Xu, G.Q.: On the security and improve-
ment of a two-factor user authentication scheme in wireless sensor networks. Pers.
Ubiquit. Comput. 17(5), 895–905 (2013)
94 A.K. Maurya and V.N. Sastry
17. Dodis, Y., Reyzin, L., Smith, A.: Fuzzy extractors: how to generate strong keys
from biometrics and other noisy data. In: Cachin, C., Camenisch, J.L. (eds.)
EUROCRYPT 2004. LNCS, vol. 3027, pp. 523–540. Springer, Heidelberg (2004).
doi:10.1007/978-3-540-24676-3 31
18. Choi, S.J., Youn, H.Y.: An efficient key pre-distribution scheme for secure dis-
tributed sensor networks. In: Enokido, T., Yan, L., Xiao, B., Kim, D., Dai,
Y., Yang, L.T. (eds.) EUC 2005. LNCS, vol. 3823, pp. 1088–1097. Springer,
Heidelberg (2005). doi:10.1007/11596042 111
19. Pathan, A.K., Dai, T.T., Hong, C.S.: An efficient LU decomposition-based key pre-
distribution scheme for ensuring security in wireless sensor networks. In: Proceed-
ings of The Sixth IEEE International Conference on Computer and Information
Technology, CIT 2006, p. 227 (2006)
20. Dolev, D., Yao, A.: On the security of public key protocols. IEEE Trans. Inf. Theory
29(2), 198–208 (1983)
21. AVISPA. http://www.avispa-project.org/
22. Cremers, C.: Scyther - Semantics and Verification of Security Protocols, Ph.D.
dissertation, Eindhoven University of Technology, Netherlands (2006)
23. Burrows, M., Abadi, M., Needham, R.M.: A logic of authentication. Proc. Royal
Soc. Lond. 426, 233–271 (1989)
24. Wander, A., Gura, N., Eberle, H., Gupta, V., Shantz, S.: Energy analysis of public-
key cryptography on small wireless devices. In: Proceedings of the IEEE PerCom,
Kauai, HI, pp. 324–328, March 2005
25. Kocher, P., Jaffe, J., Jun, B.: Differential power analysis. In: Wiener, M. (ed.)
CRYPTO 1999. LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999). doi:10.
1007/3-540-48405-1 25
26. Benenson, Z., Gartner, F., Kesdogan, D.: User authentication in sensor networks.
In: Proceedings of the Workshop on Sensor Networks. Lecture Notes Informatics
Proceedings Informatik (2004)
27. Shi, W., Gong, P.: A new user authentication protocol for wireless sensor networks
using elliptic curves cryptography. Int. J. Distrib. Sens. Netw., 730–831 (2013)
28. Choi, Y., Lee, D., Kim, J., Jung, J., Nam, J., Won, D.: Security enhanced user
authentication protocol for wireless sensor networks using elliptic curve cryptog-
raphy. Sensors 14, 10081–10106 (2014)
A Ringer-Based Throttling Approach
to Mitigate DDoS Attacks
1 Introduction
goal of the client is to calculate the point in a collection of points that produces
the value same as the challenge. So, the proposed solution is an extension of the
idea of ringer-based cheating detection to throttling of the DDoS client with the
polynomial acting as a one-way function.
Rest of the paper is organized as follows. In Sect. 2, we present a detailed
discussion about the problem. In Sect. 3, we have presented a brief idea of existing
methodologies. In Sect. 4, we have presented our proposed solution. In Sect. 5
we have described our implementation results. Section 6 includes probabilistic
analysis of effectiveness of the proposed throttling scheme. Finally, we conclude
the paper in Sect. 7.
2 Problem Description
DDoS is a mechanism devised by illegitimate clients to gain access in the server
environment by flooding the system with packets that keep the system resources
engaged long enough to deny service to legitimate clients requesting the resources
and eventually gaining control of the server. Victims of DDoS attack consists of
not only targeted system but also of all compromised systems maliciously used
and controlled by the attacker.
To elaborate, we will consider a hypothetical server capable of serving 10,000
requests per second. Assume there are two legitimate clients asking for service
request at the rate of 500 requests per second. Since, the request rate is below
the rate at which server can process the request, the server serves the request of
these clients.
Now, assume that an adversary client starts sending request to the server at
the rate of 20,000 requests per second. Due to increase in the requests received
by the server, there is a toll over the server and has overhead. Hence, we see
that in the queue maintained by the server, the amount of the requests from the
adversary is more than the requests from the legitimate clients.
In a distributed attack scenario, assume that another malicious client is send-
ing requests at the rate of 20,000 packets per second to the server. Therefore,
now at the server side, at a second 41,000 packets are requesting for the server
resources per second thereby increasing the probability that the legitimate client
gets denied of the service. Eventually, as the attack intensifies, since the server
is exhausted with her resources, she can no longer serve any requests and ends
up either hanged or crashed. So, there is a need to prevent the malicious client
from overwhelming the server with many requests so that the legitimate clients
will not be denied the service.
3 Related Work
Juels [7,12] proposed to solve the problem of DDoS by the use of crypto-
graphic puzzles. Once the sever determines that it is under attack, it starts
sending puzzles to all its clients which are to be solved in a specified time inter-
val. On solving the puzzle the clients are given access to the server resources.
98 S.V. Sawant et al.
An adversary will take more time to solve the puzzle because he will have to solve
large number of puzzles one for each of its requests. However, in this scheme the
legitimate client will be denied access if his solution does not reach the server
within the expected time duration. In the scheme by Aura [3], the efficiency of
the client puzzles are improved by reducing the length of the puzzle and the
number of hash operations needed in the verification of the solution. Abadi [1]
proposed the use of memory bound functions in cryptographic puzzles. Mem-
ory bound is a condition wherein the time needed to complete computational
problem is decided mainly by the amount of memory required to hold the data.
Back [4] contributed a Hashcash based solution in which a token is computed
by the client which can be used as a proof-of-work. But Hashcash is a function
that is efficiently verifiable but expensive to compute. The main drawback with
respect to client puzzles approach was to set the difficulty of the puzzle in the
presence of an attacker with unspecified computing power and integrating them
with existing mechanisms. Wang [17] have used the concept of puzzle auction
to enable every client to “bid” for resources by tuning the difficulty it solves
and to adapt its bidding strategy in response to apparent attacks. However, the
main issues seen in this approach were adjusting the difficulty with respect to
the unknown computing power of the attacker and the puzzles poses signifi-
cant load on the legitimate clients. On the similar lines, different schemes like
those in [5,11,15] were proposed. A brief survey of related schemes and current
DDoS trends are discussed in the report of [2,9,13,14,18]. However, foreseeing
the significant increase in the computing power of processors, it was emphasized
to develop more resilient solutions to DDoS Attack. Thus, the focus shifted
to using NP-Hard problems and exploiting the hardness of these problems to
provide solution to the DDoS Attack. In one such approach by Darapureddi
et al. [6], they have used Discrete Logarithm Problem as the hardness problem
to strategically rule out the malicious packets. In this approach, using a combi-
nation of IP Address and time stamp, prime numbers, generators and finite fields
are generated. In the scheme due to Syed [10], Integer Factorization Problem is
used as the hardness problem to throttle the illegitimate clients.
All the above schemes that employ computationally hard problems for throt-
tling the DDoS client require the problem instance to be such that the problem
formulation and result verification take little effort as compared to the actual
computation. Generating an instance of a computationally hard problem can
be as costly as solving the problem in the worst case. Also, availability of such
problem instances is affected by various factors like the input size etc. All these
issues need to be addressed so as to guarantee the scalability in addition to
effectiveness of the solution. In this paper, we propose a ringer based approach
to throttle the DDoS attack. The concept of ringer is widely used for detecting
malicious behaviour across many distributed computing platforms like in [8,16]
to segregate illegitimate requests from the legitimate ones.
A Ringer-Based Throttling Approach to Mitigate DDoS Attacks 99
4 Proposed Solution
We address the problem of DDoS by providing the client a polynomial to evaluate
and generate a validation value. This validation value will then be verified at the
server end. Using this, the computation time required at the client end would
be of O(m n!) where n is the number of coefficients in the polynomial and m,
the number of points at which the polynomial is evaluated by the client before
being granted access to the service.
We stress that our solution can effectively throttle the malicious client so as
to reduce the impact of the attack on the server while taking constant time to
generate the polynomial sequence. Thus, pressing significant computation load
on the malicious clients to gain access to the server with little or no effect on
the server due to the solution.
Polynomial Evaluation is a process of computing value from the set of domain
for a polynomial function to get corresponding range value for the polynomial
function for given domain input.
Threshold is the permitted limit of the requests the server can handle by not
stressing her resources.
In this section, we present notations that are used throughout the paper. The
concrete proposed throttling solution is also presented.
Notations:
When the server is posed with requests, in normal scenario, wherein the number
of requests reaching the server is well below the maximum number of requests,
the server can serve without any panic, operating normally without any need
for invoking the solution. However, once the threshold limit for the requests at
the server end crosses a certain value, the solution is invoked which includes the
steps mentioned in the previous section. As explained, the solution will allow only
those user’s requests to go through to the server who could successfully compute
the correct value corresponding to the validation value for that request.
These steps are bound to consume large number of computation cycles and
since each request will have a new validation value, triggering our solution results
in slowdown in the number of illegitimate requests per unit time an attacker can
send. As the distributed attack deepens, we can increase the degree of polynomial
and the set X by not putting much load on the server. Each time a client initiates
a service request, the server treats it as a new request and does not try to
distinguish between the fake and genuine requests as such. The attacking client
who tries to flood enormous number of requests towards the server, computes
and sends a response corresponding to the validation value.
Due to high computational cost involved in obtaining the desired response
for a validation value, the number of requests that can successfully go through
to the server per unit time is limited leading to a sudden drop in the server
utilization. The same computation burden is imposed on a genuine client but
since the number of requests generated by the genuine client is very low, this
leads to a tolerable delay in service for the genuine client. Moreover, the degree
of polynomial can be reduced to reduce the computation burden in case the
genuine client is a mobile device limited by storage and/or power constraints.
Increasing degree of the polynomial would only increase the time it takes for the
clients to evaluate for the right value.
A Ringer-Based Throttling Approach to Mitigate DDoS Attacks 101
Table 1. Time taken (ms) by different browsers to obtain correct validation value.
In Figs. 1(a), (b) and 2(a), the plot between the server load and time depicts
reduction in relative number of fake packets after the solution is invoked, for
degree 2, 3 and 4 polynomial respectively. The quantity Server Load in the
graph is the amount of packets the server processes at a given time instant.
A Ringer-Based Throttling Approach to Mitigate DDoS Attacks 103
Fig. 2. (a) Solution impact with degree 4 and (b) Server overhead of the solution.
As the degree of the polynomial increases the rate with which the attack packets
are eliminated also increases. So, it can be seen that the decline in the relative
number of attack packets is faster in case of the degree 4 polynomial as compared
with the degree 3 polynomial and so on. This is because the illegitimate client
has to process as many polynomials as there are number of requests it wants to
flood the server with which slows the attacking client down thus, reducing the
number of requests approaching the sever. With increase in the degree of the
polynomial, the computation time keeps on increasing.
In Fig. 2(b), the computation overhead our solution imposes on the server
with respect to each polynomial degree. We see that as the degree increases, the
computation overhead also increases on the server. However, the computation
load on the server is less compared to previous schemes developed.
In Fig. 3(a), we have presented the effectiveness of our solution in throttling
the DDoS client. It clearly highlights that higher the degree of the polynomial
faster the throttling of illegitimate clients. In order to understand the behav-
iour of our solution when the attacker is intelligent, we developed an intelligent
illegitimate client that could compute the validation value. This client was devel-
oped in JAVA using Eclipse. We analysed the duration for which the attacker
can continue with the attack before she gets totally exhausted along with the
104 S.V. Sawant et al.
Fig. 3. (a) Solution effectiveness and (b) Packets falsely marked valid (false negative).
impact of change of degree of the polynomial. Thus, in Fig. 3(b), we have pre-
sented the delay our solution took to throttle all the malicious requests made to
the server.
Figure 4 depicts the performance of our solution when the malicious client is
capable of computing the validation value. It shows the total packets generated
by the intelligent adversary when the solution is invoked.
1 w w 1 w + 1
P (w ) = ( + (1 − )( )) = (3)
n! m m m − w n!m
106
The above probability also poses a fair chance to reduce to P (1) only after the value
w surpasses n!. This means that to obtain success probability any larger than m 1
,
the attacking client has to work on the polynomial that is, both P (w) and P (w ) are
proportional to w and w respectively, that is the amount of work done in each case.
So, the client has to work on more and more points from X to attain a probability
of success of obtaining the desired point higher than that in case of a random guess
which is our objective of throttling the DDoS client.
We present a comparative study of the proposed scheme against the existing ones
with respect to strength, scalability and robustness. Performance comparison is
also presented. We find that the proposed scheme is scalable, efficient and relies on
simpler assumptions. However, the proposed scheme is strong and imposes rela-
tively low computation burden on the server for computing the challenge. Ours is
the only throttling scheme based on problems that are not NP-Hard (Table 2).
7 Conclusion
Distributed Denial of Service (DDoS) is a serious attack on a server’s reputation
in which the attacker engages the server’s resources by overwhelming the server
with huge number of service requests. Throttling of such a client is an effective
defence against such attacks. As a result of throttling, cost of sending attack pack-
ets becomes huge and the effect due to the attack can be reduced within a tolerable
limit for the server. In this paper, we address the problem of DDoS attacks by throt-
tling the attacker using polynomial evaluation problem. In previous methods, to
throttle the illegitimate clients the server sent challenges based on computation-
ally hard problems and ends up generating problem instances that are costly to
both generate and verify. In our solution we proposed to use a combinatorial prob-
lem which is a faster growing function than exponential function. Thus, with little
change such as increasing the degree of the polynomial at the server-side we are
capable of generating a massive computational overhead on the client machines.
Thus, compared to previous schemes our solution impresses more computa-
tional duty on the clients with less overhead on the server even when the solution
needs to be intensified as a result to deepening of the attack. The accompanying
probabilistic analysis suggests the effectiveness of our method against an intelligent
attacker who wishes to gain control of server by doing a small number of computa-
tions. This probability of getting access to the server by doing a limited number of
computations is even worse than that in case of a completely random guess.
108 S.V. Sawant et al.
References
1. Abadi, M., Burrows, M., Manasse, M., Wobber, T.: Moderately hard, memory-bound
functions. ACM Trans. Internet Technol. (TOIT) 5(2), 299–327 (2005)
2. Ali, S.T., Sultana, A., Jangra, A.: Mitigating DDoS attack using random integer
factorization. In: 2016 Fourth International Conference on Parallel, Distributed and
Grid Computing (PDGC), pp. 699–702, December 2016
3. Aura, T., Nikander, P., Leiwo, J.: DOS-resistant authentication with client puz-
zles. In: Christianson, B., Malcolm, J.A., Crispo, B., Roe, M. (eds.) Security Proto-
cols 2000. LNCS, vol. 2133, pp. 170–177. Springer, Heidelberg (2001). doi:10.1007/
3-540-44810-1 22
4. Back, A., et al.: Hashcash-a denial of service counter-measure. Technical report
(2002)
5. Crosby, S.A., Wallach, D.S.: Denial of service via algorithmic complexity attacks. In:
USENIX Security, vol. 2 (2003)
6. Darapureddi, A., Mohandas, R., Pais, A.R.: Throttling DDoS attacks using discrete
logarithm problem. In: Proceedings of the 2010 International Conference on Security
and Cryptography (SECRYPT), pp. 1–7. IEEE (2010)
7. Dean, D., Stubblefield, A.: Using client puzzles to protect TLS. In: USENIX Security
Symposium, vol. 42 (2001)
8. Golle, P., Mironov, I.: Uncheatable distributed computations. In: Naccache, D. (ed.)
CT-RSA 2001. LNCS, vol. 2020, pp. 425–440. Springer, Heidelberg (2001). doi:10.
1007/3-540-45353-9 31
9. Gu, Q., Liu, P.: Denial of service attacks. In: Bidgoli, H. (ed.) Handbook of Com-
puter Networks: Distributed Networks, Network Planning, Control, Management,
and New Trends and Applications, vol. 3, pp. 454–468. Wiley, Hoboken (2007)
10. Gujjunoori, S., Syed, T.A., Madhu Babu, J., Darapureddi, A., Mohandas, R., Pais,
A.R.: Throttling DDoS attacks. In: Proceedings of the 2009 International Conference
on Security and Cryptography (SECRYPT), pp. 121–126. INSTICC Press (2009)
11. Jin, C., Wang, H., Shin, K.G.: Hop-count filtering: an effective defense against
spoofed DDoS traffic. In: Proceedings of the 10th ACM Conference on Computer
and Communications Security, pp. 30–41. ACM (2003)
12. Juels, A., Brainard, J.G.: Client puzzles: a cryptographic countermeasure against
connection depletion attacks. In: NDSS 1999, pp. 151–165 (1999)
13. Li, X., Wang, Y., Zhang, Y.: Session initiation protocol denial of service attack
throttling. uS Patent Ap. 13/944,156, 22 January 2015. https://www.google.com/
patents/US20150026793
14. Malialis, K., Kudenko, D.: Multiagent router throttling: decentralized coordinated
response against DDoS attacks. In: IAAI (2013)
15. Mirkovic, J., Prier, G., Reiher, P.: Attacking DDoS at the source. In: Proceedings of
the 10th IEEE International Conference on Network Protocols, pp. 312–321. IEEE
(2002)
16. Sion, R.: Query execution assurance for outsourced databases. In: Proceedings of the
31st International Conference on Very Large Data Bases, VLDB 2005, pp. 601–612.
VLDB Endowment (2005)
17. Wang, X., Reiter, M.K.: Defending against denial-of-service attacks with puzzle auc-
tions. In: Proceedings of Symposium on Security and Privacy, pp. 78–92. IEEE (2003)
18. Wong, F., Tan, C.X.: A survey of trends in massive DDoS attacks and cloud-based
mitigations. Int. J. Netw. Secur. Appl. 6(3), 57 (2014)
NPSO Based Cost Optimization for Load
Scheduling in Cloud Computing
1 Introduction
The cloud computing is a fast growing computing technology providing large set of
resources. It is a growing computing paradigm that offers higher scalability, higher
software and hardware management, higher flexibility among the resources. The cloud
computing specifies a virtual, distributed computing environment for users by the
pay-as-you-use model. It could be accessed over a large geographical area. The
resources can be executed on the heterogeneous computers. The cloudlets are executed
in parallel on the virtual machines. The tasks consume a large amount of resources in a
distributed environment and handling complexities and time (response, execution and
transfer). The allocation of the heterogeneous resources to virtual machines is per-
formed by load schedulers and is known as load scheduling.
Load Scheduling refers to the system of providing, allocating resources to available
tasks in the virtual distributed system. It is performed to solve the problems like
starvation, system failure, deadlock, etc. [1, 3]. It deals with the minimization of the
total cost. The main purpose of load balancing is to increase the performance of the
system and imbibe cost effectiveness achieved using proper exploitation and allocation
of the resources to the tasks. The job scheduling depending on swarm intelligence has a
larger significance in the environment. The swarm deals with the collection of particles
or objects [7, 8]. The swarm technique deals with division of labour and generation of
good solutions available in the system. The biggest advantage of using the Swarm
Intelligence based technique is that it involves both positive as well as negative
feedback. Particle Swarm Optimization is a type of optimization that uses self-adaptive
global search mechanism for the workflow scheduling [9, 13]. This optimization
approach derives its references from various algorithms like genetic algorithms (GAs),
Simulated Annealing (SA), Ant Colony Optimization (ACO) among others [10, 11].
The PSO is applied in cloud computing for the faster and higher data retrieval
mechanism along with very minimal cost in the system [12, 16]. The paper discusses a
New Particle Swarm Optimization algorithm (NPSO) for load scheduling in cloud
computing using a new fitness function for the total cost evaluation. The CloudSim
simulator is used for the implementation [2].
This paper is arranged in the following manner. Section 2 provides a brief review
of particle swarm optimization based load scheduling. In Sect. 3, we introduce the new
proposed algorithmic approach. Section 4 demonstrates the experimental setup along
with the results and analysis of the algorithms and finally Sect. 5 provides the con-
clusions and future scope in the algorithm.
In this algorithm, the particles in the beginning are initialized randomly at the
intervals. Every particle generates a fitness value on specific parameters [14, 15] which
helps in traversing from one particle another. The next particle to be executed depends
on the existing position and the velocity of the particle. All the particles are traversed
till the stopping condition is met. The cost is calculated as the total sum of all the
execution and transfer cost. This drawback is minimized by the minimized cost as
compared to the static algorithms [17, 18].
The proposed approach is known as the New Particle Swarm Optimization (NPSO) for
load scheduling. It uses a new cost function for the calculation of the total cost (exe-
cution and transfer). This algorithm also includes the capability of storage. This is
achieved by stored the values of the best particle in the specific iteration as well as in
the search space. It includes the maximum exploitation of the resources in the search
space. Our new fitness function depends on the VM cost involved as well as the VM
time.
Total CostðM Þ is specified as the total cost of all the particles that are assigned to
calculate the fitness of each particle. The NPSO involves the Cost computation using
the updated fitness function and total cost evaluation function. Total CostðM Þ is cal-
culated using the following transfer and execution cost of the cloudlets:
X
Cex ðM Þj ¼ k
wkj 8M ðkÞ ¼ j ð1Þ
X X
Ctr ðMÞj ¼ k12T k22T
dM ðk1Þ;Mðk2Þ ek1;k2 8M ðk1Þ ¼ j and M ðk2Þ 6¼ j ð2Þ
Cex ðMÞj specify the execution cost, Ctr ðMÞj is the transfer cost of the cloudlets and
Ctot ðMÞj depicts the sum of the execution cost and the transfer cost of the cloudlets on
the VMs. The maximum value of the total cost among all VMs is taken as the
Total CostðM Þ value.
Figure 1 shows the flowchart of CloudSim and NPSO applied for scheduling the
cloudlets (tasks) to virtual machines. First, datacenter is created and a list of N cloudlets
is initialized with the size of data, execution cost and transfer cost. Then number of
virtual machines is initialized along with parameters like mips (million instructions per
second), RAM, execution cost of task in each VM and transfer cost among VMs.
112 D. Chaudhary et al.
Fig. 1. Flowchart of New Particle Swarm optimization based on cost function (NPSO) for load
scheduling in cloud computing
Then PSO algorithm is applied. Firstly a fixed number of particles are randomly
distributed into the search space and fitness value of the particles is computed based on
the new improved fitness function. It is the weighted sum of VMcost and VMtime
parameters of the cloudlets allocated to the virtual machines. VMtime is evaluated
NPSO Based Cost Optimization for Load Scheduling 113
using Eqs. (6) and (7). The improved fitness function used in the proposed approach is
calculated by the Eq. (9).
X
Tex ðM Þj ¼ w mipsj
k k
8M ðkÞ ¼ j ð6Þ
Total TimeðM Þ ¼ max Tex ðM Þj 8j 2 P ð7Þ
Thus, now the minimization is performed on the improved fitness function given
above on both the time and cost considerations.
a ¼ ½0; 1 ð10Þ
The parameter a is used for providing the weighted sum of cost and time. These
values help in finding the best particles among all the particles in the system, i.e.,
pbestði; tÞ and gbestðtÞ global best value among the particles in the existing iteration in
the system. The pbestði; tÞ and gbestðtÞ value are calculated as:
pbestði; tÞ ¼ arg mink¼1;...;t ½f ðPi ðkÞÞ; i 2 1; 2; . . .. . .Np ð11Þ
Here, i represent the index of the particle, Np represents the total number of
particles, f symbolizes the fitness function, P denotes the position and t is the current
iteration. The velocity and position of the next particle is calculated in the following
manner.
Pi ðt þ 1Þ ¼ Pi ðtÞ þ Vi ðt þ 1Þ ð14Þ
where, velocity of the particle i at iteration t is represented as Vi ðtÞ, the velocity of the
next particle i at ðt þ 1Þ iteration is denoted as Vi ðt þ 1Þ. The c1 and c2 depict the
coefficient for acceleration in the system & r1 and r2 denote the random values between
0 and 1with x representing the inertia weight. The Pi ðtÞ specifies the current position of
the particle i at iteration t and Pi ðt þ 1Þ denotes the position of the particle i at ðt þ 1Þ
iteration. On the basis of these values, the particles move to the next position Pi ðt þ 1Þ
and updated velocity Vi ðt þ 1Þ from the previous position Pi ðtÞ at velocity Vi ðtÞ. The
particles’ position is updated till the number of iterations condition is fulfilled and then
the best position is returned.
The cloudlets after being assigned specific values are passed to the respective
virtual machines (VMs) for the execution. The datacenter executes the cloudlets on the
virtual machines. The datacenter broker known as NetDatacenterBroker helps in
114 D. Chaudhary et al.
initializing and shutting down the datacenter in the cloud. The cost of execution
incurred is specified as the maximum value of the total computation involved in
NetCloudletSpaceSharedScheduler class. The total transfer cost is specified in the
NetworkHost class for each host. This total cost includes the sum of the total execution
time and the total transfer time of all the cloudlets on a virtual machine (VM). The New
Particle Swarm Optimization algorithm (NPSO) in cloud based on new cost evaluation
function. The total cost calculation uses the same Eqs. (1)–(4). This paper implements
the PSO scheduling given by Buyya et al. on Cloud Labs. But the contradiction given
in the paper is found for total cost evaluation using CloudSim. This algorithm
implements the correct cost evaluation strategy provided in the paper for correct results.
The cost computation in the cloud is performed in the manner as defined using an
example in Fig. 2 in CloudSim. This new cost evaluation approach provides minimized
cost. It is calculated on the basis of the execution and transfer cost of the cloudlets and
the VMs.
The proposed approach gives minimized total cost including the execution and
transfer cost among the cloudlets using the particles on the virtual machines in the
cloud for the optimal (minimized result) load scheduling in the system. This is a
meta-heuristic approach providing larger exploitation of search space using memory
based functionality. The detailed results in tabulated and graphical manner are
explained in the next section.
NPSO Based Cost Optimization for Load Scheduling 115
The above discussed scheduling heuristics using swarm intelligence for solving the
algorithms viz. PSO, NPSO are implemented in the CloudSim simulator. It helps in
designing and processing the new cost computation function. The Network CloudSim
Simulator is based on CloudSim. The proposed approach is implemented by extending
the available classes and creating a new improved fitness function and cost evaluation
function in the simulator using a JSwarm package for the particles and their properties.
25 particles are defined in the search space. These particles include various values like
inertia, maximum position, minimum position and velocity. The number of cloudlets
working on 8 VMs is 10, 15 and 20 cloudlets. These results are computed on a large
dataset of iteration ranging from 10 to 100 and 100 to 1000. The cloudlets and VMs
include the features and characteristics provided by the system like mips (millions of
instructions per second), bandwidth, transfer cost, execution cost are used for the
calculation of the total cost incurred by the system. The total cost computed on the set
of iterations for the existing PSO and the proposed New PSO for cost optimization are
computed and given in Table 1 for 10 cloudlets on 8VMs, Table 2 for 15 cloudlets on
8VMs and Table 3 for 20 cloudlets on 8VMs.
Table 1. Comparison of total cost for 10 cloudlets in PSO and NPSO algorithm
Iterations PSO New PSO
10 144670.354 19721.955
20 154718.388 22671.367
30 146023.003 22231.076
40 151117.032 19150.000
50 153934.604 20809.476
60 146671.563 22364.729
70 150630.778 23605.914
80 144320.104 19472.795
90 150043.444 22231.076
100 155538.952 24862.478
200 143757.944 23889.320
300 155469.762 18361.303
400 154534.972 21721.068
500 148303.411 19307.616
600 143482.749 23337.075
700 148181.188 22558.027
800 145980.430 18035.190
900 145713.846 23430.972
1000 150183.341 21122.244
116 D. Chaudhary et al.
Table 2. Comparison of total cost for 15 cloudlets in PSO and NPSO algorithm
No. of Iterations PSO New PSO
10 251281.517 30162.991
20 248417.269 31208.575
30 248960.089 36607.071
40 247411.824 36607.071
50 249097.514 30655.160
60 252863.341 28856.305
70 233301.076 36607.071
80 240275.467 36201.780
90 238414.151 27192.867
100 246693.713 29867.619
200 257046.548 28406.375
300 241913.925 28961.424
400 250686.566 31208.575
500 255247.494 31374.876
600 255640.518 31260.997
700 240621.622 30162.991
800 234732.638 29630.981
900 245081.605 36607.071
1000 245697.967 33854.779
Table 3. Comparison of total cost for 20 cloudlets in PSO and NPSO algorithm
No. of Iterations PSO New PSO
10 368649.602 54551.860
20 354478.049 46299.038
30 345540.580 39905.970
40 356553.370 41206.459
50 353046.854 39521.912
60 355353.891 48829.189
70 351187.753 41673.579
80 357996.221 44268.773
90 371324.589 38517.034
100 368448.353 39170.780
200 359849.885 49303.231
300 347655.150 34995.054
400 350306.997 47062.315
500 365131.362 39821.958
600 366225.261 34995.054
700 369217.780 37408.506
800 366704.318 41700.048
900 362957.682 40466.853
1000 363249.969 38177.916
NPSO Based Cost Optimization for Load Scheduling 117
The Figs. 3, 4 and 5 depicted the graphical analysis of the total cost of 10, 15 and
20 cloudlets on 8 VMs versus number of iterations. This comparison is performed for
PSO and New Particle Swarm Optimization Algorithm (NPSO) for load scheduling.
Fig. 3. Analysis of total cost based on cost function for 10 cloudlets in PSO and NPSO
algorithms
Fig. 4. Analysis of total cost based on cost function for 15 cloudlets in PSO and NPSO
algorithms.
118 D. Chaudhary et al.
Fig. 5. Analysis of total cost based on cost function for 20 cloudlets in PSO and NPSO
algorithms.
The statistical analysis of the results generates the mean, standard deviation,
minimum and maximum values given in Table 4.
Table 4. Descriptive statistics of PSO and NPSO algorithms for 10, 15 and 20 cloudlets
Cloudlets PSO New PSO
Mean 10 149119.782 21520.193
15 246493.939 31864.977
20 359677.771 41993.448
Standard deviation 10 4187.166 3177.125
15 6849.873 2354.763
20 7944.934 5159.619
Minimum 10 143482.749 18035.190
15 233301.076 27192.867
20 345540.580 34995.054
Maximum 10 155538.952 24862.478
15 257046.548 36607.071
20 371324.589 54551.860
The Figs. 6 and 7 graphically analyze the descriptive statistics of the total cost
versus number of iterations. This comparison is performed for PSO and NPSO algo-
rithms for mean and standard deviation are provided.
Thus we analyze that the New Particle Search Optimization Algorithm (NPSO)
using a new cost calculation function in cloud computing environment presented better
results to load scheduling problem.
NPSO Based Cost Optimization for Load Scheduling 119
Fig. 6. Mean of total cost for number of cloudlets in PSO and NPSO algorithms
Fig. 7. Standard deviation of total cost for number of cloudlets in PSO and NPSO algorithms
5 Conclusion
This paper specified load scheduling as the important factor for processing the data in
the cloud computing environment between cloudlets and VMs. The PSO algorithm
which is meta-heuristic in nature and based on swarm intelligence technique for
scheduling of load has been explained. This is based on the fitness values of particles
acting and also the force acting over them. The proposed New Particle Swarm Opti-
mization Algorithm (NPSO) approach used a new cost evaluation function on the
cloudlets for the total cost provided higher cost optimization on 10, 15 and 20
cloudlets. The descriptive analysis of the algorithms for the total cost results is
showcased. The results of the proposed NPSO approach have been equated with the
prevailing scheduling algorithm PSO using graphical and tabular. The proposed New
Particle Swarm Optimization approach gives more realistic and minimized results (total
120 D. Chaudhary et al.
cost) focusing on the essence of load scheduling. The future work includes generating a
new fitness function for further cost reduction in the cloud using different simulators
and real time hosts.
References
1. Buyya, R., Pandey, S., Vecchiola, C.: Cloudbus toolkit for market-oriented cloud
computing. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) CloudCom 2009. LNCS, vol.
5931, pp. 24–44. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10665-1_4
2. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: IEEE International Conference
on Neural Networks, vol. 4, pp. 1942–1948. IEEE (1995)
3. http://en.wikipedia.org/wiki/Cloud_computing
4. http://en.wikipedia.org/wiki/Load_balancing_(computing)
5. Pandey, S., Buyya, R., et al.: A particle swarm optimization based heuristic for scheduling
workflow applications in cloud computing environments. In: 24th IEEE International
Conference on Advanced Information Networking and Applications, pp. 400–407 (2010)
6. Tsai, C.W., Joel, J.P., Rodrigues, C.: Metaheuristic scheduling for cloud: a survey. IEEE
Syst. J. 8(1), 279–291 (2014)
7. Chaudhary, D., Chhillar, R.S.: A new load balancing technique for virtual machine cloud
computing environment. Int. J. Comput. Appl. 69(23), 37–40 (2013)
8. Chaudhary, D., Kumar, B.: Analytical study of load scheduling algorithms in cloud
computing. In: IEEE International Conference on Parallel, Distributed and Grid Computing
(PDGC), pp. 7–12 (2014)
9. Chaudhary, D., Kumar, B.: An analysis of the load scheduling algorithms in the cloud
computing environment: a survey. In: IEEE 9th International Conference on Industrial and
Information Systems (ICIIS), pp. 1–6 (2014)
10. Kang, Q., He, H.: A novel discrete particle swarm optimization algorithm for meta-task
assignment in heterogeneous computing systems. Microprocess. Microsyst. 35(1), 10–17
(2011)
11. Pacini, E., Mateos, C., Garino, C.G.: Distributed job scheduling based on swarm
intelligence: a survey. Comput. Electr. Eng. 40, 252–269 (2014). Elsevier
12. Garg, S.K., Buyya, R.: Network CloudSim: modelling parallel applications in cloud
simulations. In: 4th IEEE/ACM International Conference on Utility and Cloud Computing
(UCC 2011), Melbourne, Australia. IEEE CS Press (2011)
13. Kumar, D., Raza, Z.: A PSO based VM resource scheduling model for cloud computing. In:
IEEE International Conference on Computational Intelligence and Communication Tech-
nology (CICT), pp. 213–219 (2015)
14. Bhardwaj, S., Sahoo, B.: A particle swarm optimization approach for cost effective SaaS
placement on cloud. In: International Conference on Computing, Communication and
Automation (ICCCA), pp. 686–690 (2015). doi:10.1109/CCAA.2015.7148462
15. He, X., Ren, Z., Shi, C., Fang, J.: A novel load balancing strategy of software-defined
cloud/fog networking in the Internet of Vehicles. China Commun. 13(Suppl. 2), 140–149
(2016)
16. Agnihotri, M., Sharma, S.: Execution analysis of load balancing particle swarm optimization
algorithm in cloud data center. In: Fourth International Conference on Parallel, Distributed
and Grid Computing (PDGC), Waknaghat, pp. 668–672 (2016)
NPSO Based Cost Optimization for Load Scheduling 121
17. Gupta, S.R., Gajera, V., Jana, P.K.: An effective multi-objective workflow scheduling in
cloud computing: a PSO based approach. In: Ninth International Conference on Contem-
porary Computing (IC3), Noida, pp. 1–6 (2016)
18. Riletai, G., Jing, G.: Improved PSO algorithm for energy saving research in the double layer
management mode of the cloud platform. In: 2016 IEEE International Conference on Cloud
Computing and Big Data Analysis (ICCCBDA), Chengdu, pp. 257–262 (2016)
19. Fan, C., Wang, Y., Wen, Z.: Research on Improved 2D-BPSO-based VM-container hybrid
hierarchical cloud resource scheduling mechanism. In: 2016 IEEE International Conference
on Computer and Information Technology (CIT), Nadi, pp. 754–759 (2016)
20. Somasundaram, T.S., Govindarajan, K., Kumar, V.S.: Swarm Intelligence (SI) based
profiling and scheduling of big data applications. In: 2016 IEEE International Conference on
Big Data (Big Data), Washington, DC, pp. 1875–1880 (2016)
21. Ibrahim, E., El-Bahnasawy, N.A., Omara, F.A.: Task scheduling algorithm in cloud
computing environment based on cloud pricing models. In: 2016 World Symposium on
Computer Applications & Research (WSCAR), Cairo, pp. 65–71 (2016)
22. Mao, C., Lin, R., Xu, C., He, Q.: Towards a trust prediction framework for cloud services
based on PSO-driven neural network. IEEE Access 5, 2187–2199 (2017)
Multi-sink En-Route Filtering Mechanism
for Wireless Sensor Networks
1 Introduction
Wireless Sensor Networks comprise of large number of sensor nodes which are
very limited in computational and memory resources. Sensor nodes are used to
sense their nearby environment where they are deployed. Because of these sens-
ing capabilities sensor nodes are deployed in hostile environments like military
monitoring, industrial sensing, etc. for sensing and tracking purposes [1].
When a WSN is deployed, the sensor nodes sense the environment and send
this data to sink (data collection node). Sensing nodes deployed in hostile and
unattended environments can be easily compromised, which can hamper the
overall security of network. These compromised nodes can send forged or bogus
reports in the network, which will unnecessarily increase the network traffic
and can also cause sink to take wrong decisions or raise false alarms. These
compromised nodes can also launch various DoS attacks, which can jeopardize
the normal working of network. Other attacks possible are selective forwarding
attack [1] where a compromised node drop legitimate reports passing through
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 122–133, 2017.
https://doi.org/10.1007/978-981-10-6898-0_10
Multi-sink En-Route Filtering Mechanism for Wireless Sensor Networks 123
it and report disruption attack [1] where compromised nodes contaminate the
authentication data in legitimate reports. Therefore, it’s at most important to
drop these false report from the network as soon as possible to decrease the effect
of any attack on network.
To reduce the effect of attacks discussed above and to filter the false and
forged reports many en-route filtering based techniques [2–7] have been proposed.
In these techniques, when an event happens it is sensed by multiple sensor nodes
and all nodes in a cell collaborate together to form and endorse the report. Each
intermediate forwarding node verifies whether the endorsements included in the
report are genuine or not. Detection of incorrect endorsement leads to dropping
of the report. Finally reports can be checked by sink whether reports are genuine
or not.
All en-route filtering methods have mainly 3 phases- Key exchange phase, En-
route filtering phase and Sink verification phase. In Key exchange phase, nodes
exchange keys with intermediate forwarding nodes on the path to the sink. In En-
route filtering phase, intermediate nodes filter and forward the reports toward
the sink. In sink verification phase, sink acts as a final goalkeeper for the whole
network where it collects and verifies all the reports. Majority of the research
has been done in key exchange phase of en-route filtering. Many techniques [2–7]
have been proposed for key exchange phase which can be grouped in two major
categories - symmetric cryptography based key exchange (SCBKE) and asymmet-
ric cryptography based key exchange (ASCBKE). Majority of en-route filtering
based techniques are symmetric cryptography based. All of these uses message
authentication codes (MACs) derived from symmetric keys shared between mul-
tiple nodes. Each legitimate report should have certain minimum valid MACs.
On the other hand asymmetric cryptography based techniques uses signatures
which can be verified by intermediate nodes and sink. These techniques do not
require any pre-shared keys and they mainly use elliptic curve cryptography
[8] and Shamir’s threshold cryptography [8] to generate signatures. No alter-
ation has been done by any technique in second phase of en-route filtering. Thus
majority of techniques are susceptible to attacks like selective forwarding and
report disruption in the network. Sink verification requires either key exchange
with all the nodes to check the authenticity of reports or it rely on signatures to
filter false reports.
In this paper we alter the en-route filtering phase without changing the key
exchange phase and sink verification phase. For key exchange phase we use SEF
[2] and LBRS [3] techniques. We alter the en-route filtering mechanism where
only single sink was present and we introduce new sinks in the network to gather
reports from all the nodes in the network. In a nutshell the large network is
divided into many smaller networks, where each smaller network is having an
independent sink. Moreover each smaller network will be using different keys for
their network. Proposed changes are being tested with SEF and LBRS schemes
but these changes can also be applied to other filtering schemes too to get same
results. Contribution of the paper are as follow:-
124 A. Kumar and A.R. Pais
The rest of the paper is organized as follows: Sect. 2 discusses about the en-route
filtering technique including all the proposed changes, Sect. 3 gives the detailed
analysis of the proposed changes with simulation results. Section 4 compares
the altered technique with related work. Section 5 gives the discussion about
the existing and proposed technique. Finally, future work and conclusions are
discussed in Sect. 6.
2 En-Route Filtering
keys are the keys exchanged with few chosen verifiable cells. For selection of these
cells each node decides its upstream region, where upstream region represents
remote cells whose report this corresponding node may transfer. So to decide this
region, node uses a beam which is created from sink as starting phase, particular
node in the center and it stretches till end of the network. From all the cells which
lies in-between this beam, node probabilistically choose some nodes with whom
it exchange location guided keys. Location binded keys are used for endorsing
the events in the cell and Location guided keys are used to verify the report from
remote cells.
original network into 4 equal squares. Each smaller network is having 750 nodes
spread in a square field of 700 m. In each smaller network there are about 63 cells
with same node density and a sink node at the center. Each smaller network is
assigned different set of keys. All other parameters are same as discussed above.
With the creation of smaller network there is considerable decrease in key
overhead in LBRS scheme. But this reduction in number of keys does not have
much effect on overall filtering efficiency of LBRS. Filtering efficiency of SEF is
also same in both the cases. This shows that forged and false data will be dropped
within same number of hops in both the cases. But in smaller networks genuine
reports will be traveling less hops than compared with larger network. This
helps in saving energy consumption in the network. The smaller networks also
have more resiliency from compromised nodes and effect of selective forwarding
attack is also decreased. Detailed analysis and simulated results are given in next
subsections.
ph = 1 − (1 − p1 )H (1)
In the Eq. 1 we can see that there is no radius parameter which tells that
decreasing the size of network has no effect on the key stored in the nodes.
First network model had 3k nodes and a single sink in the center. By using
simulations we found that each packet travels around 16 hops on average. Using
this value in above formula with other given value gave filtering efficiency of
97.18%. Second network model had 4 smaller networks each having 750 nodes
and a separate sink in each network. Simulation results showed that in such setup
each packet travels around 9 hops on average. Putting this value gave around
86.5% filtering efficiency which is much under considerable limits. The unfiltered
reports can be filtered by the sink in sink verification phase.
Filtering efficiency of LBRS is analyzed by the filtering position of the false
reports. Attacker will be only having keys for creating only single MAC and he
would have to forge other remaining MACs. These forged MACs will be checked
and dropped by the forwarding intermediate nodes. So the probability that the
forwarding node has that particular key which attacker has tried to forge is given
by above equation, where k represents number of keys assigned to each sensor
node, m-1 are the number of MACs attacker has to forge and N is the total
number of keys assigned in whole network. But here in this technique k is not
constant and is decided according to upstream region of a particular node. More
128 A. Kumar and A.R. Pais
over total number of keys are also dependent on total number of cells in the
network and node density in each cell.
In the key exchange overhead subsection we discussed that the key exchange
overhead is considerably decreased if we decrease the size of network. This means
each node now will be storing less number of keys. This gives us the intuition that
filtering efficiency should decrease because the number of keys stored by each
individual node has decreased and now the forged reports will be traveling more
hops without being detected. But this is not the case. The number of keys will
decrease substantially if we decrease the upstream region of the node but now
the probability of choosing only the intermediate upstream nodes increases. This
means each node will have less keys but these keys will be mainly from interme-
diate upstream region cells. This in long run ensures almost same efficiency as
of the node having more keys where node will be having little less probability
of choosing intermediate upstream nodes. More over in above equation, value
of k will decrease if we reduce the size of network, but the value of N will also
decrease with it. As the smaller network will now be having less number of cells,
so the value of N will also decrease. This can further be molded to find new
equation [3] which gave the expected fraction of false reports being dropped
H
within H given hops represented by ph = 1 − 1 (1 − p1 ).
First network model had 3k nodes and a single sink in the center. By using
simulations we found that each packet travels around 16 hops on an average.
Using this value in above equation with other given values gave filtering efficiency
of 92%. Second network model had 4 smaller networks each having 750 nodes
and a separate sink in each network. Simulation results showed that in such
setup each packet travels around 9 hops on an average. This value gave around
84.5% filtering efficiency which is much under considerable limits. This decrease
in efficiency is not because of decreased keys stored by all nodes but because of
adding of new sinks. As reports originating from cells which are very near to sink
will not be having enough intermediate hops such that they could be verified
either as genuine or forged. But such reports are verified by sink itself. So if we
take such cells around sink as x, then if we increase the number of sinks to 4 then
these cells will increase to 4x, each sink having its equal number of such cells.
So in smaller networks, in total there would be 4x cells which cannot guarantee
en-route filtering. But any unfiltered forged report can be checked and dropped
finally by sink.
if only one sink was present all packets traveled around 4000 hops. This number
subsequently decreases if we increase the number of sinks to 4 and now all packets
only required 2300 hops to reach respective sink. In this case we have not done
any en-route filtering of packets and because of which each packet either forged
or genuine will travel all the H hops.
To further reduce the energy requirements we introduce en-route filtering
mechanism because of which forged reports can be filtered as soon as possible.
Early detection and dropping of forged reports helps in reducing the overall
energy requirements for the whole network. In previous subsection we proved
that decreasing the size of network does not have much effect on filtering effi-
ciency and forged report can be filtered with same efficiency. So in both the cases
where network is large and when network is divided, en-route filtering will take
same energy for filtering. Thus we will gain energy only in case where packets
are genuine or are not being filtered by en-route filtering mechanism. This is
achieved by decreasing the number of hops a genuine or undetected report has
to travel to reach the sink. For simulation, we alter the network traffic to have
variable genuine and forged reports and we will see the energy requirements in
all the cases. Figures 2 and 3 gives the energy consumption of SEF and LBRS
when the network is big and when we divide the network into smaller networks.
In both the figures we can see that we save lot of energy if we have smaller
networks as compared to larger networks.
3.4 Resiliency
4 Related Work
To filter out false data from the network many en-route filtering techniques [2–7]
have been proposed. For example SEF [2] is a global key pool based technique,
LBRS [3] is a location based key exchange technique. Other techniques include
DEFS [4] which is a hash based technique and PCREF [5] which is polynomial
based technique. All the above techniques are symmetric cryptography based
key exchange techniques. These techniques need pre shared keys, hash generator
or polynomials for implementation of en-route filtering. Other techniques such as
CCEF [6] and LBCT [7] are asymmetric cryptography based techniques. These
techniques do not require any pre-exchanged keys and here authentication of
reports is done on basis of signature. All the above techniques have a single
sink situated either in center or at one end of the network. None of the above
techniques have tried to decrease the size of network.
Some work has also been done in proposing techniques which collects data using
multiple sinks. Technique [10] implemented the network having multiple sinks and
where nodes relay the data to closest sink. This in result decreased the distance of
node to sink helping in saving of energy consumption. But this technique did not
divide the larger network into smaller networks. Technique [11] also implemented
132 A. Kumar and A.R. Pais
multiple sinks in the network but here they also divided the network into smaller
ones giving better results in term of resiliency from compromised nodes. The above
two techniques are only designed to handle genuine data from the network. So if any
false data is being sent by any node, it will travel all the way to the sink resulting in
wastage of energy. Thus we propose a multi-sink en-route filtering technique which
can check and drop the false data Our technique also reduces the hop count of gen-
uine reports to reach the sink.
5 Discussion
All existing en-route filtering techniques including LBRS and SEF are single
sink based false data filtering techniques. These could also be compared to a
centralized environment where all the data collection and decision making is done
at a single point. We by our technique are proposing a distributed scenario of the
same problem. Our technique divides the large network into smaller networks
each having independent sink. So all the smaller networks can act as independent
networks, thus sink in each smaller network can take independent decisions for
its network. So there is no need to collect the data from all the sinks at a single
central point.
If in any scenario we want to have the data collected from all the sinks at a
single sink then we can alter the proposed technique and can make all the sinks
to send the data to a single master sink. There could be many ways to send the
data from multiple sinks to a single sink-
– All the sinks use the same network to send the data to a particular chosen
master sink.
– All the sinks are having more powerful radio capabilities, thus all the sinks can
communicate with each other and can send data to single collection point.
This in turn converts the network into a 2-tier architecture where sensors
nodes communicate with each other at lower level and all sinks communicate
with other at upper level.
– A wired backbone network could be setup where all the sinks are connected
to the master sink.
– Mobile sensor node could be used to collect the data from multiple sinks. In
this case, a mobile sensor node periodically visits all the sinks and collects
the data from them.
around 40% of the energy and also decreases the key exchange overhead. Our
technique also increases the resiliency of the network from compromised nodes
and limits the effect of selective forwarding attack. As the future work, we plan
to devise new key exchange method which could work more effectively in smaller
network. We also intend to find way to effectively collect the data from all the
sinks.
References
1. Kumar, A., Pais, A.R.: En-route filtering techniques in wireless sensor networks: a
survey. Wirel. Pers. Commun. 96, 697–739 (2017)
2. Ye, F., Luo, H., Lu, S., Zhang, L.: Statistical en-route filtering of injected false
data in sensor networks. IEEE J. Sel. Areas Commun. 23(4), 839–850 (2005)
3. Yang, H., Ye, F., Yuan, Y., Lu, S., Arbaugh, W.: Toward resilient security in
wireless sensor networks. In: Proceedings of the 6th ACM International Symposium
on Mobile Ad Hoc Networking and Computing, pp. 34–45. ACM (2005)
4. Yu, Z., Guan, Y.: A dynamic en-route scheme for filtering false data injection in
wireless sensor networks. In: SenSys, vol. 5, pp. 294–295 (2005)
5. Yang, X., Lin, J., Yu, W., Moulema, P.M., Fu, X., Zhao, W.: A novel en-route
filtering scheme against false data injection attacks in cyber-physical networked
systems. IEEE Trans. Comput. 64(1), 4–18 (2015)
6. Yang, H., Lu, S.: Commutative cipher based en-route filtering in wireless sensor
networks. In: 2004 IEEE 60th Vehicular Technology Conference, VTC2004-Fall,
vol. 2, pp. 1223–1227. IEEE (2004)
7. Zhang, Y., Liu, W., Lou, W., Fang, Y.: Location-based compromise-tolerant secu-
rity mechanisms for wireless sensor networks. IEEE J. Sel. Areas Commun. 24(2),
247–260 (2006)
8. Hankerson, D., Menezes, A.J., Vanstone, S.: Guide to elliptic curve cryptography.
Springer Science & Business Media, Heidelberg (2006)
9. Levis, P., et al.: TinyOs: an operating system for sensor networks. In: Weber,
W., Rabaey, J.M., Aarts, E. (eds.) Ambient Intelligence, pp. 115–148. Springer,
Heidelberg (2005). https://doi.org/10.1007/3-540-27139-2 7
10. Vincze, Z., Vida, R., Vidacs, A.: Deploying multiple sinks in multi-hop wireless
sensor networks. In: IEEE International Conference on Pervasive Services, pp. 55–
63. IEEE (2007)
11. Ciciriello, P., Mottola, L., Picco, G.P.: Efficient routing from multiple sources to
multiple sinks in wireless sensor networks. In: European Conference on Wireless
Sensor Networks, pp. 34–50. Springer (2007)
Security Schemes for Constrained Application
Protocol in IoT: A Precise Survey
1 Introduction
The basic idea behind the Internet of Things (IoT) is connecting various kinds
of electronic device into the Internet with an aim to build a worldwide distrib-
uted system of interconnected physical objects. For constructing such a global
network, where all these nodes should be able to communicate and interact with
each other in an efficient manner, software architectures which provide scalabil-
ity, simplicity and interoperability of communication are required. Due to the
unreliable congestion control algorithms, TCP in wireless networks shows a very
low performance therefore, the connection-less UDP is mostly used in the IoT.
One approach which fulfills these requirements is the architectural style of Repre-
sentational State Transfer (REST) providing a guideline for designing large-scale
distributed applications. The basic idea behind the Internet of Things (IoT) is
the integration of all kinds of electronic devices into the Internet with an aim to
build a worldwide distributed system of interconnected physical objects.
The security of IoT is a very crucial topic, because it is related to the data
that is transmitted over the network, data may be sensitive, personal as well
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 134–145, 2017.
https://doi.org/10.1007/978-981-10-6898-0_11
Security Schemes for Constrained Application Protocol in IoT 135
are asynchronous message exchanges, low header overhead and parsing complex-
ity, supports URI (Universal Resource Identifier) and content-type, also that it
has simple proxy and caching capabilities [3].
The structure of CoAP is divides into two layer, the message layer and
request response layer. The principal layer is in charge of controlling the mes-
sage trade over UDP between two nodes. While the second layer conveys the
request/response which holds respective code with a specific end goal to main-
tain message delivery, for example, the entry of messages that are out of request,
lost or copied. Figure 1 illustrates the design for CoAP. CoAP is a solid instru-
ment with rich components, for example, basic stop-and-wait re-transmissions,
copy discovery and multicast support. CoAP utilizes a short fixed-length binary
header and components, and messages are encoded in binary simple format. The
techniques supported in CoAP depend on the REST-ful structure which is GET,
POST, PUT and DELETE [3].
CoAP message format is shown in Fig. 2. The CoAP start header contains a
version number (V), a message type number (T), a token length (TKL), a code
(C) and a Message ID (MID). Since CoAP uses the unreliable UDP, senders
can advise receivers to confirm the reception of a message by declaring it as
confirmable. The TKL field defines the size of the token which enables the asyn-
chronous message exchange. Based on this token, requests and responses can be
matched. The CoAP start header concludes with a Message ID being an identifier
for linking a reset or an acknowledgement message to its confirmable message.
The next element of the CoAP header is the token value. This value can be
Security Schemes for Constrained Application Protocol in IoT 137
CoAP, as said, lacks built-in security mechanism and hence, security for CoAP
requires the presence of an external security scheme or mechanism, e.g., HTTP
and Transport Layer Security (TLS) [8]. As widely used and mentioned by IETF
and CoRE working group, security considerations are implemented by using
Datagram Transport Layer Security (DTLS) or IPSec [1]. DTLS ensures fea-
tures such as confidentiality, integrity, authentication, and non-repudiation in
the network using AES/CCM. DTLS is mainly functional in the transport layer
of the protocol stack. DTLS was at first intended for traditional networks, but
over the time it has been ported for constrained devices but this result in pro-
ducing a heavyweight protocol. DTLS headers are likewise too long to fit in a
single IEEE 802.15.4 maximum transmission unit (MTU) [15]. Calculation over-
head of their DTLS handshake presents high vitality utilization because of the
utilization of RSA-based cryptography.
DTLS is a derived protocol, obtained by modification of Transport Layer
Security (TLS) protocol, and it is implied at application layer. DTLS contains
records that are 8 bytes longer than in TLS. 13 bytes extra overhead per data-
gram is incurred on DTLS after the handshake is processed making it costly
for constrained nodes [13]. For an incoming message during handshake, it will
be decompressed and decryption will be performed by the protocol to verify it.
While in an outgoing scenario of handshake, the protocol will apply encryption
algorithm, add authentication code (MAC) and compress the message. Following
Fig. 3 states the DTLS handshake mechanism between a client and a server.
The security in CoAP is still under talk, despite the fact that DTLS is joined
as an assurance layer. The open deliberation is the substantial cost of com-
putation and high handshake in the message which causes message disconti-
nuity. Many reviews have proposed an answer for compressed DTLS which is
addressed in further sections. Moreover, key administration is another downside
of the CoAP security which is a typical issue in all protocols. Raza et al. have
proposed to receive 6LoWPAN header size reduction for DTLS [13]. They have
connected compressed DTLS with the 6LoWPAN standard, accomplishing an
enormous lessening in the quantity of extra security bits.
138 A. Mali and A. Nimkar
using symmetric key which is the pre-shared key configured in devices a priory. Key
management is done using polynomial scheme that guarantees sharing of secret
bivariate by the domain manager of the network. Here, keys serve as root key mate-
rial in MIKEY derivation. Whereas secure communication is guaranteed by DTLS
record.
Research has been extensive to provide security using various means to the
payload of the message but, securing the meta-data of the message is also of equal
importance. Nguyen et al. [11] proposed a message authentication framework for
CoAP message as there is an issue regarding the privacy of Meta information
even though payload of message is secured. Protecting only the payload or certain
data format still leaves a trail for an attacker to manipulate meta-data, which
is a crucial part of CoAP message. Distinction between header parts of CoAP is
needed in order to differentiate meta-data from payload which isn’t present, the
proposed research provides distinction between CoAP start header and CoAP
header.
Considering an MITM model, an attacker can intrude due to known DTLS
vulnerabilities. It is a complimentary to Transport layer security. The REST-
ful CoAP message authentication protects and ensures authenticity by implying
following steps:
– CoAP still posses high energy consumption, data loss and delay as DTLS
posses heavy packet size.
142
Sr. no. Research article topic Year Security scheme Key Authentication Message End-to-end Header Protection
management mechanism security security compression from attacks
1 6LoWPAN compressed DTLS for 2012 - - - - - Yes -
CoAP [15]
2 Securing IP based IoT With HIP 2013 Host Identity AMIKEY Yes Yes Yes No Yes
and DTLS [4] Protocol
3 LITHE [14] 2013 DTLS Yes Yes Yes Yes Yes Yes
4 Secure multicast transmission [7] 2013 Batch Signature Public Key Yes Yes Yes - Yes
Verfiy
5 Securing communication in 2013 IPSec No Yes Yes Yes - Yes
A. Mali and A. Nimkar
– Being request/response protocol implies four round trips for initial authenti-
cation.
– DTLS defines to use Elliptical curve cryptography for key management but,
there is requirement of a second thought over ECC technique as its practicality
is questionable.
– A prime feature of CoAP, multicast messaging cannot be performed using
DTLS and proves to be essential in IoT environments.
– DTLS lacks the support for group key management.
5 Conclusion
Through this paper we surveyed and studied different techniques that are associ-
ated with Constrained Application Protocol to guarantee secure communication
in Internet of Things. We measured out that DTLS is mentioned as a standard
mechanism for securing CoAP protocol and it also provides the necessary secu-
rity to some extent. But there are still some modifications required to reduce the
cost of this heavy protocol with respect to the heavy handshake mechanism and
packet size. We also came across various other security schemes that are light-
weight but not yet standardized. The message authentication scheme studied
provides protection to meta-data as well, which is a add-on in improving secu-
rity in CoAP. Here, we also state various issue that still persist and need to be
addressed to provide overall security to CoAP over IoT. Some techniques men-
tioned here are evaluated and verified to provide efficient results and reliability in
144 A. Mali and A. Nimkar
securing CoAP. We expect that this survey provide some valuable contribution
and proper insights by documenting a very dynamic area of research in this era.
This will definitely be helpful to the researchers to evolve with new solutions in
the aspect of securing the IoT.
References
1. Arkko, J., Keränen, A.: CoAP security architecture (2011)
2. Bhattacharyya, A., Bose, T., Bandyopadhyay, S., Ukil, A., Pal, A.: Less: light-
weight establishment of secure session: a cross-layer approach using CoAP and
DTLS-PSK channel encryption. In: 2015 IEEE 29th International Conference on
Advanced Information Networking and Applications Workshops (WAINA), pp.
682–687. IEEE (2015)
3. Bormann, C., Hartke, K., Shelby, Z.: The Constrained Application Protocol
(CoAP). RFC 7252, June 2014
4. Garcia-Morchon, O., Keoh, S.L., Kumar, S., Moreno-Sanchez, P., Vidal-Meca, F.,
Ziegeldorf, J.H.: Securing the IP-based internet of things with HIP and DTLS. In:
Proceedings of the Sixth ACM Conference on Security and Privacy in Wireless and
Mobile Networks, pp. 119–124. ACM (2013)
5. Granjal, J., Monteiro, E., Silva, J.S.: Security for the internet of things: a survey
of existing protocols and open research issues. IEEE Commun. Surv. Tutor. 17(3),
1294–1312 (2015)
6. Ishaq, I., Hoebeke, J., Van den Abeele, F., Moerman, I., Demeester, P.: Group com-
munication in constrained environments using CoAP-based entities. In: 2013 IEEE
International Conference on Distributed Computing in Sensor Systems (DCOSS),
pp. 345–350. IEEE (2013)
7. Salem Jeyaseelan, W.R., Hariharan, S.: Secure multicast transmission. In: 2013
Fourth International Conference on Computing, Communications and Networking
Technologies (ICCCNT), pp. 1–4. IEEE (2013)
8. Karagiannis, V., Chatzimisios, P., Vazquez-Gallego, F., Alonso-Zarate, J.: A survey
on application layer protocols for the internet of things. Trans. IoT Cloud Comput.
3(1), 11–17 (2015)
9. King, J., Awad, A.I.: A distributed security mechanism for resource-constrained
IoT devices. Informatica 40(1), 133 (2016)
10. Lakkundi, V., Singh, K.: Lightweight DTLS implementation in CoAP-based inter-
net of things. In: 2014 20th Annual International Conference on Advanced Com-
puting and Communications (ADCOM), pp. 7–11. IEEE (2014)
11. Nguyen, H.V., Iacono, L.L.: REST-ful CoAP message authentication. In: 2015
International Workshop on Secure Internet of Things (SIoT), pp. 35–43. IEEE
(2015)
12. Park, J., Kang, N.: Lightweight secure communication for CoAP-enabled inter-
net of things using delegated DTLS handshake. In: 2014 International Conference
on Information and Communication Technology Convergence (ICTC), pp. 28–33.
IEEE (2014)
13. Rahman, R.A., Shah, B.: Security analysis of IoT protocols: a focus in CoAP. In:
2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC),
pp. 1–7. IEEE (2016)
14. Raza, S., Shafagh, H., Hewage, K., Hummen, R., Voigt, T.: Lithe: lightweight
secure CoAP for the internet of things. IEEE Sens. J. 13(10), 3711–3720 (2013)
Security Schemes for Constrained Application Protocol in IoT 145
15. Raza, S., Trabalza, D., Voigt,, T.: 6LowPAN compressed DTLS for CoAP. In: 2012
IEEE 8th International Conference on Distributed Computing in Sensor Systems
(DCOSS), pp. 287–289. IEEE (2012)
16. Shaheen, S.H., Yousaf, M.: Security analysis of DTLS structure and its applica-
tion to secure multicast communication. In: 2014 12th International Conference on
Frontiers of Information Technology (FIT), pp. 165–169. IEEE (2014)
17. Sheng, Z., Yang, S., Yifan, Y., Vasilakos, A., Mccann, J., Leung, K.: A survey
on the IETF protocol suite for the internet of things: standards, challenges, and
opportunities. IEEE Wirel. Commun. 20(6), 91–98 (2013)
18. Skarmeta, A.F., Hernandez-Ramos, J.L., Moreno, M.V.: A decentralized approach
for security and privacy challenges in the internet of things. In: 2014 IEEE World
Forum on Internet of Things (WF-IoT), pp. 67–72. IEEE (2014)
19. Ukil, A., Bandyopadhyay, S., Bhattacharyya, A., Pal, A., Bose, T.: Auth-lite: light-
weight M2Mauthentication reinforcing DTLS for CoAp. In: 2014 IEEE Interna-
tional Conference on Pervasive Computing and Communications Workshops (PER-
COM Workshops) pp. 215–219. IEEE (2014)
20. Ukil, A., Bandyopadhyay, S., Bhattacharyya, A., Pal, A., Bose, T.: Lightweight
security scheme for IoT applications using CoAP. Int. J. Pervasive Comput. Com-
mun. 10(4), 372–392 (2014)
Jordan Center Segregation: Rumors in Social
Media Networks
Abstract. Social media networks have gained a lot of popularity among the
people by rapidly spreading rumors inquiring a variety of human affairs.
Nowadays people simply tend to hype over social media for publicity or pro-
motion which is the prime source for all deception activities online. The data
shared in the midst of social media may be spreading a bogus news online and
sooner or later they will be sorted off the record as rumors, but meanwhile the
rumor might have done an adequate amount of damage to the subject. Current
day rumor Segregation practice aims no more than identifying the rumor in the
reign, days after its first forecast. The anticipated model will serves as a precise
way out for isolating a rumor by calculating the preparatory source of the rumor
by the use of Jordan source center with SI, SIR, and SIRI infection models.
Jordan source center is the best optimal source calculator which overcomes the
error rate, infection rates and other parameters when compared to other cen-
trality estimators from the marketplace. It helps in finding the source of a
common social media rumor and proceeding further to cleanse the infections
and trim down their forged impact over the social media networks.
1 Introduction
spreads virally over all other cohesive social networks. There is a higher probability
that people will believe these fake news especially during emergency situation or any
instances of disaster. So there is a greater possibility of spreading fake news among the
social network users which is commonly referred to as online rumor. For example
much fake news related to missing Malaysian flight MH-370 spread in social media,
which created a number of panic circumstances among the mass [12, 14].
Internet users pass this information deliberately to their community, while the cyber
criminals who initiates the news will achieve their success in creating chaos. Because
common people will believe those fake news and start panicking among them, this
might lead to fall in market shares, riots, gang war, damage to public property, death,
murder and many more fatal situations. Spreading of rumor in social media is
increasing day by day which has a variety of interesting grounds to harvest.
In today’s digitally connected world it is very difficult to recognize rumor and set
apart non-rumor news in social media due to the mass occurrence and the hasty speed
at which it propagates. This is because the use contents are not verified properly, people
immediately forward whatever they receive aiming only at the number of likes or
shares as a deciding factor to spread the same. So before forwarding anything we have
to verify the content properly and also think about the adverse effects of the false data.
Monitoring all the tweets, shares, and posts is not an easy task in the era where access
to internet is more or less freely available to all. But in order to identify the rumor in the
news, it is very important to collect and correlate with all the relevant news which is
currently circulating in the social media. Here, concentrating on both analyzing the
rumor and identifying the source, i.e., the individual who starts spreading the fake news
in social network. If a disease spreads from one person to another person it is com-
monly referred to as infection. Here the news spreading from one user to another in
social networks is also referred as infection.
There are three basic infection models in social media and they are as follows:
Susceptible Infected (SI) model, Susceptible Infected and Recovered (SIR) model and
Susceptible Infected Recovered and Infected (SIRI) model. Usually we model social
networks based on Graph theory G = (V,E), where V is the vertices and E is the edges
in the graph G. The infection model Susceptible Infected (SI) model is a set of nodes
that have possible infected nodes as their neighbors which is known as susceptible
node. Here to say in terms of social media, friends of an infected individual believes in
some rumors and shares the same post related to a spreading rumor is referred to as
infected nodes. And in future, there is a larger possibility that other surrounding
individuals will also believe in that rumor and will start sharing the posts which is
known as susceptible nodes. In the Susceptible Infected and Recovered (SIR) model,
an infected nodes get recovered as this person who shares the post believing it as a
rumor will remove that post if they later discover it as fake news which is referred as
Recovered node. The Susceptible Infected Recovered and Infected (SIRI) model is a
typical scenario after recovering they will be yet again infected by the same rumor. If
an individual believes in some rumor by sharing any post (infected) related to the
rumor and later it removes them (recovered) and even after that they tend to re-shares
the same rumor by providing additional evidences (infected), which is known as SIRI
model. In this manuscript, we took the most spoken event in social network and
monitored the tweets, posts, and shares to sort it out to be a rumor or non-rumor news.
148 R. Krithika et al.
And from the identified rumor we found the center (source of rumor) using Jordan
center calculation and tried proving that the Jordan center is verified to be the best
among its nearest competitors namely closeness, degree, betweenness centrality in
sorting out the source.
2 Related Works
The fake news at all times spreads rapidly while the genuine news takes some to gain
popularity. Especially in social media the origin and behavior of rumors are the
nightmare for data analyst. And the process in progress is divided into two phases
namely: the first phase starts with identifying whether the data is a rumor or non-rumor
by simple Segregation and the second phase is the calculation of center (rumor source)
with the inferences of Jordan center being the best among all other centrality measure
algorithms.
Towards detecting rumors in social media [4], here the social media is used as a
medium to spread information among the users. Along with the general information,
false data will also spread among the online users. This kind of unverified news along
with some fake evidences propagating in social media creates many problems espe-
cially during emergency situations like the incident which happened with the Hurricane
Sandy incident in 2012. This type of rumors during any crisis spreads quickly than the
real facts in social media. The PHEME project aims at analyzing these rumors by
tracing their source in social media and its truthfulness is verified. In this rumor
analysis, the authors initially collected all the tweets about Hurricane Sandy with tweets
having reply, retweets with the initial tweet. An atomic unit of these tweets are referred
to as thread. This rumor analysis is done by human assessor who manually reads
through the tweets and determines the rumor for the particular event which is referred
as annotation.
To facilitate this annotation task, a specific tool is used which visualizes the
timeline of tweets. Annotators use this tool to analyze the tweets and mark as positive
( ) if it is a rumor, isolate a non-rumor ( ) message or keep the rest of them in the
suspicious unverified mode ( ). It also includes an interface that allows us to revise,
rename, move the threads to another category. They used the twitter’s streaming API to
collect tweets for any particular ongoing event. The creation of datasets with the
annotated conversations and collection of thread helps in identifying the rumor or
non-rumor state in this social media for any particular circulating story.
Identifying rumors and their sources in social networks [3], is a piece of infor-
mation that propagates through social networks having many false claims. The rumors
can propagate to large number of users with the incorrect information as its source.
These false claims are spread by any unknown node and it is very difficult to tell the
original source of this fake information. In this paper the problem of identifying rumors
and their sources in social networks are discussed. A social network is modeled as a
directed graph G = (V,E) where V is the set of all people and E is the set of edges and
they represent the flow of information between specific individuals. A set of
pre-selected nodes are termed as monitor nodes (M). For investigation purposes, a
piece of dummy information is sent and the monitor is expected to report whether they
Jordan Center Segregation: Rumors in Social Media Networks 149
received it or not. If received, then it is referred as positive monitors (M+) and the nodes
which have not received is referred as negative monitors (M−). So, for each node the
reach ability and distance is calculated for both the positive and negative nodes
appropriately. Once it is sorted, they are compare with the selection methods such as
Randomness, Inter-Monitor Distance (Dist), Number of Incoming Edges (NI), NI +
Dist, Betweenness Centrality (BC) and BC + Dist.
Then the information is concluded to be a rumor or non-rumor by using the Greedy
Source Set Size (GSSS) algorithm and Maximal Distance of Greedy Information
Propagation (MDGIP) monitor selection method. When there is a large difference in the
number of sources between rumor and non-rumor, then it is clear that the actual rumors
are separated. If the difference is too small then there exists some inaccuracy and
redundancy in the procedures followed to sort them out. And logistic regression is used
to classify rumor and non-rumor accurately with first half of the data used as training
data and second half as testing dataset.
Automatic Detection and Verification of Rumors in Twitter [1], explains the rise of
social media that greatly affects the scope of journalism and similar news reporting
areas. This social media platform is not only used for sharing the genuine news but also
tends to proliferate much of unconfirmed fake news. As in Boston bombing scenario,
many rumors were spread in twitter which brought huge confusions and problems
between the people in that locality. Here a tool is developed to detect the rumor and to
check the trustworthiness of the rumor. The system has two major subsystems namely
the rumor detection and rumor verification. In rumor detection, subsystem is referred to
as ‘hearsift’ and it is the actual collection of tweets about an specific event. It is sub
classified into two major parts as assertion detector and hierarchical clustering. The raw
tweets are fed into the assertion detector and it filters only the assertion tweets. The
assertion belongs to the class of speech acts which has multi-class classification
problem, where a dataset is created and categorized based on the semantic and syntactic
features of the source. The output of the assertion detector is fed as an input to the
clustering module from which the user is able to get the collection of all the clusters.
Clusters contain the messages propagated through the twitter in multitude of cascades
which can fetch the established rumor as the output.
The next major module is the rumor verification module which is referred as rumor
gauge model. Here the output of the rumor detector is fed as input to the rumor gauge.
The rumor gauge will get the time-series features about the linguistic content, user
identity and propagation dynamics of the rumors. Then it passes to a Hidden Markov
Model (HMM), where it has trained the annotated rumors which helps to find the
truthfulness of the rumors.
Eccentricity and centrality in networks [8], belongs to the concept of centrality
discovered by Camille Jordan later introduced as a model for social network analysis. It
includes the path center of a graph. In this model the center of the graph is referred to as
operation research (OR) which helps in choosing a site for a facility and to minimize
the response time to any other location. It is solved by finding the set of nodes whose
total distance to all other nodes is least, i.e., the ‘median’ of the graph. Next by finding
the set of nodes whose maximum distance to any other node is least, i.e., the ‘center’ of
the graph. A graph G consists of a finite non-empty set V = V (G) of nodes together
with a set E = E (G) of edges joining certain nodes. A path in G is an alternating
150 R. Krithika et al.
sequence v0, e1, v1, e2, .., vn-1, en, vn. A cycle is brought up by connecting the initial
and terminal nodes joined by an edge. The length of a path is determined by the
authentic number of edges present at that time interval. The distance d (u,v) is the
length of the shortest path joining u and v. The eccentricity e (v), in a connected graph
G is the maximum distance d (u,v) for all u. The diameter of a graph G is the maximum
eccentricity of a node. The radius r (G) is the minimum eccentricity of the nodes.
The classical theorem of Jordan determines the location of the center while the
given graph is represented as a tree. It states three main theorems of Jordan.
Theorem 1. The center of a tree consists of either a single node or a pair of adjacent
nodes.
Theorem 2. The center C(G) of any connected graph G lies within a block of G.
Theorem 3. The center of any network lies in a single block.
Linear algorithms for finding the Jordan center and path center of a tree [5], in this
paper the linear algorithms for finding the Jordan center is suggested along with the
steps to calculate the weighted Jordan center and the path center of a tree. This
algorithm helps in representing canonical representation of a tree. The path center of a
tree T consists of a unique sub path of T which is also recommended in this paper. The
distance d (u,v) between two vertices u and v in a graph G is the length of a shortest
path between them. The shortest path between any two vertices is called a geodesic.
Eccentricity e (v) of a vertex in a connected graph G is the longest geodesic from v.
On the universality of the Jordan center for estimating the rumor source in a social
network [2], the base paper of our model considers the rumor spreading sources in
social networks. While identifying the source the investigator tries to fetch only the
nodes posted locally but not any clue on the model. Finding the source estimator is
applicable to Susceptible Infected (SI) model, Susceptible Infected Recovered
(SIR) model and Susceptible Infected Recovered Infected (SIRI) models. SI model gets
infected from its infected neighbor. SIR model, an infected node will one way or
another gets recovered from an infection. In SIRI model, the infected node recovers
initially and later get infected again. The main aim of this paper is to show that Jordan
center is an optimal rumor source estimator which is applicable to all the above
mentioned models. The Jordan center does not depend only on the parameters like the
infection rate, recovery, re-infection rate and so it is regarded as universal source
estimator. A Jordan center is defined as a node in G that has minimum eccentricity.
Simulation results using both the synthetic and real world networks are tested in
parallel to evaluate the performance of the Jordan source estimator. Simulations have
been conducted and the results shows that Jordan center outperforms better than the
distance, closeness and betweenness centrality based heuristic mechanisms [6].
A distributed algorithm for the graph center problem [5], is the paper with a
distributed algorithm for the graph center problem suggested to find the center of a
social graph. The algorithm is based on three sub algorithms such as test connectivity,
all-pair shortest path and leader election which operate in different layers. Locating
center between the nodes help in placing the resource at a center of a graph which
minimizes the costs of sharing the resources with other locations. The main idea of this
Jordan Center Segregation: Rumors in Social Media Networks 151
algorithm is to find the center node in a distributed environment. The test layer provides
the information whether the network is connected or isolated after infection. The
routing layer computes MinMax distance also termed as the eccentricity value. The
center layer computes the center by using the minimum eccentricity value and it is
ultimately considered as Jordan center.
Enquiring minds: early detection of rumors in social media from enquiry posts [10],
demonstrates the trending rumors that are identified by finding the entire clusters of the
actual posts and isolated malicious rumor clusters [13]. The rumor clusters can be
found by the signature text referred to as enquiry phrases. A technique is developed
based on searching for the enquiry phrase as early as possible and then separate the
posts that do not contain these phrases. Then they are clustered appropriately and
finally rank based upon their classification anatomy.
Rumor detection procedure has the following five major steps in practice namely;
Identify Signal Tweets, Identify Signal Clusters, Detect Statements, Capture
Non-Signal Tweets and Rank Candidate Rumor Clusters. To process cases with bil-
lions of record such as Boston Marathon bombing datasets, the experiment is per-
formed on a typical Hadoop cluster. The main components of the framework includes
filtering, clustering and retrieval algorithm implemented by means of apache pig
framework.
3 Proposed Methodology
Considering the virally spreading news in social networks and the intention identify the
rumor, the following model is formulated. From the identified rumor we calculate the
center using Jordan center, as Jordan center is the proven to be the best among all other
centralities like degree, closeness and betweenness centrality algorithms. The proposed
dataflow methods has two routines namely to Identify it as a rumor and collect the
supporting datasets [18, 20] (Fig. 1).
An individual shares or tweets detailed set of posts in their account. With the help
of these complete data, we identified the posts having verified rumors. So in order to
collect those posts, we processed them using streaming API by tracking all the main
keywords with relevant hash tags. Application Programming Interface (API) is a set of
protocols or routines for building an application. Using streaming API a persistent
stream alive API session will be established between the server and the client. It pushes
the data whenever it is publicly available and notifies them automatically if any new
tweets or posts arrive in the user space (Fig. 2).
The dataset from twitter streaming API is collected and stored it in a database for
future reference. In twitter streaming API, apps are created by logging in by their
twitter account and in the second stage, in OAuth interface and it gives Access Token,
Access Token Secret, Consumer Key and Consumer Key Secrets. These are the four set
of keys that are provided to the user in order to authenticate themselves while collecting
data. It provides the end client a second level of access to server in order to authorize
themselves to the server on the behalf of the owner. Twitter streaming API collects all
information based upon user specific keywords. For example, give a keyword as
‘Boston Bombing’ it collects all the data including person’s name, ID, followers list,
number of people retweeted, replies to tweets and retweets, and also their hashtags.
Monitoring the rumor is facilitated with the actual number of retweets. Then we started
collecting the data for the particular story that is circulated in the social media i.e.,
Boston Marathon Bombing through streaming API [17].
Jordan Center Segregation: Rumors in Social Media Networks 153
Technical details of the proposed system with the sole purpose of applying the
standard API imports officially provided by the social media network providers.
Implementation of the same begins with tagging and annotating specific search key-
words on the grounds of the three models (SI, SIR, SIRI) after tagging by individuals or
retweets through friends are noticed by the API collector interface. To be precise on the
implementation and as a proof of concept of the anticipated scheme only a particular
rumor circulating during the time of execution was collected to classify the rumor [15].
• Annotation: The data is collected through twitter streaming API and stored in the
JSON file. The annotation master collects all the anticipated data and stores the
JSON file in the user specified folder. Manual annotation task is carried out here by
individually reading and bookmarking the annotation tags. Human has to read
through the text file of tweets to determine the rumors for training the engine. Here,
major characteristics that are considered to sort it out as rumor are the number of
retweets and the replies for any particular tweets that exceed the time limit.
• Classification of Text: Once the text is annotated manually, then we tend to classify
them as rumor or non-rumor. Python library named TextBlob is used to classify the
text as it helps us to classify the text by creating a simple classifier. The training and
testing dataset created is the fed into the Naive Bayes classifier where the dataset is
trained and tested to finally classify the text to be a rumor as in Fig. 3.
Fig. 3. Flow diagram for anticipated rumor segregation in social media networks
So here before the rumor detection part, the preliminary dataset isolation is accu-
mulating the tweets through Twitter streaming API and manually annotating them
based on the replies and the number of retweets to identify the rumor source. If the
annotator feels it as rumor then it is marked with ‘red’ and if it is non-rumor then
154 R. Krithika et al.
labeled as ‘green’ and if they are in dilemma, the tag goes ‘orange’. Then it is aptly
trained and marked off the record using the well established TextBlob classifier. [19]
Calculating the Jordan Center.
• Dataset Collection: Once the text is classified to be a rumor or non-rumor, each set
is exported as a comma separated file (csv) in order to calculate the Jordan center to
make sure the integrity of the rumor is verified and finally to weigh up with the
exact origin of the rumor source.
• Segregation Component: After finding the rumor only, centrality (source) of the
appropriate suspicious node ought to be calculated, so the Jordan centre estimation
steps are performed only after sanitizing the junk of Tweets collected in the pre-
vious phase. Finally the graph center estimation algorithm is applied to discover the
accurate starting node as shown below.
• Graph Center Estimation: Gephi framework is used to calculate the centralities and
specifically in social networking scenario it helps to calculate centralities, viewing
data and shortest path values. Here, the csv file is given as an input to the Gephi
window and centralities are calculated as it proves Jordan center to be the best
among all the other centralities like degree, closeness and betweenness centrality.
The below diagram shows the relationship (edges) between one node to another
node, as it illustrates how the nodes in social media are interconnected with each other.
The Boston bombing event dataset is collected in twitter using twitter streaming API as
mentioned in the phase-1 and it is facilitated with the help of Gephi identifiers. This is
the best example for all the infection models in a typical social network.
The Jordan, Betweenness, Closeness and Degree centrality calculating procedures
are as follows: [9, 16]
– Jordan Centrality: It is minimum eccentricity value. The smallest distance between
each node is taken and calculated by using the following formula.
1
CJ ðxÞ ¼ MAXDðx;yÞ
y6¼x
– Betweenness Centrality: It is the number of shortest path that passes through that
particular vertices and it is calculated as,
2 X X gxy ð2Þ
CB ðZÞ ¼ 0:001 þ
ðn 1Þðn 2Þ x6¼z y6¼z gxy
n1 1
Cc ðxÞ ¼ P ¼
Dðx; yÞ AVGDðx;yÞ
y6¼x
y6¼x
Jordan Center Segregation: Rumors in Social Media Networks 155
– Degree Centrality: It is the number of outcomes from a particular node and it’s
calculated using,
CD ðvÞ ¼ DegðvÞ
Only indispensable models and formulas that are mandatory for the segregation of
rumor source is considered above. All other basic centrality formulas are skipped here
with a trust of all the prerequisite basic calculations of centrality is acknowledged by
the annotator during the early stages of the review.
After getting the nodes and their relationships, next step is to identify the source.
With the help of Gephi we are able to generate the various centrality values. In the
below figure the maximum eccentricity value i.e., the diameter value is calculated and
the expected best diameter value is highlighted in Table 1 [7].
Inorder to segregate the Jordan center we are taking the minimum eccentricity value
as four. So from this we get the source from the collected sample dataset of Boston
bombing.
Fig. 4. Minimum eccentricity value (Jordan center) plotted using gephi’s report agent.
156 R. Krithika et al.
While the graph estimation value plotted using Gephi in Fig. 4. We calculate the
additional features using R studio with the same Boston bombing dataset. Initially we
import the dataset and centrality values for indegree, outdegree, closeness, between-
ness, minimum eccentricity values are calculated as shown in Fig. 5.
In the R studio the min eccentricity value by default is set to one. Here we calculate
the Jordan center value out of which it is proven to be the best among the other
centrality calculator algorithms.
By segregating all the centrality measure values using R studio, gephi the above
graph (Fig. 6) states that the Jordan center is best among the other centralities like
Betweenness, Closeness, Degree centrality in social networks.
Manually collecting all the tweets related to our live case study is performed. It is
annotated based upon the suggestions on the scenario distinguished by the real world
annotators involved in the incident is advantageous over working on standard departed
datasets from the past. Comparing with other centrality measures Jordan provides best
results with to a large extent of classification and sound calculation of the source which
is rendered in the above table on sorting out the rumor source.
The main aim of this project is to categorize the post to b a rumor or non-rumor in a
particular event circulating in social media networks. From the basic steps to identify a
rumor, we later graduate to exactly identify the source using the appropriate centrality
measure algorithm. Here the survey on the history of Jordan center and other centrality
algorithms paved the way for sorting out the precise rumor in social networks. Keeping
this as the base we have classified the tweets to be a rumor or non-rumor with a typical
example of Boston bombing marathon event. By using the centrality measure segre-
gation, it is verified that the Jordan center is the best among other centralities like
Betweenness, closeness, degree centrality.
In order to handle the real time data and huge amount of data, the extraction and
isolation of rumor is achieved via Hadoop interface [9]. Flume helped us to achieve a
smooth progress of automating the process of rumor classification and hive modules
fetch the dataset for dynamic processing of tweets fetched from twitter API [10].
Collection of TamilNadu CM Ms. Jayaram Jayalalithaa death issue related tweets are
being performed in Hadoop to work with the real time social graph and also to handle
the huge amount of data collected in a stipulated span of time [11]. Apache Hadoop is
installed to facilitate this model and the appropriate modules have been deployed to
collect the dataset. Thereafter any such incidents should be predicted using the model
and an early alert whether it is a rumor or not will be posted well in advance to avoid all
havoc caused by the same [21]. This should be able to keep away from rumors and
disinfect all the existing rumors that circulates liberally in and around all social media
networks.
References
1. Vosoughi, S.: Automatic detection and verification of rumors on Twitter. Doctoral
dissertation, Massachusetts Institute of Technology (2015)
2. Luo, W., Tay, W.P., Leng, M., Guevara, M.K.: On the universality of the Jordan center for
estimating the rumor source in a social network. In: 2015 IEEE International Conference on
Digital Signal Processing (DSP), pp. 760–764. IEEE, July 2015
3. Seo, E., Mohapatra, P., Abdelzaher, T.: Identifying rumors and their sources in social
networks. In: SPIE Defense, Security, and Sensing, p. 83891I. International Society for
Optics and Photonics, May 2012
4. Zubiaga, A., Liakata, M., Procter, R., Bontcheva, K., Tolmie, P.: Towards detecting rumours
in social media. arXiv preprint arXiv:1504.04712 (2015)
158 R. Krithika et al.
5. Song, L.: A Distributed Algorithm for Graph Center Problem. Complexity Research Group,
BT Exact, Martlesham (2003)
6. Luo, W., Tay, W.P., Leng, M.: How to identify an infection source with limited
observations. IEEE J. Sel. Top. Sig. Process. 8(4), 586–597 (2014)
7. Hedetniemi, S.M., Cockayne, E.J., Hedetniemi, S.T.: Linear algorithms for finding the
Jordan center and path center of a tree. Transp. Sci. 15(2), 98–114 (1981)
8. Hage, P., Harary, F.: Eccentricity and centrality in networks. Soc. Netw. 17(1), 57–63 (1995)
9. Louni, A., Santhanakrishnan, A., Subbalakshmi, K.P.: Identification of source of rumors in
social networks with incomplete information. arXiv preprint arXiv:1509.00557 (2015)
10. Zhao, Z., Resnick, P., Mei, Q.: Enquiring minds: early detection of rumors in social media
from enquiry posts. In: Proceedings of the 24th International Conference on World Wide
Web, pp. 1395–1405. ACM, May 2015
11. Danthala, M.K.: Tweet analysis: twitter data processing using Apache Hadoop. Int. J. Core
Eng. Manage. (IJCEM) 1, 94–102 (2015)
12. Viswanath, B., Mislove, A., Cha, M., Gummadi, K.P.: On the evolution of user interaction in
Facebook. In: Proceeding of the 2nd ACM Workshop on Online Social Networks (2009)
13. Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social
network. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (2003)
14. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature
393(6684), 440–442 (1998)
15. Corcoran, M.: Death by cliff plunge, with a push from twitter. New York Times, 12 July
2009
16. Freeman, L.C., Borgatti, S.P., White, D.R.: Centrality in valued graphs: a measure of
betweenness based on network flow. Soc. Netw. 13, 141–154 (1991)
17. “Mining Data from Twitter” from AbhishangaUpadhyay, Luis Mao, Malavika Goda Krishna
(PDF)
18. Vinodhini, G., Chandrasekaran, R.M.: Sentiment analysis and opinion mining: a survey. Int.
J. Adv. Res. Comput. Sci. Softw. Eng. 2(6), 282–292 (2012). ISSN: 2277 128X
19. Devi, G.R., Veena, P.V., Kumar, M.A., Soman, K.P.: Entity extraction for Malayalam social
media text using structured skip-gram based embedding features from unlabeled data.
Procedia Comput. Sci. 93, 547–553 (2016)
20. Sanjay, S.P., Anand Kumar, M., Soman, K.P.: AMRITA_CEN-NLP@ FIRE 2015: CRF
based named entity extractor for Twitter Microposts. In: FIRE Workshops, pp. 96–99 (2015)
21. Mahalakshmi, R., Suseela, S.: Big-SoSA: social sentiment analysis and data visualization on
big data. Int. J. Adv. Res. Comput. Commun. Eng. 4(4), 304–306 (2015). ISSN: 2278-1021
Honeyword with Salt-Chlorine Generator
to Enhance Security of Cloud User Credentials
1 Introduction
Data breaching can happen in cloud due to lack of identity authentication and unau-
thorized account access. Cloud users login credentials such as user passwords will be
stolen by the attackers and the user account will be hijacked which leads to data breach
issues. Data breach, insufficient identity and false credentials are the major issues in
cloud, listed by cloud security alliance (CSA) [2]. Passwords are used to enhance the
entry level security of user account. Password grants access to all the online IT
resources such as computers, emails or server on a network. Passwords are very sen-
sitive information used to authenticate the users. The login credentials of the users are
stored in password database; if the password is exposed by attacking the password
© Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 159–169, 2017.
https://doi.org/10.1007/978-981-10-6898-0_13
160 T. Nathezhtha and V. Vaidehi
database then the user data can be breached. Millions of users passwords are cracked
and companies like linkedin, yahoo, ebay have been attacked [3]. Data breach has
always been out of users control but it is imperative to create passwords which can
withstand password cracking and other attacks on password retrieval. Avoiding such
attacks depends on the complexity of a password.
The password given by users for their account is hashed and stored in the cloud
server database. It is believed that hash is irreversible but this is not true for the current
scenario. There are many advanced tools to revert the hashed password to its original
form. In current scenario many cloud services use weak storage methods for storing
user credentials. For example, SHA-1 algorithm is used by LinkedIn websites for
hasing their account users password and eHarmony system uses MD5 hashing algo-
rithm without salting [4]. To increase the complexity in finding the hash passwords,
Salts are used. Salt is a pseudo random string which is added at the beginning or end of
the original password. The passwords with salt produce a strong hash. The difficultly
level of inverting increases with the complexity of salt.
The password hashed in current scenario is weak against hash crackers. Password
hashing with salting is an appropriate way to secure the password since the hash cannot
be reverted to plain text. Honeywords are decoy passwords to stronghold the password
files. They are fake passwords placed in the password database to deceive the attackers
while attacking the password DB [1]. This paper provides a novel honeyword mech-
anism along with complex salt generation which focuses in cloud users credential
security.
Section 2 gives a survey about the honeyword generation methods and hashing.
Section 3 gives the description of Complex salt-chlorine algorithm. Section 4 gives the
newly enhanced honeyword with salt-chlorine mechanism, Sect. 5 analyze the security
properties and compares the proposed approach and the existing methods. Finally,
Sect. 6 concludes this paper.
Honeywords are decoy passwords to stronghold the password files. They are fake pass-
words placed in the password database to deceive the attackers while attacking the
password DB. Honeywords appear like normal user-selected passwords, So it is difficult
for an attacker distinguish between honeywords and true user passwords from the stolen
password file. The decoy passwords and the user’s original passwords are stored together
in the password database and these lists of words in the database are called as sweetwords.
The user’s password is placed randomly placed in the sweetword list and its position k gets
stored by the honeychecker. Honeychecker is an auxiliary service checks whether the
password submitted by the user is a true password or a honeyword. The password system
at the server end itself does not know about the real password in the sweetword list. When
someone tries to login with a sweetword the password checker checks whether the entered
password exists in the honeyword list, if it exists the password system sends the index of
the sweetword j to the honeychecker, where j is the password entered by an attacker or
legitimate user. The honeychecker verifies whether j = k, if the index matches the user is
authenticated else if incorrect password is submitted an alarm is raised. There are several
Honeyword with Salt-Chlorine Generator to Enhance Security 161
Honeywords can also be hashed while hashing the real password to form Sweetword
list with hash values.
Pseudo
Random string ASCII
generator Convertor
Filter Shuffler
Symbol Salt-Chlorine
Translator generation
Procedure Salt-Chlorine
String → Genrand( ) /* generates random string */
len → length(string)
Convert the string to chararray(0 to len)
for i=0 to len do
ascii_value[i]=ASCII(chararray[i])
end for
value[i] → shuffler(ascii_value[i])
for i=0 to len do
if value[i] less than 32 then
value[i]=value[i]-32
else if value[i] greater than 127 then
for i=i, value[i] greater than 127
value[i]=value[i]-95
else value[i]=value[i]
end if
end for
for i=0 to len do
translator[i] =symbol(value[i])
end for
convert translator[i] to string
salt-chlorine → string.
When a random string is generated by the PRSG the string is converted into
chararray and the ASCII equivalent of all the characters are stored in ascii_value[i],
where i varies from 0 to len(length of the string). All the characters under go shuffling
process and the shuffled characters are stored in value[i]. The filter process checks the
range of the value, if the value is lesser then 32, the value[i] is added by 32. If the value
[i] is greater than 127 then the value is subtracted by 95 continuously until the value
becomes lesser than 127. After filtering, all the values will be in the range of 32 to 127.
In the symbol equivalent of the characters value[i] is translated and the final set of
characters is converted to the salt-chlorine as shown in Fig. 1. The combination of the
honeyword technique and salt-clorine technique gives an extremely secured system for
protecting cloud users login credentials.
The cloud user Ui authenticates his account with the password Pi. The randomized
salt-chlorine generator generates the complex salt-chlorine (sci) for the password Pi.
The Pi and sci are hashed to produce an irreversible hash h(pi + sci) for user account.
The randomized salt-chlorine generator not only produces a SC for user password, it
also produces the complex SC for Honeyword generator. Honeyword generator
164 T. Nathezhtha and V. Vaidehi
Password
DB
Cloud User Password pi h(wsc1)
h(pi+sc)
h(wsc2)
.
.
Hashed password with sc .
.
h(wscn)
Password checker
Sweetword
with index
Matches Honeychecker
Authenticated
No Matches
Alert generation
password is very less. When someone tries to login with password pi, the password is
hashed and password checker in cloud server compares the hashed password with the
list of passwords in password database. If the password matches with anyone of the
sweetwords, the password is sent to the honeychecker, if the index of the sent password
matches the index stored in honeychecks then the user gets authentication else an
intrusion gets reported as shown in Fig. 2.
5 Security Analysis
In this section the security of the proposed Honeyword with Salt-chlorine is compared
with existing systems. The password file with hashed honeywords and raw password
(sweetwords) is shown in Table 1 (column 1). These sweetwords are retrieved into
plaintext with hash cracking tools. The Raw passwords column shows the reverted
hashes, the passwords are found along with the type of hash function (column 2) used
to hash that particular password. Table 1 shows the current protection scenario of
password database. Even with the honeywords mechanism, if all sweetwords are
reversed and raw text are retrieved, the attacker can guess the users password with their
related information and try to attack the system with sweetword of high probability. To
crack the hashes the attacker only needs simple hash cracking tools or even an online
hash crackers are enough. After obtaining the cracked password the attacker can easily
hack the account. Even though the existing system challenges the attacker to the
moderate level, it does not provide a fully secured system. Various kinds of attacks can
be injected in the existing system. The lowest probability honeychecker can be attacked
after attacking the password file, so that the attacker steal the index of hashed password
to retrieve the original password and try to impersonate as authenticated user. To avoid
these types of insecurities and to enhance security of cloud password database, the
sweetwords hashed with salt-chlorine is implemented in this paper. The salt-chlorines
generated are complex salts which go under the process of shuffling, filtering.
The hash can be reverted to original plain text using hash cracking tools. The
plaintext retrieval form the hash using the tool Cain and Able is shown in Fig. 3. There
exist many online and offline password cracking tools to revert the hashes by launching
166 T. Nathezhtha and V. Vaidehi
dictionary and brute-force attack. The hashes reverted in the above picture are the
compromised user credentials. The dictionary attack for retrieving the user credentials
is shown in Fig. 4. Hash value produced by combining the salt and passwords are hard
for the attackers to retrieve. The challenge for the attacker increases when the com-
plexity of the salt increase. The proposed model gives a very complexed salt named as
salt-chlorine which makes the attacks impossible when they are hashed with the
passwords. In Fig. 3, the last four hashes are the hash values produced by the user’s
password along with salt-chlorine, which was irreversible by hash cracking attacks.
A complex salt generation process is shown in the Table 2 for sweetword Angel31.
The plain text Angel31 and salt-chorine oQQL[,FeiCJ/nz> are hashed together using
SHA256 algorithm which produces an irreversible hash value as given below;
ff9e104128000457355efbdde88185f18779f6e56f8fd7895c4442e2618dc842.
Table 3 shows the results of the hash cracking attacks tried by attackers in
sweetword list. The sweetword list contains passwords hashed with salt-chlorine. The
Hash generator with salt chlorine cannot be reverted by the hash cracking attacks. The
result of cracking hash with SC is shown in the Table 3.
Table 4 shows the comparison of honeyword generator models [1], the guessing
probability of the passwords are categorized, the security of the approaches has also
been categorized. In user passwords as honeyword model the passwords of existing
users has been used, if all the hashes are cracked in this model the guessing probability
will be low since all the sweetwords in that particular list are passwords, but users are
under the risk of compromising their password without their knowledge. The attacker
may also try to compromise all the system with the cracked passwords and launch
several attacks via other user accounts. So the risk of involving the other user pass-
words in the model is a huge drawback, whereas the proposed approach focuses on
increasing the complexity of password guessing by the attackers with salt-chlorine
technique. The Novel honeyword with SC model focuses on giving the privacy for all
the cloud users and also it increases the overhead for brute-force attackers, it also avoid
the dictionary lookups by providing hashes with complex salt-chlorine.
6 Conclusion
This paper proposed a Honeyword with Salt-Chlorine (complex salts) generator model
to protect the cloud password database. The security analysis in the proposed model is
compared with all other existing models. The Honeyword with the complex salt
generator protects the password file from attackers. It increases the complexity of
password cracking to the brute-force attackers to high level. Since the passwords are
hashed with salt-chlorines the other hash cracking attacks cannot be injected in the
proposed model. The cracking probability decreases to an extreme low level in this
model, since the time taken to crack a hashed password along with the salt-chlorine is
extremely high, along with this the complexity for cracking all the n number of
sweetword makes the attacks impossible. The proposed approach enhances the cloud
entry level security to a higher level and protects the login credentials of cloud users
form attackers.
References
1. Erguler, I.: Achieving flateness: selecting the honeywords from existing users passwords.
IEEE Trans. Dependable Secure Comput. 13, 284–295 (2016)
2. Cloud Security Alliance: The Treacherous 12- Cloud Computing Top Threats in 2016,
February 2016
3. Vance, A.: If your password is 123456, just make it hackme. New York Times, January 2010
4. Brown, K.: The dangers of weak hashes. SANS Institute InfoSec Reading Room, Maryland,
US, pp. 1–22, November 2013
5. Juels, A., Rivest, R.L.: Honeywords: making password-cracking detectable. In: Proceedings
of the ACM SIGSAC Conference on Computer and Communications Security, pp. 145–160
(2013)
6. Bojinov, H., Bursztein, E., Boyen, X., Boneh, D.: Kamouflage: loss-resistant password
management. In: Proceedings of the 15th European Symposium on Research in Computer
Security, pp. 286–302 (2010)
Multi Class Machine Learning Algorithms
for Intrusion Detection - A Performance Study
1 Introduction
Drastic development in the network technologies made every one to depend on
Internet for each and every thing. Applications of Internet includes banking,
shopping, education, communication, business and so on. Hence the network is
vulnerable to different security threats such as Denial of service (DoS) attacks,
routing attacks, Sybil attacks, probing etc. These cannot be handled by the
state of the art security mechanisms namely authentication techniques, Key-
management techniques and security protocols. Hence there is a need of Intrusion
Detection System (IDS).
In the existing literature different machine learning based intrusion detec-
tion techniques namely Naive Bayes [1,2], Neural Networks [3], Support Vector
Machine [4], Ensemble based [5] are available. Also a hybrid detection technique
is also proposed in [6].
Even though lot of literature is available on machine learning based intrusion
detection, there is very limited literature available on the performance compar-
ison of machine learning algorithms for multi-class classifications. This paper
is the extension of prior work of Belavagi and Muniyal [7], where performance
of different machine learning algorithms are compared on NSL-KDD dataset
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 170–178, 2017.
https://doi.org/10.1007/978-981-10-6898-0_14
Multi Class Machine Learning Algorithms for Intrusion Detection 171
based on the binary classification. Where as this paper discusses the performance
comparison of different multi-class machine learning classification techniques for
intrusion detection on the same dataset.
The paper is organized as follows. In Sect. 2 Intrusion Detection System and
the Data set used for it is discussed. Research work related to Intrusion Detection
System is discussed in Sect. 3. In Sect. 4 various machine learning techniques used
are discussed. Section 5 discusses the framework and algorithm used for Intrusion
detection. In Sect. 6 results obtained are analyzed and Sect. 7 gives the overall
conclusion and future scope.
In this paper prediction analysis for the different class labels is done by con-
sidering the standard intrusion detection data set NSL-KDD [9]. The data set
has forty two attributes. The forty second attribute contains label, which stores
labels of the five classes. These are categorized as one normal class and four
attack class based on the behavior of the network. Specific type of abnormal
activities are further grouped as User to Root, Denial of Service, Probe and
Remote to Local.
3 Related Work
Recently survey on ensemble and hybrid classifier based intrusion detection sys-
tem was proposed by Aburomman et al. [10]. They have considered homogeneous
and heterogeneous ensemble methods. They suggested that voting techniques
based ensemble methods give satisfactory results. They have considered bagging,
boosting, stacking, mixture of computing experts to create ensemble classifiers.
Comparison between the various existing IDS technologies based on detection
methodologies and detection approaches is proposed by Liao et al. [11]. Vladimir
et al. [12] proposed neural network based distributed classifier to handle network
intrusions. They used the confidence level as the measure for decision making.
172 M.C. Belavagi and B. Muniyal
The proposed method works better in comparison with the existing ensemble
techniques. They concluded that performance improvement is possible by the
formalization of the value of threshold selection.
Intrusion detection using support vector machine is proposed by Enache et al.
[13]. They suggested that the performance of the SVM depends on the attributes
used. Hence they used Swarm Intelligence technique to select the best features.
The proposed model is tested with NSL-KDD dataset. Authors claim that the
model has good detection rate and low false positive rate than when compared
to regular SVM.
Panda et al. [14] used combination of different classifiers to identify the intru-
sions. Supervised machine learning technique is used to filter the data. Using the
decision tree classifier they tested the NSL-KDD data set, to identify whether
the network activity is intrusion or not. But the model works only for binary
classification.
Levent et al. [15] proposed multi class classifier based on Naive Bayes to
identify the intrusions. This method identifies Denial of Service attacks with
good accuracy compared to other attacks.
Intrusion detection technique using Support Vector Machine (SVM) is pro-
posed by Li et al. [16]. They also used feature removal method to improve the
efficiency. Using the proposed feature removal method they selected nineteen
best features. They used KDD-CUP99 data-set for experimentation. In the pro-
posed method the data set used is very small.
A light weight IDS was proposed by Sivatha Sindhu et al. [17]. The proposed
method mainly focused on pre-processing of the data so that only important
attributes can be used. The first step is to remove the redundant data so that the
learning algorithms give unbiased result. Bahri et al. [18] proposed an ensemble
based on Greedy-Boost approach for anomaly as well as misuse intrusion detec-
tion. To reduce the time of intrusion detection they used aggregation decision
classifier. Authors claim that the proposed method has minimal number of false
alarms (false-positives) and undetected attacks (false-negative).
The aim of this paper is to design the intrusion detection model to identify mul-
tiple attacks. The predictive model is built using Logistic Regression, Gaussian
Naive Bayes,Random Forest and Support Vector Machine. The performance of
these techniques is also analyzed. These classification algorithms are discussed
below.
Hypothesis for the logistic regression is shown in the Eq. 1. Where g(θT x) is
a logistic function. This function is defined in the Eq. 2. Considering the Eqs. 1
and 2 the hypothesis is also can be represented as in Eq. 3
5 Methodology
The overall methodology followed for the prediction of intrusions is shown in
Fig. 1. In the preprocessing step all the categorical data is converted to numerical,
suitable for machine learning techniques. Then ten best features are selected out
of forty two features using decision tree machine learning technique. After that
preprocessed data is divided as testing data and training data. Then an Intrusion
Detection model is trained by training data to predict multiple class labels. The
different models considered for intrusion detection are Gaussian Naive Bayes,
Logistic Regression, Support Vector Machine and Random Forest classifiers.
These models are used to predict the multi-class labels such as Dos, U2R, R2L,
probe and normal of the test data. The predicted labels are compared based on
the parameters namely accuracy, Precision, Recall and F1-Score.
The following Algorithm is used to build the models. Coding is done in
Python in Intel Core i5-3230M with 4GB RAM. Initially a dataset is made
suitable for machine learning algorithms. Then the best features are selected
using c4.5 Decision Tree classifier, which uses Gini index as the measure. After
this data set is divided into training set and testing. Train the models built using
GNB, LR, SVM and RF. Then predict the labels of the testing data for multi-
class. Compare the performance of the models based on accuracy, precision,
recall and F1-Score.
Fig. 1. Framework
Multi Class Machine Learning Algorithms for Intrusion Detection 175
Initially categorical data is converted into numerical data. Then the redundant
features are eliminated and ten best features are selected using c4.5 decision tree
technique. These ten best attributes selected are Protocol type, Service, Serror
Rate, Srv-diff-host rate, Dst-host-count, Dst-host-same-srv-rate, Dst-host-diff-
srv-rate, Dst-host-srv-diff-host-rate and Dst-host-srv-rerror-rate.
The multi class performance of SVM, GNB, LR and RF to identify DoS,
U2R, R2L and probe is shown in the Fig. 2. Performance measures considered
are Precision, Recall and F1-Score. From the Fig. 2 it can be identified that the
Table 1. Average accuracy of machine learning algorithms for multi class intrusion
detection
Random Forest shows good performance in identifying all the four attacks, as
compared to other learning techniques. All machine learning algorithms show
very poor performance in identifying R2L attacks. The main reason for this that
is the amount of data available in NSL-KDD dataset for R2L attack is limited.
GNB has the lowest performance in identifying all the four attacks. SVM
works better than the LR.
Experiment is conducted by considering the four different machine learning
algorithms to predict normal behavior and the attack classes namely U2R, R2L,
Probe and DoS. Based on the prediction of each class, accuracy of each algorithm
is computed. Hence the performance of these machine learning algorithms are
also analyzed based on the average accuracy as shown in Table 1. From Table 1
it can be identified that the Random Forest shows the best average accuracy and
Gaussian Naive Bayes has least average accuracy with respect to identification
of different class of intrusions and normal behavior. Whereas average accuracy
of Support Vector Machines is better than the Logistic Regression.
7 Conclusion
References
1. Mukherjee, S., Sharma, N.: Intrusion detection using naive bayes classifier with
feature reduction. Procedia Technol. 4, 119–128 (2012). 2nd International Confer-
ence on Computer, Communication, Control and Information Technology (C3IT-
2012), 25–26 February, 2012. http://www.sciencedirect.com/science/article/pii/
S2212017312002964
2. Panda, M., Patra, M.R.: Semi-Naı̈ve Bayesian method for network intrusion detec-
tion system. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009. LNCS,
vol. 5863, pp. 614–621. Springer, Heidelberg (2009). https://doi.org/10.1007/
978-3-642-10677-4 70
3. Devaraju, S., Ramakrishnan, S.: Performance comparison for intrusion detection
system using neural network with KDD dataset. ICTACT J. Soft Comput. 4(3),
743–752 (2014)
4. Khan, L., Awad, M., Thuraisingham, B.: A new intrusion detection system using
support vector machines and hierarchical clustering. VLDB J. 16(4), 507–521
(2007). http://dx.doi.org/10.1007/s0077800600025
5. Gaikwad, D.P., Thool, R.C.: Intrusion detection system using bagging ensemble
method of machine learning. In: 2015 International Conference on Computing Com-
munication Control and Automation, pp. 291–295, February 2015
6. Leite, A., Girardi, R.: A hybrid and learning agent architecture for network intru-
sion detection. J. Syst. Softw. 130, 59–80 (2017). http://www.sciencedirect.com/
science/article/pii/S0164121217300183
7. Belavagi, M.C., Muniyal, B.: Performance evaluation of supervised machine learn-
ing algorithms for intrusion detection. Procedia Comput. Sci. 89, 117–123 (2016).
http://www.sciencedirect.com/science/article/pii/S187705091631081X
8. Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill Inc., New York (1997)
9. Nsl-kdd dataset. Accessed Dec 2015
10. Aburomman, A., Reaz, M.: A survey of intrusion detection systems based on
ensemble and hybrid classifiers. Comput. Secur. 65, 135–152 (2017)
11. Liao, H.J., Lin, C.H.R., Lin, Y.C., Tung, K.Y.: Intrusion detection system: a com-
prehensive review. J. Netw. Comput. Appl. 36(1), 16–24 (2013)
12. Bukhtoyarov, V., Zhukov, V.: Erratum: ensemble-distributed approach in classifi-
cation problem solution for intrusion detection systems. In: Corchado, E., Lozano,
J.A., Quintián, H., Yin, H. (eds.) IDEAL 2014. LNCS, vol. 8669, p. E1. Springer,
Cham (2014). https://doi.org/10.1007/978-3-319-10840-7 60
13. Enache, C., Patriciu, V.V.: Intrusions detection based on support Vector machine
optimized with swarm intelligence. In: 2014 IEEE 9th IEEE International Sympo-
sium on Applied Computational Intelligence and Informatics (SACI), pp. 153–158,
May 2014
14. Panda, M., Abraham, A., Patra, M.R.: A hybrid intelligent approach for network
intrusion detection. Procedia Eng 30, 1–9 (2012). International Conference on
Communication Technology and System Design 2011. http://www.sciencedirect.
com/science/article/pii/S1877705812008375
15. Koc, L., Mazzuchi, T.A., Sarkani, S.: A network intrusion detection sys-
tem based on a hidden Naive Bayes multiclass classier. Expert Syst. Appl.
39(18), 13492–13500 (2012). http://www.sciencedirect.com/science/article/pii/
S0957417412008640
178 M.C. Belavagi and B. Muniyal
16. Li, Y., Xia, J., Zhang, S., Yan, J., Ai, X., Dai, K.: An efficient intrusion detection
system based on support vector machines and gradually feature removal method.
Expert Syst. Appl. 39(1), 424–430 (2012). http://www.sciencedirect.com/science/
article/pii/S0957417411009948
17. Sindhu, S.S.S., Geetha, S., Kannan, A.: Decision tree based light weight intrusion
detection using a wrapper approach. Expert Syst. Appl. 39(1), 129–141 (2012).
http://www.sciencedirect.com/science/article/pii/S0957417411009080
18. Bahri, E., Harbi, N., Huu, H.N.: Approach based ensemble methods for better
and faster intrusion detection. In: Herrero, Á., Corchado, E. (eds.) CISIS 2011.
LNCS, vol. 6694, pp. 17–24. Springer, Heidelberg (2011). https://doi.org/10.1007/
978-3-642-21323-6 3
19. Murphy, K.P.: Learning Machine: A Probabilistic Perspective. The MIT Press,
Cambridge (2012)
20. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). http://www.cs.
colorado.edu/grudic/teaching/CSCI5622-2004/RandomForests-ML-Journal.pdf
Symmetric Key Based Secure Resource Sharing
1 Introduction
on the derivation of the decryption key by the users. To reduce the storage at
the users we use a logarithmic-keying approach where, for a set of R resources,
each user needs to store only O(log R) keys.
Now, if a user is granted access to a resource, the central authority selects a
unique subset of the keys from this pool of keys, derives the function F and the
corresponding public-information. We design two key distribution approaches
for implementing the function F . In the first approach, the central-authority
randomly selects a secret polynomial which is of order O(log R). An arbitrary
point on this polynomial is chosen as the encrypting key for the resource and
made public. For each user, the central authority uses the keys of this user
and evaluates them on the polynomial. All the evaluated values from all the
users are published as public-information. To derive the decryption key, the
user needs to interpolate the polynomial and evaluate it at the particular point
published by the central authority. Since deriving a polynomial of order O(log R)
requires O(log R) points, each user can easily compute the polynomial using the
corresponding set of keys and the corresponding public-information.
In the second approach, the central authority computes the decryption key
as an XOR of values derived using the keys selected from the users sharing the
resource. For each user, the central authority uses the keys of this user and passes
each key through a one-way function along with a public-value P that is specific
to this user. The hashing is necessary to prevent leaking the actual values of the
keys and the public-value acts as a salt for dictionary-based attacks. Now, the
central authority XORs all the hash values from all the users. This combined
hash value is used as the encryption key after appropriate expansion or reduction
of the hash value length depending on the requirement. In order to derive this
key, from the properties of XOR, each user needs the XOR of all the hash values
contributed by the remaining users. Since, the basic key derivation operations
are one-way hashing and XOR, this scheme is very efficient and can even be
implemented in hardware.
Key Contributions. Our major contributions are as follows. (a) We devise an
efficient key distribution approach for securing shared resource using a public-
private information model. (b) We implement our model using two efficient
key distribution approaches that only require the users to store a logarith-
mic number of keys to the number of authorized resources. (c) We show that
the cost of handling user dynamics is better than existing approaches without
compromising security. We have used key derivation cost, user storage, size of
public-information and membership handling costs as metrics to evaluate our
approaches. (d) The generic nature of our approach shows that more key distri-
bution protocols are possible within this model and hence, there is further scope
for expanding our approach to newer application domains.
Organization. The paper is organized as follows. In Sect. 2, we describe our
system model and identify security requirements. In Sect. 3, we describe our
framework in detail. We analyze the security of our framework in Sect. 5. In
Sect. 4, we present the experimental results obtained from our framework and
Symmetric Key Based Secure Resource Sharing 183
compare them with existing schemes. In Sect. 6, we conclude the paper and
describe some future work.
2 System Model
In this section, we describe the problem background in detail and the system
model. We also state our assumptions towards solving the problem. We conclude
the section by describing some related work in this area.
2.1 Background
Applications with shared resources can be classified into two broad classes based
on user behavior: those that require all the users to be online at the time of
sharing, e.g., video conferencing and those that do not have this requirement e.g.,
secure databases, file systems. For the sake of simplicity, we denote the former
class of applications as online applications and the latter as offline applications.
For online applications, the number of shared resources and the degree of sharing
is small e.g., a group of users may subscribe to one or more multicast sessions.
This problem has been studied extensively and many good solutions have been
proposed in the literature [18,25,26]. In this work, we focus on the security of
the offline applications and propose a framework to secure such applications.
that, in current literature, some special topologies of the hierarchy graph such
as a tree [15,20], graphs of a certain partial order dimension [2] are studied for
simplicity. For any node u ∈ G and R ∈ Su , let d(u, R) refer to the length of a
shortest (directed) path from u to a node v ∈ G because of which u can access
the object R. Let d(u) = maxR∈Su d(u, R). The depth of a hierarchy graph G,
denoted d(G), is defined as maxu∈V (G) d(u).
Normally, in an access graph all the users at the higher levels can access all the
lower level resources. However, there are some special cases of access hierarchies
where the users at the higher layers are restricted from access all the lower level
resources. This restriction is specified in terms of the depth of the hierarchy they
can access and hence, such hierarchies are called limited-depth hierarchies. In
the case of a mechanism that works for a limited depth hierarchy, we say that
each vertex in the graph is associated with a number (v) that indicates that v
is allowed to access resources that can be reached by a (directed) path of length
at most (v). For a user u, we denote by cap(u) the set of resources that u can
access. Similarly, for a resource r, we denote by acl(r) the set of users u such
that r ∈ cap(u). We extend this notation naturally when dealing with sets of
users and resources.
proportional to the depth of the access graph. The dynamic access control prob-
lems such as addition/deletion of edges, addition/deletion of a class are also
handled in an efficient manner by updating only the public information - a fea-
ture that was not available in earlier schemes. They also presented techniques to
minimize the key derivation time at the expense of user storage [2]. The idea is to
add some extra edges and extra nodes called dummy nodes based on the dimen-
sion of a poset. However, as pointed out in [2], the computational complexity
of finding the dimension of a given poset diagram is not known. Our approach
provides a key derivation scheme which on an average performs comparable or
better to [2,3].
In [6,11], the authors describe a scheme that combines techniques from dis-
crete logarithms and polynomial interpolation. However, the user storage cost is
high and support for dynamic operations requires costly polynomial interpola-
tion. In [28], the authors present a scheme using polynomials over finite fields.
However, the degree of the polynomial kept secret with the object store is very
high. Key derivation also involves computations with such large degree polyno-
mials and takes time proportional to the depth of access. Moreover, the cost of
rekeying under dynamic updates is quite high. In [9], the authors attempt a uni-
fication of most of the existing schemes. This is done by identifying the central
attributes of the schemes such as: node based, direct key based, and iterative.
Once database is treated as a service [13], it is easy to envision that database
operations can be outsourced bringing in a host of security issues. In [10,24], the
authors apply the key derivation approach of [3] to secure databases. However,
to be able to use the key derivation schemes to problems like secure databases
the access hierarchy needs to be built up-front. A virtual hierarchy of users is
created and the scheme of [3] is used. But, as noted in [10,24], the computational
cost of generating this virtual hierarchy can be quite high.
The key short-coming of the key derivation techniques is that applying these
solutions is difficult if the access hierarchy is not available upfront or is not
explicitly specified. Such a scenario occurs in secure data bases or in access
control matrices, which implies that, in order to be able to use the key deriva-
tion techniques, the access hierarchy structure needs to be constructed for these
applications. Although, techniques for constructing access hierarchies [10,24]
have been proposed, these are expensive and place additional pre-processing
overhead on the system. The main advantage of the key derivation schemes is
that the user storage is minimal O(1). However, the computational complexity
in key derivation scheme is proportional to the depth of hierarchy which can
be O(N ) where N is the number of application users. Thus, the key derivation
techniques reduce the storage complexity but increase the computation required
to derive the required keys.
From this discussion, we note that, there is a possibility of trade-off between
the user storage and the complexity of key derivation. Given these shortcom-
ings, we note that, it is relatively easier to consider a hierarchy and flatten it
before deploying any key distribution techniques. Flattening of a hierarchy sim-
ply means that we consider each resource individually and group all the users
186 B. Bezawada et al.
sharing that resource. Using this approach, we will be able to address those sce-
narios where the hierarchies are not readily available, which is the case in most
practical user-level databases. Also, special hierarchies like, limited-depth hier-
archies where resources given to a user fall between two levels of the hierarchy,
can also be addressed with a flattened hierarchy. Next, we describe our approach
by which the access hierarchy graph is flattened thereby eliminating the need for
the expensive pre-processing step. To secure the flat access structure, we describe
storage efficient key distribution techniques that are based on the public-private
model, which means that the decryption key of a resource is a function of some
public information and the user’s secret information.
3 Our Approach
In this section we describe the proposed framework for shared resource access.
In Sect. 3.1, we describe our approach for flattening of the access hierarchy of
a given organization. In Sect. 3.2, we describe our basic key distribution app-
roach for securing resources known to a single user using the storage efficient
logarithmic keying approach. In Sect. 3.3, we enhance our basic key distribution
using Shamir’s secret sharing technique and describe the solution to securing all
resources shared by the users.
The process of flattening the hierarchy can be seen as computing the transitive
closure of the graph G. There are several algorithms for computing the transitive
closure of a given directed graph [8]. Given the graph G, the transitive closure
of G, denoted G∗ , has at most n2 edges if the graph G has n vertices. The
central authority thus first computes the graph G∗ . Note that, the process for
computing transitive closure changes only slightly for limited depth hierarchies
as the depth of each user is noted before computing the closure. The outcome
of computing the transitive closure is that for every user u, cap(u) is known and
similarly for every resource r, acl(r) is known.
Now, to secure single owner resources with the logarithmic keying technique,
each user stores atmost O(log m) symmetric keys where m is the number of
resources to which he has access to. To encrypt a resource, the user selects
a unique subset of keys from this subset, computes an XOR of the keys and
uses this value to encrypt the resource. To enable sharing of resources, each
user who has access to the resource needs the decrypting key. We use Shamir’s
secret sharing approach to encode the encrypting (symmetric) key of the shared
resource. The encrypting key can be locally computed using a subset of the
O(log m) keys held by each user and using some public information. This public
information can be stored in the resource store as resource meta-data.
We note that, the above key distribution approach can be generalized. Instead
of choosing a fixed size subset of keys based on the binary identifier, the central
authority can choose a unique but smaller subset of keys for each resource. Thus,
the key distribution can be now stated as follows. The central authority generates
a unique pool of keys for each user. From this pool of keys, for each cap(u), the
central authority selects a unique subset of keys to encrypt the resource. Since
the subset of keys is unique by construction, no two resources will be encrypted
with the same key. Moreover, it is clear the pool of keys cannot be more than
O(2
2 loglogm) as it can be trivially shown that the number of subsets of size k i.e.,
m
k , for some k < log m, is greater than m, where k can be appropriately
chosen using Stirling’s approximation [8].
Reducing Storage Further. In the key derivation techniques [1–3,5,9,17,19,
21,28] the user needs to store only one key, albeit, the cost of deriving the decryp-
tion key involves higher computation. We note that, the logarithmic keying can
be replaced with a scheme which requires the user to store only one master
key K. The scheme is as follows: to encrypt a resource Ri with identifier IDi
the central authority computes the encrypting keys as, KRi = HK (IDi ) where
H denotes a secure one-way hash function. Since the user has the master key
K, he can compute the encrypting key by performing one secure one-way hash
computation. However, we note that, in the logarithmic keying scheme the user
needs to perform log m XOR operations where m is the size of the resources.
It can be seen that this computation is much faster than one secure hash com-
putation even when m is as large as 212 . Hence, this illustrates that reducing
storage invariably increases the computation required for key derivation. From
this discussion, we observe that, the logarithmic keying scheme reduces the key
derivation overhead by increasing the storage complexity only slightly.
The logarithmic keying approach works if a single user is only accessing the
resources. To secure shared resources we apply Shamir’s secret sharing scheme
coupled with the logarithmic keying approach. We encrypt the resource using a
secret that can be generated locally by the individual subsets of keys held by
the users. Our final solution has two main steps.
188 B. Bezawada et al.
Step 1: Key Distribution. Notice that for each user u, it holds that |cap(u)| ≤
m. To provide keys for the resources, the central authority, using the scheme
described in Sect. 3.2, picks 2 log m keys uniformly and independently at random
from the field F. Note that for all practical purposes, it can be assumed that
the set of keys chosen for each user are all distinct. These 2 log m keys form the
master keys for each user. For each user u and for every resource r ∈ cap(u),
the central authority then allocates different subsets of size log m keys chosen
independently and uniformly at random from the set of master keys at u. These
are denoted as Sur = {Ku,1 r r
, Ku,2 r
, · · · , Ku,log m } and are called as the keys of
resource r at user u. The central authority thus has |acl(r)| · log m keys for
resource r.
Step 2: Key Management. Given the key distribution from Step 1, we now
apply ideas from Shamir’s influential paper [22] to complete the solution. The
central authority chooses a polynomial fr of degree log m for resource r and
uses this to encode the encrypting key for the resource. The encrypting key
k(r) of the resource r is computed as follows. The CA chooses a point on the
polynomial fr , say p(r), and computes fr (p(r)) and publishes p(r), fr (p(r)).
The decryption key k(r) is set to fr (0), and the polynomial fr (.) is kept secret
by the CA. Now, the CA implements a log m–out of–|acl(r)| · log m threshold
secret sharing scheme to enable a user to interpolate this polynomial. To this
end, for each user u ∈ acl(r), the CA evaluates fr at each of the keys in Sur .
These values are then made public. To access a resource r, a user u uses the
subset Sur for r along with the public information, i.e., evaluations of fr (.) on
the key set Sur , for resource r to interpolate the polynomial fr (.). Finally, the
user evaluates this polynomial at fr (0) to recover the encrypting key k(r).
Storage and Computational Complexity. The storage complexity of each
user is O(log m) and that of the central authority is O(n. log m) where n is the
total number of users. We denote the average degree of resource sharing by mr ,
i.e., on an average mr users share a particular resource. Given this informa-
tion, the public information required per shared resource is given by mr log m.
The complexity of interpolating a polynomial of degree log m, for extracting the
encrypted key, is O(log2 m) operations using the well-known Lagrange’s method.
We note that, the main advantage of our scheme is the small degree of the poly-
nomial compared to other schemes [6,11,28] based on polynomial interpolation.
Our framework allows for efficient updates to handle changes to the user set,
changes to the hierarchy, and changes to the resource set. We now describe the
operations required to address each of these events.
Addition of an user. Suppose that the new user u along with a list of objects he
can access is given. Let Ra denote the set of resources for which acl(r) changes.
The central authority chooses the set of master keys for u independently and
uniformly at random from the field F. For each resource r ∈ cap(u), the CA
also picks the subsets of keys Sur . Now, for every resource r ∈ Ra , the central
authority evaluates the polynomial fr () at the points in Svr for every new member
v ∈ acl(r) and makes these evaluations public. No change to the polynomial or
Symmetric Key Based Secure Resource Sharing 189
the key k(r) of the object is required. Unlike other schemes, adding a user is very
easy in our framework as only a few more evaluation of polynomials are required.
We note that, adding a user in an access hierarchy can be easily modeled in our
approach by considering the incremental transitive closure.
Revoking a User. In this case, the central authority has to essentially change
the polynomial for each of the affected resources. For every such resource, r, the
CA chooses a different polynomial of degree log m and recomputes the public
information public for each user in aclr. The CA need not change p(r), or the
keys of the users but only has to change the encryption key of r.
Addition of an Authorization. This corresponds to adding a resource r to
cap(u) for a user u. The CA associates a subset of keys Sur , evaluates fr at the
points (keys) in Sur , and the resulting values are made public.
Revoking an Authorization. In this case, only one resource is affected. To
handle this change, the central authority chooses a different polynomial and
proceeds as in the previous case.
Addition of a Resource. We now consider the case when a new resource r
is added to the resource store. In this case, let us assume that acl(r) is also
provided. For each user u in acl(r), the CA associates the set Sur and informs u
of the same. The CA then chooses a polynomial fr (.) and computes the required
public information.
Extensions to Limited-Depth Hierarchies. Note that the above frame-
work can work seamlessly for limited-depth hierarchies. Instead of finding the
transitive closure of the access graph, we simply find the graph H such that
(u, v) ∈ E(H) if and only if dG (u, v) ≤ d(u) and apply our approach.
4 Experimental Results
We performed three experiments. First, we compared the average number of
operations required in our proposed framework against the scheme described in
[3]. We refer to the scheme from [3] as Atallah’s scheme. Second, to evaluate
the efficiency of our framework in various settings, we describe a profiling of
organizations. We evaluated our framework on each of these profiles. Finally, we
compared the storage overhead of the proposed framework with that of Atallah’s
scheme. Implementation was in C++ on a general purpose Linux PC.
Comparison of Operations. We experimented with the number of users ranging
from n = 100 and n = 1000. As we need to generate access graphs Atallah’s
scheme, we used random graphs with a diameter between log n and 2 log n.
We used the transitive closure of the same graphs for evaluating our frame-
work. The number of resources varies from 100 to 1000. To measure the average
cost of accessing a shared resource, we computed the cost of accessing random
resources at randomly chosen users using our framework and Atallah’s scheme.
For Atallah’s scheme, we used SHA-1 as the chosen hash function as the deriva-
tion function. To move away from the specifics of the different implementation,
190 B. Bezawada et al.
700 400
Proposed Scheme T, log m=10
Atallah’s Scheme T, log m=16
350 B, log m=10
600 B, log m=16
300 M, log m=10
M, log m=16
No. of Operations
500
No. of Operations
250
400 200
150
300
100
200
50
100 0
0 100 200 300 400 500 600 700 800 900 1000 1 4 16 64 256 1024 4096 16384
No. of users No. of users
(a) (b)
Fig. 1. (a) shows comparison of computational cost of the our scheme with the scheme
of Atallah et al. [3]. (b) shows the profiling results of our proposed scheme.
the cost was measured by the average number of operations performed to derive
the key of a randomly chosen resource. For our framework, the number of oper-
ations required to interpolate a set of points was measured. The results were
averaged over 25 trials. In Fig. 1(a), we show the results of the experimenta-
tion. We can see that on an average our framework requires a smaller number
of operations as the size of the system grows.
Profiling and Efficiency of the Framework. We describe a practical profiling of
different organizations that enable us to evaluate our framework in diverse set-
tings. We note that our profiling can be used to evaluate other key deriva-
tion techniques as well. Our profiling is based on the distribution of users
across the organizational hierarchy. The profiling is as follows: Bottom − heavy,
T op − heavy, M iddle − heavy, and U nif orm. The Bottom − heavy model corre-
sponds to an organizational structure where there are a lot of users at the lower
levels of the hierarchy. Similarly, a T op − heavy model consists of more users
at the top of the hierarchy; a M iddle − heavy model consists of more users in
the middle of the hierarchy. In the U nif orm model, the users are equally spread
across the organization. Note that, most organizations fall in these categories
and hence, can be easily modeled by these profiles. We experimented on user
sizes, n = 210 and n = 216 . In Fig. 1(b), we show the results of the experi-
ments. For example, the line corresponding to, T, log m = 10, means that the
average number of operations were measure for a T op − heavy organization for
n = 210 users. In the figure, B stands for Bottom − heavy and M stands for
M iddle − heavy. We can clearly see the variation in the number of operations.
In the case of a T op − heavy hierarchy where the degree of sharing is typically
small, the number of operations even for 216 users is around 200. This can be
contrasted with a Bottom − heavy hierarchy with a bigger degree of sharing. In
this case, the number of operations increase but still is under 250. Predictably,
the lines for the M iddle − heavy fall in between the T op and Bottom − heavy
cases.
Symmetric Key Based Secure Resource Sharing 191
16384
1024
256
64
16
4
Proposed Scheme
Atallah’s Scheme
1
0 100 200 300 400 500 600 700 800 900 1000
No. of users
Fig. 2. The average public storage required in Atallah’s scheme [3] and our scheme.
Storage Comparison. In Fig. 2, we show the public storage of our scheme against
Atallah’s scheme. For small hierarchies, our scheme requires a higher amount of
public storage. But as the number of users increase, the public storage in our
scheme is comparable to that Atallah’s scheme. We note that, public-storage
is necessary in such applications as it serves to reduce the amount of secure
communication between the CA and the users.
5.1 Soundness
Notice that as each resource is associated with an access polynomial which works
along the lines of Shamir’s secret sharing scheme, any valid user can always access
a resource. This holds because, the user needs to simply associate a subset of
keys for the required resource from the keys in his key ring. Since the user knows
the public information required to interpolate the resource polynomial, the user
can access the resource.
Any scheme for shared resource access has to guarantee that a group of malicious
users cannot pool up their secrets (keys) and derive access to any resource that
they cannot otherwise access. If a solution is resistant to any group of up to
k colluding users, then we call the solution to be k-collusion resistance. The
parameter k is often called as the degree of collusion resistance of the solution.
In the following, we show that our scheme is collusion resistant to a degree n
where n is the number of users in the system.
To make the presentation formal, we need to define some notions. We follow
the model introduced by Atallah et al. [3]. We look at adversaries that can
actively corrupt any node. When a node is corrupted by an adversary, it is
possible for the adversary to get all the keys owned by the corrupted node.
192 B. Bezawada et al.
We also assume that keys assigned to the users are chosen uniformly at random
from all possible keys in the field F.
We let the adversary know the access graph G and its transitive closure. In
effect, the adversary knows cap(u) for every user u and acl(r) for every resource
r. For a given set C of corrupted nodes, let cap(C) = ∪u∈C cap(u). Let us fix
any resource r ∈ cap(C) and the goal of the adversary is to access r. For this
purpose, imagine an oracle O that knows the keys for k(r) for r. The adversary
creates a key (or a set of keys) k(r ) and presents it to O. The adversary is
successful (wins) if k(r) = k(r ).
While our description above uses an adaptive adversary, it can be noted that
the power of an adaptive adversary is same as that of a static adversary. So in
the rest of the presentation, we work with a static adversary. We call the above
adversary as A.
From the above description, it is clear that the advantage of the adversary
A is tied to the ability to come up with the right polynomial. However, as
stated in Shamir’s paper [22], even if one point is not known it is difficult in the
information theoretic sense to know the polynomial. In our case, the adversary A
is not aware of any single point completely. It can know only the images but not
the pre-images. For each of the possible x ∈ F for each of the pre-images, A can
construct a polynomial. All these |F|log m polynomials are equally likely to be the
correct polynomial for resource r. Hence, A cannot win with any non-negligible
probability as |F| is large enough.
References
1. Akl, S.G., Taylor, P.D.: Cryptographic solution to a problem of access control in
a hierarchy. ACM Trans. Comput. Syst. 1(3), 239–248 (1983)
2. Atallah, M.J., Blanton, M., Frikken, K.B.: Key management for non-tree access
hierarchies. In: Proceedings of ACM SACMAT, pp. 11–18 (2006)
3. Atallah, M.J., Frikken, K.B., Blanton, M.: Dynamic and efficient key management
for access hierarchies. In: Proceedings of ACM CCS, pp. 190–202 (2005)
4. Castiglione, A., Santis, A.D., Masucci, B., Palmieri, F., Huang, X., Castiglione,
A.: Supporting dynamic updates in storage clouds with the AKL–Taylor scheme.
Inf. Sci. 387, 56–74 (2017)
5. Chang, C.C., Buehrer, D.J.: Access control in a hierarchy using a one-way trap
door function. Comput. Math. Appl. 26(5), 71–76 (1993)
6. Chen, T.S., Chen, H.J.: How-Rernlina: a novel access control scheme based on
discrete logarithms and polynomial interpolation. J. Ya-Deh Univ. 8(1), 49–56
(1999)
7. Chu, C.K., Chow, S.S., Tzeng, W.G., Zhou, J., Deng, R.H.: Key-aggregate cryp-
tosystem for scalable data sharing in cloud storage. IEEE Trans. Parallel Distrib.
Syst. 25(2), 468–477 (2014)
8. Cormen, T., Leiserson, C., Rivest, R., Stein, C.: Introduction to Algorithms, 2nd
edn. McGraw Hill, New York (2001)
9. Crampton, J., Martin, K., Wild, P.: On key assignment for hierarchical access
control. In: Proceedings of the 19th IEEE workshop on Computer Security Foun-
dations, pp. 98–111 (2006)
10. Damiani, E., di Vimercati, S.D.C., Foresti, S., Jajodia, S., Paraboschi, S., Samarati,
P.: Selective data encryption in outsourced dynamic environments. Electron. Notes
Theor. Comput. Sci. 168, 127–142 (2007)
11. Das, M., Saxena, A., Gulati, V., Pathak, D.: Hierarchical key management schemes
using polynomial interpolation. SIGOPS Oper. Syst. Rev. 39(1), 40–47 (2005)
12. Gouda, M.G., Kulkarni, S.S., Elmallah, E.S.: Logarithmic keying of communication
networks. In: Datta, A.K., Gradinariu, M. (eds.) SSS 2006. LNCS, vol. 4280, pp.
314–323. Springer, Heidelberg (2006). doi:10.1007/978-3-540-49823-0 22
13. Hacigümüs, H., Mehrotra, S., Iyer, B.R.: Providing database as a service. In: ICDE,
pp. 29–38 (2002)
14. Jend, F.G., Wang, C.M.: A practical and dynamic key management for a user
hierarchy. J. Zhejiang Univ. Sci. A 7(3), 296–301 (2006)
15. Liaw, H., Wang, S., Lei, C.: A dynamic cryptographic key assignment scheme in a
tree structure. Comput. Math. Appl. 25(6), 109–114 (1993)
16. Lin, C.H., Lee, W., Ho, Y.K.: An efficient hierarchical key management scheme
using symmetric encryptions. In: 19th International Conference on Advanced Infor-
mation Networking and Applications (AINA 2005), vol. 2, pp. 399–402 (2005)
17. MacKinnon, S.J., Taylor, P.D., Meijer, H., Akl, S.G.: An optimal algorithm for
assigning cryptographic keys to control access in a hierarchy. IEEE Trans. Comput.
34(9), 797–802 (1985)
18. Naor, D., Naor, M., Lotspiech, J.: Revocation and tracing schemes for stateless
receivers. In: Kilian, J. (ed.) CRYPTO 2001. LNCS, vol. 2139, pp. 41–62. Springer,
Heidelberg (2001). doi:10.1007/3-540-44647-8 3
19. Ray, I., Ray, I., Narasimhamurthi, N.: A cryptographic solution to implement
access control in a hierarchy and more. In: Proceedings of ACM SACMAT, pp.
65–73 (2002)
194 B. Bezawada et al.
20. Sandhu, R.S.: Cryptographic implementation of a tree hierarchy for access control.
Inf. Process. Lett. 27(2), 95–98 (1988)
21. Santis, A.D., Ferrara, A.L., Masucci, B.: Cryptographic key assignment schemes
for any access control policy. Inf. Process. Lett. 92(4), 199–205 (2004)
22. Shamir, A.: How to share a secret. Commun. ACM 22, 612–613 (1979)
23. Tang, S., Li, X., Huang, X., Xiang, Y., Xu, L.: Achieving simple, secure and efficient
hierarchical access control in cloud computing. IEEE Trans. Comput. 65(7), 2325–
2331 (2016)
24. di Vimercati, S.D.C., Samarati, P.: Data privacy problems and solutions. In: Pro-
ceedings of the Third International Conference on Information Systems Security
(ICISS), pp. 180–192 (2007)
25. Waldvogel, M., Caronni, G., Sun, D., Weiler, N., Plattner, B.: The versakey frame-
work: versatile group key management. IEEE JSAC 17, 1614–1631 (1999)
26. Wong, C.K., Gouda, M., Lam, S.S.: Secure group communications using key graphs.
IEEE/ACM Trans. Netw. 8, 16–30 (2000)
27. Yang, C., Li, C.: Access control in a hierarchy using one-way functions. Elseveir
Comput. Secur. 23, 659–664 (2004)
28. Zou, Z., Karandikar, Y., Bertino, E.: A dynamic key managment solution to acces
hierarchy. Int. J. Netw. Manag. 17, 437–450 (2007)
Prevention of PAC File Based Attack
Using DHCP Snooping
1 Introduction
Proxy Server in an organisation reduces the bandwidth used in the shared chan-
nel. It can increase the performance and can be used for load balancing within
an organisation. As number of user increases, configuring proxy settings in the
end system is a difficult task for an administrator. CFILE or PAC file [1] is used
to locate the web proxy servers stored in a Web server or it can be stored in a
proxy server depending upon the amount of users.
The automation of PAC file configuration in web clients is performed using
WPAD [2]. PAC file will decide the host to connect through a proxy or access
point directly to remote server which is written in javascript having a mandatory
function FindProxyForUrl [1] inside the file. When the user requests for a URL
through the browser, the PAC file is retrieved from Web Server using WPAD
Protocol.
WPAD protocol will search for PAC file in the following methods:
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 195–204, 2017.
https://doi.org/10.1007/978-981-10-6898-0_16
196 K.R. Atul and K.P. Jevitha
– Using DHCP: The Web Browser requests for PAC file location from the DHCP
Server. DHCP server must be configured with an option 252 having the PAC
file destination.
– Using DNS: DNS WPAD is a method for detecting a PAC file via discovery
by leveraging the network name of the user computer and using a consistent
DNS configuration and PAC script file name.
The response from the DHCP Server is spoofed [3] by an attacker within the
network, i.e., middle man between the victim and a DHCP server. The PAC
file is retrieved from attacker location. Even if the user requests for a HTTPS
[4] running Web Server the attacker is able to see the URL visited because the
attack happen before the establishment of end to end connection.
This paper is organized as follows. Related work is described in Sect. 2. The
Problem in WPAD protocol is mentioned in Sect. 3. The proposed system and
implementation is described in Sect. 4. Section 5 discuss the results and Sect. 6
concludes the paper.
2 Related Work
In Man-in-the-middle attack by name collision, the user mistakenly leaks out
domain name requests that make an attacker create name collisions for the
queries by registering the domain name ‘company.ntld’ [5] in new gTLD [6]
.ntldp [7]. This name collision attack can cause all web traffic of an internet user
to be redirected to a Man-In-The-Middle (MITM) proxy automatically right
after the launching of the web browser. The underlying problem of this attack
is internal namespace WPAD query leakage.
Browser Cache Poisoning (BCP), attack is performed in a network by one-
time MITM attack on the users HTTPS session of a user. It substitutes cached
resources with malicious ones. Browsers are highly inconsistent in their caching
policies for loading resources over SSL connections with invalid certificates [8].
In an insecure HTTP connection an attacker is possible to perform a MITM
attack. Thus the use of HTTPS enforcer helps establish a secure connection
over HTTPS. The mitigation is done based on collecting the static list of URLS
maintained by different user agent and using squid proxy server as a daemon
that checks URL with the static list of URL [9].
In the ARP cache poisoning attack the MAC address is collected by broad-
casting ARP request and caches of two hosts are poisoned. For mitigation of ARP
cache spoofing, the security features such as DHCP Snooping and Dynamic ARP
Inspection (DAI) are enabled in the network switch [10].
In sniffing and propogating malwares through WPAD deception, the attacker
impersonate as a WPAD web server while requesting the PAC file through
windows system’s NetBIOS Name Service (NBNS) protocol in a local area
network [11].
In the name collision attack the internal DNS namespace is leaked to the
outside Domain Name Server. The attacker needs to collect the leaked domain
and register domain for redirecting traffic of the user. This can be controlled
Prevention of PAC File Based Attack Using DHCP Snooping 197
in three ways, i.e., by reserving new registration from Native XML Database
(NXD) traffic, by filtering the request before entering public namespace, and by
running a background process to filter domains within the network. BCP replaces
the javascript files with malicious ones with same URL. So if the user visits the
same website again; the malicious cache is loaded to the browser provided by
the attacker. Attack is prevented by running a script that checks for freshness
and integrity of resources from a website. The SHS-HTTPS enforcer redirects
traffic through squid proxy server if it is HTTP connection and checks with the
preloaded list that are synchronised periodically. In a LAN connection, WPAD
web server impersonation is done by the attacker using NBNS protocol and
respond with a malicious PAC file.
3 Problem Outline
The attack is performed in any network including public Wi-Fi networks where
HTTPS is necessary for end-user. Figure 1 depicts the working of WPAD feature
with host and DHCP [12] server. This WPAD feature is exploited that reveals
certain browser requests to attacker-controlled code. The web site address the
user visits are known to the attacker.
When a host accesses the DHCP to connect to a network, the attacker acts
like a DHCP server and a malicious response is sent by the attacker having the
path of PAC file. DHCP can be used to set up proxy settings in the browser to
access the URLs other than just assigning IP addresses [13]. In this attack the
browser receives the PAC file and this PAC file will decide the proxy for URL.
The attacker can modify the PAC file accordingly in order to redirect the traffic
to the attacker controlled phishing site. Proxy server controlled by the attacker
198 K.R. Atul and K.P. Jevitha
receives the request from user. The attacker receives the entire URL because it
gets the request before an end to end connection is established with HTTPS
protocol. If the PAC file is not configured in DHCP, it will look for in DNS and
NetBIOS settings.
Figure 2 shows the normal functioning of accessing a web server through a
web browser. The user sends an INFORM message to a legitimate Server. The
server then serves the path where the PAC file resides. The user sends a HTTP
GET request to the web server for PAC file. PAC file is served by the web server
and has the IP address of the proxy server to be used for the URLs.
Figure 3 depicts the attack scenario. The attacker acts as a middle man, i.e.,
impersonates as a DHCP server and responds to the HTTP request before the
legitimate server message reach the user called the DHCP spoofing. The user
retrieves the PAC file from the attacker web server and redirects to the attacker
controlled proxy server and the attacker is able to view the URL the user is
visiting. This attack can be analysed using the packet sniffing tool, Wireshark.
4 Proposed System
Figure 4 depicts the basic architecture for mitigating WPAD protocol by using
DHCP snooping deployed in the switch. Snooping feature is deployed in switch
and group the ports as trusted and untrusted. The ports connected to the DHCP
server is made trusted and all others as untrusted. This method works with
multiple DHCP server.
In DHCP Snooping, switch defends the network from rogue DHCP servers.
Switch checks the messages that pass through the network and acts like a firewall.
DHCP snooping table (or DHCP binding database) is created by switch for
monitoring. This table is used by switch for filtering of messages. The Database
keeps track of DHCP addresses that are given to ports and filter them from
untrusted ports. The packet from untrusted port is dropped if the MAC address
does not match the MAC address in the database.
200 K.R. Atul and K.P. Jevitha
Figure 5 shows how the implementation of proposed system is done. The host
systems and the DHCP server need to be connected with the help of a switch
where the snooping access layer feature is available. The auto configuration must
be enabled in the web browser for each host system.
DHCP server is installed for serving path of PAC file. ISC DHCP is an open
source software that is used to provide IP addresses. After installing the ISC
DHCP server, the DHCP sever is configured by mentioning the network inter-
faces in the /etc/default/isc-dhcp-server.conf with the IP range, Default gateway,
Name server, and Subnet mask address in /etc/dhcp/dhcpd.conf file. Moreover,
it should be configured to serve a PAC file by specifying the option 252 with
path of web server.
In the second module a web server is created for hosting the PAC file. If Apache
web server is used create an .htaccess file in the root directory and specify the
MIME type as ‘application/x-ns-proxy-autoconfig.dat’ inside the file. The PAC
file can also be uploaded to the root directory with .dat extension.
In the third module a proxy server is implemented locally for end user. These
are done with the Squid proxy or Google DNS server i.e., 8.8.8.8 and 8.8.4.4.
This proxy address is provided within the PAC file.
Prevention of PAC File Based Attack Using DHCP Snooping 201
In the fourth module the DHCP snooping has to be enabled in the switch that
connects between the hosts and DHCP server. The snooping prevents rogue
DHCP server by assigning ports as trusted and untrused. The port that is con-
nected to the DHCP server from switch is made as trusted and send DHCP
INFORM message to any host and all other ports are made untrusted. If any
violation of packet is detected the packets are dropped in switch and messages
are logged.
Attacker hijacks the request made by the client and serves the malicious PAC
file to the client by spoofing the response from a legitimate server. The attacker
having the IP address 192.168.1.2 delivers this PAC file by running a bogus
DHCP server. The request for PAC file, i.e., wpad.dat made by the client having
IP address 192.168.73.131 is responded by the attacker as shown in Fig. 6.
As shown in Fig. 7 the request is hijacked by attacker and given to the Google
proxy server 8.8.8.8 in port 53 for fetching the requested website.
Figure 8 shows the client, port number, and protocol from where the malicious
PAC file is requested. Figure 9 shows the list of URL the client visits.
202 K.R. Atul and K.P. Jevitha
As shown in Fig. 10 the PAC file is requested by the host having IP address
172.17.128.121 to the web Server with IP address 172.17.128.52 by HTTP GET
request after spoofing DHCP response. Since the Automatically detect settings
is turned on in the host system, the malicious PAC file is served by the attacker.
Prevention of PAC File Based Attack Using DHCP Snooping 203
6 Conclusion
In this paper a mitigation strategy is proposed for the WPAD protocol used
to configure proxy settings in end systems by DHCP snooping. This feature
can leak URL to attacker by retrieving the PAC file from attacker web server.
The attacker is able to spoof the DHCP response that has the path for the
PAC file. This work explains the attack in WPAD using DHCP in detail. The
snooping feature in switch can be deployed to prevent the bogus DHCP servers
by grouping the ports as trusted and untrusted.
References
1. Introduction to PAC files — FindProxyForURL. http://findproxyforurl.com/
pac-file-introduction/
2. Gauthier, P., Cohen, J., Dunsmuir, M.: Draft-ietf-wrec-wpad-01 - Web Proxy Auto-
Discovery Protocol (1999). https://tools.ietf.org/html/draft-ietf-wrec-wpad-01
3. Alok: Spoofing Attacks DHCP Server Spoofing - 24355 - The Cisco Learning Net-
work, June 2014. https://learningnetwork.cisco.com/people/alokbharadwaj
4. Rescorla, E.: RFC 2818 - HTTP Over TLS (May 2000). https://tools.ietf.org/
html/rfc2818
5. Chen, Q.A., Osterweil, E., Thomas, M., Mao, Z.M.: MitM Attack by Name Colli-
sion: Cause Analysis and Vulnerability Assessment in the New gTLD Era (2016)
6. ICANN — Archives — Top-Level Domains (gTLDs). http://archive.icann.org/en/
tlds/
7. Delegated Strings — ICANN New gTLDs. https://newgtlds.icann.org/en/
program-status/delegated-strings
204 K.R. Atul and K.P. Jevitha
8. Jia, Y., Chen, Y., Dong, X., Saxena, P., Mao, J., Liang, Z.: Man-in-the- browser-
cache: persisting HTTPS attacks via browser cache poisoning. Comput. Secur. 55,
62–80 (2015)
9. Sugavanesh, B., Hari Prasath, R., Selvakumar, S.: SHS-HTTPS enforcer: enforcing
HTTPS and preventing MITM attacks. ACM SIGSOFT Softw. Eng. Notes 38(6),
1–4 (2013)
10. Mangut, H.A., Al-Nemrat, A., Benzad, C., Tawil, A.R.H.: ARP cache poi-
soning mitigation and forensics investigation. In: 2015 IEEE on Trust-
com/BigDataSE/ISPA, vol. 1, pp. 1392–1397. IEEE, August 2015
11. Li, D., Liu, C., Cui, X., Cui, X.: Sniffing and propagating malwares through WPAD
deception in LANs. In: Proceedings of the 2013 ACM SIGSAC Conference on
Computer Communications Security, pp. 1437–1440. ACM, November 2013
12. Droms, R.: Dynamic Host Configuration Protocol (1997). https://www.ietf.org/
rfc/rfc2131.txt
13. Introduction to WPAD — FindProxyForURL. http://findproxyforurl.com/
wpad-introduction/
A Quasigroup Based Synchronous Stream
Cipher for Lightweight Applications
1 Introduction
Cryptography is the art of converting any readable data into unintelligible form,
so that it can be read and processed by those who are intended. Cryptosystems
can be divided into public key and symmetric key cryptosystems. Public key sys-
tems uses a public key for encryption and private key for decryption. Symmetric
cryptographic algorithms can be either stream cipher or block cipher based. In
stream cipher based systems, the message is processed in a bit by bit fashion
and allow for higher throughput. In block ciphers, the message is processed in
blocks of fixed length. Symmetric key cryptosysytems make use of a single key
for both encryption and decryption.
The increased need for security and space reduction has led to the develop-
ment of many lightweight algorithms that provide adequate security within the
space constraints. Block cipher PRESENT [1], stream ciphers Grain [2], Trivium
and hash function Hash-One [3] are some among the many examples. But almost
all available crypto primitives make use of the properties of algebraic structures
like groups, rings, fields etc. But recent researches shows that non algebraic struc-
tures can also provide the required security. Quasigroups are one among the non
algebraic structures that can be made use for the design of cryptographic sys-
tems [4]. Quasigroups can be used for the development of encryption schemes
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 205–214, 2017.
https://doi.org/10.1007/978-981-10-6898-0_17
206 S. Lakshmi et al.
with less computational requirements since only simple lookup operations are
required.
There are many cryptosystems based on quasigroups. The lack of algebraic
properties make these systems resistant to various attacks. They can be used for
the construction of S-boxes [5], block ciphers [6] and stream ciphers [7], pseudo-
random number generators [8], hash functions [9] etc. Many of these systems
are proven to be resistant to algebraic and structural attacks. The security of
these cryptosystems rely on the quasigroups that are used for the construction.
Edon80 [7] is a stream cipher and it make use of 4 quasigroups of order 4 for their
construction. Edon80 managed to find a place among the phase 3 eSTREAM
candidates. Also Edon-R [9], a hash function based on quasigroup was a candi-
date in round two of the NIST hash function competition. There are plenty of
S-box construction based on quasigroups. Quasigroups are itself 2n → n S-boxes.
They can be transformed to 2n → 2n S-boxes by linear transformations.
The rest of the paper is organized as follows. Section 2 deals with the general
description of quasigroups and properties. Section 3 contains the proposed design
and Sect. 4 includes the design rationale of the proposed scheme and Sect. 5
contains the security analysis.
2 Quasigroups
When Q contains a finite number of elements, the main body of the Caley
table of the quasigroup represents a Latin square. Latin square contains rows
and columns that are permutations of the elements from Q.
∗ 1 2 3 4
1 1 4 2 3
2 2 3 1 4
3 3 2 4 1
4 4 1 3 2
Different authors use quaigroups with different properties for their construc-
tion. But quasigroups suitable for cryptographic construction are generally with
little structure such as shapeless quasigroups.
α(xi−1 n n
1 , αi (x1 ), xi+1 ) = xi
αi (xi−1 n n
1 , α(x1 ), xi+1 ) = xi
i = 1,2,...n.
4 Design Rationale
The building blocks of the proposed design are carefully chosen and connected in
such way that the overall cipher resists known generic attacks on stream ciphers
and produce random looking output.
4.1 Quasigroups
Usually quasigroups are constructed using Latin squares, that will be the main
body of the multiplication table. But this method is applicable only during the
construction of small order quasigroups. In practical applications quasigroups of
higher order such as 216 , 264 .... are needed. They can be constructed using the
extended Feistel network mechanism [10]. The quasigroups generated using this
method will be shapeless, which is a desirable property for the construction of
crypto primitives.
Let (G, ⊕) be an abelian group, let f : G −→ G be a permutation and let
A, B, C ∈ G be constants. The extended Feistel network FA,B,C : G2 −→ G2 is
defined for every (l, r) ∈ G2 as
X ∗FA,B,C Y = FA,B,C (X ⊕ Y ) ⊕ Y
4.2 S-box
The proposed design uses a 256 × 256 quasigroup. The permutation function
in the extended feistel network mechanism has been replaced with an S-box in
the design to increase the non-linearity of the cipher. The S-box used in the
construction is a 4-bit S-box. The S-box is constructed using quasigroups using
linear transformations [5]. Table 2 shows the S-box used for the construction.
The above S-box has the following cryptographic properties:
– Differential uniformity = 4
– Balanced
– Robustness = 0.75
– Non-linearity = 4
210 S. Lakshmi et al.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
13 9 15 12 11 5 7 6 3 8 14 2 0 1 4 10
In the initialization phase the output k1 is generated by the four symbol keyframe
[a1 , a2 , a3 , a4 ] and the three symbol IV frame [b1 , b2 , b3 ]. The output k2 is gen-
erated by the next immediate key and IV frames [a2 , a3 , a4 , a5 ] and [b2 , b3 , b4 ].
Continuing like this the symbols are generated
k13 by key and IV frames [a13 , a14 , a15 , a16 ], [b13 , b14 , b15 ]
k14 by key and IV frames [a14 , a15 , a16 , b1 ], [b14 , b15 , b16 ]
k15 by key and IV frames [a15 , a16 , b1 , k14 ], [b15 , b16 , k14 ]
k16 by key and IV frames [a16 , b1 , k14 , k15 ], [b16 , k14 , k15 ]
k17 by key and IV frames [b1 , k14 , k15 , k16 ], [k14 , k15 , k16 ]
k18 by key and IV frames [k14 , k15 , k16 , k17 ], [k15 , k16 , k17 ]
One can note here that the symbols ki , i = 18, 19, ... does not explicitly
depend on any of the key symbols [a1 , a2 , ..., a16 ], which motivates to use these
symbols as keystream symbols and generate ciphertext as described in Sect. 3.2.
The number of rounds in the initialization phase (18 here) can be increased for
making the scheme more secure by compromising efficiency.
5 Security Analysis
The cipher is subjected to various statistical and structural tests for analyz-
ing the security. NIST-STS test suite [13] is used to find the randomness of
the keystream. The structural test [14] involved key/keystream correlation test,
IV/keystream correlation test, frame correlation test and diffusion test.
The NIST Statistical Test Suite (NIST-STS) can be used to evaluate the amount
of randomness that has been introduced in the system. It contains 15 standard
tests. The NIST-STS package will be giving a p-value and a Success/Failure
status for each particular test [13]. The Table 3 shows the parameters that we
have chosen for the given NIST-STS tests.
Upon completion of each test, p-value is obtained which lies between 0 and 1
(both included). The p-values for the various tests performed on the design are as
shown in Table 4, which clearly indicates that the design has no deviation from
the random behavior. Conventionally, a p-value > 0.01 is accepted as success
while p < 0.01 is considered as failure.
A Quasigroup Based Synchronous Stream Cipher 211
It should be noted that these NIST statistical tests are not originally designed
to test the security of stream ciphers, rather to evaluate the randomness proper-
ties of finite sequences. So they do not consider the internal structure, key or IV
loading phases of the ciphers. To solve this problem we go for testing the cipher
with the following structural tests.
The NIST test is to find the randomness in the sequence while the struc-
tural testing is done to find the correlation between the Key/IV with the
generated keystream. Mainly three structural tests were done which includes
Key/Keystream correlation, IV/Keystream correlation, Frame correlation test
and Diffusion test. The Key/Keystream correlation test is done to find out the
correlation between the key and generated Keystream given a fixed IV. The
IV/Keystream correlation tests the correlation between the IV and keystream
when the key is fixed. The Frame correlation test considers the correlation
between keystreams for consecutive values of IV. The Diffusion test finds out
the diffusion of IV and key bits within the keystream. Evaluations are done
based on the Chi-Square Godness of Fit test.
sets of 128 bit random IV’s are generated. For each of these random IV’s corre-
sponding 128 bit keystream is generated with a fixed 128 bit key. The keystreams
are then XORed with their corresponding IV and weights are found. These weights
are divided into the following 5 classes: 0–58, 59–62, 63–65, 66–69, 70–128. The fre-
quencies are compared against their expected values and Chi-Square Goodness of
fit test is performed. High or low correlation between the IV and the keystream
can help the attacker to generate the keystream without the knowledge of the
keystream.
Frame Correlation. The objective of this test is to find the correlation frames
generated with similar IVs [14]. In this test, for a random IV and key of length 128
bits, keystream of length L = 256 bits is generated. This procedure is repeated for
N = 210 times with incremented values of the IV. These generated keystreams
are used to construct a 210 × 256 matrix. The column weights of this matrix is
calculated and is classified into 5 intervals: 0–498, 499–507, 508–516, 517–525, and
526–1024. The frequency of each class is compared against expected frequency and
Chi-Square Goodness of fit test is applied.
Diffusion Test. The Diffusion property is satisfied if and only if, each Key and
IV bits is having an effect in the keystream [14,15]. Any change in the key or IV
bits should generate random looking changes within the keystream. Random 128
bit key and IV are chosen for the test. A keystream of length 256 is generated
using this key and IV. By changing each bit of key and IV, new keystreams
are generated and they are XORed with the original keystream. With these
vectors, a 256 × 256 matrix is generated and this procedure is repeated 1024
times. The resulting matrices are added in real numbers and the values are
classified into the following intervals: 0–498, 499–507, 508–516, 517–525, and
526–1024. The obtained results are evaluated against the expected values and
Chi-Square Goodness of Fit test is performed to obtain the p-value.
5.3 Results
NIST-STS. For the analysis, 100 keystream samples are generated each having
106 bits. The keystream samples are written to a single file and analyzed. The
results for each test are summarized in Table 4.
It can be seen from the table that p-value for the generated keystream is
greater than 0.01 for every test. Thus the generated sequence can be considered
as a random sequence.
Structural Tests. The results of the chi-square test is used to generate the
p-value. The p-value should be in between 0 and 1 and >0.01 is considered as a
success. Failure of Key/Keystream test will result in revising the key initializa-
tion phase while failure of other two tests will result in revising the IV loading
phase. In the analysis part, 100 such p-values are generated for each test and the
average p-value for each test is given in Table 5.
A Quasigroup Based Synchronous Stream Cipher 213
Test p-value
Frequency 0.1835
Block frequency 0.3538
Cumulative sums 0.3289
Approximate entropy 0.8681
Serial 0.3633
Linear complexity 0.5317
Runs 0.0497
Longest run 0.3116
Rank 0.6898
FFT 0.8615
Non overlapping template 0.9781
Overlapping template 0.4358
Universal 0.9783
Test p-value
Key/Keystream correlation 0.8708
IV/Keystream correlation 0.9572
Frame correlation 0.7029
Diffusion test 0.8351
It can be seen that the p-value is greater than the threshold and thus the
proposed scheme passes all the structural tests.
6 Conclusion
In this paper, we propose a synchronous stream cipher based on quasigroups
which appears to be suited for lightweight applications. The randomness of the
cipher output is analyzed by the NIST statistical Test Suite and various struc-
tural tests. The cipher is believed to be resistant to the known generic attacks
on stream ciphers.
214 S. Lakshmi et al.
References
1. Bogdanov, A., Knudsen, L.R., Leander, G., Paar, C., Poschmann, A., Robshaw,
M.J.B., Seurin, Y., Vikkelsoe, C.: PRESENT: an ultra-lightweight block cipher.
In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 450–466.
Springer, Heidelberg (2007). doi:10.1007/978-3-540-74735-2 31
2. Hell, M., Johansson, T., Maximov, A., Meier, W.: A stream cipher proposal: grain-
128. In: 2006 IEEE International Symposium on Information Theory, pp. 1614–
1618. IEEE (2006)
3. Mukundan, P.M., Manayankath, S., Srinivasan, C., Sethumadhavan, M.: Hash-one:
a lightweight cryptographic hash function. IET Inf. Secur. 10(5), 225–231 (2016)
4. Markovski, S.: Design of crypto primitives based on quasigroups. Quasigroups
Related Syst. 23(1), 41–90 (2015)
5. Mihajloska, H., Gligoroski, D.: Construction of optimal 4-bit s-boxes by quasi-
groups of order 4. In: The Sixth International Conference on Emerging Security
Information, Systems and Technologies, SECURWARE (2012)
6. Battey, M., Parakh, A.: An efficient quasigroup block cipher. Wireless Pers. Com-
mun. 73(1), 63–76 (2013)
7. Gligoroski, D., Markovski, S., Knapskog, S.J.: The stream cipher Edon80. In:
Robshaw, M., Billet, O. (eds.) New Stream Cipher Designs. LNCS, vol. 4986,
pp. 152–169. Springer, Heidelberg (2008). doi:10.1007/978-3-540-68351-3 12
8. Battey, M., Parakh, A., Mahoney, W.: A new quasigroup based random number
generator. In: Proceedings of the International Conference on Security and Man-
agement (SAM), p. 1. The Steering Committee of the World Congress in Computer
Science, Computer Engineering and Applied Computing (WorldComp) (2013)
9. Gligoroski, D., Markovski, S., Kocarev, L.: Edon-R, an infinite family of crypto-
graphic hash functions. IJ Netw. Secur. 8(3), 293–300 (2009)
10. Mileva, A., Markovski, S.: Shapeless quasigroups derived by feistel orthomor-
phisms. Glasnik matematički 47(2), 333–349 (2012)
11. Petrescu, A.: Applications of quasigroups in cryptography. In: Proceedings of Inter-
Eng (2007)
12. Chakrabarti, S., Pal, S.K., Gangopadhyay, S.: An improved 3-quasigroup based
encryption scheme. In: ICT Innovations 2012, p. 173 (2012). Web Proceedings
ISSN 1857-7288
13. Rukhin, A., Soto, J., Nechvatal, J., Smid, M., Barker, E.: A statistical test suite
for random and pseudorandom number generators for cryptographic applications.
Technical report, Booz-Allen and Hamilton Inc Mclean Va (2001)
14. Turan, M.S., Doganaksoy, A., Calık, C.: Statistical analysis of synchronous stream
ciphers. In: Stream Ciphers Revisited, SASC 2006 (2006)
15. Srinivasan, C., Lakshmy, K.V., Sethumadhavan, M.: Measuring diffusion in stream
ciphers using statistical testing methods. Defence Sci. J. 62(1), 6 (2012)
Security Analysis of Key Management Schemes
Based on Chinese Remainder Theorem Under
Strong Active Outsider Adversary Model
1 Introduction
Our Contribution
Given the more realistic active outsider adversarial model, fundamentally the
key management schemes should be secure against active adversary. Chinese
remainder theorem (CRT) based key management schemes were designed to
reduce the rekey message size (communication cost) and computation cost at
user level.
– In this work, we prove that the CRT based schemes are not secure against
active outsider adversary. In particular we analyze the scheme by Zheng et al.
[15], scheme by Zhou et al. [16,17] and the scheme by Joshi et al. [7]. There
is no specific reason to choose these schemes as all the CRT based key man-
agement schemes have used the concept of secure lock [4].
– We reason why the schemes are insecure and how to make these schemes
as secure against active adversary. We conclude that the cost required to
make the CRT based schemes as secure is equivalent to the cost of new group
creation which is exorbitant as it requires the n secure channels for n users
for every rekeying event.
Security Analysis of Key Management Schemes 217
Zheng et al. [15] have proposed CRT based scheme. We show that the scheme is
insecure against active outsider adversary.
Let group users be U = {u1 , u2 , ..., un }. A random private key ki for user ui
is chosen by the key server from a collection of pairwise relatively prime integers.
So, gcd(ki , kj ) = 1, i = j, 1 ≤ i, j ≤ n. The key server chooses randomly a group
key K and establishes the following system of congruences:
When a new user un+1 joins the group, key server chooses randomly a key
kn+1 from set of relatively prime integers such that gcd(ki , kj ) = 1, i = j, 1 ≤
i, j ≤ n+1. Also, a new group key must be chosen in order to maintain backward
secrecy. Key server chooses K as the new group key and computes a unique
solution S to the following system of congruences:
X ≡ l1 mod k1 , X ≡ l2 mod k2 , . . . , X ≡ ln mod kn , X ≡ ln+1
mod kn+1
Where, li is the value corresponding to K ⊕ ki for i = 1, . . . n + 1. Then the
key server broadcasts S . All the users including newly joining user un+1 can
get the key K by computing S mod ki , 1 ≤ i ≤ n + 1 which gives li and then
computing li ⊕ ki which gives K .
Now we consider a scenario of user leaving the group. Then, the key server
should change the group key K , to ensure forward secrecy so that leaving users
will not be able to access the future communications of the group. Suppose user
u2 wants to leave the group. A new group key K is chosen by the key server
and computes a unique solution S to the following systems of equations:
X ≡ l1 mod k1 , X ≡ l2 mod k3 , . . . , X ≡ ln mod kn , X ≡ ln+1 mod kn+1
Where li is the value of bits K ⊕ ki . Each user ui , i = 1, 3, . . . , n + 1 can
get the key K by computing S mod ki which gives li and then computing
li ⊕ki which gives K . The leaving user u2 cannot obtain group key K as in the
system of congruence in computing CRT solution the equation X ≡ l2 mod k2
is excluded.
The above solution using CRT satisfies forward and backward secrecy require-
ments under passive adversarial attack model.
218 B.R. Purushothama et al.
Key server chooses the keys of the nodes such that all the keys are pairwise
relatively prime. A user tree with the private key assigned to each node is called
a key tree. Figure 1 shows the key tree for 8 = 23 users. The private key set of
user ui , 1 ≤ i ≤ n is all the keys on the path from leaf corresponding to ui to the
root. These keys are given to users ui securely by the key server. For example,
private key set of u1 is {k7 ,k3 ,k1 ,k0 } where, k1 is the private key of user u1 .
Suppose, multicast has to be done to users u1 , u2 , u3 , . . . , u8 . Then, the key
server broadcasts the identity 0 of the root node. All the users will check whether
the key corresponding to identity 0 is present. In this example, all the users have
the key k0 in their private set. So, they communicate using k0 .
Suppose, multicast is among the users u1 , . . . , u5 and u8 . There is no common
key among the private key sets of u1 , . . . , u5 and u8 i.e. the users u1 , u2 , . . . , u5
and u8 does not have any key in common. So, key server chooses a random group
key K and obtains the unique solution to the system of following congruences.
where li = is the value of K ⊕ ki for i = 1, . . . , 6, 8. Let the solution be S and
key server broadcasts S . Each user ui , i = 1 − 6, 8 can get K by computing
S mod ki to get li and li ⊕ ki and each user erases K.
Suppose a user u1 leaves the group. The group key K should be changed.
A new group key K is chosen by the key server. Further, the server computes
solution S to following system of congruences:
X ≡ l2 mod k2 , X ≡ l3 mod k3 , X ≡ l4 mod k4
X ≡ l5 mod k5 , X ≡ l6 mod k6 , X ≡ l8 mod k8
where li = is the value of K ⊕ki for for i = 2, . . . , 6, 8. The key server broadcasts
S . Each user u2 , u3 , u4 , u5 , u6 and u8 can obtain K by computing S mod ki
to get li and then computing li ⊕ ki to get K6 . Each user erases K .
Consider the current group with users u2 , . . . , u6 and u8 and current group key
is K . This state of the group is as a result of join of u6 to the initial multicast
group of users u1 , . . . , u5 , u8 and user u1 leaving the group u1 , . . . , u6 , u8 . An
active adversary A can compromise any legitimate user u2 − u5 and u8 . Also,
adversary A has access to S, S and S . Suppose, A compromises the user u5 .
Now, u5 has only k5 and K . So, the adversary can access all communications
encrypted using K .
We show that, an adversary A can also access the communications encrypted
with K and K. Since A has access to k5 , A can get K by computing li ≡
S mod k5 and li ⊕ k5 . Since, A has access to k5 , A can get K by computing
li ≡ S mod k5 and li ⊕ k5 . So, by compromising the legitimate user of the group,
the adversary A not only accesses communications encrypted with the current
group key K but also past communications of the group encrypted with K
and K.
The CRT based scalable key transport protocol is proposed by Joshi et al. [7].
Suppose the users u1 , u2 , . . . , un want to communicate secretly. Key server gen-
erates n + 1 key pairs (ki , mi ) for 0 ≤ i ≤ n such that gcd(mi , mi+1 ) = 1
∀i, j ∈ [1, n]. (k0 , m0 ) is not given to any user. The key pair (ki , mi ) is securely
communicated to user ui . Let Z be the data that the server wants to communi-
cate securely. Key server chooses a secret key S to encrypt Z for a group.
where,
n
M
Si = Eki (S), M = mi , Yi = , Ci = Yi−1 mod mi
i
mi
keys with user ui is ki , mi and S . Suppose that all the users have erased the
key S, S after rekeying events.
Adversary A has access to ES (Z), X, ES (Z ), X , and ES (Z ), X . Suppose
A compromises user u2 . By compromising u2 , A will get k2 , m2 and S . So, the
user will be able to obtain all the communications that are encrypted with S .
We show that A also gets access to past communications. Note that, this happens
even after u2 has erased S, S .
A using k2 can compute X mod m2 to get S2 and compute Dk2 (S2 ) to get
S . So, all communications encrypted with S such as ES (Z ) are accessed by
All the CRT based schemes are based on the generic “Secure Lock” concept
[4]. In this section, we highlight the generic CRT based group key management
scheme. We establish that the scheme is not secure against the active adversary.
Also, we make the scheme secure against the active adversary.
Suppose, U = {u1 , u2 , u3 , . . . , un } be the set of users want to communicate
securely. A group key should be shared among the users in U to securely com-
municate.
– Key server generates key ki and securely sends the key to ki to ui , for 1 ≤
i ≤ n. ki is the secret key of the user ui shared with the key server.
– Key server chooses the group key K and set up the system of congruences:
X ≡ l1 mod k1 , X ≡ l2 mod k2 , . . . , X ≡ ln mod kn where li = Eki (K) for
1 ≤ i ≤ n, E is symmetric encryption algorithm.
– Key server computes the unique solution of the set of congruences
n
S= li Mi Ni mod N
i=1
– Suppose, a user un+1 (new user) joins the group. A new key (group) K is
chosen by the key server. Key server generates kn+1 such that gcd(ki , kj ) = 1
∀i, j ∈ [1, n + 1] and i = j. And securely gives kn+1 to user un+1 and set
up the new set of congruences, X ≡ li mod ki , 1 ≤ i ≤ n + 1, such that
li = Eki (K ) and computes the solution S for the set of congruences and
broadcasts S .
– Each user ui , 1 ≤ i ≤ n + 1 can get K by computing S mod ki to get li and
then computing Dki (li ) to get K . So it ensure the backward secrecy.
– Suppose an user ui , 1 ≤ i ≤ n + 1 wants to leave the group.
– W.l.o.g let u1 be the user leaving the group. Then key server should generate
a new group key K and send to users uj securely, where j ∈ [2, n + 1]. Key
server sets up the following system of congruences:
X ≡ l2 mod k2 , X ≡ l3 mod k3 , . . . , X ≡ ln+1 mod kn+1
where lj = Ekj (K ) for j ∈ [2, n + 1] and computes solution S to the set
of equations and broadcasts S . The users uj except u1 can obtain the new
group key K by computing S mod kj to get lj and compute Dkj (lj ) to get
K . All the users except u1 can communicate securely using K . Note that
user u1 cannot compute K as the key server excluded X ≡ l1 mod k1 from
the system of congruences. This ensures forward secrecy.
still store k2 . Note that S and S were obtained by encrypting K and K with
k2 respectively. So A will be able to obtain these keys. To make it not accessible
to A, S and S should not use k2 . In general, the private key of u2 should not
be used in all rekeying events.
Now, we give the secure version of the above scheme and comment on the per-
formance and security. Consider the scenario where group of users u1 , u2 , . . . , un
with current group key K. Now, suppose un+1 joins the group. Then key server
should choose new set of keys k1 , k2 , . . . , kn and kn+1 such that gcd(ki , kj ) = 1,
i, j ∈ [1, n] and gcd(ki , kn+1 ) = 1, for i ∈ [1, n]. And obtain solution S to the
following set of equations:
X ≡ l1 mod k1 , X ≡ l2 mod k2 , X ≡ ln mod kn , X ≡ ln+1 mod kn+1
where li = Ek (K ) and K new group key. Key chosen kn+1 by key server
i
is securely given to un+1 . To obtain K , each user ui should have the key ki ,
for i ∈ [1, n]. The only way key server can communicate the new keys ki to
ui for i ∈ [1, n] is by securely sending to ui . Suppose say by using public key
cryptosystem. So, it requires n encryptions by the key server. Each user needs
to do one decryption to get ki . Then use ki to get K and erase ki .
When a user leaves also, key server should follow the same process excluding
the leaving user. So the cost of the rekey will increase drastically and it reduces
performance of key management scheme. However, the adversary will not be
able to access the past group keys as all the keys are changed for every rekeying.
Other schemes proposed for group key management based on Chinese Remainder
Theorem [5,11] are more efficient. However, none of them is secure under the
effect of an active outsider adversary.
6 Conclusion
References
1. Aparna, R., Amberker, B.B.: A key management scheme for secure group commu-
nication using binomial key trees. Int. J. Netw. Manag. 20(6), 383–418 (2010)
2. Burton, D.: Elementary number theory (2011). https://books.google.co.in/books?
id=3KiUCgAAQBAJ
3. Chen, Y.R., Tygar, J.D., Tzeng, W.G.: Secure group key management using uni-
directional proxy re-encryption schemes. In: INFOCOM, pp. 1952–1960. IEEE
(2011)
4. Chiou, G.H., Chen, W.T.: Secure broadcasting using the secure lock. IEEE Trans.
Software Eng. 15(8), 929–934 (1989)
5. Guo, C., Chang, C.C.: An authenticated group key distribution protocol based on
the generalized chinese remainder theorem. Int. J. Commun. Syst. 27(1), 126–134
(2014)
6. Jho, N.-S., Hwang, J.Y., Cheon, J.H., Kim, M.-H., Lee, D.H., Yoo, E.S.: One-way
chain based broadcast encryption schemes. In: Cramer, R. (ed.) EUROCRYPT
2005. LNCS, vol. 3494, pp. 559–574. Springer, Heidelberg (2005). https://doi.org/
10.1007/11426639 33
7. Joshi, M.Y., Bichkar, R.S.: Scalable key transport protocol using chinese remainder
theorem. In: Thampi, S.M., Atrey, P.K., Fan, C.-I., Perez, G.M. (eds.) SSCC 2013.
CCIS, vol. 377, pp. 397–402. Springer, Heidelberg (2013). https://doi.org/10.1007/
978-3-642-40576-1 39
8. Naor, D., Naor, M., Lotspiech, J.: Revocation and tracing schemes for stateless
receivers. In: Kilian, J. (ed.) CRYPTO 2001. LNCS, vol. 2139, pp. 41–62. Springer,
Heidelberg (2001). https://doi.org/10.1007/3-540-44647-8 3
9. Rafaeli, S., Hutchison, D.: A survey of key management for secure group commu-
nication. ACM Comput. Surv. 35(3), 309–329 (2003)
10. Sherman, A.T., McGrew, D.A.: Key establishment in large dynamic groups using
one-way function trees. IEEE Trans. Software Eng. 29(5), 444–458 (2003)
11. Vijayakumar, P., Bose, S., Kannan, A.: Chinese remainder theorem based cen-
tralised group key management for secure multicast communication. IET Inf.
Secur. 8(3), 179–187 (2014)
12. Wong, C.K., Gouda, M., Lam, S.S.: Secure group communications using key graphs.
IEEE/ACM Trans. Networking 8(1), 16–30 (2000)
13. Xu, S.: On the security of group communication schemes based on symmetric key
cryptosystems. In: Proceedings of the 3rd ACM Workshop on Security of Ad Hoc
and Sensor Networks, New York, USA, pp. 22–31 (2005)
14. Xu, S.: On the security of group communication schemes. J. Comput. Secur. 15(1),
129–169 (2007)
15. Zheng, X., Huang, C.T., Matthews, M.: Chinese remainder theorem based group
key management. In: Proceedings of the 45th Annual Southeast Regional Confer-
ence, ACM-SE 45, pp. 266–271. ACM, New York (2007)
16. Zhou, J., Ou, Y.: Key tree and Chinese remainder theorem based group key
distribution scheme. In: Hua, A., Chang, S.-L. (eds.) ICA3PP 2009. LNCS,
vol. 5574, pp. 254–265. Springer, Heidelberg (2009). https://doi.org/10.1007/
978-3-642-03095-6 26
17. Zhou, J., Ou, Y.: Key tree and chinese remainder theorem based group key distru-
bution scheme. J. Chin. Inst. Eng. 32(7), 967–974 (2009)
18. Zou, X., Dai, Y.S., Bertino, E.: A practical and flexible key management mechanism
for trusted collaborative computing. In: INFOCOM, pp. 538–546. IEEE (2008)
Deep Learning for Network Flow Analysis
and Malware Classification
1 Introduction
The scale and density of network traffic is rapidly growing through the years.
The protocols which are designed grossly based on TCP-IP model established in
the initial days of Internet, lack the necessary features required for such traffic
analysis. Most of the protocol classification systems today mainly depends on
the parameters such as Port numbers, static headers, IP addresses etc. But, as
new protocols, which are being designed every day, are not following the rule
of port registration, the situation is worsing for traffic analysers and network
administrator [12].
When we take the case of network applications, the traditional way to classify
them using meta traffic information was based on limited behavioral properties
which are used to define heuristics features. These features again include port
numbers, transmission rate and frequency, application and protocol header infor-
mation etc. [18]. With the advent of mobile and web applications, this scenarios
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 226–235, 2017.
https://doi.org/10.1007/978-981-10-6898-0_19
Deep Learning for Network Flow Analysis and Malware Classification 227
is at its worst. Along with this, administrators also face issues like tunneling, ran-
dom port usage, proxy and encryptions that makes detection and classification
almost impossible [16].
Similarly traditional classification of malware is done mainly with heuristic
and bahavioral signatures that grapple to keep up with malware evolution. A
malware signature is an algorithm or hash that uniquely identifies a specific virus.
It is proved that all viruses in a family share common behaviour and a single
generic signature can be created for them. However, malware authors always try
to confuse antivirus software by writing polymorphic and metamorphic malware
that constantly change known signatures and thus fool the system. To avoid
all such contempt of behaviors, a flow and code feature based analysis or data
driven analysis is mandatory for network applications, protocols and malware.
Behavioral signature can be mocked, copied, changed or tampered with, but
data signatures are abstract and cannot be manipulated that easily [4].
In 2015, Microsoft hosted a competition in Kaggle with the goal of classifying
malware into their respective families based on the their content and character-
istics. Microsoft provided a set of malware samples representing 9 different mal-
ware families. Each malware sample had an ID, a 20 character hash value is used
to uniquely identify the sample and a class, an integer label representing one of
the 9 malware family (class) to which the malware belong: (1) Ramnit, (2) Lol-
lipop, (3) Kelihos ver3, (4) Vundo, (5) Simda, (6) Tracur, (7) Kelihos ver1, (8)
Obfuscater. ACY, (9) Gatak [7]. The dataset includes files containing hexadeci-
mal representation of malwares’ executable machine code. Each files is composed
of Byte Count, Address, Record type, Data and Checksum.
Together all three, can be defined as a multi class classification problem, to
make it machine learnable. Selection and processing of the right features from
a frenzy of unintelligible data is a near impossible task which makes the above
problems an ideal case for applying deep learning [4].
Deep learning is a new subversive machine learning strategy, where extrac-
tion of features is done by the machine itself from the given data for the best
classification possible. These feature are at best, non orthogonal and signifi-
cantly enhance the accuracy of classification or regression, compared to human
hand crafted features [8,14]. Some supervised learning algorithms include logistic
regression, multilayer perceptron, deep convolutional network etc. Semi or unsu-
pervised learning include stacked auto encoders, restricted Boltzmann machines
(RBMs), deep belief networks (DBNs) etc. [13,15].We approach the above prob-
lems with a convolutional neural network (CNN) with auto encoders and tweak
the network performance.
A Convolutional Neural Network (CNN) is a form of feed-forward neural
network in which the connection between its neurons is similar to the structure
of the animal visual cortex, whose individual neurons are organized in such a
way that they respond to overlapping regions tilling the visual field [1,6]. CNNs
are composed by three types of layers such as fully-connected, convolutional and
pooling. CNN has the ability to see any data as an image and this characteristic
allows users to encode certain properties into the architecture. CNN will con-
volve several small filters on the input image and subsample this space of filter
228 R.K. Rahul et al.
activations and repeat these processes until we left with enough high level fea-
tures. Then it will apply a standard feed-forward neural network to the resulting
features [2].
The other main method used for feature extraction are auto encoders. They
are made to generate a set of features which can be reverse transformed to yield
back the original input. This is called bidirectional training. The networks has
the same input and output from which it back propagates and learns [6]. In an
essence these can be used for kernel type feature mapping normally used with
non-linear or non-separable data in traditional machine learning
In order to collect packet data, Wireshark and Tshark is used. Wireshark [10]
is an open source software for analyzing network packet. Only HTTP, SSL, and
SMTP protocol packets are selected from the entire collection of captured pack-
ets. Since the classification process with entire payload is not computationally
easy, only the metadata or packet attributes are taken for the experiment. The
metadata contain a partial information about the payload. The collected data
packets are converted into comma separated values and it acts as the input to
the deep learning architecture. The data packetstrimmed to a uniform length
vector of 1024 bytes and the data is converted to decimal format, so it can be
easy fed to a network programatically [12].
Deep Learning for Network Flow Analysis and Malware Classification 229
Table 1. CNN architecture used in the Classification Processes for Specific Tasks
The entire payloads were fed into an auto encoder and features were extracted.
The architecture of the auto encoder used in the experiment is shown in the
Fig. 1. The packet attributes of three protocols is given to the designed auto
encoder.The network has an input of length 1024 and a 512 node middle layer.
tanh, is used as the activation function in all the three layers. The loss function
for training was taken as the root mean square error between the outputs of final
nodes and the inputs. The network is trained using batched stochastic gradient
decent, which is faster than individually updating after each data. The middle
layer samples were taken as input and are fed into the CNN which is further
trained with data labels as the ground truth.
Feature selection from the auto encoder is computationally heavy in the
training stage but is a one time process. Once the network has been trained,
features can be extracted from data frames with simple computations such as
matrix multiplication.
The only difference between the architectures of auto encoders used for appli-
cation classification and protocol classification is the input vector size; 2048 for
the application classifier. The number of nodes in the hidden layer is 512 as in
the above case. The samples from middle layer was taken as the features for
classification, to be fed to into the CNN.
The preprocessed samples of metadata for three different protocols are given to
the CNN. Out of 75,000 data given 52,500 packets of data are given for training
and remaining 22,500 are given for testing. The result shows classification on
test data. 83.78% accuracy is obtained for 2000 iterations. The same data set is
given to the auto encoder with a softmax layer for classification. However, this
classification gave less accuracy with a best measure of 75.57 for 1700 iterations.
The data contains payloads from three different network applications are given
to the CNN mentioned in the Table 1. Different parameters of the CNN were
changed and the changes in the accuracy were observed.
In the experiment stage, we tried with three different learning rates. Initially
it was fixed at 0.01 and an accuracy was 20.12% for 1000 epochs. Then we
changed the learning rate to 0.001 and obtained accuracy of 45.20% for the
same number of iterations. So we again decreased the learning rate to 0.0001
and got a high accuracy of 84.26%. Then we fixed the value of learning rate as
0.0001 and then changed the dropout values.Since the training takes much time
here the number of epochs is fixed to 200. The value of dropout was changed from
0.1 to 0.9 gradually and we could observe an evident increase in accuracy. The
accuracy for 0.1 dropout is 62.50% and for 0.9 it is 91.90%. We fixed the value
of dropout as 0.9 for our architecture. After choosing the values for learning rate
and dropout, we changed the number of epochs from 200 to 2000 and observed
the results. The corresponding observations are plotted in Fig. 2. Here, we can
observe that for 2000 epochs the accuracy obtained was 95.50%. The class-wise
accuracy for 3 different epochs are given in the Fig. 3. All these results were
obtained by the classification using only CNN . The next type of classification
combines an auto encoder with the existing CNN. The data points were directly
fed into an auto encoder and the features were piped to the CNN. The result
obtained for these two methods were compared to the existing results [17] and
are given in the Table 2.
The accuracy obtained for feature extracted CNN is high compared to the
other two methods, giving a class wise accuracy for both browsers and chat
232 R.K. Rahul et al.
Fig. 2. Training epochs vs Accuracy graph with a learning rate of 0.0001 and dropout
of 0.9
Class Existing results(%) [17] Accuracy for CNN(%) CNN + Auto encoder(%)
Browser 88.60 97.97 100
Chat 99.80 100 100
Torrent 98.70 94.16 98.90
Overall 95.70 97.37 99.63
Deep Learning for Network Flow Analysis and Malware Classification 233
Fig. 5. Training epochs vs Accuracy graph for malware classification with learning rate
of 0.001 and dropout of 0.9. Average Testing Accuracy is 94.91%
4 Conclusion
Deep learned features are abstract in nature and cannot be attributed to any
specific measure of the data entities (like traditional features which are hand
crafted), such as network traffic and malware that generate huge amounts of
data. From the results of our experiment we can conclude that this data in
each case has got more information than what is humanly visible, like basic
statistical behaviours, port associations, header information and format etc. The
obvious benefits are that these features are also invisible to the attacker and
data fingerprints like these cannot be manipulated. For malware we have used
their binary executable code and for the network traffic we used the transmitted
payload inside each packet. We speculate that since the code profiles seldom
change for even the most tricky polymorph, a static pre-trained model will do
that can be integrated with the firmware. The case with network traffic is almost
very similar. Newer and proprietary protocols/applications are mostly derived
from existing ones and can be caught since their flow signatures will remain more
or less static. We also observed that among all the three, torrent gives the least
accuracy for which we speculate that it is due to the tunnelling and encrypting
behaviour of torrent transmissions. But given these conditions, we still believe
we can identify them accurately in real time, if we pump in more data to train
the network.
5 Future Work
Presently the classification was done only using CNN and auto encoder. In future,
it can be further extended to RNN and LSTM as transmitted data might have
some auto correlations or sequential behaviour. The experiment was done by
collecting packets over a small network which can be expanded over larger ones.
We can try for more applications especially the ones with proprietary protocols.
Classification can be performed on Botnets to identify infections in real time.
The malware classification has a lot of room for improvement, and can chunk
more data toward this goal. The main use of this malware model is with in
disassemblers or firmware profilers which can see the actual code passed for
execution. Any code suspected to be malicious can be filtered or at least be
quarantined prior to the real execution of it. In the same way a network traffic
filter can be set up on bridges and routers based on learned models trained on
malicious or congestion causing traffic to do selective load shedding.
References
1. Convolutional neural network. https://en.wikipedia.org/wiki/Convolutional
neural network. Accessed 10 May 2017
2. Deep learning. https://en.wikipedia.org/wiki/Deep learning. Accessed 29 Nov
2016
3. Tcpdump. http://www.tcpdump.org/tcpdump man.html. Accessed 27 Apr 2017
Deep Learning for Network Flow Analysis and Malware Classification 235
4. Anjali, T., Menon, V.K., Soman, K.P.: Network application identification using
deep learning. In: 6th IEEE International Conference on Communication and Sig-
nal Processing (2017, accepted)
5. Drew, J., Moore, T., Hahsler, M.: Polymorphic malware detection using sequence
classification methods, pp. 81–87 (2016)
6. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444
(2015)
7. Microsoft: Kaggle malware data. https://www.kaggle.com/c/malware-
classification/data. Accessed 11 May 2017
8. Nagananthini, C., Yogameena, B.: Crowd disaster avoidance system (CDAS) by
deep learning using extended center symmetric local binary pattern (XCS-LBP)
texture features. In: Raman, B., Kumar, S., Roy, P.P., Sen, D. (eds.) Proceedings
of International Conference on Computer Vision and Image Processing. AISC, vol.
459, pp. 487–498. Springer, Singapore (2017). doi:10.1007/978-981-10-2104-6 44
9. Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visu-
alization and automatic classification. In: Proceedings of the 8th International
Symposium on Visualization for Cyber Security, VizSec 2011, pp. 4:1–4:7. ACM
(2011). http://doi.acm.org/10.1145/2016904.2016908
10. Orebaugh, A., Ramirez, G., Beale, J.: Wireshark & ethereal network protocol ana-
lyzer toolkit (2006)
11. Athira, S., Mohan, R., Poornachandran, P., Soman, K.P.: Automatic modulation
classification using convolutional neural network. IJCTA 9(16), 7733–7742 (2016)
12. Rahul, R.K., Menon, V.K., Soman, K.P.: Network protocol classification using deep
learning. In: 6th IEEE International Conference on Communication and Signal
Processing (2017, accepted)
13. Soman, K., Diwakar, S., Ajay, V.: Data Mining: Theory and Practice [WITH CD].
PHI Learning Pvt. Ltd., Delhi (2006)
14. Soman, K., Loganathan, R., Ajay, V.: Machine learning with SVM and other kernel
methods. PHI Learning Pvt. Ltd., Delhi (2009)
15. Team, T.D.: Deep learning tutorials. http://deeplearning.net/tutorial/. Accessed
29 Nov 2016
16. Tongaonkar, A., Keralapura, R., Nucci, A.: Challenges in network application iden-
tification (2012)
17. Wang, Z.: The applications of deep learning on traffic identification. BlackHat USA
(2015)
18. Zander, S., Nguyen, T., Armitage, G.: Automated traffic classification and appli-
cation identification using machine learning, pp. 250–257 (2005)
Kernel Modification APT Attack Detection
in Android
Abstract. Android is one of the most secure and widely used operating
systems for the mobile platform. Most of the Android devices have the
functionality for rooting and installing new custom ROMs and kernels in
the device. This feature of the Android devices makes it vulnerable to the
kernel-modification advanced persistent threat attack (APT). This type
of APT attacks cannot be detected by using existing tools and methods.
This paper presents the implementation details of a kernel-modification
APT attack performed on an android device and proposes a new method
for detecting the same. The proposed system uses control flow analysis
of the kernel binary code for detecting APT. In control flow analysis the
control flow graph of the genuine kernel is compared with the control flow
graph of the device-kernel and detects the APT based on signatures.
1 Introduction
OS. This type of threat has a very high impact and it is very hard to identify.
After a successful attack, all built in security features of android will work in
favor of the attacker and will prevent the detection of attack by any security
software running on the device.
Nowadays buying mobile phones from online shopping sites is very common.
Many are ready to buy from any seller, who is selling mobiles for a lesser price
without checking their authenticity. This behavior of android customers opens a
great opportunity for the attacker to implement APT attack over a large number
of users. The attacker can easily register to any of the shopping sites and can
easily sell products which contains malicious kernel. If an attacker modifies the
kernel and implements the attack in the android kernel, then the lifetime of the
attack is almost equal to the lifetime of the device. No security software running
in android phone can detect this kind of attack. Implementing such an APT
attack is also a challenging thing. This paper presents the steps for implementing
kernel-modification APT attack and implementation details of the APT attack
performed in the GT S-6102 kernel.
We propose a new method for detecting the kernel-modification APT attack
in android device. The proposed system uses control flow analysis of the kernel
binary code for detecting APT. In control flow analysis the control flow graph of
the genuine kernel is compared with the control flow graph of the device-kernel
and detects the APT based on signatures generated for the detection mechanism.
The rest of the paper is organized as follows. In Sect. 2, we have presented a
brief literature survey on existing work. Section 3 discusses kernel-modification
APT attack implementation steps for android. Section 4 presents the proposed
system for kernel modification APT attack detection. Section 5 discusses Imple-
mentation and results. Section 6 presents conclusions and future work.
2 Literature Survey
We are not able to find any major research work related to kernel-modification
APT attack implementation and APT detection in the literature. But many
works related to binary-code analysis and android malware detection are
available.
Levine et al. [1] presented a framework to detect and classify rootkits and
discussed a methodology for determining if a system has been infected with a
kernel-level rootkit. By using their tool once infection is established, administra-
tors can create new signatures for kernel-level rootkits to detect them. Although
they have used cyclical redundancy check (CRC) checksum for faster and less
memory comparison of file contents, this comparison tells only that a current
program file differs from its original program file. The above approach can be
extended for detection of APT in android. You and Noh [2] proposed Android
platform based Linux kernel rootkit. In this paper, they have discussed some
rootkits, which exploit android kernel by taking advantage of LKM (loadable
kernel module) and /dev/kmem device access technology and the danger the
rootkit attack would bring. Some of these methods can be used for implement-
ing APT components.
238 A. Anto et al.
Isohara et al. [3] proposed a system for Android malware detection. This sys-
tem performs kernel based behavioral analysis. Liu et al. [4] proposed a technique
that discusses different methods and tools, for analyzing binary code. Among
these tools static binary-code analyzing tools can be used in the proposed sys-
tem, for analyzing the binary-code. Bergeron et al. [5] proposed a technique for
static analysis of binary code that address the problem of static slicing on binary
executables for the purposes of the malicious code detection in COTS compo-
nents. Rubanov et al. [6] proposed a system for Runtime Verification of Linux
Kernel Modules. In the proposed system, they have used call interception for
the verification of the kernel. They developed a framework called KEDR for the
kernel verification. But their framework cannot be used for the analysis of the
entire system. It can be used for analysis of single function. KEDR tool can be
used for collecting control flow information.
Many of the existing methods for binary code analysis can be extended to
analyze the android kernel binary code. The static binary analysis tools can be
used for creating a control flow graph of entire kernel. The generated control
flow graphs can be used for APT detection.
modified kernel is cross- compiled by using ARM tool chain. After compilation
the compressed kernel image (zImage) is generated. Steps 3–6 are performed
for creating boot.img from zImage and ramdisk image. For creating boot.img
the original ramdisk image is separated from the genuine boot.img and which is
combined with the modified zImage.
In the seventh step, the device is booted into bootloader /recovery /down-
loading mode and in the last step the newly created boot.img is copied to the
device and flashed it into the device using different tools.
4 Detection of APT
The kernel-modification APT attacks can be efficiently detected by using control
flow analysis of the kernel binary code. The control flow analysis compares the
control flow graph of the genuine kernel with the control flow graph of the ker-
nel from the device and detects the APT based on signatures generated for the
detection mechanism. This section presents the proposed mechanism for detect-
ing kernel-modification APT attack in android using control flow analysis of
the kernel binary-code and implementation details of the proposed system. This
section presents the proposed system for the kernel-modification APT detection
in android.
Notation Meaning
B Genunie kernel boot image (boot.img)
B boot.img from device
V vmlinux for the genuine kernel B
Z Genuine kernel zImage
Z zImage extracted from the device kernel boot.img
I Uncompressed kernel image for genuine kernel
I Uncompressed kernel image for device kernel
H Hash code calculated from I
H Hash code calculated from I
X Hex code for the genuine kernel
X Hex code for the kernel extracted from the device
M Function mapping file for the genuine kernel
M Function mapping file for the kernel extracted from the device
L device driver function call details
G Used for storing function call graph of the kernel extracted from
the device
G Used the function call graph of the genuine kernel
S Signature file
A1 (two dimensional array) For storing nodes from G and
corresponding matching nodes in G
A2 For storing non matching nodes of G
A3 (two dimensional array) For storing nodes from G and
corresponding matching nodes in G find during edge matching
and a variable
R For storing results and detailed log during APT detection
M ax Contains the maximum number of nodes to be searched
R For storing results and detailed log during APT detection
threshold Indicates the amount of similarity is required to consider two
nodes are same
edge threshold Indicates the amount of similarity in edges is required to
consider two nodes as same
Graph Matching for the APT detection (Algorithm 2): In the function
call graph each node represents a function and each edge represents a function
call. The following algorithm will compare two function call graphs. It will also
check the similarity of the functions.
The graph generated from the Kernel Image will contain addresses of func-
tions as nodes instead of the function name, due to this the graph matching
Kernel Modification APT Attack Detection in Android 241
20 return R
is not straight forward. The function call graph may contain multiple similar
trees also. The existence of the multiple similar trees makes the graph matching
even difficult. Due to the above reasons the matching algorithm should check
the function length, content of the function and number of function calls initi-
ated from the function for finding matching function for each node. In the first
step the algorithm creates hex code (X ) for the kernel image (I ), then it will
perform the M atching, AdvancedM atching, EdgeM atching for comparing the
graphs G and G .
Algorithm for matching two graphs (Algorithm 3): The M atching algo-
rithm finds the matching node (both nodes can be considered as similar) in graph
G for each node in graph G . It will compare the nodes based on the length of
function, number of edges and code difference. The attacker can insert, remove
and modify functions in the kernel. Due to this the slight change in order of
functions in the binary code may be there and remaining order will be similar in
both genuine kernel image and the kernel image extracted from the device. This
algorithm uses this property for improving accuracy and speed. This algorithm
checks nodes based on the order of their existence, for that it is keeping the
indexes of nodes in i and j (i is used for storing index of nodes in graph G and
j is used for storing index of nodes in graph G).
242 A. Anto et al.
Algorithm 2: GraphMatching
Input: Function call graphs G , I
Data: S(G, H, I, M, L, X), array A1 , array A2 , arrayA3 , R
Result: modified array A1 containing nodes and corresponding matching
nodes, modified array A2 containing non-matching nodes, modified
array A3 containing nodes and corresponding matching nodes find
during AdvancedM atching, modified R
1 Generate hexcode X from I
2 M atching(G , I , X , M ) // which will create a matching list A1 and
non-matching list A2
3 AdvancedM atching(G , I , X , M )
4 EdgeM atching(G , I , X , M )
5 return
Algorithm 3: Matching
Data: S(G, H, I, M, L, X), array A1 , array A2 , maximum number of nodes to
be searched M ax, code distance threshold value threshold
Input: G , I , X , M
Result: modified array A1 containing nodes and corresponding matching
nodes, Modified array A2 containing non-matching nodes
1 i ← 0 // i represent the node number in graph G
2 j ← 0 // j represent the node number in graph G
3 while i <number of nodes in graph G do
4 if i is not visited then
5 if j < total number of nodes in graph G and
CompareN odes(G , i, j, X , M ) = 1 then
6 /*nodes i of G and node j of G are matching*/
DF SM atching(i, j, G , M , X );
7 else
8 Compare each node j1 of graph G having |j − j1 | ≤ M ax/2 with the
node i of G in ascending order of |j − j1 |
9 if node j1 got matched with the node i then
10 DF SM atching(i, j1 , G , M , X )
11 j ← j1 + 1
12 i←i+1
13 GenerateN onM atchingList(G )
Algorithm for comparing two nodes (Algorithm 4): This algorithm com-
pares node i of function call graph G with node j of function call graph G.
In this algorithm steps from 1 to 4 calculates the function lengths, number of
function calls for node i and j. In step 5 the hex code difference between the
functions corresponding to node i and j is calculated. In step 6 checks the func-
tion lengths, number of function calls are same or not and code difference is less
than threshold or not. If all the condition in step 6 is satisfied, then the node
Kernel Modification APT Attack Detection in Android 243
Algorithm 4: CompareNodes
Input: Function call graphs G , i, j, X , M
Data: S(G, H, I, M, L, X), array A1 , code distance threshold value threshold
Result: modified array A1 containing nodes and corresponding matching
nodes, Modified array A2 containing non-matching nodes
1 l1 ← function length of node i in graph G from M
2 l2 ←function length of node j in graph G from M
3 e1 ← number of edges from node i of graph G
4 e2 ← number of edges from node j of graph G
5 c1 ← Hex code difference between node i of G and node j of G using X and X
6 if l1 = l2 and e1 = e2 and c1 ≤ threshold then
7 Add node i of G and node j of G to the matching list A1 with the hex code
difference c1
8 return 1
9 return 0
i, j are added to the matching list A1 and algorithm returns 1. If the conditions
are not satisfied, then the algorithm will return 0.
DFSMatching Algorithm (Algorithm 5): If two functions are same then the
function calls made by those functions are also same. Using this property the
DFS matching algorithm finds the matching function. This is done to improve
the speed of node matching and improving accuracy.
Algorithm 5: DFSMatching
Data: S(G, H, I, M, L, X), array A1 , array A2 , code distance threshold value
treshold
Input: i, j, G , M , X
Result: modified array A1 containing nodes and corresponding matching
nodes, modified array A2 containing non-matching nodes
1 Make node i as visited
2 for each node m adjacent to the node i in graph G do
3 find similar adjacent node n in graph G
4 if m is not visited and CompareNode(G’,m,n,X’,M’) = 1 then
5 DF SM atching(m, n, G , M , X )
Algorithm 6: GenerateNonMatchingList
Input: Function call graphs G
Data: array A1 , array A2
Result: modified array A2 containing non-matching nodes
1 i←0
2 while i < number of nodes in graph G do
3 if node i of graph G ∈
/ A1 then
4 add node i to A2
5 return
Algorithm 7: AdvancedMatching
Data: S(G, H, I, M, L, X), array A1 , Array A2 , code distance threshold value
threshold
Input: G , I , X , M
Result: modified array A1 containing nodes and corresponding matching
nodes, Modified array A2 containing non-matching nodes
1 for each node i of G ∈ A2 do
2 if i <total number of nodes in graph G then
3 j←i
4 else
5 j ← total number of nodes in graph G
6 if CompareN odes(G , i, j, X , M ) = 1 then
7 Compare each node j1 of the graph G with the node i of graph G in
ascending order of |j − j1 |
8 if CompareN odes(G , i, j1 , X , M ) = 1 then
9 Matching node found
10 A2 ← ∅
11 GenerateN onM atchingList(G )
Algorithm 8: EdgeMatching
Data: S(G, H, I, M, L, X), array A1 . Array A2 , code-distance threshold value
threshold, edge threshold, array A3
Input: G , I , X , M
Result: Modified array A3 containing non-matching nodes
1 for each node i of G ∈ A2 do
2 l1 ← length of function i in G from M
3 e ← number of edges starting from node i in graph G
4 if l1 > 100 and e > 0 then
5 for each node k adjacent to i in graph G do
6 if matching node for k ∈ A2 then
7 n ← matching node for k from A2
8 Name of node k ← name of node n
having the same number of edges for node matching. This algorithm compares
the nodes based on the edge difference and code difference. This algorithm can
be used to identify matching nodes for those nodes having a small number of
function call changes.
Algorithm for Signature Matching (Algorithm 9): The signature match-
ing algorithm will check the following things. Primarily the algorithm checks
whether all functions in each device driver are properly identified or not. If it
is not identified, then some modification to the device driver has to be there.
Secondly, in device drivers some functions like ioctl, open, close etc. are only
called from the user space. Those functions are never to be called inside the
kernel. Due to that in the second step the algorithm checks whether any unex-
pected such function calls are there or not. One property of almost all android
APT attacks is that, it will use two or more device drivers (at least one input
or one output component of the device) for implementing attacks. For example,
capturing images automatically using camera can’t be considered as APT attack
until it sends the data using network to the attacker. Due to this the algorithm
checks any function, calls more than one device driver (input and output) in
a function call tree which is identified as modified. Android device components
can be classified as pure input device pure output devices, the device having
input and output facilities like communication components and file system. The
algorithm should check the following things in each modified tree.
246 A. Anto et al.
Algorithm 9: SignatureMatching
Data: S(G, H, I, M, L, X), array A1 , array A2 , A3 , R
Result: Modified R
1 f lag ← 0
2 for each function i ∈ L do
3 if node i ∈
/ A1 or node i ∈ / A3 then
4 /*Check existence of node I in graph G by checking the matching node
list A1 and edge-matching list A3 */
5 Add details of node i to R
19 if f lag = 1 then
20 return 1
21 else
22 return 0
1. Whether there are function calls to pure input device drivers and pure output
device drivers in modified tree.
2. Whether there are function calls to pure input device driver and input-output
device driver, example camera and network
3. Whether there are function calls to more than one input-output device, for
example, file operation and network operation
In step 1 the algorithm initializes a flag to 0, this flag is used to indicate the
existence of APT. In Step 2 to 6 each function i in device driver function details
list L is searched in arrays A1 and A3 . If the function i is not found in both
A1 and A3 , then the missing function details are added to the R. Steps 8 to
23 are performed for each node j of graph G in A2 . For each node j all sub
trees having node j in graph G are generated and steps from 9 to 19 are carried
out. For each tree, steps 10 to 12 checks whether any function belongs to L is
called or not. If it is there, then the function call is considered as an unexpected
Kernel Modification APT Attack Detection in Android 247
function call and details will be added to R. After checking unexpected function
calls the algorithm starts checking for APT. For Detecting APT the algorithm
checks the existence of function calls to both input and output device driver in
each modified tree by performing steps 13 to 18. If those calls are there, then the
APT is detected, then the flag is changed into 1 and details are added into R.
source site. The source code was modified for including required functionality.
Then the source code is compiled using the ARM compiler. The ramdisk image is
extracted using the umkbootimg tool form genuine boot. img. Then boot.img is
created by combining ramdisk image and zImage using the mkbootimg tool [7].
Then the boot.img is flashed on the device using the kernel flasher tool. There
are two different implementation one for images capturing and other for video
capturing. Image capturing attack This attack captures pictures using the
camera in frequent intervals. The attack will do the entire capturing procedure
automatically and store the captured YCBCR image in SD card. The captured
image is having 640 × 480 resolution. This image is then converted into .JPEG
format after transferring it into the device. The image captured by the camera
APT attack is given in Fig. 1.
Video capturing attack. This attack makes the kernel capturing 30 s video in
every 3 min. The captured video has 320 × 240 resolution and YCBCR format.
After capturing each frame the program automatically stores them in SD card.
Then the raw video is transferred to the computer and converted it into mp4
format. Figure 2 shows one frame in the video captured by the attack.
References
1. Levine, J., Grizzard, J.B., Owen, H.L.: Detecting and categorizing kernel-level rootk-
its to aid future detection. IEEE Secur. Priv. 4(1), 24–32 (2006)
2. You, D.H., Noh, B.N.: Android platform based linux kernel rootkit. In: 2011 6th
International Conference on Malicious and Unwanted Software (MALWARE), pp.
79–87. IEEE (2011)
3. Isohara, T., Takemori, K., Kubota, A.: Kernel-based behavior analysis for android
malware detection. In: 2011 Seventh International Conference on Computational
Intelligence and Security (CIS), pp. 1011–1015. IEEE (2011)
4. Liu, K., Tan, H.B.K., Chen, X.: Binary code analysis. Computer 46(8), 60–68 (2013)
5. Bergeron, J., Debbabi, M., Erhioui, M.M., Ktari, B.: Static analysis of binary code
to isolate malicious behaviors. In: Proceedings of the IEEE 8th International Work-
shops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WET
ICE 1999), pp. 184–189. IEEE (1999)
6. Rubanov, V.V., Shatokhin, E.A.: Runtime verification of linux kernel modules based
on call interception. In: 2011 IEEE Fourth International Conference on Software
Testing, Verification and Validation (ICST), pp. 180–189. IEEE (2011)
7. Xda developers: compiled mkbootimg and unpack/repack linux scripts for
boot.img. http://forum.xda-developers.com/nexus-s/development/hack-compiled-
mkbootimg-unpack-repack-t891333 (2016). Accessed 01 June 2016
Opaque Predicate Detection by Static Analysis
of Binary Executables
1 Introduction
Among the numerous methods employed to make reverse engineering hard,
opaque predicates belong to a special class of approach. It is because, opaque
predicates can seamlessly be integrated into the program along with any of the
numerous obfuscation methods available like virtualization and packing. So, it
is crucial to understand the working of opaque predicates and formulate an effi-
cient method to detect their presence in a given program. Although there are
multiple methods [5,7] currently present in the detection of opaque predicates,
there are new methods which are being discovered for generating new classes of
opaque predicates [9], leveraging on the limitations of previous works on opaque
predicate detection. So in this paper, we are addressing the limitations posed by
LOOP [7] due to the underlying use of dynamic analysis.
Opaque predicates are a special sub-class of predicates such that these predicates
are constant expressions. They constantly evaluate to only true or only false,
depending on the way they are constructed. These type of predicates are termed
“opaque” because, its behavior cannot be determined by an analyst [4].
Presence of opaque predicates are relatively trivial when it comes to manual
analysis by a human reverse engineer. By running the program or by debugging,
one will be able to recognize patterns and identify the presence of opaque predi-
cates. This is done by noticing that some branches are never executed. But, this
takes a lot of effort and manpower to analyze a single obfuscated program. So,
it is crucial to have an automated approach in the detection of opaque predi-
cates. The existence of opaque predicates effectively cripple all naive automated
analysis of binary programs [2]. This is because, they are used to insert huge
chunks of junk code which never gets executed during the course of execution.
An analyzer, unless it is able to identify if a branch never gets executed or not,
it will end up analyzing huge parts of junk code which functionally has no effect
on the program.
Opaque Predicates have a very detrimental effect over decompilers. This
is because, it would result in the decompiled output containing too much of
irrelevant junk code. This makes it hard to gain any meaningful information
from the decompiled output.
Opaque predicates are extensively used by malicious programs to evade sig-
nature based detection from anti-virus softwares. Polymorphic malwares gen-
erally encrypt themselves to evade static signature-based detection. But, they
are vulnerable to detection in memory since, before execution the code must be
decrypted. So, polymorphic malware authors employ opaque predicates in their
code so that signature generation is different [5] even when the same code is
loaded into the memory.
2 Background
Opaque predicates are boolean expressions which always evaluates to true or
false. They are either tautologies, which always evaluates to true irrespective of
the inputs or they can also be contradictions, which always evaluates to false
irrespective of value of the inputs. Opaque predicates can also be called as con-
stant boolean expressions.
These boolean expressions are used to construct bogus conditional statements
like if-else structures, switch-case or loops. In a more general sense, opaque pred-
icates introduce fake conditional jumps in the program. Each of these conditional
jump results in branching of the execution. Since, opaque predicates make sure
that only one branch is always taken, the other branch can be used to introduce
junk code into the program which is never executed in the course of the program
execution. This results in an increased code size
∀x ∈ Z (x × (x + 1))3 ≡ 0 (mod 2) (1)
A simple example for an opaque predicate is shown by Eq. (1). Irrespective
of the value of x, x × (x + 1) would always result in an even number and cube of
252 R. Krishna Ram Prakash et al.
a even number would result in an even number again. A opaque predicate can
be constructed by checking, if the result of this expression is either odd or even.
The presence of these opaque predicates can further be obfuscated by additional
transformations or even another opaque predicate.
The previous work, LOOP [7] employs a similar concept of making use of
SMT solvers to detect opaque predicates. We propose a number of improve-
ments over the idea introduced by LOOP. The entire system of opaque predi-
cate detection works by symbolic execution of the obfuscated program, which
involves dynamic analysis. Although LOOP is a very powerful tool, it carries
the same shortcomings as dynamic analysis [9]. The architecture of LOOP has
an inherent limitation that the presence of opaque predicates are checked only
in the execution path of the program. So, depending on the values of inputs
(disk, network, stdin) the detection rate might vary [1]. There is no direct way
of increasing the code coverage. We address this particular issue by making use
of static analysis to iterate over the conditions instead of dynamic analysis.
3 Proposed Solution
We detect the presence of opaque predicates with the help of a Satisfiability Mod-
ulo Theorem Solvers. In our case, just a boolean satisfiability solver is enough to
classify if a particular boolean expression is an opaque predicate or not. There are
a number of solvers available. We use STP solver [8] (Simple Theorem Prover)
for our purposes.
Satisfiability Solvers consist of an equation stack, which takes in boolean
expressions as input. The system of equations is then checked to determine
if that particular system consists of a solution or not [6]. If there is at least
one solution, which satisfies this system of equations, the system is modeled
and the solution is returned. Otherwise, the system of equations is considered
unsatisfiable or unsolvable.
We propose a two phased approach for the detection of opaque predicated
through static analysis of the obfuscated binaries. The two phases are
– Phase 1 :
• A system which extracts all the predicates present in the program from
all the control flow structures like if-else, switch-case and loops as math-
ematical expressions.
– Phase 2 :
• A decision engine which accepts a predicate as input and returns true, if
it is opaque.
• It returns f alse, if it is not an opaque predicate.
Opaque Predicate Detection by Static Analysis of Binary Executables 253
We start at the location of the conditional jump and we keep stepping back-
wards in the code until we get to the declaration statements of all the sym-
bols contributing to the expression. As we backtrace, the each step made back-
wards, we check if the expanded expression is made out of temporary registers
or whether all the values are loaded from memory.
The aforementioned assumption works because, whatever input is received
whether from network or disk, it is stored in the stack or heap. So, any references
to memory locations can be counted as variable declarations in the high level
language. ‘isDeclaration’ function detects if any of the operands in the current
instruction under consideration references a memory location or not. It can be
easily by cross checking the operands against the known list of general purpose
registers. the IL instruction assignment at that particular location is substituted
in the master expression. This process is continued until the loop terminates.
As the loop terminates, the reconstructed predicate expression is returned,
which will be passed on to the next phase to check if it is an opaque predicate
or not.
The predicate expressions generated in the previous phase are checked to detect
if they are opaque predicates or not. We make use of STP (Simple Theorm
Prover) [6] to model our Opaque Predicate Decision Engine.
STP: STP or Simple Theorem Prover is a type of SAT solver which has a
sat check functionality present in it. To run a sat check, it takes a system of
equations as its input. And returns,
– sat - if there is at least one solution available such that the expression holds
true.
– unsat - if there are no possible solutions for that particular expression such
that it evaluates to true.
X ⊂ Z, ∀x ∈ X, f (x) = 1 (2)
X=Z (4)
256 R. Krishna Ram Prakash et al.
f(x) f (x)
x≤3 x>3
x2 >= 0 x2 < 0
(x + x)mod2 = 0 (x + x)mod2 = 0
The generated complimentary function f is fed to the SAT solver and run a
sat check. If the result is sat, the function f has solutions in both X and Y and
so, it is not an opaque predicate.
If the sat check returns unsat, there are no solutions existing for f and so,
all the elements in Z are solutions for f and so, f is an opaque predicate. The
entire process is summarized as Algorithm 3
4 Results
Our opaque predicate detection algorithm OPDetect was run against the same
set of opaque predicate equations used by LOOP, and our algorithm performs
multiple times faster in terms of runtime, as shown in Table 2.
Opaque Predicate Detection by Static Analysis of Binary Executables 257
References
1. Schrittwieser, S., et al.: Protecting software through obfuscation: can it keep pace
with progress in code analysis? ACM Comput. Surveys 49(1), 4 (2016)
2. Banescu, S., Ochoa, M., Pretschner, A.: A frame- work for measuring software obfus-
cation resilience against automated attacks. In: 2015 IEEE/ACM 1st International
Workshop on Software Protection (SPRO), pp. 45–51. IEEE (2015)
3. Breaking Down Binary Ninjas Low Level IL (2017). http://bit.ly/binjaIL
4. Collberg, C.: Surreptitious Software. In: Opaque Predicates, pp. 246–253 (2009)
5. Dalla Preda, M., Madou, M., De Bosschere, K., Giacobazzi, R.: Opaque predicates
detection by abstract interpretation. In: Johnson, M., Vene, V. (eds.) AMAST 2006.
LNCS, vol. 4019, pp. 81–95. Springer, Heidelberg (2006). doi:10.1007/11784180 9
6. Ganesh, V., Dill, D.L.: A decision procedure for bit-vectors and arrays. In: Damm,
W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 519–531. Springer,
Heidelberg (2007). doi:10.1007/978-3-540-73368-3 52
7. Ming, J. et al.: Loop: Logic-oriented opaque predicate detection in obfuscated binary
code. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and
Communications Security, pp. 757–768. ACM (2015)
8. STP - Simple Theorem Prover (2008). https://github.com/stp/stp
9. Xu, D., Ming, J., Wu, D.: Generalized dynamic opaque predicates: a new control flow
obfuscation method. In: Bishop, M., Nascimento, A.C.A. (eds.) ISC 2016. LNCS,
vol. 9866, pp. 323–342. Springer, Cham (2016). doi:10.1007/978-3-319-45871-7 20
An Overview on Spora Ransomware
1 Introduction
– The infection was done by a pop-up displayed on the website requesting the
download of a Chrome Font Pack update. In fact, when accessing this web-
site (only by Google Chrome browser), all its textual content was encoded
by incomprehensible characters. This is the reason why this pop-up was dis-
played asking an update for the Chrome font pack. Clicking on the download
button downloads an application named Chrome Font vx.xx.exe which in
each download the version x.xx was modified. During our analysis we were
able to download 8 different samples (different hash) of Spora.
– The downloaded samples were stored inside other websites, generally uni-
versity and school websites. In our case we found some Colombian schools
and universities. Each sample was downloaded by a post request to one of
the Spora storage websites. This post was different in each download (for
example a post to new.php, free.php and next.php).
The infection method used by this variant of Spora is known by the name of
EITest Chrome Font Update. Before a few days of the compromised website
observation, an article [3] was published on this subject explaining this method
of propagation. We refer to this article to summarize the propagation method:
– Firstly the EITest actors hack a legitimate website and add a JavaScript
code at the end of the page. This code will look the page like an encoded page
then it displays the pop-up alert in order to see the page properly. Figure 1
shows the extracted EITest script that causes the fake chrome popup in the
compromised website.
– When a visitor visits the compromised website, the script makes the page
unreadable and asks for a chrome font pack update. The downloaded pro-
gram doesn’t start automatically and the victim must manually execute the
program to be infected.
We note that Brad Duncan published on his page [4] three publications on this
method of propagation.
An Overview on Spora Ransomware 261
For all samples we weren’t able to find any information in the PE sections, nor
interesting strings (e.g. “ransom note” strings) inside the Spora samples to tell us
immediately whether these samples were a ransomware or a malicious program.
PEiD didn’t suggest any packer used for all samples except the two samples
in G2 that suggested the following signature: “fasm -> Tomasz Grysztar”, for
flat assembler developed by Tomasz Grysztar. By asInvoker in the ressource
An Overview on Spora Ransomware 263
section all samples will run with the same permission as the process that has
started them and they can potentially be elevated to a higher permission level
by selecting Run as Administrator.
3 Behavioral Analysis
We first built an isolated malware analysis sandbox environment within which
to examine the behavior of the 8 samples using VirtualBox. This environment
included a set of test data files with different file extensions and user documents
folders (Desktop, Documents...) within Windows 7 Virtual Machine (VM) con-
nected to an USB drive which contained some different data (machine/usb drive
specifications aren’t important) with no antivirus and the VirtualBoxGuest
Additions were installed. All executions of these samples ran with local adminis-
trator privileges and their first execution was performed without any connections
to outside. Also for monitoring we used Process monitor.
1. Despite the absence of communication, the target files encryption was carried
out. This situation is different to PrincessLocker ransomware [1].
2. The clickable sample wasn’t deleted. This behavior is similar to
PrincessLocker’s behavior [1] and different to that of TeslaCrypt ran-
somware [2]. We will demonstrate that the self-reproduction was performed
by keeping the clickable file.
3. At the end of Spora execution, the HTML page C:\Users\MyPc\
Appdata\Roaming\FRF4-78ETG-TZTHA-TXHZT-REXYY.html was displayed
which informs that all target files were encrypted using RSA-1024 algorithm.
The name of the HTML page is the ID of the victim. The use of RSA-1024
algorithm suggests that the public key was stored inside the binary. Indeed,
we can assume the following scenario according to [5]: this sample began by
generating a random key for symmetric encryption (generally AES), after
encrypting the target files, the ransomware used the embedded public key to
encrypt the random key, sometimes the encrypted random key serves to be
an id key. After paying the ransom, the victim sent3 the encrypted key to the
attacker or published it in a pre-agreed place. The attacker uses his private
key to unlock the random symmetric key and send it to the victim. Note that
the Emsisoft Spora variant analyzed in [6] is close to our hypothesis.
4. The clickable button Authorization in the HTML page was used to com-
municate with the web page spora.biz. Indeed, it was a request post to
C&C. When the victim clicked the link, a base64 data4 was submitted auto-
matically as shown in Fig. 3. It was a post of two values u and b, the value
3
Spora performs only one communication with the C&C by a POST request.
4
Note that the data in Fig. 3 is in an URL format.
264 Y. Lemmou and E.M. Souidi
5
During our analysis we found that the developers of Spora were reactive.
An Overview on Spora Ransomware 265
The communication with the C&C was done only by the HTML page after files
encryption, so a detection based only on the exchanged network requests between
the ransomware and the C&C before the infection procedure (for CryptoWall [15])
hasn’t any effect on Spora detection. The HTML page code shows that the Spora
developers define the target machine language using JavaScript. The HTML page
was displayed by two languages, if the language is Russian, it was displayed in
Russian, if not it was displayed in English. So we assume that Spora targets
firstly Russian internet users6 . Concerning the data sent, the JavaScript used
in this HTML code is responsible for the post discussed above (Fig. 4). During
the writing this article the C&C (spora.bz) was offline MalwareHunterTeam was
published on twitter [16] a tweet that Spora’s team registered a new domain
torifyme.com, we tried to redirect our collected infection from spora.bz to
torifyme.com, it worked because they have a single server where they receive
all communications from the victims.
was SUCCESS, normally this file is suspect because it was opening at the begin-
ning of execution and this file was not closed until the end of execution. This
operation was followed by ReadFile(Offset:0,Length:4), here Sopra tries to
read 4 bytes from the beginning of this file, normally the file was empty in first
infection and it has just been created by the previous operation CreateFile, we
had END OF FILE in result. This is summarized in 1 of Fig. 5.
The file 1624817891 was fixed in all executions on the same machine. Indeed,
it’s the serial number shown by running the dir command in the cmd.exe com-
mand prompt. Spora wrote in this file by two WriteFile (2 of Fig. 5). The
previous ReadFile operation of 4 bytes wasn’t a random choice. Indeed, the
first WriteFile in 2 of Fig. 5 shows that Spora wrote the 4 bytes searched by
ReadFile operation, so these 4 bytes were a sign to doing something. Further-
more, these operations were carried out at the beginning of execution of this
sample, so we can assume that these 4 bytes were used to manage the overinfec-
tion. After, Spora continued its execution without closing this file.
After these operations, Spora listed the target directories (without target files
encryption) by CreateFile, QueryDirectory and sometimes by ReadFile, this
listing was generally by alphabetical order and some directories weren’t listed
like Program Files (x86), Windows and Program Files. This listing started
by C: drive followed by any removable drive and finally all mounted shared
directories, but the D: drive couldn’t be accessed by this sample (this is because
D: was reserved as a media drive for VirtualBox guest). This file listing at the
beginning is similar to PrincessLocker infection [1]. Indeed, PrincessLocker
makes a listing to search the targets which were found cached in its memory in
order to encrypt them later one by one. The activity of searching/listing through
all files and directories is suspicious for any unknown program that executing on
a machine. For ransomware detection this suspicious behavior can be useful as
An Overview on Spora Ransomware 267
3.3 Encryption
Without repeating the previous targets listing, this sample directly accessed
the target locations (the locations of target files discovered during the
file system search phase was cached). Among the target extensions
we found .log, .sqlite, .bmp, .jpg, .zip, .rar, .cfg, .msg, .tar,
.bin, .cab, .wmv.
The encryption process of Spora didn’t perform the following behaviors:
The encryption in Spora was done directly on the target file, no new file to receive
the encrypted data. As shown in 1 of Fig. 7, all target files had at least two
ReadFile functions: ReadFile(Offset:[EndOfFile-(128+4)],Length:128)
and ReadFile(Offset:[EndOfFile-4],Length:4). The first ReadFile reads
128 bytes from the end of the target file minus (128 + 4) bytes, the second reads
4 bytes from the end of file minus 4 octets. Furthermore Spora take for each
file at least two WriteFile (2 of Fig. 7), in fact it adds 128 bytes then 4 bytes
to the file end: WriteFile (Offset:[EndOfFile],Length:128) and WriteFile
(Offset:[EndOfFile+128],Length:4). The two WriteFile were used to label
each file after encryption as encrypted file and the two ReadFile were used to
check this label to not encrypt an already encrypted file. Note that the two
WriteFile construct a behavior indicator to detect this sample because it adds
at the end of each target file 128 bytes followed by 4 bytes. Moreover, the two
previous ReadFile can be added to this behavior indicator.
By reverse-engineering a part of this sample we found that the encryption
part wasn’t different to that of the previous version of Spora [6], this work in
malwarebyte explains the method used by Spora to encrypt the target files. In
fact for any target file, a new individual AES key was generated and used to
encrypt mapped file content. This method of infection makes Spora detection
more difficult because the encryption was written in the same target file with-
out renaming this file nor a creation file that receive the encryption data nor
WriteFile operations to write the encrypted data in the target file. The exported
representation of the individual key is encrypted by a previously generated RSA
key and then stored at the end of the encrypted file (first WriteFile) followed
by the CRC32 of this encrypted representation (second WriteFile).
The encryption process was finished by four WriteFile in the file
C:\MyPc\AppData\Roaming\1624817891 without CloseFile. We note that
An Overview on Spora Ransomware 269
Spora wrote some data in this file after each task for example the first writ-
ing was at the beginning of the infection, then after the target files list-
ing and now after targets infection. Therefore, writing data after each task
in another file adds another detection indicator for this sample. After that
Spora created a new process by Process Create to start WMIC.exe by the
command ‘‘C:\Windows\SysWOW64\wbem\WMIC.exe’’ process call create
‘‘cmd.exe /c vssadmin.exe delete shadows /quiet /all’’ this command
executed vssadmin.exe to delete all shadow volume copies on the target
machine. This operation was followed by ReadFile and WriteFile operations
to the file C:\MyPc\AppData\Roaming\1624817891 then Spora ran the default
browser to display the ransom note page and it returned again to the file
C:\MyPc\AppData\Roaming\1624817891 by two WriteFile without CloseFile.
The next step was the self-reproduction process.
self-reproduction process was performed, this process was followed by the copy
running to start/proceed the infection process and the original ransomware
deletion. Concerning Spora, it performed the self-reproduction after the infec-
tion/encryption process without removing the clickable/original ransomware.
The Fig. 8 shows the process of self-reproduction in Spora.
As shown in 1 of Fig. 8, Spora started by checking the existence of the
file C:\7072899c5ddb69209.exe, the result in a first infection was NAME NOT
FOUND. Note that the file name 7072899c5ddb69209.exe is related to the tar-
get machine like the previous name seen because it was fixed in each infec-
tion by Spora on the same machine. After that, it created a copy of the click-
able/original ransomware, this copy had the same size (2 of Fig. 8) and the same
MD5 as the original ransomware. Furthermore, Spora make this copy hidden by
SetbasicInformationFile HN operation. The self-reproduction in Spora and
TeslaCrypt was only a copy process not an evolution7 . Note that, the same
operations was performed to copy the ransomware in Desktop, shared directo-
ries and any USB drive connected to the target machine.
Spora performed after other operations, for any directory in C: drive, desk-
top, USB drives and shared directories (except the sub-directories) a CreateFile
followed by SetBasicInformationFile with HN in FileAttributes to hide
any directory. To verify if they were hidden in result, these operations
were followed by a QueryBasicInformationFile, if that is the case, Spora
performed the operation CreateFile Desired Access: Read Attributes,
Disposition: Open to verify the existence of a shortcut to this direc-
tory, the result was NAME NOT FOUND in first infection. So this shortcut
was created by a following CreateFile and WriteFile. The result of this
task was: all mentioned directories were hidden and shortcuts with the
same names of these directories were created and had the options Reduced
window in Run and C:\Windows\system32\cmd.exe /c start explorer.exe
‘‘<directory>” & type‘‘7072899c5ddb69209.exe’’>‘‘%temp%\7072899
c5ddb69209.exe’’&&‘‘%temp%\7072899c5ddb69209.exe’’ in Target option.
7
Generally the evolution contains other added functions than the copy, more infor-
mation about self-reproduction (copy or evolution) we refer to [17].
An Overview on Spora Ransomware 271
its not performed the ransomware will encrypt any new target file created
between two infections in this machine independently to previous/following
infections (for each infection an ID and a ransom to pay). It is controled by
the presence of a signature introduced by the ransomware (registry keys, par-
ticular file, etc.). The overinfection in Spora is not clear, it was limited to
make shortcuts to any new directory created between two infections. The short-
cuts was created without listing directories operations, directly after the sec-
ond ReadFile (Fig. 9). We had a new infection (new ID) by removing the file
C:\MyPc\AppData\Roaming\1624817891, replacing it by another similar file of
other infection or by a modification inside the first 234 bytes in this file. This
file (precisely the 234 bytes) is responsible to determining whether Spora per-
forms an infection or an overinfection. We summarize the process of infection
and overinfection in Fig. 10.
In this part we discuss about the behaviors that can be used for the Spora
detection according to some indicators posed in recent works on ransomware
detection. An indicator is a monitoring ransomware behavior that can be used
in its detection. Kharraz et al. in [10] studied the behavior of ransomware dis-
covered between 2006 and 2014 on a target machine, on the same approach
they published a second work [12] on ransomware detection. They suggest that
monitoring abnormal file system activity builds many indicators for ransomware
detection precisely by describing the interaction between the ransomware and
the file system. Also, they discussed a detection based on the use of encryption
mechanisms. These two behaviors can be used for the detection of Spora, indeed
by using Windows Crypto API each file has a new and an individual AES key
used to encrypt mapped8 file content. The encrypted exported representation
of the individual key and the Crc32 of this result are stored at the end of the
encrypted file, this generated two WriteFile for each target file. In the same
way; the use of the API Crypto, Eugene et al. [8] proposed PayBreak that pro-
tects against the threats posed by crypto-based ransomware which it observes
the use of the symmetric sessions keys and holds them in escrow, we think that
8
File mapping is a file system behavior.
An Overview on Spora Ransomware 273
Paybreak is able to hold the used keys by Spora. Nolen Scaif et al. [9] developed
a detector based on the behaviors exchanged between the ransomware and their
targets, they have divided the collected behaviors to two groups of indicators,
primary and secondary indicators. In first group we found:
– File Type Changes: monitoring this behaviour makes an indicator for Spora
detection, in fact Spora encrypted the entire file with its magic number which
allows to identify its type, by using the file Linux command we have: file
target file i = file Encrypted(target file i).
– Similarity Measurement: Spora uses AES which produces a totally differ-
ent output at the input, these changes to the content can be suspect using
similarity-preserving hash functions.
– Shannon Entropy: the data after encryption by Spora was high entropy.
– We had collected 8 different samples of the new version of Spora from the
compromised website and we posed them for the first time in Virustotal and
analyzed them by static and behavioral analysis.
– We extracted some behaviors that can be used for Spora detection.
– Proving that self-reproduction was also an indicator that can be used to
increase detection efficiency or to limit the Spora propagation after infection,
this indicator can be generalized for some ransomware.
– Discuss the behavior of Spora according to the indicators proposed in recent
ransomware detection works. In fact our analysis is the first analysis that
discusses and values these indicators10 .
References
1. Yassine, L., Souidi, E.M.: PrincessLocker analysis. In: International Conference on
Cyber Security and Protection of Digital Service, London (2017). https://doi.org/
10.1109/CyberSecPODS.2017.8074854
2. Rascagneres, P.: Analyse du rançongiciel TeslaCrypt. Misc mag N89 (2016)
3. Abrams, L.: Fake Chrome Font Pack Update Alerts Infecting Visitors with Spora
Ransomware, BleepingComputer blog (2017)
4. Duncan, B.: Eitest Hoeflertext Chrome popup leads to Spora Ransomware,
malware-traffic-analysis blog (2017)
5. Orman, H.: Evil offspring-ransomware and crypto technology. IEEE Internet Com-
put. 20, 89–94 (2016)
6. Hasherezade: Explained: Spora ransomware, malwarebytes blog (2017)
7. Cimpanu, C.: Spora Ransomware Works Offline, Has the Most Sophisticated Pay-
ment Site as of Yet, bleepingcomputer blog (2017)
8. Kolodenker, E., et al.: PayBreak: defense against cryptographic ransomware. In:
ASIA CCS 2017 (2017). https://doi.acm.org/10.1145/3052973.3053035
9. Scaife, N., et al.: CryptoLock(and drop it): stopping ransomware attacks on user
data. In: IEEE 36th International Conference on Distributed Computing Systems
(2016). https://doi.org/10.1109/ICDCS.2016.46
10. Kharraz, A., Robertson, W., Balzarotti, D., Bilge, L., Kirda, E.: Cutting the gor-
dian knot: a look under the hood of ransomware attacks. In: Almgren, M., Gulisano,
V., Maggi, F. (eds.) DIMVA 2015. LNCS, vol. 9148, pp. 3–24. Springer, Cham
(2015). https://doi.org/10.1007/978-3-319-20550-2 1
11. Continella, A., et al.: ShieldFS: a self-healing, ransomware-aware filesystem. In:
Proceedings of the 32nd Annual Conference on Computer Security Applications
(2016)
12. Kharraz, A., et al.: UNVEIL: A Large-Scale, Automated Approach to Detect-
ing ransomware. USENIX Security 2016 (2016). https://doi.org/10.1109/SANER.
2017.7884603
13. Hahn, K.: Spora - the Shortcut Worm that is also a Ransomware, G DATA Security
Blog, gdatasoftware blog (2017)
14. Coldshell: Spora-id, github (2017). https://gist.github.com/coldshell/
6204919307418c58128bb01baba6478f
10
We discussed on the proposed indicators and not the tools.
An Overview on Spora Ransomware 275
15. Cabaj, K., Mazurczyk, W.: Using Software-Defined Networking for Ransomware
Mitigation: The Case of CryptoWall. IEEE Network (2016). https://doi.org/10.
1109/MNET.2016.1600110NM
16. MalwareHunterTeam: Spora’s team registered a new domain (2017). https://
twitter.com/malwrhunterteam/status/841564703881068544
17. Filiol, E.: Computer Viruses: From Theory to Applications. Springer, Heidelberg
(2005)
Pattern Generation and Test Compression
Using PRESTO Generator
Abstract. The proposed work has a test pattern generator for built-in self-test
(BIST) based applications along with test data compression. Test patterns are
produced with desired levels of toggling and improved fault coverage is
obtained when compared with BIST-based pseudorandom pattern generators
(PRPG). The pattern generator comprises of a pseudorandom pattern generation
unit, a toggle generation and control unit, a hold register unit. Preselected
toggling (PRESTO) generator allows user defined levels of toggling. The pattern
generator is a linear finite state machine which drives a phase shifter, which
reduces correlation of patterns. This paper proposes a test compression method
which elevates the compression efficiency that has not been obtained by con-
ventional compression techniques. It does not need any core logic modifications
like test point insertion and thus the compression technique is nonintrusive. This
hybrid technique of BIST along with test compression achieves fault coverage
above 90%. Experimental results are obtained for ISCAS 85, ISCAS 89 and ITC
99 standard benchmark circuits. The PRESTO generator can effectively function
as a decompressor also and hence area is reduced.
1 Introduction
Low-power design with high performance has become the main challenge in today’s
very large scale integration (VLSI) design. Many power reduction techniques are
available, but they are highly concentrated on power usage during normal mode
operation rather than test mode operation, whereas in most cases test mode has more
power consumption than normal mode. Toggling of the nodes of circuit under test
(CUT) highly contributes to the test mode power consumption, which is more than the
switching activity of nodes when they work in normal mode. The main objective of
manufacturing test even today is to provide reliability and high quality to the semi-
conductor products. The test solutions and market conditions are undergoing evolution
to achieve these objectives and the factors affecting the evolution are semiconductor
technology, design process and design characteristics. New defects are demanding for
new design for test methods. Test compression got introduced in the last decade, gained
popularity and became today’s main methodology for testing, but technology is rapidly
changing and test compression may not be able to follow these changes in the next
decade. As a solution for that another prominent DFT technique, logic built-in self-test
(LBIST) is used along with test compression and thereby it is possible to achieve the
advantages of both techniques.
The bandwidth between external tester and chip is small and is a main issue in IC
testing. Every new generation technology has improved integration density than the
previous one and this results in larger designs and more faults. High fault coverage
aims at the detection of delay faults and other faults apart from the detection of stuck-at
faults and for that test pattern requirement is high [10]. External testing, the conven-
tional method stores test patterns and test responses in automatic test equipment (ATE),
the external testing equipment, but it has the disadvantage of limited speed, less I/O
channel bandwidth, and low memory. So the smaller tester-chip bandwidth is often a
major issue in deciding the speed of testing. The maximum speed testing can have is
the speed with which the test data transfers. BIST and test data compression are the
techniques to overcome this problem and the combination of BIST and test data
compression has become a main research area [3, 14]. The (Automatic test pattern
generator) ATPG patterns are compressed and stored in the chip and later on for testing
purpose they are decompressed using the existing BIST hardware [8]. Techniques that
use compressed weights to embed the deterministic stimuli are proposed in [11]. Code
based schemes are the conventional methods of compression in which patterns are
encoded into a set of code words. The data is divided into symbols and each symbol is
encoded using the specific code word for it and compression is done. By converting
back the code words in the compressed data to symbols decompression can be
achieved. In run-length coding, consecutive 0 s are encoded using code words of fixed
length and the length of runs of 0 s are increased using cyclical scan architecture [12].
Golomb coding encodes consecutive 0 s with code words whose length is different and
such code words helps achieve effective encoding even though it needs synchronization
between tester and chip [9]. Frequency-directed run-length (FDR) coding helps in
further optimization of test patterns. Dictionary coding partitions data into n-bit sym-
bols and a dictionary is used to store symbols. Here n-bit symbols are encoded using
b-bit code word, provided ‘b’ is less than ‘n’ [3]. Huffman coding partitions the data
into n-bit symbols and depending on the frequency of occurrence, code words are
provided. For symbols that occurred more frequently, smaller code words are given and
for symbols that occurred less frequently, larger code words are given [3].
Various techniques exist for reducing the power of test pattern generators. In the
scheme of [2] for scan based BIST the modes and modules that consume high power are
identified and appropriate modifications are made in the design so as to achieve power
reduction. This method reduces the power, but suffers from area overhead. For low
switching activity, a dual speed LFSR (DS-LFSR) can be used as pattern generator [4].
The technique provides good fault coverage, but suffers from slight area overhead. In
modified clock scheme PRPG uses a modified clock [5]. Another method uses a biasing
logic to supply inputs to scan chains. LFSR drives the scan chains, but has an AND gate
with k inputs and a T flip-flop in between them as biasing logic. T flip-flop outputs the
same value until it receives a ‘1’ as input. When ‘1’ is fed, the output gets inverted. So
unless a ‘1’ comes as output, the same pattern will be fed to scan chain. The chance to
278 A. Roy and J.P. Anita
have a ‘1’ at flip-flop input depends on the AND gate. So the factor that controls the
transitions of the CUT input is fan-in of AND gate. If k is AND gate fan-in, then 1/2k is
the chance for T flip-flop output to be ‘1’. In order to obtain very less transitions, the
AND gate inputs should be high and vice versa [6]. In LFSR with bit swapping, to
decrease the switching between patterns, bit swapping can be used. All these techniques
concentrate on reducing power between patterns. In this method, one of LFSR output is
made select line and when the select line has a value zero, the neighbouring bits are
swapped and if the select line is one, then the pattern will not be changed [7].
Code based methods makes use of the correlation in specified bits and is not
efficient in handling don’t care. As a result, the CUT may not consider test patterns as
distinct ones and will result in low fault coverage. The Embedded deterministic test
(EDT) based compressor has low linear dependency as it uses PRESTO generator.
EDT uses hybrid testing by combining both BIST and ATE. Reduced pattern appli-
cation time and low external influence are benefits of using BIST. ATE ensures
determinism in patterns and low on-chip area is achieved.
The PRPG is suitable for LP BIST applications. By using its preselected toggling
(PRESTO) levels, the switching activity of generated patterns is reduced and thereby
power dissipation is also reduced. The PRPG functions as a successful LP decom-
pressor too and thus the hybrid method of BIST and test compression is achieved.
Hence an environment is created to achieve a hybrid solution by merging LBIST and
test compression.
2 Basic Architecture
The structure of basic PRESTO generator is shown in Fig. 1. The PRPG is of n-bits
and it is connected to the phase shifter which feeds the scan chains and this arrange-
ment is the generators kernel.
The toggle generating and control unit, and the hold register unit together controls
the output of LFSR and feeds the phase shifter. An LFSR or ring generator can serve as
pattern generator. Between PRPG and phase shifter, n hold latches are kept. They are
controlled by n-bit toggle control register which in turn receives input from n-bit shift
register. The hold latches have two modes of operation, toggle mode and hold mode. In
toggle mode, the enable input of the latch will be high and it allows the data from
PRPG to pass through it and will be given to phase shifter whereas in hold mode, the
enable signal will be low and the output of PRPG will be hold and a constant value will
be passed to the phase shifter. The phase shifter output is taken from hold latches by
XOR-ing three of them. Toggle control register has values 0 s and 1 s and a 0 value
indicates the hold mode of the latch and toggle mode is indicated by 1 value. Shift
register feeds control register with new values for every pattern. The shift register is fed
by OR gate and the value are chosen in a probabilistic manner with programmable set
of weights by using the original PRPG. The probability of a k input AND gate to
produce 1 output is 0.5k and so the probabilities with which 1 s are produced are 0.5,
0.25, 0.125, and 0.0625. One of the four AND gates are selected by means of a
switching register, which enables only one AND gate at a time and that value will be
passed through OR gate. An example is, if 0001 is the switching code, first AND gate
will be selected and the toggle control register will have 50% of 1 s. NOR gate with
four inputs is to disable the low power mode when the switching register has 0000 in it.
The data in control register, i.e. the amount of 1 s it has is maintained in a stable level
by means of switch level selector when operating in weighted random mode. Conse-
quently, the fraction of scan chains in LP mode will roughly remain the same, though
the chains that toggle will change from one pattern to another.
The Fig. 2 shows a fully employable PRESTO generator. This scheme helps to achieve
low toggling test patterns with more flexibility. The test pattern’s period of shifting is
splitted into hold and toggle intervals, which is an alternating sequence. A T flip-flop is
used to achieve this functionality. T flip-flop toggles only when it is fed with input 1,
till then it will remain in its previous state. When T flip-flop output is 0, the hold latches
enter hold mode, the PRPG values will be holded and phase shifter receives constant
values. To achieve this AND gates are kept between toggle control register and OR
gates. When flip-flop output is 1, hold latches enter toggle mode, and LFSR outputs
reach phase shifter. At this time AND gate is transparent to the toggle control register
values. The toggle and hold registers are of four bits and they decides how long the
generator will be in each mode. In order to flip each mode, T flip-flop should have 1 as
output. The multiplexers are driven by hold and toggle registers and their select signal
is output of T flip-flop. The values fed to AND gates are chosen in a probabilistic
manner with programmable set of weights by using the original PRPG. When the
device enters toggle mode, it waits for a 1 at T flip-flop output in order to enter hold
mode which is in turn decided by the hold and toggle registers. Test patterns with
low-toggling can be achieved with this scheme by preserving the principle of operation
of the basic solution.
280 A. Roy and J.P. Anita
4 Decompressor Structure
The automatic test pattern generator (ATPG) patterns are treated as Boolean variables
for compressing the test cubes. Input variables are injected to the decompressor at
locations specified by the primitive polynomial. The symbolic expression of each scan
cell is a linear function of the injected input variable. By knowing the following details:
a polynomial executed by the ring generator; the phase shifter structure; injection site’s
location; and also the number of shift cycles, the linear equations of the scan cells with
specified values can be found. Consequently, by solving the linear equations, com-
pressed patterns can be obtained. By scanning these compressed patterns through
decompressor, a match of ATPG output can be obtained. The unknown bits are given
either ‘0’ or ‘1’ depending on the decompressor structure. Often test cube for a par-
ticular fault may not get compressed due to large number of specified bits or due to
linear dependency of specified bits. Those faults are retargeted and new test cube is
generated and this makes the compression algorithm complex.
6 Validating Experiments
compression for the Huffman encoding, and column EDT has the number of test
patterns after compression for Embedded Deterministic Test. The compression
achieved for ISCAS 85 circuits are shown in Table 4, Table 5 shows the compression
achieved for ISCAS 89 circuits and Table 6 shows the compression achieved for ITC
99 circuits. Superior results are produced by EDT for all the circuits.
The power and area obtained from Synopsys Design Compiler for basic PRESTO
Generator and fully employable PRESTO Generator is shown in Table 10.
7 Conclusion
Hence, PRESTO generator produces pseudo random patterns with low switching
activity and allows user defined levels of toggling. The control signals produce distinct
patterns and thus high fault coverage can be achieved. PRESTO generator can effec-
tively function as a decompressor, thus a combined technique which uses both LBIST
and test compression can be implemented and a hybrid solution is obtained which
combines the advantages of both the techniques. This technique can overcome the
problem of low bandwidth between the tester and chip encountered during testing when
using external testing equipment. The LFSR can be modified to reduce the switching of
patterns in the scan chains to achieve low power EDT and this can be the future scope
of this work.
Pattern Generation and Test Compression Using PRESTO Generator 285
References
1. Filipek, M., Mukharjee, N., Mrugalski, G.: Low power programmable PRPG with test
compression capabilities. IEEE Trans. Very Large Scale Integr. 23(6), 1063–1076 (2015)
2. Gerstendorfer, S., Wunderlich, H.: Minimized power consumption for scan-based BIST. In:
Proceedings of International Test Conference (ITC), pp. 77–84 (1999)
3. Touba, N.A.: Survey of test vector compression techniques. IEEE Design Test 23(4), 294–
303 (2006)
4. Wang, S., Gupta, S.K.: DS-LFSR: a BIST TPG for low switching activity. IEEE Trans.
Comput. Aided Design Integr. Circ. Syst. 21(7), 842–851 (2002)
5. Girard, P., Guiller, L., Landrault, C., Pravossoudovitch, S., Wunderlich, H.-J.: A modified
clock scheme for a low power BIST test pattern generator. In: Proceedings of the 19th
IEEE VLSI Test Symposium (VTS), pp. 306–311 (2001)
6. Wang, S., Gupta, S.K.: LT-RTPG: a new test-per-scan BIST TPG for low switching activity.
IEEE Trans. Comput. Aided Design Integr. Circ. Syst. 25(8), 1565–1574 (2006)
7. Abu-Issa, A.S., Quigley, S.F.: Bit-swapping LFSR for low-power BIST. Electron. Lett.
44(6), 401–402 (2008)
8. Das, D., Touba, N.A.: Reducing test data volume using external/LBIST hybrid test patterns.
In: Proceedings of International Test Conference (ITC), pp. 115–122 (2000)
9. Chandra, A., Chakrabarty, K.: System-on-a-chip test-data compression and decompression
architectures based on Golomb codes. IEEE Trans. Comput. Aided Design 20(3), 355–368
(2001)
10. Anita, J.P., Sudheesh, P.: Test power reduction and test pattern generation for multiple faults
using zero suppressed decision diagrams. Int. J. High Perform. Syst. Archit. 6(1), 51–60
(2016)
11. Hakmi, A.-W., et al.: Programmable deterministic built-in self-test. In: Proceedings of
IEEE VLSI Test Symposium (VTS), pp. 1–9 (2007)
12. Jas, A., Touba, N.A.: Test vector compression via cyclical scan chains and its application to
testing core-based designs. In: Proceedings of International Test Conference, pp. 458–464
(1998)
13. Rajski, J., Tyszer, J., Kassab, M., Mukherjee, N.: Embedded deterministic test. IEEE Trans.
CAD 23, 776–792 (2004)
14. Asokan, A., Anita, J.P.: Burrows wheeler transform based test vector compression for digital
circuits. Indian J. Sci. Technol. 9(30) (2016)
Challenges in Android Forensics
1 Introduction
Smartphones are capable of doing a multitude of tasks not possible with conven-
tional phones. We can now send emails, engage in video chats, access satellite
navigation and remain connected to the outside world 24 × 7. These devices are
key sources of evidence collection in criminal investigations. Criminals routinely
use their phones, with encrypted messages, and are now savvy enough to wipe
out the traces of their activities. Even if the phones are confiscated, it is very hard
for law-enforcement agencies to extract data from those devices. Any attempt
to get access to the device memory using brute-force could potentially wipe out
the all the data. This paper identifies the challenges faced by the investigation
agencies. We focus on Android devices as they are now dominant. Gartner [35]
reports that Android market share is 81%, IOS with 18% and Windows with
0.3% of the global smartphone market. So, there is a good chance that a seized
device will be an Android smartphone.
2 Background
Android operating system was developed by Google for touchscreen mobile
devices. Android source code is open-source. Figure 1 shows the Android archi-
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 286–299, 2017.
https://doi.org/10.1007/978-981-10-6898-0_24
Challenges in Android Forensics 287
tecture. Early Android devices used YAFFS2 file systems for flash storage. Cur-
rent Android kernels use EXT4 file systems. HAL layer bridges the gap between
hardware and software and allows applications to communicate with the spe-
cific device drivers. The ART JVM is specifically designed for devices with low
processing power and has lower memory requirements than traditional JVMs.
Android Native libraries handle different types of data using assembly, C and
C++ languages. Android Framework is the layer that applications directly inter-
act with. Application layer is the topmost layer in the Android architecture.
Forensically, Application layer is the most important part of the Android archi-
tecture because all user data are stored and generated in this layer.
/boot. This partition contains the boot image. It includes the kernel, and the
ramdisk. It is responsible for booting into the system. However, we can still
enter the system using recovery by pressing a couple of keys as it powers-up.
/system. This partition contains the system files which are installed when ROM
image is flashed. It is akin to the Windows C: drive where all OS files are
stored.
288 S. Hazra and P. Mateti
The Android Debugger Bridge, adb, is part of the Android Software Development
Kit. It is a CLI tool that connects a Linux or Windows PC as a client to an
Android device as a server, and can pull or push files, and can even invoke a
shell on the Android device. Figure 2 Illustrates what options must be enabled
in the developer settings in Android to access it via adb.
3 Android Forensics
Forensics Investigations can be broadly divided into (i) Proactive Forensic Inves-
tigations and (ii) Reactive Forensic Investigations.
data, device data, sim card data, usage history, application and sensor data.
The data can be sent via GSM message, PAN like bluetooth, WLAN or cellular
network. The data collected can then be classified in a taxonomy of evidence
type such as Identity evidence, Location evidence, Time evidence, Context evi-
dence, Motivation evidence and Means evidence. The data collected can then be
analyzed and presented in court.
Grover [7] made the first of its kind proactive Android forensics tool called
Droidwatch, which collected data with user consent and uploaded it to a remote
server. It was mainly designed for BYOD (Bring Your Own Devices), where orga-
nizations allow employees to bring their own device for work. Droidwatch mon-
itors using content observers, broadcast receivers and alarms from the Android
Framework. The main drawback is that these are susceptible to tampering.
Walnycky [2] took a survey of network forensics of 20 most popular Android
Messaging Apps. Only 4 apps passed the privacy test. TextMe app might be a
potential Trojan and some apps like MessageMe, MeetMe, Oovoo even send the
messages over the network in plain text. Full Video reconstruction was possible
in the cases of Tango, Nimbuzz, and MessageMe. Whatsapp successfully passed
the test. MITM (Man in the Middle) attack is perfectly possible in some apps,
They used a program called Datapp to generate the report. However with the
advent of end-to-end encryption, it will be much harder to analyze the network
packets of messengers now.
Karpisek [3] studied the calling feature in Whatsapp where they first de-
synchronised the full handshake of Whatspp and then monitored the entire
handshaking procedure. Whatsapp was using OPUS [20] voice codec in RTP
(Real-time Transport Protocol). They were able to observe the entire call process
where a connection with at least 8 Whatsapp servers were made between calls.
Vrizlynn [12] devised a live memory forensic technique for mobile phones
in which they created a framework composed of a Message Script generator,
UI/Application exerciser monkey, a memory acquisition tool (mem grab) and
a memory dump analyzer. The messages are intercepted in real time from the
shared memory regions. After sending each sets of message they performed a
memory dump, with varying time intervals, they were able to recover up to
100% of outgoing messages and 75% of incoming messages from the memory
dumps. The system is infeasible in forensic scenario because the taking memory
dumps takes significant system resources and the smartphones are not static so
we will not get optimal performance everywhere.
A proactive Android forensic ROM has been developed by Aiyyappan [15]
and Karthik [16]. Aiyyappan [15] ported the inotifywait package to Android.
File events are tracked by inotify tool and the inotify source was compiled using
NDK programming and compiled to a native shared object library. The native
function accepts a directory to track and all file events are tracked. He also
created the forensic examiner toolkit which runs on a Linux machine to image,
recover and collect device specific data from the cloud. Some amount of stealth
was also enabled using hidepid = 2, where users can only see their own processes
Challenges in Android Forensics 291
and process id’s are also hidden from /proc. The tool uses part of AFFT [39]
code. It has options to get data from cloud and retrieve all the collected data.
Karthik [16] created an Android APK that extensively tracks the user activ-
ities like GPS, sensor data, wifi metadata, SMS, call recording and a keylogger
was also included in the APK. All this is configured when the ROM is flashed.
After that there is no dialog whatsoever and the app runs in stealth mode and
saves all the forensically relevant data in /forensic partition and opportunisti-
cally uploads it to the cloud.
data using the physical analyzer software of Celebrite. It succeeded in the case
of Whatsapp but failed in the case of Viber. Manual analyses of the Viber
folder were needed. Pretty much everything was extracted such as chat mes-
sages, images videos with timestamp. However, the data in the internal sdcard
was encrypted and they did not test it after the deleting the data. Whether the
tool was able to extract data from the unallocated space is unknown.
Lamine [6] proposed acquiring device image of the MTD devices using
nanddump tool which can collect NAND data. The tool was designed for YAFFS2
file system which Google have now stopped using. The authors targeted the user
data partition. Image carving was done using tools such as scalpel which was
able to recover data regarding searched google maps locations and connected
wifi hot spots. Modern Android devices are now using eMMC devices which are
flash block devices instead of MTD because of which file systems such as EXT4
can be easily supported, which was earlier not possible.
Sylve [5] created a tool which parses the kernel iomem resource structure
to find the physical memory range, and performs physical to virtual address
translation. The tool was able to read all pages in each memory range and write
them to a file. The memory dump was directly written from kernel to limit the
amount of interaction with user space and to prevent contamination of system
and network buffers. The resultant image was then examined using volatility [33]
memory forensic toolkit and the authors added ARM address space support. The
authors developed two volatility plugins which were able to mimic the contents
Challenges in Android Forensics 293
of /proc/iomem file and the other for acquiring selective memory mapping from
specified user land process.
Andriotis [10] used open source tools like adb to do live analysis of Network
buffers in Android devices. The researchers rooted the phone and installed su and
busybox binary. Then a physical imaging of the device was done and the main
and events ring buffer is analysed. They tested it by sending files via bluetooth
and wifi to another device and took an image after 30 min, after that they did
a factory reset and again a device image was taken. The process was repeated
with time intervals of 30 min, 6 h and 12 h. The experimental results differed
from device to device because some device had larger buffer size compared to
others. They were able to recover name of the objects sent, MAC id’s, Bytes
sent and timestamp. They were also able to recover the wifi connections made
by the suspect. However on examination of the system image, lot more artifacts
such as browsing history, caches, and cloud storage could be extracted.
Quang [4] did an analysis on extracting cloud based data from Android
devices. They used a Nexus S, rooted the phone and loaded a custom image
in the volatile RAM to avoid modifications to internal partitions. Then they col-
lected a physical image via a custom boot image and analyzed it. They were able
to extract app private repositories of Dropbox, Box and Onedrive. The private
app storage folders contained files and sqllite databases which contained user
tokens, Oauth tokens and secret keys. The XML files revealed the list of objects
stored, timestamps of when they were accessed and email addresses.
A survey on commercially available forensics tool has been done by Nihar
[17] and Venkateswara [18] to find out the data extraction capabilities of the
tools. There are several free tools available like SleuthKit Forensic Toolkit [39],
Volatility [33], while tools like Andriller [34] are trial version tools. Paid tools
like Universal Forensic Extraction Device (UFED) [32] provide both standalone
tablets as well as devices to logical and physical data from devices. The authors
made a comparative study of the available tools in the market, However no
information was given on how these tools perform on various devices on cases
like when the devices are locked, the data has been deleted or when the device
is encrypted.
4 Challenges
The challenges faced by forensics investigators are increasing day by day. There
is growing pressure from privacy activist groups to make the Android platform
more secure by using end-to-end encryption and stronger encryption algorithms
for encrypting device data. Criminals are using these to their own advantage to
securely communicate over the network and also encrypt their devices.
Criminals are increasingly becoming tech savvy and are using various apps which
provide encryption facilities as well as secure communication. This is becoming a
294 S. Hazra and P. Mateti
problem for security agencies as they are not able to intercept the communication
channel. Some of the apps used by criminals networks are as follows:
Mappr [21]. An App that can change location data on photos, so they do not
reveal where they actually were.
Cryptophone [27]. An Android based phone with enhanced security features like
encrypted calls with 256-bit AES and twofish algorithms in addition to 4096-
bit Diffie-Hellman Key generation for each calls.
Telegram [28]. An encrypted mobile messaging app that can host different chan-
nels where members can talk in a group setting.
Firechat [26]. An App that connects to nearby devices which have firechat
installed through wifi or bluetooth and build a “mesh network” that allows
messages to be passed to other devices within vicinity without any usage of
cell phone tower.
Wickr [25]. An end-to-end encrypted messaging app that allows users to send
messages which are self-destructed after a time limit. It use strong encryption
and deletes all metadata like geotags and time stamps. It also includes a secure
shredder to erase attached files to prevent recovery.
SSE Universal Encryption [24]. An encryption app which can encrypt texts, files
and directories and provides a password vault. It encrypts with AES 256
bit encryption algorithm and also provides user an option to delete the data
sources after encryption.
Challenges in Android Forensics 295
to recover data. In addition, apps like Uninstallit can wipe out all app related
data from Android phones and in those scenarios hardware acquisition may be
the only option.
Even after the investigator has access to the device bootloader and can con-
nect the device via adb, if the suspect factory-resets the phone, then the chances
of data recovery are minimal. Figure 4 shows the data which can be recovered
from a device which has been wiped out using SleuthKit [39]. The device was
extensively used for two weeks prior to factory reset and then a physical acqui-
sition was done and the image was analyzed.
Sleuthkit was able to extract the partition data as seen in Fig. 3. However,
when email and userdata recovery was tried, it could only provide us with one
email-id and 6 image files as seen below in Figs. 6 and 7, which was considerably
less than the original data stored in the device prior to factory reset. In these
scenario, hardware memory acquisition may be the only option left for forensic
investigators.
Sylve [5] attempted to identify the barriers in getting volatile memory data
from Android devices. The main barrier was the large number of kernel versions
currently running on Android devices, some of which are proprietary and hence
investigators face difficulty creating kernel patches since the symvers file which
contains the CRC of all kernel symbols is not available. And, module compilation
requires the kernel configuration file (.config) which can either be acquired from
the device or from kernel distributions. Last but not the least, Android security
features like FDE [22] (Full-Disk Encryption) and the FBE [23] (File Based
Encryption) can create hindrance in acquiring evidence files from the suspects
device. Android 5.0 on wards supports FDE which encrypts the whole /data
partition of the device with a single key. Starting from Android 7.0, Google has
also implemented FBE in Android phones which encrypts files with different keys
that can be unlocked independently. This means investigators will have to search
for multiple decryption keys for multiple files which can hinder the process of
investigation.
References
1. Mahajan, A., Dahiya, M.S., Sanghvi, H.P.: Forensic analysis of instant messenger
applications on Android devices. arXiv preprint arXiv:1304.4915 (2013)
2. Walnycky, D., Baggili, I., Marrington, A., Moore, J., Breitinger, F.: Network and
device forensic analysis of Android social-messaging applications. Digit. Invest. 14,
77–84 (2015). Elsevier
3. Karpisek, F., Baggili, I., Breitinger, F.: WhatsApp network forensics: decrypting
and understanding the WhatsApp call signaling messages. Digit. Invest. 15, 110–
118 (2015). Elsevier
4. Do, Q., Martini, B., Choo, K.-K.R.: A cloud-focused mobile forensics methodology.
IEEE Cloud Comput. 2, 60–65 (2015). IEEE
298 S. Hazra and P. Mateti
5. Sylve, J., Case, A., Marziale, L., Richard, G.G.: Acquisition and analysis of volatile
memory from android devices. Digit. Invest. 8, 175–184 (2012). Elsevier
6. Aouad, L.M., Kechadi, T.M.: Android forensics: a physical approach. In: Proceed-
ings of the International Conference on Security and Management (SAM), The
Steering Committee of The World Congress in Computer Science, Computer Engi-
neering and Applied Computing (WorldComp) (2012)
7. Grover, J.: Android forensics: automated data collection and reporting from a
mobile device. Digit. Invest. 10, 12–20 (2013). Elsevier
8. Akarawita, I.U., Perera, A.B., Atukorale, A.: ANDROPHSY-forensic framework
for Android. In: 2015 Fifteenth International Conference on Advances in ICT for
Emerging Regions (ICTer). IEEE (2015)
9. Freiling, F., Spreitzenbarth, M., Schmitt, S.: Forensic analysis of smartphones: the
Android Data Extractor Lite (ADEL). In: Proceedings of the Conference on Digital
Forensics, Security and Law. Association of Digital Forensics, Security and Law
(2011)
10. Andriotis, P., Oikonomou, G., Tryfonas, T.: Forensic analysis of wireless networking
evidence of Android smartphones. In: IEEE international workshop on Information
forensics and security (WIFS), pp. 109–114. IEEE (2012)
11. Müller, T., Spreitzenbarth, M.: FROST. In: Jacobson, M., Locasto, M., Mohassel,
P., Safavi-Naini, R. (eds.) ACNS 2013. LNCS, vol. 7954, pp. 373–388. Springer,
Heidelberg (2013). https://doi.org/10.1007/978-3-642-38980-1 23
12. Thing, V.L.L., Ng, K.-Y., Chang, E.-C.: Live memory forensics of mobile phones.
Digit. Invest. 7, 74–82 (2010). Elsevier
13. Mylonas, A., Meletiadis, V., Tsoumas, B., Mitrou, L., Gritzalis, D.: Smartphone
forensics: a proactive investigation scheme for evidence acquisition. In: Gritzalis,
D., Furnell, S., Theoharidou, M. (eds.) SEC 2012. IAICT, vol. 376, pp. 249–260.
Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30436-1 21
14. Vidas, T., Zhang, C., Christin, N.: Toward a general collection methodology for
android devices. Digit. Invest. 8, 14–24 (2011). Elsevier
15. Aiyyappan, P.S.: Android forensic support framework. Masters Thesis, Advisor:
Prabhaker Mateti, Amrita Vishwa Vidyapeetham, Ettimadai, Tamil Nadu, India
(2015). http://cecs.wright.edu/∼pmateti/Students/
16. Rao, M.K.: Proactive forensic support for Android devices. Masters thesis, Advisor:
Prabhaker Mateti, Amrita Vishwa Vidyapeetham, Ettimadai, Tamil Nadu, India
(2016). http://cecs.wright.edu/∼pmateti/Students/
17. Roy, N., Ranjan, K., Anshul, K., Aneja, L.: Android phone forensic: tools and
techniques. In: International Conference on Communication and Automation, pp.
605–610. IEEE (2016)
18. Rao, V., Chakravarthy, A.S.N.: Survey on Android forensic tools and methodolo-
gies. Int. J. Comput. Appl. 154, 17–21 (2016). Foundation of Computer Science
(FCS), New York
19. Hoog, A.: Android Forensics: Investigation, Analysis and Mobile Security for
Google Android. Elsevier, Amsterdam (2011)
20. Valin, J.-M., Maxwell, G., Terriberry, T.B., Vos, K.: High-quality, low-delay music
coding in the Opus codec. arXiv preprint arXiv:1602.04845 (2016)
21. Mappr - Latergram Location Editor for Instagram. On the iTunes App Store
22. Full-Disk Encryption, Android Open Source Project. https://source.android.com/
security/encryption/full-disk
23. File-Based Encryption, Android Open Source Project. https://source.android.
com/security/encryption/file-based/
Challenges in Android Forensics 299
24. Secret Space Encryptor is a password manager, text encryption and file encryption
all-in-one solution. http://www.paranoiaworks.mobi/
25. The Wickr instant messaging app allows users to exchange end-to-end encrypted
and content-expiring messages. https://wickr.com/
26. FireChat is a proprietary mobile app, developed by Open Garden, which uses
wireless mesh networking to enable smartphones to connect via Bluetooth, WiFi.
https://www.opengarden.com/firechat.html/
27. The GSMK CryptoPhone 500i is an Android-based secure mobile phone with 360
mobile device security for secure messaging and voice over IP communication on
any network. http://www.cryptophone.de/en/products/mobile/cp500i
28. Telegram is a free cloud-based instant messaging service. Telegram also provides
optional end-to-end-encrypted messaging. http://telegram.org/
29. RIFF Box - “Best JTAG Box in this Galaxy.” http://www.riffbox.org/
30. HTCI Chip-Off Forensic Tools. http://forensicstore.com/product/
forensic-hardware/htci-chip-off-tools/
31. The UP-828P Series universal programmer to acquire data from a variety of
flash storage devices. http://www.teeltech.com/mobile-device-forensic-software/
up-828-programmer/
32. UFED Touch platform is a portable digital forensics solution. http://www.
cellebrite.com/Mobile-Forensics/Products/ufed-touch2/
33. The Volatility Foundation - Open Source Memory Forensics. http://www.
volatilityfoundation.org/
34. Andriller performs read-only, forensically sound, non-destructive acquisition from
Android devices. https://www.andriller.com/
35. Gartner says worldwide sales of smartphones grew 7 percent in the fourth quarter
of 2016. http://www.gartner.com/newsroom/id/3609817
36. Andy, G.: Whatsapp just switched on end-to-end encryption for hundreds of mil-
lions of users. http://www.wired.com/2014/11/whatsapp-encrypted-messaging/
37. Timberg, C., Miller, G.: FBI blasts Apple, Google for locking police out of phones,
The Washington Post (2014)
38. Platform Architecture. https://source.android.com/images/android framework
39. Android Free Forensic Toolkit. https://n0where.net/android-free-forensic-toolkit/
Current Consumption Analysis of AES
and PRESENT Encryption Algorithms
in FPGA Using the Welch Method
1 Introduction
Encryption uses a set of methods and techniques for encoding and decoding data, in
order to guarantee the protection and access of these data only by authorized persons.
The development of cryptographic algorithms of compact size, with low cost and
consumption has been focus of researches in the area. AES is a widely-used encryption
algorithm, with several architectures implemented, even for applications that require
reduced resource consumption [1, 2]. PRESENT is a lightweight block cipher, stan-
dardized by ISO/IEC 29192-2: 2012 standardization for applications that require low
resources [3, 4]. FPGA (Field Programmable Gate Array), is a device composed
basically of a set of logic blocks organized in matrix form, programmable via software
and widely used today due to rapid prototyping and high performance for many
applications.
The modeling of variables that represent the energy consumption in cryptographic
devices, aiming at a comparison between ciphers, or as part of a Lateral Channel Attack
process, to obtain part of the cryptographic key has also been the object of research,
according [5, 6].
The present paper presents an analysis of the current consumption through standard
curves obtained using the Welch method in AES and PRESENT encryption algorithms
implemented in hardware FPGA Xilinx Artix-7 family (Basys 3 – Digilent board), and
measurements made using a prototype with Adafruit INA219 sensor (Texas Instru-
ments chip) and Arduino Uno microcontroller.
The organization of the article is as follows: Sect. 1 (Introduction), where the
subject is presented and its contributions. An overview of implementations of AES and
PRESENT in FPGA is described in Sects. 2 and 3 presents an analysis of related
works, Sect. 4 addresses the methodological procedures of this work, Sect. 5 presents
the results and respective analysis, Finally, Sect. 6 presents the conclusions and future
work.
2 FPGA Implementations
In this section is a brief account of the architecture of AES and PRESENT algorithms,
implemented in FPGA in this work.
It is important to note that the last round of encryption is different from the others,
not performing the MixColumns operation.
In this work, a version of AES Encryption, based on [8] was implemented, using
VHDL and the software tool Vivado Design Suites (Xilinx), synthesized for the Basys
3 board FPGA (Artix-7 family).
Figure 1 illustrates the AES architecture employed.
The AES version implemented (Fig. 1) encrypts a simple text in 11 rounds, using a
128-bit data block and parameterized with a 128-bit key (AES standard).
Current State block, called ESTATE, in the case of the first state, the ESTATE will
be the plaintext.
• RoundKeys Generation (80-bit key): The generation of sub keys (round keys),
which are used at each iteration (addRoundKey) works through a specific update
process (round by round), where 64 bits MSB of the key are extracted, after pro-
cesses of permutation, substitution (S-box) and still operation XOR with the counter
of rounds.
Figure 2 illustrates the architecture of the PRESENT cipher implemented.
The PRESENT version used in this work is based on [3, 9], encrypts a simple text
in 32 rounds and operates with blocks of 64-bit for text and 80-bit for keys.
3 Related Works
circuit reflects the aggregate activity of its individual elements, for example, the
switching of the transistors may be different according to the data types, which can
reflect in a different consumption. In this context, the application of statistical tech-
niques that search for the correlation between the data of consumption during the
encryption process and other data already known, aiming at obtaining the secret key is
known as Differential Power Analysis.
According to [14], Differential Power Analysis (DPA) collects information about
the energy consumption of a physical system, performing the statistical modeling of
this data to obtain important information for the cryptographic system crash.
Differential Power Analysis attacks on lightweight cryptographic algorithms were
performed in [15]. Different optimized architectures of the AES, Camellia, xTEA,
HIGHT and PRESENT algorithms were submitted to the attacks. Implementations
were performed on low-cost Spartan-3 FPGA hardware (Xilinx). The results showed
that architectures that use records to store data and keys are more susceptible to attacks.
Algorithms that used a greater amount of XOR operations also demonstrated greater
vulnerability to attacks.
In the work [10] DPA attacks were analyzed in an area-optimized AES version. The
results showed that the low-power version of AES is more susceptible to Differential
Power Analysis. The work also proposed a design with Integrated Low-Drop-Out
Regulators to increase resistance to DPA attack from compact AES.
The researches demonstrated that most of these works analyze energy consumption
variables describing the pattern at run time. The present work, in addition to measuring
the current consumption data of the implementations of AES and PRESENT, analyzes
the characteristic curve generated during the encryption process, using the Power
Spectral Density estimator known as the Welch method.
4 Methods
In this section, these is described the steps and procedures used to create the prototype
of current measurement, the architecture of the AES and PRESENT implementations
for simulation purposes, Welch mathematical model, as well as the conditions under
which the tests were performed.
FPGA: AES/PRESENT
Encryption
Connections:
Encryption rounds
Current sensor
(INA219)
Microcontroller (Arduino):
Read current sensor and
Serial interface
One of the characteristics of the Welch method is the ability to smooth the spectrum
of a signal in that it allows to reduce the variance between the estimators, in order to
obtain a better representation of the obtained signals standard.
In this work, specific functions were used in the Matlab software, which returns the
frequency response of the Welch method for the consumption data.
5 Analysis of Results
The results collected are evaluated based on a significant quantity of samples for each
proposed measurement condition. For each round of encryption, AES with 11 rounds
and PRESENT with 32 rounds, 100 samples of the total measured (approximately 160
Kbytes for AES and 80 Kbytes for PRESENT) were selected, then the overall mean of
the Dynamic state and the average for each round of encryption were calculated. The
average consumption for the Static condition was also calculated, however for a
quantity of 500 samples. The FPGA clock speed configured for the simulations and
measurements was 10 kHz.
Current Consumption Analysis of AES and PRESENT Encryption Algorithms 307
Data of the resources consumed by the FPGA and performance for the imple-
mentations of the encryption algorithms are illustrated in Table 1.
Throughput
Mpbs/slice
Flip-Flops
Max Freq.
Efficiency
Latency
(Gbps)
Cipher
(MHz)
cycles
Slices
LUTs
AES 260 1336 372 11 185.1 2.153 5.78
PRESENT 151 209 65 32 346.0 0.692 10.64
The results presented show AES with high efficiency, but at a cost of slices of
approximately 5.7 times greater, while PRESENT uses a reduced number of resources,
only 65 slices, but it encrypts a simple text in approximately 3 times higher latency
cycles. These results were displayed for a maximum frequency that the design can
achieve in the FPGA used.
Figure 5 shows the current consumption data for AES (a) and PRESENT (b) during
encryption simulation.
Through the graph, it is possible to observe that the variation of current con-
sumption between the encryption rounds is very small, not greater than 0.02 mA for
both algorithms. Figure 6 shows in more detail this difference during the encryption of
a block of text.
The results show AES with a greater consumption than PRESENT, which is
consistent with the implemented AES architecture because it performs more complex
operations and also use 128-bit blocks, while PRESENT encrypts 64-bit blocks, but the
number of rounds for encrypting a plain text is much smaller in AES, 11 rounds, versus
32 rounds for PRESENT.
Figure 7 shows a detailed comparison between the means for consumption in the
Static and Dynamic states.
In the Static condition (which represents a leakage current from the circuit), the
AES consumption was 66.8 mA, whereas the PRESENT was 10.8 mA, representing a
consumption of 83.84% higher for AES.
Current Consumption Analysis of AES and PRESENT Encryption Algorithms 309
Figure 8 shows the current consumption data of the frequency domain imple-
mentations after application of the Welch method and subsequently standardized. The
data show characteristic curves generated for AES and PRESENT.
Fig. 8. Normalized Welch current estimation of (a) AES and (b) PRESENT
In this work, it was presented an analysis of the energy consumption (current variable)
of the AES and PRESENT algorithms implemented in FPGA, making a comparison
between the resources used in the hardware and the forms of representation of the
current at the domain of time and frequency based on the normalized Welch method for
estimation of spectral density.
From the data obtained in the presented experiment, it is concluded that for the
architectures implemented in the FPGA, AES presents Slices consumption approxi-
mately 5.7 times greater than PRESENT, and a current consumption of approximately
33.6% higher, in addition to a high efficiency for AES compared to PRESENT. This
difference can be explained by the purpose for which the algorithms were developed,
since AES presents a greater robustness, working with a greater volume of data, while
PRESENT is designed for ultra-lightweight applications, which require the minimum
area of implementation and processing of a smaller amount of data.
With respect to the comparison between the current represented in the time domain
and the one modeled by the normalized Welch method, it was demonstrated the pos-
sibility of obtaining curves current consumption patterns for AES and PRESENT,
being the visualization of easy differentiation between the algorithms Encryption.
As a suggestion of future work, we recommend measuring for different AES and
PRESENT designs, for example for other key sizes, as well as for other encryption
algorithms, with the purpose of comparing the responses generated by the Welch
Method for the purpose of confirming a model behavior that can easily aid in the
identification of a certain encryption algorithm, and also contribute to a Side Channel
Attack process. Another suggestion is the measurement for different frequencies of
encryption, with specific equipment for this purpose.
References
1. Bogdanov, A., Mendel, F., Regazzoni, F., Rijmen, V., Tischhauser, E.: ALE: AES-based
lightweight authenticated encryption. In: Moriai, S. (ed.) FSE 2013. LNCS, vol. 8424,
pp. 447–466. Springer, Heidelberg (2014). doi:10.1007/978-3-662-43933-3_23
2. Chodowiec, P., Gaj, K.: Very compact FPGA implementation of the AES algorithm. In:
Walter, C.D., Koç, Ç.K., Paar, C. (eds.) CHES 2003. LNCS, vol. 2779, pp. 319–333.
Springer, Heidelberg (2003). doi:10.1007/978-3-540-45238-6_26
Current Consumption Analysis of AES and PRESENT Encryption Algorithms 311
3. Bogdanov, A., Knudsen, L.R., Leander, G., Paar, C., Poschmann, A., Robshaw, M.J.B.,
Seurin, Y., Vikkelsoe, C.: PRESENT: an ultra-lightweight block cipher. In: Paillier, P.,
Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 450–466. Springer, Heidelberg
(2007). doi:10.1007/978-3-540-74735-2_31
4. Tay, J.J., Wong, M.L.D., Wong, M.M., Zhang, C., Hijazin, I.: Compact FPGA implemen-
tation of PRESENT with Boolean S-Box. In: 2015 6th Asia Symposium on Quality
Electronic Design (ASQED), pp. 144–148. IEEE (2015). doi:10.1109/ACQED.2015.
7274024
5. Masoumi, M., Mohammadi, S.: A new and efficient approach to protect AES against
differential power analysis. In: 2011 World Congress on Internet Security (WorldCIS),
pp. 59–66. IEEE (2011)
6. Örs, S.B., Oswald, E., Preneel, B.: Power-analysis attacks on an FPGA – first experimental
results. In: Walter, C.D., Koç, Ç.K., Paar, C. (eds.) CHES 2003. LNCS, vol. 2779, pp. 35–
50. Springer, Heidelberg (2003). doi:10.1007/978-3-540-45238-6_4
7. Moreno, E.D., Pereira, F.D., Chiaramonte, R.B.: Software and Hardware Encryption.
Novatec, São Paulo (2005)
8. Palmeira, S.I.N., Góis, A.C.D.S., Dias, W.R.A., Moreno, E.D.: An Implementation of AES
algorithm in FPGA. In: 14th Microelectronics Students Forum (SForum), at Federal
University of Sergipe, Aracaju, Brazil (2014)
9. Gajewski, K.: Present a lightweight block cipher. In: Open Cores (2014). https://opencores.
org/project,present
10. Singh, A., Kar, M., Ko, J.H., Mukhopadhyay, S.: Exploring power attack protection of
resource constrained encryption engines using integrated low-drop-out regulators. In: 2015
IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED),
pp. 134–139. IEEE (2015). doi:10.1109/ISLPED.2015.7273503
11. Deng, L., Sobti, K., Zhang, Y., Chakrabarti, C.: Accurate area, time and power models for
FPGA-based implementations. J. Signal Process. Syst. 63(1), 39–50 (2011). doi:10.1007/
s11265-009-0387-7
12. Batina, L., Das, A., Ege, B., Kavun, E.B., Mentens, N., Paar, C., Verbauwhede, I., Yalçın,
T.: Dietary recommendations for lightweight block ciphers: power, energy and area analysis
of recently developed architectures. In: Hutter, M., Schmidt, J.-M. (eds.) RFIDSec 2013.
LNCS, vol. 8262, pp. 103–112. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41332-
2_7
13. Kocher, P., Jaffe, J., Jun, B., Rohatgi, P.: Introduction to differential power analysis.
J. Cryptograph. Eng. 1(1), 5–27 (2011). doi:10.1007/s13389-011-0006-y
14. Tang, M., Qiu, Z., Yang, M., Cheng, P., Gao, S., Liu, S., Meng, Q.: Evolutionary ciphers
against differential power analysis and differential fault analysis. Sci. China Inf. Sci. 55, 1–
15 (2012). doi:10.1007/s11432-012-4615-6
15. Yalla, P.S.: Differential power analysis on light weight implementations of block ciphers.
Doctoral dissertation, George Mason University (2009)
Spiral Model for Digital Forensics
Investigation
1 Introduction
Digital Forensics is the scientific analysis of digital crimes. As the world is going
digital, the physical crimes have also been modified to be occurring at digital level.
Now, no crimes of stealing money from a bank ATM are being committed. Rather,
money is hacked either via network or by some other means with the help of computer,
or other digital devices. These kinds of crimes which involve some digital means like
computer, cell phones or other peripheral devices come under the category of digital
forensics. Computer forensics is a sub division. The term forensics implies similarity to
normal crimes, but the only difference is that we are dealing digitally. Hence, digital
investigation has to be done for digital crimes. The analysis of suspecting digital
Please note that the LNCS Editorial assumes that all authors have used the western naming
convention, with given names preceding surnames. This determines the structure of the names
in the running heads and the author index.
crimes, doing investigation for them to determine criminal and gathering proper evi-
dences against it, is all know as process of digital forensics. Hence, digital forensics is
becoming very much popular now days to restrict digital crimes. Like normal inves-
tigation has to be done for physical crimes, digital forensics is also defined by a proper
format. But, as of now there is no proper method to be defined which could be followed
efficiently to perform digital investigation. As some of the methods fits a particular
scenario while it may not for the other cases. Some of the methods are defined for some
digital devices like computers, etc. but they are not applicable to other digital media
like cell phones and all. Hence, this paper provides a solution to all these problems. It
consists of a model to be followed that fits and is suitable for all the scenarios as well as
for all the digital devices. This model is very much flexible in nature and consists of
different paths which could be followed in case we get stuck at any situation. The
biggest advantage of this model is that, it uses previous conditions or previous itera-
tions to determine further steps to be followed. Hence, if previous process is successful,
we can move on with next step as defined. Otherwise, we also have the option of
modifying next step, as per the condition of previous step. This flexibility between
iterations makes it generic for all scenarios and for all digital devices as well.
In 2008, it had been recorded in the USA that about 98% of the documents were
created electronically. Approximately 85% of 66 million dollars were lost by the US
government due to cyber related crimes. Digital forensic is nothing but the use of
scientific methods towards the identification, preservation, collection, validation,
analysis, interpretation, documentation, and preservation of digital evidences, so that,
they could be produced in court of law properly. Digital evidences are data that
provides a link between the cause of crime and the criminal. Digital evidences are
fragile in nature. Thus, they can be modified, altered or updated by the criminal, just
like finger prints in case of physical crime. While doing digital forensic investigation,
the first and foremost task is to collect evidences, just like as that they are collected in
case of physical crimes. This task is performed by trained professionals. If we further
elaborate it, then it is similar to collecting fingerprints in physical crimes. For digital
crimes the data backup, is taken in some form of mass storage media like floppy disks
or CD drive, hard drive, etc. While doing this, one thing is to be kept in mind i.e. to
disconnect the network, so that there should not be any possibility of getting malicious
software’s into it. It is to be done to avoid any chance of allowing malicious software’s
to alter our images or whatever data we have collected. After collecting evidences, next
step is to keep a backup of that data or images. Thus, there should be a copy of original
data, which if required could be used in future for making a comparison, to know
whether our original evidences have been altered or not. Third step is to prepare a
document specifying the crime scene properly. It is done to help a person to analyze the
crime scene properly, even if he was not present at the spot where crime has been
occurred previously. Fourth step is to keep those evidences safe so that nobody can
alter them. For this either MD5 or SHA1 hash code is generated for the data or images
and stored into the database. Final step is to generate a hypothesis for that crime. For
example, if a file was found in the drive, so the hypothesis could be made that first the
malicious file was downloaded from the internet and then it was stored in the download
folder. From that folder further, the file would have been copied to some other drive.
Finally, there is a need to present all those evidences collected with suitable hypothesis
314 S. Kothari and H. Hasija
against the criminal in the court of law for jurisdiction. Section 1 was all about the
introduction of digital forensics, Sect. 2 describes about the literature survey and
background work done so far. Section 3 consists of the proposed model to deal with
digital forensics problems, so as to successfully perform digital investigation. Section 4
provides advantages of the proposed model over other previous models. Section 5
concludes the paper. The last section provides references.
2 Background
Phase 1: -
As described in Fig. (1), Mark M. Pollitt [MP95] had proposed four different steps
for digital forensics as Acquisition, Identification, Evaluation, and Admission as evi-
dence, so that evidences could be documented in the court of law. Their outputs are
media in physical context form, information in legal context form, and evidences
respectively. But, a generalized process was not present to be followed for each and
every case.
As described in Fig. (2), Farmer and Venema [FV99] defined a methodology for
digital forensics as “secure and isolate, record the scene, conduct a systematic approach
for evidence, collect and package evidence, and maintain a chain of custody”, but the
drawback was that, it was defined mainly for UNIX forensic procedures. Mandia and
Prosise [MP01] overcomes the drawback of previous methodologies by defining steps
as “pre incident preparation, detection of incidents, initial response, response strategy
formulation, duplication, investigation security measure implementation, network
monitoring, recovery, reporting, and follow up”. But, again the drawback with this
method was that, it was applicable for only Windows NT/2000, UNIX and Cisco
Routers. Another drawback was, it was not applicable for all digital devices like
personal digital assistants, peripheral devices, cell phones or future digital technology,
and all. Then came the standard abstract model by U.S. Department of Justice
[TWG01], which includes “collection, examination, analysis, and reporting”. This
model had overcome the drawbacks by defining a generic method which could be
applied to all the digital devices. But, the analysis phase of this model was ambiguous.
Hence, the model was not properly defined at all. Finally, came the milestone for future
research work to be performed in a well-planned manner. For this, the base was given
by Digital Forensic Research Workshop [DFRW01], which was surprisingly held by
academia persons rather than law enforcement. It identifies steps as “identification,
Spiral Model for Digital Forensics Investigation 315
Phase 2: 2004–2007: -
As described in Fig. (6), the framework proposed by Ciardhuain [EMCI04] could
be considered as the complete one till that date. Because, it includes activities as
“awareness, authorization, planning, notification, search and identify, collection,
transport, storage, examination, hypotheses, presentation, proof, defense and dissemi-
nation”. It provides a basis for the development of tools and techniques to support the
work of digital investigators. Baryamueeba and Tushabe [EDIP04] made some addi-
tions to the Integrated Digital Investigation Model [IDIM03] and removed one of its
disadvantages by making clear distinction between the primary and secondary crime
scene after the addition of two phases “Trace back phase and Dynamite phase”. It also
makes those phases as linear ones instead of making them iterative.
Spiral Model for Digital Forensics Investigation 317
As descried in Fig. (7), hierarchical Objectives based Framework for the Digital
Investigations Process [HOF04] by Beebe and Clark was a multi tired model opposite
to that of single tier as of discuss till now. The phases of the first tier are “preparation,
incident response, data collection, data analysis, presentation and incident closure”. In
the second tier, the data analysis phase has been further organized into the survey
phase, extract phase and examine phase. It consists of analysis task using the concept of
objective-based tasks. This framework offers unique benefits in the areas of practicality
and specificity.
Fig. 7. Hierarchical objectives based framework for the digital investigations process
As described in Fig. (8), in 2004, Carrier and Spafford [EBD04] proposed a model
consisting of 3 phases named as “Preservation, Search and Reconstruction Phase”.
Reconstruction phase is nothing but the construction of hypothesis to develop and test
the evidences collected based on crime scene. So, this model was completely based on
the causes and effects of events. However, completeness of each phase was not clearly
mentioned, and hence it was not clear that framework was sufficient enough or not.
As described in Fig. (9), Rubin, Yun and Gaertner [CRI05] carried on with the
work of Carrier [EBD04], [IDIM03] and Beebe [HOF04] by introducing the concept of
seek knowledge and case relevance. Seek knowledge means the investigative clues by
which the analysis of data takes place. Case relevance is the piece of information, based
on which we should be answerable to following questions like “who, what, where,
when, why and how” questions in a criminal investigation [CRI05]. There are various
levels of case relevance like, “Absolutely irrelevant, Probably Irrelevant, Possibly
irrelevant, Possibly Case-Relevant, Probably Case Relevant”. A paper based on visu-
alization of data in intrusion detection systems and network forensic situations was
proposed by Erbacher, Christensen and Sunderberg [VFTP06]. It proposed that dif-
ferent visualization techniques are required for different kind of analysis and they also
have to be integrated at the end, so as to reach with final conclusion. Kent, Chevalier,
Grance and Dang [GIF06] published four basic steps for digital forensics “Collection,
Examination, Analysis and Reporting”. It is very much similar to [MP01]. It firstly,
transforms the collected data from media into a particular format which could be
understood by forensic tools. Then, data is modified into information based on analysis
done over it. Finally, information is converted to evidence form during reporting phase
in court of las for jurisdiction. Computer Forensic Field Triage Process Model
[CFFTPM06] had been derived from IDIP framework [IDIM03]. It basically works on
the principle of performing digital investigation at onsite or field itself instead of taking
the snapshots to the lab for examination. Its major advantage was its short time frame
required to conduct digital investigation and its practical as well as pragmatic nature.
But, its drawback was that we could not apply it to all the situations.
As described in Fig. (10), digital Forensic Investigation by Kohn, Eloff and Oliver
came up with three basic stages required for digital investigation as “preparation,
investigation and presentation”. The number of steps has been reduced to three because
in all the previous models the phases had been overlapping one another and the
difference was only of the terminologies. Hence, this model comes with the advantage
Spiral Model for Digital Forensics Investigation 319
that it has merged the unnecessary steps and more over to that it could be easily
expanded to include more number of additional phases in the future.
As described in Fig. (11), the Common Process Model for Incident and Computer
Forensics[CPM07] proposed by Freiling and Schwittay was introduced to combine the
advantages of both Incident Response and Computer Forensics in order to improve
overall process of investigation. This framework mainly consists of “Pre-Incident
Preparation, Pre-Analysis, Analysis and Post- Analysis”. Pre analysis phase consists of
all steps and activities to be performed before actual analysis like collecting evidences
and all. Post analysis consists of activities like documentation to be produced in court
of law. Actual analysis is performed in analysis phase like investigating the collected
images, etc. Hence, it combines the features of incident response performed with pre
and post analysis, as well as computer forensic performed in actual analysis.
Fig. 11. Common process model for incident and computer forensics
Phase 3: 2008–2014:-
As described in Fig. (12), Perumal [DFIMP09] introduced very important stages
into digital forensics investigation as collecting live data and static data acquisition in
the model so as to focus on fragile evidence. The Digital Forensic Process Model
proposed by Cohen [TSDFE10] breaks the process into seven phases as “Identification,
This model draws its inspiration from the spiral model for software development which
has the basic characteristic of cyclic approach for incrementally growing a system’s
degree of definition and implementation while decreasing its degree of risk. It is equally
applicable in digital forensic process as it can be defined in the following generalized
additive iterations –
• Determine objectives, alternatives and constraints
• Evaluate alternatives, identify, resolve risks
• Develop, verify, next-level phase
• Plan next phase
As described in Fig. (14), the investigation process begins, the investigator per-
forms activities that are implied by a circuit around the spiral in a clockwise direction,
beginning at the centre. 5 iterations have been proposed in the model based on the five
phases mentioned in Common Phases of Computer Forensics Investigation Process
Model.
Iteration 1: Preparation
The process starts with an investigator forming an approach strategy based on any
previous knowledge or any prior experiences. This phase involves planning the
course of action for the investigation based on the chosen strategy and gathering the
requirement.
Spiral Model for Digital Forensics Investigation 321
Iteration 2: Acquisition
Based on the chosen approach strategy and the pre-analysis, next steps for acqui-
sition are chosen. Risk analysis is performed on all the available steps and the ones
that suit the situation best are chosen. The action or set of action chosen for carrying
out the acquisition is called Approach 1. After that identification of evidence is done
and preservation of evidence is done based on Approach 1.
Iteration 3: Case-Specific Analysis
Based on the evidence gathered from the previous step, a set of actions is again
chosen to carry out the case-specific analysis based on risk analysis. The set of
actions chosen are termed as Approach 2 and Examination of evidence, Hypothesis
Creation and Reconstruction of Crime Scene is done based on this approach.
Iteration 4: Presentation
As per the outcome of the Case-Specific Analysis, steps are now chosen from the
table for presentation of evidence after risk analysis has been done on them. The set
of actions chosen is grouped as Approach 3. Admission of evidence is done in the
court of law and all the proof and defense are presented.
Iteration 5: Final Step
Grouping Approach 1, Approach 2, Approach 3 into one and adding any steps that
could have been inculcated to make this investigation smoother is termed as Final
322 S. Kothari and H. Hasija
Approach. After the incident response has been recorded and the evidence has been
returned, Documentation of the whole process is done and Review is made as to
what could have been done differently to arrive to the conclusion faster or more
efficiently.
The major drawbacks with all the previous models were that that the same model was
to be followed for all investigation processes irrespective of the situations. This model
provides the flexibility to the investigator to choose the next phase of how to carry
forward an investigation based on the information gathered in the previous phase.
Hence, a custom based model can be built by the investigator as per the requirements of
the crime committed. One more drawback encountered with other methods was, a
generalized strategy is required to be applicable on all the digital devices like com-
puters, cell phones etc. Hence, the proposed method should be generic in nature. As the
proposed method is flexible enough which could be changed at any time as per the
requirements, thus it is applicable for all devices. Apart from that, as spiral model
provides risk analysis factor also, this factor is an addition to determine that whether
what we have done so far is going in the right direction or not. For example, if we are
proposing a hypothesis to reconstruct the crime scene, then at this stage itself we could
do risk analysis to determine that we are proceeding in the right direction or not. We
could also make an estimation that this methodology would be going to work properly
or not. Due to all these advantages, this proposed method is much better as compared to
all the other methods. Last but not the least advantage of using spiral model over here is
that, as spiral model fits perfectly for the agile methodology. Therefore, the software’s
following spiral model are developed in a very fast way so that the prototype could be
analyzed as soon as possible. This is done, because spiral model provides a very fast
and efficient way to analyze and compare software product with business requirements.
Similarly, here also spiral model provides a very fast and efficient way to analyze the
investigation by comparing the hypothesis with the actual crime scene documented. So,
that, we could analyze the investigation again and so quickly, that it could be altered as
per the crime scene if the hypothesis is wrong. In a nutshell, this model overcomes all
the drawbacks of previous models in the best possible and smoother manner.
Digital forensics needs a set of tools along with a proper methodology to perform
digital investigation, so as to produce evidences in court of law for proper jurisdiction.
Digital data is basically in numerical format. It is generally represented in binary format
of 0 and 1 bits. These bits are usually written into the hard disk. Thus, hard disk
represents physical evidence for digital forensics. But, we are interested more with
digital evidences. The data is written into a binary format into the hard disk repre-
senting a state of hard disk. This state gets changes as soon as more data is written into
Spiral Model for Digital Forensics Investigation 323
it. So, basic motive of digital forensic is to preserve this state in the form of evidence, if
some digital crime happens so that it could be produced in court of law for jurisdiction.
In order to deal with digital investigation of digital evidences, a particular framework is
to be followed. This framework consists of gathering all evidences with the help of
experts, so that original evidences should not get disturbed. After the collection of
evidences, they are to be documented and stored properly, so as to keep a backup of it
using MD5 or SHA1 hashing algorithms and then storing them into a database. After
preservation of evidences, hypothesis generation of crime starts. In this phase,
hypothesis is generated as per the evidences obtained. Finally, the documents are
presented in court of law for jurisdiction. But, then too, in many of the cases, the
presentation fails because of many reasons like hypothesis was not correct, evidences
got altered due to improper handling, the backup of evidences was not created properly,
etc. Thus, just like waterfall model of software, it becomes unfeasible to start from the
first phase again and perform the complete process. As the solution, model proposed by
software engineering to deal with these kinds of situations is the spiral model. Hence,
we are going to follow the same model here also, in order to find a solution to this
situation. Therefore, this model proposed a spiral shape structure in which these phases
are covered again and again, so as to perform digital forensics investigation properly. It
also provides the flexibility of modifying the next phase as per the drawbacks or
shortcomings of previous phase. Hence, this model is the best way to perform digital
forensic investigation for any kind of devices like computers, mobile phones, and
above to all, applicable for all scenarios like for cloud computation and in Big Data
domain as well [22, 23].
References
[MP95] Pollitt, M.M.: Computer forensics: an approach to evidence in cyberspace. In:
National Information System Security Conference (1995)
[FV99] Farmer, D., Venema, W.: Computer Forensics Analysis Class Handouts (1999)
[MP01] Mandia, K., Prosisse, C.: Incident Response. Osbourne/McGraw-Hill (2001)
[TWG01] Technical Working Group for Electrical Crime Scene Investigation. Electronic
Crime Scene Investigation: A Guide for First Responders (2001)
[DFRW01] Digital Forensics Research Workshop. A Road Map for Digital Forensics
Research (2001)
[ADFM02] Reith, M., Carr, C., Gunsch, G.: An examination of digital forensic models. Int.
J. Digit. Evid. 1(3), 1–12 (2002)
[IDIM03] Carrier, B., Spafford, E.: Getting physical with the investigative process. Int.
J. Digital Evidence (2003)
[CADII03] Stephenson, P.: A Comprehesive Approach to Digital Incident Investigation.
Elsevier Information Security Technical report (2003)
[EMCI04] Ciardhuain, S.O.: An extended model of cybercrime investigations. Int. J. Digit.
Evid. 3(1), 1–22 (2004)
[EDIP04] Baryamureeba, V., Tushabe, F.: The enhanced digital investigation process
model. In: DFRWS (2004)
[HOF04] Beebe, N., Clark, J.: A hierarchical objectives based framework for the digital
investigations process. In: DFRWS (2004)
324 S. Kothari and H. Hasija
[EBD04] Carrier, B., Spafford, E.: An event based digital forensic investigation
framework. In: DFRWS (2004)
[CRI05] Rubin, G., Yun, C., Gaertner, M.: Case-relevance information investigation:
binding computer intelligence to the current computer forensic framework. Int.
J. Digit. Evid. 4(1), 1–13 (2005)
[VFTP06] Erbacher, R.F., Christensen, K., Sunderberg, A.: Visual forensic techniques and
processes (2006)
[FDFI06] Kohn, M., Eloff, J.H.P., Olivier, M.S.: Framework for a digital forensic
investigation. In: Proceedings of Inforation Security South Africa (ISSA) (2006)
[GIF06] Kent, K., Chevalier, S., Grance, T., Dang, H.: Guide to Integrating Forensics into
Incident Response. NIST Special Publication 800-86 (2006)
[CFFTPM06] Rogers, M.K., Goldman, J., Mislan, R., Wedge, T., Debrota, S.: Computer
forensics field triage process model. In: Conference on Digital Forensics Security
and Law (2006)
[CPM07] Freiling, F., Schwittay, B.: A common process model for incident response and
computer forensics. In: Conference on IT Incident Management and IT Forensics
(2007)
[DFIMP09] Perumal, S.: Digital Forensic Model based on Malaysian Investigative Process
(2009)
[TSDFE10] Cohen, F.: Toward a science of digital forensic evidence examination. In: Chow,
K.-P., Shenoi, S. (eds.) DigitalForensics 2010. IFIP IAICT, vol. 337, pp. 17–35.
Springer, Heidelberg (2010). doi:10.1007/978-3-642-15506-2_2
[SDFIM11] Agarwal, A., Gupta, M., Gupta, S., Gupta, C.: Systematic digital forensic
investigation model. Int. J. Comput. Sci. Secur. 5(1), 118–131 (2011)
[22] Jones, A., Vidalis, S., Abouzakhar, N.: Information security and digital forensics
in the world of cyber physical systems. In: Eleventh International Conference on
Digital Information Management (2016)
[23] Jones, J., Etzkorn, L.: Analysis of digital forensics live system acquisition
methods to achieve optimal evidence preservation. In: Southeast con (2016)
Smart-Lock Security Re-engineered
Using Cryptography and Steganography
1 Introduction
The domain of Internet of Things (IoT) has shown significant capability to drasti-
cally change the technological world. IoT systems include computing and house-
hold devices, as well as sensors. It is possible to control household devices with
a tap on the mobile screen, thanks to IoT. In addition, Cisco’s Internet Business
Solutions Group has predicted that the number of IoT devices will be about
20.4 billion by the year 2020 [1]. IoT devices have made people’s lives easier in
a number of ways. Nonetheless, security experts have expresses their concerns
about the threats and vulnerabilities that these devices bring along, termed as
the ‘Insecurity of Things’.
Mobile devices that connect to Smart Locks using the Bluetooth Low Energy
(BLE) protocol are vulnerable to various security attacks like the Man-in-the-
Middle (MITM) attack. BLE is a power efficient technology which is capa-
ble of transferring data between smart-phones and IoT devices. Basically, an
intruder/attacker tries to impersonate a receiver and takes hold of the commu-
nication between two parties. Such an attack is called MITM attack and is found
to be carried out in BLE protocol.
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 325–336, 2017.
https://doi.org/10.1007/978-981-10-6898-0_27
326 C. Bapat et al.
When home automation and security are under consideration, locks -either
mechanical or electronic- are a necessity. However, the problem associated with
any physical lock is about the key handling and management. Humans tend to
be forgetful and multiple keys need to be managed hence it was replaced by
electromagnetic locks. However, it still didn’t address the issue of remote acces-
sibility. In the age of smart-phones and a hyper-connected world, it is essential to
control locks remotely, using hand-held devices. Hence, smart-locks have been
introduced to address this concern. But the issue is, despite the promise of
accessibility, ease of use and comfort associated with smart-locks, security is an
imminent and constant threat. So the problem is to tackle the security threats
and attacks on IoT based smart-locks.
The ongoing research in the field of Internet of Things and BLE protocol relies
heavily on the usage of Cryptography. The algorithm of Advanced Encryption
Standard is used for encryption and decryption. However, the research has found
out problems associated with cryptography algorithms like MITM attack, mas-
querade attack, etc. Moreover, few papers involve usage of one-time passwords
for securing the communication. However, OTP generation is an intensive task
and depends on network bandwidth thus suffering from latency issues.
This paper aims at investigating the working of BLE protocol and highlights
the underlying architecture designed for communication using BLE protocol. In
addition, it’s vulnerabilities have been studied and a synthesis of cryptographic
and steganographic techniques has been implemented so as to prevent MITM
attack on BLE protocol. Such a combined approach tackles the shortcomings
of the individual methods of cryptography and steganography whilst preserving
the advantages of each of them.
The paper is organized as follows. Section 2.1 focuses on the architecture of
BLE. as well as the vulnerabilities existing in BLE. Section 2.2 throws more light
on the MITM attack and it’s relevance in BLE protocol. Section 2.3 is a review of
Steganography as a possible solution to existing problems in BLE protocol. The
existing solutions in the sphere of IoT devices and BLE protocol are presented
in Sect. 3. In Sect. 4, a combination of Steganography with Cryptography as
a possible solution is proposed. The actual implementation of the system is
included in Sect. 5 followed by discussion of the results in Sect. 6. Ultimately,
the article is concluded in Sect. 7.
2 Related Work
BLE is a wireless technology which consumes less energy and supports short
range communication. This technology has can be used in various fields such as
Entertainment, Health and Sports. BLE devices have easy maintenance and can
work for years on coin-cell batteries [3]. Although low-power technologies such
as Zigbee, 6 LoWPAN and Z-wave have made their mark in the market, BLE
has greater deployment expectations [2].
Smart-Lock Security Re-engineered Using Cryptography 327
Security at the Link Layer. Authentication and encryption is done using the
Cipher Block Chaining-Message Authentication Code (CCM) algorithm and a
128-bit AES block cipher. When connection is based on encryption as well as
authentication, a 4-byte Message Integrity Check (MIC) gets appended to the
data channel PDU. The Payload and MIC fields are then encrypted. Authen-
ticated data is passed over an unencrypted channel by using digital signatures.
An algorithm which makes use of a 128-bit AES block cipher helps generate the
signature [2]. A counter is given as one of the inputs to this algorithm, that gives
protection against various replay attacks. It is assumed that a trusted source has
sent the data in case the receiver successfully verifies the signature.
For communication over BLE, pairing is an important task. Pairing in BLE
is done in 3 phases. In first phase, devices announce their input-output capabil-
ities. Subsequently, STK (Short Term Key) is generated for secure distribution
of key materials that are required for next phase. At first, both the devices agree
on Temporary Key (TK). It is done using Out of Band communication, Passkey
Entry or JustWorks. Based on the TK and random values generated by both
the devices, STK is generated. Later, in the next phase each end-point sends
to every other end-point , three 128-bit keys: Long-term key, Connection Sig-
nature Resolving Key, Identity Resolving Key. Long term key is for Link Layer
Encryption and authentication. Connection resolving key performs data signing
at ATT layer while Identity Resolving Key generates a private address based on
the public address of the device. The STK generated in PHASE II is used for
encryption while distributing these 3 keys. In all the three phases, the message
exchange is carried out by the Security Manager Protocol (SMP).
In order to better understand the working of MITM attacks, the paper [4] was
reviewed. MITM attack is a prominent attack in computer security, which rep-
resents a pressing concern for security experts and the academia. MITM targets
the data flowing between two victims, thereby attacking the confidentiality and
integrity of the data itself.
In the MITM attack,
the intruder possesses access
to the communication chan-
nel between two victims,
enabling him to manipulate
messages flowing through
the communication channel.
The visualization of MITM
attacks is as shown in Fig. 1.
Specifically, victims try to
establish a secure communi-
cation by exchanging their Fig. 1. MITM exchange methodology
own public keys (P1 and P2)
with each other. Attacker intercepts the public keys P1 and P2, and as a response
sends its own public key (P3) to both the victims. Consequently, victim 1
encrypts its message using the attacker’s public key (P3), and sends it to victim
2 (E1). Here, as the public key used for encryption was attacker’s public key,
decryption needs to be carried out using attacker’s private key. The attacker
intercepts E1, and decrypts it using the corresponding private key. The attacker
later encrypts some plain-text message using victim 2’s public key, and transmits
it to victim 2 (message E2). When victim 2 is able to decrypt the messages sent
by victim 1, it means that the attacker has been able to deceive both the victim
parties that they are communicating over a secure channel.
MITM attack can be carried out in various communication channels such as
UMTS, Long-Term Evolution (LTE), Wi-Fi, GSM, Bluetooth, and Near Field
Communication (NFC). MITM attack aims to compromise:
There are minimum three ways of characterizing MITM attacks, based on:
1. Impersonation techniques
2. Communication channel in which the attack is executed.
3. Location of intruder and victim in the network.
Smart-Lock Security Re-engineered Using Cryptography 329
2.3 Steganography
3 Existing Solutions
4 Proposed Solution
The security and privacy of any information traveling across a channel that pro-
motes open communication results into a major problem. Hence, in order to
prevent unauthenticated and unwarranted access and usage, confidentiality and
integrity is needed. Of the many methods available, steganography and cryptog-
raphy are two of the most used ones. The first one hides the sheer existence of
the information while the second one twists the data itself [14].
The data is transformed into another incomprehensible format which is then
sent over the network, in case of cryptography. However, in case of steganog-
raphy, stego-files such as image, text, audio, video is used as a platform for
embedding the message. Later, the stego-file is transferred over the communi-
cation channel. This paper is based on harnessing the advantages of both the
Smart-Lock Security Re-engineered Using Cryptography 331
5 Implementation
5.1 System Design
In order to create a remote-controlled
system for accessing the electro-
magnetic lock, the system was first
designed, as shown in Fig. 4. It con-
sisted of 4 main components - Android
smart-phone, Raspberry Pi, Electro-
magnetic lock and Server. The Blue-
tooth and WiFi modules in Android
are fairly robust with good documen- Fig. 4. System design
tation support. Of the multiple ver-
sions of Raspberry Pi, the latest Pi 3 Model B has inbuilt WiFi and Bluetooth 4.0
(BLE). Server would be needed to log the system usage so as to provide future
scope for performing analytics and understand usage patterns and statistics.
B. As shown in Fig. 5, the mains 230 V alternating current is fed to the minia-
ture circuit breaker (MCB) which breaks the circuit during power failure or
short-circuit. It prevents any damage to the internal circuit components. Power
adapter facilitates conversion of 230 V power supply to 5 V as needed by Rasp-
berry Pi. Raspberry Pi provides 3.3 V with respect to its ground as an output to
general-purpose input output (GPIO) pins. A relay acts as a switch for accessing
the lock. However, the relay circuit works on 12 V supply provided by SMPS.
The circuit is completed by connecting electromagnetic lock in series with the
relay and joining the grounds of Raspberry Pi and electromagnetic lock.
Fig. 6. Image size vs Total time taken Fig. 7. Image size vs BLE transfer time
To find out the efficiency of the BLE protocol, the time needed only for
transferring image was first tracked. The BLE transfer time takes into account
the time needed to send the image from Android smart-phone to the Raspberry
Pi 3 over BLE protocol. It was found that there exists a fairly linear relationship
between the image size and BLE transfer time. Hence lesser image sizes would
be transferred faster, as per Fig. 7.
Table 1 summarizes the relationship between the image size, its dimensions
and the time needed. It shows direct relationship between the image size and
the time needed for processing the image.
334 C. Bapat et al.
Fig. 8. Wallpaper
Figures 8 and 9 show the difficulty an attacker will experience to find out
differences between the 2 images. With absolutely no visual differences in the pre-
processing and post-processing image, it satisfies the requirement of providing
an additional layer of security to the existing system in IoT devices.
Smart-Lock Security Re-engineered Using Cryptography 335
Fig. 9. Airplane
7 Conclusion
This paper is an effort to review existential security threats in the sphere of IoT,
vulnerabilities of BLE protocol and related work around MITM attacks. Having
studied the BLE protocol, various issues were found including the possibility
of MITM attack. Although existing solutions involve SMS One-Time-Password,
Cryptography, Steganography, still few vulnerabilities persist. According to the
study of these techniques, a combination of both (Cryptography and Steganog-
raphy) ensures elimination of the disadvantages of the individual methods while
retention of the advantages that these principles possess. An implementation of
such a methodology can possibly aid research in the field of Security in IoT and
fortify the future of BLE enabled IoT devices.
References
1. Gartner Says 8.4 Billion Connected. Gartner.com (2017). Accessed 8 June 2017
2. Gomez, C., Oller, J., Paradells, J.: Overview and evaluation of bluetooth low
energy: an emerging low-power wireless technology. Sensors 12(12), 11734–11753
(2012). doi:10.3390/s120911734
3. Al Hosni, S.H.: Bluetooth low energy: a survey. Int. J. Comput. Appl. (0975–8887)
162(1) (2017)
4. Conti, M., Dragoni, N., Lesyk, V.: A survey of man in the middle attacks. IEEE
Commun. Surv. Tutor. 18(3), 2027–2051 (2016)
5. Green, I.: DNS spoofing by the man in the middle (2005). http://www.sans.org/
rr/whitepapers/dns/1567.php
6. Fisher, D., et al.: New Attack Finds AES Keys Several Times Faster Than Brute
Force. Threatpost — The first stop for security news (2017). Accessed 25 Jan 2017
7. Thiyagarajan, P., Aghila, G., Venkatesan, V.P.: Stepping up internet banking secu-
rity using dynamic pattern based image steganography. In: Abraham, A., Mauri,
J.L., Buford, J.F., Suzuki, J., Thampi, S.M. (eds.) ACC 2011. CCIS, vol. 193, pp.
98–112. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22726-4 12
336 C. Bapat et al.
1 Introduction
Smartphones have become essential to not only law-abiding citizens but also to
criminals and terrorists. The typically small size of the screen helps make it a
mobile and pocketable device, ever present physically with its owner. The equip-
ment, such as GPS (location gathering), cameras (still photos and videos), micro-
phones (voice and ambient sound recording), wifi, bluetooth, NFC networking,
the various sensors, and ever increasing capacities of persistent storage (eMMC),
and multicore CPUs (e.g., Snapdragon 808 64-bit with 16 cores), included in the
phone is capable of recording all kinds of data that is highly usable as evidence
in a forensic investigation (Fig. 1).
Proactive forensics anticipatorily collects evidence data. The pfs service will con-
stantly monitor the device, gather the information, store it in a stealthy location
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 337–349, 2017.
https://doi.org/10.1007/978-981-10-6898-0_28
338 K.M. Rao et al.
Fig. 1. pfs based forensic Fig. 2. pfs architecture Fig. 3. Forensic phases
investigation
within the device, and opportunistically upload to the cloud. The size of such col-
lected data may overwhelm the capacity of the device. So, pfs caches the tip of
this iceberg of data and stores the rest in the cloud. If a forensic investigation never
becomes necessary, all this effort was wasted. Proactive forensics takes this excess
resource usage risk. At the end of the day, the traditional forensics is always doable.
2 Background
Digital Mobile Forensics: Books such as Tamma and Tindall (2015) and
Hoog (2011) are good introductions to the (non-proactive, i.e., reactive) Android
forensics field. But, do note the year of their publication, and that Android
changes rapidly.
Android Development Overview: Familiarity with the following books
is expected. For Android internals: (Yaghmour 2013) or (Elenkov 2014); for
Android APK development: (Annuzzi Jr. et al. 2016). In Android, when one
app wishes to invoke another, it uses Broadcast Intents. A Content Provider
presents data to other applications.
1
https://github.com/psaiyappan/ and https://github.com/mrkarthik07/.
Android Proactive Forensics 339
3 Proactive Forensics
Operating Systems, in wide use, have never included any support for forensics.
We wish to change that. Perhaps we should call pfs Proactive Provenance. For
any given item x of interest, its provenance is a history of values that x held
from the beginning to the present time. The history of x is valuable especially
when it is synchronized with those of other items.
Proactive Digital Forensics is the act of storage of time-stamped and labeled
data that could be the evidence needed in proving various accusations. Storing
the provenance of system state is the ideal, but not achievable because of size
and network usage.
emails, calendar data, photos, videos, GPS locations, browser history, cookies,
search keywords, dictionary content, installed APK details, applications’ data,
and keystrokes. Traditional forensics cannot collect this highly dynamic data.
Changes in the data of (i) and (ii) is forensically noteworthy.
Incremental Imaging: In our custom ROM (Mateti and Students 2015), the
pfs service will constantly monitor the device, gather the information listed
above and selected in a configuration file, and caches it (in a stealthy location)
within the device. It is expected that we can harvest encryption keys among
this data. Data deleted from the device can be reconstructed, in some cases,
with help of service providers, but highly problematic. To dramatize, what if the
devices were thrown in water or fire? Reactive smartphone forensics can only
gather what is leftover in a captured device. This is a non-issue in proactive
forensics.
Immune to Obfuscation: Users are also becoming increasingly knowledgeable.
Obfuscation tools and anti-forensics toolkits such as Shah (2010) provide hin-
drances to traditional forensics. Almost no app does encryption without using
libraries and syscall based encryption. We can intercept both.
Activity Monitoring: Consider what can be deduced from the uploaded
stream: What places did the owner visit? For how long? How many times a
day is the phone used? How is the phone held, at what angle? Etc. The data
gathered can be found with specific modification/creation/deletion dates.
Reactive Forensics: Although proactive forensics does change the physical/
virtual memory and file volume foot prints, it does not otherwise interfere with
traditional reactive forensics.
4 Design of PFS
Figure 2 shows the architecture of our pfs (proactive forensics service) as an
Android built-in.
pfs service gets hold of data in four different ways: (i) From Android
APIs, which collect data such as phone logs, SMS logs, camera events and
GPS data; (ii) From the inotify tool, and FileObserver, which collect
file system events; (iii) From SQLite database files. Android applications
store their private data using SQLite databases and these files are the main
source of information. E.g., phone logs are stored in the /data/data/com.
android.providers.contacts/databases/contacts2.db file. We collect these
SQLite files both on a schedule and event driven by their updates, so that no
data is forgotten. (iv) Other sources include the collection of the keystrokes, via
a stealthy keylogger, on the device.
Android Proactive Forensics 341
5 Implementation of PFS
To detrmine the feasibilty of proactive forensics, we built not only the pfs but
also the tools described below.
342 K.M. Rao et al.
5.4.3 Contacts
The ContactsContract.Data content provider stores contact details, such as
addresses, phone numbers, and email addresses using a three-tier data model to
store data, associate it with a contact, and aggregate it to a single person using
ContactsContract subclasses: Data, RawContacts, and Contacts.
User events, such as GPS location change, SMS and call events, are tracked
by using broadcast receivers. The events such as browser data, calendar data,
dictionary words are obtained by content observers.
Call Logs: Android can access call logs (Fig. 8) with just few lines of code.
Cursor managedCursor = managedQuery
( CallLog . Calls . CONTENT_URI , null , null , null , null ) ;
Call Recording: Call recorder (Fig. 9) is built with broadcast receivers, which
waits for EXTRA STATE to change and then starts recording.
String extraState = intent . getStringExtra
( TelephonyManager . EXTRA_STATE ) ;
SMS/MMS apps work with the SEND and SEND TO broadcast intents (Fig. 5).
To extract the array of SmsMessage objects packaged within the SMS Broad-
cast Intent bundle, we use the pdu key to extract PDUs (protocol data units).
Each SmsMessage contains the SMS message details, including getOriginating-
Address (phone number), getTimestampMillis, and the getMessageBody.
GPS: The GPS-info activity is implemented with Google Maps API where data
of phone travelled can be seen as a trail on the map (Figs. 12, 13, and 14).
Tracking Cell Location Changes: We can get notifications whenever the
current cell location changes by overriding onCellLocationChanged on a Phone
State Listener. The onCellLocationChanged handler receives a CellLocation
object that includes methods for extracting different location information based
on the type of phone network. In the case of a GSM network, the cell ID (getCid)
Fig. 8. Call logs Fig. 9. Recorded Fig. 10. Videos Fig. 11.
calls recorded com.android.
external storage
Android Proactive Forensics 345
Fig. 12. GPS Fig. 13. Fig. 14. GPS and Fig. 15. URLs
track line Recorded loca- network locations visited
tions
and the current location area code (getLac) are available. For CDMA networks,
we can obtain the current base station ID (getBaseStationId) and the latitude
(getBaseStationLatitude) and longitude (getBaseStationLongitude) of that
base station.
Browser Artifacts: The Browser Provider can give default browser’s usage
details (Fig. 15). BOOKMARKS URI gives the history of visited and bookmarked
URLs. Using SEARCHES URI we can get the history of search terms.
Calendar Data: The Calendar Content Provider includes an Intent-based
mechanism that allows common actions without the need for special permis-
sions using the Calendar application. Each table is exposed from within the
CalendarContract class, including Calendars, Events, Instances, Attendees,
and Reminders.
Video Recording: The video is stealthily recorded and saved onto a sdcard
(Fig. 10).
Keylogger: We wrote a fully functional keyboard.
6 Related Work
There is almost no prior work on proactive Android forensics work. Hence, this
section covers areas that any proactive forensics must interface with.
Smartphone Forensics: There is considerable work on iOS, and others.2 We
are, of course, focused on Android. Mylonas et al. (2012) and Grover (2013)
explain the term proactive and its significance.
2
E.g., see A Glimpse of iOS 10 from a Smartphone Forensic Perspective, by Heather
Mahalik, September 17, 2016, http://www.forensicswiki.org/wiki/Blackberry
Forensics, and https://www.gillware.com/forensics/windows-phone-forensics.
346 K.M. Rao et al.
Linux Forensics: There is an enormous body of free and open source Linux
forensics software, in languages ranging over C/C++, Java, and Python.3 Our
expectation is that nearly all of this code can be ported to Android, but with
varying degrees of ease. We selected inotifywait McGovern (2012), and ported
it to Android. To us: Android FileObserver.
Android Forensics: Android forensics is not only continuing the tradition of
Linux FOSS but also giving rise to commercial tools. Here we briefly describe
a select list of FOSS work. SourceForge lists4 6500+ “Android forensics tools”,
but many of them are not. The slides by Carlo (2016) do describe “Android
Forensics with Free/Open Source Tools”. DroidWatch (Grover 2013), calls itself
an enterprise monitoring tool, but it is an automated forensic tool, which
sends useful data frequently to a web server. The file volume forensic tool by
Zimmermann et al. (2012) uses yaffs2 and is now obsolete because of all
recent Android devices have switched over to eMMC and ext4. The “Open Source
Android Forensics Toolkit”5 is good even though it was developed, as an under-
graduate senior design project.
App Forensics: WhatsApp has attracted a good amount of forensic analy-
ses: WhatsApp Xtract 20126 and papers (Anglano 2014; Karpisek et al. 2015;
Shortall and Azhar 2015; Azfar et al. 2016; Anglano et al. 2016; Shuaibu and
Bala 2016), and theses (Thakur 2013; Terpstra 2013). Skype too has attracted
forensic attention. The tool named Skype Xtractor7 is a Python 2.7 applica-
tion written for the forensics focused distribution named Deft Linux.8 There is
another tool named Skyperious9 .
Device Imaging is considered in (Macht 2013; Kong 2015; Guido 2016). It is
worth mentioning that Android devices do not fully wipe themselves out even
after a factory reset (Simon and Anderson 2015).
Stealth File Systems: Much work has been done in stealth file systems. For lack
of space, we limit ourselves to just citing a select few papers: (Hokke et al. 2015;
Lengyel et al. 2014; Peinado and Kim 2016; Neuner et al. 2016).
7 Evaluation
Contribution to Lag: Our GPS tracking, background syncs and video record-
ing can cause the device to never sleep or at times cause noticeable lag in the
3
E.g., see Kali Linux https://www.kali.org/ even has a boot option for forensics,
https://en.wikipedia.org/wiki/List of digital forensics tools http://forensicswiki.
org/wiki/Tools, and http://linoxide.com/linux-how-to/forensics-tools-linux/
July 20, 2016, .
4
https://sourceforge.net/directory/os:linux/?q=android%20forensics%20tools.
5
https://sourceforge.net/projects/osaftoolkit/.
6
http://blog.digital-forensics.it/2012/05/whatsapp-forensics.html.
7
http://www.slideshare.net/AlessandroRossetti/deftcon2013-ngskype.
8
http://www.deftlinux.net/.
9
https://suurjaak.github.io/Skyperious/.
Android Proactive Forensics 347
running of applications. Some devices with suffer also from a low-memory issue
and perform poorly once they hit around 80% of their capacity especially in case
of larger yet to be uploaded videos. The lag caused by cloud upload is less visible
to the user as it is achieved by opportunistic uploading.
Impact on Battery Consumption: Recording of video and wifi based upload-
ing to the cloud, GPS tracking, etc. ineffective wake locks, all significantly drain
the battery. Process running in the background, careful implementation of log-
ging of events, etc. are light on battery use. The stealth, we might otherwise
have, can be lost because the drain is (often) noticeable.
Hide Forensic Processes: Extraction of data or uploading to the cloud, should
be stealthy (Fig. 11). Normal users can easily detect it with apps available on
the Play store. The typical process list command ps uses /proc file system to
get process’ details. We chose not to rootkit-edit ps to covert a process, but
instead hide the folder /proc/PID/ of specific PIDs10 The command hidepid
hides processes and its information to other users. It accepts three values. Default
is hidepid = 0, and any user can see processes running in background. When hide-
pid = 1, normal user would not see other processes but their own about ps, top,
etc., but still able to see process IDs in /proc. When hidepid = 2, user can only
able too see their own processes also the process IDs are hidden from /proc also.
8 Conclusion
We implemented a forensic framework for Android smartphones. It is proactive
in the sense that it anticipates data that could become useful as evidence and
saves the data on a stealth file location. As the gathered data will grow to a
size that cannot be stored within the device, we opportunistically upload this
evidence to a cloud storage facility. Our work supports forensic investigators in
all phases (Fig. 3) of mobile forensics.
Our programming work is open sourced on GitHub/[blinded]11 . A custom
ROM (Mateti and Students 2015) we built includes the new proactive forensics
support service. Aiyyappan (2015) designed and implemented the pfs service,
and portions of data gathering. Rao (2016) designed and implemented data
10
https://sysdig.com/blog/hiding-linux-processes-for-fun-and-profit/.
11
https://github.com/psaiyappan/ and https://github.com/mrkarthik07/.
348 K.M. Rao et al.
References
Aiyyappan, P.S.: Android Forensic Support Framework. Master’s thesis, Amrita
Vishwa Vidyapeetham, Ettimadai, Tamil Nadu, India (2015). http://cecs.wright.
edu/∼pmateti/Students/. Advisor: Prabhaker Mateti
Anglano, C.: Forensic analysis of WhatsApp messenger on android smartphones. Digit.
Invest. 11(3), 201–213 (2014)
Anglano, C., Canonico, M., Guazzone, M.: Forensic analysis of the chat secure instant
messaging application on android smartphones. Digit. Invest. 19, 44–59 (2016)
Annuzzi Jr., J., Darcey, L., Conder, S.: Introduction to Android Application Develop-
ment: Android Essentials, 5 edn., p. 672. Pearson Education, Hoboken (2016)
Azfar, A., Choo, K.-K.R., Liu, L.: An android communication app forensic taxonomy.
J. Forensic Sci. 61(5), 1337–1350 (2016)
Carlo, A.D.: Android Forensics with Free/Open Source Tools (2016). www.slideshare.
net
CyberPunk. Android Free Forensic Toolkit (2015). http://n0where.net/
Android-free-forensic-toolkit
Elenkov, N.: Android Security Internals: An In-Depth Guide to Android’s Security
Architecture. No Starch Press, San Francisco (2014)
Google, Com. android.os.FileObserver Class. Google.com (201x).
AOSP/../java/android/os/FileObserver.java
Grover, J.: Automated data collection and reporting from a mobile device. Digit. Invest.
10, S12–S20 (2013). https://github.com/jgrover/DroidWatch
Guido, M., Buttner, J., Grover, J.: Rapid differential forensic imaging of mobile devices.
Digit. Invest. 18, S46–S54 (2016)
Hazra, S.: Stealth File Systems for Proactive Forensics on Android. Master’s the-
sis, Amrita Vishwa Vidyapeetham, Amritapuri, Kerala, India (2017). http://cecs.
wright.edu/∼pmateti/Students/. Subproject: FUSE-based Mounting of Cloud Stor-
age. Advisor: Prabhaker Mateti
Hokke, O., Kolpa, A., van den Oever, J., Walterbos, A., Pouwelse, J.: A Self-Compiling
Android Data Obfuscation Tool (2015). arXiv:1502.01625
Hoog, A.: Android Forensics: Invest. Analysis and Mobile Security for Google Android.
Syngress/Elsevier, Amsterdam (2011)
Karpisek, F., Baggili, I., Breitinger, F.: WhatsApp network forensics: decrypting and
understanding the WhatsApp call signaling messages. Digit. Invest. 15, 110–118
(2015)
Kong, J.: Data Extraction on MTK-based android mobile phone forensics. J. Digit.
Forensics Secur. Law: JDFSL 10(4), 31 (2015)
Lengyel, T.K., Maresca, S., Payne, B.D., Webster, G.D., Vogl, S., Kiayias, A.: Scala-
bility, Fidelity and stealth in the DRAKVUF dynamic malware analysis system. In:
30th Annual Computer Security Applications Conference, pp. 386–395. ACM (2014)
Macht, H.: Live Memory Forensics on Android with Volatility. Master’s thesis,
Friedrich-Alexander University Erlangen-Nuremberg (2013)
Android Proactive Forensics 349
Mateti, P.: Design and Construction of a new Highly Secure Android ROM. Technical
report, Amrita Viswa Vidyapeetham and Wright State University, Ettimadai, Tamil
Nadu, India; Dayton, OH, USA (2015). http://cecs.wright.edu/∼pmateti/Students/
Theses/
McGovern, R.: inotifywait for Android (2012). https://github.com/mkttanabe/
inotifywait-for-Android
Mylonas, A., Meletiadis, V., Tsoumas, B., Mitrou, L., Gritzalis, D.: Smartphone foren-
sics: a proactive investigation scheme for evidence acquisition. In: Gritzalis, D., Fur-
nell, S., Theoharidou, M. (eds.) SEC 2012. IAICT, vol. 376, pp. 249–260. Springer,
Heidelberg (2012). doi:10.1007/978-3-642-30436-1 21
Neuner, S., Voyiatzis, A.G., Schmiedecker, M., Brunthaler, S., Katzenbeisser, S.,
Weippl, E.R.: Time is on my side: steganography in filesystem metadata. Digit.
Invest. 18, S76–S86 (2016)
Peinado, M., Kim, T.: System and Method for Providing Stealth Memory. US Patent
9,430,402 (2016)
Rao, K.M.: Proactive Forensic Support for Android Devices. Master’s thesis, Amrita
Vishwa Vidyapeetham, Ettimadai, Tamil Nadu, India (2016). http://cecs.wright.
edu/∼pmateti/Students/. Advisor: Prabhaker Mateti
Shah, C.: An Analysis. Technical report, McAfee.com. https://blogs.mcafee.com/
mcafee-labs/zeus-crimeware-toolkit/
Shortall, A., Azhar, M.A.H.B.: Forensic acquisitions of whatsapp. data on popular
mobile platforms. In: 2015 Sixth International Conference on Emerging Security
Technologies (EST), pp. 13–17. IEEE (2015)
Shuaibu, M.Z., Bala, A.: WhatsApp forensics and its challenges for android smart-
phone. Global J. Adv. Eng. Technol. Sci. 8 (2016)
Simon, L., Anderson, R.: Security analysis of android factory resets. In: 3rd Mobile
Security Technologies Workshop (MoST) (2015)
Tamma, R., Tindall, D.: Learning Android Forensics. Packt Publishing, Birmingham
(2015)
Terpstra, M.: WhatsApp & Privacy. Master’s thesis, Radboud University Nijmegen,
Netherlands (2013)
Thakur, N.S.: Forensic Analysis of WhatsApp on Android Smartphones. Master’s the-
sis, University of New Orleans (2013)
Yaghmour, K.: Embedded Android: Porting, Extending, and Customizing, p. 95472.
O’Reilly Media Inc., Sebastopol (2013)
Zimmermann, C., Spreitzenbarth, M., Schmitt, S., Freiling, F.C.: Forensic analysis of
YAFFS2. In: Sicherheit, pp. 59–69 (2012)
ASLR and ROP Attack Mitigations
for ARM-Based Android Devices
1 Introduction
The objective of the work reported here is to secure Android devices from
ASLR and ROP attacks. The existing (2017) Zygote model permits these attacks.
The Stagefright (Drake 2015) exploit is based on ROP attacks.
1.1 Organization
Background needed for this paper is provided in Sect. 2. Section 3 describes tools
and techniques that we use in our design. Section 6 are about related work and
evaluation. We conclude the paper with Sect. 7.
2 Background
2.1 ASLR
2.2 ROP
Return oriented programming is a growing class of exploits in which the exploit
consists of re-using code that is already present in the virtual memory of the
system.
Return oriented programming was discovered in 2007 by Shacham (2007).
The authors used it to bypass the limitations of executable stack such as DEP,
Stack Canaries, etc. in jumping to valid code regions avoiding the need to
place shellcode on the stack. They bypassed the hardware enforced NX (non-
executable protection) in Intel, XN in ARM and software enforced DEP (Data
execution prevention) with the ROP technique. ROP was also shown to be
applicable on ARM architecture (Kornau 2010).
The most common type of attack is attack is Return-to-LibC (Ret2LibC),
where an attacker exploits a buffer overflow to redirect control flow to a library
function already present in the system. To perform a Ret2LibC attack, the
352 V. Parikh and P. Mateti
attacker must overwrite a return address on the stack with the address of a
library function. Additionally, the attacker must place the arguments of the
library function on the stack in order to control the execution of the library
function.
There is yet another approach if the attacker has already access to mem-
ory modification routines such as VirtualProtect on Windows and mprotect on
Linux. Compose ROP shellcode that makes a call to memory modification rou-
tines such as mprotect.
We ran ARM shellcode on Android and found that ROP shellcode/payload
is perfectly possible on ARM.
Load/Store. (i) Loading a Constant (ii) Loading from Memory (iii) Storing to
Memory
Control Flow. Unconditional Jump, Conditional Jumps
Arithmetic & Logic. Add, Exclusive OR (XOR), And, Or, Not, Shift and
Rotate
The ropper1 tool discovered on the ARM /bin/ls binary a staggering 1112
ROP gadgets. This shows that the incentive for attackers is quite the same on
ARM architecture as on x86(-64) architecture.
1
https://github.com/sashs/Ropper.
ASLR and ROP Attack Mitigations for ARM-Based Android Devices 353
0 x0000fee4 : adc.w r0 , r0 , r3 ; bx lr ;
0 x0001540e : adc.w r1 , r1 , r4 , lsl #20 ; orr.w r1 , r1 ,
r5 ; pop { r4 , r5 , pc };
0 x0001529e : adc.w r2 , r2 , r2 ; it hs ; subhs.w r0 , r0 ,
r1 ; mov r0 , r2 ; bx lr ;
0 x0001540a : adcs r0 , r0 ,#0 ; adc.w r1 , r1 , r4 , lsl
#20; orr.w r1 , r1 , r5 ; pop { r4 , r5 , pc };
0 x0001551a : adcs r1 , r1 ; it hs ; orrhs r1 , r1 , #0
x80000000 ; pop { r4 , r5 , pc };
Listing 1. An ROP Gadget Found in ARM /bin/ls
3 Binary Instrumentation
Dynamic Binary Instrumentation (DBI) (Backes et al. 2016) is an introspec-
tion technique in which a binary is re-built just before being run with added
hooks that invoke certain callback routines. These callback functions help iden-
tify events that happen during execution. These hooks can be called at various
354 V. Parikh and P. Mateti
3.2 Valgrind
Valgrind3 is a Dynamic Binary Analysis (DBA) tool that uses DBI framework
to check memory allocation, to detect deadlocks and to profile the applications.
Valgrind tool loads the application into a process, disassembles the application
code, add the instrumentation code for analysis, assembles it back and executes
the application. It uses Just In time Compiler (JIT) to embed the application
with the instrumentation code.
3.3 DynamoRIO
DynamoRIO4 is also a DBI tool. It is supported on Linux, Windows and Android.
This database contains information about all the apps that are getting exploited,
as gathered through crowd sources. Such apps will be blacklisted and will not
be allowed to run.
4.6 Algorithm 1
The following algorithm is adapted5
for each IMAGE
for each BLOCK in IMAGE
insert BLOCK in BLOCKLIST
for each INSTRUCTION in BLOCK
if INSTRUCTION is RETURN or BRANCH
insert retrieve SAVED_EIP from stack
insert CALL to ROP_VALIDATE ( SAVED_EIP ) before
INSTRUCTION
ROP VALIDATE
if SAVED_EIP not in BLOCKLIST
exit with error warning
Listing 2. Instrument Program
Notes: (i) The type of branch has not been specified. It has to be an indirect
branch. (ii) Code to retrieve saved EIP might not be easy to construct.
4.7 Algorithm 2
The following algorithm is adapted6
PRE - PROCESSING STEP
for_each image in process
for_each bb in image
Bblist . push_back ( BBInfo ( bb ) )
DETECTION STEP
for_each ins in Program
if I s I n d i r e c t B r a n c h O r C a l l ( ins )
5
http://www.talosintelligence.com/.
6
http://public.avast.com/caro2011/.
358 V. Parikh and P. Mateti
This algorithm works for all types of branches and not just return instruc-
tions. But, as a side-effect, there is greater performance overhead.
4.8 Algorithm 3
1. Find out all valid call sites in a binary (Parse Export Add all valid call sites
to a block list.
2. If the BB ends with a call or ret instruction then (insert code to) check the
starting address of the next BB. It should either be a valid call site (from the
block list) or should be an address just below a valid call site. In ARM this
would be PC+4
3. Else Continue
4. Note: BB is computed before the image begins execution.
4.9 Algorithm 4
4.10 Algorithm 5
4.11 Algorithm 6
4.12 Algorithm 7
1. At the end of each basic block, we check whether a ret (return) instruction is
taken (In ARM the equivalent is the POP PC or MOV PC, LR) by analyzing
the branch instruction at the end of basic block.
2. Check the top of the stack (or LR register in case of ARM) in case a return
instruction is taken. Let the address be A.
3. If (Disassemble(A-4) == call instruction) then it is a genuine function return-
ing back from where it was called.
4. Else alert the cloud module about an ROP attack.
5. Note: A basic block is always ended by a direct/indirect branch
For making DBI faster we can use a technique of selective binary instrumenta-
tion that only targets critical applications in the device which may have a higher
probability of getting exploited in the wild. We are certainly aware of the perfor-
mance hit. That is why we will plan to have selective instrumentation (i.e. the
binary will not be instrumented every time it runs. e.g. if a binary runs for 100
times then at 101th time it will not be instrumented, or instrument at random
times). Also researchers have used the technique of dynamic binary instrumen-
tation in the past to detect ROP attacks. Specifically for ARM/Android the
following paper has used Valgrind based instrumentation to detect ROP.
7
https://github.com/benwaffle/DynamoRIO-shadow-stack.
360 V. Parikh and P. Mateti
5 Implementation
5.1 Analysis of Zygote
Whenever a new app has to be launched, Zygote forks off its own process.
The base zygote process has already loaded most of the framework core
libraries. After forking, the process inherits all the libraries associated with par-
ent zygote process. After creating a process, the Zygote does several things
like assigning group id, etc. to the apps. In init.rc file there is a file
service Zygote /system/bin/app-process -X Zygote /system/bin --Zygote
--start-system-server. The aim is to improve the existing security of current
Zygote implementation. We will be testing existing Android ASLR exploits on
the current security implementations and demonstrate that they are not nearly
enough for security of the device. Then we will demonstrate how Zygote4 can
stop those exploits owing to its new and improved defenses against ROP init
runs /system/bin/app-process ROP is actually possible on Android ARM.
Due to Zygote design the effectiveness of ASLR is undermined as all forked
processes from Zygote inherit the same memory layout of the shared libraries.
Also multiple copies of the same process share the exact same code layout. This
greatly reduces the security provided by ASLR and makes the device vulnerable
to several kinds of attacks against randomization.
6 Related Work
Lee et al. (2014) demonstrate the weakness of the existing Android Zygote imple-
mentation and tries to introduce a new replacement, Morula instead of the weak
(from security perspective) Zygote model. Morula promises full ASLR support
for the processes spawned. Unlike Zygote which only uses fork() to create new
processes, Morula uses fork() along with exec() to bring full ASLR support on
Android platforms. In existing Android systems the apps have the same code
base address even when ASLR is present. In Morula there is a pool of Zygote
processes which are maintained at all times. When a new app starts, the process
will inherit from any one of these Zygote processes Legacy apps tend to have
a heavy usage of native code. This code is loaded through JNI-like interfaces.
362 V. Parikh and P. Mateti
The authors further highlight the weakened ASLR model of Android by devising
two real exploits on Android apps. These exploits break aslr and achieve ROP
on current systems They then designed Morula as a countermeasure and imple-
mented it as an extension to the Android OS. By leaking an address in his/her
own process, an attacker can relate the address to another app which needs
too be exploited, thus providing a memory disclosure vulnerability. This works
because Zygote causes two child process to have the same memory and thus
revealing memory of one sibling process will also let us find the corresponding
memory location in another sibling process.
Shetti (2015) did a further enhancement to the Morula framework. ASLR
in 32 bits is weaker even when best randomization practices are followed. This
is due to the fact that sufficient entropy is simply unattainable in 32 bit archi-
tecture. However, the same cannot be said for 64 bit architectures as a wide
address size (8 bytes) offers a considerable advantage over 32 bit architecture.
Due to the huge virtual memory it becomes relatively easy improving the existing
ASLR techniques on 64 bits. The idea of dynamic offset randomization is quite
effective against de-randomization exploits against Android Zygote. Enhanced
Morula’s process creation model and Randomization highlights shortcomings
with Morula framework. It proposes Zygote3, an enhancement of Morula which
features Dynamic offset randomization and base pointer randomization.
Kbouncer (Carlini and Wagner 2014) uses the last branch tracing functionality
provided by the intel architecture. Kbouncer uses hardware support for tracing
indirect branches. It inspects the history of indirect branches taken at every sys-
tem call. Their implementation consists of three components: An offline gadget
extraction and analysis toolkit, A userspace layer and a kernel module to modu-
late all the system calls being passed to the kernel and also to log all the indirect
branches being taken by the application.
7 Conclusion
In this report, we how the Android Zygote is still not secure and proposed
our new framework as a countermeasure. Our framework promises to be an
enhancement over the past work involving Zygote. The design is based on an
open source ideology and can serve as a better alternative to the traditional
ROP protections. One of the possible challenges that we face is the process
slowdown introduced by dynamic binary instrumentation. We expect to solve
such issues as we further make progress in our research. For making DBI faster
we can use a technique of selective binary instrumentation that only targets
critical applications in the device which may have a higher probability of getting
exploited in the wild.
ASLR and ROP Attack Mitigations for ARM-Based Android Devices 363
References
Drake, J.: Stagefright: scary code in the heart of Android. BlackHat USA,
August 2015. Slides: https://www.blackhat.com/docs/us-15/materials/
us-15-Drake-Stagefright-Scary-Code-In-The-Heart-Of-Android.pdf, video: https://
www.youtube.com/watch?v=71YP65UANP0
Shacham, H., Page, M., Pfaff, B., Goh, E.-J., Modadugu, N., Boneh, D.: On the effec-
tiveness of address-space randomization. In: Proceedings of the 11th ACM Confer-
ence on Computer and Communications Security, pp. 298–307. ACM (2004). http://
www.hovav.net/dist/asrandom.pdf
Shacham, H.: The geometry of innocent flesh on the bone: return-into-LIBC without
function calls (on the x86). In: Proceedings of the 14th ACM Conference on Com-
puter and Communications Security, pp. 552–561. ACM (2007)
Kornau, T.: Return oriented programming for the ARM architecture. Master’s the-
sis, Ruhr-Universitat Bochum, Germany (2010). http://zynamics.com/downloads/
kornau-tim-diplomarbeit-rop.pdf
Davi, L., Dmitrienko, A., Sadeghi, A.-R., Winandy, M.: Privilege escalation attacks
on android. In: Burmester, M., Tsudik, G., Magliveras, S., Ilić, I. (eds.) ISC
2010. LNCS, vol. 6531, pp. 346–360. Springer, Heidelberg (2011). doi:10.1007/
978-3-642-18178-8 30
Pappas, V., Polychronakis, M., Keromytis, A.D.: Transparent ROP exploit mitigation
using indirect branch tracing. Presented as Part of the 22nd USENIX Security Sym-
posium (USENIX Security 2013), pp. 447–462 (2013)
Checkoway, S., Davi, L., Dmitrienko, A., Sadeghi, A.-R., Shacham, H., Winandy, M.:
Return-oriented programming without returns. In: Proceedings of the 17th ACM
Conference on Computer and Communications Security, pp. 559–572. ACM (2010).
http://cseweb.ucsd.edu/∼hovav/dist/noret-ccs.pdf
Backes, M., Bugiel, S., Schranz, O., von Styp-Rekowsky, P., Weisgerber, S.:
ARTist: the Android runtime instrumentation and security toolkit. arXiv preprint
arXiv:1607.06619 (2016)
Huang, Z., Zheng, T., Liu, J.: A dynamic detection method against ROP attack on
ARM platform. In: Proceedings of the Second International Workshop on Software
Engineering for Embedded Systems, pp. 51–57. IEEE Press (2012)
Lee, B., Lu, L., Wang, T., Kim, T., Lee, W.: From Zygote to Morula: fortifying weak-
ened ASLR on Android. In: IEEE Symposium on Security and Privacy (2014)
Shetti, P.: Enhancing the security of Zygote/Morula in Android Lollipop. Master’s
thesis, Amrita Vishwa Vidyapeetham, Ettimadai, Tamil Nadu, India, June 2015.
Advisor: Prabhaker Mateti. http://cecs.wright.edu/∼pmateti/Students/
Carlini, N., Wagner, D.: ROP is still dangerous: breaking modern defenses. In: USENIX
Security, vol. 14 (2014)
CBEAT: Chrome Browser Extension
Analysis Tool
1 Introduction
2 Background
content_script field
content_script field contains a sub field js containing names of JavaScript files which
need to be executed when a match to a URL is found. The extensions can track the
366 S.S. Roy and K.P. Jevitha
URLs visited by the user in the browser. Once a URL match is found from the lists of
URLs to be matched present in the matches sub field, the JavaScript files present in
the js field is executed.
The content_script field is usually declared as follows-
"content_scripts": [
{
"matches": ["http://www.google.com/*"],
"js": ["jquery.js", "myscript.js"]
}
]
Permissions field
Permissions field is one of the most important fields of the manifest.json file. The
Chrome APIs to be used by the extension are specified under the permissions field.
Permissions can be either known strings such as “tabs” or a match pattern that gives
access to one or more hosts. Example of a host declaration is “http://*.-
google.com/”.
The permissions field is usually declared as follows-
"permissions": [
"tabs", "geolocation", "http://www.google.com/*"
]
Taint Analysis
An operation, or series of operations, that uses the value of some object, say x, to derive
a value for another object, say y, creates a flow from x to y. Information flows from
object x to object y, when information stored in x is transferred to object y. The object x
is called source and the object to which the information flows i.e., object y is called
sink. Taint analysis is a form of information flow analysis. If the information entering
through a source is considered untrustworthy, a taint tag is added. Thus when the
information flows from a source to a sink, it can be identified if it is tainted based on
this taint tag.
Motivating Example
Figure 1 shows an example JavaScript code snippet having the source document.
getElementById and sink addEventListener.
3 Related Works
As users widely accept the use of Chrome extensions, key concern lies in how to make
them secured. Hence security scholars have shown interest to work in this area. There
are many important research works in the area of browser extension security and
information flow in JavaScript separately. In depth review of related literatures iden-
tifies the existing gap to find a single tool which combines browser extension security
and information flow in JavaScript together.
ANDROMEDA: Accurate and scalable security analysis of web applications is pro-
posed by Tripp et al. [1] ANDROMEDA produces precise and modular secured
analysis of web applications in a demand driven manner by tracing information flows
which are vulnerable, without constructing the representation of the entire target
application. A Web application together with its associated libraries is given as an input
to the ANDROMEDA algorithm to validate it against conditions in the form of security
guidelines. A security guideline contains three major information in the form (Src,
Dwn, Snk), where Src, Dwn and Snk are conditions written as patterns for corre-
sponding sources, downgraders and sinks in the target application, respectively.
Method call or field dereference is considered as a pattern match. Vulnerabilities are
stated for flows ranging between a source and a sink matching to the exact rule, except
for the downgrader from the rules Dwn set resolving the flow.
HULK: Eliciting Malicious Behavior in Browser Extensions by Kapravelos, Alexan-
dros et al. [2], is a tool for detecting malicious behaviour in Chrome extensions. It relies
on dynamic execution of extensions and uses the technique of HoneyPages that are
specifically created web pages intended to fulfill the structural conditions that initiate a
given extension. By means of this procedure, they can directly detect malicious
behaviour that inserts new iframe or div elements. Along with this, a fuzzer is made to
steer the execution of event handlers registered by extensions.
Analyzing information flow in JavaScript-based browser extensions by Dhawan
et al. [3], proposes SABRE, a system to analyze extensions by tracking in-browser
information flow. SABRE merges a tag with each in-memory JavaScript object in the
browser, which decides if the object contains sensitive information. SABRE is
implemented for tracing information flow, by altering SpiderMonkey which is the
JavaScript interpreter in Firefox. To include their security labels, they altered Java-
Script object depictions within SpiderMonkey.
An Evaluation of the Google Chrome Extension Security Architecture performed
by Carlini et al. [4], identified the vulnerabilities present in the Chrome extensions.
Using black box testing and source code analysis they had reviewed Chrome exten-
sions for identifying the vulnerabilities. Their work concentrated on mainly three types
of vulnerabilities; vulnerabilities that extensions add to websites, vulnerabilities in
content scripts and vulnerabilities in core extensions. Apart from providing information
on vulnerabilities they had listed some major defenses which could be taken to mitigate
the vulnerabilities.
The existing works mainly focus on either analysis of Chrome extension security
architecture or analysis of information flow in JavaScript discretely. As there is no
368 S.S. Roy and K.P. Jevitha
holistic approach to analyzing the Chrome extensions at present, this paper introduces
CBEAT, a unique Chrome Browser Extension Analysis Tool which combines Manifest
Analysis along with JavaScript analysis to give holistic analysis of Chrome extensions.
4 Proposed Framework
Taint Analysis. These scripts are executed only when a match to the URLs present in
the matches sub field is found. The matches sub field is analysed to check which
URLs are tracked by the extension. This helps in identifying whether all of user’s
browsing experience is tracked by the extension or not. If the content scripts are
executed at match of < all_urls > , it implies that any URL which the user visits
the extension will execute the scripts. Table 1 contains the list of content_script mat-
ches that keep track of all the URLs visited by the user.
the reaching tags are obtained. Finally the tainted information flow path from source to
sink is formed.
Sources and Sinks
Taint analysis starts with defining the set of sources and sink. It is the most important
task of the analysis since the information flows will be found based on the sources and
sinks given as an input. The sources and sinks which have the potential to expose user
private data will be matched against those present in the extensions. This will help in
identifying whether the extension contain such sources and sinks that expose user
private data. Out of total 155 sources and 225 sinks considered, a few are shown in
Fig. 3.
5 Experimental Evaluation
CBEAT is assessed with a set of extensions from the Chrome Web Store spread across
various genres such as accessibility, fun, productivity and social including number of
downloads made by users ranging from hundreds to millions. This variety is chosen to
validate the results across all types of extensions. The following are the list of
extensions-AlphaText, Browser Clock, Calculator, CliMate, Currency converter, Docs
Online Viewer, Emoji Keyboard, ESI Stylish, Guru, Honey, Lazarus, Liner, Mer-
curyReader, Music Bubbles, News Factory, Noisli, Notepad, Planyway, Playmoss,
Rebrandly, Remove Redirects, SmoothScroll, Spoiler Spoiler, Stock Portfolio and
Tagboard.
Extensions classified as low indicates that they expose negligible user private data.
Extensions classified as medium indicates that they expose moderate user private data.
CBEAT: Chrome Browser Extension Analysis Tool 373
Similarly, extensions classified as high indicate that they expose considerable user
private data. Table 5 presents the classification of the extensions.
Figure 6 shows, 40% of the extensions are classified as low, 32% as medium and
28% as high in exposing user’s private and sensitive data.
Table 4 are from the sources and sinks which violate the user privacy exposing user
private data from the extensions.
Figure 10 shows that the maximum sources and sinks are found from chrome.tabs
API followed by chrome.storage, chrome.boomarks, chrome.contextMenus, chrome.-
cookies, chrome.history, chrome.webNavigation, chrome.webRequest, chrome.back-
ground, chrome.browsingData, chrome.fileBrowserhandler and chrome.geolocation.
6 Conclusion
In current era, growing need to use browser extensions is unavoidable. The possibility
that browser extensions are collecting user private data is also a fact. There is prevailing
misunderstanding among users of Chrome extensions that their browsing experience is
safe and secured, will continue. This makes the users vulnerable and the users need to
live with such vulnerability in days to come. The only option available to users is to use
a tool to identify such malicious browser extensions. Developing such a tool capable of
identifying sophisticated, powerful browser extensions exposing user private data is a
challenge. Extensive research helped to develop CBEAT, the tool which can perform
manifest analysis and JavaScript static taint analysis. This will benefit the security
research community as well as users’ community as it provides holistic analysis of
Chrome extensions.
This paper finds from the tested Chrome extensions that the maximum used per-
mission is found to be storage, 53% of the extensions have persistent background
scripts and 36% of the extensions make matches to < all_urls > . Out of all the
information flows, 2% – 20% of the information flows are from the sources and sinks
which expose user private data from the extensions. Finally, 40% of extensions are
classified as low, 32% as medium and 28% as high in exposing user’s private and
sensitive data.
378 S.S. Roy and K.P. Jevitha
References
1. Tripp, O., Pistoia, M., Cousot, P., Cousot, R., Guarnieri, S.: ANDROMEDA: accurate and
scalable security analysis of web applications. In: Cortellessa, V., Varró, D. (eds.) FASE
2013. LNCS, vol. 7793, pp. 210–225. Springer, Heidelberg (2013). doi:10.1007/978-3-642-
37057-1_15
2. Kapravelos, A., Grier, C., Chachra, N., Kruegel, C., Vigna, G., Paxson, V.: Hulk: eliciting
malicious behavior in browser extensions. In: USENIX Security, pp. 641–654, August 2014
3. Dhawan, M., Ganapathy, V.: Analyzing information flow in JavaScript-based browser
extensions. In: Computer Security Applications Conference, ACSAC 2009 Annual, pp. 382–
391. IEEE, December 2009
4. Carlini, N., Felt, A.P., Wagner, D.: An evaluation of the Google Chrome extension security
architecture. In: USENIX Security Symposium, pp. 97–111, 8 August 2012
5. http://wala.sourceforge.net/wiki/index.php/UserGuide:CallGraph
6. http://wala.sourceforge.net/wiki/index.php/Main_Page#Welcome_to_the_T.J._Watson_
Libraries_for_Analysis_.28WALA.29
7. http://wala.sourceforge.net/javadocs/trunk/com/ibm/wala/ssa/SSAInstruction.html
8. http://wala.sourceforge.net/wiki/index.php/UserGuide:DataflowSolvers
9. Kildall, G.A.: A unified approach to global program optimization. In: Proceedings of the 1st
annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages,
pp. 194–206. ACM, October 1973
10. Dev, P.A., Jevitha, K.P.: STRIDE based analysis of the chrome browser extensions API. In:
Satapathy, S.C., Bhateja, V., Udgata, S.K., Pattnaik, P.K. (eds.) Proceedings of the 5th
International Conference on Frontiers in Intelligent Computing: Theory and Applications. AISC,
vol. 516, pp. 169–178. Springer, Singapore (2017). doi:10.1007/978-981-10-3156-4_17
11. Arunagiri, J., Rakhi, S., Jevitha, K.P.: A systematic review of security measures for web
browser extension vulnerabilities. In: Suresh, L.P., Panigrahi, B.K. (eds.) Proceedings of the
International Conference on Soft Computing Systems. AISC, vol. 398, pp. 99–112. Springer,
New Delhi (2016). doi:10.1007/978-81-322-2674-1_10
12. Shahanas, P., Jevitha, K.P.: Static analysis of Firefox OS privileged applications to detect
permission policy violations. Int. J. Control Theor. Appl. 3085–3093 (2016)
Hardware Trojan Detection Using Effective
Test Patterns and Selective Segmentation
Abstract. Hardware Trojans (HTs) have become a major threat to the modern
fabless semiconductor industry. This has raised serious concerns over integrated
circuits (IC) outsourcing. HT detection and diagnosis is challenging due to the
diversity of HTs, large number of gates in modern ICs, intrinsic process vari-
ation (PV) in IC design and the high cost of testing. An efficient HT detection
and diagnosis scheme based on selective segmentation is proposed in this work.
It divides the large circuit into small sub-circuits and applies consistency
analysis of gate-level properties. In addition, Transition probability (TP) esti-
mation for each node is employed and performed segmentation on the least
probable transition nodes. In order to further enhance the detection, optimized
test vectors are chosen during the procedure. Based on the selected segments,
HTs are detected correctly by tracing gate level properties.
1 Introduction
Power analysis of the circuit after segmentation is a key technique used to find the
presence of extra gates without the need of logical analysis of the circuit. A number of
recently published works have concentrated on the above method for Trojan detection.
There are two main drawbacks in general segmentation process: One of them is due the
overlapping of many segments since the circuit cannot be perfectly divided, the second
one is time consumption due to extra segments having redundant nodes. This new
approach for HT detection and diagnosis that employs selective segmentation of circuit
using probability of transition of gates.
The main contributions of this paper include the following: Initially the transition
probability of each node is calculated and a threshold value is set. Based on the threshold
value, the nodes are characterized into low, medium and high probability in [1]. In [2]
the segmentation of the circuit is done using the classification obtained in [1] so that
there is very less overlap of segments leading to lesser number of segments and increase
in efficiency. The method of selecting logic gates in its logical fan-out cone is utilized to
optimize the number of test nodes. In [3], a backtracking algorithm is used to find out the
test vectors which trigger each and every node so that during testing process a minimal
number of vectors can be given to reduce time consumption. This is based on the fact
that a circuit consisting of n nodes can be tested using maximum of 2n vectors because
each node needs only 2 input combination to make the output ‘0’ or ‘1’.
The circuit under consideration is subjected to segmentation by the method [2] and
power analysis is done for each segment in order to check for consistency. The power
analysis is aided by reduced test pattern obtained from [3] so that nodes in the segment
are triggered more frequently and the testing process takes less time for completion.
Overall the time constraint is greatly reduced and the accuracy of finding Trojan is also
high.
2 Related Works
Recently a number of approaches have been proposed for HT detection. A scalable and
efficient HT detection method based on segmentation and consistency analysis was
used to [1, 2] determine the locations of the HTs in the circuit; and a self-consistency
based approach to minimize the required number of power measurements in HT
detection and diagnosis. The idea of segmentation and detection of Trojan using GLC
such as leakage power is obtained from this work. The idea of increasing the activity of
low active nodes by providing appropriate test vector is mentioned in [2]. Here the time
taken for Trojan detection is reduced by triggering low activity nodes since more
number of vectors are required to activate those nodes. The fan-out cone analysis is
used, which explains the triggering of low TP nodes by triggering the parent node,
thereby optimizing the number of MUXs or vectors (in our case). The analysis of
physical characteristics of a circuit and Trojan detection by exploiting the time delay of
input and output is used to detect the presence of any extra gate. In [4] the division of
circuit into smaller segments and power analysis of each segment is explained in detail.
Also the controllability of the nodes using appropriate test vectors is stated in this work.
The classification and design of the hardware Trojan has been described in [5]. The
Finger print analysis technique with region based segmentation is explained in [6].
Hardware Trojan Detection Using Effective Test Patterns 381
3 Methodology
3.1 Calculation of Transition Probability
The calculation of the Transition Probability (TP) of the nodes in the netlist involves
several steps. Firstly, the algorithm analyses the netlist and identifies the functionality
of each individual gate. The nodes of the gates are assigned with the corresponding TPs
which are computed mathematically using Table 1. Considering a two input AND gate
with identical input signal probabilities of 0.5, the probability of output node being 1 is
0.25 and probability of output node being 0 is 0.75. So the TP for this output node is
0.25 * 0.75 which is 0.1875. Similarly the TPs are calculated for all nodes in the
circuit. The TPs in this work is calculated using the below expressions (Fig. 1).
Fig. 1. TP Calculation
other outputs do not fall under first level. If any gate has input from one lower level
(n − 1th level) and higher level (nth level) then the node falls in the level next to the
higher level (n + 1th level). This process of levelizing continues until all the nodes of
the circuit fall under some level.
Averaging the transition probability at all levels
In many of the recent works, the Threshold is often taken to be 0.5. Any node below
0.5 is considered as low TP node and any node above 0.5 are considered as high TP
node. In order to improve the accuracy of locating a low TP node, a new method of
averaging the TPs at all levels is used. The Threshold transition probability in this work
is calculated using the average of the transition probabilities at each level. The nor-
malized value obtained in the previous step is considered as the threshold transition
probability of the whole circuit. The threshold value set by the above technique is better
than the nominal value of 0.5 set in previous works in identifying the low TP nodes.
Identification of least transition probable nodes
With the threshold probability been calculated, the final step is identification of least
probable nodes. The nodes with transition probability values lesser than the threshold
values are considered to be the least transition probable nodes. Since most of the HTs
are implanted in least transition probable nodes, these nodes have more probability of
being infected by HT than other nodes. Thus the identification of least TP nodes speeds
up the process of HT detection and also improves the accuracy to greater extent
(Table 2).
3.4 Detection of HT
The presence of HT in the circuit is detected using power analysis of the segmented
circuit. All the segments obtained after segmentation are checked for variations in
power. Any slight variations in the power metrics confirm the presence of a HT. If there
is no HT detected in any of these segments, then the nodes with transition probability
higher than the previously considered nodes are analyzed. These nodes are categorized
as medium transition probable nodes. For categorizing nodes as medium probable
nodes, the threshold value is increased to a suitable amount. The process is repeated till
the HT is detected.
The results obtained from the proposed method shown in Table 4 containing the
number of LTP nodes, Segments formed and Test patterns generated show that the
number of nodes of interest is reduced to a considerable amount thereby decreasing the
time taken for testing. The LTPs are grouped together in segments to aid the process of
HT detection. The effective test pattern generated by the proposed method has mini-
mized the number of test patterns and aids in faster HT detection.
The data in Fig. 2 clearly shows the variation in the dynamic power of the segments
between the uninfected circuit and circuit infected with functional Trojan. The varia-
tions in the power confirms the presence an extra circuitry or probably Trojan in the
corresponding segment.
Fig. 2. Data showing power variations in Trojan infected segments of benchmark circuits
Hardware Trojan Detection Using Effective Test Patterns 385
An accurate method for hardware Trojan detection has been proposed in this work
using selective segmentation and effective test pattern which proves to be time efficient.
The results show that there is a better chance of finding a Trojan (if any) by this
method.
However this method is applicable for circuits with combinational logic only. There
is great scope of research to further improve the existing algorithm to test complex
circuits involving sequential logic.
The result from Table 5 using the Prime time tool, one can infer that there is a
considerable difference in leakage power of the infected segment. The total power of
the circuit and the percentage power consumed by each segments also shows a dif-
ference which aids in identifying the segment affected and to detect the HT. The
segments s1 to s8 shows the segmented dynamic power signature of the circuit whose
effective test patterns are applied to the simulation software
References
1. Wei, S., Potkonjak, M.: Self-consistency and consistency-based detection and diagnosis of
malicious circuitry. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 22(9), 1845–1853
(2013)
2. Wei, S., Potkonjak, M.: Malicious circuitry detection using fast timing characterization via
test points. In: Proceedings of the IEEE International Symposium on HOST, pp. 113–118,
June 2013
3. Zhou, B., Zhang, W., Thambipillai, S., Jin, J.T.K.: Cost-efficient acceleration of hardware
trojan detection through fan-out cone analysis and weighted random pattern technique. IEEE
Trans. Comput.-Aided Des. Integr. Circuits Syst. 35(5), 792–805 (2016)
4. Wong, J.S.J., Cheung, P.Y.K.: Timing measurement platform for arbitrary black-box circuits
based on transition probability. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 21(12),
2307–2320 (2013)
5. Wei, S., Potkonjak, M.: Scalable hardware Trojan diagnosis. IEEE Trans. Very Large Scale
Integr. (VLSI) Syst. 20(6), 1049–1057 (2012)
6. Sree Ranjani, R., Nirmala Devi, M.: Malicious hardware detection and design for trust: an
analysis. Elektrotehniski Vestn. 84(1–2), 7–16 (2017)
7. Saran, T., Sree Ranjani, R., Nirmala Devi, M.: A region based fingerprinting for hardware
Trojan detection and diagnosis. In: 4th International Conference on Signal Processing and
Integrated Networks (SPIN). IEEE (2017)
Estimation and Tracking of a Ballistic Target
Using Sequential Importance
Sampling Method
1 Introduction
The need for tracking ballistic target arose during the testing of missiles before war.
These tracking model were used both practically as well as theoretically [1]. Since the
equations of target are non-linear, all filters cannot be used. In the early days Kalman
filter was widely used since it is cheaper and easier but since many models are
non-linear Kalman cannot satisfy them [2]. The most practical usage is in military
purposes. This can also be used for tracking old satellites, debris entering the earth. The
older satellites are not removed, this leads to increase in number of satellites in space.
The debris of the older ones which are floating in the space tend to enter the earth’s
atmosphere [3]. This model can be used to track the time and path of these pieces to
detect the landing point on earth. Our aim is to formulate filtering problem of the target
tracking and the noise affecting it [4]. Various filters can be used to track the target
2 Existing Methods
0 0
0
Sk þ 1jk ¼ Sk þ 1jk þ Kk þ 1 Zk þ 1 HSk þ 1jk ð4Þ
3 Objective
The objective of the proposed paper is to track air launched ballistic missile. This can
be overcome by studying all filter [10]. The theoretical explanation defines the best
filter required for the system by including all approximation in the filtering algorithms.
4 Proposed Work
A missile following parabolic path is studied with air drag in order to make it realistic.
An object is being launched from one point of Earth to another with a path of ballistic
flight. Drag force and gravity are the kinematical forces acting on the ballistic target
[11]. The centrifugal acceleration effect, lift force, earth spin, projectile spin wind force
are ignored, due to negligible effect on the trajectory [12]. The earth is assumed to be
flat to use the orthogonal coordinate as reference scale.
Now the measurement and state model of the ballistic target has to be modeled.
When considering a high speed body from a very high altitude entering the atmosphere
the radar measures the range, bearing and speed [13]. Due to non-linearity of the
motion this becomes more complex. Three accounted forces acts on this body
1. Aerodynamic drag (speed function of the vehicle), it varies nonlinearly with altitude
2. Gravity which accelerates the vehicle towards the center of the earth
3. Random buffeting forces
Ballistic path is followed by the trajectory initially but due to increase in atmo-
spheric density, drag effect increases and when the motion of vehicle is almost vertical
it starts to decelerate. This causes difficulty in tracking. Without an air drag it would be
five times longer than the observed. Hence instead Radar Noise and System Noise to
the path are taken in account. The position of ballistic object in Cartesian coordinates
are shown in Fig. 1.
Where x and y are the reference axis, x0 and y0 are the target coordinate points at
initial time t0, The velocity of the target is given by V, a is the angle of elevation of the
target.
390 J. Ramnarayan et al.
Where the Sk, state vector and w are shown in Eqs. (7) and (8).
2 3
xk
6 x_ 7
Sk , 4 y k 5 ð7Þ
k
y_ k
The radar measures the reading periodically at a time interval of T, the drag force
given by
gqv2
2b
x y
sin arctg ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð13Þ
y x þ y2
2
Where noise intensity parameter is q. The noise includes all forces that have not
been considered in the model and reason for variation of the model from the reality.
The radar measures the range r and elevation angle a. The radar is located at the
origin (0, 0).
rr is the error standard deviations for range
ra is the error standard deviations for elevation
Radar measurements are converted to the Cartesian coordinates as in Eqs. (17) and (18).
The measurement equation vector components are given in Eqs. (19) and (20).
d ¼ rcosa ð17Þ
h ¼ rsina ð18Þ
d
Zk ¼ k ð19Þ
hk
1 0 0 0
H¼ ð20Þ
0 0 1 0
392 J. Ramnarayan et al.
Vk is the noise on the measured Cartesian coordinates; it does not depend on the
process noise Wk the variance is given by the Eqs. (21) (22) and (23).
5 Particle Filter
The approximation error can be limited to a very small value while using this algo-
rithm. The sample weights are updated in every iteration. Performing this can give us a
set of samples close to the original value with a small error. Finally the weighted mean
can be taken as the best estimate of that state vector.
Let the state vector and the observation vector is given in Eqs. (24) and (25).
Where l(k) and m(k) are the system noise and the observation noise respectively.
The observation equation is often written as a conditional likelihood, p(xt,yt), and the
state equation as p(xt+1,xt). Both of these distributions typically depend on the state
parameters.
The algorithm starts with the generation of an initial set of samples, called particles,
with N being the total number of particles. These particles are distributed over a region
where the state vector is assumed to be.
The first step of the prediction phase is to pass each of the initial particles through
the system model. This can generate a new set of particles for the next time step k.
According to the conditional likelihood of the observation, the weights of the particles
are updated as shown in Eq. (26).
ik ¼ w
w ik1 p yk jxik ð26Þ
Now we have a new particle set. The estimated value of the state vector is given by
Eq. (28).
XN
xik ^ ik xik
w ð28Þ
j¼1
Now the particles with insignificant weights are ignored, whereas the particles with
better weights are represented by more particles around. This process is called as
resampling.
The most computational and crucial part is resampling step in PF. Hence a suitable
choice should be justified for this, as the entire method actually benefits the system by
reducing the complexity and also improving the accuracy in the resampling step. The
most common resampling algorithms are systematic resampling, multinomial resam-
pling, residual resampling and stratified resampling [10].
After every measurement update, resampling step is performed. This is done to
reduce the computational effort by avoiding the particles with lower weights.
394 J. Ramnarayan et al.
Resampling process should be carried out only when the N ^ eff falls below a
threshold. After resampling, this procedure of particle filter is repeated N times to build
up a new set of particles. The new sets of particles are the samples of the required
probability density function.
6 Noise Analysis
Practically errors tend to 0, because of the Nonlinear Dynamic State Equation. The
error estimation’s standard deviation and mean are the average report of several iter-
ation [13]. The “Monte Carlo Simulation” technique has a wide range of scope and
impact in computational science. It derives its name from the casinos in Monte Carlo.
For sorting some process this uses random numbers. For the process of known prob-
ability and unknown results (difficulty in determining) this technique works
particularly.
All simulations are done using the MATLAB software. From Fig. 3. We can infer that
the particle filter algorithm produces accurate results with high fidelity. The error
margin is very low compared to other estimation methods. It is understood that by
Fig. 4. Where the MSE is high for KF compared to PF even when the SNR values are
increased. From Fig. 5. We can infer that increasing the number of iterations provides
lesser error therefore giving better results. Similarly, it is illustrated that in Fig. 6.
increasing the number of particles also increases the accuracy and reduces error but
Fig. 4. MSE versus SNR graph of Kalman Filter and particle filter
8 Conclusion
Many different filters can be used to track the object but particle seems to be more
efficient by theory given above. In the form of filter covariance almost all the filters
produce estimated error. The testing are done based on standard deviation of
Estimation and Tracking of a Ballistic Target 397
References
1. Singh, N.K., Bhaumik, S., Bhattacharya, S.: A comparison of several non-linear filters for
ballistic missile tracking on re-entry. In: 2016 IEEE First International Conference on
Control, Measurement and Instrumentation (CMI), pp. 459–463 (2016)
2. Patral, N., Sadhu, S., Ghoshae, T.K.: Adaptive state estimation for ballistic object tracking
with nonlinear model and state dependent process noise. In: 1st IEEE International
Conference on Power Electronics. Intelligent Control and Energy Systems
(ICPEICES-2016), pp. 1–5 (2016)
3. Safarinejadian, B., Mohammadnia, F.: Distributed weighted averaging-based robust cubature
Kalman filter for state estimation of nonlinear systems in wireless sensor networks. In: 6th
International Conference on Computer and Knowledge Engineering (lCCKE 2016), 20–21
October 2016, pp. 66–71 (2016)
4. Farina, A., Immediata, S., Timmoneri, L.: Impact of ballistic target model uncertainty on
IMMUKF and IMM-EKF tracking accuracies. In: 14th European Signal Processing
Conference (EUSIPCO 2006), 4–8 September 2006, pp. 1–5 (2006)
5. Benvenuti, B., Farina, A., Ristic, B.: Estimation accuracy of a landing point of a ballistic
target. In: Proceedings of International Conference Fusion 2002, Washington D.C., pp. 2–9,
May 2002
6. Gokkul Nath, T.S., Sudheesh, P., Jayakumar, M.: Tracking inbound enemy missile for
interception from target aircraft using extended Kalman filter. In: Mueller, P., Thampi, S.M.,
Alam Bhuiyan, M.Z., Ko, R., Doss, R., Alcaraz Calero, J.M. (eds.) SSCC 2016. CCIS, vol.
625, pp. 269–279. Springer, Singapore (2016). doi:10.1007/978-981-10-2738-3_23
7. Mehra, R.: A comparison of several non-linear filters for re-entry vehicle tracking. IEEE
Trans. Autom. Control AC 16(4), 307–319 (1971)
8. Farina, A., Ristic, B., Benvenuti, D.: Tracking a ballistic target: comparison of several
nonlinear filters. IEEE Trans. Aerosp. Electron. Syst. 38(3), 854–867 (2002)
9. kumar, K.S., Dustakar, N.R., Jatoth, R.K.: Evolutionary computational tools aided extended
Kalman filter for ballistic target tracking. In: 2010 3rd International Conference on Emerging
Trends in Engineering and Technology (ICETET), 19–21 November 2010 (2010)
10. Wu, C., Han, C.: Strong tracking finite-difference extended Kalman filtering for ballistic
target tracking. In: 2007 IEEE International Conference on Robotics and Biomimetics
(ROBIO), December 2007
398 J. Ramnarayan et al.
11. Lin, Y.-P., Lin, C.-L., Suebsaiprom, P., Hsieh, S.-L.: Estimating evasive acceleration for
ballistic targets using an extended state observer. IEEE Trans. Aerosp. Electron. Syst. 52(1),
337–349 (2016)
12. Zhao, Z., Chen, H., Chen, G., Kwan, C., Rong Li, X.: Comparison of several ballistic target
tracking filters.In: 2006 American Control Conference, Minneapolis, MN, p. 6 (2006).
doi:10.1109/ACC.2006.165654
13. Domuta, I., Palade, T.P.: Adaptive Kalman Filter for target tracking in the UWB networks.
In: 2016 13th Workshop on Positioning, Navigation and Communications (WPNC),
Bremen, pp. 1–6 (2016). doi:10.1109/WPNC.2016.7822855
14. Vikranth, S., Sudheesh, P., Jayakumar, M.: Nonlinear tracking of target submarine using
Extended Kalman Filter (EKF). In: Mueller, P., Thampi, S.M., Alam Bhuiyan, M.Z., Ko, R.,
Doss, R., Alcaraz Calero, J.M. (eds.) SSCC 2016. CCIS, vol. 625, pp. 258–268. Springer,
Singapore (2016). doi:10.1007/978-981-10-2738-3_22
15. de Doucet, A., Freitas, N., Gordon, N.J. (eds.): Sequential Monte Carlo Methods in Practice.
Springer, New York (2001)
16. Julier, S., Uhlmann, J., Durrant-Whyte, H.F.: A new method for the non linear
transformation of means and covariances in filters and estimators. IEEE Trans. Autom.
Control AC 45(3), 477–482 (2000)
An Android Application for Secret Image
Sharing with Cloud Storage
Abstract. The usage of online cloud storages via Smart phones has
become popular in today’s world. This helps the people to store their
huge data in to the cloud and to access it from anywhere. The individuals
rely upon the Cloud Storage Providers (CSP) like Amazon, Dropbox,
Google Drive, Firebase etc. for storing their information in the cloud due
to the lack of storage space in their Mobile phones. The main concern
in cloud storage is its privacy. To obtain privacy the Confidentiality,
Integrity and Availability has to be maintained. This paper addresses
about the development of a new Android application that will provide
the cloud users to store the geotagged secret image in the form of shares
in to various CSP and reconstruct the secret image back by combining
the shares. This key idea will provide security to the stored data. Here
in this paper we also propose a (1, k, n) secret image sharing scheme
constructed by using (k−1, n−1) secret image sharing scheme. An image
encryption scheme is also addressed as a building block which is used for
mitigating the collusive attacks by CSPs. We have also implemented our
apk in the scenario for distributing shares by the dealer to a group of
participants within a single CSP.
1 Introduction
The portability and data that are easy to backup are basic requirements for
datastorage which was provided by Cloud Storage technology [19]. The public
cloud storage is a technology where data is stored on remote servers and services
are available to the users via internet. This service allows the user to store
file online so that the user can access them from anywhere at any time. It is
maintained, operated and managed by the Cloud Service Providers (CSP) based
on virtualization techniques. Every cloud user will have a unique credentials for
storing the information and to manage them. Some CSPs provides the storage
space up to certain limit for free and beyond that we can access it by paying
them. Many of the CSPs, provide the data drag and drop, auto sync, between
c Springer Nature Singapore Pte Ltd. 2017
S.M. Thampi et al. (Eds.): SSCC 2017, CCIS 746, pp. 399–410, 2017.
https://doi.org/10.1007/978-981-10-6898-0_33
400 K. Praveen et al.
the local devices and cloud. Some of the CSPs are Dropbox, Google Drive,
Firebase etc. Dropbox offers a storage of 2 GB which is the lowest space provided
compared to other CSPs. Google Drive provides a storage of 5 GB. Firebase
provides user authentication, cloud messaging, crashing report, notifications etc.
Storing our data in the cloud introduces a new set of security challenges. The
handling of public cloud storage typically has a lower risk profile than the private
server in the back of your office. There are some mitigation techniques such as
encryption, secret sharing mechanism, hashing etc., for protecting data from
security breaches. By splitting data into several chunks and storing parts of it
on multiple cloud providers that preserves data confidentiality, integrity and
ensures availability [15]. In case of availability, create replicas of secret shares
and distribute them among multiple resource providers to ensure availability and
also create dummy shares to find any outsiders are intercepting [18].
Nowadays, the usage of the Smart phones has been increased rapidly. Android
is one of the leading operating system in the mobile market and the recent survey
says that the Android has 88% of the market share. Apart from a mobile device,
it can do many things that a PC cannot able to perform. In today’s world,
the mobile cloud storage has gained wide popularity for storing and sharing
the data. Storing data on the cloud also saves up phone storage space. Many
android phones suffers from very limited external storage. By storing data in
the cloud, that memory space can be allocated for apps for other additional
purpose, thus improves the performance and the efficiency of the phone. Android
provides various applications (apks) that support the cloud storage and sharing.
Currently, there are so many apks available in the market which allow uploading
files to multiple clouds like Cloudii apk [7].
Here in this paper, we propose an apk to upload the geotagged secret image
shares to multiple clouds. Geotagging has become a popular feature on several
social media platforms which helps to capture GPS information at the time the
photo is taken. The secret sharing scheme is a technique used for securely sharing
data between the users. The idea of (k, n) threshold secret sharing scheme was
introduced by Adi Shamir [1] in 1979. This scheme was based on the polynomial
interpolation technique. The idea is to divide a secret in to n shares such that
it will be reconstructed only by k shares and not by less than k shares [16].
Here we depend on multiple CSPs for storing the shares of Geotagged secret
image which in turn help us as a prevention of single point of failure unlike
encrypting the image and storing in a single CSP. The cloud storage is more
secure and the risk level is also too low when compared to the local storage. But
with the multiplication of CSPs and sub-contractors in many countries, intricate
legal issues arise, as well as another fundamental issue: trust. Telling whether
trust should be placed in CSPs falls back onto end-users, with the implied costs
[13]. If the user distributes multiple secrets, reconstruction independence can be
maintained by independently [17]. By this way we could download the shares
from any of the k CSPs for reconstructing the secret image. Also if one server
is not available we can upload and share images via other CSPs. Additionally,
utilizing a multi-cloud deployment strategy can typically provide users with a
An Android Application for Secret Image Sharing with Cloud Storage 401
simple, easy interface for accessing and taking advantage of the public cloud’s
scalability as needed through the apk. The protection of contents using the
secret sharing scheme in multi-cloud storages are addressed in papers [9,10].
The Shamir’s secret sharing algorithm has a good foundation that provides an
excellent platform for proofs and applications [11]. This scheme’s security rests
on the fact that at least k points are needed to uniquely reconstruct a polynomial
of degree k − 1 [21]. A technique to outsource a database using Shamir’s secret-
sharing scheme to public clouds, and then, provide privacy-preserving algorithms
for performing search and fetch, equijoin, and range queries using MapReduce in
discussed in [12]. Inorder to provide privacy and also to ensure security, two types
of secure cloud computing: one is with trusted third party (TTP) and the other
is without TTP in a more efficient way [14]. A notable work on development
of Android apk’s uses secret sharing to split the file and then stores each of
the shares on a separate remote storage service was done in NEWCASTLE
University [8]. But integration of the secret image sharing scheme with multi
cloud storage functionality into an Android apk is been addressed for the first
time in the literature compared to other related works.
One of the disadvantages of the above proposal is that, there is a less probable
scenario where if any of the k shares stored over multiple CSPs while combining,
will disclose the secret image to CSPs. So in order to mitigate this we propose a
(1, k, n) secret image sharing scheme using (k − 1, n − 1) secret image sharing
scheme and an image encryption scheme as building block. There are studies
in the literate to construct shares for binary images using deterministic [2] and
probabilistic [3,4] (1, k, n) visual cryptographic scheme [20]. Let us divide the
n shares generated using (1, k, n) secret image sharing scheme in to two sets
E = {e0 } and R = {r1 , r2 , r3 , ..., rn−1 }. So the reconstruction of secret image is
done using e0 share from set E and any of the (k − 1) shares out of (n − 1)
shares from set R. So (n − 1) shares from set R can be stored in multiple CSPs
and e0 share from set E can be stored in our own multiple private clouds as
replicas which mitigate the single point of failure. So when any k shares stored
over multiple CSPs combines, will not disclose the secret image to CSPs. For the
implemented apk we have used one of the efficient (k, n) secret image sharing
scheme by Thien and Lin [5] and image encryption scheme by Alhusainy [6] from
the literature.
The paper is organized in the following way. Section 2 gives an explanation
of (k, n) secret image sharing scheme of Thien and Lin [5] and image encryption
scheme by Alhusainy [6]. Section 3 presents a detailed explanation of our apk
which is implemented in a (1, k, n) secret image sharing model. Section 4 shows
the implementation of our apk in concern with distribution of shares by the dealer
to a group of participants in a single CSP. Conclusions are given in Sect. 5.
402 K. Praveen et al.
2 Background
2.1 (k, n) Secret Image Sharing Scheme by Thien and Lin
Initially, this (k, n) secret image sharing algorithm divide the secret grey level
image into m blocks, where m = l /k, l is the total number of pixels in the
grey level image. Then all the grey values between 251–255 in each block is
truncated to 250. For each dth block (1 ≤ d ≤ m), we define the following
(k − 1) degree polynomial Sd (y) = (p0d + p1d (y) + .... + pk−1
d (y
k−1
)) mod 251, where
pd , pd , ..., pd are pixels of d block. Then the n shares for the dth block are
0 1 k−1 th
Sd (1), Sd (2), Sd (3), ...., Sd (n). So k pixels in a block is converted to single pixel.
So the shares contain m pixels in total. During reconstruction phase, use any of
the k values from Sd (1), Sd (2), Sd (3), ...., Sd (n) with Lagrange’s interpolation [1]
to find the pixels of dth block.
Initially this encryption algorithm will divide the secret grey level image into
m blocks B0 , B1 , B2 , B3 ,....., Bm each of size 16 × 16 bytes. Then randomly
select a secret key SK0 of size 16 × 16 bytes. Initially the block B0 is encrypted
with SK0 . For encrypting the remaining blocks B0 , B1 , B2 , B3 ,....., Bm differ-
ent secret keys are generated from SK0 . The abstract way for encrypting blocks
B0 , B1 , B2 , B3 ,....., Bm is E (Bi ) = Transposition (Substitution (Bi , SKi )) for
(1 ≤ i ≤ m). The same step is used in reverse order on the encrypted block
E (Bi ) for decrypting the secret block B0 . The following operation need to be
done during the encryption and decryption process for constructing new secret
key block, SKi+1 = Transposition (Substitution (E (Bi ), SKi )). So the encryp-
tion/decryption of the block Bi+1 is done only after encrypting/decrypting block
Bi . The detailed explanation of the algorithm and the advantages of this algo-
rithm are listed in paper [6]. The following are,
– To encrypt a grey level secret image of any size w × h with 16 × 16 bytes key.
– This algorithm is equally secure compared to data encryption standard and
advanced encryption standard when analyzing the results for visual and sta-
tistical test, signal to noise ratio, peak signal to noise ratio and normalized
mean absolute error.
– The time taken for encryption is less when compared to other methods.
The credentials of the user will be stored in the cloud named Firebase (since
user authentication facility is provided by firebase), at the time of initial regis-
tration. Whenever the user is entering their information that information will
be verified with the data that is been stored in the cloud. If the credentials are
matched, then the user is successfully logged in to the application which allows
the user to upload and download the image. The user has to choose whether
he/she needs to upload/download a picture. If the user opting to upload the
picture then he/she needs to choose whether the picture has to captured lively
or to choose from the gallery where the existing images will be stored. If the
picture has to be captured lively then that can be done by enabling the camera
feature of the application which also tags the GPS location in it. Then that cap-
tured image is been separated as shares using the (1, k, n) secret image sharing
scheme and it will be stored in the gallery. If the user is preferred to upload the
share images then he/she can directly choose it from the gallery for storing in to
the separate multi clouds Dropbox, Firebase, Google Drive etc. The major goal
of multi-cloud is to provide “computing”, “storage”, and “software” as a service
[22] Fig. 1 shows the architecture of our apk.
The idea behind this GPS camera is that, when the user is uploading the
live image of him/her then the user will be selecting the option “Take Photo” in
the Android Apk as shown in the Fig. 9. Usually, while choosing that particular
“Take Photo” option the inbuilt camera will get triggered with the help of library
called “import android.hardware.camera”, but it is not possible for the developer
to change the behavior of the inbuilt camera. To add the additional features to
the camera, the developer need to develop another camera instead of calling the
inbuilt one. Here, in this application we are trying to make use of an secondary
camera which helps us in Geotagging. The functionality of the secondary camera
say GPS camera is to get the GPS location information of the image. The GPS
information includes the information of latitude and longitude of the position
from where the image is being clicked. This Latitude and longitude information
can be get with the help of the package called “android.location”. The idea
behind this GPS camera is that, whenever the user is clicking a photo, the
location details will be tagged with the image i.e., Current Address of the user
where he/she is clicking the photo and the map of the current location will be
shown as in Fig. 2. Along with these information the image will be captured.
Fig. 2. Geotagging
4. Generate (n − 1) shares ER1 , ER2 ,....., ER(n−1) from ERed , (n − 1) shares EG1 ,
EG2 ,....., EG(n−1) from EGreen and (n − 1) shares EB1 , EB2 ,....., EB(n−1) from
EBlue using (k − 1, n − 1) secret image sharing scheme [5].
5. Then combine the grey levels (ER1 , EG1 , EB1 ), (ER2 , EG2 , EB2 ),....., (ER(n−1) ,
EG(n−1) , EB(n−1) ) to form the color images EGI1 , EGI2 ,....., EGIn−1 .
6. Then store EGI1 , EGI2 ,....., EGIn−1 into CSP1 , CSP2 ,....., CSPn−1 respec-
tively.
An Android Application for Secret Image Sharing with Cloud Storage 405
So based on the above algorithm it is evident that in our own private cloud
we have stored only three 16 × 16 bytes of key shares and the (n−1) shares of the
GI each of size 3 × w × h bytes are stored in multiple CSPs. So the huge amount
of data is outsourced into the public cloud and small amount is stored in our
own private cloud or devices which can maintain the privacy of the secret image.
The user interface for our apk is shown in Fig. 3. We have implemented a (1,
3, 4) secret image sharing scheme using the (2, 3) secret image sharing scheme
of Thien and Lin [5] and image encryption scheme by Alhusainy [6]. Since we
are using a (2, 3) secret image sharing scheme of Thien and Lin [5] the image
shares are of 1/2 the size of the geotagged image as shown in Fig. 3. First the
geotagged image is encrypted with three 16 × 16 bytes of key shares. Then the
three shares are generated and stored in Firebase, Google Drive and Dropbox as
shown in Figs. 4, 5 and 6 respectively. Regarding the implementation, our apk
will create shares of geotagged image and store into any one of the CSPs (either
Dropbox, Google Drive or Firebase) in a single run. In order to upload the share
to another CSP we need to start our apk again. Implementation of uploading
shares to multiple CSPs in a single run is in progress.
406 K. Praveen et al.
5 Conclusion
This paper proposes a novel Android apk which integrates the secret image
sharing method and multi cloud storage functionalities in to a single architecture
with a GEOTAGGING feature. This apk facilitates to quickly upload the shares
of the geotagged secret pictures into multiple CSPs independent of the location
and time when mobile data or Wi-Fi is available. The (1, k, n) secret image
sharing scheme proposed in this paper mitigate the problem of privacy issues
when multiple CSPs collusively try to identify the cloud users original secret.
Also using a limited key size, huge image is encrypted which reduce the burden
of the key storage in the device or private cloud. The apk is also implemented
which is compatible to a scenario where the dealer create secret shares from the
image and distribute it to a group of participants in Dropbox.
References
1. Shamir, A.: How to share a secret. Commun. ACM 22(11), 612–613 (1979)
2. Arumugam, S., Lakshmanan, R., Nagar, A.K.: On (k, n)*-visual cryptography
scheme. Des. Codes Crypt. 1–10 (2012)
3. Praveen, K., Rajeev, K., Sethumadhavan, M.: On the extensions of (k, n)*-visual
cryptographic schemes. In: Martı́nez Pérez, G., Thampi, S.M., Ko, R., Shu, L.
(eds.) SNDS 2014. CCIS, vol. 420, pp. 231–238. Springer, Heidelberg (2014).
https://doi.org/10.1007/978-3-642-54525-2 21
4. Praveen, K., Sethumadhavan, M.: A probabilistic essential visual cryptographic
scheme for plural secret images. In: Kumar Kundu, M., Mohapatra, D.P., Konar,
A., Chakraborty, A. (eds.) Advanced Computing, Networking and Informatics-
Volume 2. SIST, vol. 28, pp. 225–231. Springer, Cham (2014). https://doi.org/10.
1007/978-3-319-07350-7 25
410 K. Praveen et al.
5. Thien, C.C., Lin, J.C.: Secret image sharing. Comput. Graph. 26(5), 765–770
(2002)
6. Al-Husainy, M.A.F.: A novel image encryption algorithm based on the extracted
map of overlapping paths from the secret key. RAIRO-Theor. Inf. Appl. 50(3),
241–249 (2016)
7. https://apkpure.com/cloudii/com.getcloudii.android
8. https://www.futurelearn.com/courses/cyber-security/0/steps/19605
9. Chong, J., Wong, C.J., Ha, S., Chiang, M.: CYRUS: Towards client defined Cloud
storage. In: Proceedings of EuroSys (2015)
10. Pundkar, S.N., Shekokar, N.: Cloud computing security in multi-clouds using
Shamir’s secret sharing scheme. In: Electrical, Electronics, and Optimization Tech-
niques (ICEEOT), pp. 392–395 (2016)
11. Muhil, M., Krishna, U.H., Kumar, R.K., Anita, E.M.: Securing multi-cloud using
secret sharing algorithm. Procedia Comput. Sci. 50, 421–426 (2015)
12. Dolev, S., Li, Y., Sharma, S.: Private and secure secret shared MapRe-
duce (Extended abstract). In: Ranise, S., Swarup, V. (eds.) DBSec 2016.
LNCS, vol. 9766, pp. 151–160. Springer, Cham (2016). https://doi.org/10.1007/
978-3-319-41483-6 11
13. Attasena, V., Harbi, N., Darmont, J.: A novel multi-secret sharing approach
for secure data warehousing and on-line analysis processing in the cloud. arXiv
preprint arXiv:1701.05449 (2017)
14. Yang, C.N., Lai, J.B., Fu, Z.: Protecting user privacy for cloud computing by
bivariate polynomial based secret sharing. CIT J. Comput. Inf. Technol. 23(4),
341–355 (2015)
15. Morozan, I.: A new model to provide security in cloud computing. Vrije Universiteit
16. Takahashi, S., Iwamura, K.: Secret sharing scheme suitable for cloud computing.
In: 2013 IEEE 27th International Conference on Advanced Information Networking
and Applications (AINA), pp. 530–537. IEEE, March 2013
17. Takahashi, S., Kobayashi, S., Kang, H., Iwamura, K.: Secret sharing scheme for
cloud computing using IDs. In: 2013 IEEE 2nd Global Conference on Consumer
Electronics (GCCE), pp. 528–529. IEEE, October 2013
18. Pal, D., Khethavath, P., Thomas, J.P., Chen, T.: Multilevel threshold secret shar-
ing in distributed cloud. In: Abawajy, J.H., Mukherjea, S., Thampi, S.M., Ruiz-
Martı́nez, A. (eds.) SSCC 2015. CCIS, vol. 536, pp. 13–23. Springer, Cham (2015).
https://doi.org/10.1007/978-3-319-22915-7 2
19. Wu, H.L., Chang, C.C.: A robust image encryption scheme based on RSA and
secret sharing for cloud storage systems. J. Inf. Hiding Multimedia Sig. Process.
6(2), 288–296 (2015)
20. Dong, X., Jiadi, Y., Luo, Y., Chen, Y., Xue, G., Li, M.: P2E: privacy-preserving
and effective cloud data sharing service. In: 2013 IEEE Global Communications
Conference (GLOBECOM), pp. 689–694. IEEE, December 2013
21. Dautrich, J.L., Ravishankar, C.V.: Security limitations of using secret sharing for
data outsourcing. In: Cuppens-Boulahia, N., Cuppens, F., Garcia-Alfaro, J. (eds.)
DBSec 2012. LNCS, vol. 7371, pp. 145–160. Springer, Heidelberg (2012). https://
doi.org/10.1007/978-3-642-31540-4 12
22. Kaufman, L.M.: Data security in the world of cloud computing. IEEE Secur. Priv.
7(4) (2009)
Tracking of GPS Parameters
Using Particle Filter
Abstract. For proper functioning of the GPS system, tracking the code and
carrier effectively in GPS receivers is important. The time taken for a signal to
propagate from a satellite is calculated by a GPS receiver by analyzing the
“pseudo random code” it generates, to that of code generated in the signal from
the satellite. So it is important to effectively track the code before they become
out of phase. The tracking medium synchronizes consecutively, the acquired
satellite signal with the code and carrier frequencies that are locally generated.
To track these parameters Kalman filter is used. To improve the efficiency of
estimation and to obtain faster and accurate results particle filter (PF) is pro-
posed, which further reduces the complexity as compared to that of the Kalman
filter.
1 Introduction
The global positioning system (GPS) is a satellite based network. This system provides
precise 3-Dimensional position and velocity estimate of a person (or) an object any-
where on earth. It operates by tracking the time-of-arrival of spread spectrum signals.
The use of satellite is that it transmits radio signals which provide the precise position
and other parameters [1]. By using the method of triangulation, GPS receivers calculate
the user’s accurate location. GPS receiver operates in two portions: hardware and
software. Tracking and acquisition falls in the software part of the receiver. First
objective of the software part is determining the signal (satellite signal) availability [2].
After the completion of the above task GPS receiver works in tracking the signal’s code
and carrier components. The tracking uses a delay lock loop (DLL) and a Costas loop,
where the former is used in tracking of the coarse/acquisition (C/A) code sequence and
the later in tracking the received satellite signal’s carrier [3]. The output of the loops is
the decrypted form of the navigation message. Using pseudo range measurements from
the above loops, position of the user is calculated. The code and carrier must be in lock
for a receiver to track the path. The loops begin to get rid of the lock whenever the
signals become weak. When this happens the receiver cannot track further, until the
signal becomes stronger again.
The GPS space vehicles (satellites) transmits two carrier frequencies L1 and L2
which are known as primary and secondary frequencies respectively [4]. The carrier
frequencies are modulated by spread spectrum codes with a unique pseudo random
noise sequence that is associated with each satellite and by the navigation data message
[5]. BPSK modulation is carried out. All satellites transmit at the same carrier fre-
quencies but their signals do not interfere with each other due to the PRN code
modulation. By a technique called code division multiple access (CDMA) the satellite
signals are separated and detected [6]. CDMA is a type of a spread spectrum multiple
access technique. The GPS signal is demodulated through a series of steps. First by
acquisition, followed by tracking and then demodulating. The acquisition is carried out
in order to identify the satellites that are notable by the users. Acquisition determines
the frequency and code phase of the signal. Frequency of signals from different
satellites varies from its nominal values. When down conversion takes place, the GPS
signal’s frequency points to the IF. Code phase is used to denote the point where the
C/A code starts.
2 Tracking Channel
The signal received from the satellite is always a mixture of the carrier signal, PRN
code and also the navigation data. [7] Tracking a channel is important since it necessary
to process the common values of frequency and code phase. Navigation data must be
discarded from the combination for obtaining the position of a GPS receiver. This is
done by the tracking channel, generating two replicas for the carrier and code as shown
in Fig. 1. After a receiver gets synchronized with the signal received, it has to continue
operating in locked state with the sequence of codes of that of the incoming message
signal. Pseudo random noise are deterministic that are generated with the help of a
clocked feedback shift register.
Incoming Message
signal Navigation Data
3 Code Tracking
For obtaining a perfect replica of the code, tracking of code is implied. The code is
mostly actualized as a delay lock loop (DLL) [7]. Here 3 code replications are gen-
erated and then it is correlated with the incoming signal. The three codes are distinct by
a half chip length. DLL allows the generating of a local PN-sequence that is aligned
with time to that of the received direct sequence. For estimating the time delay between
the received and local signals, the reception signal is correlated with the local
PN-sequence [8]. Considering the security aspects of the system, spoofing is the
method used to create false signals that sends incorrect data to the receivers. Datas
include time and location. To prevent spoofing, manufacturers should employ
encryption technologies which makes it impossible to spoof.
4 Carrier Tracking
For data demodulation with the help of frequency lock loop (or) phase locked loop, an
exact replication of the carrier wave is generated. The input signal’s carrier and the
PRN code are wiped off when the first two multiplications are carried out. The gen-
erated local carrier wave frequency is adapted as per the feedback given by the change
in phase error.
5 Costas Loop
A Costas loop is generally used by receivers which reconstructs a carrier reference from
an input signal, where the carrier component of the input signal is totally suppressed
[9]. Costas loop is widely used in carrier tracking since it is unresponsive to 180° phase
shifts and also does not change much when there is a transition in phase due to the
message data. The local carrier wave has a phase error which is given by Eq. (1)
[10, 11].
Qp
U ¼ arctan ð1Þ
Ip
In the Costas loop as shown in Fig. 2, the locally generated carrier and the input
signal are multiplied first followed by the multiplication of the 90 degree phase shifted
wave with the input signal.
414 M. Nishanth et al.
Fig. 2. Block diagram of carrier tracking that is being used in the Costas loop.
6 Signal Model
where,
Ps is power of the signal that is transmitted
d(t) is the Binary Phase-Shifted Keyed (BPSK)
fc is the carrier frequency
PN(t) is the pseudorandom noise (PRN)
Pseudo Random Noise (PRN) is modeled as a equation as shown in Eq. (3)
X þ 1 XLca1
PN ðtÞ ¼ 1 0
CK PT0 ðt KTc mTca Þ ð3Þ
where,
Tca is the period of the C/A PRN sequence, which is measured in seconds.
Tc is the chip duration given by
Tc ¼ Tca =Lca
The state space equations are as shown in Eqs. (4) and (5)
! 0 1 !
hk 1 Dt Dt2 =2 hk1
Fk ¼ @0 1 Dt A Fk1 ð4Þ
DFk 0 0 1 DFk1
!
hk
ðZ Þ ¼ ð 1 0 0Þ Fk ð5Þ
DFk
where
hk is the phase change of the received carrier.
Fk is the carrier’s frequency that is determined from the rate of change of the phase
of the carrier.
ΔFk is the derivative of the carrier frequency that varies linearly with time.
7 Estimation methods
7.1 Kalman Filter
Kalman filtering (KF) based estimator is extensively proposed in [18, 19]. In this paper
with the methodology of KF based estimator and the equations related to its algorithm
are also discussed.
The tracking loops cannot be used at all circumstances. They certainly have some
flaws. Generally loops use filters of fixed bandwidth. So this makes them unusable for
high user dynamics [12]. The tracking loop filter’s order provides the dynamic that the
loop tracks with zero steady state error. So designer is left off in trade-off state. The
only solution to the above problem is increasing the bandwidth of the filter to operate.
Bandwidths of filters are increased so that there is an improvement in the loop’s
tracking ability for high user dynamics. But in doing so, makes the loop susceptible to
noise. So Kalman filter was used whose gain varies with time. When this filter is given
with relevant process and measurement noise matrices, it can easily distinguish the
signal from noise [13].
The Kalman filter equations are as shown in Eqs. (6) and (7).
zk ¼ H xk þ Vk ð7Þ
Where,
A is a n*n matrix which relates the output at instant k-1 to the present instant k.
B is a matrix which forms a relation between predicted state(x) to the control input.
H is a matrix which forms the relation between measurement (zk) and state.
wk-1 and Vk are two random variables which represents process and measurement
noise.
416 M. Nishanth et al.
The noises are assumed to be a white and Gaussian. Even though Kalman filter is
used, it is best only in estimating linear systems with Gaussian noise [14]. When the
system becomes nonlinear, particle filters are used which are more flexible.
For predication and estimation of the posterior density function we have two
models: system model and measurement model in the probabilistic form. Generally in a
Bayesian approach we consider all models and state variations in probabilistic form.
The particle filter is a recursive filtering approach that has two stages, namely pre-
diction and update, that utilizes the system model and measurement model respectively.
Tracking of GPS Parameters Using Particle Filter 417
For estimation we define a vector xk that represents the state of system at an instant k,
as shown in Eq. (8).
where, Vk represents the Gaussian noise present and xk is a state vector that is defined
by a non linear and time varying function fk [16]. The state variable can be estimated
using noisy measurements of zk which is governed by Eq. (9) as shown.
z k ¼ hk ð xk ; nk Þ ð9Þ
Measurement states are used here, which we denote by z1:k. This is done by
computing the probability distribution of pðxk jz1:k Þ which is done recursively in two
steps
Prediction step:
pðxk jz1:k1 Þ is computed from pðxk1 jz1:k1 Þ at k-1 instant as shown in Eq. (10).
Z
pðxk jz1:k1 Þ ¼ pðxk jxk1 Þpðxk1 jz1:k1 Þdxk1 ð10Þ
Update step:
The prior estimate is updated with new measurements Zk which further obtains the
posterior estimate state as shown in Eq. (11).
But the problem is that we cannot directly compute or operate on these functions fk
and hk. Hence we resort to approximate method which is sequential importance sam-
pling (SIS). The aim of SIS is in finding the posterior distribution at k-1 instant,
pðx0:k1 jz1:k1 Þ, with a set of samples (known as particles) and updating the particles
repeatedly so that an proximate posterior distribution is achieved [17]. Particles are
generated by taking samples from the a proposal distribution q(x) and updating them
relating to the target distribution p(x). Weight of each particle is represented by wi. This
is obtained by the relation as shown in Eq. (12).
wi ¼ pðxiÞ=qðxiÞ ð12Þ
The weight of the particles is recursively updated using Eqs. (6) and (7) and this
results in, as shown in Eq. (14)
!
p zk jxik p xik jxik1
xik ¼ xik1 ð14Þ
q xik jxi0:k1 ; z1:k
In practice, we face the degeneracy problem [13]. The problem is where only some
of the particles having significant weights and rest having smaller weights. The
effective sample size is given by
1
Neff ¼ PN i 2
ð15Þ
i¼1 ðxk Þ
where the weights have larger variance when Neff is small which implies there will be
more degeneracy.
Steps of particle filtering
(a) Initiate a set of Np particles by using any random distribution. Assign each particle
with initial weight of 1/Np.
(b) Obtain the Non linear/linear update and measurement equations for estimation.
(c) Using these equations estimate the kth step xk value.
(d) Update the weight of particles as shown in Eq. (16)
1 2
wpn ¼ wpn1 pffiffiffiffiffiffi xðjznxnjÞ =2r ð16Þ
2pr
In this section we have studied the simulation results obtained. All simulations are done
using the MATLAB software. From Fig. 4 we can infer that increasing the number of
iterations provides lesser error therefore giving better results. Similarly, it is illustrated
that in Fig. 5 that, increasing the number of particles also increases the accuracy and
reduces error but increasing number of particles and number of iterations results in
increase in computation time as illustrated in Tables 1 and 2. From Fig. 6 we can infer
that the particle filter algorithm produces accurate results with high fidelity. The error
margin is very low compared to other estimation methods. It is elucidated in Fig. 7
where the MSE is high for Kalman filter (KF) compared to particle filter (PF) even
when the SNR values are increased. The increase in computational time as illustrated
gives us a trade-off between accuracy and time taken for computation, if we want faster
Tracking of GPS Parameters Using Particle Filter 419
Fig. 4. Comparison of SNR versus MSE for 100, 150 and 200 iterations.
Fig. 5. Comparison of SNR versus MSE for 50, 100 and 200 particles.
Fig. 6. Output graph of estimated phase change versus received phase change.
Fig. 7. Plot of MSE versus SNR graph for Kalman Filter and particle filter.
results there is a decline in accuracy of results and for accurate results there is raise in
time taken.
9 Conclusion
The tracking of code and carrier using particle filtering method has been discussed in
this paper. The phase change in a GPS receiver is estimated for a better communication
between the satellite and the user. There were different techniques discussed, but the
proposed particle filtering method is proved to produce better and accurate results.
MATLAB simulations are used to support this. From the above results it can be
concluded that particle filter is superior and provides high fidelity and statistical effi-
ciency, even though it has high computational cost as compared to others.
Tracking of GPS Parameters Using Particle Filter 421
References
1. Ward, P.M.: GPS Receivers, Receiver Signals and Principals of Operation. The Abdus
Salam International Centre for Theoretical Physics, January 1997
2. Kim, S.-J., Iltis, R.A.: STAP for GPS Receiver Synchronization. IEEE Trans. Aerosp. Elec-
tron. Syst. 40(1), 132–144 (2004)
3. Soubielle, J., Fijalkow, I., Duvaut, P., Bibaut, A.: GPS Positioning in a Multipath
Environment. IEEE Trans. Signal Process. 50(1), 141–150 (2002)
4. Matosevic, M., Salcic, Z., Berber, S.: A Comparison of Accuracy Using a GPS and a
Low-Cost DGPS. IEEE Trans. Instrum. Meas. 55(5), 1677–1683 (2006)
5. Al Rashed, M.A., Oumar, O.A., Singh, D.: A real time GSM/GPS based tracking system
based on GSM mobile phone
6. Enge, P., Misra, P.: Scanning the Issue/Technology. Proc. IEEE 87(1), 1–13 (1999)
7. Misra, R., Palod, S.: Code and Carrier Tracking Loops for GPS C/A Code. Int. J. Pure Appl.
Sci. Technol. 6(1), 1–20 (2011)
8. Wilde, A.: The Generalized Delay Locked Loop. Wirel. Pers. Commun. 8, 113–130 (1998)
9. Cahn, C.R.: Improving Frequency Acquisition of a Costas Loop. IEEE Trans. Commun. 25
(12), 1453–1459 (1911)
10. Marvin, M.K.: Simon: The Effects of Residual Carrier on Costas Loop Performance as
Applied to the Space Shuttle Orbiter S-Band Uplink. IEEE Trans. Commun. 26(11), 1542–
1548 (1978)
11. Simon, M.K.: Tracking Performance of Costas Loop with Hard-Limited In-phase Channel.
IEEE Trans. Commun. 26(4), 420–432 (1978)
12. Kumar, J.P., Rarotra, N., Maheswari, U.: Design and Implementation of Kalman Filter for
GPS Receivers. Indian J. Sci. Technol. 8(25), 1–5 (2015)
13. Lashley, M.: Kalman Filter Based Tracking Algorithms For Software GPS Receivers. IEEE
Trans. Commun. (1978)
14. Doucet, A., de Freitas, N., Gordon, N.: An Introduction to Sequential Monte Carlo Methods.
In: Doucet, A., de Freitas, N., Gordon, N. (eds.) Sequential Monte Carlo Methods in
Practice. ISS. Springer, New York (2001). doi:10.1007/978-1-4757-3437-9_1
15. Arulampalam, M.S., et al.: A tutorial on particle filters for online nonlinear/non-Gaussian
Bayesian tracking. IEEE Trans. Signal Process. 50(2), 174–188 (2002)
16. Yang, T., Mehta, P.G., Meyn, S.P.: Feedback particle filter for a continuous-time Markov
chain. IEEE Trans. Autom. Control 61(2), 556–561 (2016)
17. Greg, W., Bishop, G.: An introduction to the Kalman filter (1995)
18. Seshadri, V., Sudheesh, P., Jayakumar, M: Tracking the variation of tidal stature using
Kalman filter. In: International Conference on Circuit, Power and Computing Technologies
(ICCPCT 2016) (2016)
19. Nair, N., Sudheesh, P., Jayakumar, M.: 2-D tracking of objects using Kalman filter. In:
International Conference on circuit, Power and computing Technologies (ICCPCT 2016)
(2016)
Author Index