2014 Book ToolsAndAlgorithmsForTheConstr PDF
2014 Book ToolsAndAlgorithmsForTheConstr PDF
123
Lecture Notes in Computer Science 8413
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board
David Hutchison, UK Takeo Kanade, USA
Josef Kittler, UK Jon M. Kleinberg, USA
Alfred Kobsa, USA Friedemann Mattern, Switzerland
John C. Mitchell, USA Moni Naor, Israel
Oscar Nierstrasz, Switzerland C. Pandu Rangan, India
Bernhard Steffen, Germany Doug Tygar, USA
Demetri Terzopoulos, USA
Gerhard Weikum, Germany
13
Volume Editors
Erika Ábrahám
RWTH Aachen University
Aachen, Germany
E-mail: [email protected]
Klaus Havelund
Jet Propulsion Laboratory
California Institute of Technology
Pasadena, CA, USA
E-mail: [email protected]
ETAPS 2014 was the 17th instance of the European Joint Conferences on The-
ory and Practice of Software. ETAPS is an annual federated conference that
was established in 1998, and this year consisted of six constituting conferences
(CC, ESOP, FASE, FoSSaCS, TACAS, and POST) including eight invited speak-
ers and two tutorial speakers. Before and after the main conference, numerous
satellite workshops took place and attracted many researchers from all over the
globe.
ETAPS is a confederation of several conferences, each with its own Program
Committee (PC) and its own Steering Committee (if any). The conferences cover
various aspects of software systems, ranging from theoretical foundations to pro-
gramming language developments, compiler advancements, analysis tools, formal
approaches to software engineering, and security. Organizing these conferences
in a coherent, highly synchronized conference program, enables the participation
in an exciting event, having the possibility to meet many researchers working
in different directions in the field, and to easily attend the talks of different
conferences.
The six main conferences together received 606 submissions this year, 155 of
which were accepted (including 12 tool demonstration papers), yielding an over-
all acceptance rate of 25.6%. I thank all authors for their interest in ETAPS, all
reviewers for the peer reviewing process, the PC members for their involvement,
and in particular the PC co-chairs for running this entire intensive process. Last
but not least, my congratulations to all authors of the accepted papers!
ETAPS 2014 was greatly enriched by the invited talks of Geoffrey Smith
(Florida International University, USA) and John Launchbury (Galois, USA),
both unifying speakers, and the conference-specific invited speakers (CC) Benoı̂t
Dupont de Dinechin (Kalray, France), (ESOP) Maurice Herlihy (Brown
University, USA), (FASE) Christel Baier (Technical University of Dresden, Ger-
many), (FoSSaCS) Petr Jančar (Technical University of Ostrava, Czech Repub-
lic), (POST) David Mazières (Stanford University, USA), and finally (TACAS)
Orna Kupferman (Hebrew University Jerusalem, Israel). Invited tutorials were
provided by Bernd Finkbeiner (Saarland University, Germany) and Andy Gor-
don (Microsoft Research, Cambridge, UK). My sincere thanks to all these speak-
ers for their great contributions.
For the first time in its history, ETAPS returned to a city where it had been
organized before: Grenoble, France. ETAPS 2014 was organized by the Univer-
sité Joseph Fourier in cooperation with the following associations and societies:
ETAPS e.V., EATCS (European Association for Theoretical Computer Science),
EAPLS (European Association for Programming Languages and Systems), and
EASST (European Association of Software Science and Technology). It had
VI Foreword
support from the following sponsors: CNRS, Inria, Grenoble INP, PERSYVAL-
Lab, Université Joseph Fourier, and Springer-Verlag.
The overall planning for ETAPS is the responsibility of the Steering Commit-
tee (SC). The ETAPS SC consists of an executive board (EB) and representa-
tives of the individual ETAPS conferences, as well as representatives of EATCS,
EAPLS, and EASST. The Executive Board comprises Gilles Barthe (satellite
events, Madrid), Holger Hermanns (Saarbrücken), Joost-Pieter Katoen (chair,
Aachen and Twente), Gerald Lüttgen (treasurer, Bamberg), and Tarmo Uustalu
(publicity, Tallinn). Other current SC members are: Martı́n Abadi (Santa Cruz
and Mountain View), Erika Ábrahám (Aachen), Roberto Amadio (Paris), Chris-
tel Baier (Dresden), Saddek Bensalem (Grenoble), Giuseppe Castagna (Paris),
Albert Cohen (Paris), Alexander Egyed (Linz), Riccardo Focardi (Venice), Björn
Franke (Edinburgh), Stefania Gnesi (Pisa), Klaus Havelund (Pasadena), Reiko
Heckel (Leicester), Paul Klint (Amsterdam), Jens Knoop (Vienna), Steve Kre-
mer (Nancy), Pasquale Malacaria (London), Tiziana Margaria (Potsdam), Fabio
Martinelli (Pisa), Andrew Myers (Boston), Anca Muscholl (Bordeaux), Catuscia
Palamidessi (Palaiseau), Andrew Pitts (Cambridge), Arend Rensink (Twente),
Don Sanella (Edinburgh), Vladimiro Sassone (Southampton), Ina Schäfer (Braun-
schweig), Zhong Shao (New Haven), Gabriele Taentzer (Marburg), Cesare Tinelli
(Iowa), Jan Vitek (West Lafayette), and Lenore Zuck (Chicago).
I sincerely thank all ETAPS SC members for all their hard work in making the
17th ETAPS a success. Moreover, thanks to all speakers, attendants, organizers
of the satellite workshops, and Springer for their support. Finally, many thanks
to Saddek Bensalem and his local organization team for all their efforts enabling
ETAPS to return to the French Alps in Grenoble!
This volume contains the proceedings of TACAS 2014: the 20th International
Conference on Tools and Algorithms for the Construction and Analysis of Sys-
tems. TACAS 2014 took place during April 7–11, 2014, in Grenoble, France. It
was part of ETAPS 2014: the 17th European Joint Conferences on Theory and
Practice of Software.
TACAS is a forum for researchers, developers, and users interested in rigor-
ously based tools and algorithms for the construction and analysis of systems.
The research areas covered by TACAS include, but are not limited to, formal
methods, software and hardware specification and verification, static analysis,
dynamic analysis, model checking, theorem proving, decision procedures, real-
time, hybrid and stochastic systems, communication protocols, programming
languages, and software engineering. TACAS provides a venue where common
problems, heuristics, algorithms, data structures, and methodologies in these
areas can be discussed and explored.
TACAS 2014 solicited four kinds of papers, including three types of full-length
papers (15 pages), as well as short tool demonstration papers (6 pages):
This year TACAS attracted a total of 161 paper submissions, divided into
117 research papers, 11 case study papers, 18 regular tool papers, and 15 tool
demonstration papers. Each submission was refereed by at least three reviewers.
Papers by PC members were refereed by four reviewers. 42 papers were accepted
for presentation at the conference: 26 research papers, 3 case study papers, 6
regular tool papers, and 7 tool demonstration papers. This yields an overall
acceptance rate of 26.1 %. The acceptance rate for full papers (research + case
study + regular tool) was 24.0 %.
TACAS 2014 also hosted the Competition on Software Verification again, in
its third edition. This volume includes an overview of the competition results,
and short papers describing 11 of the 15 tools that participated in the competi-
tion. These papers were reviewed by a separate Program Committee, and each
included paper was refereed by at least four reviewers. The competition was
organized by Dirk Beyer, the Competition Chair. A session in the TACAS pro-
gram was reserved for presenting the results (by the Chair) and the participating
verifiers (by the developer teams).
VIII Preface
Steering Committee
Rance Cleaveland University of Maryland, USA
Holger Hermanns Saarland University, Germany
Kim Guldstrand Larsen Aalborg University, Denmark
Bernhard Steffen TU Dortmund, Germany
Lenore Zuck University of Illinois at Chicago, USA
Program Committee
Erika Ábrahám RWTH Aachen University, Germany (Co-chair)
Christel Baier Technische Universität Dresden, Germany
Saddek Bensalem Verimag/UJF, France
Nathalie Bertrand Inria/IRISA, France
Armin Biere Johannes Kepler University, Austria
Nikolaj Bjørner Microsoft Research, USA (Tool Chair)
Alessandro Cimatti Fondazione Bruno Kessler, Italy
Rance Cleaveland University of Maryland, USA
Cindy Eisner IBM Research - Haifa, Israel
Martin Fränzle Carl von Ossietzky University Oldenburg,
Germany
Patrice Godefroid Microsoft Research, Redmond, USA
Susanne Graf Verimag, France
Orna Grumberg Technion, Israel
Klaus Havelund NASA/JPL, USA (Co-chair)
Boudewijn Haverkort University of Twente, The Netherlands
Gerard Holzmann NASA/JPL, USA
Barbara Jobstmann Verimag/CNRS, France and EPFL,
Switzerland
Joost-Pieter Katoen RWTH Aachen University, Germany and
University of Twente, The Netherlands
Kim Guldstrand Larsen Aalborg University, Denmark
Roland Meyer TU Kaiserslautern, Germany
Corina Pasareanu NASA Ames Research Center, USA
Doron Peled Bar Ilan University, Israel
Paul Pettersson Mälardalen University, Sweden
Nir Piterman University of Leicester, UK
Jaco van de Pol University of Twente, The Netherlands
Sriram Sankaranarayanan University of Colorado Boulder, USA
Natasha Sharygina Università della Svizzera Italiana, Switzerland
X Organization
Additional Reviewers
Aarts, Fides Badouel, Eric Corbineau, Pierre
Abd Elkader, Karam Balasubramanian, Corzilius, Florian
Afzal, Wasif Daniel Csallner, Christoph
Alberti, Francesco Bernardo, Marco D’Ippolito, Nicolas
Aleksandrowicz, Gadi Bollig, Benedikt Dalsgaard, Andreas
Alt, Leonardo Bortolussi, Luca Engelbredt
Andres, Miguel Bouajjani, Ahmed Dang, Thao
Arbel, Eli Bozga, Marius David, Alexandre
Arenas, Puri Bozzano, Marco de Paula, Flavio M.
Aştefănoaei, Lacramioara Bradley, Aaron de Ruiter, Joeri
Aucher, Guillaume Bruintjes, Harold Defrancisco, Richard
Bacci, Giorgio Chakraborty, Souymodip Dehnert, Christian
Bacci, Giovanni Chen, Xin Derevenetc, Egor
Organization XI
Invited Contribution
Variations on Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Orna Kupferman
Tool Demonstrations
SACO: Static Analyzer for Concurrent Objects . . . . . . . . . . . . . . . . . . . . . . 562
Elvira Albert, Puri Arenas, Antonio Flores-Montoya,
Samir Genaim, Miguel Gómez-Zamalloa, Enrique Martin-Martin,
German Puebla, and Guillermo Román-Dı́ez
Case Studies
On the Correctness of a Branch Displacement Algorithm . . . . . . . . . . . . . . 605
Jaap Boender and Claudio Sacerdoti Coen
XVIII Table of Contents
Orna Kupferman
Hebrew University, School of Engineering and Computer Science, Jerusalem 91904, Israel
[email protected]
1 Introduction
Today’s rapid development of complex and safety-critical systems requires reliable veri-
fication methods. In formal verification, we verify that a system meets a desired property
by checking that a mathematical model of the system meets a formal specification that
describes the property. Of special interest are properties asserting that the observed be-
havior of the system always stays within some allowed region, in which nothing “bad”
happens. For example, we may want to assert that every message sent is acknowledged
in the next cycle. Such properties of systems are called safety properties. Intuitively, a
property ψ is a safety property if every violation of ψ occurs after a finite execution of
the system. In our example, if in a computation of the system a message is sent with-
out being acknowledged in the next cycle, this occurs after some finite execution of the
system. Also, once this violation occurs, there is no way to “fix” the computation.
In order to formally define what safety properties are, we refer to computations of a
nonterminating system as infinite words over an alphabet Σ. Consider a language L of
infinite words over Σ. A finite word x over Σ is a bad prefix for L iff for all infinite
words y over Σ, the concatenation x·y of x and y is not in L. Thus, a bad prefix for L is
a finite word that cannot be extended to an infinite word in L. A language L is a safety
language if every word not in L has a finite bad prefix. For example, if Σ = {0, 1},
then L = {0ω , 1ω } is a safety language. Indeed, every word not in L contains either the
sequence 01 or the sequence 10, and a prefix that ends in one of these sequences cannot
be extended to a word in L. 1 .
1
The definition of safety we consider here is given in [1,2], it coincides with the definition of
limit closure defined in [12], and is different from the definition in [26], which also refers to
the property being closed under stuttering.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 1–14, 2014.
c Springer-Verlag Berlin Heidelberg 2014
2 O. Kupferman
The interest in safety started with the quest for natural classes of specifications. The
theoretical aspects of safety have been extensively studied [2,28,29,33]. With the grow-
ing success and use of formal verification, safety has turned out to be interesting also
from a practical point of view [14,20,23]. Indeed, the ability to reason about finite pre-
fixes significantly simplifies both enumerative and symbolic algorithms. In the first,
safety circumvents the need to reason about complex ω-regular acceptance conditions.
For example, methods for temporal synthesis, program repair, or parametric reasoning
are much simpler for safety properties [18,32]. In the second, it circumvents the need
to reason about cycles, which is significant in both BDD-based and SAT-based meth-
ods [5,6]. In addition to a rich literature on safety, researchers have studied additional
classes, such as liveness and co-safety properties [2,28].
The paper surveys several extensions and variations of safety. We start with bounded
and checkable properties – fragments of safety properties that enable an even simpler
reasoning. We proceed to a reactive setting, where safety properties require the system
to stay in a region of states that is both allowed and from which the environment cannot
force it out. Finally, we describe a probability-based approach for defining different
levels of safety. The survey is based on the papers [24], with Moshe Y. Vardi, [21],
with Yoad Lustig and Moshe Y. Vardi, [25], with Sigal Weiner, and [10], with Shoham
Ben-David.
2 Preliminaries
Linear Temporal Logic. The logic LTL is a linear temporal logic. Formulas of LTL are
constructed from a set AP of atomic propositions using the usual Boolean operators
and the temporal operators G (“always”), F (“eventually”), X (“next time”), and U
(“until”). Formulas of LTL describe computations of systems over AP . For example,
the LTL formula G(req → F ack ) describes computations in which every position in
which req holds is eventually followed by a position in which ack holds. Thus, each
LTL formula ψ corresponds to a language, denoted ||ψ||, of words in (2AP )ω that satisfy
it. For the detailed syntax and semantics of LTL, see [30]. The model-checking problem
for LTL is to determine, given an LTL formula ψ and a system M , whether all the
computations of M satisfy ψ.
General methods for LTL model checking are based on translation of LTL formulas
to nondeterministic Büchi word automata. By [36], given an LTL formula ψ, one can
construct an NBW Aψ over the alphabet 2AP that accepts exactly all the computations
that satisfy ψ. The size of Aψ is, in the worst case, exponential in the length of ψ.
Given a system M and an LTL formula ψ, model checking of M with respect to ψ is
reduced to checking the emptiness of the product of M and A¬ψ [36]. This check can
be performed on-the-fly and symbolically [7,35], and the complexity of model checking
that follows is PSPACE, with a matching lower bound [34].
It is shown in [2,33,22] that when ψ is a safety formula, we can assume that all the
states in Aψ are accepting. Indeed, Aψ accepts exactly all words all of whose prefixes
have at least one extension accepted by Aψ , which is what we get if we define all
the states of Aψ to be accepting. Thus, safety properties can be recognized by NLWs.
Since every NLW can be determined to an equivalent DLW by applying the subset
construction, all safety formulas can be translated to DLWs.
3 Interesting Fragments
In this section we discuss two interesting fragments of safety properties: clopen (a.k.a.
bounded) properties, which are useful in bounded model checking, and checkable prop-
erties, which are useful in real-time monitoring.
Bounded model checking methodologies check the correctness of a system with respect
to a given specification by examining computations of a bounded length. Results from
4 O. Kupferman
set-theoretic topology imply that sets in Σ ω that are both open and closed (clopen sets)
are bounded: membership in a clopen set can be determined by examining a bounded
number of letters in Σ.
In [24] we studied safety properties from a topological point of view. We showed
that clopen sets correspond to properties that are both safety and co-safety, and show
that when clopen specifications are given by automata or LTL formulas, we can point
to a bound and translate the specification to bounded formalisms such as bounded LTL
and cycle-free automata.
Bounding Clopen Properties. Our goal in this section is to identify a bound for a
clopen property given by an automaton. Consider a clopen language L ⊆ Σ ω . For
a finite word x ∈ Σ ∗ , we say that x is undetermined with respect to L if there are
y ∈ Σ ω and z ∈ Σ ω such that x · y ∈ L and x · z ∈ L. As shown in [24], every word in
Σ ω has only finitely many prefixes that are undetermined with respect to L. It follows
that L is bounded: there are only finitely many words in Σ ∗ that are undetermined with
respect to L. For an integer k, we say that L is bounded by k if all the words x ∈ Σ ∗
such that |x| ≥ k are determined with respect to L. Moreover, since L is bounded, then
a minimal DLW that recognizes L must be cycle free. Indeed, otherwise we can pump
a cycle to infinitely many undetermined prefixes. Let diameter (L) be the diameter of
the minimal DLW for L.
Lemma 1. A clopen ω-regular language L ⊆ Σ ω is bounded by diameter (L).
Variations on Safety 5
Proof: Let A be the minimal deterministic looping automaton for L. Consider a word
x ∈ Σ ∗ with |x| ≥ diameter (L). Since A is cycle free, its run on x either reaches
an accepting sink, in which case x is a good prefix, or it does not reach an accepting
sink, in which case, by the definition of diameter (A), we cannot extend x to a word
accepted by A, thus x is a bad prefix.
For a language L, the in index of L, denoted inindex (L), is the minimal num-
ber of states that an NBW recognizing L has. Similarly, the out index of L, denoted
outindex (L), is the minimal number of states that an NBW recognizing comp(L) has.
Proof: Assume by way of contradiction that there is a word x ∈ Σ ∗ such that |x| ≥
inindex (L) · outindex (L) and x is undetermined with respect to L. Thus, there are
suffixes y and z such that x · y ∈ L and x · z ∈ L. Let A1 and A2 be nondeterminis-
tic looping automata such that L(A1 ) = L, L(A2 ) = comp(L), and A1 and A2 have
inindex (L) and outindex (L) states, respectively. Consider two accepting runs r1 and
r2 of A1 and A2 on x · y and x · z, respectively. Since |x| ≥ inindex (L) · outindex (L),
there are two prefixes x[1, . . . , i] and x[1, . . . , j] of x such that i < j and both runs re-
peat their state after these two prefixes; i.e., r1 (i) = r1 (j) and r2 (i) = r2 (j). Consider
the word x = x[1, . . . , i] · x[i + 1, . . . , j]ω . Since A1 is a looping automaton, the run r1
induces an accepting run r1 of A1 on x . Formally, for all l ≤ i we have r1 (l) = r1 (l)
and for all l > i, we have r1 (l) = r1 (i + ((l − i)mod(j − i))). Similarly, the run r2
induces an accepting run of A2 on x . It follows that x is accepted by both A1 and A2 ,
contradicting the fact that L(A2 ) = comp(L(A1 )).
0 1,2
0,1,2
q0 1 q1 0
qac
2 0
1,2
q2
At first sight, it seems that the same considerations applied in Lemma 1 can be used
in order to prove that the width of a checkable language is bounded by the diameter
of the smallest DBW recognizing the language. Indeed, it appears that in an accepting
run, the traversal through the minimal good prefix should not contain a cycle. This
impression, however, is misleading, as demonstrated in the DBW A from Example 1,
where a traversal through the subword 120 contains a cycle, and similarly for 010. The
diameter of the DBW A is 3, so it does not constitute a counterexample to the conjecture
that the diameter bounds the width, but the problem remains open in [21], and the
tightest bound proven there depends on the size of A and not only on its diameter, and
is even not linear. Intuitively, it follows form an upper-bound on the size of a DBW that
recognizes minimal bad prefixes of L. Formally, we have the following.
Variations on Safety 7
Recall that safety is defined with respect to languages over an alphabet Σ. Typically,
Σ = 2AP , where AP is the set of the system’s atomic propositions. Thus, the definition
and studies of safety treat all the atomic propositions as equal and do not distinguish
between input and output signals. As such, they are suited for closed systems – ones
that do not maintain an interaction with their environment. In open (also called reactive)
systems [19,31], the system interacts with the environment, and a correct system should
satisfy the specification with respect to all environments. A good way to think about
the open setting is to consider the situation as a game between the system and the
environment. The interaction between the players in this game generates a computation,
and the goal of the system is that only computations that satisfy the specification will
be generated.
Technically, one has to partition the set AP of atomic propositions to a set I of input
signals, which the environment controls, and a set O of output signals, which the system
controls. An open system is then an I/O-transducer – a deterministic automaton over
the alphabet 2I in which each state is labeled by an output in 2O . Given a sequence
of assignments to the input signals (each assignment is a letter in 2I ), the run of the
transducer on it induces a sequence of assignments to the output signals (that is, letters
in 2O ). Together these sequences form a computation, and the transducer realizes a
specification ψ if all its computations satisfy ψ [31].
The transition from the closed to the open setting modifies the questions we typically
ask about systems. Most notably, the synthesis challenge, of generating a system that
satisfies the specification, corresponds to the satisfiability problem in the closed setting
and to the realizability problem in the open setting. As another example, the equiva-
lence problem between LTL specifications is different in the closed and open settings
[16]. That is, two specifications may not be equivalent when compared with respect
to arbitrary systems on I ∪ O, but be open equivalent; that is, equivalent when com-
pared with respect to I/O-transducers. To see this, note for example that a satisfiable
yet non-realizable specification is equivalent to false in the open but not in the closed
setting.
As mentioned above, the classical definition of safety does not distinguish between
input and output signals. The definition can still be applied to open systems, as a special
case of closed systems with Σ = 2I∪O . In [11], Ehlers and Finkbeiner introduced reac-
tive safety – a definition of safety for the setting of open systems. Essentially, reactive
8 O. Kupferman
safety properties require the system to stay in a region of states that is both allowed and
from which the environment cannot force it out. The definition in [11] is by means of
sets of trees with directions in 2I and labels in 2O . The use of trees naturally locate
reactive safety between linear and branching safety. In [25], we suggested an equivalent
yet differently presented definition, which explicitly use realizability, and study the the-
oretical aspects of receive safety and other reactive fragments of specifications. In this
section, we review the definition and results from [25].
4.1 Definitions
We model open systems by transducers. Let I and O be finite sets of input and output
signals, respectively. Given x = i0 · i1 · i2 · · · ∈ (2I )ω and y = o0 · o1 · o2 · · · ∈ (2O )ω ,
we denote their composition by x ⊕ y = (i0 , o0 ) · (i1 , o1 ) · (i2 , o2 ) · · · ∈ (2I∪O )ω . An
I/O-transducer is a tuple T = I, O, S, s0 , η, L, where S is a set of states, s0 ∈ S is
an initial state, η : S × 2I → S is a transition function, and L : S → 2O is a labeling
function. The run of T on a (finite or infinite) input sequence x = i0 · i1 · i2 · · · , with
ij ∈ 2I , is the sequence s0 , s1 , s2 , . . . of states such that sj+1 = η(sj , ij+1 ) for all
j ≥ 0. The computation of T on x is then x ⊕ y, for y = L(s0 ) · L(s1 ) · L(s2 ) · · · Note
that T is responsive and deterministic (that is, it suggests exactly one successor state for
each input letter), and thus T has a single run, generating a single computation, on each
input sequence. We extend η to finite words over 2I in the expected way. In particular,
η(s0 , x), for x ∈ (2I )∗ is the |x|-th state in the run on x. A transducer T induces a
strategy f : (2I )∗ → 2O such that for all x ∈ (2I )∗ , we have that f (x) = L(η(s0 , x)).
Given an LTL formula ψ over I ∪ O, we say that ψ is I/O-realizable if there is a finite-
state I/O-transducer T such that all the computations of T satisfy ψ [31]. We then say
that T realizes ψ. When it is clear from the context, we refer to I/O-realizability as
realizability, or talk about realizability of languages over the alphabet 2I∪O .
Since the realizability problem corresponds to deciding a game between the system
and the environment, and the game is determined [15], realizability is determined too,
in the sense that either there is an I/O-transducer that realizes ψ (that is, the system
wins) or there is an O/I-transducer that realizes ¬ψ (that is, the environment wins).
Note that in an O/I-transducer the system and the environment “switch roles” and the
system is the one that provides the inputs to the transducer. A technical detail is that
in order for the setting of O/I-realizability to be dual to the one in I/O-realizability
we need, in addition to switching the roles and negating the specification, to switch
the player that moves first and consider transducers in which the environment initiates
the interaction and moves first. Since we are not going to delve into constructions, we
ignore this point, which is easy to handle.
Let I and O be sets of input and output signals, respectively. Consider a language
L ⊆ (2I∪O )ω . For a finite word u ∈ (2I∪O )∗ , let Lu = {s : u · s ∈ L} be the set of all
infinite words s such that u · s ∈ L. Thus, if L describes a set of allowed computations,
then Lu describes the set of allowed suffixes of computations starting with u.
We say that a finite word u ∈ (2I∪O )∗ is a system bad prefix for L iff Lu is not
realizable. Thus, a system bad prefix is a finite word u such that after traversing u,
the system does not have a strategy to ensure that the interaction with the environment
would generate a computation in L. We use sbp(L) to denote the set of system bad
Variations on Safety 9
prefixes for L. Note that by determinacy of games, whenever Lu is not realizable by the
system, then its complement is realizable by the environment. Thus, once a bad prefix
has been generated, the environment has a strategy to ensure that the entire generated
behavior is not in L.
ω
A language L ⊆ (2I∪O ) is a reactive safety language if every word not in L has
a system bad prefix. Below are two examples, demonstrating that a reactive safety lan-
guage need not be safe. Note that the other direction does hold: Let L be a safe language.
Consider a word w ∈ / L and a bad prefix u ∈ (2I∪O )∗ of w. Since u is a bad prefix, the
u
set L is empty, and is therefore unrealizable, so u is also a system bad prefix. Thus,
every word not in L has a system bad prefix, implying that L is reactively safe.
In the closed settings, the set bad-pref (L) is closed under finite extensions for all lan-
guages L ⊆ Σ ω . That is, for every finite word u ∈ bad-pref (L) and finite extension
v ∈ Σ ∗ , we have that u · v ∈ bad-pref (L). This is not the case in the reactive setting:
Theorem 3. System bad prefixes are not closed under finite extension.
Recall that reasoning about safety properties is easier than reasoning about general
properties. In particular, rather than working with automata on infinite words, one can
model check safety properties using automata (on finite words) for bad prefixes. The
question is whether and how we can take advantage of reactive safety when the specifi-
cation is not safe (but is reactively safe). In [11], the authors answered this question to
the positive and described a transition from reactively safe to safe formulas. The trans-
lation is by means of nodes in the tree in which a violation starts. The translation from
[25] we are going to describe here uses realizability explicitly, which we find simpler.
For a language L ⊆ (2I∪O )ω , we define close(L) = L ∩ {w : w has no system bad
prefix for L}. Equivalently, close(L) = L \ {w : w has a system bad prefix for L}.
Intuitively, we obtain close(L) by defining all the finite extensions of sbp(L) as bad
prefixes. It is thus easy to see that sbp(L) ⊆ bad-pref (close(L)).
As an example, consider again the specification ψ = G(err → X fix ) ∧ FG¬err ,
with I = {fix }, O = {err }. An infinite word contains a system bad prefix for ψ iff it
has a position that satisfies err . Accordingly, close(ψ) = G¬err . As another example,
let us add to O the signal ack , and let ψ = G(err → X (fix ∧ F ack )), with I = {fix },
O = {err , ack }. Again, ψ is reactively safe and an infinite word contains a system bad
prefix for ψ iff it has a position that satisfies err . Accordingly, close(ψ) = G¬err .
Our definition of close(L) is sound, in the following sense:
Theorem 4. A language L ⊆ (2I∪O )ω is reactively safe iff close(L) is safe.
While L and close(L) are not equivalent, they are open equivalent [16]. Formally,
we have the following.
Theorem 5. For every language L ⊆ (2I∪O )ω and I/O-transducer T , we have that
T realizes L iff T realizes close(L).
It is shown in [11] that given an LTL formula ψ, it is possible to construct a determin-
istic looping word automaton for close(ψ) with doubly-exponential number of states.
In fact, as suggested in [23], it is then possible to generate also a deterministic automa-
ton for the bad prefixes of close(ψ). Note that when L is not realizable, we have that
Safety is a binary notion. A property may or may not satisfy the definition of safety.
In this section we describe a probability-based approach for defining different levels
of safety. The origin of the definition is a study of vacuity in model checking [4,23].
Vacuity detection is a method for finding errors in the model-checking process when
the specification is found to hold in the model. Most vacuity algorithms are based on
checking the effect of applying mutations on the specification. It has been recognized
that vacuity results differ in their significance. While in many cases vacuity results
are valued as highly informative, there are also cases in which the results are viewed as
meaningless by users. In [10], we suggested a method for an automatic ranking of vacu-
ity results according to their level of importance. Our method is based on the probability
of the mutated specification to hold in a random computation. For example, two natural
mutations of the specification G(req → F ready) are G(¬req), obtained by mutating
the subformula ready to false, and GF ready , obtained by mutating the subformula
req to true. It is agreed that vacuity information about satisfying the first mutation is
12 O. Kupferman
more alarming than information about satisfying the second. The framework in [10] for-
mally explains this, as the probability of G(¬req) to hold in a random computation is 0,
whereas the probability of GF ready is 1. In this section we suggest to use probability
also for defining levels of safety.
References
1. Alpern, B., Schneider, F.B.: Defining liveness. IPL 21, 181–185 (1985)
2. Alpern, B., Schneider, F.B.: Recognizing safety and liveness. Distributed Computing 2,
117–126 (1987)
3. Barringer, H., Goldberg, A., Havelund, K., Sen, K.: Rule-based runtime verification. In: Stef-
fen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 44–57. Springer, Heidelberg
(2004)
4. Beer, I., Ben-David, S., Eisner, C., Rodeh, Y.: Efficient detection of vacuity in ACTL formu-
las. In: Grumberg, O. (ed.) CAV 1997. LNCS, vol. 1254, pp. 279–290. Springer, Heidelberg
(1997)
5. Biere, A., Cimatti, A., Clarke, E., Zhu, Y.: Symbolic model checking without BDDs. In:
Cleaveland, W.R. (ed.) TACAS 1999. LNCS, vol. 1579, pp. 193–207. Springer, Heidelberg
(1999)
6. Bloem, R., Gabow, H.N., Somenzi, F.: An algorithm for strongly connected component anal-
ysis in n log n symbolic steps. In: Johnson, S.D., Hunt Jr., W.A. (eds.) FMCAD 2000. LNCS,
vol. 1954, pp. 37–54. Springer, Heidelberg (2000)
7. Courcoubetis, C., Vardi, M.Y., Wolper, P., Yannakakis, M.: Memory efficient algorithms for
the verification of temporal properties. FMSD 1, 275–288 (1992)
8. Courcoubetis, C., Yannakakis, M.: The complexity of probabilistic verification. J. ACM 42,
857–907 (1995)
9. d’Amorim, M., Roşu, G.: Efficient monitoring of omega-languages. In: Etessami, K., Raja-
mani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 364–378. Springer, Heidelberg (2005)
10. Ben-David, S., Kupferman, O.: A framework for ranking vacuity results. In: Van Hung, D.,
Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172, pp. 148–162. Springer, Heidelberg (2013)
11. Ehlers, R., Finkbeiner, B.: Reactive safety. In: Proc. 2nd GANDALF. Electronic Proceedings
in TCS, vol. 54, pp. 178–191 (2011)
12. Emerson, E.A.: Alternative semantics for temporal logics. TCS 26, 121–130 (1983)
13. Fagin, R.: Probabilities in finite models. Journal of Symb. Logic 41(1), 50–58 (1976)
14 O. Kupferman
14. Filiot, E., Jin, N., Raskin, J.-F.: An antichain algorithm for LTL realizability. In: Bouajjani,
A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 263–277. Springer, Heidelberg (2009)
15. Gale, D., Stewart, F.M.: Infinite games of perfect information. Ann. Math. Studies 28,
245–266 (1953)
16. Greimel, K., Bloem, R., Jobstmann, B., Vardi, M.: Open implication. In: Aceto, L., Damgård,
I., Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds.) ICALP 2008,
Part II. LNCS, vol. 5126, pp. 361–372. Springer, Heidelberg (2008)
17. Gumm, H.P.: Another glance at the Alpern-Schneider characterization of safety and liveness
in concurrent executions. IPL 47, 291–294 (1993)
18. Harel, D., Katz, G., Marron, A., Weiss, G.: Non-intrusive repair of reactive programs. In:
ICECCS, pp. 3–12 (2012)
19. Harel, D., Pnueli, A.: On the development of reactive systems. In: Logics and Models of
Concurrent Systems, NATO ASI, vol. F-13, pp. 477–498. Springer (1985)
20. Havelund, K., Roşu, G.: Synthesizing monitors for safety properties. In: Katoen, J.-P.,
Stevens, P. (eds.) TACAS 2002. LNCS, vol. 2280, pp. 342–356. Springer, Heidelberg (2002)
21. Kupferman, O., Lustig, Y., Vardi, M.Y.: On locally checkable properties. In: Hermann, M.,
Voronkov, A. (eds.) LPAR 2006. LNCS (LNAI), vol. 4246, pp. 302–316. Springer, Heidel-
berg (2006)
22. Kupferman, O., Vardi, M.Y.: Model checking of safety properties. In: Halbwachs, N., Peled,
D.A. (eds.) CAV 1999. LNCS, vol. 1633, pp. 172–183. Springer, Heidelberg (1999)
23. Kupferman, O., Vardi, M.Y.: Model checking of safety properties. FMSD 19(3), 291–314
(2001)
24. Kupferman, O., Vardi, M.Y.: On bounded specifications. In: Nieuwenhuis, R., Voronkov, A.
(eds.) LPAR 2001. LNCS (LNAI), vol. 2250, pp. 24–38. Springer, Heidelberg (2001)
25. Kupferman, O., Weiner, S.: Environment-friendly safety. In: Biere, A., Nahir, A., Vos, T.
(eds.) HVC 2012. LNCS, vol. 7857, pp. 227–242. Springer, Heidelberg (2013)
26. Lamport, L.: Logical foundation. In: Alford, M.W., Hommel, G., Schneider, F.B., Ansart,
J.P., Lamport, L., Mullery, G.P., Zhou, T.H. (eds.) Distributed Systems. LNCS, vol. 190,
pp. 19–30. Springer, Heidelberg (1985)
27. Manna, Z., Pnueli, A.: he anchored version of the temporal framework. In: de Bakker, J.W.,
de Roever, W.-P., Rozenberg, G. (eds.) Linear Time, Branching Time and Partial Order in
Logics and Models for Concurrency. LNCS, vol. 354, pp. 201–284. Springer, Heidelberg
(1989)
28. Manna, Z., Pnueli, A.: The Temporal Logic of Reactive and Concurrent Systems: Specifica-
tion. Springer (1992)
29. Manna, Z., Pnueli, A.: The Temporal Logic of Reactive and Concurrent Systems: Safety.
Springer (1995)
30. Pnueli, A.: The temporal semantics of concurrent programs. TCS 13, 45–60 (1981)
31. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Proc. 16th POPL, pp. 179–
190 (1989)
32. Pnueli, A., Shahar, E.: Liveness and acceleration in parameterized verification. In: Emerson,
E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 328–343. Springer, Heidelberg
(2000)
33. Sistla, A.P.: Safety, liveness and fairness in temporal logic. Formal Aspects of Computing 6,
495–511 (1994)
34. Sistla, A.P., Clarke, E.M.: The complexity of propositional linear temporal logic. Journal of
the ACM 32, 733–749 (1985)
35. Touati, H.J., Brayton, R.K., Kurshan, R.: Testing language containment for ω-automata using
BDD’s. I & C 118(1), 101–109 (1995)
36. Vardi, M.Y., Wolper, P.: Reasoning about infinite computations. I & C 115(1), 1–37 (1994)
Decision Procedures for Flat Array Properties
1 Introduction
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 15–30, 2014.
c Springer-Verlag Berlin Heidelberg 2014
16 F. Alberti, S. Ghilardi, and N. Sharygina
results and their applications to concrete decision problems for array programs
annotated with assertions or postconditions.
We examine Flat Array Properties in two different settings. In one case, we
consider Flat Array Properties over the theory of arrays generated by adding
free function symbols to a given theory T modeling both indexes and elements
of the arrays. In the other one, we take into account Flat Array Properties over
a theory of arrays built by connecting two theories TI and TE describing the
structure of indexes and elements. Our decidability results are fully declarative
and parametric in the theories T, TI , TE . For both settings, we provide suffi-
cient conditions on T and TI , TE for achieving the decidability of Flat Array
Properties. Such hypotheses are widely met by theories of interest in practice,
like Presburger arithmetic. We also provide suitable decision procedures for Flat
Array Properties of both settings. Such procedures reduce the decidability of
Flat Array Properties to the decidability of T -formulæ in one case and TI - and
TE -formulæ in the other case.
We further show, as an application of our decidability results, that the safety
of an interesting class of programs handling arrays or strings of unknown length is
decidable. We call this class of programs simple0A -programs : this class covers non-
recursive programs implementing for instance searching, copying, comparing,
initializing, replacing and testing functions. The method we use for showing
these safety results is similar to a classical method adopted in the model-checking
literature for programs manipulating integer variables (see for instance [7,9,12]):
we first assume flatness conditions on the control flow graph of the program and
then we assume that transitions labeling cycles are “acceleratable”. However,
since we are dealing with array manipulating programs, acceleration requires
specific results that we borrow from [3]. The key point is that the shape of
most accelerated transitions from [3] matches the definition of our Flat Array
Properties (in fact, Flat Array Properties were designed precisely in order to
encompass such accelerated transitions for arrays).
From the practical point of view, we tested the effectiveness of state of the
art SMT-solvers in checking the satisfiability of some Flat Array Properties aris-
ing from the verification of simple0A -programs. Results show that such tools fail
or timeout on some Flat Array Properties. The implementation of our decision
procedures, once instantiated with the theories of interests for practical applica-
tions, will likely lead, therefore, to further improvements in the areas of practical
solutions for the rigorous analysis of software and hardware systems.
Plan of the Paper. The paper starts by recalling in Section 2 required back-
ground notions. Section 3 is dedicated to the definition of Flat Array Properties.
Section 3.1 introduces a decision procedure for Flat Array Properties in the case
of a mono-sorted theory ARR1 (T ) generated by adding free function symbols to
a theory T . Section 3.2 discusses a decision procedure for Flat Array Properties
in the case of the multi-sorted array theory ARR2 (TI , TE ) built over two theories
TI and TE for the indexes and elements (we supply also full lower and upper
complexity bounds for the case in which TI and TE are both Presburger arith-
metic). In Section 4 we recall and adapt required notions from [3], define the
Decision Procedures for Flat Array Properties 17
class of flat0 -programs and establish the requirements for achieving the decid-
ability of reachability analysis on some flat0 -programs. Such requirements are
instantiated in Section 4.1 in the case of simple0A -programs, array programs with
flat control-flow graph admitting definable accelerations for every loop. In Sec-
tion 4.2 we position the fragment of Flat Array Properties with respect to the
actual practical capabilities of state-of-the-art SMT-solvers. Section 5 compares
our results with the state of the art, in particular with the approaches of [8,15].
2 Background
We use lower-case latin letters x, i, c, d, e, . . . for variables; for tuples of vari-
ables we use bold face letters like x, i, c, d, e . . . . The n-th component of a tuple
c is indicated with cn and | − | may indicate tuples length (so that we have
c = c1 , . . . , c|c| ). Occasionally, we may use free variables and free constants in-
terchangeably. For terms, we use letters t, u, . . . , with the same conventions as
above; t, u are used for tuples of terms (however, tuples of variables are assumed
to be distinct, whereas the same is not assumed for tuples of terms - this is useful
for substitutions notation, see below). When we use u = v, we assume that two
tuples have equal n length, say n (i.e. n := |u| = |v|) and that u = v abbreviates
the formula i=1 ui = vi .
With E(x) we denote that the syntactic expression (term, formula, tuple
of terms or of formulæ) E contains at most the free variables taken from the
tuple x. We use lower-case Greek letters φ, ϕ, ψ, . . . for quantifier-free formulæ
and α, β, . . . for arbitrary formulæ. The notation φ(t) identifies a quantifier-free
formula φ obtained from φ(x) by substituting the tuple of variables x with the
tuple of terms t.
A prenex formula is a formula of the form Q1 x1 . . . Qn xn ϕ(x1 , . . . , xn ), where
Qi ∈ {∃, ∀} and x1 , . . . , xn are pairwise different variables. Q1 x1 · · · Qn xn is the
prefix of the formula. Let R be a regular expression over the alphabet {∃, ∀}.
The R-class of formulæ comprises all and only those prenex formulæ whose prefix
generates a string Q1 · · · Qn matched by R.
According to the SMT-LIB standard [22], a theory T is a pair (Σ, C), where
Σ is a signature and C is a class of Σ-structures; the structures in C are called
the models of T . Given a Σ-structure M, we denote by S M , f M , P M , . . . the
interpretation in M of the sort S, the function symbol f , the predicate symbol P ,
etc. A Σ-formula α is T -satisfiable if there exists a Σ-structure M in C such that
α is true in M under a suitable assignment to the free variables of α (in symbols,
M |= α); it is T -valid (in symbols, T |= α) if its negation is T -unsatisfiable. Two
formulæ α1 and α2 are T -equivalent if α1 ↔ α2 is T -valid; α1 T -entails α2 (in
symbols, α1 |=T α2 ) iff α1 → α2 is T -valid. The satisfiability modulo the theory
T (SM T (T )) problem amounts to establishing the T -satisfiability of quantifier-
free Σ-formulæ. All theories T we consider in this paper have decidable
SM T (T )-problem (we recall that this property is preserved when adding free
function symbols, see [13, 26]).
A theory T = (Σ, C) admits quantifier elimination iff for any arbitrary Σ-
formula α(x) it is always possible to compute a quantifier-free formula ϕ(x)
18 F. Alberti, S. Ghilardi, and N. Sharygina
Let T = (Σ, C) be a theory; the theory ARR1 (T ) of arrays over T is obtained from
T by adding to it infinitely many (fresh) free unary function symbols. This means
that the signature of ARR1 (T ) is obtained from Σ by adding to it unary function
symbols (we use the letters a, a1 , a2 , . . . for them) and that a structure M is a
model of ARR1 (T ) iff (once the interpretations of the extra function symbols are
disregarded) it is a structure belonging to the original class C.
For array theories it is useful to introduce the following notation. We use a for
a tuple a = a1 , . . . , a|a| of distinct ‘array constants’ (i.e. free function symbols);
if t = t1 , . . . , t|t| is a tuple of terms, the notation a(t) represents the tuple (of
length |a| · |t|) of terms a1 (t1 ), . . . , a1 (t|t| ), . . . , a|a| (t1 ), . . . , a|a| (t|t| ).
ARR1 (T ) may be highly undecidable, even when T itself is decidable (see [17]),
thus it is mandatory to limit the shape of the formulæ we want to try to decide.
A prenex formula or a term in the signature of ARR1 (T ) are said to be flat iff for
every term of the kind a(t) occurring in them (here a is any array constant), the
1
This is useful in the analysis of programs, when pointers to the memory (modeled
as an array) are stored into array variables.
Decision Procedures for Flat Array Properties 19
φ(a(t), ...) ∃x(x = t ∧ φ(a(x), ...)) or φ(a(t), ...) ∀x(x = t → φ(a(x), ...))
and consequently they may alter the quantifiers prefix of a formula. Thus it must
be kept in mind (when understanding the results below), that flattening trans-
formation cannot be operated on any occurrence of a term without exiting from
the class that is claimed to be decidable. When we indicate a flat quantifier-free
formula with the notation ψ(x, a(x)), we mean that such a formula is obtained
from a Σ-formula of the kind ψ(x, z) (i.e. from a quantifier-free Σ-formula where
at most the free variables x, z can occur) by replacing z by a(x).
Step I. Let
F := ∃c ∀i.ψ(i, a(i), c, a(c))
be a ∃∗ ∀-flat ARR1 (T )-sentence, where ψ is a quantifier-free Σ-formula. Sup-
pose that s is the length of a and t is the length of c (that is, a = a1 , . . . , as
and c = c1 , . . . , ct ). Let e = el,m (1 ≤ l ≤ s, 1 ≤ m ≤ t) be a tuple of
length s · t of fresh variables and consider the ARR1 (T )-formula:
F1 := ∃c ∃e ∀i.ψ(i, a(i), c, e) ∧ am (cl ) = el,m
1≤l≤t 1≤m≤s
Step III. Let d be a fresh tuple of variables of length s; check the T -satisfiabi-
lity of
⎡ ⎤
F3 := ∃c ∃e ∀i ∃d. ⎣ψ(i, d, c, e) ∧ (i = cl → dm = el,m )⎦
1≤l≤t 1≤m≤s
20 F. Alberti, S. Ghilardi, and N. Sharygina
(I) ∀i. a(i) = i; (II) ∀i1 ∀i2 . (i1 ≤ i2 → a(i1 ) ≤ a(i2 ));
(III) ∃i1 ∃i2 . (i1 ≤ i2 ∧ a(i1 ) ≤ a(i2 )); (IV ) ∀i1 ∀i2 . a(i1 ) = a(i2 );
(V ) ∀i. (D2 (i) → a(i) = 0); (V I) ∃i ∀j. (a1 (j) < a2 (3i)).
The flat formula (I) is not well-typed, hence it is not allowed in ARR2 (P, P); however,
it is allowed in ARR1 (P). Formula (II) expresses the fact that the array a is sorted: it is
flat but not monic (because of the atom i1 ≤ i2 ). On the contrary, its negation (III) is
flat and monic (because i1 , i2 are now existentially quantified). Formula (IV) expresses
that the array a is constant; it is flat and monic (notice that the universally quantified
variables i1 , i2 both occur in a(i1 ) = a(i2 ) but the latter is an ELEM atom). Formula
(V) expresses that a is initialized so to have all even positions equal to 0: it is monic
and flat. Formula (VI) is monic but not flat because of the term a2 (3i) occurring in it;
however, in 3i no universally quantified variable occurs, so it is possible to produce by
flattening the following sentence
which is logically equivalent to (VI), it is flat and still lies in the ∃∗ ∀-class. Finally, as
a more complicated example, notice that the following sentence
∃k ∀i. (D2 (k) ∧ a(k) = ‘\0‘ ∧ (D2 (i) ∧ i < k → a(i) = ‘b‘) ∧ (¬D2 (i)∧i < k → a(i) = ‘c‘))
is monic and flat: it says that a represents a string of the kind (bc)∗ .
by substituting each term in the tuple a(b)∗a(c) with the constant occupying
the corresponding position in the tuple e.
Step IV. Let B a full Boolean satisfying assignment for the atoms of the formula
s
F3 := ψ̄(b, c, e) ∧ (dm = dn → el,m = el,n )
dm ,dn ∈b∗c l=1
and let ψ̄I (b, c), ψ̄E (e) be the (conjunction of the) sets of literals of sort
INDEX and ELEM, respectively, induced by B.
Decision Procedures for Flat Array Properties 23
Proof. We use exponentially bounded domino systems for reduction [6,19], see [2]
for details.
Based on the decidability results described in the previous section, we can now
achieve important decidability results in the context of reachability analysis for
programs handling arrays of unbounded length. As a reference theory, we shall
use ARR1 (P+ ) or ARR2 (P+ , P+ ), where P+ is P enriched with free constant sym-
bols and with definable predicate and function symbols. We do not enter into
more details concerning what a definable symbol is (see, e.g., [25]), we just un-
derline that definable symbols are nothing but useful macros that can be used to
formalize case-defined functions and SMT-LIB commands like if-then-else. The
addition of definable symbols does not compromise quantifier elimination, hence
decidability of P+ . Below, we let T be ARR1 (P+ ) or ARR2 (P+ , P+ ).
Henceforth v will denote, in the following, the variables of the programs we will
analyze. Formally, v = a, c where, according to our conventions, a is a tuple of
array variables (modeled as free unary function symbols of T in our framework)
and c a tuple of scalar variables; the latter can be modeled as variables in the
logical sense - in ARR2 (P+ , P+ ) we can model them either as variables of sort
INDEX or as free constants of sort ELEM.
A state-formula is a formula α(v) of T representing a (possibly infinite) set of
configurations of the program under analysis. A transition formula is a formula
of T of the kind τ (v, v ) where v is obtained from copying the variables in v
and adding a prime to each of them. For the purpose of this work, programs will
be represented by their control-flow automaton.
24 F. Alberti, S. Ghilardi, and N. Sharygina
linit
τ1
τ2 l1
procedure initEven ( a[N ] , v ) :
τ3
l1 for (i = 0; i < N ; i = i + 2) a[i] = v;
τ4 l2
l2 for (i = 0; i < N ; i = i + 2) assert(a[i] = v);
τ5 τE
(a) l3 lerror
(b)
Fig. 1. The initEven procedure (a) and its control-flow graph (b)
We indicate by src, L, trg the three projection functions on E; that is, for
e = (li , τj , lk ) ∈ E, we have src(e) = li (this is called the ‘source’ location of
e), L(e) = τj (this is called the ‘label’ of e) and trg(e) = lk (this is called the
‘target’ location of e).
τ1 := i = 0
τ2 := i < N ∧ a = λj.if (j = i) then v else a(j) ∧ i = i + 2
τ3 := i ≥ N ∧ i = 0
τ4 := i < N ∧ a(i) = v ∧ i = i + 2
τ5 := i ≥ N
τE := i < N ∧ a(i) = v
The procedure initEven can be formalized as the control-flow graph depicted in Fig. 1(b),
where L = {linit , l1 , l2 , l3 , lerror }.
reachability problem for P. This problem, given well known limiting results, is
not decidable for an arbitrary program P. The consequence is that, in general,
reachability analysis is sound, but not complete, and its incompleteness mani-
fests itself in (possible) divergence of the verification algorithm (see, e.g., [1]).
To gain decidability, we must first impose restrictions on the shape of the
transition formulæ, for instance we can constrain the analysis to formulæ falling
within decidable classes like those we analyzed in the previous section. This is
not sufficient however, due to the presence of loops in the control flow. Hence
we assume flatness conditions on such control flow and “accelerability” of the
transitions labeling self-loops. This is similar to what is done in [7, 9, 12] for
integer variable programs, but since we handle array variables we need specific
restrictions for acceleration. Our result for the decidability of the safety of an-
notated array programs builds upon the results presented in Section 3 and the
acceleration procedure presented in [3].
We first give the definition of flat0 -program, i.e., programs with only self-
loops for which each location belongs to at most one loop. Subsequently we will
identify sufficient conditions for achieving the full decidability of the reachability
problem for flat0 -programs.
We can now formally show that the reachability problem for simple0A -programs
is decidable, by instantiating Theorem 4 with the results obtained so far.
Theorem 5. The unbounded reachability problem for simple0A -programs is de-
cidable.
Proof. By prenex transformations, distributions of universal quantifiers over con-
junctions, etc., it is easy to see that the decidable classes covered by Corollary 1
or Theorem 3 are closed under conjunctions. Since the acceleration of a simplek -
assignment fits inside these classes (just eliminate definitions via λ-abstractions
by using universal quantifiers), Theorem 4 applies.
there is some similarity with [18], although (contrary to [18]) we consider purely
syntactically specified classes of formulæ. We provided a complexity analysis
of our decision procedures. We also showed that the decidability of Flat Array
Properties, combined with acceleration results, allows to depict a sound and
complete procedure for checking the safety of a class of programs with arrays.
The modular nature of our solution makes our contributions orthogonal with
respect to the state of the art: we can enrich P with various definable or even
not definable symbols [24] and get from our Theorems 1,2 decidable classes
which are far from the scope of existing results. Still, it is interesting to notice
that also the special cases of the decidable classes covered by Corollary 1 and
Theorem 3 are orthogonal to the results from the literature. To this aim, we
make a closer comparison with [8,15]. The two fragments considered in [8,15] are
characterized by rather restrictive syntactic constraints. In [15] it is considered
a subclass of the ∃∗ ∀-fragment of ARR1 (T ) called SIL, Single Index Logic. In
this class, formulæ are built according to a grammar allowing (i) as atoms only
difference logic constraints and some equations modulo a fixed integer and (ii) as
universally quantified subformulæ only formulæ of the kind ∀i.φ(i) → ψ(i, a(i +
k̄)) (here k is a tuple of integers) where φ, ψ are conjunctions of atoms (in
particular, no disjunction is allowed in ψ). On the other side, SIL includes some
non-flat formulæ, due to the presence of constant increment terms i + k̄ in the
consequents of the above universally quantified implications. Similar restrictions
are in [16]. The Array Property Fragment described in [8] is basically a subclass
of the ∃∗ ∀∗ -fragment of ARR2 (P, P); however universally quantified subformulæ
are constrained to be of the kind ∀i.φ(i) → ψ(a(i)), where in addition the INDEX
part φ(i) must be a conjunction of atoms of the kind i ≤ j, i ≤ t, t ≤ i (with
i, j ∈ i and where t does not contain occurrences of the universally quantified
variables i). These formulæ are flat but not monic because of the atoms i ≤ j.
From a computational point of view, a complexity bound for SATMONO has
been shown in the proof of Theorem 1, while the complexity of the decision pro-
cedure proposed in [15] is unknown. On the other side, both SATMULTI and the
decision procedure described in [8] run in NExpTime (the decision procedure
in [8] is in NP only if the number of universally quantified index variables is
bounded by a constant N ). Our decision procedures for quantified formulæ are
also partially different, in spirit, from those presented so far in the SMT commu-
nity. While the vast majority of SMT-Solvers address the problem of checking
the satisfiability of quantified formulæ via instantiation (see, e.g., [8, 11, 14, 23]),
our procedures – in particular SATMULTI – are still based on instantiation, but the
instantiation refers to a set of terms enlarged with the free constants witnessing
the guessed set of realized types.
From the point of view of the applications, providing a full decidability result
for the unbounded reachability analysis of a class of array programs is what
differentiates our work with other contributions like [1, 3].
Decision Procedures for Flat Array Properties 29
References
1. Alberti, F., Bruttomesso, R., Ghilardi, S., Ranise, S., Sharygina, N.: Lazy abstrac-
tion with interpolants for arrays. In: Bjørner, N., Voronkov, A. (eds.) LPAR-18.
LNCS, vol. 7180, pp. 46–61. Springer, Heidelberg (2012)
2. Alberti, F., Ghilardi, S., Sharygina, N.: Decision procedures for flat array
properties. Technical Report 2013/04, University of Lugano (October 2013),
http://www.inf.usi.ch/research_publication.htm?id=77
3. Alberti, F., Ghilardi, S., Sharygina, N.: Definability of accelerated relations in a
theory of arrays and its applications. In: Fontaine, P., Ringeissen, C., Schmidt,
R.A. (eds.) FroCoS 2013. LNCS, vol. 8152, pp. 23–39. Springer, Heidelberg (2013)
4. Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanović, D., King, T.,
Reynolds, A., Tinelli, C.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV
2011. LNCS, vol. 6806, pp. 171–177. Springer, Heidelberg (2011)
5. Behrmann, G., Bengtsson, J., David, A., Larsen, K.G., Pettersson, P., Yi, W.:
UPPAAL implementation secrets. In: Damm, W., Olderog, E.-R. (eds.) FTRTFT
2002. LNCS, vol. 2469, pp. 3–22. Springer, Heidelberg (2002)
6. Börger, E., Grädel, E., Gurevich, Y.: The classical decision problem. Perspectives
in Mathematical Logic. Springer, Berlin (1997)
7. Bozga, M., Iosif, R., Lakhnech, Y.: Flat parametric counter automata. Fundamenta
Informaticae (91), 275–303 (2009)
8. Bradley, A.R., Manna, Z., Sipma, H.B.: What’s decidable about arrays? In: Emer-
son, E.A., Namjoshi, K.S. (eds.) VMCAI 2006. LNCS, vol. 3855, pp. 427–442.
Springer, Heidelberg (2006)
9. Comon, H., Jurski, Y.: Multiple counters automata, safety analysis and pres-
burger arithmetic. In: Vardi, M.Y. (ed.) CAV 1998. LNCS, vol. 1427, pp. 268–279.
Springer, Heidelberg (1998)
10. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
11. Detlefs, D.L., Nelson, G., Saxe, J.B.: Simplify: a theorem prover for program check-
ing. Technical Report HPL-2003-148, HP Labs (2003)
12. Finkel, A., Leroux, J.: How to compose Presburger-accelerations: Applications to
broadcast protocols. In: Agrawal, M., Seth, A.K. (eds.) FSTTCS 2002. LNCS,
vol. 2556, pp. 145–156. Springer, Heidelberg (2002)
13. Ganzinger, H.: Shostak light. In: Voronkov, A. (ed.) CADE 2002. LNCS (LNAI),
vol. 2392, pp. 332–346. Springer, Heidelberg (2002)
14. Ge, Y., de Moura, L.: Complete instantiation for quantified formulas in satisfi-
abiliby modulo theories. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS,
vol. 5643, pp. 306–320. Springer, Heidelberg (2009)
15. Habermehl, P., Iosif, R., Vojnar, T.: A logic of singly indexed arrays. In: Cervesato,
I., Veith, H., Voronkov, A. (eds.) LPAR 2008. LNCS (LNAI), vol. 5330, pp. 558–
573. Springer, Heidelberg (2008)
16. Habermehl, P., Iosif, R., Vojnar, T.: What else is decidable about integer arrays?
In: Amadio, R.M. (ed.) FOSSACS 2008. LNCS, vol. 4962, pp. 474–489. Springer,
Heidelberg (2008)
17. Halpern, J.Y.: Presburger arithmetic with unary predicates is Π11 complete. J.
Symbolic Logic 56(2), 637–642 (1991), doi:10.2307/2274706
18. Ihlemann, C., Jacobs, S., Sofronie-Stokkermans, V.: On local reasoning in verifica-
tion. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp.
265–281. Springer, Heidelberg (2008)
30 F. Alberti, S. Ghilardi, and N. Sharygina
19. Lewis, H.B.: Complexity of solvable cases of the decision problem for the predicate
calculus. In: 19th Ann. Symp. on Found. of Comp. Sci. pp. 35–47. IEEE (1978)
20. Nieuwenhuis, R., Oliveras, A.: DPLL(T) with Exhaustive Theory Propagation and
Its Application to Difference Logic. In: Etessami, K., Rajamani, S.K. (eds.) CAV
2005. LNCS, vol. 3576, pp. 321–334. Springer, Heidelberg (2005)
21. Oppen, D.C.: A superexponential upper bound on the complexity of Presburger
arithmetic. J. Comput. System Sci. 16(3), 323–332 (1978)
22. Ranise, S., Tinelli, C.: The Satisfiability Modulo Theories Library, SMT-LIB
(2006), http://www.smt-lib.org
23. Reynolds, A., Tinelli, C., Goel, A., Krstić, S., Deters, M., Barrett, C.: Quantifier
instantiation techniques for finite model finding in SMT. In: Bonacina, M.P. (ed.)
CADE 2013. LNCS, vol. 7898, pp. 377–391. Springer, Heidelberg (2013)
24. Semënov, A.L.: Logical theories of one-place functions on the set of natural num-
bers. Izvestiya: Mathematics 22, 587–618 (1984)
25. Shoenfield, J.R.: Mathematical logic. Association for Symbolic Logic, Urbana
(2001) (reprint of the 1973 second printing )
26. Tinelli, C., Zarba, C.G.: Combining nonstably infinite theories. J. Automat. Rea-
son. 34(3), 209–238 (2005)
SATMC: A SAT-Based Model Checker
for Security-Critical Systems
1 Introduction
With the convergence of the social, cloud, and mobile paradigms, information
and communication technologies are affecting our everyday personal and working
live to unprecedented depth and scale. We routinely use online services that stem
from the fruitful combination of mobile applications, web applications, cloud
services, and/or social networks. Sensitive data handled by these services often
flows across organizational boundaries and both the privacy of the users and the
assets of organizations are often at risk.
Solutions (e.g., security protocols and services) that aim to securely combine
the ever-growing ecosystem of online services are already available. But they
are notoriously difficult to get right. Many security-critical protocols and ser-
vices have been designed and developed only to be found flawed years after their
deployment. These flaws are usually due to the complex and unexpected inter-
actions of the protocols and services as well as to the possible interference of
malicious agents. Since these weaknesses are very difficult to spot by traditional
verification techniques (e.g., manual inspection and testing), security-critical sys-
tems are a natural target for formal method techniques.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 31–45, 2014.
c Springer-Verlag Berlin Heidelberg 2014
32 A. Armando, R. Carbone, and L. Compagna
User
Sec
uri
Con ty AP
Domain-specific nec I
tor
Connector BPMN
Connector
ASLan Output col
y Proto
Specification Format Securit ctor
Conne
SATMC MiniSAT
NuSMV
in the same figure, SATMC leverages NuSMV to generate the SAT encoding for
the LTL formulae and MiniSAT [22] to solve the SAT problems.
Structure of the Paper. In the next section we present some success stories
related to the application domains wherein SATMC has been so far employed.
In Section 3 we provide the formal framework and in Section 4 we illustrate
how the ASLan specification language can be used to specify security-critical
systems, the abilities of the attackers, and the security goals. In Section 5 we
present the bounded model checking procedure implemented in SATMC and the
architecture of the tool, and we conclude in Section 6 with some final remarks.
2 Success Stories
SATMC has been successfully used to support the security analysis and testing
in a variety of industry relevant application domains: security protocols, business
processes, and security APIs.
Security Protocols. Security protocols are communication protocols aiming
to achieve security assurances of various kinds through the usage of crypto-
graphic primitives. They are key to securing distributed information infrastruc-
tures, including—and most notably—the Web. The SAML 2.0 Single Sign-On
protocol [21] (SAML SSO, for short) is the established standard for cross-domain
browser-based SSO for enterprises. Figure 2 shows the prototypical use case for
the SAML SSO that enables a Service Provider (SP) to authenticate a Client (C)
via an Identity Provider (IdP): C asks SP to provide the resource at URI (step
1). SP then redirects C to IdP with the authentication request AReq(ID, SP),
where ID uniquely identifies the request (steps 2 and 3). IdP then challenges C
to provide valid credentials (gray dashed arrow). If the authentication succeeds,
IdP builds and sends C a digitally signed authentication assertion ({AA}K−1 )
IdP
embedded into an HTTP form (step 4). This form also includes some script
that automatically posts the message to SP (step 5). SP checks the assertion
34 A. Armando, R. Carbone, and L. Compagna
C IdP SP
1. GET URI
and then deliver the requested resource to C (step 6). Upon successful execu-
tion of the protocol, C and SP are mutually authenticated and the resource
has been confidentially delivered to C. To achieve this, SAML SSO—as most
of the application-level protocols—assumes that the communication between C
and SP as well as that between C and IdP is carried over unilateral SSL/TLS
communication channels. It must be noted that even with secure communication
channels in place, designing and developing application-level security protocols
such as the SAML SSO remains a challenge. These protocols are highly con-
figurable, are described in bulky natural language specifications, and deviations
from the standard may be dictated by application-specific requirements of the
organization.
The SAML-based SSO for Google Apps in operation until June 2008 deviated
from the standard in a few, seemingly minor ways. By using SATMC, we discov-
ered an authentication flaw in the service that allowed a malicious SP to mount
a severe man-in-the-middle attack [6]. We readily informed Google and the US-
CERT of the problem. In response to our findings Google developed a patch and
asked their customers to update their applications accordingly. A vulnerability
report was then released by the US-CERT.2 The severity of the vulnerability has
been rated High by NIST.3 By using the SATMC we also discovered an authen-
tication flaw in the prototypical SAML SSO use case [5]. This flaw paves the way
to launching Cross-Site Scripting (XSS) and Cross-Site Request Forgery (XSRF)
attacks, as witnessed by a new XSS attack that we identified in the SAML-based
SSO for Google Apps. We reported the problem to OASIS which subsequently
released an errata addressing the issue.4
We also used SATMC at SAP as a back-end for security protocol analysis
and testing (AVANTSSAR [1] and SPaCIoS [28]) to assist development teams
in the design and development of the SAP NetWeaver SAML Single Sign-On
(SAP NGSSO) and SAP OAuth 2.0 solutions. Overall, more than one hundred
different protocol configurations and corresponding formal models have been
analyzed, showing that both SAP NGSSO and SAP OAuth2 services are indeed
well designed.
2
http://www.kb.cert.org/vuls/id/612636
3
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2008-3891
4
http://tools.oasis-open.org/issues/browse/SECURITY-12
SATMC: A SAT-Based Model Checker 35
SoD1
Approve
Travel
Manager
Request Notify
Travel SoD3 Requestor
Staff
Approve
Budget
SoD2 Manager
All in all, SATMC has been key to the analysis of various security protocols
of industrial complexity, leading to the discovery of a number of serious flaws,
including a vulnerability in a “patched” version of the optimistic fair-exchange
contract signing protocol developed by Asokan, Shoup, and Weidner [3] and a
protocol for strong authentication based on the GSM infrastructure [7].
Business Processes. A Business Process (BP) is a workflow of activities
whose execution aims to accomplish a specific business goal. BPs must be care-
fully designed and executed so to comply with security and regulatory com-
pliance requirements. Figure 3 illustrates a simple example of a BP for travel
request approval: a staff member may issue a travel request. Both the reason
and the budget of the travel must be approved by managers. Afterwards, the
requesting user is notified whether the request is granted or not. The BP shall
ensure a number of authorization requirements: only managers shall be able to
approve the reason and budget of travel requests. Moreover, a manager should
not be allowed to approve her own travel request (Separation of Duty). Finally,
the manager that approves the travel reason should not get access to the details
of the travel budget (and vice versa) to perform her job (Need-to-Know Princi-
ple). Checking whether a given BP of real-world complexity complies with these
kind of requirements is difficult.
SATMC has been used to model check BPs against high-level authorization
requirements [10]. Moreover, SATMC lies at the core of a Security Validation
prototype for BPs developed by the Product Security Research unit at SAP.
This prototype can integrate off-the-shelf Business Process Management (BPM)
systems (e.g., SAP Netweaver BPM and Activity) to support BP analysts in the
evaluation of BP compliance. It enables a BP analyst to easily specify the security
goals and triggers SATMC via a translation of the BP workflow, data, security
policy and goal into ASLan. As soon as a flaw is discovered, it is graphically
rendered to the analyst [20,11].
Security APIs. A Security API is an Application Program Interface that allows
untrusted code to access sensitive resources in a secure way. Figure 4 shows a few
methods of the Java interface5 for the security API defined by the PKCS#11 stan-
dard [25]. The Java interface and its implementation allow to access the PKCS#11
modules of smart cards or other hardware security modules (HSM) where
5
http://javadoc.iaik.tugraz.at/pkcs11 wrapper/1.2.15/iaik/pkcs/
pkcs11/wrapper/PKCS11.html
36 A. Armando, R. Carbone, and L. Compagna
sensitive resources (e.g., cryptographic keys, pin numbers) can be stored. These
resources can be associated with attributes (cf. C SetAttributeValue) stat-
ing, e.g., whether they can be extracted from the device or not, whether a certain
key can be used to wrap (encrypt) another key, etc. For instance, if an object is
set to be non-extractable, then it cannot be reset and become extractable again.
More in general, changes to the attribute values and access to sensitive resources
must comply the policy that the security API is designed to enforce and this must
hold for any possible sequence of invocation of the methods offered by the API
(C Decrypt, C WrapKey, etc.).
SATMC lies at the core of Tookan [18], a tool capable to automatically detect
and reproduce policy violations in commercially available cryptographic security
tokens, exploiting vulnerabilities in their RSA PKCS#11 based APIs. Tookan
can automatically reverse-engineer real PKCS#11 tokens, deduce their function-
alities, construct formal models of the API for the SATMC model checker, and
then execute the attack traces found by SATMC directly against the actual to-
ken. Tookan has been able to detect a variety of severe attacks on a number of
commercial tokens (e.g., SecurID800 by RSA, CardOS V4.3 B by Siemens) [23].
Cryptosense6 is a spin-off recently established on top of the success of Tookan.
3 Formal Framework
symbol uniquely associated with the rule) for n ≥ 0, and v1 , . . . , vn are the
variables in L; it is required that the variables occurring in R also occur in
L.
c(v1 ,...,vn )
– H is a set of Horn clauses, i.e., expressions of the form (h ←−−−−−−− B),
where h is a fact, B is a finite set of facts, c is a Horn clause name (i.e., an
n-ary function symbol uniquely associated with the clause) for n ≥ 0, and
v1 , . . . , vn are the variables occurring in the clause.
– C ⊆ L is a set of closed formulae called constraints.
πi |=σ f iff f σ ∈
π(i)H (f is a fact)
πi |=σ t1 = t2 iff t1 σ and t2 σ are the same term
πi |=σ ¬φ iff πi |=σ φ
πi |=σ φ∨ψ iff πi |=σ φ or πi |=σ ψ
πi |=σ F(φ) iff there exists j ≥ i such that πj |=σ φ
πi |=σ Gφ iff for all j ≥ i πj |=σ φ
πi |=σ Oφ iff there exists 0 ≤ j ≤ i such that πj |=σ φ
πi |=σ ∃x.φ iff there exists t ∈ T such that πi |=σ[t/x] φ
where σ[t/x] is the assignment that associates x with t and all other vari-
ables y with σ(y). The semantics of the remaining connectives and temporal
operators, as well as of the universal quantifier, is defined analogously. Let
M1 = I1 , R1 , H1 , C1 and M2 = I2 , R2 , H2 , C2 . The parallel composition of
M1 and M2 is the model M1 M2 = I1 ∪ I2 , R1 ∪ R2 , H1 ∪ H2 , C1 ∪ C2 . Let
M = I, R, H, C be a model and φ ∈ L. We say that φ is valid in M , in symbols
M |= φ, if and only if π |=σ φ for all initialized paths π of M and all assignments
σ such that π |=σ ψ for all ψ ∈ C.
38 A. Armando, R. Carbone, and L. Compagna
Fact Meaning
Domain sent(s, b, a, m, c) s sent m on c to a pretending to be b
Independent rcvd(a, b, m, c) m (supposedly sent by b) has been received on c by a
contains(d, ds) d is member of ds
ik(m) the intruder knows m
Protocols stater (j, a, ts) a plays r, has internal state ts, and can execute step j
Business pa(r, t) r has the permission to perform t
Processes ua(a, r) a is assigned to r
executed(a, t) a executed t
granted(a, t) a is granted to execute t
APIs attrs(as) security token has attributes as
Legenda: s, a, b: agents; m: message; c: channel; r: role; j: protocol step; ts: list of
terms; t: task d: data; ds: set of data; o: resource object; as: set of attributes
MS MI |= G (1)
stating that if an agent r has a valid contract, then we ask o to possess a valid
contract relative to the same contractual text txt and secret commitment nO .
Finally, the separation of duty property SoD3 exemplified in Figure 3 can be
expressed as the following LTL formula:
This goal states that if an agent A has executed the task approve travel then
he should not execute the task approve budget.
traces of length up to kmax exist. Notice that it is possible to set kmax to ∞, but
then the procedure may not terminate (i.e., it is a semi-decision procedure). It
is worth noticing that even though the planning graph may represent spurious
execution paths, the encoding in SAT is precise and thus false positives are not
returned by SATMC.
The Planning Graph. A planning graph is a sequence of layers Γi for i =
0, . . . , k, where each layer Γi is a set of facts concisely representing the set of
states Γi = {S : S ⊆ Γi }. The construction of a planning graph for M goes
beyond the scope of this paper and the interested reader is referred to [9] for
more details. For the purpose of this paper it suffices to know that (i) Γ0 is set
to the initial state of M , (ii) if S is reachable from the initial state of M in i
steps, then S ∈ Γi (or equivalently S ⊆ Γi ) for i = 0, . . . , k, and (iii) Γi ⊆ Γi+1
for i = 0, . . . , k − 1, i.e., the layers in the planning graph grow monotonically.
Encoding the Model. The first step is to add a time-index to the rules and
facts to indicate the state at which the rules apply or the facts hold. Facts and
rules are thus indexed by 0 through k. If p is a fact, a rule, or a Horn clause and
i is an index, then pi is the corresponding time-indexed propositional variable.
If p = p1 , . . . , pn is a tuple of facts, rules, or Horn clauses and i is an index,
then pi = pi1 , . . . , pin is the corresponding time-indexed tuple of propositional
variables. The propositional formula [[M ]]0 is I(f 0 , hc0 ), while [[M ]]k , for k > 0,
is of the form:
k−1
I(f 0 , hc0 ) ∧ Ti (f i , ρi , hci , f i+1 , hci+1 ) (2)
i=0
where f , ρ, and hc are tuples of facts, rules, and Horn clauses, respectively. The
formula I(f 0 , hc0 ) encodes the initial state whereas the formula Ti (f i , ρi , hci ,
f i+1 , hci+1 ) encodes all the possible evolutions of the system from step i to step
i + 1. The encoding of the system follows the approach proposed in [9], adapted
from an encoding technique originally introduced for AI planning, extended to
support Horn clauses.
Grounding First-order LTL Formulae. Planning graphs are also key to
turn any first-order LTL formula ψ into a propositional LTL formula ψ0 such
that if π is an execution path of M with k or less states that violates ψ0 , then
π violates also ψ, and vice versa. This allows us to reduce the BMC problem
for any first-order LTL formula ψ to the BMC for a propositional LTL formula
ψ0 (module Goal Grounding) which can in turn be reduced to SAT by us-
ing the techniques available in the literature. The functionalities of the module
PLTL2SAT are currently given by the NuSMV model checker, used as a plugin
by SATMC. From the key properties of the planning graph described in Sec-
tion 5 it is easy to see that if a fact does not occur in Γk , then it is false in all
states reachable from the initial state in k steps and this leads to the following
fact.
SATMC: A SAT-Based Model Checker 43
6 Conclusions
References
1. Armando, A., et al.: The AVANTSSAR Platform for the Automated Validation of
Trust and Security of Service-Oriented Architectures. In: Flanagan, C., König, B.
(eds.) TACAS 2012. LNCS, vol. 7214, pp. 267–282. Springer, Heidelberg (2012)
2. Armando, A., et al.: The AVISPA Tool for the Automated Validation of Internet
Security Protocols and Applications. In: Etessami, K., Rajamani, S.K. (eds.) CAV
2005. LNCS, vol. 3576, pp. 281–285. Springer, Heidelberg (2005)
3. Armando, A., Carbone, R., Compagna, L.: LTL Model Checking for Security Proto-
cols. In: 20th IEEE Computer Security Foundations Symposium (CSF), pp. 385–396.
IEEE Computer Society (2007)
4. Armando, A., Carbone, R., Compagna, L.: LTL Model Checking for Security Pro-
tocols. In: JANCL, pp. 403–429. Hermes Lavoisier (2009)
5. Armando, A., Carbone, R., Compagna, L., Cuéllar, J., Pellegrino, G., Sorniotti,
A.: An Authentication Flaw in Browser-based Single Sign-On Protocols: Impact
and Remediations. Computers & Security 33, 41–58 (2013)
6. Armando, A., Carbone, R., Compagna, L., Cuéllar, J., Tobarra, L.: Formal Analysis
of SAML 2.0 Web Browser Single Sign-On: Breaking the SAML-based Single Sign-
On for Google Apps. In: Shmatikov, V. (ed.) Proc. ACM Workshop on Formal
Methods in Security Engineering, pp. 1–10. ACM Press (2008)
7. Armando, A., Carbone, R., Zanetti, L.: Formal Modeling and Automatic Security
Analysis of Two-Factor and Two-Channel Authentication Protocols. In: Lopez, J.,
Huang, X., Sandhu, R. (eds.) NSS 2013. LNCS, vol. 7873, pp. 728–734. Springer,
Heidelberg (2013)
8. Armando, A., Compagna, L.: SATMC: A SAT-Based Model Checker for Security
Protocols. In: Alferes, J.J., Leite, J. (eds.) JELIA 2004. LNCS (LNAI), vol. 3229,
pp. 730–733. Springer, Heidelberg (2004)
SATMC: A SAT-Based Model Checker 45
Abstract. We present a novel approach for generalizing the IC3 algorithm for
invariant checking from finite-state to infinite-state transition systems, expressed
over some background theories. The procedure is based on a tight integration of
IC3 with Implicit (predicate) Abstraction, a technique that expresses abstract tran-
sitions without computing explicitly the abstract system and is incremental with
respect to the addition of predicates. In this scenario, IC3 operates only at the
Boolean level of the abstract state space, discovering inductive clauses over the
abstraction predicates. Theory reasoning is confined within the underlying SMT
solver, and applied transparently when performing satisfiability checks. When the
current abstraction allows for a spurious counterexample, it is refined by discov-
ering and adding a sufficient set of new predicates. Importantly, this can be done
in a completely incremental manner, without discarding the clauses found in the
previous search.
The proposed approach has two key advantages. First, unlike current SMT
generalizations of IC3, it allows to handle a wide range of background theories
without relying on ad-hoc extensions, such as quantifier elimination or theory-
specific clause generalization procedures, which might not always be available,
and can moreover be inefficient. Second, compared to a direct exploration of the
concrete transition system, the use of abstraction gives a significant performance
improvement, as our experiments demonstrate.
1 Introduction
IC3 [5] is an algorithm for the verification of invariant properties of transition systems.
It builds an over-approximation of the reachable state space, using clauses obtained by
generalization while disproving candidate counterexamples. In the case of finite-state
systems, the algorithm is implemented on top of Boolean SAT solvers, fully leveraging
their features. IC3 has demonstrated to be extremely effective, and it is a fundamental
core in all the engines in hardware verification.
There have been several attempts to lift IC3 to the case of infinite-state systems,
for its potential applications to software, RTL models, timed and hybrid systems [9],
although the problem is in general undecidable. These approaches are set in the frame-
work of Satisfiability Modulo Theory (SMT) [1] and hereafter are referred to as IC3
Modulo Theories [7,18,16,25]: the infinite-state transition system is symbolically de-
scribed by means of SMT formulas, and an SMT solver plays the same role of the
SAT solver in the discrete case. The key difference is the need in IC3 Modulo Theories
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 46–61, 2014.
c Springer-Verlag Berlin Heidelberg 2014
IC3 Modulo Theories via Implicit Predicate Abstraction 47
for specific theory reasoning to deal with candidate counterexamples. This led to the
development of various techniques, based on quantifier elimination or theory-specific
clause generalization procedures. Unfortunately, such extensions are typically ad-hoc,
and might not always be applicable in all theories of interest. Furthermore, being based
on the fully detailed SMT representation of the transition systems, some of these solu-
tions (e.g. based on quantifier elimination) can be highly inefficient.
We present a novel approach to IC3 Modulo Theories, which is able to deal with
infinite-state systems by means of a tight integration with predicate abstraction (PA)
[12], a standard abstraction technique that partitions the state space according to the
equivalence relation induced by a set of predicates. In this work, we leverage Implicit
Abstraction (IA) [23], which allows to express abstract transitions without computing
explicitly the abstract system, and is fully incremental with respect to the addition of
new predicates. In the resulting algorithm, called IC3+IA, the search proceeds as if
carried out in an abstract system induced by the set of current predicates P – in fact,
IC3+IA only generates clauses over P. The key insight is to exploit IA to obtain an
abstract version of the relative induction check. When an abstract counterexample is
found, as in Counter-Example Guided Abstraction-Refinement (CEGAR), it is simu-
lated in the concrete space and, if spurious, the current abstraction is refined by adding
a set of predicates sufficient to rule it out.
The proposed approach has several advantages. First, unlike current SMT general-
izations of IC3, IC3+IA allows to handle a wide range of background theories without
relying on ad-hoc extensions, such as quantifier elimination or theory-specific clause
generalization procedures. The only requirement is the availability of an effective tech-
nique for abstraction refinement, for which various solutions exist for many important
theories (e.g. interpolation [15], unsat core extraction, or weakest precondition). Sec-
ond, the analysis of the infinite-state transition system is now carried out in the abstract
space, which is often as effective as an exact analysis, but also much faster. Finally, the
approach is completely incremental, without having to discard or reconstruct clauses
found in the previous iterations.
We experimentally evaluated IC3+IA on a set of benchmarks from heterogeneous
sources [2,14,18], with very positive results. First, our implementation of IC3+IA is
significantly more expressive than the SMT-based IC3 of [7], being able to handle
not only the theory of Linear Rational Arithmetic (LRA) like [7], but also those of
Linear Integer Arithmetic (LIA) and fixed-size bit-vectors (BV). Second, in terms of
performance IC3+IA proved to be uniformly superior to a wide range of alternative
techniques and tools, including state-of-the-art implementations of the bit-level IC3
algorithm ([11,22,3]), other approaches for IC3 Modulo Theories ([7,16,18]), and tech-
niques based on k-induction and invariant discovery ([14,17]). A remarkable property
of IC3+IA is that it can deal with a large number of predicates: in several benchmarks,
hundreds of predicates were introduced during the search. Considering that an explicit
computation of the abstract transition relation (e.g. based on All-SMT [19]) often be-
comes impractical with a few dozen predicates, we conclude that IA is fundamental to
scalability, allowing for efficient reasoning in a fine-grained abstract space.
The rest of the paper is structured as follows. In Section 2 we present some back-
ground on IC3 and Implicit Abstraction. In Section 3 we describe IC3+IA and prove
48 A. Cimatti et al.
its formal properties. In Section 4 we discuss the related work. In Section 5 we ex-
perimentally evaluate our method. In Section 6 we draw some conclusions and present
directions for future work.
2 Background
2.1 Transition Systems
Our setting is standard first order logic. We use the standard notions of theory, satisfi-
ability, validity, logical consequence. We denote formulas with ϕ, ψ, I, T, P , variables
with x, y, and sets of variables with X, Y , X, X. Unless otherwise specified, we work
on quantifier-free formulas, and we refer to 0-arity predicates as Boolean variables, and
to 0-arity uninterpreted functions as (theory) variables. A literal is an atom or its nega-
tion. A clause is a disjunction of literals, whereas a cube is a conjunction of literals.
If s is a cube l1 ∧ . . . ∧ ln , with ¬s we denote the clause ¬l1 ∨ . . . ∨ ¬ln , and vice
versa. A formula is in conjunctive normal form (CNF) if it is a conjunction of clauses,
and in disjunctive normal form (DNF) if it is a disjunction of cubes. With a little abuse
of notation, we might sometimes denote formulas in CNF C1 ∧ . . . ∧ Cn as sets of
clauses {C1 , . . . , Cn }, and vice versa. If X1 , . . . , Xn are a sets of variables and ϕ is a
formula, we might write ϕ(X1 , . . . , Xn ) to indicate that all the variables occurring in ϕ
are elements of i Xi . For each variable x, we assume that there exists a corresponding
variable x (the primed version of x). If X is a set of variables, X is the set obtained
by replacing each element x with its primed version (X = {x | x ∈ X}), X is the set
obtained by replacing each x with x (X = {x | x ∈ X}) and X n is the set obtained by
adding n primes to each variable (X n = {xn | x ∈ X}).
Given a formula ϕ, ϕ is the formula obtained by adding a prime to each variable
occurring in ϕ. Given a theory T, we write ϕ |=T ψ (or simply ϕ |= ψ) to denote that
the formula ψ is a logical consequence of ϕ in the theory T.
A transition system (TS) S is a tuple S = X, I, T where X is a set of (state)
variables, I(X) is a formula representing the initial states, and T (X, X ) is a formula
representing the transitions. A state of S is an assignment to the variables X. A path of
S is a finite sequence s0 , s1 , . . . , sk of states such that s0 |= I and for all i, 0 ≤ i < k,
(si , si+1 ) |= T .
Given a formula P (X), the verification problem denoted with S |= P is the problem
to check if for all paths s0 , s1 , . . . , sk of S, for all i, 0 ≤ i ≤ k, si |= P . Its dual is the
reachability problem, which is the problem to find a path s0 , s1 , . . . , sk of S such that
sk |= ¬P . P (X) represents the “good” states, while ¬P represents the “bad” states.
Inductive invariants are central to solve the verification problem. P is an inductive
invariant iff (i) I(X) |= P (X); and (ii) P (X) ∧ T (X, X ) |= P (X ). A weaker notion
is given by relative inductive invariants: given a formula φ(X), P is inductive relative
to φ iff (i) I(X) |= P (X); and (ii) φ(X) ∧ P (X) ∧ T (X, X ) |= P (X ).
SMT case in [7,16]. In the following, we present its main ideas, following the descrip-
tion of [7]. For brevity, we have to omit several important details, for which we refer to
[5,7,16].
Let S and P be a transition system and a set of good states as in §2.1. The IC3
algorithm tries to prove that S |= P by finding a formula F (X) such that: (i) I(X) |=
F (X); (ii) F (X) ∧ T (X, X ) |= F (X ); and (iii) F (X) |= P (X).
In order to construct an inductive invariant F , IC3 maintains a sequence of formulas
(called trace) F0 (X), . . . , Fk (X) such that: (i) F0 = I; (ii) Fi |= Fi+1 ; (iii) Fi (X) ∧
T (X, X ) |= Fi+1 (X ); (iv) for all i < k, Fi |= P . Therefore, each element of the
trace Fi+1 , called frame, is inductive relative to the previous one, Fi . IC3 strengthens
the frames by finding new relative inductive clauses by checking the unsatisfiability of
the formula:
RelInd(F, T, c) := F ∧ c ∧ T ∧ ¬c . (1)
More specifically, the algorithm proceeds incrementally, by alternating two phases:
a blocking phase, and a propagation phase. In the blocking phase, the trace is analyzed
to prove that no intersection between Fk and ¬P (X) is possible. If such intersection
cannot be disproved on the current trace, the property is violated and a counterexample
can be reconstructed. During the blocking phase, the trace is enriched with additional
formulas, which can be seen as strengthening the approximation of the reachable state
space. At the end of the blocking phase, if no violation is found, Fk |= P .
The propagation phase tries to extend the trace with a new formula Fk+1 , moving
forward the clauses from preceding Fi ’s. If, during this process, two consecutive frames
become identical (i.e. Fi = Fi+1 ), then a fixpoint is reached, and IC3 terminates with
Fi being an inductive invariant proving the property.
In the blocking phase IC3 maintains a set of pairs (s, i), where s is a set of states that
can lead to a bad state, and i > 0 is a position in the current trace. New formulas (in the
form of clauses) to be added to the current trace are derived by (recursively) proving
that a set s of a pair (s, i) is unreachable starting from the formula Fi−1 . This is done
by checking the satisfiability of the formula RelInd(Fi−1 , T, ¬s). If the formula is un-
satisfiable, then ¬s is inductive relative to Fi−1 , and IC3 strengthens Fi by adding ¬s
to it1 , thus blocking the bad state s at i. If, instead, (1) is satisfiable, then the overap-
proximation Fi−1 is not strong enough to show that s is unreachable. In this case, let p
be a subset of the states in Fi−1 ∧ ¬s such that all the states in p lead to a state in s in
one transition step. Then, IC3 continues by trying to show that p is not reachable in one
step from Fi−2 (that is, it tries to block the pair (p, i − 1)). This procedure continues
recursively, possibly generating other pairs to block at earlier points in the trace, until
either IC3 generates a pair (q, 0), meaning that the system does not satisfy the property,
or the trace is eventually strengthened so that the original pair (s, i) can be blocked.
A key difference between the original Boolean IC3 and its SMT extensions in [7,16]
is in the way sets of states to be blocked or generalized are constructed. In the blocking
phase, when trying to block a pair (s, i), if the formula (1) is satisfiable, then a new
pair (p, i − 1) has to be generated such that p is a cube in the preimage of s wrt. T .
In the propositional case, p can be obtained from the model μ of (1) generated by
1
¬s is actually generalized before being added to Fi . Although this is fundamental for the IC3
effectiveness, we do not discuss it for simplicity.
50 A. Cimatti et al.
the SAT solver, by simply dropping the primed variables occurring in μ. This cannot
be done in general in the first-order case, where the relationship between the current
state variables X and their primed version X is encoded in the theory atoms, which in
general cannot be partitioned into a primed and an unprimed set. The solution proposed
in [7] is to compute p by existentially quantifying (1) and then applying an under-
approximated existential elimination algorithm for linear rational arithmetic formulas.
Similarly, in [16] a theory-aware generalization algorithm for linear rational arithmetic
(based on interpolation) was proposed, in order to strengthen ¬s before adding it to Fi
after having successfully blocked it.
EQ
EQ
EQ
The main idea of IC3+IA is to mimic how IC3 would work on the abstract state space
defined by a set of predicates P, but using IA to avoid quantifier elimination to compute
the abstract transition relation. Therefore, clauses, frames and cubes are restricted to
have predicates in P as atoms. We call these clauses, frames and cubes respectively
P-clauses, P-formulas, and P-cubes. Note that for any P-formula φ (and thus also for
P-cubes and P-clauses), φ = φ[XP /P] ∧ ∃X.( p∈P xp ↔ p(X)).
The key point of IC3+IA is to use an abstract version of the check (1) to prove that
c is inductive relative to the abstract frame F :
an abstract clause
Proof. Suppose s |= AbsRelInd(F, T, c, P). Let us denote with t and t the projections
of s respectively over X ∪ X and over X ∪ X . Then t |= T and therefore t |= T. Since
The IC3+IA algorithm is shown in Figure 2. The IC3+IA has the same structure of
IC3 as described in [11]. Additionally, it keeps a set of predicates P, which are used
to compute new clauses. The only points where IC3+IA differs from IC3 (shown in
red in Fig. 2) are in picking P-cubes instead of concrete states, the use of AbsRelInd
instead of RelInd, and in the fact that a spurious counterexample may be found and, in
that case, new predicates must be added.
More specifically, the algorithm consists of a loop, in which each iteration is divided
into the blocking and the propagation phase. The blocking phase starts by picking a cube
c of predicates representing an abstract state in the last frame violating the property.
This is recursively blocked along the trace by checking if AbsRelInd(Fi−1 , T, ¬c, P)
52 A. Cimatti et al.
Fig. 2. High-level description of IC3+IA (with changes wrt. the Boolean IC3 in red)
to either find a real counterexample or to refine the abstraction, adding new predicates
to P. Technically, IC3+IA finds a set of counterexamples π = (s0 , 0); . . . ; (sk , k) in-
stead of a single counterexample, as described in [7] (i.e. this behaviour depends on the
generalization of a cube performed by ternary simulation or don’t care detection). We
simulate π as usual via bounded model checking.
Formally, we encode allthe paths of S
up to k steps restricted to π with: I(X 0 )∧ i<k T (X i , X i+1 )∧P (X k )∧ i≤k sk (X k ).
If the formula is satisfiable, then there exists a concrete counterexample that witnesses
S |= P , otherwise π is spurious and we refine the abstraction adding new predicates.
The refine(I, T, P, π) procedure is orthogonal to IC3+IA, and can be carried out with
several techniques, like interpolation, unsat core extraction or weakest precondition, for
which there is a wide literature. The only requirement of the refinement is to remove the
spurious counterexamples π. In our implementation we used interpolation to discover
predicates, similarly to [15].
Also, note that in our approach the set of predicates increases monotonically after a
refinement (i.e. we always add new predicates to the existing set of predicates). Thus,
the transition relation is monotonically strengthened (i.e. since P ⊆ P , TP → TP ). This
allows us to keep all the clauses in the IC3+IA frames after a refinement, enabling a
fully incremental approach.
3.4 Correctness
Lemma 1 (Invariants). The following conditions are invariants of IC3+IA:
1. F0 = I;
2. for all i < k, Fi |= Fi+1 ;
3. for all i < k, Fi (X) ∧ EQP (X, X) ∧ T (X, X ) ∧ EQP (X , X ) |= Fi+1 (X );
4. for all i < k, Fi |= P .
Proof. Condition 1 holds, since initially F0 = I, and F0 is never changed. We prove
that the conditions (2-4) are loop invariants for the main IC3+IA loop (line 4). The
invariant conditions trivially hold when entering the loop.
Then, the invariants are preserved by the inner loop at line 5. The loop may change the
content of a frame Fi+1 adding a new clause c while recursively blocking a cube (p, i +
1). c is added to Fi+1 if the abstract relative inductive check AbsRelInd(Fi , T, c, P)
holds. Clearly, this preserves the conditions 2-3. In the loop the set of predicates P may
change at line 8. Note that the invariant conditions still hold in this case. In particular,
3 holds because if P ⊆ P , then EQP |= EQP . When the inner loop ends, we are guar-
anteed that Fk |= PP holds. Thus, condition 4 is preserved when a new frame is added
to the abstraction in line 10. Finally, the propagation phase clearly maintains all the in-
variants (2-4), by the definition of abstract relative induction AbsRelInd(Fi , T, c, P ).
P |= P
Lemma 2. If IC3+IA (I, T , P , P) returns true, then S P .
Proof. The invariant conditions of the IC3 algorithm hold for the abstract frames: 1)
0 = I;
F for all i < k, 2) Fi |= F
i+1 ; 3) Fi ∧ T |= Fi+1 ; and 4) Fi |= P .
Conditions 1), 2), and 4) follow from Lemma 1, since I, P , and Fi are P-cubes.
Condition 3) follows from Lemma 1, since T = ∃X, X .EQP (X, X) ∧ T (X, X ) ∧
EQP (X , X ) by definition.
54 A. Cimatti et al.
Proof. If IC3+IA (I, T , P , P) returns true, then S P |= PP by Lemma 2, and thus
S |= P . If IC3+IA (I, T , P , P) returns false, then the simulation of the abstract coun-
terexample in the concrete system succeeded, and thus S |= P .
Lemma 3 (Abstract counterexample). If IC3+IA finds a counterexample π, then π
is a path of S violating P.
Proof. Let us consider the case in which, at a certain iteration of the main loop, P is as
defined in the premises of theorem. At every following iteration of the loop, IC3+IA
either finds an abstract counterexample π or strengthens a frame Fi with a new P-clause.
The first case is not possible, since, by Lemma 3, π would be a path of S violating the
property. Therefore, at every iteration, IC3+IA strengthens some frame with a new P-
clause. Since the number of P-clauses is finite and, by Lemma 1, for all i, Fi |= Fi+1 ,
IC3+IA will eventually find that Fi = Fi+1 for some i and return true.
4 Related Work
This work combines two lines of research in verification, abstraction and IC3.
Among the existing abstraction techniques, predicate abstraction [12] has been suc-
cessfully applied to the verification of infinite-state transition systems, such as soft-
ware [20]. Implicit abstraction [23] was first used with k-induction to avoid the explicit
computation of the abstract system. In our work, we exploit implicit abstraction in IC3
to avoid theory-specific generalization techniques, widening the applicability of IC3 to
transition systems expressed over some background theories. Moreover, we provided
the first integration of implicit abstraction in a CEGAR loop.
The IC3 [5] algorithm has been widely applied to the hardware domain [11,6] to
prove safety and also as a backend to prove liveness [4]. In [24], IC3 is combined with
a lazy abstraction technique in the context of hardware verification. The approach has
some similarities with our work, but it is limited to Boolean systems, it uses a “visible
IC3 Modulo Theories via Implicit Predicate Abstraction 55
variables” abstraction rather than PA, and applies a modified concrete version of IC3
for refinement.
Several approaches adapted the original IC3 algorithm to deal with infinite-state
systems [7,16,18,25]. The techniques presented in [7,16] extend IC3 to verify systems
described in the linear real arithmetic theory. In contrast to both approaches, we do
not rely on theory specific generalization procedures, which may be expensive, such as
quantifier elimination [7] or may hinder some of the IC3 features, like generalization
(e.g. the interpolant-based generalization of [16] does not exploit relative induction).
Moreover, IC3+IA searches for a proof in the abstract space. The approach presented
in [18] is restricted to timed automata since it exploits the finite partitioning of the
region graph. While we could restrict the set of predicates that we use to regions, our
technique is applicable to a much broader class of systems, and it also allows us to apply
conservative abstractions. IC3 was also extended to the bit-vector theory in [25] with
an ad-hoc extension, that may not handle efficiently some bit-vector operators. Instead,
our approach is not specific for bit-vectors.
5 Experimental Evaluation
We have implemented the algorithm described in the previous section in the SMT exten-
sion of IC3 presented in [7]. The tool uses M ATH SAT [8] as backend SMT solver, and
takes as input either a symbolic transition system or a system with an explicit control-
flow graph (CFG), in the latter case invoking a specialized “CFG-aware” variant of
IC3 (TreeIC3, also described in [7]). The discovery of new predicates for abstraction
refinement is performed using the interpolation procedures implemented in M ATH -
SAT, following [15]. In this section, we experimentally evaluate the effectiveness of
our new technique. We will call our implementation of the various algorithms as fol-
lows: IC3(LRA) is the “concrete” IC3 extension for Linear Rational Arithmetic (LRA)
as presented in [7]; T REE IC3+ITP(LRA) is the CFG-based variant of [7], also work-
ing only over LRA, and exploiting interpolants whenever possible2 ; IC3+IA(T) is IC3
with Implicit Abstraction for an arbitrary theory T; T REE IC3+IA(T) is the CFG-based
IC3 with Implicit Abstraction for an arbitrary theory T.
All the experiments have been performed on a cluster of 64-bit Linux machines with
a 2.7 Ghz Intel Xeon X5650 CPU, with a memory limit set to 3Gb and a time limit of
1200 seconds (unless otherwise specified). The tools and benchmarks used in the ex-
periments are available at https://es.fbk.eu/people/griggio/papers/
tacas14-ic3ia.tar.bz2.
In the first part of our experiments, we evaluate the impact of Implicit Abstraction for
the performance of IC3 modulo theories. In order to do so, we compare IC3+IA(LRA)
and T REE IC3+IA(LRA) against IC3(LRA) and T REE IC3+ITP(LRA) on the same
set of benchmarks used in [7], expressed in the LRA theory. We also compare both
2
See [7] for more details.
56 A. Cimatti et al.
1000 1000
T REE IC3+IA(LRA)
IC3+IA(LRA)
100 100
10 10
1 1
0.1 0.1
0.1 1 10 100 1000 0.1 1 10 100 1000
80 IC3+IA(LRA)
TreeIC3+IA(LRA) Algorithm/Tool # solved Tot time
TreeIC3+ITP(LRA)
70
IC3+IA(LRA) 82 5836
Number of solved instances
Z3
IC3(LRA)
60 T REE IC3+IA(LRA) 75 8825
T REE IC3+ITP(LRA) 70 10478
50
Z3 66 2923
40 IC3(LRA) 62 9637
30
20
10
variants against the SMT extension of IC3 for LRA presented in [16] and implemented
in the Z 3 SMT solver.3
The results are reported in Figure 3. In the scatter plots at the top, safe instances
are shown as blue squares, and unsafe ones as red circles. The plot at the bottom re-
ports the number of solved instances and the total accumulated execution time for
each tool. From the results, we can clearly see that using abstraction has a very sig-
nificant positive impact on performance. This is true for both the fully symbolic and
the CFG-based IC3, but it is particularly important in the fully symbolic case: not only
IC3+IA(LRA) solves 20 more instances than IC3(LRA), but it is also more than one
order of magnitude faster in many cases, and there is no instance that IC3(LRA) can
solve but IC3+IA(LRA) can’t. In fact, Implicit Abstraction is so effective for these
benchmarks that IC3+IA(LRA) outperforms also T REE IC3+IA(LRA), even though
IC3(LRA) is significantly less efficient than T REE IC3+ITP(LRA). One of the rea-
sons for the smaller performance gain obtained in the CFG-based algorithm might be
3
We used the Git revision 3d910028bf of Z 3.
IC3 Modulo Theories via Implicit Predicate Abstraction 57
that T REE IC3+ITP(LRA) already tries to avoid expensive quantifier elimination op-
erations whenever possible, by populating the frames with clauses extracted from in-
terpolants, and falling back to quantifier elimination only when this fails (see [7] for
details). Therefore, in many cases T REE IC3+ITP(LRA) and T REE IC3+IA(LRA)
end up computing very similar sets of clauses. However, implicit abstraction still
helps significantly in many instances, and there is only one problem that is solved by
T REE IC3+ITP(LRA) but not by T REE IC3+IA(LRA). Moreover, both abstraction-
based algorithms outperform all the other ones, including Z 3.
We also tried a traditional CEGAR approach based on explicit predicate abstraction,
using a bit-level IC3 as model checking algorithm and the same interpolation procedure
of IC3+IA(LRA) for refinement. As we expected, this configuration ran out of time or
memory on most of the instances, and was able to solve only 10 of them.
Finally, we did a preliminary comparison with a variant of IC3 specific for timed
automata, ATMOC [18]. We randomly selected a subset of the properties provided with
ATMOC, ignoring the trivial ones (i.e. properties that are 1-step inductive or with a coun-
terexample of length < 3). IC3+IA(LRA) performs very well also in this case, solving
100 instances in 772 seconds, while ATMOC solved 41 instances in 3953 seconds (Z 3
and IC3(LRA) solved 100 instances in 1535 seconds and 46 instances in 3347 seconds
respectively). For lack of space we do not report the plots.
Impact of Number of Predicates. The refinement step may introduce more predicates
than those actually needed to rule out a spurious counterexample (e.g. the interpolation-
based refinement adds all the predicates found in the interpolant). In principle, such re-
dundant predicates might significantly hurt performance. Using the implicit abstraction
framework, however, we can easily implement a procedure that identifies and removes
(a subset of) redundant predicates after each successful refinement step. Suppose that
IC3+IA finds a spurious counterexample trace π = (s0 , 0); . . . ; (sk , k) with the set of
predicates P, and that refine(I, T, P, π) finds a set Pn of new predicates. The reduction
procedure exploits the encoding of the set of paths of the abstract system SP∪Pn up to
k steps, BMCkP∪Pn . If P ∪ Pn are sufficient to rule out the spurious counterexample,
BMCkP∪Pn is unsatisfiable. We ask the SMT solver to compute the unsatisfiable core of
BMCkP∪Pn , and we keep only the predicates of Pn that appear in the unsatisfiable core.
In order to evaluate the effectiveness of this simple approach, we compare two ver-
sions of IC3+IA(LRA) with and without the reduction procedure. Perhaps surprisingly,
although the reduction procedure is almost always effective in reducing the total number
of predicates, the effects on the execution time are not very big. Although redundancy
removal seems to improve performance for the more difficult instances, overall the two
versions of IC3+IA(LRA) solve the same number of problems. However, this shows
that the algorithm is much less sensitive to the number of predicates added than ap-
proaches based on an explicit computation of the abstract transition relation e.g. via
All-SMT, which often show also in practice (and not just in theory) an exponential
increase in run time with the addition of new predicates. IC3+IA(LRA) manages to
solve problems for which it discovers several hundreds of predicates, reaching the peak
of 800 predicates and solving most of safe instances with more than a hundred predi-
cates. These numbers are typically way out of reach for explicit abstraction techniques,
which blow up with a few dozen predicates.
58 A. Cimatti et al.
160
TreeIC3+IA(BV)
140 IC3+IA(BV) Algorithm/Tool # solved Tot time
ABC-dprove
T REE IC3+IA(BV) 150 7056
Number of solved instances
Tip
120 IC3ref
ABC-PDR IC3+IA(BV) 150 12753
100 ABC- DPROVE 120 4298
T IP 119 6361
80
IC3 REF 110 9041
60 ABC- PDR 75 6447
40
20
In the second part of our experimental analysis, we evaluate the effectiveness of Implicit
Abstraction as a way of applying IC3 to systems that are not supported by the methods
of [7], by instantiating IC3+IA(T) (and T REE IC3+IA(T)) over the theories of Linear
Integer Arithmetic (LIA) and of fixed-size bit-vectors (BV).
IC3 for BV. For evaluating the performance of IC3+IA(BV) and T REE IC3+IA(BV),
we have collected over 200 benchmark instances from the domain of software verifica-
tion. More specifically, the benchmark set consists of: all the benchmarks used in §5.1,
but using BV instead of LRA as background theory; the instances of the bitvector
set of the Software Verification Competition SV-COMP [2]; the instances from the test
suite of InvGen [13], a subset of which was used also in [25].
We have compared the performance of our tools with various implementations of
the Boolean IC3 algorithm, run on the translations of the benchmarks to the bit-level
Aiger format: the PDR implementation in the ABC model checker (ABC- PDR) [11],
T IP [22], and IC3 REF [3], the new implementation of the original IC3 algorithm as
described in [5]. Finally, we have also compared with the DPROVE algorithm of ABC
(ABC- DPROVE), which combines various different techniques for bit-level verification,
including IC3.4 We also tried Z 3, but it ran out of memory on most instances. It seems
that Z 3 uses a Datalog-based engine for BV, rather than PDR.
The results of the evaluation on BV are reported in Figure 4. As we can see, both
IC3+IA(BV) and T REE IC3+IA(BV) outperform the bit-level IC3 implementations.
In this case, the CFG-based algorithm performs slightly better than the fully-symbolic
one, although they both solve the same number of instances.
IC3 for LIA. For our experiments on the LIA theory, we have generated benchmarks
using the Lustre programs available from the webpage of the K IND model checker for
Lustre [14]. Since such programs do not have an explicit CFG, we have only evaluated
4
We used ABC version 374286e9c7bc, T IP 4ef103d81e and IC3 REF 8670762eaf.
IC3 Modulo Theories via Implicit Predicate Abstraction 59
900 IC3+IA(LIA)
Z3
800 pKind
Algorithm/Tool # solved Tot time
Number of solved instances
Kind
700
IC3+IA(LIA) 933 2064
600 Z3 875 1654
500 P K IND 859 720
400
K IND 746 8493
300
200
100
0
0.01 0.1 1 10 100 1000 10000
Total time
IC3+IA(LIA), by comparing it with Z 3 and with the latest versions of K IND as well as
its parallel version P K IND [17].5 The results are summarized in Figure 5. Also in this
case, IC3+IA(LIA) outperforms the other systems.
6 Conclusion
In this paper we have presented IC3+IA, a new approach to the verification of infinite
state transition systems, based on an extension of IC3 with implicit predicate abstrac-
tion. The distinguishing feature of our technique is that IC3 works in an abstract state
space, since the counterexamples to induction and the relative inductive clauses are ex-
pressed with the abstraction predicates. This is enabled by the use of implicit abstraction
to check (abstract) relative induction. Moreover, the refinement in our procedure is fully
incremental, allowing to keep all the clauses found in the previous iterations.
The approach has two key advantages. First, it is very general: the implementations
for the theories of LRA, BV, and LIA have been obtained with relatively little effort.
Second, it is extremely effective, being able to efficiently deal with large numbers of
predicates. Both advantages are confirmed by the experimental results, obtained on a
wide set of benchmarks, also in comparison against dedicated verification engines.
In the future, we plan to apply the approach to other theories (e.g. arrays, non-linear
arithmetic) investigating other forms of predicate discovery, and to extend the technique
to liveness properties.
Acknowledgments. This work was carried out within the D-MILS project, which is
partially funded under the European Commission’s Seventh Framework Programme
(FP7).
5
We used version 1.8.6c of K IND and P K IND. P K IND differs from K IND because it runs in
parallel k-Induction and an automatic invariant generation procedure. We run K IND with options
“-compression -n 100000” and P K IND with options “-compression -with-inv-gen -n 100000”.
60 A. Cimatti et al.
References
1. Barrett, C.W., Sebastiani, R., Seshia, S.A., Tinelli, C.: Satisfiability modulo theories. In:
Handbook of Satisfiability, vol. 185, pp. 825–885. IOS Press (2009)
2. Beyer, D.: Second Competition on Software Verification - (Summary of SV-COMP 2013).
In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 594–609. Springer,
Heidelberg (2013)
3. Bradley, A.: IC3ref, https://github.com/arbrad/IC3ref
4. Bradley, A., Somenzi, F., Hassan, Z., Zhang, Y.: An incremental approach to model checking
progress properties. In: Proc. of FMCAD (2011)
5. Bradley, A.R.: SAT-Based Model Checking without Unrolling. In: Jhala, R., Schmidt, D.
(eds.) VMCAI 2011. LNCS, vol. 6538, pp. 70–87. Springer, Heidelberg (2011)
6. Chokler, H., Ivrii, A., Matsliah, A., Moran, S., Nevo, Z.: Incremental formal verification of
hardware. In: Proc. of FMCAD (2011)
7. Cimatti, A., Griggio, A.: Software Model Checking via IC3. In: Madhusudan, P., Seshia,
S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 277–293. Springer, Heidelberg (2012)
8. Cimatti, A., Griggio, A., Schaafsma, B.J., Sebastiani, R.: The MathSAT5 SMT Solver. In:
Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 93–107. Springer,
Heidelberg (2013)
9. Cimatti, A., Mover, S., Tonetta, S.: Smt-based scenario verification for hybrid systems. For-
mal Methods in System Design 42(1), 46–66 (2013)
10. Clarke, E., Grumberg, O., Long, D.: Model Checking and Abstraction. ACM Trans. Program.
Lang. Syst. 16(5), 1512–1542 (1994)
11. Een, N., Mishchenko, A., Brayton, R.: Efficient implementation of property-directed reach-
ability. In: Proc. of FMCAD (2011)
12. Graf, S., Saı̈di, H.: Construction of Abstract State Graphs with PVS. In: Grumberg, O. (ed.)
CAV 1997. LNCS, vol. 1254, pp. 72–83. Springer, Heidelberg (1997)
13. Gupta, A., Rybalchenko, A.: InvGen: An efficient invariant generator. In: Bouajjani, A.,
Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 634–640. Springer, Heidelberg (2009)
14. Hagen, G., Tinelli, C.: Scaling Up the Formal Verification of Lustre Programs with SMT-
Based Techniques. In: Cimatti, A., Jones, R.B. (eds.) FMCAD, pp. 1–9. IEEE (2008)
15. Henzinger, T., Jhala, R., Majumdar, R., McMillan, K.: Abstractions from proofs. In: POPL,
pp. 232–244 (2004)
16. Hoder, K., Bjørner, N.: Generalized property directed reachability. In: Cimatti, A., Sebas-
tiani, R. (eds.) SAT 2012. LNCS, vol. 7317, pp. 157–171. Springer, Heidelberg (2012)
17. Kahsai, T., Tinelli, C.: Pkind: A parallel k-induction based model checker. In: Barnat, J.,
Heljanko, K. (eds.) PDMC. EPTCS, vol. 72, pp. 55–62 (2011)
18. Kindermann, R., Junttila, T., Niemelä, I.: SMT-based induction methods for timed systems.
In: Jurdziński, M., Ničković, D. (eds.) FORMATS 2012. LNCS, vol. 7595, pp. 171–187.
Springer, Heidelberg (2012)
19. Lahiri, S.K., Nieuwenhuis, R., Oliveras, A.: SMT techniques for fast predicate abstraction.
In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 424–437. Springer, Heidel-
berg (2006)
20. McMillan, K.L.: Lazy Abstraction with Interpolants. In: Ball, T., Jones, R.B. (eds.) CAV
2006. LNCS, vol. 4144, pp. 123–136. Springer, Heidelberg (2006)
IC3 Modulo Theories via Implicit Predicate Abstraction 61
21. Sharygina, N., Tonetta, S., Tsitovich, A.: The synergy of precise and fast abstractions for
program verification. In: SAC, pp. 566–573 (2009)
22. Sorensson, N., Claessen, K.: Tip, https://github.com/niklasso/tip
23. Tonetta, S.: Abstract Model Checking without Computing the Abstraction. In: Cavalcanti,
A., Dams, D.R. (eds.) FM 2009. LNCS, vol. 5850, pp. 89–105. Springer, Heidelberg (2009)
24. Vizel, Y., Grumberg, O., Shoham, S.: Lazy abstraction and SAT-based reachability in hard-
ware model checking. In: Cabodi, G., Singh, S. (eds.) FMCAD, pp. 173–181. IEEE (2012)
25. Welp, T., Kuehlmann, A.: QF BV model checking with property directed reachability. In:
Macii, E. (ed.) DATE, pp. 791–796 (2013)
SMT-Based Verification of Software Countermeasures
against Side-Channel Attacks
1 Introduction
Security analysis of the hardware and software systems implemented in embedded de-
vices is becoming increasingly important, since an adversary may have physical access
to such devices and therefore can launch a whole new class of side-channel attacks,
which utilize secondary information resulting from the execution of sensitive algo-
rithms on these devices. For example, the power consumption of a typical embedded
device executing the instruction tmp=text⊕key depends on the value of the secret
key [12]. This value can be reliably deduced using a statistical method known as differ-
ential power analysis (DPA [10,19]). In recent years, many commercial systems in the
embedded space have shown weaknesses against such attacks [16,14,1].
A common mitigation strategy against such attacks is using randomization techniques
to remove the statistical dependency between the sensitive data and the side-channel in-
formation. This can be done in multiple ways. Boolean masking, for example, uses an
XOR operation of a random number r with a sensitive variable a to obtain a masked (ran-
domized) variable: am = a ⊕ r [1,17]. Later, the sensitive variable can be restored by a
second XOR operation with the same random number: am ⊕r = a. Other randomization
based countermeasures have used additive masking (am = a + r mod n), multiplicative
masking (am = a ∗ r mod n), and application-specific code transformations such as
RSA blinding (am = are mod N ).
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 62–77, 2014.
c Springer-Verlag Berlin Heidelberg 2014
SMT-Based Verification of Software Countermeasures 63
k r1 r2 o1 o2 o3 o4
0 0 0 0 0 0 0
o1 = k ∧ (r1 ∧ r2) 0 0 1 0 0 0 1
o2 = k ∨ (r1 ∧ r2) 0 1 0 0 0 0 1
0 1 1 0 1 1 0
o3 = k ⊕ (r1 ∧ r2) 1 0 0 0 1 1 1
o4 = k ⊕ (r1 ⊕ r2) 1 0 1 0 1 1 0
1 1 0 0 1 1 0
1 1 1 1 1 0 1
Fig. 1. Masking examples: o1,o2,o3 are not perfectly masked, but o4 is perfectly masked
64 H. Eldib, C. Wang, and P. Schaumont
2 Preliminaries
In this section, we define the type of side-channel attacks considered in this paper and
review the notion of perfect masking.
Side-Channel Attacks. Following the notation used by Blömer et al. [4], we assume
that the program to be verified implements a function c ← enc(x, k), where x is the
plaintext, k is the secret key, and c is the ciphertext. Let I1 (x, k, r), I2 (x, k, r), . . .,
It (x, k, r) be the sequence of intermediate computation results inside the function,
where r is an s-bit random number in the domain {0, 1}s. The purpose of using r is
to make all intermediate results statistically independent of the secret key (k).
When enc(x, k) is a linear function in the Boolean domain, masking and de-masking
are straightforward. However, when enc(x, k) is a non-linear function, masking and de-
masking often require a complete redesign of the implementation. However, this manual
design process is both labor intensive and error prone, and currently, there is a lack of
automated tools to assess how secure a countermeasure really is.
We assume that an adversary knows the pair (x, c) of plaintext and ciphertext in
c ← enc(x, k). For each pair (x, c), the adversary also knows the joint distribution of at
most d intermediate computation results I1 (x, k, r), . . . , Id (x, k, r), through access to
SMT-Based Verification of Software Countermeasures 65
some aggregated quantity such as the power dissipation. However, the adversary does
not have access to r, which is produced by a true random number generator. The goal
of the adversary is to compute the secret key (k). In embedded computing, for instance,
these are realistic assumptions. In their seminal work, Kocher et al. [10] demonstrated
that for d = 1 and 2, the sensitive data can be reliably deduced using a statistical method
known as differential power analysis (DPA).
Perfect Masking. Given a pair (x, k) of plaintext and secret key for the function
enc(x, k), and d intermediate results I1 (x, k, r), . . . , Id (x, k, r), we use Dx,k (R) to
denote the joint distribution of I1 , . . . , Id – while assuming that the s-bit random num-
ber r is uniformly distributed in the domain {0, 1}s . Following Blömer et al. [4], we
do not put restrictions on the technical capability of an adversary. As long as there is
information leak, we consider the implementation to be vulnerable.
Definition 1. Given an implementation of function enc(x, k) and a set of intermediate
results {Ii (x, k, r)}, we say that the implementation is order-d perfectly masked if, for
all d-tuples I1 , . . . , Id , we have
Dx,k (R) = Dx ,k (R) for any two pairs (x, k) and (x , k ) .
The notion of perfect masking used here is more accurate than the sensitivity [2].
There, an intermediate result is considered to be sensitive if (1) it depends on at least
one secret input and (2) it is independent of any random input. We have demonstrated
the difference between them using the example in Fig. 1, where o1,o2,o3,o4 are all
insensitive, but only o4 is perfectly masked. In general, if an intermediate result is per-
fectly masked, it is guaranteed to be insensitive. However, an insensitive intermediate
result may not be perfectly masked.
To check for violations of perfect masking, we need to decide whether there exists a
d-tuple I1 , . . . , Id such that Dx,k (R) = Dx ,k (R) for some (x, k) and (x , k ). Here,
the main challenge is to compute Dx,k (R). We will present our solution in Section 3.
In this work, we focus on verifying security-critical programs, e.g. those that im-
plement cryptographic algorithms, as opposed to arbitrary software programs. (Our
method would be too expensive for verifying general-purpose software.) In general,
the class of programs that we consider here do not have input-dependent control flow,
meaning that we can easily remove all the loops and function calls from the code using
standard loop unrolling and function inlining techniques. Furthermore, the program can
be transformed into a branch-free representation, where the if-else branches are merged.
Finally, since all variables are bounded integers, we can convert the program to a purely
Boolean program through bit-blasting. Therefore, in this paper, we shall present our
new verification method on the bit-level representation of a branch-free program. Our
goal is to verify that all intermediate bits of the program are perfectly masked.
Fig. 2. Example: a program and its graphic representation (⊕ denotes XOR; ∧ denotes AND)
keys, r1 and r2 are random variables with independent and uniform distribution in
{0, 1}, and c is the computation result. The objective of masking is to make the power
consumption of the device executing this code independent from the values of the secret
keys. This masking scheme originated from Blömer et al. [4]. The return value c is
logically equivalent to (k1 ∧ k2) ⊕ (r1 ∧ r2). The corresponding demasking function,
which is not shown in the figure, is c ⊕ (r1 ∧ r2). Therefore, demasking would produce
a result that is logically equivalent to the desired value (k1 ∧ k2).
Our method will determine if all the intermediate variables of the program are per-
fectly masked. We use the Clang/LLVM compiler to parse the input Boolean program
and construct the data-flow graph, where the root represents the output and the leaf
nodes represent the input bits. Each internal node represents the result of a Boolean
operation of one of the following types: AND, OR, NOT, and XOR. For the example
in Fig. 2, our method starts by parsing the program and creating a graph representation.
This is followed by traversing the graph in a topological order, from the program inputs
(leaf nodes) to the return value (root node). For each internal node, which represents
an intermediate result, we check whether it is perfectly masked. The order in which we
check the internal nodes is as follows: n1, n2, n3, n4, n5, n6, n7, n8, and finally, c.
The Theory. As the starting point, we mark all the plaintext bits in x as public, the
key bits in k as secret, and the mask bits in r as random. Then, for each intermediate
computation result I(x, k, r) of the program, we check whether it is perfectly masked.
Following Definition 1, we formulate this check as a satisfiability problem as follows:
∃x.∃k, k . Σr∈{0,1}s I(x, k, r) = Σr∈{0,1}s I(x, k , r)
Here, x represents the plaintext bits, k and k represent two different valuations of the
key bits, and r is the random number uniformly distributed in the domain {0, 1}s, where
s is the number of random bits. For any fixed (x, k, k ),
– Σr∈{0,1}s I(x, k, r) is the number of satisfying assignments for I(x, k, r), and
– Σr∈{0,1}s I(x, k , r) is the number of satisfying assignment for I(x, k , r).
Assume that r is uniformly distributed in the domain {0, 1}s, the above summations
can be used to indicate the probabilities of I being logical 1 under two different key
values k and k .
SMT-Based Verification of Software Countermeasures 67
If the above formula is satisfiable, there exists a plaintext x and two different keys
(k, k ) such that the distribution of I(x, k, r) differs from the distribution of I(x, k , r).
In other words, some information of the secret key is leaked through I, and therefore
we say that I is not perfectly masked. If the above formula is unsatisfiable, then such
information leakage is not possible, and therefore we say that I is perfectly masked.
Another way to understand the above satisfiability problem is to look at the negation.
Instead of checking the satisfiability of the formula above, we check the validity of the
formula below:
∀x.∀k, k . Σr∈{0,1}s I(x, k, r) = Σr∈{0,1}s I(x, k , r)
If this formula is valid – meaning that it holds for all valuations of x, k and k – then
we say that I is perfectly masked.
The Encoding. Let Φ denote the SMT formula to be created for checking intermediate
result I(x, k, r). Let s be the number of random bits in r. Our encoding method ensures
that Φ is satisfiable if and only if I is not perfectly masked. We define Φ as follows:
2s −1 2s −1
Φ := Ψkr ∧ Ψkr ∧ Ψb2i ∧ Ψsum ∧ Ψdiff ,
r=0 r=0
where the subformulas are defined as follows:
– Program logic (Ψkr ): Each subformula Ψkr encodes a copy of the functionality of
I(x, k, r), with the random variable r set to a concrete value in {0, . . . , 2s − 1} and
the key set to value k or k . All copies share the same plaintext variable x.
– Boolean-to-int (Ψb2i ): It encodes the conversion of the Boolean valued output of
I(x, k, r) to an integer (true becomes 1 and false becomes 0), so that the integer
2s
values can be summed up later to compute Σr=1 I(x, k, r).
– Sum-up-the-1s (Ψsum ): It encodes the two summations of the logical 1s in the out-
puts of the 2s program logic copies, one for I(x, k, r) and the other for I(x, k , r).
– Different sums (Ψdiff ): It asserts that the two summations should have different
results.
The Running Example. Consider node n8 in Fig. 2 as the node under verification. The
function is defined as n8 = (r1 & (k2 xor r2)) xor (r2 & (k1 xor r1)).
The SMT formula that our method generates – by instantiating r1r2 to 00, 01, 10,
and 11 – is the conjunction of all of the formulas listed below:
n8_1 = (0 & (k2 xor 0)) xor (0 & (k1 xor 0)) // four copies of I(k, r)
n8_2 = (0 & (k2 xor 1)) xor (1 & (k1 xor 0))
n8_3 = (1 & (k2 xor 0)) xor (0 & (k1 xor 1))
n8_4 = (1 & (k2 xor 1)) xor (1 & (k1 xor 1))
n8_1’ = (0 & (k2’ xor 0)) xor (0 & (k1’ xor 0)) // four copies of I(k’,r)
n8_2’ = (0 & (k2’ xor 1)) xor (1 & (k1’ xor 0))
n8_3’ = (1 & (k2’ xor 0)) xor (0 & (k1’ xor 1))
n8_4’ = (1 & (k2’ xor 1)) xor (1 & (k1’ xor 1))
(( num1 = 1 ) & n8_1 ) | ((num1=0) & not n8_1 ) // convert bool to integer
(( num2 = 1 ) & n8_2 ) | ((num2=0) & not n8_2 )
(( num3 = 1 ) & n8_3 ) | ((num3=0) & not n8_3 )
(( num4 = 1 ) & n8_4 ) | ((num4=0) & not n8_4 )
(( num1’ = 1 ) & n8_1’) | ((num1’=0) & not n8_1’) // convert bool to integer
(( num2’ = 1 ) & n8_2’) | ((num2’=0) & not n8_2’)
(( num3’ = 1 ) & n8_3’) | ((num3’=0) & not n8_3’)
(( num4’ = 1 ) & n8_4’) | ((num4’=0) & not n8_4’)
(num1 + num2 + num3 + num4) != (num1’ + num2’ + num3’ + num4’) // the check
We solve the conjunction of the above formulas using an off-the-shelf SMT solver
called Yices [6]. In this particular example, the formula is satisfiable. For example, one
of the satisfying assignments is k1k2=00 and k1’k2’=01. We shall show in the next
section that, when the key bits are 00, the probability for n8 to be logical 1 is 0%; but
when the key bits are 01, the probability is 50%. This makes it vulnerable to first-order
DPA attacks. Therefore, n8 is not perfectly masked.
Our encoding can be easily extended to implement this new check. In practice, most
countermeasures assume that the adversary has access to the side-channel leakage of
SMT-Based Verification of Software Countermeasures 69
either one or two intermediate results, which corresponds to first-order and second-
order attacks. In our actual implementation, we handle both first-order and second-order
attacks. In our experiments, we also evaluate our new method on verifying countermea-
sures against both first-order and second-order attacks (where d = 1 or 2).
Consider the automated verification of our running example in Fig. 2. For each internal
node I, we first identify all the transitive fan-in nodes of I in the program to form a code
region for the subsequent SMT solver based analysis. In the worst case, the extracted
code region should start from the instruction (node) to be verified, and cover all the
transitive fan-in nodes on which it depends. Then, the extracted code region is given
to our SMT based verification procedure, whose goal is to prove (or disprove) that the
node is statistically independent of the secret key.
Following a topological order, our method starts with node n1, which is defined in
Line 3 of the program in Fig. 2. The extracted code region consists of n1 = k1 ⊕ r1
itself. Since it involves only one key and one random variable in the XOR operation,
a simple static analysis can prove that it is perfectly masked. Therefore, although we
could have verified it using SMT, we skip it for efficiency reasons. Such simple static
analysis is able to prove that n2, n4 and n6 are also perfectly masked.
Next, we check if n3 is perfectly masked. The truth table of n3 is shown in Fig. 4
(left). In all four valuations of k1 and k2, the probability of n3 being logical 1 is 25%.
Therefore, n3 is perfectly masked. When we apply our SMT based method, the solver
is not able to find any satisfying assignment for k1 and k2 under which the probability
distributions of n3 are different. Note that our method does not check the probability of
the output being logical 0, since having an equal probability distribution of logical 1 is
equivalent to having an equal probability distribution for logical 0.
k1 k2 r1 r2 n3 k1 k2 r1 r2 n8 k1 k2 r1 r2 c
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 1 0 0 0 0 1 0
0 0 1 0 0 0 0 1 0 0 0 0 1 0 0
0 0 1 1 1 0 0 1 1 0 0 0 1 1 1
0 1 0 0 0 0 1 0 0 0 0 1 0 0 0
0 1 0 1 0 0 1 0 1 0 0 1 0 1 0
0 1 1 0 1 0 1 1 0 1 0 1 1 0 0
0 1 1 1 0 0 1 1 1 1 0 1 1 1 1
1 0 0 0 0 1 0 0 0 0 1 0 0 0 0
1 0 0 1 1 1 0 0 1 1 1 0 0 1 0
1 0 1 0 0 1 0 1 0 0 1 0 1 0 0
1 0 1 1 0 1 0 1 1 1 1 0 1 1 1
1 1 0 0 1 1 1 0 0 0 1 1 0 0 1
1 1 0 1 0 1 1 0 1 1 1 1 0 1 1
1 1 1 0 0 1 1 1 0 1 1 1 1 0 1
1 1 1 1 0 1 1 1 1 0 1 1 1 1 0
Fig. 4. The truth-tables for internal nodes n3, n8, and c of the example program in Fig. 2
The verification steps for nodes n5 and n7 are similar to that of n3 – all of them are
perfectly masked.
70 H. Eldib, C. Wang, and P. Schaumont
Next, we check if n8 is perfectly masked. The proof would fail because, as shown in
the truth table in Fig. 4 (middle), the probability for n8 to be logical 1 is not the same
under different valuations of the keys. For example, if the keys are 00, then n8 would
be 0 regardless of the values of the random variables. Recall that we have shown the
detailed SMT encoding for n8 in Section 3. Using our method, the solver can quickly
find two configurations of the key bits (for example, 00 and 11) under which the prob-
abilities of n8 being logical 1 are different. Therefore, n8 is not perfectly masked.
The remaining node is c, whose truth table is shown in Fig. 4 (right). Similar to n8,
our SMT based method will be able to show that it is not perfectly masked.
It is worth pointing out that the result of applying the Sleuth method [2] would have
been different. Although n8 and c are clearly vulnerable to first-order DPA attacks, the
Sleuth method, based on the notion of sensitivity, would have classified them as “se-
curely masked.” This demonstrates a major advantage of our new method over Sleuth.
I2 := I1 ⊕ de-mask (x , k , r )
mask2 mask2 := rnew ⊕ mask (x , k , r ) ⊕ de-mask (x , k , r )
I2 := rnew ⊕ (. . .)
+ rdummy := rdummy
I3 I3
I1
+ de−mask Before verifying mask2, if we have already
rnew proved that I2 is perfectly masked, and rnew
mask x k r is a new random variable not used elsewhere,
then for the purpose of checking mask2 only,
x k r we can substitute I2 with rnew while verify-
ing mask2.
Fig. 5. Incremental verification: applying the SMT based analysis to a small fan-in region only
Due to associativity of the ⊕ operator, reordering the masking and demasking oper-
ations would not change the logical result. For example, in Fig. 5, the instruction being
verified is in mask2(). Since the newly added random variable rnew is not used inside
mask() or de-mask(), or in the support of I3 , we can replace the entire fan-in cone of
I2 by a new random variable rdummy (or even rnew itself) while verifying mask2().
We shall see in the experimental results section that such opportunities are abundant in
real-world applications. Therefore, in this subsection, we present a sound algorithm for
extracting a small code region from the fan-in cone of the node under verification.
Our algorithm relies on some auxiliary data structures associated with the current
node i under verification: supportV[i], uniqueM[i] and perfectM[i].
– supportV[i] is the set of inputs in the support of the function of node i.
– uniqueM[i] is the set of random inputs that each reaches i along only one path.
– perfectM[i] is a subset of uniqueM[i] where each random variable, by itself, guar-
antees that node i is perfectly masked.
These tables can be computed by a traversal of the program nodes as described in Algo-
rithm 1. For example, for node I1 in Fig. 5, supportV[I1 ]= {x, k, r, rnew }, uniqueM[I1 ]
= {r, rnew }, and perfectM[I1 ]= {rnew }, assuming r is not repeated in the mask block.
For node I2 , we have supportV[I2 ]= {x, k, r, rnew }, uniqueM[I2 ]= {rnew }, since r
reaches I2 twice and so may have been de-masked, and perfectM[I2 ]= {rnew }.
Algorithm 1. Computing the auxiliary tables for all internal nodes of the program.
1. supportV[i] ← { v } for each input node i with variable v
2. uniqueM[i] ← { v } for each input node i with random mask variable v
3. perfectM[i] ← { v } for each input node i with random mask variable v
4. for each (internal node i in a leaf-to-root topological order) {
5. L ← L EFT C HILD(i)
6. R ← R IGHT C HILD(i)
7. supportV[i] ← supportV[L] ∪ supportV[R]
8. uniqueM ← (uniqueM[L] ∪ uniqueM[R]) \ (supportV[L] ∩ supportV[R])
9. if (i is an XOR node)
10. perfectM[i] ← uniqueM[i] ∩ (perfectM[L]∪perfectM[R])
11. else
12. perfectM[i] ← { }
13. }
Our idea of extracting a small code region for SMT based analysis is formalized in
Algorithm 2. Given the node i under verification, and uniqueM[i] as the set of random
variables that each reaches i along only one path, we call G ET R EGION(i,uniqueM[i])
to compute the region. Inside G ET R EGION, uniqueM[i] is renamed to freshMasksATi.
More specifically, we start by checking each transitive fan-in node n of the current node
i. If n is a leaf node (Line 2), then we add n and the input variable v to the region. If
n is not a leaf node, we check if there is a random variable r ∈uniqueMATi that, by
itself, can perfectly mask node n (Line 4). In Fig. 5, for example, rnew , by itself, can
uniformly mask node I2 . If such random variable r exists, then we add pair (n, r) to
the region and return – skipping the entire fan-in cone of n. Otherwise, we recursively
invoke G ET R EGION to traverse the two child nodes of n.
72 H. Eldib, C. Wang, and P. Schaumont
Algorithm 2. Extracting a code region for node i for the subsequent SMT based analysis.
1. G ET R EGION (n, uniqueMATi) {
2. if (n is an input node with variable v)
3. region.add ← (n, v)
4. else if (∃ random variable r ∈ perfectM[n] ∩ uniqueMATi)
5. region.add ← (n, r)
6. else
7. region.add ← (n, {})
8. region.add ← G ET R EGION(n.Left, uniqueMATi)
9. region.add ← G ET R EGION(n.Right, uniqueMATi)
10. return region
11. }
The Overall Algorithm. Algorithm 3 shows the overall flow of our incremental verifi-
cation method. Given the program and the lists of secret, random and plaintext variables,
our method systematically scans through all the internal nodes from the inputs to the
return value. For each node i, our method first extracts a small code region (Line 4).
Then, we invoke the SMT based analysis. If the node is not perfectly masked, we add it
to the list of bad nodes.
– Both operands are perfectly masked. This guarantees to find all the resultant imper-
fect masked instructions due to an initial imperfectly masked instruction.
To further optimize the performance of Algorithm 3, we implement a method for
identifying random variables that are don’t cares for the node i under verification, and
use the information to reduce the cost of the SMT based analysis. Prior to the SMT
encoding, for each random variable r ∈supportV[i], we check if the value of r can
ever affect the output of i. If the answer is no, then r is a don’t care. During our SMT
encoding, we will set r to logical 0 rather than treat r as a random variable, to to reduce
the size of the SMT formula. This can lead to a significant performance improvement
since the formula size is exponential in the number of relevant random variables.
We check whether r ∈ support[i] is a don’t care for node i by constructing a SAT
formula and solving it using the SMT solver. The SAT formula is defined as follows:
r=0
Ψregion ∧ Ψregion
r=1
∧ ΨdiffO ,
r=0
where Ψregion encodes the program logic of the region, with the random bit r set to
r=1
0, Ψregion encodes the program logic of the region, with the random bit r set to 1,
and ΦdiffO asserts that the outputs of these two copies differ. If the above formula is
unsatisfiable, then r is a don’t care for node i.
6 Experiments
We have implemented our method in a verification tool called SC Sniffer, based on the
LLVM compiler and the Yices SMT solver [6]. It runs in two modes: monolithic and
incremental. The monolithic mode applies our SMT based encoding to the entire fan-in
cone of each node in the program, whereas the incremental method tries to restrict the
SMT encoding to a localized region. In addition, we implemented the Sleuth method [2]
for experimental comparison. The main difference is that our method not only checks
whether a node is masked (as in Sleuth), but also checks whether it is perfectly masked,
i.e. it is statistically independent of the secret key.
We have evaluated our tool on some recently proposed countermeasures. Our exper-
iments were designed to answer the following research questions:
– How effective is our new method? We know that in theory, the new method is more
accurate than the Sleuth method. But does it have a significant advantage over the
Sleuth method in practice?
– How scalable is our new method, especially in verifying applications of realistic
code size and complexity? We have extended our SMT based method with incre-
mental verification. Is it effective in practice?
Table 1 shows the statistics of the benchmarks. Column 1 shows the name of each
benchmark example. Column 2 shows a short description of the implemented algo-
rithm. Column 3 shows the number of lines of code – here, each instruction is a bit
level operation. Column 4 shows the number of nodes that represent the intermediate
computation results. Columns 5-7 show the number of input bits that are the secret key,
the plaintext, and the random variable, respectively.
74 H. Eldib, C. Wang, and P. Schaumont
Table 1. The benchmark statistics: in addition to the program name and a short description, we
show the total lines of code, the numbers of intermediate nodes and the various inputs
Table 2. Experimental results: comparing our SC Sniffer method with the Sleuth method [2]
The benchmarks are classified into three groups. The first group of test cases (P1 to
P5) are taken from the Sleuth benchmark [2], all of which contain intermediate variables
that are not masked at all. More specifically, P1 is the masking key whitening code on
Page 12 of the Sleuth paper. P2 is the AES8 example, a smart card implementation of
AES resistant to power analysis, originated from Herbst et al. [8]. P3 is the code on
Page 13 of the Sleuth paper, also originated from Herbst et al. [8]. P4 is the code on
Page 18 of the Sleuth paper, originated from Messerges [13]. P5 is the code on Page 18
of the Sleuth paper, originated from Goubin [7].
The second group of test cases (P6 to P11) are examples where most of the interme-
diate variables are masked, but none of the masking schemes is perfect. P6 and P7 are
SMT-Based Verification of Software Countermeasures 75
the two examples used by Blömer et al. [4] (on Page 7). P8 and P9 are the SHA3 MAC-
Keccak computation reordered examples, originated from Bertoni et al. [3] (Eq. 5.2 on
Page 46). P10 and P11 are two experimental masking schemes for the Chi function in
SHA3, none of which is perfectly masked.
The third group of test cases (P12 to P17) comes from the regeneration of MAC-
Keccak reference code submission to NIST in the SHA-3 competition [15]. There are a
total of 285k lines of Boolean operation code. The difference among these test cases is
that they are protected by various countermeasures, some of which are perfectly masked
(e.g. P12) whereas others are not.
Table 2 shows the experimental results run on a machine with a 3.4 GHz Intel i7-
2600 CPU, 4 GB RAM, and a 32-bit Linux OS. We have compared the performance
of three methods: Sleuth, New (monolithic), and New (incremental). Here, Sleuth is
the method proposed by Bayrak et al. [2], while the other two are our own method.
In this table, Column 1 shows the name of each test program. Columns 2-5 show the
results of running Sleuth, including whether the program passed the check, the number
of nodes failed the check, and the total number of nodes checked. Columns 6-9 show the
results of running our new monolithic method. Here, mem-out means that the method
requires more than 4 GB of RAM. Columns 10-14 show the results of running our new
incremental method. Here, we also show the number of SMT based masking checks
made, which is often much smaller than the number of nodes checked, because many
of them are resolved by our static analysis.
First, the results show that our new algorithm is more accurate than Sleuth in deciding
whether a node is securely masked. Every node that failed the security check of Sleuth
would also fail the security check of our new method. However, there are many nodes
that passed the check of Sleuth, but failed the check of our new method. These are
the nodes that are masked, but their probability distributions are still dependent on the
sensitive inputs – in other words, they are not perfectly masked.
Second, the results show that our incremental method is significantly more scalable
than the monolithic method. On the first two groups of test cases, where the programs
are small, both methods can complete, and the difference in run time is small. However,
on large programs such as the Keccak reference code, the monolithic method could not
finish since it quickly ran out of the 4 GB RAM, whereas the incremental method can
finish in a reasonable amount of time. Moreover, although the Sleuth method imple-
ments a significantly simpler (and hence weaker) check, it is also based on a monolithic
verification approach. Our results in Table 2 show that, on large examples, our incre-
mental method is significantly faster than Sleuth.
As a measurement of the scalability of the
algorithms, we have conducted experiments
on a 1-bit version of test program P1 for 1
to 10 encryption rounds. In each parameter-
ized version, the input for each round is the
output from the previous round. We ran the
experiment twice, once with an unmasked in-
struction in each round, and once with all
instructions perfectly masked. The results of
76 H. Eldib, C. Wang, and P. Schaumont
the two experiments are almost identical, and therefore, we only plot the result for the
perfectly masked version. In the right figure, the x-axis shows the program size, and the
y-axis shows the verification time in seconds. Among the three methods, our incremen-
tal method is the most scalable.
7 Conclusions
We have presented the first fully automated method for formally verifying whether a
software implementation is perfectly masked by uniformly random inputs, and there-
fore is secure against power analysis based side-channel attacks. Our new method re-
lies on translating the verification problem into a set of constraint solving problems,
which can be decided by off-the-shelf solvers such as Yices. We have also presented
an incremental checking procedure to drastically improve the scalability of the SMT
based algorithm. We have conducted experiments on a large set of recently proposed
countermeasures. Our results show that the new method is not only more precise than
existing methods, but also scalable for practical use.
References
1. Balasch, J., Gierlichs, B., Verdult, R., Batina, L., Verbauwhede, I.: Power analysis of Atmel
CryptoMemory – recovering keys from secure EEPROMs. In: Dunkelman, O. (ed.) CT-RSA
2012. LNCS, vol. 7178, pp. 19–34. Springer, Heidelberg (2012)
2. Bayrak, A.G., Regazzoni, F., Novo, D., Ienne, P.: Sleuth: Automated verification of software
power analysis countermeasures. In: Bertoni, G., Coron, J.-S. (eds.) CHES 2013. LNCS,
vol. 8086, pp. 293–310. Springer, Heidelberg (2013)
3. Bertoni, G., Daemen, J., Peeters, M., Assche, G.V., Keer, R.V.: Keccak implementation
overview, http://keccak.neokeon.org/Keccak-implementation-3.2.pdf
4. Blömer, J., Guajardo, J., Krummel, V.: Provably secure masking of AES. In: Handschuh, H.,
Hasan, M.A. (eds.) SAC 2004. LNCS, vol. 3357, pp. 69–83. Springer, Heidelberg (2004)
5. Clarke, E.M., Grumberg, O., Peled, D.A.: Model Checking. MIT Press, Cambridge (1999)
6. Dutertre, B., de Moura, L.: A fast linear-arithmetic solver for DPLL(T). In: Ball, T., Jones,
R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 81–94. Springer, Heidelberg (2006)
7. Goubin, L.: A sound method for switching between boolean and arithmetic masking. In:
Koç, Ç.K., Naccache, D., Paar, C. (eds.) CHES 2001. LNCS, vol. 2162, pp. 3–15. Springer,
Heidelberg (2001)
8. Herbst, C., Oswald, E., Mangard, S.: An AES smart card implementation resistant to power
analysis attacks. In: Zhou, J., Yung, M., Bao, F. (eds.) ACNS 2006. LNCS, vol. 3989, pp.
239–252. Springer, Heidelberg (2006)
9. Joye, M., Paillier, P., Schoenmakers, B.: On second-order differential power analysis. In:
Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 293–308. Springer, Heidelberg
(2005)
10. Kocher, P.C., Jaffe, J., Jun, B.: Differential power analysis. In: Wiener, M. (ed.) CRYPTO
1999. LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999)
SMT-Based Verification of Software Countermeasures 77
11. Li, B., Wang, C., Somenzi, F.: A satisfiability-based approach to abstraction refinement in
model checking. Electronic Notes in Theoretical Computer Science 89(4) (2003)
12. Mangard, S., Oswald, E., Popp, T.: Power Analysis Attacks - Revealing the Secrets of Smart
Cards. Springer (2007)
13. Messerges, T.S.: Securing the AES finalists against power analysis attacks. In: Schneier, B.
(ed.) FSE 2000. LNCS, vol. 1978, pp. 150–164. Springer, Heidelberg (2001)
14. Moradi, A., Barenghi, A., Kasper, T., Paar, C.: On the vulnerability of FPGA bitstream en-
cryption against power analysis attacks: Extracting keys from Xilinx Virtex-II FPGAs. In:
ACM Conference on Computer and Communications Security, pp. 111–124 (2011)
15. NIST. Keccak reference code submission to NIST’s SHA-3 competition (Round 3),
http://csrc.nist.gov/groups/ST/hash/sha-3/Round3/
documents/Keccak FinalRnd.zip
16. Paar, C., Eisenbarth, T., Kasper, M., Kasper, T., Moradi, A.: Keeloq and side-channel
analysis-evolution of an attack. In: FDTC, pp. 65–69 (2009)
17. Prouff, E., Rivain, M.: Masking against side-channel attacks: A formal security proof. In:
Johansson, T., Nguyen, P.Q. (eds.) EUROCRYPT 2013. LNCS, vol. 7881, pp. 142–159.
Springer, Heidelberg (2013)
18. Sabelfeld, A., Myers, A.C.: Language-based information-flow security. IEEE Journal on Se-
lected Areas in Communications 21(1), 5–19 (2003)
19. Taha, M., Schaumont, P.: Differential power analysis of MAC-Keccak at any key-length.
In: Sakiyama, K., Terada, M. (eds.) IWSEC 2013. LNCS, vol. 8231, pp. 68–82. Springer,
Heidelberg (2013)
20. Wang, C., Hachhtel, G.D., Somenzi, F.: Abstraction Refinement for Large Scale Model
Checking. Springer (2006)
21. Yang, Z., Wang, C., Ivančić, F., Gupta, A.: Mixed symbolic representations for model check-
ing software programs. In: Formal Methods and Models for Codesign, pp. 17–24 (July 2006)
Detecting Unrealizable Specifications
of Distributed Systems
1 Introduction
The goal of program synthesis, and systems engineering in general, is to build sys-
tems that satisfy a given specification. Sometimes, however, this goal is unattain-
able, because the conditions of the specification are impossible to satisfy in an
This work was partially supported by the German Research Foundation (DFG) as
part of SFB/TR 14 AVACS and by the Saarbrücken Graduate School of Computer
Science, which receives funding from the DFG as part of the Excellence Initiative of
the German Federal and State Governments.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 78–92, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Detecting Unrealizable Specifications of Distributed Systems 79
Related Work. To the best of the authors’ knowledge, there has been no at-
tempt in the literature to characterize unrealizable specifications for distributed
systems beyond the restricted class of architectures with decidable synthesis
problems, such as pipelines and rings [3, 4]. By contrast, there is a rich litera-
ture concerning unrealizability for open systems, that is, single-process systems
interacting with the environment [7, 8, 9]. In robotics, there have been recent at-
tempts to analyze unrealizable specifications [10]. The results are also focused on
the reason for unsatisfiability, while our approach tries to determine if a specifi-
cation is unrealizable. Moreover, they only consider the simpler non-distributed
synthesis of GR(1) specifications, which is a subset of LTL. There are other
approaches concerning unrealizable specifications in the non-distributed setting
that also use counterexamples [11, 12]. There, the system specifications are as-
sumed to be correct and the information from the counterexamples are used
to modify environment assumptions in order to make the specifications realiz-
able. The Byzantine Generals’ Problem is often used as an illustration for the
knowledge-based reasoning in epistemic logics, see [13] for an early formaliza-
tion. Concerning the synthesis of fault-tolerant distributed systems, there is an
approach to synthesize fault-tolerant systems in the special case of strongly con-
nected system architectures [14].
Detecting Unrealizable Specifications of Distributed Systems 81
a b x
a b
y a b y z
x x c
y
a
2 Distributed Realizability
A specification is realizable if there exists an implementation that satisfies the
specification. For distributed systems, the realizability problem is typically stated
with respect to a specific system architecture. Figure 1 shows some typical exam-
ple architectures: an architecture consisting of independent processes, a pipeline
architecture, and a join architecture. The architecture describes the communi-
cation topology of the distributed system. For example, an edge from x to y
labeled with b indicates that b is a shared variable between processes x and y,
where x writes to b and y reads b. The classic distributed realizability problem is
to decide whether there exists an implementation (or strategy) for each process
in the architecture, such that the joint behavior satisfies the specification. In
this paper, we are furthermore interested in the synthesis of fault-tolerant dis-
tributed systems, where the processes and the communication between processes
may become faulty.
In order to have a uniform and precise definition for the various realizabil-
ity problems of interest, we use a logical representation. Extended coordination
logic (ECL) [15] is a game-based extension of linear-time temporal logic (LTL).
ECL uses the strategy quantifier ∃C s to express the existence of an implemen-
tation for a process output s based on input variables C.
ECL Syntax. ECL formulas contain two types of variables: the set C of input
(or coordination) variables, and the set S of output (or strategy) variables. In
addition to the usual LTL operators Next , Until U, and Release R, ECL has
the strategy quantifier ∃C s, which introduces an output variable s whose values
must be chosen based on the inputs in C. The syntax is given by the grammar
ϕ ::= x | ¬x | ϕ ∨ ϕ | ϕ ∧ ϕ | ϕ | ϕ U ϕ | ϕ R ϕ | ∃C s. ϕ | ∀C s. ϕ ,
ECL Semantics. We give a quick definition of the ECL∃ semantics for for-
mulas in PNF and refer the reader to [15] for details and for the semantics
of full ECL. The semantics is based on trees as a representation for strate-
gies and computations. Given a finite set of directions Υ and a finite set of
labels Σ, a (full) Σ-labeled Υ -tree T is a pair Υ ∗ , l, where l : Υ ∗ → Σ as-
signs each node υ ∈ Υ a label l(υ). For two trees T and T , we define the
joint valuation T ⊕ T to be the widened tree with the union of both la-
bels. We refer to [15] for a formal definition. A path σ in a Σ-labeled Υ -tree
T is an ω-word σ0 σ1 σ2 . . . ∈ Υ ω and the corresponding labeled path σ T is
(l(
), σ0 )(l(σ0 ), σ1 )(l(σ0 σ1 ), σ2 )(l(σ0 σ1 σ2 ), σ3 ) . . . ∈ (Υ × Σ)ω .
For a strategy variable s that is bound by some quantifier QC s. ϕ, we refer to
C as the scope of s, denoted by Scope(s). The meaning of a strategy variable s is
a strategy or implementation fs : (2Scope(s) )∗ → 2{s} , i.e., a function that maps a
history of valuations of input variables to a valuation of the output variable s. We
represent the computation of a strategy fs as the tree (2Scope(s) )∗ , fs where fs
serves as the labeling function (cf. Fig. 2(a)–(b)). ECL∃ formulas are interpreted
over computation trees, that are the joint valuations of the computations for
Detecting Unrealizable Specifications of Distributed Systems 83
{y} ∅ {y}
a ¬a a ¬a
∅ {x} ∅ {x} ∅
Fig. 2. In (a) and (b) we sketch example strategies for y and x satisfying the ECL∃
formula ∃∅
y. ∃{a}
x. ( x ↔ a) ∧ (y ↔ ¬y). In (c) we visualize the resulting
computation tree on which the body (LTL) formula is evaluated.
Scope(s) ∗
strategies belonging to the strategy variables in S, i.e., s∈S (2 ) , fs
(cf. Fig. 2(c)). Given an ECL∃ formula Q∃ . ϕ in prenex normal form over strategy
variables S and coordination variables C, the formula is satisfied if there exists
a computation tree T (over S), such that all paths in T satisfy the LTL formula
ϕ, i.e., ∀σ ∈ (2C )ω . σ T , 0 ϕ where the satisfaction of an LTL formula on a
labeled path σ T on position i ≥ 0 is defined as usual.
which of its inputs come from a faulty node. Since z must produce the same
output on two paths it cannot distinguish, the implementation of z contradicts
the specification in either architecture.
b
c d d
x y x y
a b a c
(a) (b)
Theorem 6 (Correctness). Given an ECL∃ formula Φ = Q∃ . 1≤i≤n ϕpathi
→ ϕi over coordination variables C and strategy variables S with an acyclic
system architecture AΦ after removing common feedback edges. Φ is unsatisfi-
able if there exist functions K1 . . . Kn : C → IN such that the QPTL formula
unsatfault (Φ, K1 , . . . , Kn ) is satisfiable.
Example. We consider again the Byzantine Generals’ Problem with three nodes
g1 , g2 , and g3 . The first general is the commander who forwards the input v that
states whether to attack the enemy or not. The encoding as ECL∃ formula is
Φbgp := ∃{v}
g12 , g13 . ∃{c12 }
g23 . ∃{c13 }
g32 . ∃{c12 , c32 }
g2 . ∃{c13 , c23 }
g3 .
(operational2,3 → consensus2,3 ) ∧ (operational1,i → correctvali ) ,
i∈{2,3}
where the quantification prefix introduces the strategies for the generals g2 and
g3 , as well as the communication between the three generals as depicted in the
architecture in Fig. 4(a). Note that we omit the vote of the commander g1 as it is
not used in the specification. In the temporal part, we specify which failures can
occur. The first conjunct, corresponding to Fig. 4(b), states that the commander
is faulty (operational2,3 ) which implies that the other two generals have to reach a
consensus whether to attack or not (consensus2,3 ). The other two cases, depicted
in Fig. 4(c)–(d), are symmetric and state that whenever one general is faulty the
other one should agree on the decision made by the commander. The QPTL
encoding unsatfault (Φbgp , K1 , K2 , K3 ) is given as
∃ paths({v}, K). ∀ strategies({g12 , g13 }, K). ∃ paths({c12 , c13 }, K).
∀ strategies({g23 , g32 }, K). ∃ paths({c23 , c32 }, K). ∀ strategies({g2 , g3 }, K).
consistent({g12 , g13 , g23 , g32 , g2 , g3 }, K) →
operational2,3 (π) ∧ ¬consensus2,3 (π) ∨
π∈branches(C,K1 ) π∈branches(C,K1 )
operational1,3 (π) ∧ ¬correctval3 (π) ∨
π∈branches(C,K2 ) π∈branches(C,K2 )
operational1,2 (π) ∧ ¬correctval2 (π) .
π∈branches(C,K3 ) π∈branches(C,K3 )
88 B. Finkbeiner and L. Tentrup
v v v v
g1 g1 g1 g1
c12 c13 c12 c13 c12 c13 c12 c13
g2 g3 g2 g3 g2 g3 g2 g3
Fig. 4. The Byzantine Generals’ architecture. Figure (a) shows the architecture in
cases all generals are loyal. Figures (b)–(d) show the possible failures, indicated by the
dashed communication links.
Presently available QPTL solver were unable to handle even small instances
of our problem. We therefore simplify the problem using the following steps.
Instead of checking the QPTL formula directly, we encode the formula as an
equivalent monadic second order logic of one successor (S1S) formula using a
straightforward translation. We then interpret the S1S formula with a WS1S
formula, which can be checked using the WS1S solver Mona [16]. Some of our
smaller instances were solved by Mona, but the Byzantine Generals’ Problem
failed due to memory constraints in the BDD library.
Taking the simplifications even further, we not only bound the number of
paths but also the length of the paths by translating the problem to the satisfia-
bility problem of quantified Boolean formulas (QBF). The encoding translates a
QPTL variable x to Boolean variables x0 , . . . , xk−1 , each representing one step
in the system where k is the length of the paths. We build the QBF formula
by unrolling the QPTL formula for k-steps: Each variable in the quantification
prefix of the QPTL formula is transformed into k Boolean variables in the QBF
prefix, e.g., the 3-unrolling of ∃x. ∀y. ϕ is ∃x0 , x1 , x2 . ∀y0 , y1 , y2 . ϕunroll . The un-
rolling of the remaining LTL formula is given by the expansion law for Until,
ϕ U ψ ≡ ψ ∨ (ϕ ∧ ϕ U ψ). After the unrolling, the QBF formula is transformed
into Conjunctive Normal Form (CNF) and encoded in the QDIMACS file for-
mat, that is the standard format for QBF solvers. Already with this encoding
we could solve more examples than using the WS1S approach.
In this simple translation, one cause of high complexity is due to the consis-
tency conditions between the strategy variables across different paths. However,
most of these variables are not used for the counterexample itself but appear
only in the consistency condition. One optimization removes these unnecessary
variables from the encoding. Therefore, we collect all strategy variables and
(when possible) their temporal occurrence from the LTL specification. For every
used strategy variable we build the dependency graph that contains all variables
which can influence the outcome of the strategy. In the last step, we remove all
variables that are not contained in any dependency graph.
Detecting Unrealizable Specifications of Distributed Systems 89
6 Completeness
Proposition 1 states that the characterization of unsatisfiable formulas with
counterexamples is complete. Our method, however, searches for counterex-
amples involving only a bounded number of external paths and the follow-
ing example shows this leads to incompleteness. Consider the ECL∃ formula
Φinf := ∃∅ y. ϕinf with temporal specification ϕinf := (y = x) where x
is a free coordination variable. Φinf is unsatisfiable because for every strategy
fy : ∅∗ → 2{y} there exists a path σ ∈ (2{x} )ω that simulates exactly the output
of the strategy, as the formula is evaluated over the full binary x-tree. Assume
for contradiction that a finite set of paths P ⊆ (2{x} )ω suffices to satisfy ¬ϕinf
against any strategy fy . Interpreting the outcome of the strategy as a path and
considering all possible strategies gives us a full binary tree T . Let ρ be a path
from T that is not contained in P (after renaming y in ρ to x). Such a path must
exists because there are infinite many different paths in T . Choose the strategy
fyρ that belongs to ρ. For all paths in P it holds that (y = x) and thus no
path satisfies ¬ϕinf .
However, in practice finite external counterexamples are sufficient to detect
many errors in specifications. In this section we give a semantic characterization
of the finite path satisfiability based only on the LTL specification.
Given an ECL∃ formula Φ = Q∃ . ϕpath → ϕ. We assume w.l.o.g. that ϕ only
contains coordination variables Ce ⊆ C that are not used as a channel as otherwise
one could replace a variable c ∈ C \ Ce by the strategy variable corresponding to
the channel. The semantics of the LTL formula ¬ϕ, denoted by ¬ϕ, gives us
a language L ⊆ (2S × 2Ce )ω . From L we obtain the relation R ⊆ (2S )ω × (2Ce )ω
between paths of strategy variables and paths of coordination variables. We say
that an LTL formula ψ over variables S × Ce admits finite external paths if there
exists a function r : (2S )ω → (2Ce )ω such that (1) for all σ ∈ (2S )ω it holds that
r(σ) = ρ ⇔ σ R ρ, and (2) {r(σ) | σ ∈ (2S )ω } is finite.
Let RAψ be the deterministic Rabin word automata for LTL formula ψ. RAψ
contains a path split if there exist a state q in the automaton where (1) there are
two outgoing edges labeled with (s, p) and (s , p ) where s = s and p = p , and
(2) from q we can build accepting runs visiting q infinitely often and containing
exclusively the (s, p)-edge or (s , p )-edge.
Theorem 7. An LTL formula ψ over variables S × Ce admits finite external
paths if and only if the automaton RAψ has no path split.
7 Experimental Results
We have carried out our experiments on a 2.6 GHz Opteron system. For solving
the QBF instances, we used a combination of the QBF preprocessor Bloqqer [17]
in version 031 and the QBF solver DepQBF [18] in version 1.0. For solving the
WS1S instances, we used Mona [16] in version 1.4-15.
Table 1 demonstrates that the Byzantine Generals’ Problem remains, despite
the optimizations described above, a nontrivial combinatorial problem: we need
90 B. Finkbeiner and L. Tentrup
to find a suitable set of paths for every possible combination of the strategies
of the generals. The bound given in the first column reads as follows: The first
component is the number of branchings for the input variable v in all three
architectures. The last three components state the number of branchings for the
outputs of the faulty nodes in their respective architectures. For example, bound
(1, 1, 0, 0) means that we have two branches for v, c12 , and c13 , while we have only
one branch for c23 and c32 . More precisely, starting from always zero functions
K1 , K2 , K3 , the bound (1, 1, 0, 0) sets K1 (v) = K2 (v) = K3 (v) = K1 (c12 ) =
K1 (c13 ) = 1 and K2 (c23 ) = K3 (c32 ) = 0. To prove the unrealizability, we need
one branching for the input v and one branching for every coordination variable
that serves as a shared variable for a faulty node, i.e., the bound (1, 1, 1, 1). The
number of branches and thereby the formula size grows exponentially with the
number of branchings for the input variables.
The CAP Theorem for two nodes is encoded as the ECL∃ formula
∃{req1 }
com1 . ∃{req1 , chan2 }
out1 . ∃{req2 }
com2 . ∃{req2 , chan1 }
out2 .
( (chan1 = com1 ) → ((out1 = out2 ) ∧ ((req1 ∨ req2 ) ↔ 2 (out1 ∨ out2 )))) ∧
( (chan2 = com2 ) → ((out1 = out2 ) ∧ ((req1 ∨ req2 ) ↔ 2 (out1 ∨ out2 )))) .
The architecture is similar to Fig. 1(a) with the difference that there is a direct
communication channel between the two processes (chan1 , chan2 ). The formula
states that the system should be available and consistent despite an failure of
one process. Table 2 shows that our method is able to find conflicts in a speci-
fication with an architecture up to 50 nodes within reasonable time. When we
drop either Consistency, Availability, or Partition tolerance, the corresponding
instances (AP, CP, and CA) become satisfiable. Hence, our tool does not find
counterexamples in these cases.
Discussion. We evaluate the different encodings that we have used in the fol-
lowing. There does not exist an algorithm that decides whether a given ECL∃
formula is unsatisfiable. We used a sound approach where we bound the number
of paths and encoded the problem in QPTL. The reason for incompleteness was
Detecting Unrealizable Specifications of Distributed Systems 91
shown in Sec. 6; in some cases one may need infinite many paths to show unsatis-
fiability. Our encoding in WS1S (Mona) loses the ability to find counterexample
paths of infinite length, e.g., the ECL∃ formula ∃∅ y. ( y ↔ x) with free
coordination variable x is unsatisfiable where two paths that are infinitely of-
ten different are sufficient to prove it. The QPTL encoding is capable of finding
these paths while the WS1S encoding is not. However, Mona could not solve any
satisfiable instance given in Tables 1 and 2. Lastly, for the translation in QBF we
do not only restrict ourself to paths of finite length (WS1S), but we also bound
the paths to length k where k is an additional parameter. With this encoding
we approximate the reactive behavior of our system by a finite prefix. It turned
out that despite of this restriction we could prove unsatisfiability for many in-
teresting specifications. In practice, one would first use the QBF abstraction in
order to find “cheap” counterexamples. After hitting the number of paths that
the QBF solver can no longer handle within reasonable time, one proceeds with
more costly abstractions like the WS1S encoding.
8 Conclusion
We introduced counterexamples for distributed realizability and showed how to
automatically derive counterexamples from given specifications in ECL∃ . We
used encodings in QPTL, WS1S, and QBF. Our experiments showed that the
QBF encoding was the most efficient. Even problems with high combinatorial
complexity, such as the Byzantine Generals’ Problem, are handled automati-
cally. Given that QBF solvers are likely to improve in the future, even larger
instances should become tractable. Possible future directions include building a
set of benchmarks, evaluating more solvers, and use the information about an
counterexample given by QBF certification [19] to build counterexamples for the
specification. As the bound given for the encoding is not uniform, i.e., there is a
bound for each coordination variable, and the observation that the performance
depend on the chosen bound, it is crucial to find suitable heuristics that rank the
importance of the coordination variables. Also, more types of failures could be
92 B. Finkbeiner and L. Tentrup
incorporated into our model, e.g., variations of the failure duration like transient,
or intermittent. Lastly, it would be also conceivable to use similar methods to
derive a larger class of infinite counterexamples.
References
1. Lamport, L., Shostak, R.E., Pease, M.C.: The byzantine generals problem. ACM
Trans. Program. Lang. Syst. 4(3), 382–401 (1982)
2. Pnueli, A., Rosner, R.: Distributed reactive systems are hard to synthesize. In:
Proc. FOCS 1990, pp. 746–757 (1990)
3. Kupferman, O., Vardi, M.Y.: Synthesizing distributed systems. In: LICS, pp. 389–
398. IEEE Computer Society (2001)
4. Finkbeiner, B., Schewe, S.: Uniform distributed synthesis. In: LICS, pp. 321–330.
IEEE Computer Society (2005)
5. Finkbeiner, B., Schewe, S.: Bounded synthesis. International Journal on Software
Tools for Technology Transfer 15(5-6), 519–539 (2013)
6. Brewer, E.A.: Towards robust distributed systems (abstract). In: PODC, p. 7.
ACM (2000)
7. Church, A.: Logic, arithmetic and automata. In: Proc. 1962 Intl. Congr. Math.,
Upsala, pp. 23–25 (1963)
8. Abadi, M., Lamport, L., Wolper, P.: Realizable and unrealizable specifications of
reactive systems. In: Ronchi Della Rocca, S., Ausiello, G., Dezani-Ciancaglini, M.
(eds.) ICALP 1989. LNCS, vol. 372, pp. 1–17. Springer, Heidelberg (1989)
9. Kupferman, O., Vardi, M.Y.: Synthesis with incomplete information. In: Proc. of
ICTL (1997)
10. Raman, V., Kress-Gazit, H.: Analyzing unsynthesizable specifications for high-
level robot behavior using LTLMoP. In: Gopalakrishnan, G., Qadeer, S. (eds.)
CAV 2011. LNCS, vol. 6806, pp. 663–668. Springer, Heidelberg (2011)
11. Li, W., Dworkin, L., Seshia, S.A.: Mining assumptions for synthesis. In: MEM-
OCODE, pp. 43–50. IEEE (2011)
12. Chatterjee, K., Henzinger, T.A., Jobstmann, B.: Environment assumptions for syn-
thesis. In: van Breugel, F., Chechik, M. (eds.) CONCUR 2008. LNCS, vol. 5201,
pp. 147–161. Springer, Heidelberg (2008)
13. Halpern, J.Y., Moses, Y.: Knowledge and common knowledge in a distributed
environment. In: PODC, pp. 50–61. ACM (1984)
14. Dimitrova, R., Finkbeiner, B.: Synthesis of fault-tolerant distributed systems. In:
Liu, Z., Ravn, A.P. (eds.) ATVA 2009. LNCS, vol. 5799, pp. 321–336. Springer,
Heidelberg (2009)
15. Finkbeiner, B., Schewe, S.: Coordination logic. In: Dawar, A., Veith, H. (eds.) CSL
2010. LNCS, vol. 6247, pp. 305–319. Springer, Heidelberg (2010)
16. Henriksen, J.G., Jensen, J.L., Jørgensen, M.E., Klarlund, N., Paige, R., Rauhe,
T., Sandholm, A.: Mona: Monadic second-order logic in practice. In: Brinksma,
E., Steffen, B., Cleaveland, W.R., Larsen, K.G., Margaria, T. (eds.) TACAS 1995.
LNCS, vol. 1019, pp. 89–110. Springer, Heidelberg (1995)
17. Biere, A., Lonsing, F., Seidl, M.: Blocked clause elimination for QBF. In: Bjørner,
N., Sofronie-Stokkermans, V. (eds.) CADE 2011. LNCS, vol. 6803, pp. 101–115.
Springer, Heidelberg (2011)
18. Lonsing, F., Biere, A.: DepQBF: A dependency-aware QBF solver. JSAT 7(2-3),
71–76 (2010)
19. Balabanov, V., Jiang, J.H.R.: Unified QBF certification and its applications. For-
mal Methods in System Design 41(1), 45–65 (2012)
Synthesizing Safe Bit-Precise Invariants
1 Introduction
The problem of program safety (or reachability) verification is to decide whether
a given program can violate an assertion (i.e., can reach a bad state). The prob-
lem is reducible to finding either a finite counter-example, or a safe inductive
invariant that certifies unreachability of a bad state. The problem of bit-precise
program safety, Safety(BV), further requires that the program operations are
represented soundly relative to low-level bit representation of data. Arguably,
verification techniques that are not bit-precise are unsound, and do not reflect
the actual behavior of a program. Unlike many other problems in software veri-
fication, bit-precise verification (without memory allocation and concurrency) is
decidable. However, in practice it appears to be more challenging that verifica-
tion of programs relative to integers or rationals (both undecidable).
The recent decade has seen an amazing progress in SAT solvers, in Satisfiabil-
ity Modulo Theory of Bit-Vectors, SMT(BV), and in Bounded Model Checkers
This material is based upon work funded and supported by the Department of De-
fense under Contract No. FA8721-05-C-0003 with Carnegie Mellon University for the
operation of the Software Engineering Institute, a federally funded research and de-
velopment center. This material has been approved for public release and unlimited
distribution. DM-0000869. The second and third authors are financially supported
by SFI PI grant BEACON (09/IN.1/I2618), and by FCT grants ATTEST (CMU-
PT/ELE/0009/2009) and POLARIS (PTDC/EIA-CCO/123051/2010).
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 93–108, 2014.
c Springer-Verlag Berlin Heidelberg 2014
94 A. Gurfinkel, A. Belov, and J. Marques-Silva
(BMC) based on these techniques. A SAT solver decides whether a given propo-
sitional formula is satisfiable. Current solvers can handle very large problems
and are routinely used in many industrial applications (including Hardware and
Software verification). SMT(BV) extends SAT-solver techniques to the theory of
bit-vectors – that is propositional formulas whose atoms are predicates about bit-
vectors. Most successful SMT(BV) solvers (e.g., Boolector [6], STP [17], Z3 [12],
MathSAT [9]) are based on reducing the problem to SAT via pre-processing
and bit-blasting. The bit-blasting step takes a BV formula ϕ and constructs
an equivalent propositional formula ψ, where each propositional variable of ψ
corresponds to a bit of some bit-vector variable of ϕ. The, more important,
pre-processing step typically consists of equisatisfiable reductions that reduce
the size of the input formula. While the pre-processor is not as powerful as the
SAT-solver (typically pre-processor is required to run in polynomial time), it
does not maintain equivalence. The pre-processing phase of SMT(BV) solvers is
crucial for their performance. For example, in our experiments with Boolector,
the different between straight forward bit-blasting and pre-processing is several
order of magnitude.
There has also been a tremendous progress in applying those techniques to
program verification. In particular, there are several mature Bounded Model
Checkers, including CBMC [10], LLBMC [32], and ESBMC [11], that decide
existence of a bounded bit-precise counterexamples of C programs. These tools
are based on ultimate reduction of BMC to SAT, either via their own custom
bit-blasting and pre-processing steps (e.g., CBMC) or by leveraging SMT(BV)
solvers described above (e.g., LLBMC). While BMC tools are great at finding
counterexamples (even in industrial applications), proving bit-precise safety, i.e.,
synthesizing a bit-precise invariant, remains a challenge. For example, none of the
tools submitted to the Software Verification Competition in 2013 (SVCOMP’13)
are both bit-precise and effective at invariant synthesis.
As we described above, Safety(BV) is decidable. In fact, it is reducible to safety
problem over propositional logic, Safety(Prop), via the simple bit-blasting men-
tioned above. Thus, the naive solution is to reduce Safety(BV) to Safety(Prop)
and decide it using tools for propositional verification. This, however, does not
scale. Our experiments with Z3/PDR (the Model Checker of Z3), show that the
approach is ineffective for almost all benchmarks in SVCOMP’13. The main is-
sue is that the reduction of Safety(BV) to Safety(Prop) is incompatible with the
pre-processing techniques that make bit-blasting for SMT(BV) so effective.
An alternative approach of lifting effective Model Checking technique from
propositional level to BV appears to be difficult, with only a few somewhat suc-
cessful attempts (e.g., [26,19]). For example, techniques based on interpolation
(e.g., [31,27,1]) require world-level interpolation for BV [25,19] that satisfies ad-
ditional properties (e.g., sequence and tree properties) [21]. While, techniques
based on PDR [22], require novel world-level inductive generalization strategies.
Both are difficult problems in themselves.
Thus, instead of lifting existing techniques, we are interested in finding a
way to use existing verification engines to improve scalability of the naive
Synthesizing Safe Bit-Precise Invariants 95
2 Preliminaries
We assume some familiarity with program verification, logic, SMT and SAT.
96 A. Gurfinkel, A. Belov, and J. Marques-Silva
When P is UNSAFE and s ∈ Bad is the reachable state, the path from s0 ∈ Init
to s ∈ Bad is called a counterexample (CEX).
A transition system P is SAFE if and only if there exists a formula Inv, called
a safe invariant, that satisfies the following conditions:
Init(v) → Inv(v) Inv (v) ∧ Tr (v, u) → Inv (u) Inv (v) → ¬Bad (v) (2)
A formula Inv that satisfies the first two conditions is called an invariant of
P , while a formula Inv that satisfies the third condition is called safe. A safety
verification problem is to decide whether a transition system P is SAFE or
UNSAFE. Thus, a safety verification problem is equivalent to the problem of
establishing an existence of a safe invariant. In SAT-based Model Checking, the
verification problem is decided by iteratively synthesizing an invariant Inv or
finding a CEX.
where x[32] and y [32] are bit-vector and z propositional variables, the correspond-
ing LA formula Init W (x, y, z) is
over the target theory TT (e.g., BV∗ (32) from Example 1), we first attempt to
solve P with a solver for Safety(TT ) under heuristically chosen resource limits2 .
If the solver fails to prove or disprove the safety of P , we pick a working theory
TW , and a pair of corresponding mappings MT →W and MW →T (e.g., TW = LA
and the mappings are as in Example 1). Then, we attempt to verify the safety of
PW = MT →W (P ) = (U, MT →W (Init ), MT →W (Tr ), MT →W (Bad )), where U are
the fresh variables introduced by MT →W , using a solver for Safety(TW ). Since
PW is in general neither under- nor over- approximation of P , the (un)safety of
the former does not imply the (un)safety of the latter. Since the focus of this
paper is on synthesis of invariants for verification, we omit the detailed discus-
sion of how to handle the UNSAFE status of PW . One option is to simply return
UNKNOWN, as in Algorithm 1. Alternatively, the CEX for PW can be mapped
to TT via MW →T and checked on P — if the mapped CEX is also a CEX for
P , return UNSAFE. Otherwise, the mapping can be refined to eliminate the
CEX, and the safety verification of PW under the new mapping repeated. If,
on the other hand, PW is safe, we take the safe invariant Inv W of PW , and
translate it back to the target theory TT to obtain a candidate-invariant for-
mula Cand = MW →T (Inv W ). If Cand is a safe invariant of P , then the safety
of P is established, and the algorithm returns SAFE. Otherwise, we attempt
to compute a subformula Cand I of Cand that is an invariant of P — this is
done in the function ComputeMIS on line 13 of Algorithm 1, which we describe
in detail in Section 3.2. Once an invariant of P is obtained, we restrict the tran-
sition relation of P by replacing the formula Tr (u, v) in P with the formula
Cand I (u) ∧ Tr (u, v) ∧ Cand I (v), and attempt to verify the safety of the new
2
This step is optional on the first iteration of the main loop of Algorithm 1.
Synthesizing Safe Bit-Precise Invariants 99
transition system (the next iteration of the main loop). Since Cand I is the ac-
tual invariant of P , the (un)safety of strengthened transition system implies the
(un)safety of the input system P .
This verification framework can be instantiated in numerous ways and leaves a
number of open heuristic choices. We postpone the description of an instantiation
of the framework used in our experiments to Section 4.
the model returned by the solver must falsify one or more lemmas in L(v). These
lemmas are then removed both from L(u) and from L(v), and the test is repeated.
The process continues until for some subset L ⊆ L, L (u) ∧ Tr (u, v) |= L (v).
The final subset L is obviously inductive. Furthermore, for any set of lemmas
L ⊆ L \ L there must have been a point in the execution of the algorithm
where it obtained a model for a formula L (u) ∧ L (u) ∧ Tr (u, v) that falsifies
at least one lemma in L (v), as otherwise this lemma would be included in L .
Hence, L is maximal, and therefore is a MIS of L.
In the lazy approach to MIS computation (e.g., [16,24]), when the set L is
not inductive, the lemmas in the consequent L(v) that are falsified by the model
of L(u) ∧ Tr (u, v) are initially removed only from L(v). The process continues
until for some L ⊆ L, L(u) ∧ Tr (u, v) |= L (v) — notice that the premise still
contains all of the lemmas of L. We refer to such sets L as semi-inductive with
respect to L and Tr . Observe that the semi-inductive subset L obtained in
this manner is maximal and also maximum, by the argument analogous to that
used to establish the uniqueness of MISes. Once the maximum semi-inductive
subset L of L is computed, the lemmas excluded from L are removed from
L(u), and the algorithm checks whether L (u) ∧ Tr (u, v) |= L (v), i.e., whether
L is inductive. If not, the algorithm repeats the process, by first computing a
maximum semi-inductive subset of L , then checking its inductiveness, and so
on. The, eventually obtained, inductive subset of L is the MIS of L — this can
be justified in essentially the same way as for the eager approach.
One potential advantage of the lazy approach is that, since, compared to the
eager approach, there are often more lemmas in the premises, the SMT/SAT
solver is likely to work with stronger formulas. Furthermore, if a solver retains
information between invocations — for example, derived facts and history-based
heuristic parameters, as in incremental SAT solvers — more information can be
reused between iterations, thus speeding-up the MIS computation.
One additional feature of the lazy approach, pointed out and used in [24], is
that the computation of semi-inductive subsets can be reduced to the computa-
tion of Minimal Unsatisfiable Subformulas (MUSes), or, more precisely, to the
computation of group-MUSes (recall Definition 1). This observation is particu-
larly useful in cases when satisfiability problem in the theory that defines the
invariants can be soundly reduced to propositional satisfiability, as it allows to
leverage the large body of recent work and tools for the computation of MUSes
(e.g., [2,30,34]). We take advantage of this observation in the implementation of
our framework since, in our case, the invariants are quantifier-free formula over
(a sub-theory of) the theory of bit-vectors, and the satisfiability of such formulas
can be soundly reduced to SAT via bit-blasting. The reduction to group-MUS
computation and the overall MIS extraction flow are presented below.
Computing MISes with Group-MUSes. For a set of lemmas L = {L1 , . . . ,
Ln } and a transition relation formula Tr , we first rewrite the formula L(u) ∧
Tr (u, v)∧¬L(v), used to check the inductiveness of L, as a formula AL,Tr defined
in the following way:
Synthesizing Safe Bit-Precise Invariants 101
AL,Tr = (prei → Li (u)) ∧ Tr (u, v) ∧ (posti ∧ ¬Li (v)) , (3)
Li ∈L Li ∈L
where prei and posti for i ∈ [1, n] are fresh propositional variables, one for each
lemma Li ∈ L. One of the purposes of these variables is similar to that of the
indicator variables used in assumption-based incremental SAT solving (cf. [14])
— the variables can be used to emulate the removal of lemmas from formulas
L(u) and L(v). Setting prei to true (resp. false) causes the lemma Li to be
included (resp. excluded) from L(u), while setting posti to true (resp. false) has
the same effect on the lemma Li in L(v). The names of the variables reflect the
fact that they control either the “precondition” or the “postcondition” lemmas.
With this in mind, a computation of the MIS of L with respect to Tr can
be implemented on top of an incremental SMT solver by loading the formula
AL,Tr into the solver, and checking the satisfiability of the formula under a set
of assumptions. For example, the set L is inductive if and only if the formula
is unsatisfiable under assumptions i∈[1,n] {prei , posti }. When a lemma Li ∈ L
needs to be removed from L(u) and/or L(v), we simply assert the formula (¬prei )
and/or (¬posti ) to the solver.
However, as explained above, our intention is to take advantage of proposi-
tional MUS extractors, using the fact that quantifier-free bit-vector formulas can
be soundly converted to propositional logic. The pre and post variables serve a
purpose in this context as well. Assume that we have a polytime computable
function B2P , which given a quantifier-free formula FBV over the theory BV,
and a set of propositional variables X = {x1 , . . . , xk } that occur in FBV returns a
propositional formula FP rop = B2P (FBV , X), in CNF, with the following prop-
erty: for any assignment τ to the variables in X, the formula FBV [τ ] is satisfiable
if and only if so is the formula FP rop [τ ]. Following [29], we say that the formulas
FBV and FP rop are var-equivalent on X in this case. Note that var-equivalence
of FBV and FP rop on X does not imply FP rop contains all variables of X — for
example, FP rop = # is var-equivalent to FBV if FBV [τ ] is satisfiable for every
assignment τ for X.
Now, for a set of lemmas L = {L1 , . . . , Ln } and a transition relation Tr
over BV, let AL,Tr be the formula defined in (3), let Pre = {prei | i ∈ [1, n]},
Post = {posti | i ∈ [1, n]}. Consider the group-CNF formula GL,Tr constructed
in the following way:
GL,Tr = G0 ∪ G1 ∪ · · · ∪ Gn , where:
G0 = CL,Tr ∪ {(prei ) | i ∈ [1, n]}, with CL,Tr = B2P (AL,Tr , Pre ∪ Post)
Gi = {(¬posti )} for i ∈ [1, n]
That is, the group G0 of GL,Tr is the formula CL,Tr — a CNF formula var-
equivalent to AL,Tr on the set Pre ∪ Post — together with the positive unit
clauses for pre variables. Each group Gi in GL,Tr consists of a single negative
unit clause for the variable posti .
102 A. Gurfinkel, A. Belov, and J. Marques-Silva
variables is slightly more technical. Assume that in the first iteration of the lazy
MIS computation algorithm a maximal semi-inductive set L of L is computed,
and that L ⊂ L. At this point, some of the lemmas L(u) (i.e., the precondition
lemmas) have to be removed from L. One possibility is to build a new formula
AL ,Tr analogously to that in equation (3), apply the function B2P to it, and
proceed with the computation of the maximum semi-inductive subset of L . An
alternative is to re-use the CNF formula CL,Tr , obtained by translating the
original formula AL,Tr via B2P , and simply add negative unit clauses (¬prei )
and (¬posti ) for each of the lemmas removed from L. This way we avoid re-
invoking B2P , and open up the possibility of reusing more information between
the invocations of the group-MUS extractor. As the group-CNF formula GL,Tr
does need to be modified between iterations by taking into account removal
of some of the lemmas, for a set L ⊆ L of remaining lemmas we define the
group-CNF formula GL,L ,Tr as follows:
The pseudocode of the MIS computation algorithm based on the ideas pre-
sented above is presented in Algorithm 2. Given a set of BV lemmas L and a
transition relation formula Tr , the algorithm constructs the formula AL,Tr , de-
fined in (3), and converts the formula to CNF using a var-equivalence preserving
function B2P . The set L that will eventually represent the resulting MIS is ini-
tialized to L. The main loop of the algorithm reflects the outer loop of the lazy
MIS computation approach. On every iteration, the maximum semi-inductive
subset of L is computed via the reduction to group-MUS computation, as jus-
tified by Proposition 1. If the group-MUS is empty, then, according to Proposi-
tion 1, the set L itself is inductive, and, therefore, based on the correctness of
the lazy MIS computation algorithm, is the MIS of L. Otherwise, L is updated
to the computed maximum semi-inductive subset represented by the extracted
104 A. Gurfinkel, A. Belov, and J. Marques-Silva
Fig. 1. Performance of Z3/PDR and MISper for the target theories BV∗ (32) (left) and
BV∗ (16) (right) in terms of CPU runtime. Timeout of 1800 seconds is represented by
the dashed (green) lines; orders of magnitude are represented by diagonals.
group-MUS (line 9). Note that the removal of the lemmas from the premise for-
mula L(u) performed at this stage during the lazy MIS computation is implicit
in the construction of the group-CNF formula GL,L ,Tr in the next iteration of
the main loop (cf. (4)). The termination of the algorithm is guaranteed by the
fact that on every iteration at least one lemma is removed from L , and so, in
the worst case, there will be an iteration of the main loop with L = ∅. Since, in
this case, L is inductive, by Proposition 1 the computed group-MUS will be ∅,
and the algorithm terminates.
Table 1. Performance of Z3/PDR and MISper for the target theories BV∗ (32) and
BV∗ (16). Within each horizontal section, the first row (all) presents the data for all
214 instances, while the second row (unsol.) presents the data for those instances that
were not solved by Z3/PDR. ’‘Solved” means that the tool returned SAFE within the
timeout/memout of 1800 sec/4 GB. Column Z3/PDR shows the data for Z3/PDR —
each cell contains the number of solved instances (#sol), followed by the average and the
median of the CPU times on the solved instances (avg/med). Column MISper displays
the same data for MISper. Column MISper:Cand displays the data for instances solved
by MISper by proving the safety of the candidate invariant Cand (Alg. 1, line 12).
Column MISper:MIS displays the data for instances solved by MISper by computing
MIS of Cand , and invoking Z3/PDR on strengthened system (Alg. 1, lines 13-14). For
example, the first row in the table shows that out of 214 instances, Z3/PDR solved 116,
while MISper solved 174, out of which 165 were solved immediately after the conversion
of LA invariant to BV∗ (32), and 9 were solved after extracting invariants.
bit inst. count Z3/PDR MISper MISper:Cand MISper:MIS
width #sol(avg/med) #sol(avg/med) #sol(avg/med) #sol(avg/med)
all 214 116(127.54/8.27) 174(28.34/0.43) 165(8.50/0.42) 9(391.95/133.94)
32
unsol. 98 — 58(75.90/1.03) 52(21.89/0.70) 6(544.05/366.18)
all 214 165(176.69/8.20) 182(69.32/0.38) 165(8.37/0.36) 17(660.91/399.32)
16
unsol. 49 — 18(624.79/376.24) 6(50.80/21.45) 12(911.78/1094.58)
from the set of SAFE benchmarks used in 2013 Competition on Software Verifica-
tion, SVCOMP’133. We translated the benchmarks to Horn SMT formulas over
the theories BV∗ (32) and BV∗ (16) (recall Example 1), after replacing the unsup-
ported bit-vector operations by fresh variables — hence, the resulting systems
are an over-approximation of the original programs4. We compared the perfor-
mance of Z3/PDR engine with that of MISper, instantiated with the theory of
linear arithmetic (LA) as a working theory TW . All experiments were performed
on Intel Xeon X3470, 32 GB, running Linux 2.6. For each experiment, we set a
CPU time limit of 1800 seconds, and a memory limit of 4 GB.
The scatter plots in Figure 1, complemented by Table 1, summarize the re-
sults of our experiments. In 32-bit experiments, MISper solved all 116 instances
solved by Z3/PDR, and additional 58 on which Z3/PDR exceeded the allot-
ted resources (174 in total). Furthermore, judging from the scatter plot (left),
on the vast majority of instances MISper was at least one order of magnitude
faster than Z3/PDR, and, in some cases, the performance improvement exceeded
three orders of magnitude. The 16-bit benchmarks were, not surprisingly, easier
for Z3/PDR than 32-bit, and so it succeeded to solve quite significantly more
problems (165). Nevertheless, MISper significantly outperforms Z3/PDR in this
setting as well, solving 17 more benchmarks, and still demonstrating multiple
orders of magnitude performance improvements. We found only one instance
solved by Z3/PDR, but unsolved by MISper (exceeded time limit). To summa-
rize, the results clearly demonstrate the effectiveness of the proposed framework.
3
http://sv-comp.sosy-lab.org/2013.
4
The benchmarks are available at http://bitbucket.org/arieg/misp.
106 A. Gurfinkel, A. Belov, and J. Marques-Silva
5 Conclusion
In this paper, we introduced a bit-precise program verification framework MISper.
The key idea behind the framework is to transfer, at least partially, information
obtained during the verification of an unsound approximation of the original
program in the form of bit-precise invariants. We describe a novel approach to
computing such invariants that allows to take advantage of the state-of-the-art
propositional MUS extractors. The results of the experiments with our proto-
type implementation of the framework suggest that the proposed approach is
promising. Furthermore, the verification tool FrankenBit [20] that integrates our
prototype implementation of MISper with LLBMC [32], has won two awards at
the 2014 Competition on Software Verification (SVCOMP’14).
References
1. Albarghouthi, A., Gurfinkel, A., Chechik, M.: From Under-Approximations to
Over-Approximations and Back. In: Flanagan, C., König, B. (eds.) TACAS 2012.
LNCS, vol. 7214, pp. 157–172. Springer, Heidelberg (2012)
2. Belov, A., Lynce, I., Marques-Silva, J.: Towards efficient MUS extraction. AI Com-
mun. 25(2) (2012)
3. Belov, A., Marques-Silva, J.: MUSer2: An Efficient MUS Extractor. JSAT 8(1/2)
(2012)
4. Beyer, D., Löwe, S., Novikov, E., Stahlbauer, A., Wendler, P.: Precision reuse for
efficient regression verification. In: ESEC/SIGSOFT FSE (2013)
5. Bradley, A.R.: SAT-Based Model Checking without Unrolling. In: Jhala, R.,
Schmidt, D. (eds.) VMCAI 2011. LNCS, vol. 6538, pp. 70–87. Springer, Heidel-
berg (2011)
Synthesizing Safe Bit-Precise Invariants 107
6. Brummayer, R., Biere, A.: Boolector: An Efficient SMT Solver for Bit-Vectors and
Arrays. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505,
pp. 174–177. Springer, Heidelberg (2009)
7. Bryant, R.E., Kroening, D., Ouaknine, J., Seshia, S.A., Strichman, O., Brady,
B.A.: Deciding Bit-Vector Arithmetic with Abstraction. In: Grumberg, O., Huth,
M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 358–372. Springer, Heidelberg (2007)
8. Chockler, H., Ivrii, A., Matsliah, A., Moran, S., Nevo, Z.: Incremental formal ver-
ification of hardware. In: FMCAD (2011)
9. Cimatti, A., Griggio, A., Schaafsma, B.J., Sebastiani, R.: The MathSAT5 SMT
Solver. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp.
93–107. Springer, Heidelberg (2013)
10. Clarke, E., Kroening, D., Lerda, F.: A Tool for Checking ANSI-C Programs.
In: Jensen, K., Podelski, A. (eds.) TACAS 2004. LNCS, vol. 2988, pp. 168–176.
Springer, Heidelberg (2004)
11. Cordeiro, L., Fischer, B., Marques-Silva, J.: SMT-Based Bounded Model Checking
for Embedded ANSI-C Software. IEEE Trans. Software Eng. 38(4) (2012)
12. de Moura, L., Bjørner, N.: Z3: An Efficient SMT Solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
13. Eén, N., Mishchenko, A., Brayton, R.K.: Efficient implementation of property di-
rected reachability. In: FMCAD (2011)
14. Eén, N., Sörensson, N.: Temporal induction by incremental SAT solving. Electr.
Notes Theor. Comput. Sci. 89(4) (2003)
15. Fedyukovich, G., Sery, O., Sharygina, N.: Function Summaries in Software Up-
grade Checking. In: Eder, K., Lourenço, J., Shehory, O. (eds.) HVC 2011. LNCS,
vol. 7261, pp. 257–258. Springer, Heidelberg (2012)
16. Flanagan, C., Leino, K.R.M.: Houdini, an Annotation Assistant for ESC/Java. In:
Oliveira, J.N., Zave, P. (eds.) FME 2001. LNCS, vol. 2021, pp. 500–517. Springer,
Heidelberg (2001)
17. Ganesh, V., Dill, D.L.: A Decision Procedure for Bit-Vectors and Arrays. In:
Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 519–531.
Springer, Heidelberg (2007)
18. Godlin, B., Strichman, O.: Regression verification. In: DAC (2009)
19. Griggio, A.: Effective word-level interpolation for software verification. In: FMCAD
(2011)
20. Gurfinkel, A., Belov, A.: FrankenBit: Bit-Precise Verification with Many Bits
(Competition Contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014.
LNCS, vol. 8413, pp. 408–411. Springer, Heidelberg (2014)
21. Gurfinkel, A., Rollini, S.F., Sharygina, N.: Interpolation properties and SAT-based
model checking. In: Van Hung, D., Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172,
pp. 255–271. Springer, Heidelberg (2013)
22. Hoder, K., Bjørner, N.: Generalized Property Directed Reachability. In: Cimatti,
A., Sebastiani, R. (eds.) SAT 2012. LNCS, vol. 7317, pp. 157–171. Springer, Hei-
delberg (2012)
23. Kahsai, T., Ge, Y., Tinelli, C.: Instantiation-Based Invariant Discovery. In: Bobaru,
M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NFM 2011. LNCS, vol. 6617,
pp. 192–206. Springer, Heidelberg (2011)
24. Komuravelli, A., Gurfinkel, A., Chaki, S., Clarke, E.M.: Automatic Abstraction
in SMT-Based Unbounded Software Model Checking. In: Sharygina, N., Veith, H.
(eds.) CAV 2013. LNCS, vol. 8044, pp. 846–862. Springer, Heidelberg (2013)
108 A. Gurfinkel, A. Belov, and J. Marques-Silva
25. Kroening, D., Weissenbacher, G.: Lifting Propositional Interpolants to the Word-
Level. In: FMCAD (2007)
26. Kroening, D., Weissenbacher, G.: Interpolation-Based Software Verification with
Wolverine. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806,
pp. 573–578. Springer, Heidelberg (2011)
27. Kuncak, V., Rybalchenko, A. (eds.): VMCAI 2012. LNCS, vol. 7148. Springer,
Heidelberg (2012)
28. Lahiri, S.K., Hawblitzel, C., Kawaguchi, M., Rebêlo, H.: SYMDIFF: A Language-
Agnostic Semantic Diff Tool for Imperative Programs. In: Madhusudan, P., Seshia,
S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 712–717. Springer, Heidelberg (2012)
29. Lang, J., Liberatore, P., Marquis, P.: Propositional Independence: Formula-
Variable Independence and Forgetting. J. Artif. Intell. Res. (JAIR) 18 (2003)
30. Marques-Silva, J., Janota, M., Belov, A.: Minimal Sets over Monotone Predicates in
Boolean Formulae. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044,
pp. 592–607. Springer, Heidelberg (2013)
31. McMillan, K.L.: Lazy Abstraction with Interpolants. In: Ball, T., Jones, R.B. (eds.)
CAV 2006. LNCS, vol. 4144, pp. 123–136. Springer, Heidelberg (2006)
32. Merz, F., Falke, S., Sinz, C.: LLBMC: Bounded Model Checking of C and C++
Programs Using a Compiler IR. In: Joshi, R., Müller, P., Podelski, A. (eds.) VSTTE
2012. LNCS, vol. 7152, pp. 146–161. Springer, Heidelberg (2012)
33. Nadel, A.: Boosting minimal unsatisfiable core extraction. In: FMCAD (2010)
34. Nadel, A., Ryvchin, V., Strichman, O.: Efficient MUS Extraction with Resolution.
In: FMCAD (2013)
PEALT: An Automated Reasoning Tool
for Numerical Aggregation of Trust Evidence
1 Introduction
Trust is a fundamental factor that influences decisions pertaining to human inter-
actions, be they social or economic in nature. Mayer et al. [11] offer a definition of
trust as “... the willingness to be vulnerable, based on positive expectation about
the behavior of others.” These expectations of the trustor would be informed
by trust signals exchanged with the trustee of a planned interaction. Trust has
an economic incentive, it avoids the use of costly measures that guarantee as-
surance in the absence of trust-enabled interaction. We note that assurance is
the established means of realizing “IT security”. Traditionally, trust signals (e.g.
body language) could be observed both in spatial and temporal proximity to
a planned interaction. Modern IT infrastructures, however, disembed agents in
space and in time from such signals and interaction resources, making it hard to
use existing trust mechanics such as those proposed in [17] in this setting [10].
This identifies a need for a calculus in which trust and distrust signals can
be expressed and aggregated to support decision making in a variety of applica-
tions (e.g. financial transactions, software installations, and run-time monitoring
of hardware). In our proposed methodology, signals of trust or distrust have no
effect in their absence but evaluate to a score in their presence. These scores
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 109–123, 2014.
c Springer-Verlag Berlin Heidelberg 2014
110 M. Huth and J.H.-P. Kuo
may be determined by techniques suitable for the types of signals, e.g. machine
learning if signals are features, metrics if signals indicate trustworthiness of IT
infrastructures, etc. This then makes it challenging to devise a calculus for com-
bining scores of different types in a manner that articulates the expectations in
trust-mediated interactions. Let us give some examples of this.
Trust of an individual in an online transaction will depend, amongst other
things, on the monetary value of that transaction, the reputation of the seller,
and contextual information such as recommendations from friends. IT infras-
tructures in highly dynamic and volatile environments such as military operating
theatres can no longer be secured in a binary “secure or insecure” manner. They
have to react to risks in agile manners [1], suggesting the use of compositional
metrics for run-time trust management. Similarly, run-time systems may want
to monitor executing code by measuring signals from execution characteristics –
such as the threat level of parsed input (e.g. input such as meta-data may serve
as an attack surface [19]), the domain of a remote procedure call, etc. – and ag-
gregate such evidence to control execution paths. We refer to [8] for a case study
of such execution control in the Scala programming language, where methods are
annotated with expectation blocks – a precursor of the language Peal [4] – whose
aggregation computes what corresponds to the score of a policy set in Peal.
These examples suggest a trust calculus needs to express evidence that is not
only rooted in trust (e.g. an asset value), needs to be extensible for domain-
specific expressions of signals (e.g. those of a social network), and requires a
means of calculating trust from observed signals (e.g. compositional metrics). In
[4], such a language Peal was proposed in which signals are abstract predicates
whose truth triggers a score, and where score aggregation captures reasoning
about levels of trust. In [4], several analyses were also defined that assess if trust
calculations perform as expected by specifiers. Verification of trust calculations
is thus a key ingredient of such an approach, and the focus of this paper.
We here express the analysis of Peal expressions as constraints that can be an-
alyzed with the SMT solver Z3, and so capture logical dependencies of (dis)trust
signals. Specifically, we refine and extend the language Peal of [4] to support a
richer calculus, we implement analyses proposed in [4] in the SMT solver Z3 on
this richer language via two different methods of automated Z3 code generation
in PEALT, and we experimentally explore the trade-offs of both methods.
Outline of paper. Section 2 contains background on Peal and the SMT solver Z3.
Design and implementation of PEALT are outlined in Section 3. In Section 4, we
describe two methods for converting conditions used in analyses into Z3 input.
The validation of PEALT via experiments and other activities is reported in
Section 5. Section 6 contains related work, and Section 7 concludes the paper.
2 Background
its declared score sj , has no effect if predicate qj is false (no signal), and has
score sj as effect otherwise (signal present). Policies pol have form as in
contain zero or more rules, a default score s, and an aggregation operator op.
Policy pi returns default score s if all its rules have false predicates; otherwise
it returns the result of applying op to all scores sj of true predicates qj . The
Fig. 1. Syntax of Peal where q ranges over some language of predicates, and th and
score range over real numbers (potentially restricted by domains or analysis methods)
design of Peal is layered as in [4]. Supported aggregation operators are min (e.g.
for distrust signals), max (e.g. for trust signals), + (e.g. for accumulative sig-
nals), and ∗ (e.g. for aggregating independent probabilistic evidence). Policies
are composed into policy sets (pSet) using max and min. Finally, policy sets are
compared to thresholds th using inequalities in conditions cond. The intuition
is that scores and thresholds are real numbers but that some analysis methods
may constrain the ranges of said values. The latter is one reason why the PEALT
input language under-specifies such design choices. The meaning of policy com-
position is context-dependent. For example, if a condition th < min(pS1, pS2) is
used in support of recommending an action, e.g., then min acts as a pessimistic
composition since the score of any of its arguments may falsify this condition.
SMT solver Z3. Satisfiability modulo theories [5] is supported with robust and
powerful tools, that combine the state-of-the-art of deductive theorem proving
with that of SAT solving for propositional logic. The SMT solver Z3 has a
declarative input language for defining constants, functions, and assertions about
them [12]. Figure 2 shows Z3 input code to illustrate that language and its
principal analysis directives. On the left, constants of Z3 type Bool and Real are
declared. Then an assertion defines that the Boolean constant q1 means that x
is less than y + 1, and the next assertion insists that q1 be true. The directives
check-sat and get-model instruct Z3 to find a witness of the satisfiability of
the conjunction of all visible assertions, and to report such a witness (called a
model). On the right, we see what Z3 reports for the input on the left: sat states
that there is a model; other possible replies are unsat (there cannot be a model),
and unknown (Z3 does not know whether or not a model exists).
112 M. Huth and J.H.-P. Kuo
Fig. 2. Left: Z3 input with directives to find and generate a model. Right: Z3 output
for this input, a model that makes all input assertions true. (Both edited to save space.)
POLICIES
b1 = min ((companyDevice 0.1) (uncertifiedOrigin 0.2) (nonMatchingHash 0.2)) default 1
b2 = + ((downloadWithBrowserX 0.1) (useIOS 0.2) (useLinux 0.1) (recentPatch 0.1)) default 0
POLICY_SETS
pSet = min(b1, b2)
CONDITIONS
cond1 = 0.2 < pSet
cond2 = 0.1 < pSet
DOMAIN_SPECIFICS
(declare-const numberOfDaysSinceLastPatch Real)
(assert (= recentPatch (< numberOfDaysSinceLastPatch 7)))
ANALYSES
ana1 = always_true? cond1
ana2 = equivalent? cond1 cond2
4 Z3 Code Generation
Our tool only generates code for conditions that are used : i.e. that are declared
in the input panel and occur in at least one declared analysis as argument.
Let c1 be the declared name of such a condition for declaration c1 = cond. We
generate Z3 code that declares c1 as Z3 type Bool and adds an assert statement
that binds the name c1 to φ[cond] via (assert (= c1 φ[cond])) where φ[cond] is
Z3 code for the logical formula generated for condition cond.
The code generated for φ[cond] explicitly or implicitly lists all signal scenarios
that may occur if we ignore any logical dependencies between signals. This means
that we delegate to our analysis backend, the Z3 SMT solver, the task of only
generating scenarios in analyses that are also logically feasible. We now describe
two methods for generating Z3 code for φ[cond], starting with the explicit one.
def
φ[min(pS1 , pS2 ) ≤ th] = φ[pS1 ≤ th] ∨ φ[pS2 ≤ th] (2)
def
φ[max (pS1 , pS2 ) ≤ th] = φ[pS1 ≤ th] ∧ φ[pS2 ≤ th] (3)
def
φ[th < min(pS1 , pS2 )] = φ[th < pS1 ] ∧ φ[th < pS2 ] (4)
def
φ[th < max (pS1 , pS2 )] = φ[th < pS1 ] ∨ φ[th < pS2 ] (5)
def
Q1 (pol, cond) = (s ≤ th, cond = pol ≤ th) ∨ (th < s, cond = th < pol)
def
Q2 (pol, cond) = (th < s, cond = pol ≤ th) ∨ (s ≤ th, cond = th < pol)
def
Q3 (op, cond) = (op ∈ {+, max }, cond = pol ≤ th) ∨ (op ∈ {∗, min}, cond = th < pol)
def
Q4 (op, cond) = (op = ∗, cond = pol ≤ th) ∨ (op = +, cond = th < pol)
def
φ[cond] = (¬q1 ∧ · · · ∧ ¬qn ) ∨ φndf
op [cond] (when Q1 (pol, cond) is true) (6)
def
φ[cond] = (q1 ∨ · · · ∨ qn ) ∧ φndf
op [cond] (when Q2 (pol, cond) is true) (7)
def
op [cond] = ¬φop [dual(cond)]
φnd f ndf
(when Q3 (op, cond) is true) (8)
def
def
ndf
φmax [th < pol] = qi φmin [pol ≤ th] =
ndf
qi (9)
i|th<si i|si ≤th
def
φndf
op [cond] = qi (when Q4 (op, cond) is true) (10)
X∈Mop i∈X
Fig. 4. Explicit code generation (recursively): pol has form as in (1); predicates Q1 to
Q4 drive the compilation logic; the computation of sets Mop is detailed in Figure 5
enum+ (X, acc, index, op) { enum∗ (X, acc, index, op) {
if (th < acc) { output X; } if (acc ≤ th) { output X; }
else { else {
j = index − 1; j = index − 1;
while ((0 ≤ j) ∧ (th < op(acc, tj )) { while ((0 ≤ j) ∧ (op(acc, tj ) ≤ th) {
enum+ (X ∪ {j}, op(acc, sj ), j, op); enum∗ (X ∪ {j}, op(acc, sj ), j, op);
j = j − 1; }}} j = j − 1; }}}
Fig. 5. Left: algorithm enum+ computes M+ where scores si are sorted in ascending
order. Right: algorithm enum∗ computes M∗ where si are sorted in descending order.
Initial call context is ({}, 0, n, +) for enum+ and ({}, 1, n, ∗) for enum∗ .
in (9) have the intended meaning for all sign combinations. Z3 code generated
for the PEALT input in Figure 3 is shown in Figure 6. PEALT uses the push
and pop directives of Z3 in order to add constraints specific to an analysis onto
the top of the assertion visibility stack that Z3 maintains, and to discharge these
assertions before turning to the next analysis. The Z3 code generated for analyses
is verbatim the same for the symbolic code generation to which we turn next.
116 M. Huth and J.H.-P. Kuo
Fig. 6. Explicitly generated code for input from Figure 3 (hand edited to save space)
Symbolic code generation. This method also binds the name c1 of declaration
c1 = cond to its condition via (assert (= c1 φ[cond])). But for each policy pi
occurring in cond, it also declares a constant cond pi of Z3 type Bool and then
generates φ[cond] as a positive Boolean formula over the constants cond pi . This
process follows the same logic as for explicit code generation in (2) to (5). For
each declared constant cond p i of Z3 type Bool, it then adds an assert state-
ment (assert (= cond p i φ[cond pi ])) that defines the meaning of cond pi .
For policies pi of form as in (1), the code generated is similar to the one of the
explicit method when op equals min or max – we refer to [7] for further details.
Let op equal ∗ or + and policy pi occur in at least one condition within some
declared analysis. Then the code generation for φ[cond pi ] in Figure 7 trades off
the space complexity of enumerating elements in M+ and M∗ with the time
complexity of solving real-valued inequalities in the Z3 SMT solver. For each
predicate qj within pi , we declare a constant pi score qj of Z3 type Real, and
add two assertions that, combined, model that the value of pi score qj is sj iff
qj is true, and that this value equals the unit of + (respectively, ∗) iff qj is
false. This means that we can precisely model the effect of the non-default case
(when at least one qj is true) by aggregating all values pi score qj with op, and
by comparing that aggregated result to the threshold in the specified manner
PEALT: An Automated Reasoning Tool for Numerical Aggregation 117
(< or ≥). Crucially, the values of pi score qj for predicates that happen to be
false won’t contaminate this aggregated value as they are units for operator op.
The encoding for symbolic code generation is therefore linear in the size of
cond. Using this encoding, we can now express φ[cond pi ] in Z3 by directly
encoding the “operational” semantics of cond pi : either the default score satisfies
the inequality and all policy predicates are false, or at least one policy predicate is
true and the aggregation of all values pi score qj with op satisfies the inequality.
These Z3 declarations and expressions are stated in Figure 7.
Fig. 7. Top: declarations for pi score qj where s j is sj , and <unit> is 0.0 for +
policies pi and 1.0 for ∗ policies pi . Bottom: Z3 code for φ[cond pi ] for the first case
in (1); comparison operator cop is < for th < pi or ≥ for th ≥ pi , and th denotes th.
Analyses. Analysis implies? checks whether the first condition logically implies
the second one, which is a form of policy refinement. Analyses always false?
and satisfiable? are “equivalent” but capture different intent of the user, ditto
for analysis equivalent? versus analysis different?. A typical use of analysis
different? is to check whether conditions differ for 0.5 < pSet and 0.6 < pSet,
i.e. whether pSet is sensitive to the increase of threshold value from 0.5 to 0.6.
Witness generation. For each declared analysis, Z3 will try to decide it when
running PEALT. If the Z3 output is unsat, then we know that there is no witness
to the query – e.g. for always true? this would mean that Z3 decides that the
condition cannot be false, and so the answer is “yes, always true”. If the Z3
118 M. Huth and J.H.-P. Kuo
output is sat, then we report the correct answer (e.g. for always true? we say
“no, not always true”) and generate supporting evidence for this answer. For
explicit code generation, the generated models tend to be very short (few crucial
truth values of predicates qi and supporting values of variables used to define
these qi if applicable). PEALT can post-processes this raw Z3 output to extract
this information in pretty-printed form, an example thereof is seen in Figure 8.
For symbolic code generation, model list truth values for almost all declared
predicates qi that occur in at least one ∗ or + policy. The reason for this seems
to stem from the assertions we declare for variables p_i_score_q_j in Figure 7.
We mean to investigate how to shorten such evidence in future work.
Result of analysis [ana1 = always_true? cond1]
cond1 is NOT always true
For example, when useLinux is true, recentPatch is true,
nonMatchingHash is true, companyDevice is false
Fig. 8. Sample of pretty printed evidence for satisfiability witness computed from ex-
plicitly generated code for always true? from Figure 3 (hand edited to save space)
5 Validation
We report experimental results for code generation methods and execution of
generated code on random and non-random analyses. We also discuss other tool
validation activities we conducted. All experiments were run on a test server
with two, 6-core, Intel E5 CPUs running at 2.5GHz and 48G of RAM.
Non-random benchmark. We use condition 0.5 < pmv(n) with + policy pmv(n) ,
default score 0, and n many rules each with score 1/n. The condition is true
when more than half of the predicates are true (“majority voting”). There are
no logical dependencies of predicates in pmv(n) and the size of M+ is exponential
in n. We can generate explicitly Z3 input code for values of n up to 27 (when
code takes up half a gigabyte), and code generation takes more than five minutes
for n being 23. By comparison, we could generate symbolically such code and
verify that this condition is true, within five minutes each, for n up to 49408.
that randomly generates a policy set pSet, two conditions th < pSet and th +
δ < pSet and analyses the first one with always true?, the second one with
PEALT: An Automated Reasoning Tool for Numerical Aggregation 119
always false?, and then applies different? to both conditions. Predicates are
randomly selected from a pool of p many predicates (with n ≤ p). Scores are
chosen from [0, 1] uniformly at random. In pSet, there are n policies for each
operator op of Peal (i.e. 4n policies in total) and each op policy has mop many
rules. For the maximal k with 2k ≤ 4n, we combine 2k policies using alternating
max and min compositions on their full binary parse tree; the result is further
composed with the remaining 4n−2k policies (if applicable) by grouping these in
min pairs, and by adding these pairs in alternating min and max compositions
to the binary policy tree. This stress tests policy composition above and beyond
what one would expect in practical specifications.
We then conducted three experiments that share an execution and termination
logic: experimental input to randP eal has only one degree of freedom and we use
unbounded binary search to see (within granularity of 10 and for five randomly
generated condition pairs) whether both code generation methods can generate
Z3 code within five minutes, and whether Z3 can perform each analysis within
that same time frame. If this fails for one of these condition pairs, we stop binary
expansion and go to a bisection mode to find the boundary.
Experiment 1 picks for operator min input headers 1, x, 1, 1, 1, 3x, 0.5, 0.1 so
it explores how many (x) rules a sole min policy can handle within five minutes.
The same evaluation is done for the other three operators. We also investigated
a variant of this experiment – Exp 1 (DS) – for which we also add as many
assertions as there are declared predicates in the conditions, as described in
[7]. This uses a function calledBy that models method call graphs with at
most one incoming edge (using a forall axiom in Z3 code) and declares a
third of these predicates to mean that a specific method called. The other two
thirds define predicates as linear inequalities between real, respectively integer,
variables (which may stem from method input headers) – please see [7] for details.
Experiment 2 picks for operator min the input headers n, c, 1, 1, 1, 3c, 0.5, 0.1
where c equals x/10 for the boundary value of x found in Experiment 1. We here
explore how many min policies we can handle for a sizeable number of rules.
The same evaluation is done for the other three operators. Experiment 3 picks
for operator min input headers n, n, 1, 1, 1, 3n, 0.5, 0.1 so that we explore how
many (the n) min policies with the same number of rules we can handle within
five minutes. The same evaluation is done for the other three operators.
Results of these experiments are displayed in Figure 9. In their discussion we
need to recognize that random analyses can have very different analysis times
for the same configuration type. So a termination “boundary” does not mean
that we cannot verify larger instances within five minutes, it just means that we
encountered an instance at the reported boundary that took longer than that.
In the first experiment, Z3 code generation seems faster than execution of
that Z3 code. We also see that up to two million rules can be handled for min
and max for both code generation methods within two minutes. For ∗, explicit
code generation seems to be one order of magnitude better than symbolic code
generation, although the Z3 execution in the latter case appears to be faster.
For +, on the other hand, symbolic code generation now seems to be an order
120 M. Huth and J.H.-P. Kuo
of magnitude better than the explicit one – handling thousands of rules in just
over two minutes. When we add the domain-specific constraints in Exp 1 (DS),
we notice that min and max can only handle about seven-thousand rules in a
similar amount of time (compared to two million beforehand). The results for ∗
for both methods and for + for explicit code generation seem about the same
as without domain-specific constraints. But + now only can handle less than
two-thousand rules for symbolic code generation. In the second experiment, the
number of rules used for max and min is about two-hundred thousand. We can
deal with about fifty policies with that many rules within five minutes, noting
that code generation now takes more time. It is noteworthy that explicit code
generation can handle over sixty-thousand ∗ policies with 12 rules each, but that
this drops to less than twenty-thousand + policies; the symbolic approach does
not scale that well in comparison. In the third experiment, both methods can
handle between two to three thousand policies with that many rules for max
and min. For operators ∗ and +, the explicit method spends most of its time
in code generation whereas the symbolic one spends the bulk of its time in Z3
execution. For operator ∗, explicit code generation is still about an order of
magnitude better whereas for + it is not significantly better.
Ideally, we would like to extend these experiments to larger data points. But
such an attempt quickly reaches the memory boundary of our powerful server in
explicit code generation. We also believe that practical case studies would not
PEALT: An Automated Reasoning Tool for Numerical Aggregation 121
use more than a few dozen or hundreds of rules for each + and ∗ policy declared,
and so both approaches may actually work well then.
Software validation and future work. We have not yet encountered a Z3 output
unknown for PEALT analyses, although this is easy to achieve by adding complex
constraints as domain specifics. We validated both code generation methods
by running them side by side on randomly generated analyses and checking
whether they would produce conflicting answers (unsat and sat). During the
development of PEALT, we encountered a few of these conflicts which helped to
identify implementation bugs. Of course, this does not mean what we proved the
correctness of our Z3 code generator (written in Scala), and doing so would be
unwise as this generator will evolve with the tool language. Therefore, we want
to independently verify the evidence computed by Z3, in future work. This will
also verify that no double rounding errors in Z3 corrupted analysis outcomes.
In future work, we also want to understand whether we can construct proofs for
outputs unsat such that these proofs are meaningful for the analyses in question.
6 Related Work
The language in Figure 1 extends that in [4]: it supports policies without rules,
∗ policies, negative and non-constant scores for symbolic code generation, and
logical dependencies of predicates qi within PEALT. The symbolic code genera-
tion in PEALT uses the same enumeration process for + and ∗ on minimal index
sets (and not maximal ones as in [4]). PEALT implements most analyses of [4]
with logical dependencies, leaving more complex ones of [4] for future work.
The determination of scores is a fundamental concern in our approach, and
where PEALT is meant to provide confidence in such scorings and their implica-
tions. The process of arriving at scores depends on the application domain, we
offer two examples thereof from the literature. TrustBAC [3] extends role-based
access control with levels of trust, scores in [−1, 1], that are bound to roles in
RBAC sessions. These levels are derived from a trust vector that reflects user be-
havior, user recommendations, and other sources. No analysis of these levels and
their implications is offered. In [16], we see an example of how a sole score may
reflect the integrity of an information infrastructure, as a formula that accounts
for known vulnerabilities, threats that can exploit such vulnerabilities, and the
likelihood for each vulnerability to exist in the given infrastructure. We should
keep in mind that any such metrics are heuristics, and so it is important to an-
alyze their impact on decision making, especially if other factors also influence
such decisions. PEALT allows us, in principle, to conduct such analyses. Extant
work enriches security elements with quantities, e.g. credential chains [18], secu-
rity levels [15], trust-management languages [2], reputation [9], and combinations
of reputation and trust [13,14]. But we are not aware of substantial tool support
for analyzing the effect of such enrichments when combined with other aspects
of evidence. Shinren [6] offers the ability to reason about both trust and distrust
explicitly and in a declarative manner, with the support of priority composition
122 M. Huth and J.H.-P. Kuo
operators for layers of trust and distrust. Although Peal is in principle expressive
enough to encode most of this functionality, doing so would not constitute good
engineering practice: this is a good example for when conditions of Peal would
be expressions to be composed in upstream languages such as Shinren.
7 Conclusions
We have created a tool PEALT in which one can study different mechanisms of
aggregating numerical trust evidence. We extended the policy-composition lan-
guage Peal of [4] and modified the generation of verification conditions reported
in [4] for Peal conditions to make them dischargeable with an SMT solver. We
proposed two different means of generating such verification conditions and dis-
cussed both conceptual and experimental advantages and disadvantages of such
methods. The explicit method compiles away any references to numerical values
and so arrives at a purely logical formulation. The price for this may be an explo-
sion in the length of the resulting formula and in the restriction of score ranges for
certain policy composition operators (e.g. multiplication). The symbolic method
creates formulas with only linear size in the conditions but shifts the compu-
tational burden to Z3 and its reasoning about linear arithmetic. Both methods
delegate to Z3 logical feasibility checks of trust scenarios discovered in analyses.
Our current PEALT prototype supports verification of policy refinement, vacu-
ity checking, sensitivity analysis of thresholds in conditions, and non-constant
scores (for symbolic code generation) to express metrics. We think PEALT is
a good example of the benefits that can be gained by connecting to a powerful
back-end such as the SMT solver Z3 for analyses. The version of the source code
used in this paper is available on https://bitbucket.org/jimhkuo/pealt.
References
1. Announcement of Cybersecurity Collaborative Research Alliance. Press Release,
US Army Research Laboratory (October 15, 2013)
2. Bistarelli, S., Martinelli, F., Santini, F.: A semantic foundation for trust man-
agement languages with weights: An application to the RT family. In: Rong, C.,
Jaatun, M.G., Sandnes, F.E., Yang, L.T., Ma, J. (eds.) ATC 2008. LNCS, vol. 5060,
pp. 481–495. Springer, Heidelberg (2008)
3. Chakraborty, S., Ray, I.: TrustBAC: integrating trust relationships into the RBAC
model for access control in open systems. In: Proceedings of the Eleventh ACM
Symposium on Access Control Models and Technologies, SACMAT 2006, pp. 49–
58. ACM, New York (2006)
4. Crampton, J., Huth, M., Morisset, C.: Policy-based access control from numerical
evidence. Tech. Rep. 2013/6, Imperial College London, Department of Computing
(October 2013) ISSN 1469-4166 (Print), ISSN 1469-4174 (Online)
PEALT: An Automated Reasoning Tool for Numerical Aggregation 123
5. De Moura, L., Bjørner, N.: Satisfiability modulo theories: introduction and appli-
cations. Commun. ACM 54(9), 69–77 (2011)
6. Dong, C., Dulay, N.: Shinren: Non-monotonic trust management for distributed
systems. In: Nishigaki, M., Jøsang, A., Murayama, Y., Marsh, S. (eds.) IFIPTM
2010. IFIP AICT, vol. 321, pp. 125–140. Springer, Heidelberg (2010)
7. Huth, M., Kuo, J.H.P.: PEALT: A reasoning tool for numerical aggregation of trust
evidence. Tech. Rep. 2013/7, Imperial College London, Department of Computing
(2013) ISSN 1469-4166 (Print)
8. Huth, M., Kuo, J.H.-P.: Towards verifiable trust management for software execu-
tion(extended abstract). In: Huth, M., Asokan, N., Čapkun, S., Flechais, I., Coles-
Kemp, L. (eds.) TRUST 2013. LNCS, vol. 7904, pp. 275–276. Springer, Heidelberg
(2013)
9. Jøsang, A., Ismail, R.: The beta reputation system. In: Proceedings of the 15th
Bled Conference on Electronic Commerce, Bled, Slovenia, June 17-19 (2002)
10. Kirlappos, I., Sasse, M.A., Harvey, N.: Why trust seals don’t work: A study of user
perceptions and behavior. In: Katzenbeisser, S., Weippl, E., Camp, L.J., Volkamer,
M., Reiter, M., Zhang, X. (eds.) TRUST 2012. LNCS, vol. 7344, pp. 308–324.
Springer, Heidelberg (2012)
11. Mayer, R., Davis, J., Schoorman, F.D.: An integrative model of organizational
trust. Academy of Management Review 20(3), 709–734 (1995)
12. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
13. Mui, L.: Computational Models of Trust and Reputation: Agents, Evolutionary
Games, and Social Networks. Ph.D. thesis, Massachusetts Institute of Technology
(2002)
14. Muller, T., Schweitzer, P.: On beta models with trust chains. In: Fernández-Gago,
C., Martinelli, F., Pearson, S., Agudo, I. (eds.) IFIPTM. IFIP AICT, vol. 401, pp.
49–65. Springer, Heidelberg (2013)
15. Ni, Q., Bertino, E., Lobo, J.: Risk-based access control systems built on fuzzy infer-
ences. In: Proceedings of the 5th ACM Symposium on Information, Computer and
Communications Security, ASIACCS 2010, pp. 250–260. ACM, New York (2010),
http://doi.acm.org/10.1145/1755688.1755719
16. Nurse, J.R.C., Creese, S., Goldsmith, M., Rahman, S.S.: Supporting human
decision-making online using information-trustworthiness metrics. In: Marinos, L.,
Askoxylakis, I. (eds.) HAS/HCII 2013. LNCS, vol. 8030, pp. 316–325. Springer,
Heidelberg (2013)
17. Riegelsberger, J., Sasse, M.A., McCarthy, J.D.: The mechanics of trust: A frame-
work for research and design. Int. J. Hum.-Comput. Stud. 62(3), 381–422 (2005)
18. Schwoon, S., Jha, S., Reps, T.W., Stubblebine, S.G.: On generalized authorization
problems. In: CSFW, pp. 202–218. IEEE Computer Society (2003)
19. Shapiro, R., Bratus, S., Smith, S.W.: “Weird Machines” in ELF: A Spotlight on
the Underappreciated Metadata. In: Proceedings of the 7th USENIX Workshop on
Offensive Technologies (WOOT 2013), 12 pages. USENIX (2013)
GRASShopper
Complete Heap Verification with Mixed Specifications
1 Introduction
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 124–139, 2014.
c Springer-Verlag Berlin Heidelberg 2014
GRASShopper 125
into developing tool support for automated verification of programs against SL specifi-
cations [3,4,9,27]. The cores of such tools are specialized theorem provers for checking
entailments between SL assertions [2, 6, 7, 21]. Much of the work on such provers aims
at decidable fragments of separation logic to guarantee a robust user experience.
Despite the elegance of separation logic, there are certain situations where it is more
appropriate to express specifications in classical logic. This includes, for example, sit-
uations in which data structures exhibit complex sharing or involve constraints about
data, e.g., arithmetic constraints. Reasoning about such constraints is not directly sup-
ported by SL theorem provers. The question is then how to extend these provers without
giving up on decidability and completeness guarantees.
Typically, theory reasoning is realized by using a satisfiability modulo theories (SMT)
solver that is integrated with the SL entailment procedure [5]. However, the interplay
between SL reasoning and theory reasoning is intricate, e.g. equalities inferred by the
theory solvers must be propagated back to the SL solver. Guaranteeing completeness of
such a combined procedure is brittle and often involves the reimplementation of infras-
tructure that is already provided by the SMT solver.
In our previous work, we developed a new approach for checking SL entailments
that reduces to checking satisfiability of formulas expressed in a decidable fragment of
first-order logic [22]. We refer to this fragment as the logic of graph reachability and
stratified sets (GRASS). Formulas in this logic express properties of the structure of
graphs, such as whether nodes in the graph are inter-reachable, as well as properties of
sets of nodes. The combination of these two features enables a natural encoding of the
semantics of SL assertions. The advantage of this approach is that we can now delegate
all reasoning to the SMT solver, exploiting existing infrastructure for combinations [18]
and extensions [25] of first-order theories to handle reasoning about data robustly.
In this paper, we present GRASShopper, a tool which extends our previous work with
support for local reasoning. Inspired by implicit dynamic frames [20, 24], we present a
translation of programs with mixed separation logic and first-order logic specifications
to programs with GRASS specifications. The translation and verification of the resulting
program is fully automated. The key challenge in this approach is to ensure that the en-
coding of SL assertions and the support for local reasoning remains within a decidable
logic. To this end, we present a decidable extension of the GRASS logic that suffices to
express that reachability information concerning heap paths outside the footprint of a
code fragment is preserved by the execution of that code fragment.
We implemented the decision procedure for our extension of GRASS on top of
the SMT solver Z3 [8] and integrated this decision procedure into GRASShopper. We
used the tool to automatically verify list-manipulating programs such as sorting algo-
rithms whose specifications involve constraints on data. We further considered pro-
grams whose specifications are difficult to express in decidable SL fragments alone.
One example is the find operation of a union/find data structure. The postcondition of
this operation must describe a heap region that consists of an unbounded number of list
segments. With our approach we can easily express this postcondition using a quantified
constraint in classical logic, while using SL assertions to describe the precondition. The
seamless yet robust combination of separation logic and classical logic in a specification
language that supports local reasoning is the key contribution of this work.
126 R. Piskac, T. Wies, and D. Zufferey
1 procedure split(x: Node, y: Node, ghost lb: int, ghost ub: int) returns (rx: Node, pivot: Node)
2
requires blseg(x, y, lb, ub) ∗ x y;
ensures blseg(rx, pivot, lb, pivot.data) ∗ blseg(pivot, y, pivot.data, ub);
3
a separate procedure split. After splitting, quicksort recursively calls itself on the two
sublists and concatenates the two sorted list segments.
We provide the specification of split but not its implementation. It is shown in Fig. 2.
The specification is agnostic to implementation details such as whether only the data
values are reordered in the list or the entire nodes. Multiple ensures, respectively, re-
quires clauses in a procedure contract are implicitly connected by spatial conjunction.
The procedure split also demonstrates the convenience of a specification language
that allows mixing of separation logic and reachability logic. The conjunct Btwn(next,
rx, pivot, y) in the second ensures clause is a predicate in our logic GRASS. The pred-
icate states that the node pivot lies between rx and y on the direct next path connecting
the two nodes. That is, the two list segments described by the first ensures clause do
not form a panhandle list. A panhandle list can occur if y is a dangling pointer to an
unallocated node and split allocates that node and inserts it into the list segment from
rx to pivot, thereby creating a cycle. Without the additional reachability constraint, the
specification of split would be too weak to prove the correctness of quicksort because
the final sorted list segment returned by quicksort must be acyclic. If we used either only
separation logic or only reachability logic, the specification of procedure split would be
considerably more complicated (assuming we stayed inside decidable fragments).
The verification of the input program provided to GRASShopper proceeds in three steps:
first we translate the program to an equivalent program whose specification is expressed
solely in our first-order logic fragment GRASS; in the second step we encode the trans-
lated program into verification conditions (also expressed in GRASS) using standard
verification condition generation; finally we decide the generated verification condi-
tions using our GRASS solver. All three steps are fully automated in GRASShopper.
We now explain these steps using the quicksort procedure as a running example.
We first describe the translation of the input program to a GRASS program. The trans-
lation must capture the semantics of Hoare triples in separation logic and preserve the
ability to reason about correctness locally. For a Hoare triple P C Q to be valid in
separation logic, the precondition P must subsume the footprint of the program frag-
ment C. That is, P specifies the portion of memory that C is allowed to access. This
semantics enables local reasoning, which is distilled into the so-called frame rule. The
frame rule states that if P C Q is valid, then so is P F C Q F for any SL
128 R. Piskac, T. Wies, and D. Zufferey
assertion F . That is, C does not affect the state of memory regions disjoint from its
footprint. The assertion F is referred to as the frame of the rule application.
The frame rule enables compositional symbolic execution of program fragments.
For example in quicksort, the symbolic state after the call to split in line 13 is de-
scribed by the postcondition of split. The first subsequent recursive call to quicksort
then only operates on the first sublist blseg(rx,pivot,lb,ub) of that symbolic state, leav-
ing blseg(pivot,y,lb,ub) in the frame. The frame rule then implies that this second sublist
is not modified by the first recursive call. All such applications of the frame rule for
procedure calls are made explicit in the GRASS program.
The translation to a GRASS program proceeds one procedure at a time. Each result-
ing procedure is equivalent to its counterpart in the input program, modulo auxiliary
ghost state. This auxiliary ghost state makes the semantics of separation logic specifica-
tions explicit and encodes the applications of the frame rule. Figure 3 shows the result
of the translation for the quicksort procedure. The translation works as follows.
Alloc. First, we introduce a global ghost variable Alloc (line 2), which is used to model
allocation and deallocation instructions. That is, at any point of execution, Alloc denotes
the set of all Node objects that are currently allocated on the heap.
Footprints and Implicit Frame Inference. Each procedure maintains its own footprint
throughout its execution using the dedicated local ghost variable FP. That is, at any point
of a procedure’s execution, FP contains the set of all heap nodes that the procedure
has permission to access or modify at that point. Each heap access or modification
is therefore guarded by an assert statement that checks whether the modification is
permitted by the current footprint (see, e.g., lines 25 and 29). The translation maintains
the invariant that footprints contain only allocated nodes. That is, both allocation and
deallocation instructions affect FP.
For each procedure call, the footprint of the caller is passed to the callee and the
callee returns the new footprint of the caller. That is, it is the callee’s responsibility
to inform the caller about allocation and deallocation operations that affect the caller’s
footprint. For this purpose, each procedure is instrumented with an additional ghost
input parameter FP_Caller and an additional ghost return parameter FP_Caller’.
The contract of the translated procedure governs the transfer of permissions between
caller and callee via the exchanged footprints and ties the footprints to the translations
of the separation logic specifications in the original procedure contract. The initial value
of FP in the translated procedure is determined by the footprint of the separation logic
assertions in the precondition of the input procedure, which itself must be a subset of
the callers footprint (line 16).
Note that the ghost variable FP is declared as an implicit ghost input parameter of
the procedure (line 13). The semantics of an implicit ghost parameter is that it is ex-
istentially quantified across the entire procedure contract1. That is, during verification
condition generation, the precondition of the contract is asserted at the call site with all
implicit ghost parameters existentially quantified. When the solver checks the gener-
ated verification condition for this assertion, it needs to find a witness for FP, thereby
implicitly inferring the frame of the procedure call that is used in the application of
1
We adhere to the usual semantics of procedure contracts where input parameters occuring in
ensures clauses refer to the initial values of these parameters.
GRASShopper 129
16
23 free ensures Frame(old(Alloc), FP, old(next), next) Frame(old(Alloc), FP, old(data), data);
24
{ FP_Caller := FP_Caller FP;
assert x y x FP;
25
26 if (x y x.next y) {
27 var pivot: Node, z: Node;
28 rx, pivot, FP := split(x, y, lb, FP);
29
assert pivot FP;
30 rx, FP := quicksort(rx, pivot, lb, pivot.data, FP);
31 z, FP := quicksort(pivot.next, y, pivot.data, ub, FP);
32 pivot.next := z;
} else { rx := x; }
33
the frame rule. After the precondition has been asserted, it is assumed with the implicit
ghost parameters replaced by fresh Skolem constants. These Skolem constants then also
occur in the assumed postcondition at the call site.
Encoding the Frame Rule. The free requires and ensures clauses in the contract con-
stitute the actual encoding of the frame rule. The free annotation means that the corre-
sponding clause does not need to be checked but can be freely assumed by the callee,
respectively, caller. These clauses follow from the soundness of the frame rule and the
invariants concerning Alloc and the footprints that are guaranteed by the translation. We
discuss the most important parts of the encoding in more detail:
– First, consider the ensures clause in line 20: blseg_fprx, y Alloc FP
AllocoldAlloc. This clause states that the footprint of the postcondition, de-
noted by blseg_fprx, y, accounts for all memory in the initial footprint that has not
been deallocated, and all memory that has been freshly allocated (but not
130 R. Piskac, T. Wies, and D. Zufferey
deallocated again) during execution of quicksort. This clause thus implies that the
procedure does not leak memory.
– Next, consider the ensures clause in line 21: FP_Caller FP_CallerFP
Alloc FP
AllocoldAlloc. This clause states that the new footprint of the
caller, FP_Caller’, is the caller’s old footprint with the initial footprint of quicksort
replaced by quicksort’s final footprint (as defined in line 20).
– Finally, the clause in line 23 states that the fields next and data are not modified in
the frame of the call. We express this using the predicate Frame. The frame of the
call is given by the set old(Alloc) FP. We discuss the predicate Frame in more detail
in the next section, as the choice of its encoding is crucial for the completeness of
our translation.
The translation to GRASS programs that we outlined in the previous section would
then be complete if we considered an axiomatic semantics where GRASS formulas
are interpreted in a first-order logic with transitive closure. Transitive closure enables
2
As well as the quantified implicit ghost parameter FP in call-site checks of preconditions.
132 R. Piskac, T. Wies, and D. Zufferey
rx pivot y rx y
pivot
(a) y does not reach nodes in the footprint (b) y reaches the footprint (panhandle list).
Fig. 4. Two of the possible heaps at the call site on line 14. The footprint of the recursive call to
quicksort and the portion of the frame that belongs to the caller’s footprint are enclosed in dotted
boxes. Solid black edges denote next pointers, dashed black edges indicate next paths, and solid
red edges represent the ep function.
Effective Propositional Fragment (EPR). The EPR fragment (aka the Bernays-Schön-
finkel-Ramsey class) consists of formulas in which universally quantified variables do
not occur below function symbols. This fragment can be decided quite efficiently us-
ing Z3’s model-based quantifier instantiation mechanism. Hence, all EPR formulas are
passed directly to Z3. For formulas that are not in EPR, we make a finer distinction.
Stratified Sort Fragment. If universal quantified variables appear below function sym-
bols, then instantiating these variables may create new ground terms, which in turn can
be used for instantiation, causing the SMT solver to diverge. One special case, though,
are axioms satisfying stratified sort restrictions [1]. Examples of such formulas are the
quantified constraints in the predicates blseg_struct and bslseg_struct of Figure 3. The
sort of the quantified variables z and w is Node, while the sort of the instantiated terms
134 R. Piskac, T. Wies, and D. Zufferey
z.data and w.data is int. Since we do not quantify over int variables, the generated ground
terms do not enable new quantifier instantiations. Formulas in the stratified sort frag-
ment are directly passed to Z3.
Local Theory Extensions. The remaining quantified constraints are more difficult. In
general, we provide no completeness guarantee for our handling of quantifiers because
we allow users to specify unrestricted quantified pure constraints in their specifications.
However, we can guarantee completeness for specifications written in separation logic
for linked lists mixed with quantifier-free pure GRASS constraints (as well as some
types of user-specified quantified constraints). We designed our translation carefully
so that the remaining quantified formulas are in decidable fragments (in particular, the
frame and theory axioms). To decide these fragments, we build on local theory exten-
sions [25]. Local theory extensions are described by axioms for which instantiation can
be restricted to ground terms appearing in the verification condition (or some finite set
of ground terms that can be computed from this formula). We preprocess such axioms
by partially instantiating all variables below function symbols with the relevant sets of
ground terms. The partially instantiated axioms are then in the EPR fragment and passed
to Z3. We discuss one example of a local theory extension in more detail below. To re-
duce the number of generated partial instances, we compute the congruence closure for
the ground part of the verification condition to group ground terms into equivalence
classes. We then only need to consider one representative term per equivalence class
during instantiation.
Example 3. One example of a local theory extension is the theory extension defining the
entry point functions in Section 3.2 together with the generated frame axioms concern-
ing ep. Note that in all models of this extension, the entry point function is idempotent
for fixed X and f . Hence, we only need to instantiate these axioms once for each Node
ground term x. One potential problem may arise from the interactions between the ep
functions for different footprint sets and fields. That is, instantiating one ep term for
one X, f and ground term t may expose a new entry point e ep X , f , ep X, f, t
for another pair X , f such that, in some model, e is different from all previously gen-
erated ground terms. However, such a situation cannot occur if all footprints are defined
by a union of a bounded number of list segments. This holds true for separation logic of
linked lists. Even in the general case, the counterexamples that witness incompleteness
are rather degenerate and we doubt they can occur in actual program executions.
9 procedure union(x: Node, y: Node, ghost root_x: Node, ghost root_y: Node,
10 implicit ghost X: set<Node>, implicit ghost Y: set<Node>)
requires lseg_set(x, root_x, X) lseg_set(y, root_y, Y);
11
more difficult. For instance, in the union procedure if x and y are in different equivalence
classes, then the two paths in the data structure are disjoint. However, if they are in the
same class, then their paths may be partially shared. It is difficult to express this in
traditional SL fragments without explicitly distinguishing the two cases. We can cover
both cases conveniently using the spatial connective for nondisjoint union.
Structural Constraints Expressed in First-Order Logic. When path compaction is
used in the find procedure, then the postcondition of find is not expressible in terms of
a bounded number of inductive predicates. The reason is that path compaction turns a
list segment of unbounded length into an unbounded number of points-to predicates.
Therefore, expressing the postcondition requires some form of universal quantification.
We can express this quite easily using the constraint F z X :: z.next root_x,
where X is the initial footprint of the procedure described by an SL assertion. Note
that the additional predicate acc(X) in the postcondition specifies that X is also the final
footprint of the procedure. Hence, F only constrains the structure of the heap region that
is captured by the footprint. Note that this example also uses implicit ghost parameters
of procedures to existentially quantify over the explicit footprint X.
When mixing separation logic and classical logic, then additional well-formedness
checks are needed to guarantee that reachability predicates and other heap-dependent
pure formulas do not constrain heap regions outside of the footprint that is specified by
the nonpure SL assertions. Otherwise, the application of the frame rule would become
unsound. However, these additional checks can be automated in the same manner as the
checks of the actual verification conditions.
pivot null
Benchmarks # LOC # VCs time in s
SLL (loop) 156 56 1.9
Loc!0
SLL (rec.) 142 70 3.1 x
data = 1
Loc!7 next
Fig. 6. The left-hand side shows the summary of the experiments for the collections of correct
benchmarks as well as some benchmarks that contain bugs in the code or specification. The
right-hand side shows the generated counterexample for the underspecified quicksort program.
clearly shows the panhandle list. The full counterexample also includes valuations for
the footprint sets of the caller and callee. The final footprint FP_Caller’ returned by split
is Loc!0, Loc!1, Loc!2, Loc!3, Loc!4, Loc!8 and the footprint that was expected by the
postcondition of quicksort is Loc!2, Loc!4. The two sets should be equal.
References
1. Abadi, A., Rabinovich, A., Sagiv, M.: Decidable fragments of many-sorted logic. In: Der-
showitz, N., Voronkov, A. (eds.) LPAR 2007. LNCS (LNAI), vol. 4790, pp. 17–31. Springer,
Heidelberg (2007)
2. Berdine, J., Calcagno, C., O’Hearn, P.W.: A decidable fragment of separation logic. In: Lo-
daya, K., Mahajan, M. (eds.) FSTTCS 2004. LNCS, vol. 3328, pp. 97–109. Springer, Hei-
delberg (2004)
3. Berdine, J., Calcagno, C., O’Hearn, P.W.: Smallfoot: Modular automatic assertion checking
with separation logic. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.-P. (eds.)
FMCO 2005. LNCS, vol. 4111, pp. 115–137. Springer, Heidelberg (2006)
4. Berdine, J., Cook, B., Ishtiaq, S.: SLAYER: Memory Safety for Systems-Level Code. In:
Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 178–183. Springer,
Heidelberg (2011)
5. Botincan, M., Parkinson, M.J., Schulte, W.: Separation logic verification of C programs with
an SMT solver. Electr. Notes Theor. Comput. Sci. 254, 5–23 (2009)
6. Bouajjani, A., Drăgoi, C., Enea, C., Sighireanu, M.: Accurate invariant checking for pro-
grams manipulating lists and arrays with infinite data. In: Chakraborty, S., Mukund, M. (eds.)
ATVA 2012. LNCS, vol. 7561, pp. 167–182. Springer, Heidelberg (2012)
7. Cook, B., Haase, C., Ouaknine, J., Parkinson, M., Worrell, J.: Tractable reasoning in a
fragment of separation logic. In: Katoen, J.-P., König, B. (eds.) CONCUR 2011. LNCS,
vol. 6901, pp. 235–249. Springer, Heidelberg (2011)
8. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J.
(eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008)
9. Dudka, K., Peringer, P., Vojnar, T.: Predator: A practical tool for checking manipulation of
dynamic data structures using separation logic. In: Gopalakrishnan, G., Qadeer, S. (eds.)
CAV 2011. LNCS, vol. 6806, pp. 372–378. Springer, Heidelberg (2011)
10. GRASShopper tool wep page, http://cs.nyu.edu/wies/software/
grasshopper(last accessed: October 2013)
11. Immerman, N., Rabinovich, A., Reps, T., Sagiv, M., Yorsh, G.: The boundary between de-
cidability and undecidability for transitive-closure logics. In: Marcinkowski, J., Tarlecki, A.
(eds.) CSL 2004. LNCS, vol. 3210, pp. 160–174. Springer, Heidelberg (2004)
12. Iosif, R., Rogalewicz, A., Simacek, J.: The tree width of separation logic with recursive
definitions. In: Bonacina, M.P. (ed.) CADE 2013. LNCS, vol. 7898, pp. 21–38. Springer,
Heidelberg (2013)
13. Itzhaky, S., Banerjee, A., Immerman, N., Nanevski, A., Sagiv, M.: Effectively-propositional
reasoning about reachability in linked data structures. In: Sharygina, N., Veith, H. (eds.) CAV
2013. LNCS, vol. 8044, pp. 756–772. Springer, Heidelberg (2013)
14. Itzhaky, S., Lahav, O., Banerjee, A., Immerman, N., Nanevski, A., Sagiv, M.: Modular rea-
soning on unique heap paths via effectively propositional formulas. In: POPL (2014)
15. Jacobs, B., Smans, J., Philippaerts, P., Vogels, F., Penninckx, W., Piessens, F.: VeriFast: A
powerful, sound, predictable, fast verifier for C and java. In: Bobaru, M., Havelund, K., Holz-
mann, G.J., Joshi, R. (eds.) NFM 2011. LNCS, vol. 6617, pp. 41–55. Springer, Heidelberg
(2011)
16. Lahiri, S., Qadeer, S.: Back to the future: revisiting precise program verification using SMT
solvers. In: POPL (2008)
17. Leino, K.R.M., Müller, P., Smans, J.: Verification of concurrent programs with chalice. In:
Aldini, A., Barthe, G., Gorrieri, R. (eds.) FOSAD 2007/2008/2009. LNCS, vol. 5705, pp.
195–222. Springer, Heidelberg (2009)
GRASShopper 139
18. Nelson, G., Oppen, D.C.: Simplification by cooperating decision procedures. ACM
TOPLAS 1(2), 245–257 (1979)
19. O’Hearn, P., Reynolds, J., Yang, H.: Local reasoning about programs that alter data struc-
tures. In: Fribourg, L. (ed.) CSL 2001 and EACSL 2001. LNCS, vol. 2142, pp. 1–19.
Springer, Heidelberg (2001)
20. Parkinson, M.J., Summers, A.J.: The relationship between separation logic and implicit dy-
namic frames. Logical Methods in Computer Science 8(3) (2012)
21. Pérez, J.A.N., Rybalchenko, A.: Separation logic + superposition calculus = heap theorem
prover. In: PLDI, pp. 556–566. ACM (2011)
22. Piskac, R., Wies, T., Zufferey, D.: Automating Separation Logic Using SMT. In: Sharygina,
N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 773–789. Springer, Heidelberg (2013)
23. Qiu, X., Garg, P., Stefanescu, A., Madhusudan, P.: Natural proofs for structure, data, and
separation. In: PLDI, pp. 231–242 (2013)
24. Smans, J., Jacobs, B., Piessens, F.: Implicit dynamic frames: Combining dynamic frames and
separation logic. In: Drossopoulou, S. (ed.) ECOOP 2009. LNCS, vol. 5653, pp. 148–172.
Springer, Heidelberg (2009)
25. Sofronie-Stokkermans, V.: Hierarchic reasoning in local theory extensions. In: Nieuwenhuis,
R. (ed.) CADE 2005. LNCS (LNAI), vol. 3632, pp. 219–234. Springer, Heidelberg (2005)
26. Totla, N., Wies, T.: Complete instantiation-based interpolation. In: POPL. ACM (2013)
27. Yang, H., Lee, O., Berdine, J., Calcagno, C., Cook, B., Distefano, D., O’Hearn, P.W.: Scalable
shape analysis for systems code. In: Gupta, A., Malik, S. (eds.) CAV 2008. LNCS, vol. 5123,
pp. 385–398. Springer, Heidelberg (2008)
Alternating Runtime and Size Complexity
Analysis of Integer Programs
1 Introduction
There exist numerous methods to prove termination of imperative programs,
e.g., [2, 6, 8, 9, 12, 13, 15–17, 19, 25, 33–35]. In many cases, however, termination is
not sufficient, but the program should terminate in reasonable (e.g., (pseudo-)
polynomial) time. To prove this, it is often crucial to derive (possibly non-linear)
bounds on the values of variables that are modified repeatedly in loops.
We build upon the well-known observation that rank functions for termina-
tion proofs also provide a runtime complexity bound [3, 4, 6, 7, 32]. However, this
only holds for proofs using a single rank function. Larger programs are usually
handled by a disjunctive [16,28,35] or lexicographic [6,12,13,17,19,21,23,25] com-
bination of rank functions. Here, deriving a complexity bound is much harder.
To illustrate this, consider the program on the right and
while i > 0 do
a variant where the instruction “x = x + i” is removed. For
i=i−1
both variants, the lexicographic rank function f1 , f2 proves
x=x+i
termination, where f1 measures states by the value of i and
done
f2 is just the value of x. However, the program without the
instruction “x = x + i” has linear runtime, while the program while x > 0 do
x=x−1
on the right has quadratic runtime. The crucial difference be-
done
tween the two programs is in the size of x after the first loop.
To handle such effects, we introduce a novel modular approach which alter-
nates between finding runtime bounds and finding size bounds. In contrast to
standard invariants, our size bounds express a relation to the size of the variables
at the program start, where we measure the size of integers by their absolute
values. Our method derives runtime bounds for isolated parts of the program
Supported by the DFG grant GI 274/6-1.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 140–155, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Alternating Runtime and Size Complexity Analysis of Integer Programs 141
and uses these to deduce (often non-linear) size bounds for program variables at
certain locations. Further runtime bounds can then be inferred using size bounds
for variables that were modified in preceding parts of the program. By splitting
the analysis in this way, we only need to consider small program parts in each
step, and the process continues until all loops and variables have been handled.
For the example, our method proves that the first loop is executed linearly
often using the rank function i. Then, it deduces that i is bounded by the size of
its initial value |i0 | in all loop iterations. Combining these bounds, it infers that
x is incremented by a value bounded by |i0 | at most |i0 | times, i.e., x is bounded
by the sum of its initial size |x0 | and |i0 |2 . Finally, our method detects that the
second loop is executed x times, and combines this with our bound |x0 | + |i0 |2
on x’s value when entering the second loop. In this way, we can conclude1 that
the program’s runtime is bounded by |i0 | + |i0 |2 + |x0 |. This novel combination
of runtime and size bounds allows us to handle loops whose runtime depends on
variables like x that were modified in earlier loops. Thus, our approach succeeds
on many programs that are beyond the reach of previous techniques.
Sect. 2 introduces the basic notions for our approach. Then Sect. 3 and Sect. 4
present our techniques to compute runtime and size bounds, respectively. Sect. 5
discusses related work and provides an extensive experimental evaluation. Proofs
for all theorems as well as several extensions of our approach can be found in [14].
Input: List x
2 Preliminaries 0 : List y = null
1 : while x = null do
Consider the program on the right. For an input y = new List(x.val, y)
list x, the loop at location 1 creates a list y by x = x.next
reversing the elements of x. The loop at location done
2 iterates over the list y and increases each el- List z = y
ement by the sum of its successors. So if y was 2 while z = null do
:
[5, 1, 3], it will be [5 + 1 + 3, 1 + 3, 3] after the sec- List u = z.next
ond loop. This example is a representative for 3 : while u = null do
methods using several algorithms in sequence. z.val += u.val
We regard sequential imperative integer pro- u = u.next
grams with (potentially non-linear) arithmetic done
and unbounded non-determinism. Our approach z = z.next
is compatible with methods that abstract fea- done
tures like heap usage to integers [2, 4, 15, 19, 29, 34]. So the above program could
be abstracted automatically to the integer program below. Here, list variables
are replaced by integer variables that correspond to the lengths of the lists.
We fix a (finite) set of program variables V = {v1 , . . . , vn } and represent inte-
ger programs as directed graphs. Nodes are program locations L and edges are
program transitions T . The set L contains a canonical start location 0 . W.l.o.g.,
we assume that no transition leads back to 0 and that all transitions T are reach-
able from 0 . All transitions originating in 0 are called initial transitions. The
1
Since each step of our method over-approximates the runtime or size of a variable,
we actually obtain the bound 2 + |i0 | + max{|i0 |, |x0 |} + |i0 |2 , cf. Sect. 4.2.
142 M. Brockschmidt et al.
transitions are labeled by formulas over the variables V and primed post-variables
V = {v1 , . . . , vn } which represent the values of the variables after the transi-
tion. In the following graph, we represented these formulas by imperative com-
mands. For instance, t3 is labeled by the for- 0
mula z > 0∧u = z−1∧x = x∧y = y∧z = z.
t0: y = 0
We used standard invariant-generation tech-
niques (based on the Octagon domain [30]) to t1: if(x > 0) 1
propagate simple integer invariants, adding the y= y+1 t2: if(x ≤ 0)
condition z > 0 to the transitions t4 and t5 . x= x−1 z=y
• for all (, τ, ) ∈ T , we have τ ⇒ (Pol ())(v1 , . . . , vn ) ≥ (Pol ( ))(v1 , . . . , vn )
• for all (, τ, ) ∈ T
, we have τ ⇒ (Pol ())(v1 , . . . , vn ) > (Pol ( ))(v1 , . . . , vn )
and τ ⇒ (Pol ())(v1 , . . . , vn ) ≥ 1
The constraints on a PRF Pol are the same constraints needed for termination
proofs, allowing to re-use existing PRF synthesis techniques and tools. They
imply that the transitions in T
can only be used a limited number of times, as
each application of a transition from T
decreases the measure, and no transition
increases it. Hence, if the program is called with input m1 , . . . , mn , no transition
t ∈ T
can be used more often than (Pol (0 ))(m1 , . . . , mn ) times. Consequently,
Pol (0 ) is a runtime bound for the transitions in T
. Note that no such bound
is obtained for the remaining transitions in T .
In the program from Sect. 2, we could use Pol 1 with Pol 1 () = x for all ∈ L,
i.e., we measure configurations by the value of x. No transition increases this mea-
sure and t1 decreases it. The condition x > 0 ensures that the measure is positive
whenever t1 is used, i.e., T
= {t1 }. Hence Pol 1 (0 ) (i.e., the value x at the
beginning of the program) is a bound on the number of times t1 can be used.
Such PRFs lead to a basic technique for inferring time bounds. As mentioned
in Sect. 2, to obtain a modular approach afterwards, we only allow weakly mono-
tonic functions as complexity bounds. For any polynomial p ∈ Z[v1 , . . . , vn ], let
[p] result from p by replacing all coefficients and variables with their absolute
value (e.g., for Pol 1 (0 ) = x we have [Pol 1 (0 )] = |x| and if p = 2 · v1 − 3 · v2
then [p] = 2 · |v1 | + 3 · |v2 |). As [p](m1 , . . . , mn ) ≥ p(m1 , . . . , mn ) holds for all
m1 , . . . , mn ∈ Z, this is a sound approximation, and [p] is weakly monotonic. In
our example, the initial runtime approximation R0 can now be refined to R1 ,
with R1 (t1 ) = [Pol 1 (0 )] = |x| and R1 (t) = R0 (t) for all other transitions t.
Theorem 5 (Complexities from PRFs). Let R be a runtime approximation
and Pol be a PRF for T . Let4 R (t) = [Pol (0 )] for all t ∈ T
and R (t) = R(t)
for all other t ∈ T . Then, R is also a runtime approximation.
1 does not increase its absolute value. The bound max{0, |x| − 1} would also be
allowed, but our approach does not compute better global size bounds from it.
To track how variables influence each other, we construct a result variable
graph (RVG) whose nodes are the result variables. An RVG for our example is
shown below. Here, we display local size bounds in the RVG to the left of the
result variables, separated by “≥” (e.g., “|x| ≥ |t1 , x |” means Sl (t1 , x ) = |x|).
The RVG has an edge
|x| ≥ |t0 , x | 0 ≥ |t0 , y | |z| ≥ |t0 , z | |u| ≥ |t0 , u |
from a result variable |t̃, ṽ |
to |t, v | if the transition t̃ can
|x| ≥ |t1 , x | |y|+1 ≥ |t1 , y | |z| ≥ |t1 , z | |u| ≥ |t1 , u |
be used directly before t and
if ṽ occurs in the local size
|x| ≥ |t2 , x | |y| ≥ |t2 , y | |y| ≥ |t2 , z | |u| ≥ |t2 , u |
bound Sl (t, v ). Such an edge
means that the size of ṽ in
|x| ≥ |t3 , x | |y| ≥ |t3 , y | |z| ≥ |t3 , z | |z| ≥ |t3 , u |
the post-location of the tran-
sition t̃ may influence the size
|x| ≥ |t4 , x | |y| ≥ |t4 , y | |z| ≥ |t4 , z | |u| ≥ |t4 , u |
of v in t’s post-location.
To state which variables
|x| ≥ |t5 , x | |y| ≥ |t5 , y | |z| ≥ |t5 , z | |u| ≥ |t5 , u |
may influence a function f ∈
C, we define its active variables as actV(f ) = {vi ∈ V | ∃m1 , . . . , mn , mi ∈ N .
f (m1 , . . . , mi , . . . , mn ) = f (m1 , . . . , mi , . . . , mn ) }. Let pre(t) denote the tran-
sitions that may precede t in evaluations, i.e., pre(t) = {t̃ ∈ T | ∃v 0 , , v .
(0 , v 0 ) →∗ ◦ →t̃ ◦ →t (, v)}. While pre(t) is undecidable in general, there exist
several techniques to compute over-approximations of pre(t), cf. [19, 21]. For ex-
ample, one can disregard the formulas of the transitions and approximate pre(t)
by all transitions that end in t’s source location.
Definition 8 (RVG). Let Sl be a local size approximation. An RVG has T ’s re-
sult variables as nodes and the edges {(|t̃, ṽ |, |t, v |) | t̃ ∈ pre(t), ṽ ∈ actV(Sl (t,v ))}.
For the transition t2 which sets z = y, we obtain Sl (t2 , z ) = |y|. Hence, we
have actV(Sl (t2 , z )) = y. The program graph implies pre(t2 ) = {t0 , t1 }, and
thus, our RVG contains edges from |t0 , y | to |t2 , z | and from |t1 , y | to |t2 , z |.
Each SCC of the RVG represents a set of result variables that may influence
each other. To lift the local approximation Sl to a global one, we consider each
SCC on its own. We treat the SCCs in topological order, reflecting the data flow.
As usual, an SCC is a maximal subgraph with a path from each node to every
other node. An SCC is trivial if it consists of a single node without an edge to
itself. In Sect. 4.1, we show how to deduce global bounds for trivial SCCs and in
Sect. 4.2, we handle non-trivial SCCs where transitions are applied repeatedly.
transitions. For example, regard the trivial SCC with the result variable |t0 , y |.
As 0 ≥ |t0 , y | holds, its global size bound is also 0, and we set S(t0 , y ) = 0.
Next, we consider trivial SCCs α = |t, v | with incoming edges from other
SCCs. Now Sl (α) (m) is an upper bound on the size of v after using the tran-
sition t in a configuration where the sizes of the variables are at most m. To
obtain a global bound, we replace m by upper bounds on t’s input variables.
The edges leading to α come from result variables |t̃, vi | where t̃ ∈ pre(t) and
vi ∈ actV(Sl (α)). Thus, a bound for the result variable α = |t, v | is obtained by
applying Sl (α) to S(t̃, v1 ), . . . , S(t̃, vn ), for all t̃ ∈ pre(t).
As an example consider the result variable |t2 , z |. Its local size bound is Sl (t2 ,
z ) = |y|. To express this bound in terms of the input variables, we consider the
predecessors |t0 , y | and |t1 , y | of |t2 , z | in the RVG. So Sl (t2 , z ) must be applied
to S(t0 , y ) and S(t1 , y ). If SCCs are handled in topological order, one already
knows that S(t0 , y ) = 0 and S(t1 , y ) = |x|. Thus, S(t2 , z ) = max{0, |x|} = |x|.
Thm. 9 presents the resulting procedure SizeBounds. Based on the current
approximation (R, S), it improves the global size bound for the result variable
in a non-trivial SCC of the RVG. Non-trivial SCCs will be handled in Thm. 10.
Theorem 9 (SizeBounds for Trivial SCCs). Let (R, S) be a complexity ap-
proximation, let Sl be a local size approximation, and let {α} ⊆ RV be a trivial
SCC of the RVG. We define S (α ) = S(α ) for α = α and
• S (α) = Sl (α), if α = |t, v | for some initial transition t
• S (α) = max{Sl (α) (S(t̃, v1 ), . . . , S(t̃, vn )) | t̃ ∈ pre(t)}, otherwise
Then SizeBounds(R, S, {α}) = S is also a size approximation.
In the following, local size bounds like 2 · |x| are not handled because we
Alternating Runtime and Size Complexity Analysis of Integer Programs 149
= 2 + |i| + max{|i|, |x|} + |i|2 , i.e., it is linear in |x| and quadratic in |i|.
Our approach builds upon well-known basic concepts (like lexicographic rank
functions), but uses them in a novel way to obtain a more powerful technique
than previous approaches. In particular, in contrast to previous work, our
approach deals with non-linear information flow between different program parts.
To evaluate our approach, we implemented a prototype KoAT and compared
it with PUBS [3, 4] and Rank [6]. We also contacted the authors of SPEED [24]
and Loopus [37], but were not able to obtain these tools. We did not compare
KoAT to ABC [11], RAML [26], or r-TuBound [27], as their input or analysis goals
differ considerably from ours. As benchmarks, we collected 682 programs from
the literature on termination and complexity of integer programs. These include
all 36 examples from the evaluation of Rank, all but one of the 53 examples used
to evaluate PUBS,7 all 27 examples from the evaluations of SPEED, and the ex-
amples from the current paper (which can be handled by KoAT, but not by PUBS
or Rank). Where examples were available as C programs, we used the tool KITTeL
[19] to transform them into integer programs automatically. The collection con-
tains 48 recursive examples, which cannot be analyzed with Rank, and 20 exam-
ples with non-linear arithmetic, which can be handled by neither Rank nor PUBS.
The remaining examples are compatible with all tested tools. All examples, the
results of the three tools, and a binary of KoAT are available at [1].
The table illustrates how often 1 log n n n log n n2 n3 n>3 EXP Time
each tool could infer a specific KoAT 121 0 145 0 59 3 3 0 1.1 s
runtime bound for the example PUBS 116 5 131 5 22 7 0 6 0.8 s
set. Here, 1, log n, n, n log n, n2 , Rank 56 0 19 0 8 1 0 0 0.5 s
n3 , and n>3 represent their corresponding asymptotic classes and EXP is the
class of exponential functions. In the column “Time”, we give the average run-
time on those examples where the respective tool was successful. The average
runtime on those 65 examples where all tools succeeded were 0.5 s for KoAT,
0.2 s for PUBS, and 0.6 s for Rank. The benchmarks were executed on a computer
with 6GB of RAM and an Intel i7 CPU clocked at 3.07 GHz, using a timeout of
60 seconds for each example. A longer timeout did not yield additional results.
On this collection, our approach was more powerful than the two other tools
and still efficient. In fact, KoAT is only a simple prototype whose efficiency could
still be improved considerably by fine-tuning its implementation. As shown in
[1], there are 77 examples where KoAT infers a bound of a lower asymptotic
class than PUBS, 548 examples where the bounds are in the same class, and 57
examples where the bound of PUBS is (asymptotically) more precise than KoAT’s.
Similarly, there are 259 examples where KoAT is asymptotically more precise than
Rank, 410 examples where they are equal, and 13 examples where Rank is more
precise. While KoAT is the only of the three tools that can also handle non-linear
arithmetic, even when disregarding the 20 examples with non-linear arithmetic,
KoAT can detect runtime bounds for 325 examples, whereas PUBS succeeds only
for 292 programs and Rank only finds bounds for 84 examples.
A limitation of our implementation is that it only generates (possibly non-
linear) PRFs to detect polynomial bounds. In contrast, PUBS uses PRFs to find
logarithmic and exponential complexity bounds as well [3]. Such an extension
7
We removed one example with undefined semantics.
Alternating Runtime and Size Complexity Analysis of Integer Programs 153
could also be directly integrated into our method. Moreover, we are restricted to
weakly monotonic bounds in order to allow their modular composition. Another
limitation is that our size analysis only handles certain forms of local size bounds
in non-trivial SCCs of the result variable graph. For that reason, it often over-
approximates the sizes of variables that are both incremented and decremented
in the same loop. Due to all these imprecisions, our approach sometimes infers
bounds that are asymptotically larger than the actual asymptotic costs.
Our method is easily extended. In [14], we provide an extension to handle
(possibly recursive) procedure calls in a modular fashion. Moreover, we show how
to treat other forms of bounds (e.g., on the number of sent network requests)
and how to compute bounds for separate program parts in advance or in parallel.
Future work will be concerned with refining the precision of the inferred run-
time and size approximations and with improving our implementation (e.g., by
extending it to infer also non-polynomial complexities). Moreover, instead of ab-
stracting heap operations to integers, we intend to investigate an extension of our
approach to apply it directly to programs operating on the heap. Finally, simi-
lar to the coupling of COSTA with the tool KeY in [5], we want to automatically
certify the complexity bounds found by our implementation KoAT.
Acknowledgments. We thank A. Ben-Amram, B. Cook, C. von Essen, C. Otto
for valuable discussions and C. Alias and S. Genaim for help with the experiments.
References
1. http://aprove.informatik.rwth-aachen.de/eval/IntegerComplexity/
2. Albert, E., Arenas, P., Codish, M., Genaim, S., Puebla, G., Zanardini, D.: Termi-
nation analysis of Java Bytecode. In: Barthe, G., de Boer, F.S. (eds.) FMOODS
2008. LNCS, vol. 5051, pp. 2–18. Springer, Heidelberg (2008)
3. Albert, E., Arenas, P., Genaim, S., Puebla, G.: Closed-form upper bounds in static
cost analysis. JAR 46(2), 161–203 (2011)
4. Albert, E., Arenas, P., Genaim, S., Puebla, G., Zanardini, D.: Cost analysis of
object-oriented bytecode programs. TCS 413(1), 142–159 (2012)
5. Albert, E., Bubel, R., Genaim, S., Hähnle, R., Puebla, G., Román-Dı́ez, G.: Verified
resource guarantees using COSTA and KeY. In: Khoo, S.-C., Siek, J.G. (eds.) PEPM
2011, pp. 73–76. ACM Press (2011)
6. Alias, C., Darte, A., Feautrier, P., Gonnord, L.: Multi-dimensional rankings, pro-
gram termination, and complexity bounds of flowchart programs. In: Cousot, R.,
Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 117–133. Springer, Heidelberg
(2010)
7. Avanzini, M., Moser, G.: A combination framework for complexity. In: van Raams-
donk, F. (ed.) RTA 2013. LIPIcs, vol. 21, pp. 55–70. Dagstuhl Publishing (2013)
8. Bagnara, R., Mesnard, F., Pescetti, A., Zaffanella, E.: A new look at the automatic
synthesis of linear ranking functions. IC 215, 47–67 (2012)
9. Ben-Amram, A.M., Genaim, S.: On the linear ranking problem for integer linear-
constraint loops. In: Giacobazzi, R., Cousot, R. (eds.) POPL 2013, pp. 51–62. ACM
Press (2013)
10. Ben-Amram, A.M., Jones, N.D., Kristiansen, L.: Linear, polynomial or exponential?
Complexity inference in polynomial time. In: Beckmann, A., Dimitracopoulos, C.,
Löwe, B. (eds.) CiE 2008. LNCS, vol. 5028, pp. 67–76. Springer, Heidelberg (2008)
154 M. Brockschmidt et al.
11. Blanc, R., Henzinger, T.A., Hottelier, T., Kovács, L.: ABC: Algebraic bound compu-
tation for loops. In: Clarke, E.M., Voronkov, A. (eds.) LPAR 2010. LNCS (LNAI),
vol. 6355, pp. 103–118. Springer, Heidelberg (2010)
12. Bradley, A.R., Manna, Z., Sipma, H.B.: Linear ranking with reachability. In: Etes-
sami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 491–504. Springer,
Heidelberg (2005)
13. Brockschmidt, M., Cook, B., Fuhs, C.: Better termination proving through cooper-
ation. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 413–429.
Springer, Heidelberg (2013)
14. Brockschmidt, M., Emmes, F., Falke, S., Fuhs, C., Giesl, J.: Alternating runtime and
size complexity analysis of integer programs. Tech. Rep. AIB 2013-12, RWTH Aachen
(2013), available from [1] and from http://aib.informatik.rwth-aachen.de
15. Brockschmidt, M., Musiol, R., Otto, C., Giesl, J.: Automated termination proofs
for Java programs with cyclic data. In: Madhusudan, P., Seshia, S.A. (eds.) CAV
2012. LNCS, vol. 7358, pp. 105–122. Springer, Heidelberg (2012)
16. Cook, B., Podelski, A., Rybalchenko, A.: Termination proofs for systems code. In:
Schwartzbach, M., Ball, T. (eds.) PLDI 2006, pp. 415–426. ACM Press (2006)
17. Cook, B., See, A., Zuleger, F.: Ramsey vs. Lexicographic termination proving.
In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 47–61.
Springer, Heidelberg (2013)
18. Debray, S., Lin, N.: Cost analysis of logic programs. TOPLAS 15, 826–875 (1993)
19. Falke, S., Kapur, D., Sinz, C.: Termination analysis of C programs using compiler
intermediate languages. In: Schmidt-Schauß, M. (ed.) RTA 2011. LIPIcs, vol. 10,
pp. 41–50. Dagstuhl Publishing (2011)
20. Fuhs, C., Giesl, J., Middeldorp, A., Schneider-Kamp, P., Thiemann, R., Zankl, H.:
SAT solving for termination analysis with polynomial interpretations. In: Marques-
Silva, J., Sakallah, K.A. (eds.) SAT 2007. LNCS, vol. 4501, pp. 340–354. Springer,
Heidelberg (2007)
21. Fuhs, C., Giesl, J., Plücker, M., Schneider-Kamp, P., Falke, S.: Proving termination
of integer term rewriting. In: Treinen, R. (ed.) RTA 2009. LNCS, vol. 5595, pp.
32–47. Springer, Heidelberg (2009)
22. Giesl, J., Ströder, T., Schneider-Kamp, P., Emmes, F., Fuhs, C.: Symbolic evalu-
ation graphs and term rewriting: A general methodology for analyzing logic pro-
grams. In: De Schreye, D., Janssens, G., King, A. (eds.) PPDP 2012, pp. 1–12.
ACM Press (2012)
23. Giesl, J., Thiemann, R., Schneider-Kamp, P., Falke, S.: Mechanizing and improving
dependency pairs. JAR 37(3), 155–203 (2006)
24. Gulwani, S., Mehra, K.K., Chilimbi, T.M.: SPEED: Precise and efficient static es-
timation of program computational complexity. In: Shao, Z., Pierce, B.C. (eds.)
POPL 2009, pp. 127–139. ACM Press (2009)
25. Harris, W.R., Lal, A., Nori, A.V., Rajamani, S.K.: Alternation for termination. In:
Cousot, R., Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 304–319. Springer,
Heidelberg (2010)
26. Hoffmann, J., Aehlig, K., Hofmann, M.: Multivariate amortized resource analysis.
TOPLAS 34(3) (2012)
27. Knoop, J., Kovács, L., Zwirchmayr, J.: r-TuBound: Loop bounds for WCET analy-
sis (Tool paper). In: Bjørner, N., Voronkov, A. (eds.) LPAR 2012. LNCS, vol. 7180,
pp. 435–444. Springer, Heidelberg (2012)
28. Lee, C.S., Jones, N.D., Ben-Amram, A.M.: The size-change principle for program
termination. In: Hankin, C., Schmidt, D. (eds.) POPL 2001, pp. 81–92. ACM Press
(2001)
Alternating Runtime and Size Complexity Analysis of Integer Programs 155
29. Magill, S., Tsai, M.H., Lee, P., Tsay, Y.K.: Automatic numeric abstractions for
heap-manipulating programs. In: Hermenegildo, M.V., Palsberg, J. (eds.), POPL
2010, pp. 211–222 (2010)
30. Miné, A.: The Octagon abstract domain. HOSC 19(1), 31–100 (2006)
31. Navas, J., Mera, E., López-Garcı́a, P., Hermenegildo, M.V.: User-definable resource
bounds analysis for logic programs. In: Dahl, V., Niemelä, I. (eds.) ICLP 2007.
LNCS, vol. 4670, pp. 348–363. Springer, Heidelberg (2007)
32. Noschinski, L., Emmes, F., Giesl, J.: Analyzing innermost runtime complexity of
term rewriting by dependency pairs. JAR 51(1), 27–56 (2013)
33. Podelski, A., Rybalchenko, A.: A complete method for the synthesis of linear rank-
ing functions. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp.
239–251. Springer, Heidelberg (2004)
34. Spoto, F., Mesnard, F., Payet, É.: A termination analyser for Java Bytecode based
on path-length. TOPLAS 32(3) (2010)
35. Tsitovich, A., Sharygina, N., Wintersteiger, C.M., Kroening, D.: Loop summariza-
tion and termination analysis. In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS
2011. LNCS, vol. 6605, pp. 81–95. Springer, Heidelberg (2011)
36. Wilhelm, R., Engblom, J., Ermedahl, A., Holsti, N., Thesing, S., Whalley, D.B.,
Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I.,
Puschner, P.P., Staschulat, J., Stenström, P.: The worst-case execution-time prob-
lem: overview of methods and survey of tools. TECS 7(3), 36:1–36:53 (2008)
37. Zuleger, F., Gulwani, S., Sinn, M., Veith, H.: Bound analysis of imperative pro-
grams with the size-change abstraction. In: Yahav, E. (ed.) SAS 2011. LNCS,
vol. 6887, pp. 280–297. Springer, Heidelberg (2011)
Proving Nontermination via Safety
1 Introduction
The problem of proving program nontermination represents an interesting com-
plement to termination as, unlike safety, termination’s falsification cannot be
witnessed by a finite trace. While the problem of proving termination has now
been extensively studied, the search for reliable and scalable methods for proving
nontermination remains open.
In this paper we develop a new method of proving nontermination based on a
reduction to safety proving that leverages the power of existing tools. An iterative
algorithm is developed which uses counterexamples to a fixed safety property
to refine an underapproximation of a program. With our approach, existing
safety provers can now be employed to prove nontermination of programs that
previous techniques could not handle. Not only does the new approach perform
better, it also leads to nontermination proving tools supporting programs with
nondeterminism, for which previous tools had only little support.
Limitations. Our proposed nontermination procedure can only prove
nontermination. On terminating programs the procedure is
likely to diverge (although some heuristics are proposed if (k ≥ 0)
which aim to avoid this). While our method could be ex- skip;
tended to further programming language features (e.g. heap, else
recursion), in practice the supported features of an under- i := −1;
lying safety prover determine applicability. Our implemen-
tation uses a safety prover for non-recursive programs with while (i ≥ 0) {
linear integer arithmetic commands. i := nondet();
}
Example. Before discussing our procedure in a formal set-
ting, we begin with a simple example given to the right. In i := 2;
this program the command i := nondet() represents non-
deterministic value introduction into the variable i. The loop in this program
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 156–171, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Proving Nontermination via Safety 157
if (k ≥ 0) if (k ≥ 0) if (k ≥ 0)
skip; skip; skip;
else else else
i := −1; i := −1; i := −1;
i := 2; i := 2; i := 2;
(a) (b) (c)
assume(k ≥ 0 ∧ i ≥ 0);
assume(k ≥ 0 ∧ i ≥ 0);
if (k ≥ 0)
skip; if (k ≥ 0)
assume(k ≥ 0 ∧ i ≥ 0);
else skip;
assume(k ≥ 0);
i := −1; else
i := −1;
skip;
while (i ≥ 0) {
while (i ≥ 0) {
i := nondet(); assert(false);
i := nondet();
assume(i ≥ 0);
assume(i ≥ 0);
} while (i ≥ 0) {
}
i := nondet();
assert(false); assume(i ≥ 0);
}
i := 2;
(d) (e) (f )
loop are still reachable. The path violating the assertion is our desired path to
the loop which we refer to as stem. Fig. 1(f) shows the stem and the loop.
Finally we need to ensure that the assume statement in Fig. 1(f) can always
be satisfied with some i by any reachable state from the restricted pre-state.
This is necessary: our underapproximations may accidentally have eliminated
not only the paths to the loop’s exit location, but also all of the nonterminating
paths inside the loop. Once this check succeeds we have proved nontermination.
Proving Nontermination via Safety 159
3 Algorithm
Our nontermination proving procedure Prover is detailed in Fig. 3. Its input is
a program P given by its CFG, and a loop to be considered for nontermination.
To prove nontermination of the entire program P we need to find only one
nonterminating loop L. This can be done in parallel. Alternatively, the procedure
162 H.-Y. Chen et al.
In some cases our refinement is too weak, leading to divergence. The difficulty
is that in cases the same loop path will be considered repeatedly, but at each
instance the loop will be unrolled for an additional iteration. To avoid this prob-
lem we impose a limit n for the number of paths that go along the same locations
(possibly with more and more repetitions). We call such paths repeating. If we
reach this limit, we use the subprocedure Strengthen to strengthen the precon-
dition, inspired by a heuristic by Cook and Koskinen [8]. Here we again calculate
a precondition, but when we have found ψp , we quantify out all the variables that
are written to after ψp and apply quantifier elimination (QE) to get ρp . We then
refine with ¬ρp . This leads to a more aggressive pruning of the transition rela-
tion. This heuristic can lead to additional incompleteness.
0
Example. Consider the instrumented program ϕ
to the right. Suppose we have initially ϕ i<0
i ≥ 0. We might get cex1 : 0 → 1 → 2 → 6 1
3 → 5 → 1 → 6 as a first counterexample. Error i ≥ 0
The Refine procedure finds the weakest precon-
dition k ≥ 0∧i = 0 at location 1. Adding its nega- 2
k≥0
tion to ϕ and simplifying the formula gives us k<0 skip
ϕ (i ≥ 0) ∧ (k < 0 ∨ i ≥ 1). Now we may get
cex2 : 0 → 1 → 2 → 3 → 5 → 1 → 2 → 3 → 3 4
5 → 1 → 6 as next counterexample, and Refine skip
updates ϕ (i ≥ 0) ∧ (k < 0 ∨ i ≥ 2). Now we i := i − 1
5
may get cex3 : 0 → 1 → 2 → 3 → 5 → 1 → 2 →
3 → 5 → 1 → 2 → 3 → 5 → 1 → 6 as next counterexample. Note that cex1 ,
cex2 , cex3 are repeating counterexamples and if we just use the Refine pro-
cedure, Underapproximate gets stuck in a sequence of infinite counterexam-
ples. Now Strengthen identifies the repeating counterexamples, considers cex1
and calculates the weakest precondition ψ1 k ≥ 0 ∧ i = 0. It then existentially
quantifies out variable i as it gets modified later along cex1 . We get ∃i. k ≥ 0∧i =
0, and quantifier elimination yields ρ1 k ≥ 0. Clearly ψ1 entails ρ1 . Adding ¬ρ1
to ϕ and simplifying the formula we get ϕ i ≥ 0 ∧ k < 0. Now all repeating
counterexamples are eliminated, the program is safe, and we have obtained a
closed recurrence set witnessing nontermination of the original program.
In the Underapproximate procedure, once there are no further counterex-
amples to safety of P , we know that in P the loop exit is not reachable. The
procedure returns the final underapproximation (denoted by P ) that is safe.
When Underapproximate returns to Prover, we check if in P the original
loop L after refinements has a closed recurrence set. We refer to the refined loop
as L . In order to check the existence of a closed recurrence set, we first need to
ensure that L is reachable in P even after the refinements. We again pose this
problem as a safety/reachability problem. This time we mark the header node of
L as an error location in P and hope that P is unsafe. If P is safe then clearly
we have failed to prove nontermination and we report the result as unknown. If
P is unsafe, then the counterexample to its safety is a path to the header of L .
We enumerate all such paths to the header of L in a set Π (generated lazily
164 H.-Y. Chen et al.
weakest invariant true can be sufficient to prove validity of (6). In this example
as well we can easily prove that true → ∃i . i ≥ 0 is valid.
Moreover, consider the program in Fig. 4. Suppose ϕ (j ≤ 3 ∨ i == 9) ∧
(j ≥ 4 ∨ i == 11). Using an invariant generator, we obtain the location invariant
i = 10 at location 2. Then (6) becomes i = 10 → ∃j. (j ≤ 3∨i = 9)∧(j ≥ 4∨i = 11).
Clearly the formula is not valid. In this case Validate returns false.
If Validate returns true, we are sure that every reachable state at the non-
deterministic assignment node in L has a successor along the edge. At this point,
we report nontermination and return the final underapproximation P of P as a
proof of nontermination for P : P is a closed recurrence set.
Note that as invariants are overapproximations, we may report unknown in
some cases even when the discovered underapproximation actually does have a
closed recurrence set. However, the check is essential to retain soundness.
Theorem 3 (Correctness of Prover for Nontermination). Let P be a
program and L a loop in P . Suppose Prover (P, L) = Nonterminating, P . Then
P is nonterminating.
5 Related Work
Automatic tools for proving nontermination of term rewriting systems include
[14,23]. However, while nontermination analysis for term rewriting considers the
entire state space as legitimate initial states for a (possibly infinite) evaluation
sequence, our setting also factors in reachability from the initial states.
Static nontermination analysis has also been investigated for logic programs
(e.g. [24,31]). Most related to our setting are techniques for constraint-logic pro-
grams (CLPs) [25]. Termination tools for CLPs (e.g. [25]) can in cases be used to
prove nontermination of imperative programs (e.g. Julia [26] can show nontermi-
nation for Java Bytecode programs if the abstraction to CLPs is exact, but gives
no witness like a recurrence set to the user). The main difficulty for imperative
programs is that typically overapproximating abstractions (in general unsound
for nontermination) are used for converting languages like Java and C to CLPs.
TNT [16] uses a characterization of nontermination by recurrence sets. We
build upon this notion and introduce closed recurrence sets in our formalization,
as an intermediate concept during our nontermination proof search. In contrast
to us, TNT is restricted to programs with periodic “lasso-shaped” counterexam-
ples to termination. We support unbounded nondeterminism in the program’s
transition relation, whereas TNT is restricted to deterministic commands.
The tool Invel [30] analyzes nontermination of Java programs using a com-
bination of theorem proving and invariant generation. However, Invel does not
provide a witness for nontermination. Like Brockschmidt et al. [5], we were un-
able to obtain a working version of Invel. Note that in the empirical evaluation
by Brockschmidt et al. [5], the AProVE tool (which we have compared against)
subsumed Invel on Invel’s data set. Finally, Invel is only applicable to deter-
ministic (integer) programs, yet our approach allows nondeterminism as well.
Atig et al. [1] describe a technique for proving nontermination of multi-
threaded programs, via a reduction to nontermination reasoning for sequential
programs. Our work complements Atig et al., as we provide improvements to
the underlying sequential tools that future multithreaded tools can make use of.
The tool TRex [19] combines existing nontermination proving techniques with
a Terminator-like [9] iterative procedure. Our new method should complement
TRex nicely, as ours is more powerful than the underlying nontermination prov-
ing approach previously used [16].
AProVE [13] uses SMT to prove nontermination of Java programs [5]. First
nontermination of a loop regardless of its context is shown, then reachability of
Proving Nontermination via Safety 167
this loop with suitable values. Drawbacks are that they require recurrence sets
to be singletons (after program slicing) or the loop conditions to be invariants.
Gurfinkel et al. [18] present the CEGAR-based model checker Yasm which
supports arbitrary CTL properties, such as EG pc = END, denoting nontermi-
nation. Yasm implements a method of both under and over-approximating the
input program. Unfortunately, together with the author of Yasm we were not
able to get the tool working on our examples [17]. We suspect that our approach
will be faster, as it uses current safety proving techniques, i.e. Impact [20] rather
than Slam-style technology [2]. This is a feature of our approach: any off-the-
shelf software model checker can be turned into a nontermination prover.
Nontermination proving for finite-state systems is essentially a question of
safety [3]. Nontermination and/or related temporal logics are also supported for
more expressive systems, e.g. pushdown automata [28].
Recent work on CTL proving for programs uses an off-the-shelf nontermina-
tion prover [8]. We use a few steps when treating nondeterminism which look
similar to the approach from [8]. The key difference is that our work provides a
nontermination prover, whereas the previous work requires one off-the-shelf.
Gulwani et al. [15] (Claim 3), make a false claim that is similar to our own.
Their claim is false, as a nondeterministic program can be constructed which
represents a counterexample. Much of the subtlety in our approach comes from
our method of dealing with nondeterminism.
6 Experiments
– TNT [16], the original TNT tool was not available, and thus we have reim-
plemented its constraint-based algorithm with Z3 [11] as SMT backend.
– AProVE [13], via the Java Bytecode frontend, using the SMT-based non-
termination analysis by Brockschmidt et al. [5].
– Julia [29], which implements an approach via a reduction to constraint logic
programming described by Payet and Spoto [26].
Fig. 5. Evaluation success overview, showing the number of problems solved for each
tool. Here (a) represents the results for known nonterminating examples, (b) is known
terminating examples, (c) is (previously) unknown examples.
our test suite has 18.4 nodes (max. 427 nodes) and 2.4 loops (max. 120 loops).
Unfortunately each tool requires a different machine configuration, and thus a
direct comparison is difficult. Experiments with our procedure were performed on
a dualcore Intel Core 2 Duo U9400 (1.4 GHz, 2 GB RAM, Windows 7). TNT was
run on Intel Core i5-2520M (2.5 GHz, 8 GB RAM, Ubuntu Linux 12.04). We ran
AProVE on Intel Core i7-950 (3.07 GHz, 6 GB RAM, Debian Linux 7.2). Note
that the TNT-/AProVE-machines are significantly faster than the machine our
new procedure was run on, thus we can make some adjusted comparison between
the tools. For Julia, an unknown cloud-based configuration was used. All tools
were run with a timeout of 60s. When a tool returned early with no definite
result, we display this in the plots using the special “NR” (no result) value.
We ran three sets of experiments: (a) all the examples previously known to be
nonterminating, (b) all the examples previously known to be terminating, and
(c) all the examples where no previous results are known. With (a) we assess
the efficiency of the algorithm, (b) is used to demonstrate its soundness, and (c)
checks if our algorithm scales well on relatively large and complicated examples.
The results of the three sets of experiments are given in Fig. 5, which shows for
each tool and for each set (a)–(c) the numbers of benchmarks with nontermina-
tion proofs (“Nonterm”), timeouts (“TO”), and no results (“No Res”). (Proofs
of termination, found by AProVE and Julia, are also listed as “No Res”.)
On the 89 deterministic instances of our bench- # Paths Nonterm Time [s]
mark set, our implementation proves nontermina- 2 133 272
tion of 33 examples, and TNT of 21 examples. 4 133 301
We have also experimented with different values 6 129 264
for the number of repeated paths before invoking ∞ 123 272
Strengthen. The results are reported in Fig. 6 Fig. 6. Repeated paths be-
(runtimes are for successful nontermination proofs). fore calling Strengthen
Fig. 7 charts the difference in power and performance between our imple-
mentation and TNT in a scatter plot, in log scale. Here we have included all
programs from (a)–(c). Each ‘x’-mark in the plot represents an example from the
benchmark. The value of the x-axis plots the runtime of TNT and the y-axis
value plots the runtime of our procedure on the same example. Points under the
diagonal are in favor of our procedure. Thus, the more ‘x’-marks there are in the
lower-righthand corner, the better our tool has performed.
Discussion. Figs. 5(a&c) demonstrate that our technique is overwhelmingly the
most successful tool (Fig. 5(b) confirms simply that no tool has demonstrable
Proving Nontermination via Safety 169
7 Conclusion
We have introduced a new method of proving nontermination. The idea is to split
the reasoning in two parts: a safety prover is used to prove that a loop in an
underapproximation of the original program never terminates; meanwhile failed
safety proofs are used to calculate the underapproximation. We have shown that
nondeterminism can be easily handled in our framework while previous tools
often fail. Furthermore, we have shown that our approach leads to performance
improvements against previous tools where they are applicable.
Our technique is not restricted to linear integer arithmetic: Given suitable
tools for safety proving and for precondition inference, in principle our approach
is applicable to any program setting (note that the Strengthen procedure is
just an optimization). For future work, e.g. heap programs are a highly promising
candidate for nontermination analysis via abduction tools for separation logic [6].
References
1. Atig, M.F., Bouajjani, A., Emmi, M., Lal, A.: Detecting fair non-termination in
multithreaded programs. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS,
vol. 7358, pp. 210–226. Springer, Heidelberg (2012)
2. Ball, T., Rajamani, S.K.: The SLAM toolkit. In: Berry, G., Comon, H., Finkel, A.
(eds.) CAV 2001. LNCS, vol. 2102, pp. 260–264. Springer, Heidelberg (2001)
170 H.-Y. Chen et al.
3. Biere, A., Artho, C., Schuppan, V.: Liveness checking as safety checking. In: Proc.
FMICS 2002 (2002)
4. Brockschmidt, M., Cook, B., Fuhs, C.: Better termination proving through cooper-
ation. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 413–429.
Springer, Heidelberg (2013)
5. Brockschmidt, M., Ströder, T., Otto, C., Giesl, J.: Automated detection of non-
termination and NullPointerExceptions for Java Bytecode. In: Beckert, B., Dami-
ani, F., Gurov, D. (eds.) FoVeOOS 2011. LNCS, vol. 7421, pp. 123–141. Springer,
Heidelberg (2012)
6. Calcagno, C., Distefano, D., O’Hearn, P.W., Yang, H.: Compositional shape anal-
ysis by means of bi-abduction. J. ACM 58(6), 26 (2011)
7. Chen, H.-Y., Cook, B., Fuhs, C., Nimkar, K., O’Hearn, P.: Proving nontermination
via safety. Technical Report RN/13/23, UCL (2014)
8. Cook, B., Koskinen, E.: Reasoning about nondeterminism in programs. In: Proc.
PLDI 2013 (2013)
9. Cook, B., Podelski, A., Rybalchenko, A.: Terminator: Beyond safety. In: Ball, T.,
Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 415–418. Springer, Heidelberg
(2006)
10. Cook, B., See, A., Zuleger, F.: Ramsey vs. Lexicographic termination proving.
In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 47–61.
Springer, Heidelberg (2013)
11. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
12. Dijkstra, E.W.: A Discipline of Programming. Prentice-Hall (1976)
13. Giesl, J., Schneider-Kamp, P., Thiemann, R.: AProVE 1.2: Automatic termina-
tion proofs in the dependency pair framework. In: Furbach, U., Shankar, N. (eds.)
IJCAR 2006. LNCS (LNAI), vol. 4130, pp. 281–286. Springer, Heidelberg (2006)
14. Giesl, J., Thiemann, R., Schneider-Kamp, P.: Proving and disproving termina-
tion of higher-order functions. In: Gramlich, B. (ed.) FroCos 2005. LNCS (LNAI),
vol. 3717, pp. 216–231. Springer, Heidelberg (2005)
15. Gulwani, S., Srivastava, S., Venkatesan, R.: Program analysis as constraint solving.
In: Proc. PLDI 2008 (2008)
16. Gupta, A., Henzinger, T.A., Majumdar, R., Rybalchenko, A., Xu, R.-G.: Proving
non-termination. In: Proc. POPL 2008 (2008)
17. Gurfinkel, A.: Private communication (2012)
18. Gurfinkel, A., Wei, O., Chechik, M.: Yasm: A software model-checker for verifica-
tion and refutation. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144,
pp. 170–174. Springer, Heidelberg (2006)
19. Harris, W.R., Lal, A., Nori, A.V., Rajamani, S.K.: Alternation for termination. In:
Cousot, R., Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 304–319. Springer,
Heidelberg (2010)
20. McMillan, K.L.: Lazy abstraction with interpolants. In: Ball, T., Jones, R.B. (eds.)
CAV 2006. LNCS, vol. 4144, pp. 123–136. Springer, Heidelberg (2006)
21. Merz, F., Falke, S., Sinz, C.: LLBMC: Bounded model checking of C and C++ pro-
grams using a compiler IR. In: Joshi, R., Müller, P., Podelski, A. (eds.) VSTTE
2012. LNCS, vol. 7152, pp. 146–161. Springer, Heidelberg (2012)
22. Nelson, G.: A generalization of Dijkstra’s calculus. ACM TOPLAS 11(4) (1989)
23. Payet, É.: Loop detection in term rewriting using the eliminating unfoldings. Theor.
Comput. Sci. 403(2-3) (2008)
Proving Nontermination via Safety 171
24. Payet, É., Mesnard, F.: Nontermination inference of logic programs. ACM
TOPLAS 28(2) (2006)
25. Payet, É., Mesnard, F.: A non-termination criterion for binary constraint logic
programs. TPLP 9(2) (2009)
26. Payet, É., Spoto, F.: Experiments with non-termination analysis for Java Bytecode.
In: Proc. BYTECODE 2009 (2009)
27. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes:
The Art of Scientific Computing. Cambridge Univ. Press (1989)
28. Song, F., Touili, T.: Pushdown model checking for malware detection. In: Flana-
gan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 110–125. Springer,
Heidelberg (2012)
29. Spoto, F., Mesnard, F., Payet, É.: A termination analyzer for Java bytecode based
on path-length. ACM TOPLAS 32(3) (2010)
30. Velroyen, H., Rümmer, P.: Non-termination checking for imperative programs. In:
Beckert, B., Hähnle, R. (eds.) TAP 2008. LNCS, vol. 4966, pp. 154–170. Springer,
Heidelberg (2008)
31. Voets, D., De Schreye, D.: A new approach to non-termination analysis of logic
programs. In: Hill, P.M., Warren, D.S. (eds.) ICLP 2009. LNCS, vol. 5649, pp.
220–234. Springer, Heidelberg (2009)
Ranking Templates for Linear Loops
1 Introduction
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 172–186, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Ranking Templates for Linear Loops 173
These linear ranking templates can be used as a ’construction kit’ for com-
posing linear ranking templates that enable more complex ranking functions
(Subsection 4.4). Moreover, variations on the linear ranking templates presented
here can be used and completely different templates could be conceived.
Our method is described in Section 5 and can be summarized as follows. The
input is a linear loop program as well as a linear ranking template. From these
we construct a constraint to the parameters of the template. With Motzkin’s
Theorem we can transform the constraint into a purely existentially quantified
constraint (Subsection 5.1). This ∃-constraint is then passed to an SMT solver
which checks it for satisfiability. A positive result implies that the program termi-
nates. Furthermore, a satisfying assignment will yield a ranking function, which
constitutes a termination argument for the given linear loop program.
Related approaches invoke Farkas’ Lemma for the transformation into ∃-
constraints [2,5,6,7,14,20,24,25]. The piecewise and the lexicographic ranking
template contain both strict and non-strict inequalities, yet only non-strict in-
equalities can be transformed using Farkas’ Lemma. We solve this problem by
introducing the use of Motzkin’s Transposition Theorem, a generalization of
Farkas’ Lemma. As a side effect, this also enables both, strict and non-strict
inequalities in the program syntax. To our knowledge, all of the aforementioned
methods can be improved by the application of Motzkin’s Theorem instead of
Farkas’ Lemma.
Our method is complete in the following sense. If there is a ranking function
of the form specified by the given linear ranking template, then our method will
discover this ranking function. In other words, the existence of a solution is never
lost in the process of transforming the constraint.
In contrast to some related methods [14,20] the constraint we generate is not
linear, but rather a nonlinear algebraic constraint. Theoretically, this constraint
can be decided in exponential time [10]. Much progress on nonlinear SMT solvers
174 J. Leike and M. Heizmann
has been made and present-day implementations routinely solve nonlinear con-
straints of various sizes [16].
A related setting to linear loop programs are linear lasso programs. These
consist of a linear loop program and a program stem, both of which are specified
by boolean combinations of affine-linear inequalities over the program variables.
Our method can be extended to linear lasso programs through the addition of
affine-linear inductive invariants, analogously to related approaches [5,7,14,25].
2 Preliminaries
In this paper we use K to denote a field that is either the rational numbers Q or
the real numbers R. We use ordinal numbers according to the definition in [15].
The first infinite ordinal is denoted by ω; the finite ordinals coincide with the
natural numbers, therefore we will use them interchangeably.
∃λ ∈ Km ∃μ ∈ K
. λ ≥ 0 ∧ μ ≥ 0
∧ λT A + μT C = 0 ∧ λT b + μT d ≤ 0 (M2)
∧ (λ b < 0 ∨ μ = 0)
T
for some finite index set I, some matrices Ai ∈ Kn×mi , Ci ∈ Kn×ki , and some
vectors bi ∈ Kmi and di ∈ Kki . The linear loop program LOOP(x, x ) is called
conjunctive, iff there is only one disjunct, i.e., #I = 1.
while ( q > 0 ) :
i f (y > 0 ) :
q := q − y − 1 ;
else :
q := q + y − 1 ;
(q > 0 ∧ y > 0 ∧ y = y ∧ q = q − y − 1)
∨ (q > 0 ∧ y ≤ 0 ∧ y = y ∧ q = q + y − 1)
3 Ranking Templates
A ranking template is a template for a well-founded relation. More specifically, it
is a parametrized formula defining a relation that is well-founded for all assign-
ments to the parameters. If we show that a given program’s transition relation
LOOP is a subset of an instance of this well-founded relation, it must be well-
founded itself and thus we have a proof for the program’s termination. Moreover,
an assignment to the parameters of the template gives rise to a ranking func-
tion. In this work, we consider ranking templates that can be encoded in linear
arithmetic.
We call a formula whose free variables contain x and x a relation template.
Each free variable other than x and x in a relation template is called parameter.
Given an assignment ν to all parameter variables of a relation template T(x, x ),
the evaluation ν(T) is called an instantiation of the relation template T. We note
that each instantiation of a relation template T(x, x ) defines a binary relation.
When specifying templates, we use parameter variables to define affine-linear
functions. For notational convenience, we will write f (x) instead of the term
sTf x + tf , where sf ∈ Kn and tf ∈ K are parameter variables. We call f an
affine-linear function symbol.
176 J. Leike and M. Heizmann
Example 2. We call the following template with parameters D = {δ} and affine-
linear function symbols F = {f } the Podelski-Rybalchenko ranking template [20].
The next lemma states that we can prove termination of a given linear loop
program by checking that this program’s transition relation is included in an
instantiation of a linear ranking template.
Lemma 1. Let LOOP be a linear loop program and let T be a linear ranking
template with parameters D and affine-linear function symbols F . If there is an
assignment ν to D and F such that the formula
∀x, x .
LOOP(x, x ) → ν(T)(x, x ) (2)
Proof. By definition, ν(T) is a well-founded relation and (2) is valid iff the rela-
tion LOOP is a subset of ν(T). Thus LOOP must be well-founded.
Example 3. Consider the terminating linear loop program LOOP from Example 1.
A ranking function for LOOP is ρ : R2 → ω, defined as follows.
q, if q > 0, and
ρ(q, y) =
0 otherwise.
Here · denotes the ceiling function that assigns to every real number r the
smallest natural number that is larger or equal to r. Since we consider the natural
numbers to be a subset of the ordinals, the ranking function ρ is well-defined.
For better readability we used this notation which does not explicitly refer to
δ. In our presentation the step size δ is always clear from the context in which
an ordinal ranking equivalent f is used.
Example 4. Consider the linear loop program LOOP(x, x ) from Example 1. For
δ = 12 and f (q) = q + 1, the ordinal ranking equivalent of f with step size δ is
2(q + 1), if q + 1 > 0, and
f(q, y) =
0 otherwise.
178 J. Leike and M. Heizmann
The assignment from Example 4 to δ and f makes the implication (2) valid.
In order to invoke Lemma 1 to show that the linear loop program given in
Example 1 terminates, we need to prove that the Podelski-Rybalchenko ranking
template is a linear ranking template. We use the following technical lemma.
Lemma 3. Let f be an affine-linear function of step size δ > 0 and let x and x
be two states. If f (x) > 0 and f (x) − f (x ) > δ, then f(x) > 0 and f(x) > f(x ).
Proof. From f (x) > 0 follows that f(x) > 0. Therefore f(x) > f(x ) in the case
f(x ) = 0. For f(x ) > 0, we use the fact that f (x) − f (x ) > δ to conclude that
f (x )
f (x)
δ − δ > 1 and hence f(x ) > f(x).
Example 5. Consider the linear loop program from Figure 1. Every execution
can be partitioned into two phases: first y increases until it is positive and then
q decreases until the loop condition q > 0 is violated. Depending on the initial
values of y and q, either phase might be skipped altogether.
k
δi > 0 (3)
i=1
k
∧ fi (x) > 0 (4)
i=1
∧ f1 (x ) < f1 (x) − δ1 (5)
k
∧ fi (x ) < fi (x) − δi ∨ fi−1 (x) > 0 (6)
i=2
Ranking Templates for Linear Loops 179
Let (x, x ) ∈ T. By Lemma 2, we need to show that ρ(x ) < ρ(x). From (4)
follows that ρ(x) > 0. Moreover, there is an i such that fi (x) > 0 and fj (x) ≤ 0
for all j < i. By (5) and (6), fj (x ) ≤ 0 for all j < i because fj (x ) < fj (x)−δj ≤
0 − δj ≤ 0, since f
(x) ≤ 0 for all < j.
If fi (x ) ≤ 0, then ρ(x ) ≤ ω · (k − i) < ω · (k − i) + fi (x) = ρ(x). Otherwise,
fi (x ) > 0 and from (6) follows fi (x ) < fi (x) − δi . By Lemma 3, fi (x) > fi (x )
for the ordinal ranking equivalent of fi with step size δi . Hence
q > 0 ∧ q = q + z − 1 ∧ z = −z
Here, the sign of z is alternated in each iteration. The function ρ(q, y, z) = q
is decreasing in every second iteration, but not decreasing in each iteration.
Example 8. Consider the following linear loop program.
(q > 0 ∧ y > 0 ∧ y = 0)
∨ (q > 0 ∧ y ≤ 0 ∧ y = y − 1 ∧ q = q − 1)
180 J. Leike and M. Heizmann
For a given input, we cannot give an upper bound on the execution time: starting
with y > 0, after the first loop execution, y is set to 0 and q is set to some
arbitrary value, as no restriction to q applies in the first disjunct. In particular,
this value does not depend on the input. The remainder of the loop execution
then takes q iterations to terminate.
However we can prove the program’s termination with the 2-phase ranking
function constructed from f1 (q, y) = y and f2 (q, y) = q.
δ>0 (8)
k
k
∧ gi (x) < 0 ∨ gj (x ) < 0 ∨ fj (x ) < fi (x) − δ (9)
i=1 j=1
k
∧ fi (x) > 0 (10)
i=1
k
∧ gi (x) ≥ 0 (11)
i=1
The function ρ is well-defined, because according to (11), the set {fi (x) | gi (x) ≥
0} is not empty. Let (x, x ) ∈ T and let i and j be indices such that ρ(x) = fi (x)
and ρ(x ) = fj (x ). By definition of ρ, we have that gi (x) ≥ 0 and gj (x) ≥ 0,
and (9) thus implies fj (x ) < fi (x) − δ. According to Lemma 3 and (10), this
entails fj (x ) < fi (x) and therefore ρ(x ) < ρ(x). Lemma 2 now implies that T
is well-founded.
Example 9. Consider the following linear loop program.
(q > 0 ∧ p > 0 ∧ q < p ∧ q = q − 1)
∨ (q > 0 ∧ p > 0 ∧ p < q ∧ p = p − 1)
In every loop iteration, the minimum of p and q is decreased by 1 until it be-
comes negative. Thus, this program is ranked by the 2-piece ranking function
constructed from f1 (p, q) = p and f2 (p, q) = q with step size δ = 12 and discrimi-
nators g1 (p, q) = q −p and g2 (p, q) = p−q. Moreover, this program does not have
a multiphase or lexicographic ranking function: both p and q may increase with-
out bound during program execution due to non-determinism and the number
of switches between p and q being the minimum value is also unbounded.
k
∧ fi (x ) < fi (x) − δi (16)
i=1
182 J. Leike and M. Heizmann
k
ρ(x) = ω k−i · fi (x) (17)
i=1
Let (x, x ) ∈ T. From (14) follows fj (x) > 0 for all j, so ρ(x) > 0. By (16)
and Lemma 3, there is a minimal i such that fi (x ) < fi (x). According to (15),
f1 (x ) ≤ f1 (x) and hence inductively fj (x ) ≤ fj (x) for all j < i, since i was
minimal.
k
i−1
k
ρ(x ) = ω k−j · fj (x ) ≤ ω k−j · fj (x) + ω k−j · fj (x )
j=1 j=1 j=i
i−1
< ω k−j · fj (x) + ω k−i · fi (x) ≤ ρ(x)
j=1
k
δi,j > 0
i=1 j=1
k
∧ fi,j (x) > 0
i=1 j=1
k−1
∧ fi,1 (x ) ≤ fi,1 (x) ∧ fi,j (x ) ≤ fi,j (x) ∨ fi,j−1 (x) > 0
i=1 j=2
i−1
∨ ft,1 (x ) < ft,1 (x) − δt,1 ∧ ft,j (x ) < ft,j (x) − δt,j ∨ ft,j−1 (x) > 0
t=1 j=2
k
∧ fi,1 (x ) < fi,1 (x) − δi,1 ∧ fi,j (x ) < fi,j (x) − δi,j ∨ fi,j−1 (x) > 0
i=1 j=2
Input: linear loop program LOOP and a list of linear ranking templates T
Output: a ranking function for LOOP or null if none is found
foreach T ∈ T do :
l e t ϕ = ∀x, x . LOOP(x, x ) → T(x, x )
l e t ψ = transformWithMotzk in ( ϕ )
i f SMTsolver . checkSAT ( ψ ) :
l e t (D , F ) = T . getParameters ( )
l e t ν = getAssignment (ψ , D , F )
return T . e x t r a c t R a n k i n g F u n c t i o n ( ν )
return n u l l
Fig. 3. Our ranking function synthesis algorithm described in pseudocode. The func-
tion transformWithMotzkin transforms the ∃∀-constraint ϕ into an ∃-constraint ψ as
described in Subsection 5.1.
LOOP(x, x ) ≡ Ai (xx ) ≤ bi
i∈I
T(x, x )≡ Tj,
(x, x )
j∈J
∈Lj
We prove the termination of LOOP by solving the constraint (2). This constraint
is implicitly existentially quantified over the parameters D and the parameters
corresponding to the affine-linear function symbols F .
∀x, x . Ai (xx ) ≤ bi →
Tj,
(x, x ) (18)
i∈I j∈J
∈Lj
First, we transform the constraint (18) into an equivalent constraint of the form
required by Motzkin’s Theorem.
∀x, x . ¬ Ai (xx ) ≤ bi ∧ ¬Tj,
(x, x ) (19)
i∈I j∈J
∈Lj
Now, Motzkin’s Transposition Theorem will transform the constraint (19) into
an equivalent existentially quantified constraint.
This ∃-constraint is then checked for satisfiability. If an assignment is found,
it gives rise to a ranking function. Conversely, if no assignment exists, then there
cannot be an instantiation of the linear ranking template and thus no ranking
function of the kind formalized by the linear ranking template. In this sense our
method is sound and complete.
6 Related Work
The first complete method of ranking function synthesis for linear loop pro-
grams through constraint solving was due Podelski and Rybalchenko [20]. Their
approach considers termination arguments in form of affine-linear ranking func-
tions and requires only linear constraint solving. We explained the relation to
their method in Example 2.
Bradley, Manna, and Sipma propose a related approach for linear lasso pro-
grams [5]. They introduce affine-linear inductive supporting invariants to handle
the stem. Their termination argument is a lexicographic ranking function with
each component corresponding to one loop disjunct. This not only requires non-
linear constraint solving, but also an ordering on the loop disjuncts. The authors
extend this approach in [6] by the use of template trees. These trees allow each
lexicographic component to have a ranking function that decreases not neces-
sarily in every step, but eventually.
In [14] the method of Podelski and Rybalchenko is extended. Utilizing sup-
porting invariants analogously to Bradley et al., affine-linear ranking functions
are synthesized. Due to the restriction to non-decreasing invariants, the gener-
ated constraints are linear.
A collection of example-based explanations of constraint-based verification
techniques can be found in [24]. This includes the generation of ranking functions,
interpolants, invariants, resource bounds and recurrence sets.
In [4] Ben-Amram and Genaim discuss the synthesis of affine-linear and lex-
icographic ranking functions for linear loop programs over the integers. They
prove that this problem is generally co-NP-complete and show that several spe-
cial cases admit a polynomial time complexity.
References
1. Albert, E., Arenas, P., Genaim, S., Puebla, G.: Closed-form upper bounds in static
cost analysis. J. Autom. Reasoning 46(2), 161–203 (2011)
2. Alias, C., Darte, A., Feautrier, P., Gonnord, L.: Multi-dimensional rankings, pro-
gram termination, and complexity bounds of flowchart programs. In: Cousot, R.,
Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 117–133. Springer, Heidelberg
(2010)
3. Ben-Amram, A.M.: Size-change termination, monotonicity constraints and ranking
functions. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 109–
123. Springer, Heidelberg (2009)
4. Ben-Amram, A.M., Genaim, S.: Ranking functions for linear-constraint loops. In:
POPL (2013)
5. Bradley, A.R., Manna, Z., Sipma, H.B.: Linear ranking with reachability. In: Etes-
sami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 491–504. Springer,
Heidelberg (2005)
6. Bradley, A.R., Manna, Z., Sipma, H.B.: The polyranking principle. In: Caires, L.,
Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS,
vol. 3580, pp. 1349–1361. Springer, Heidelberg (2005)
186 J. Leike and M. Heizmann
7. Colón, M.A., Sankaranarayanan, S., Sipma, H.B.: Linear invariant generation using
non-linear constraint solving. In: Hunt Jr., W.A., Somenzi, F. (eds.) CAV 2003.
LNCS, vol. 2725, pp. 420–432. Springer, Heidelberg (2003)
8. Cook, B., Fisher, J., Krepska, E., Piterman, N.: Proving stabilization of biological
systems. In: Jhala, R., Schmidt, D. (eds.) VMCAI 2011. LNCS, vol. 6538, pp.
134–149. Springer, Heidelberg (2011)
9. Cook, B., Podelski, A., Rybalchenko, A.: Terminator: Beyond safety. In: Ball, T.,
Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 415–418. Springer, Heidelberg
(2006)
10. Grigor’ev, D.Y., Vorobjov Jr., N.N.: Solving systems of polynomial inequalities in
subexponential time. Journal of Symbolic Computation 5(1-2), 37–64 (1988)
11. Gulwani, S., Zuleger, F.: The reachability-bound problem. In: PLDI, pp. 292–304
(2010)
12. Gupta, A., Henzinger, T.A., Majumdar, R., Rybalchenko, A., Xu, R.G.: Proving
non-termination. In: POPL, pp. 147–158 (2008)
13. Harris, W.R., Lal, A., Nori, A.V., Rajamani, S.K.: Alternation for termination. In:
Cousot, R., Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 304–319. Springer,
Heidelberg (2010)
14. Heizmann, M., Hoenicke, J., Leike, J., Podelski, A.: Linear ranking for linear lasso
programs. In: Van Hung, D., Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172, pp.
365–380. Springer, Heidelberg (2013)
15. Jech, T.: Set Theory, 3rd edn. Springer (2006)
16. Jovanović, D., de Moura, L.: Solving non-linear arithmetic. In: Gramlich, B., Miller,
D., Sattler, U. (eds.) IJCAR 2012. LNCS, vol. 7364, pp. 339–354. Springer, Hei-
delberg (2012)
17. Kroening, D., Sharygina, N., Tonetta, S., Tsitovich, A., Wintersteiger, C.M.: Loop
summarization using abstract transformers. In: Cha, S(S.), Choi, J.-Y., Kim,
M., Lee, I., Viswanathan, M. (eds.) ATVA 2008. LNCS, vol. 5311, pp. 111–125.
Springer, Heidelberg (2008)
18. Kroening, D., Sharygina, N., Tsitovich, A., Wintersteiger, C.M.: Termination anal-
ysis with compositional transition invariants. In: Touili, T., Cook, B., Jackson, P.
(eds.) CAV 2010. LNCS, vol. 6174, pp. 89–103. Springer, Heidelberg (2010)
19. Leike, J.: Ranking function synthesis for linear lasso programs. Master’s thesis,
University of Freiburg, Germany (2013)
20. Podelski, A., Rybalchenko, A.: A complete method for the synthesis of linear rank-
ing functions. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp.
239–251. Springer, Heidelberg (2004)
21. Podelski, A., Rybalchenko, A.: Transition invariants. In: LICS, pp. 32–41 (2004)
22. Podelski, A., Rybalchenko, A.: Transition predicate abstraction and fair termina-
tion. In: POPL, pp. 132–144 (2005)
23. Podelski, A., Wagner, S.: A sound and complete proof rule for region stability of
hybrid systems. In: Bemporad, A., Bicchi, A., Buttazzo, G. (eds.) HSCC 2007.
LNCS, vol. 4416, pp. 750–753. Springer, Heidelberg (2007)
24. Rybalchenko, A.: Constraint solving for program verification: Theory and practice
by example. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174,
pp. 57–71. Springer, Heidelberg (2010)
25. Sankaranarayanan, S., Sipma, H.B., Manna, Z.: Constraint-based linear-relations
analysis. In: Giacobazzi, R. (ed.) SAS 2004. LNCS, vol. 3148, pp. 53–68. Springer,
Heidelberg (2004)
26. Schrijver, A.: Theory of linear and integer programming. Wiley-Interscience series
in discrete mathematics and optimization. Wiley (1999)
FDR3 — A Modern Refinement Checker for CSP
1 Introduction
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 187–201, 2014.
c Springer-Verlag Berlin Heidelberg 2014
188 T. Gibson-Robinson et al.
2 CSP
CSP [1,2,3] is a process algebra in which programs or processes that communicate
events from a set Σ with an environment may be described. We sometimes
structure events by sending them along a channel. For example, c.3 denotes
the value 3 being sent along the channel c. Further, given a channel c the set
{|c|} ⊆ Σ contains those events of the form c.x .
The simplest CSP process is the process STOP that can perform no events.
The process a → P offers the environment the event a ∈ Σ and then behaves
like P . The process P 2 Q offers the environment the choice of the events
offered by P and by Q and is not resolved by the internal action τ . P Q
non-deterministically chooses which of P or Q to behave like. P Q initially
behaves like P , but can timeout (via τ ) and then behaves as Q .
P A B Q allows P and Q to perform only events from A and B respectively
and forces P and Q to synchronise on events in A ∩ B . P Q allows P and Q to
A
run in parallel, forcing synchronisation on events in A and arbitrary interleaving
of events not in A. The interleaving of two processes, denoted P ||| Q , runs P
and Q in parallel but enforces no synchronisation. P \ A behaves as P but hides
any events from A by transforming them into the internal event τ . This event
does not synchronise with the environment and thus can always occur. P [[R]],
behaves as P but renames the events according to the relation R. Hence, if P can
perform a, then P [[R]] can perform each b such that (a, b) ∈ R, where the choice
(if more than one such b) is left to the environment (like 2). P ' Q initially
behaves like P but allows Q to interrupt at any point and perform a visible
event, at which point P is discarded and the process behaves like Q . P ΘA Q
initially behaves like P , but if P ever performs an event from A, P is discarded
and P ΘA Q behaves like Q . Skip is the process that immediately terminates.
FDR3 — A Modern Refinement Checker for CSP 189
function Refines(S , I , M)
done ← {}
The set of states that have been visited
current ← {(root(S ), root(I ))}
States to visit on the current ply
next ← {}
States to visit on the next ply
while current = {} do
for (s, i ) ← current \ done do
Check if i refines s according to M
done ← done ∪ {(s, i )}
for (e, i ) ∈ transitions(I , i ) do
if e = τ then next ← next ∪ {(s, i )}
else t ← transitions(S , s, e)
if t = {} then Report trace error
S cannot perform the event
else {s } ← t
next ← next ∪ {(s , i )}
current ← next
next ← {}
theorems of digitisation [17]. In order to support this, FDR3 also supports the
prioritise operator [3,18], which has other interesting applications as shown
there.
function Worker(S , I , M, w )
donew , currentw , nextw ← {}, {}, {}
finishedw ← true
if hash(root(S ), root(I )) = w then
currentw ← {(root(S ), root(I ))}
finishedw ← false
while ∨w ∈Workers ¬finishedw do
Wait for other workers to ensure the plys start together
finishedw ← true
for (s, i ) ← currentw \ donew do
finishedw ← false
Check if i refines s according to M
donew ← donew ∪ {(s, i )}
for (i , e) ∈ transitions(I , i ) do
if e = τ then w ← hash(s, i ) mod #Workers
nextw ← nextw ∪ {(s, i )}
else t ← transitions(S , s, e)
if t = {} then Report Trace Error
else {s } ← t
w ← hash(s , i ) mod #Workers
nextw ← nextw ∪ {(s , i )}
Wait for other workers to finish their ply
currentw ← nextw
nextw ← {}
Fig. 2. Each worker in a parallel refinement check executes the above function. The
set of all workers is given by Workers. Hash(s, i ) is an efficient hash function on the
state pair (s, i ). All other functions are as per Figure 1.
all of the above sets [19], primarily because this allowed checks to efficiently use
disk-based storage when RAM was exhausted (in contrast to, e.g. hash tables,
where performance often decays to the point of being unusable once RAM has
been exhausted). This brings the additional benefit that inserts into done (from
current ) can be performed in sorted order. Since B-Trees perform almost op-
timally under such workloads, this makes insertions into the done tree highly
efficient. To improve efficiency, inserts into the next tree are buffered, with the
buffer being sorted before insertion. The storage that the B-Tree uses is also
compressed, typically resulting in memory requirements being halved.
consideration is minimising memory usage. In fact, this becomes even more critical
in the parallel setting since memory will be consumed at a far greater rate: with 16
cores, FDR3 can visit up to 7 billion states per hour consuming 70GB of storage.
Thus, we need to allow checks to exceed the size of the available RAM. Given the
above, B-Trees are a natural choice for storing the sets.
All access to the done and current B-Trees is restricted to the worker who
owns those B-Trees, meaning that there are no threading issues to consider. The
next B-Trees are more problematic: workers can generate node pairs for other
workers. Thus, we need to provide some way of accessing the next B-Trees of
other workers in a thread-safe manner. Given the volume of data that needs to
be put into next (which can be an order of magnitude greater than the volume
put into done), locking the tree is undesirable. One option would be to use fine-
grained locking on the B-Tree, however this is difficult to implement efficiently.
Instead of using complex locks, we have generalised the buffering that is used
to insert into next under the single-threaded algorithm. Each worker w has a set
of buffers, one for each other worker, and a list of buffers it has received from
other workers that require insertion into this worker’s next . When a buffer of
worker w for worker w = w fills up, it immediately passes it to the target worker.
Workers periodically check the stack of pending buffers to be flushed, and when
a certain size is exceeded, they perform a bulk insert into next by performing a
n-way merge of all of the pending buffers to produce a single sorted buffer.
One potential issue this algorithm could suffer from is uneven distribution
amongst the workers. We have not observed this problem: the workers have
terminated at roughly the same time. If necessary this could be addressed by
increasing the number of partitions, with workers picking a partition to work on.
We give experimental results that show the algorithm is able to achieve a near
linear speed up in Section 6.
Related Work There have been many algorithms proposed for parallelising BFS,
e.g. [20,21,22,23]. In general, these solutions do not attempt to optimise memory
usage of performance once RAM has been exhausted to the same degree.
The authors of [20] parallelised the FDR2 refinement checker for cluster sys-
tems that used MPI. The algorithm they used was similar to our algorithm in
that nodes were partitioned amongst the workers and that B-Trees were used
for storage. The main difference comes from the communication of next : in their
approach this was deferred until the end of each round where a bulk exchange
was done, whereas in our model we use a complex buffer system.
The authors of [21] propose a solution that is optimised for performing a BFS
on sparse graphs. This uses a novel tree structure to efficiently (in terms of time)
store the bag of nodes that are to be visited on the next ply. This was not suitable
for FDR since it does not provide a general solution for eliminating duplicates
in next , which would cause FDR3 to use vastly more memory.
The author of [23] enhances the Spin Model Checker [9] to support parallel
BFS. In this solution, which is based on [24], done is a lock-free hash-table and is
shared between all of the workers, whilst new states are randomly assigned to a
number of subsets which are lock-free linked lists. This approach is not suitable
194 T. Gibson-Robinson et al.
for FDR since hash-tables are known not to perform well once RAM has been
exhausted (due to their essentially random access pattern). Storing next in a
series of linked-lists is suitable for Spin since it can efficiently check if a node is
in done using the lock-free hash-table. This is not the case for FDR, since there
is no way of efficiently checking if a node is in the done B-Tree of a worker.
5 Compiler
As outlined in Section 3, the compiler is responsible for converting syntactic
processes into GLTSs. This is a difficult problem due to the generality of CSP
since operators can be combined in almost arbitrary ways. In order to allow the
processes to be represented efficiently, FDR3 has a number of different GLTS
types as described in Section 5.1, and a number of different way of construct-
ing each GLTS, as described in Section 5.2. In Section 5.3 we detail the new
algorithm that the compiler uses to decide which of FDR3’s representations of
GLTSs to use. This is of critical importance: if FDR3 were to choose the wrong
representation this could cause the time to check a property and the memory
requirements to greatly increase.
5.1 GLTSs
FDR3 has two main representations of GLTSs: Explicit and Super-Combinator ma-
chines. Explicit machines require memory proportional to the number of states and
transitions during a refinement check. In contrast, Super-Combinator machines only
require storage proportional to the number of states, since the transitions can be
computed on-the-fly. Equally, it takes longer to calculate the transitions of a Super-
Combinator machine than the corresponding Explicit machine.
An Explicit GLTS is simply a standard graph data structure. Nodes in an
Explicit GLTS are process states whilst the transitions are stored in a sorted list.
A Super-Combinator machine represents the LTS by a series of component LTSs
along with a list of rules to combine the transitions of the components. Nodes
for a Super-Combinator machine are tuples, with one entry for each component
machine. For example, a Super-Combinator for P ||| Q consists of the components
P , Q and the rules:
{(1 )→ a, a) | a ∈ αP ∪ {τ }} ∪ {(2 )→ a, a) | a ∈ αQ ∪ {τ }}
where αX is the alphabet of the process X (i.e. the set of events it can perform).
These rules describe how to combine the actions of P and Q into actions of the
whole machine. A single rule is of the form (f , e) where f is a partial function
from the index of a component machine (e.g. in the above example, 1 represents
P ) to the event that component must perform. e is the event the overall machine
performs if all components perform their required events.
Rules can also be split into formats, which are sets of rules. For example, a
Super-Combinator for P ; Q would start in format 1 , which has the rules:
{(1 )→ a, a, 1) | a ∈ αP ∪ {τ }, a = } ∪ {(1 )→ , τ, 2) | a ∈ αQ ∪ {τ }}.
FDR3 — A Modern Refinement Checker for CSP 195
The second format has the rules: {(2 )→ a, a, 2 ) | a ∈ αQ ∪ {τ }}. Thus, the
first format allows P to perform visible events and stay in format 1 (as indicated
by the third element of the tuple), but if P performs a and terminates, the
second format is started which allows Q to perform visible events.
Rules can also specify that component machines should be restarted. For ex-
ample, to represent P = X ; P as a Super-Combinator, there needs to be a way of
restarting the process X after a . Thus, we add to the rules a list of components
whose states should be discarded and replaced by their root states:
The first rule set allows X to perform non- events as usual. However, if X ever
performs a this is converted into a τ and component 1 (i.e. X ) is restarted.
FDR also recursively combines the rules for Super-Combinator machines. For
example, (P ||| Q ) ||| R is not represented as two different Super-Combinator
machines, but instead the rules for P ||| Q and · ||| R are combined. This
process is known as supercompilation. As you might expect from the name, super-
combinators are closely related to combinator operational semantics: the “super”
essentially co-incides with the joining together using supercompilation.
5.2 Strategies
There are several different strategies that FDR3 can use to construct Explicit or
Super-Combinator machines from syntactic processes. These strategies differ in
the type of processes that they can support (e.g. some cannot support recursive
processes), the time they take to execute and the type of the resulting GLTS.
The low-level is the simplest strategy and supports any process. An Explicit
LTS is constructed simply by directly applying CSP’s operational semantics.
The high-level compiles a process to a Super-Combinator. This is not able to
compile recursive processes, such as P = a → P . The supercombinator rules are
directly constructed using the operational semantics of CSP.
The mixed-level is a hybrid of the low and high-level strategies where, in-
tuitively, non-recursive parts of processes are compiled as per the high-level
strategy whilst recursive parts are compiled as per the low-level strategy. For
example, consider P = a → P 2 b → (X ||| Y ): compiling X ||| Y at the
high-level is preferable since it does not require the cartesian product of X
and Y to be formed. If P is compiled at the mixed-level, X ||| Y is compiled
at the high-level, and a → P 2 b → · is compiled into an Explicit machine.
These are wrapped in a Super-Combinator machine that starts X ||| Y when
the Explicit machine performs the b. The supercombinator has two formats,
the first with the rules: {({1 )→ a}, a, 1 ), ({1 )→ b}, b, 2 )} and the second with:
{({2 )→ a}, a, 2 ) | a ∈ α(X ||| Y ) ∪ {τ }}. Thus, when the first process performs
b, the Super-Combinator moves to the second format in which X ||| Y is run.
The next section formalises the set of process that can be compiled in this way.
The recursive high-level strategy is new in FDR3. This compiles to a Super-
Combinator machine and allows some recursive processes (which we formalise
196 T. Gibson-Robinson et al.
Fig. 3. The algorithm FDR3 uses to decide how to compile syntactic processes
Figure 3 defines a function Strategy(P , r ) that returns the strategy that should
be used to compile the syntactic process P in a context that is discarded by
events of event type r . Informally, given a process P and an event type r this
firstly recursively visits each of its arguments, passing down an appropriate event
restriction (which is computed using discards for on arguments and turnedOnBy
for off arguments). It may also force some arguments to be low-level if the
restriction becomes None. Then, a compilation strategy for P is computed by
considering the preferences of the operator, whether the operator is recursive
and the deduced strategies for the arguments. The overriding observation behind
this choice is that compilation at high is only allowed when the process is non-
recursive, and when there is no surrounding context (i.e. r = Anything).
FDR2 has support for Explicit and Super-Combinator GLTSs, along with a GLTS
definition for each CSP operator (e.g. external choice etc). We believe that the
FDR3 representation is superior, since it requires fewer GLTS types to be main-
tained and because it makes the GLTSs independent of CSP, making other pro-
cess algebras easier to support. As mentioned in Section 5.2, FDR2 did not make
use of the recursive high-level, and was unable to automatically compile processes
such as P = (X ||| Y ) ; P at the high-level. We have found that the recursive
high-level has dramatically decreased compilation time on many examples.
The biggest difference is in the algorithm that each uses to compile syntac-
tic processes. FDR2 essentially used a series of heuristics to accomplish this and
would always start trying to compile the process at its preferred level, backtrack-
ing where necessary. This produced undesirable behaviour on certain processes.
We believe that since the new algorithm is based on the operational semantics of
CSP, it is simpler and can be easily applied to other CSP-like process algebras.
6 Experiments
10
Speedup
5
bully.7
solitaire.0
tnonblock.7
0
0
0 10 20 30 0 1 2 ·10 4
Workers Time (s)
stores the difference between keys. The extra memory used for the parallel ver-
sion is for extra buffers and the fact that the B-Tree blocks do not compress as
well.
The speed-ups that Figures 4a and 4b exhibit between 1 and 32 workers vary
according to the problem. solitaire is sped up by a factor of 15 which is almost
optimal given the 16 cores. Conversely, tnonblock.7 is only sped up by a factor
of 9 because it has many small plys, meaning that the time spent waiting for
other workers at the end of a ply is larger.
Figure 4c shows how the speed that FDR3 visits states at changes during
the course of verifying knightex.3.11, which required 300GB of storage (FDR3
used 110GB of memory as a cache and 190GB of on-disk storage). During a
refinement check, the rate at which states are explored will decrease because the
B-Trees increase in size. Observe that there is no change in the decrease of the
state visiting rate after memory is exceeded. This demonstrates that B-Trees are
effectively able to scale to use large amounts of on-disk storage.
Figure 5 compares the performance of FDR3, Spin, DiVinE and LTSmin. For
in-memory checks Spin, DiVinE and LTSmin complete the checks up to three
200 T. Gibson-Robinson et al.
Fig. 5. A comparison between FDR3, Spin, DiVinE and LTSmin. knightex.3.10 has
2035 × 10 6 states and 6786 × 10 6 transitions.
times faster than FDR3 but use up to four times more memory. We believe that
FDR3 is slower because supercombinators are expensive to execute in comparison
to the LTS representations that other tools use, and because B-Trees are slower
to insert into than hashtables. FDR3 was the only tool that was able to complete
knightex.3.11 which requires use of on-disk storage; Spin, DiVinE and LTSmin
were initially fast, but dramatically slowed once main memory was exhausted.
7 Conclusions
In this paper we have presented FDR3, a new refinement checker for CSP. We
have described the new compiler that is more efficient, more clearly defined
and produces better representations than the FDR2 compiler. Further, we have
detailed the new parallel refinement-checking algorithm that is able to achieve
a near-linear speed-up as the number of cores increases whilst ensuring efficient
memory usage. Further, we have demonstrated that FDR3 is able to scale to
enormous checks that far exceed the bounds of memory, unlike related tools.
This paper concentrates on parallelising refinement checks on shared-memory
systems. It would be interesting to extend this to support clusters instead: this
would allow even larger checks to be run. It would also be useful to consider
how to best parallelise checks in the failures-divergence model. This is a difficult
problem, in general, since this uses a depth-first search to find cycles.
FDR3 is available for 64-bit Linux and Mac OS X from https://www.
cs.ox.ac.uk/projects/fdr/. FDR3 is free for personal use or academic re-
search, whilst commercial use requires a licence.
References
1. Hoare, C.A.R.: Communicating Sequential Processes. Prentice-Hall, Inc., Upper
Saddle River (1985)
2. Roscoe, A.W.: The Theory and Practice of Concurrency. Prentice Hall (1997)
FDR3 — A Modern Refinement Checker for CSP 201
Gavin Lowe
1 Introduction
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 202–216, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Concurrent Depth-First Search Algorithms 203
2 Tarjan’s Algorithm
1 var index = 0
2 // Set node’s index and lowlink, and add it to the stacks
3 def addNode(node) = {
4 node.index = index; node. lowlink = index; index += 1
5 controlStack .push(node); tarjanStack .push(node)
6 }
7 addNode(startNode)
8 while( controlStack .nonEmpty){
9 val node = controlStack.top
10 if (node has an unexplored edge to child ){
11 if ( child previously unseen) addNode(child)
12 else if ( child is in tarjanStack ) node.updateLowlink(child . index)
13 // otherwise, child is complete, nothing to do
14 }
15 else{ // backtrack from node
16 controlStack .pop
17 if ( controlStack .nonEmpty) controlStack.top.updateLowlink(node.lowlink )
18 if (node. lowlink == node.index){
19 start new SCC
20 do{
21 w = tarjanStack.pop; add w to SCC; mark w as complete
22 } until (w == node)
23 }
24 }
25 }
Also, each node has a status: either complete (when it has been placed in an
SCC), in-progress (when it has been encountered but not yet been placed in an
SCC), or unseen (when it has not yet been encountered).
Tarjan’s Algorithm is normally described recursively; however, we consider
here an iterative version. We prefer an iterative version for two reasons: (1) as
is well known, iteration is normally more efficient than recursion; (2) when we
move to a concurrent version, we will want to suspend searches; this will be
easier with an iterative version. We use a second stack, denoted controlStack,
that corresponds to the control stack of the recursive version, and keeps track
of the nodes to backtrack to.
We present the sequential Tarjan’s Algorithm for finding SCCs (Problem 1)
in Figure 1. The search starts from the node startNode. When an edge is explored
to a node that is already in the stack, the low-link of the edge’s source is updated
(line 12). Similarly, when the search backtracks, the next node’s low-link is up-
dated (line 17). On backtracking from a node, if its low-link equals its index, all
the nodes above it on the Tarjan stack form an SCC, and so are removed from
that stack and collected (lines 18–23).
The following observation will be useful later.
Concurrent Depth-First Search Algorithms 205
Observation 1. 1. For each node in the tarjanStack, there’s a path in the graph
to each subsequent node in the tarjanStack.
2. For any node n in the tarjanStack, if n is nearer the top of that stack than
controlStack.top, then there is a path from n to controlStack.top (and hence the
two nodes are in the same SCC).
3. If nodes n and l are such that n.lowlink = l.index, then all the nodes between n
and l in the tarjanStack are in the same SCC.
If, instead, we are interested in finding cycles (Problem 2) then: (1) at line 12,
if node == child then we mark the node as in a cycle; and (2) after line 22, if the
SCC has more than one node, we mark all its nodes as in a cycle.
If we are interested in finding lassos (Problem 3) then: (1) at line 12, we
immediately mark node and all the other nodes in the Tarjan stack as being in
a lasso; and (2) if we encounter a complete node (line 13), if it is in a lasso, we
mark all the nodes in the Tarjan stack as being in a lasso.
1 var index = 0
2 // Set node’s index, lowlink and search, and add it to the stacks
3 def addNode(node) = {
4 node.index = index; node. lowlink = index; index += 1
5 node.search = thisSearch ; controlStack .push(node); tarjanStack .push(node)
6 }
7 addNode(StartNode)
8 while( controlStack .nonEmpty){
9 val node = controlStack.top
10 if (node has an unexplored edge to child ){
11 if ( child previously unseen) addNode(child)
12 else if ( child is in tarjanStack ) node.updateLowlink(child . index)
13 else if ( child is not complete) // child is in−progress in a different search
14 suspend waiting for child to complete
15 // otherwise, child is complete, nothing to do
16 }
17 else{ // backtrack from node
18 controlStack .pop
19 if ( controlStack .nonEmpty) controlStack.top.updateLowlink(node.lowlink )
20 if (node. lowlink == node.index){
21 start new SCC
22 do{
23 w = tarjanStack.pop; add w to SCC
24 mark w as complete and unblock any searches suspended on it
25 } until (w == node)
26 }
27 }
28 }
c2 and n2 , and between c3 and n3 are all in the same SCC, by Observation 1(1);
we denote this SCC by “C”.
Let t1 be the top of the Tarjan stack of s1 : t1 might equal n1 ; or s1 might
have backtracked from t1 to n1 . Note that all the nodes between n1 and t1 are
in the same SCC as n1 , by Observation 1(2), and hence in the SCC C. Similarly,
let t2 and t3 be the tops of the other Tarjan stacks; all the nodes between n2
and t2 , and between n3 and t3 are likewise in C.
Let l1 be the earliest node of s1 known (according to the low-links of s1 ) to
be in the same SCC as c1 : l1 is the earliest node reachable by following low-
links from the nodes between c1 to t1 (inclusive), and then (perhaps) following
subsequent low-links; equivalently, l1 is the last node in s1 that is no later than c1
and such that all low-links of nodes between l1 and t1 are at least l1 (a simple
traversal of the Tarjan stack can identify l1 ). Hence all the nodes from l1 to t1
are in the SCC C (by Observation 1(3)). Let l2 and l3 be similar.
Consider the graph G formed by transforming the original graph by adding
edges from n1 to l2 , and from n3 to l1 , as illustrated in Figure 3 (middle top
Concurrent Depth-First Search Algorithms 207
s1 s2 s3 s1 s2 s3 s1 s2 s3
l1-V l2-V l3-V l1-V Z lJ 2-V l3-V ,, l3-V
( ( ( ( ( ( ,, (
,,
,,
c1 Y cG 2 cG 3 c1 Y cG 2 cG 3 cJ 3
,,
,,
,,
,,
,,
,,
,,
n1 n2 n3 n1 n2 n3 ,, n3
,,
,,
t1 t2 t3 t1 t2 t3 ,, t3
,,
,,
Key: / blocking edge l1-V
+3 Tarjan stack (
_ _ _/ lowest low-link
/ added edge. c1
n1
l2-V
c1 Y cG 2 cG 3 c1 Y cG 2 cG 3
(
• • • • • •
c2
n1U n2U n3U n1U n2U n3U
n2
t1 t2 t3 t1 t2 t3
t2
and middle bottom). It is clear that the transformed graph has precisely the
same SCCs as the original, since all the nodes below l1 , l2 and l3 in the figure
are in the same SCC C. Consider the following scenario for the transformed
graph: the search s3 explores via nodes l3 , c3 , n3 (backtracking from t3 ), l1 ,
c1 , n1 (backtracking from t1 ), l2 , c2 , n2 (backtracking from t2 ), and then back
to c3 ; meanwhile, the searches s1 and s2 reach l1 and l2 , respectively, and are
suspended.
We transform the stacks to be compatible with this scenario, as illustrated in
Figure 3 (right), by transferring the nodes from l1 to n1 , and from l2 to n2 onto
the stack of search s3 . Note, in particular, that the indexes and lowlinks have to
be updated appropriately (letting δ1 = s3 .index − l1 .index, we add δ1 onto the
index and lowlink of each node transferred from s1 and update s3 .index to be one
larger than the greatest of the new indexes; we then repeat with s2 ).
We then resume search s3 . We start by considering the edge from n2 to c3 ,
and so update the lowlink of n2 . Searches s1 and s2 remain suspended until l1
and l2 are completed.
We now consider the other two problems. If we are interested in finding cycles
(Problem 2) then we adapt the algorithm as for the sequential algorithm: (1) at
line 12, if node == child then we mark the node as in a cycle; and (2) after line 25,
if the SCC has more than one node, we mark all its nodes as in a cycle.
If we are interested in finding lassos (Problem 3) then we again adapt the
algorithm as for the sequential algorithm: (1) at line 12, we immediately mark
node and all the other nodes in the Tarjan stack as being in a lasso; and (2) if
we encounter a complete node (line 15), if it is in a lasso, we mark all the
nodes in the Tarjan stack as being in a lasso. Further, if a search encounters an
in-progress node (line 13), if that node is in a lasso, then there is no need to
suspend the search: instead all the nodes in the Tarjan stack can also be marked
as in a lasso. Similarly, when a node is marked as being in a lasso, any search
blocked on it can be unblocked; when such a search is unblocked, all the nodes
in its Tarjan stack can also be marked as in a lasso. Finally, the procedure for
reducing blocking cycles can be greatly simplified, using the observation that all
the nodes in the Tarjan stacks are in a lasso: the search that discovered the cycle
(s3 in the example) marks all its nodes as in a lasso, and so unblocks the search
blocked on it (s2 in the example); that search similarly marks its nodes as in a
lasso, and so on.
4 Implementation
1
Available from http://www.cs.ox.ac.uk/people/gavin.lowe/parallelDFS.html.
Concurrent Depth-First Search Algorithms 209
Each node n includes a field blocked : List[Search], storing the searches that have
encountered this node and are blocked on it. When the node is completed, those
searches can be resumed (line 24 of Figure 2). Note that testing whether n is
complete (line 13 of Figure 2) and updating blocked has to be done atomically.
In addition, each suspended search has a field waitingFor, storing the node it is
waiting on.
We record which searches are blocked on which others in a map suspended
from Search to Search, encapsulated in a Suspended object. The Suspended object
has an operation suspend(s: Search, n: Node) to record that s is blocked on n.
When s suspends blocked by a node of s , we detect if this would create a
blocking cycle by transitively following the suspended map to see if it includes a
blocking path from s to s. If so, nodes are transferred to s, and s is resumed
as outlined in the previous section. This is delicate. Below, let sb be one of the
searches from which nodes are transferred and that remains blocked.
1. Each node’s search, index and lowlink are updated, as described in the previous
section.
2. Each sb with remaining nodes has its waitingFor field updated to the appro-
priate node of s (the li nodes of Figure 3); and those nodes have their blocked
fields updated.
3. The suspended map is updated: each sb that has had all its nodes transferred
is removed; each other sb is now blocked by s; and any other search s that
was blocked on one of the nodes transferred to s is now also blocked on s.
and consider the case that n’ is one of the nodes transferred to s. We argue that
the resulting race is benign. The second call will not create a blocking cycle
(since only the sink search of the reverse arborescence, s, can create a block-
ing cycle); this will be correctly detected, even in the half-updated state. Fur-
ther, suspended(s’) gets set correctly: if suspend(s’,n’) sets suspended(s’) to n’.search
before suspend(s,n) updates n’.search, then the latter will subsequently update
suspended(s’) to s (in item 3); if suspend(s,n) sets n’.search to s before suspend(s’,n’)
reads it, then both will set suspended(s’) to s.
4.2 Scheduling
Our implementation uses a number of worker threads (typically one per processor
core), which execute searches. We use a Scheduler object to provide searches for
workers, thereby implementing a form of task-based parallelism.
The Scheduler keeps track of searches that have been unblocked as a result of
the blocking node becoming complete (line 24 of Figure 2). A dormant worker
can resume one of these. (Note that when a search is unblocked, the update
to the Scheduler is done after the updates to the search itself, so that it is not
resumed in an inconsistent state.)
The algorithm can proceed in one of two different modes: rooted, where the
search starts at a particular node, but the state space is not known in advance;
and unrooted, where the state space is known in advance, and new searches can
start at arbitrary nodes. In an unrooted search, the Scheduler keeps track of all
nodes from which no search has been started. A dormant worker can start a new
search at one of these (assuming it has not been reached by another search in
the meantime). Similarly, in a rooted search the Scheduler keeps track of nodes
encountered in the search but not yet expanded: when a search encounters a
new node n, it passes n’s previously unseen successors, except the one it will
consider next, to the Scheduler. Again, a dormant worker can start a new search
from such a node.
4.3 Enhancements
We now describe a few details of our implementation that have an effect upon
efficiency.
We use a map from node identifiers (Ints) to Node objects that store infor-
mation about nodes. We have experimented with many representations of this
map. Our normal implementation is based on the hash table described by Laar-
man et al. in [7]. However, this implementation uses a fixed-size table, rather
than resizing the table, thus going against the design of FDR (we have extended
the hash table to allow resizing, but this makes the implementation somewhat
slower). On some problems (including our experiments on random graphs in the
next section), the implementation works better with a sharded hash table3 with
3
A sharded hash table can be thought of as a collection of M individual hash tables,
each with its own lock; an entry with hash value h is stored in the table with
index h mod M .
Concurrent Depth-First Search Algorithms 211
open addressing. Even with these implementation, the algorithms spend about
40% of their time within this map. (Other implementations are worse; using a
Java ConcurrentHashMap increases the running time by a factor of two!)
It is clearly advantageous to avoid suspending searches, if possible. Therefore,
the implementation tries to choose (at line 10 of Figure 2) a child node that is
not in-progress in a different search, if one exists.
Some nodes have no successors. It is advantageous, when starting a search
from such a node, to avoid creating a Search object with its associated stacks,
but instead to just mark the node as complete and to create a singleton SCC
containing it.
5 Experiments
In this section we report the results of timing experiments. The experiments were
carried out on an eight-core machine (an Intel* R Xeon* R E5620) with 12GB of
RAM. Each of the results is averaged over ten runs, after a warm-up round.
We have performed timing experiments on a suite of CSP files. We have
extracted the graphs of τ -transitions for all implementation processes in the
FDR3 test suite (including most of the CSP models from [15,16,1]) and the
CSP models from [10]. The top of Figure 4 gives statistics about a selection
of the graphs with between 200,000 and 5,000,000 states (we omit eleven such,
in the interests of space), plus a slightly smaller file tring2.1 which we discuss
below4 . For each graph we give the number of states (i.e. nodes), the number
of transitions (i.e. edges), the number of SCCs, the size of the largest SCC, the
number of trivial SCCs (with a single state), the number of states on a loop,
and the number of states on a lasso.
The bottom of Figure 4 gives corresponding timing results. For each of the
three problems, we give times (in ms) for each of the concurrent and sequential
algorithms, and the ratio between them (which represents the speed-up factor).
The penultimate row gives totals for these running times, and their ratios. The
final row gives data for tring2.1. Even on a single-threaded program, the JVM
uses a fair amount of concurrency. The sequential algorithm typically uses about
160% of a single core (as measured by top). Hence the maximum speed-up one
should expect is a factor of about five.
We have performed these experiments in unrooted mode, because it more-
closely simulates our main intended use within FDR, namely for detecting di-
vergences (i.e. τ -lassos) during failures-divergences checks. Such a check performs
a breadth-first search of the product of the system and specification processes;
for each pair of states encountered, if the specification state does not allow a
divergence, then FDR checks that the system state does not have a divergence.
The overall effect is normally that a lasso search is started at every reachable
system state.
The concurrent algorithms normally give significant speed-ups. Further, the
speed-up tends to be larger for larger graphs, particularly for graphs with more
4
The file matmult.6 contains no τ -transitions, only visible transitions.
212 G. Lowe
Fig. 4. Results for tests on CSP files: statistics about the graphs, and timing results
Concurrent Depth-First Search Algorithms 213
transitions. However, beyond a few million states, the speed-ups drop off again,
I believe because of issues of memory contention.
The results for tring2.1 deserve comment. This graph has a large SCC, ac-
counting for over 70% of the states. The first two concurrent algorithms consider
the nodes of this SCC sequentially and so (because the concurrent algorithms
are inevitably more complex) are slightly slower than the sequential algorithms.
However, the algorithm for lassos gives more scope for considering the nodes of
this SCC concurrently, and therefore gives a speed-up.
The above point is also illustrated in Figure 5. This figure considers a number
of random graphs, each with N = 200,000 states. For each pair of nodes n and n ,
an edge is included from n to n with probability p; this gives an expected number
Fig. 6. Speed ups on CSP files as a function of the number of worker threads
of edges equal to N 2 p. (Note that such graphs do not share many characteristics
with the graphs one typically model checks!) The graph plots the speed-up for
the three algorithms for various values of p; the tables give statistical information
about the graphs considered (giving averages, rounded to the nearest integer in
each case). For p greater than about 0.000005, the graph has a large SCC, and
the algorithms for SCCs and loops become less efficient. However, the algorithm
for finding lassos becomes progressively comparatively more efficient as p, and
hence the number of edges, increases; indeed, for higher values of p, the speed-up
plateaus at about 5.
It is worth noting that graphs corresponding to the τ -transitions of CSP
processes rarely have very large SCCs. The graph tring2.1 corresponds to a CSP
process designed for checking in the traces model, as opposed to the failures-
divergences model, so the problems considered in this paper are not directly
relevant to it.
Figure 6 considers how the speed up varies as a function of the number of
worker threads. It suggests that the algorithm scales well.
6 Conclusions
In this paper we have presented three concurrent algorithms for related problems:
finding SCCs, loops and lassos in a graph. The algorithms give appreciable speed-
ups, typically by a factor of about four on an eight-core machine.
It is not surprising that we fall short of a speed-up equal to the number of
cores. As noted above, the JVM uses a fair amount of concurrency even on
single-threaded programs. Also, the concurrent algorithms are inevitably more
complex than the sequential ones. Further, I believe that they are slowed down
by contention for the memory bus, because the algorithms frequently need to
read data from RAM.
Concurrent Depth-First Search Algorithms 215
I believe there is some scope for reducing the memory contention, in particular
by reducing the size of Node objects: many of the attributes of Nodes are neces-
sary only for in-progress nodes, so could be stored in the relevant Search object.
Further, I intend to investigate whether it’s possible to reduce the amount of
locking of objects done by the prototype implementation.
We intend to incorporate the lasso and SCC algorithms into the FDR3 model
checker. In particular, it will be interesting to see whether the low-level nature
of C++ (in which FDR3 is implemented) permits optimisations that give better
memory behaviour.
As noted earlier, a large proportion of the algorithms’ time is spent within the
map storing information about nodes. I would like to experiment with different
implementations.
Related Work. We briefly discuss here some other concurrent algorithms address-
ing one or more of our three problems. We leave an experimental comparison
with these algorithms for future work.
Gazit and Miller [5] describe an algorithm based upon the following idea.
The basic step is to choose an arbitrary pivot node, and calculate its SCC as the
intersection of its descendents and ancestors; these descendents and ancestors can
be calculated using standard concurrent algorithms. This basic step is repeated
with a new pivot whose SCC has not been identified, until all SCCs are identified.
A number of improvements to this algorithm have been proposed [13,11,2].
Several papers have proposed algorithms for finding loops, in the particular
context of LTL model checking [8,4,9,3]. These algorithms are based on the
SWARM technique: multiple worker threads perform semi-independent searches
of the graph, performing a nested depth-first search to detect a loop containing
an accepting state; the workers share only information on whether a node has
been fully explored, and whether it has been considered within an inner depth-
first search.
References
1. Armstrong, P., Lowe, G., Ouaknine, J., Roscoe, A.W.: Model checking timed CSP.
In: Proceedings of HOWARD-60 (2012)
2. Barnat, J., Chaloupka, J., van de Pol, J.: Distributed algorithms for SCC decom-
position. Journal of Logic and Computation 21(1), 23–44 (2011)
3. Evangelista, S., Laarman, A., Petrucci, L., van de Pol, J.: Improved multi-core
nested depth-first search. In: Chakraborty, S., Mukund, M. (eds.) ATVA 2012.
LNCS, vol. 7561, pp. 269–283. Springer, Heidelberg (2012)
4. Evangelista, S., Petrucci, L., Youcef, S.: Parallel nested depth-first searches for LTL
model checking. In: Bultan, T., Hsiung, P.-A. (eds.) ATVA 2011. LNCS, vol. 6996,
pp. 381–396. Springer, Heidelberg (2011)
216 G. Lowe
1 Introduction
Real systems are usually complex objects, and grasping all the details of a system at the
same time is often difficult. In addition, each of the various stakeholders in the system
are concerned with different system aspects. For these reasons, modeling and design
teams usually deal only with partial and incomplete views of a system, which are easier
to manage separately. For example, when designing a digital circuit, architects may
be concerned with general (boolean) functionality issues, while ignoring performance.
Other stakeholders, however, may be concerned about timing aspects such as the delay
of the critical path, which ultimately affects the clock rate at which the circuit can
be run. Yet other stakeholders may be interested in a different aspect, namely, energy
consumption of the circuit which affects battery life.
Modeling and simulation are often used to support system design. In this paper,
when we talk about views, we refer concretely to the different models of a system
that designers build. Such models may be useful as models of an existing system: the
system exists, and a model is built in order to study the system. Then, the model is only
a partial or incomplete view of the system, since it focuses on certain aspects and omits
others. For example, an energy consumption model for an airplane ignores control,
air dynamics, and other aspects. Models may also be used for a system-to-be-built: an
energy consumption model as in the example above could be developed as part of the
design process, even before the airplane is built.
This research is partially supported by the National Science Foundation and the Academy of
Finland, via projects ExCAPE: Expeditions in Computer Augmented Program Engineering
and COSMOI: Compositional System Modeling with Interfaces, by the Deutsche Forschungs-
gemeinschaft as part of the Transregional Collaborative Research Centre SFB/TR 14 AVACS,
and by the centers TerraSwarm and iCyPhy (Industrial Cyber-Physical Systems) at UC
Berkeley.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 217–232, 2014.
c Springer-Verlag Berlin Heidelberg 2014
218 J. Reineke and S. Tripakis
For large systems, each aspect of the system is typically designed by a dedicated
design team. These teams often use different modeling languages and tools to capture
different views, which is generally referred to as multi-view modeling (MVM). MVM
presents a number of challenges, such as the crucial issue of consistency: if different
views of the system are captured by different models, and these models have some
degree of overlap, how can we guarantee that the models are consistent, i.e., that they
do not contradict each other? Understanding the precise meaning of such questions, and
developing techniques to answer them, ideally fully automatically, is the main goal of
this paper.
Toward this goal, we begin in Section 2 by introducing an example of simple 3-
dimensional structure modeling. Even though our focus is on dynamic behaviors, we
will use this static system as an illustrative running example to demonstrate the salient
concepts of our formal MVM framework. The latter is itself presented in Section 3. The
main concepts are as follows: (1) views can be derived from systems using abstraction
functions, which map system behaviors to view behaviors; (2) conformance formalizes
how “faithful” a view is to a system; (3) consistency of a set of views is defined as
existence of a witness system to which all views conform; (4) view reduction allows to
“optimize” views by using the information contained in other views; (5) orthogonality
captures independence between views.
The framework proposed in Section 3 is abstract, in the sense that it does not refer to
specific notions of behaviors, neither to concrete representations of systems and views.
In the rest of the paper we instantiate this abstract framework for the case of discrete
systems. The latter, defined in Section 4, are finite-state symbolic transition systems
consisting of a set of state variables, a predicate over the state variables characterizing
the set of initial states, and a predicate characterizing the transition relation.
In Section 5 we study projections as abstraction functions for discrete systems. Fully-
observable systems, where all variables are observable, are not closed under projection,
therefore we also consider systems with internal (unobservable) variables. We show
how to effectively solve a number of verification and synthesis problems on discrete
systems and views, including view conformance and consistency checking.
Fig. 1. A 3D structure (left) and 3 views of it (right) – image produced using this tool:
http://www.fi.uu.nl/toepassingen/02015/toepassing_wisweb.en.html
to the object the missing boxes so that no box under the “staircase structure” hangs in
the air.
3 Views: A Formalization
Systems: We define a system semantically, as a set of behaviors. As in [15], there is
no restriction on the type of behaviors: they could be discrete traces, continuous tra-
jectories, hybrid traces, or something else. We only assume given a domain of possible
behaviors, U. Then, a system S over domain of behaviors U is a subset of U: S ⊆ U.
View Domains: A view is intuitively an “incomplete picture” of a system. It can be
incomplete in different ways:
– Some behaviors may be missing from the view, i.e., the view may contain only a
subset of system behaviors. (As we shall see when we discuss conformance, the
view may also be a superset.)
– Some parts of a behavior itself may be missing in the view. E.g., if the behavior
refers to a state vector with, say, 10 state variables, the view could refer only to 2
state variables. In this case the view can be seen as a projection.
– More generally, the view may be obtained by some other kind of transformation
(not necessarily a projection) to behaviors. E.g., the original system behaviors may
contain temperature as a state variable, but the view only contains temperature av-
erages over some period of time.
From the above discussion, it appears that: semantically, views can be formalized as sets
of behaviors, just like systems are. However, because of projections or other transfor-
mations, the domain of behaviors of a view is not necessarily the same as the domain of
system behaviors, U. Therefore, we let Di be the domain of behaviors of view i (there
can be more than one view, hence the subscript i). When we refer to a general view
domain, we drop the subscript and simply write D.
In the case of our running example, U = {1, 2, 3, 4}3, and Dtop = Df ront =
Dside = {1, 2, 3, 4}2.
Views: A view is a set of behaviors over a given view domain. That is, a view V over
view domain D is defined to be a subset of D: V ⊆ D.
220 J. Reineke and S. Tripakis
For this to work, however, we need |= to have the two following properties:
1. (monotonicity) V1 |=S ∧ V2
+ V1 ⇒ V2 |=S.
2. (conformance preserved by ) ∀W ⊆ 2D : (∀V ∈ W : V |=S) ⇒ ( W )|=S.
Condition 1 says that if V1 conforms to S then any view greater than V1 also conforms
to S. Condition 2 says that if a set of views all conform to a system S, then their greatest
lower bound also conforms to S. Any relation +a defined by an abstraction function a
and an order + forming a complete lattice has these two properties by construction.
Consistency: Consider a set of views, V1 , V2 , ..., Vn , over view domains D1 , D2 , ..., Dn .
For each view domain Di , consider given a conformance relation |=i (which could
be derived from given abstraction function ai and partial order +i , or defined as a
primitive notion as explained above). We say that V1 , V2 , ..., Vn are consistent w.r.t.
|=1 , |=2 , ..., |=n if there exists a system S over U such that ∀i = 1, ..., n : Vi |=i S. We
call such a system S a witness to the consistency of V1 , V2 , ..., Vn . Clearly, if no such
S exists, then one must conclude that the views are inconsistent, as there is no system
from which these views could be derived. When +i is = for all i, i.e., when Vi = ai (S)
for all i, we say that V1 , ..., Vn are strictly consistent. Note that if +i is ⊇ for all i,
then consistency trivially holds as the empty system is a witness, since Vi ⊇ ∅ = ai (∅)
for all i. Also, if +i is ⊆ for all i and every ai satisfies ai (U) = Di , then consistency
trivially holds as the system U is a witness, since Vi ⊆ Di = ai (U) for all i.
In our 3D objects example, if Vtop is non-empty but Vside is empty, then the two
views are inconsistent w.r.t. strict conformance V = a(S). A less trivial case is when
Vtop = {(1, 1)} and Vside = {(2, 2)}. Again the two views are inconsistent (w.r.t. =):
Vtop asserts that some box must be in the column with (x, y) coordinates (1, 1), but
Vside implies that there is no box whose y coordinate is 1.
The last example may mislead to believe that consistency (w.r.t. =) is equivalent to
“intersection of inverse projection of views being non-empty.” This is not true. Even in
the case where abstraction functions are projections, non-empty intersection of inverse
projections is a necessary, but not a sufficient condition for consistency. To see this,
consider views Vtop = {(1, 1), (3, 3)} and Vside = {(2, 2), (1, 2)} in the context of our
running example. These two views are inconsistent w.r.t. =. Yet the intersection of their
inverse projections is non-empty, and equal to {(1, 1, 2)}.
View Reduction: Given a set of views V1 , ..., Vn of a system S, it may be possible to
“reduce” each view Vi based on the information contained in the other views, and as
a result obtain views V1 , ..., Vn that are “more accurate” views of S. We use the term
reduction inspired from similar work in abstract interpretation [5,10].
For example, if we assume that conformance is defined as V ⊇ a(S), then the views
Vtop = {(1, 1), (3, 3)} and Vside = {(2, 2), (1, 2)} can be reduced to Vtop = {(1, 1)}
and Vside = {(1, 2)}. Vtop is still a valid top view, in the sense that for every system S,
if both Vtop ⊇ atop (S) and Vside ⊇ aside (S), then Vtop ⊇ atop (S). In addition, Vtop is
more accurate than Vtop in the sense that Vtop is a strict subset of Vtop . Indeed, Vtop does
not contain the “bogus” square (3, 3) which cannot occur in S, as we learn from Vside .
Let us now define the notion of view reduction formally. First, given a conformance
relation between views and systems, |= ⊆ 2D × 2U , we define the concretization func-
tion c|= which, given a view V , returns the set of all systems which V conforms to:
222 J. Reineke and S. Tripakis
Lemma 1. The most accurate view that conforms to a set of systems S can also be
determined from the individual systems’ abstractions:
$
a|= (S) = {a|= (S) | S ∈ S}.
Missing proofs to lemmas and theorems can be found in the technical report [17].
Given the above, and assuming n view domains with corresponding conformance
relations, (D1 , |=1 ), ..., (Dn , |=n ), view reduction can be defined as follows:
%
n
reducei (V1 , V2 , ..., Vn ) := a|=i c|=i (Vi ) .
i=1
Lemma 2. Reduction is a reductive operation, i.e., Vi + reducei (V1 , V2 , ..., Vn ) for
all i. The set#of witnesses to the consistency of views #n V1 , ..., Vn is invariant under re-
n
duction, i.e., i=1 c|=i (reducei (V1 , V2 , ..., Vn )) = i=1 c|=i (Vi ) for all i.
The second part of the lemma implies that reduction is idempotent, i.e., for all i:
reducei (V1 , ..., Vn ) = reducei (V1 , ..., Vn ), where Vi = reducei (V1 , V2 , ..., Vn ).
Orthogonality: In some fortunate cases different aspects of a system are indepen-
dent of each other. Intuitively, what this means is that each aspect can be defined
separately without the need for communication between development teams to avoid
inconsistencies.
Formally, we say that view domains D1 , ..., Dn are orthogonal if all sets of non-
empty views V1 , ..., Vn from these view domains are mutually irreducible, i.e., if
reducei (V1 , ..., Vn ) = Vi for all i = 1, ..., n. The view domains from our example
of 3D objects, capturing projections onto two dimensions, are not orthogonal, as the
reduction example involving the domains shows. On the other hand, view domains cor-
responding to the projection onto individual dimensions would indeed be orthogonal to
each other.
Alternatively, orthogonal view domains can be defined by requiring that all sets of
non-empty views V1 , ..., Vn from these domains are consistent w.r.t. =.
The following lemma shows that the two definitions of orthogonal domains are
equivalent, if we assume that conformance is defined based on abstraction functions
and the superset and equality relations as the partial orders on views.
that when is a set-theoretic relation such as ⊆ or ⊇, this obviously holds and is
1
Note
D
or . When is = then (2 , =) is not a lattice, and the definition of view reduction given
below does not apply. This is not a problem, as in that case we require views to be complete.
Basic Problems in Multi-View Modeling 223
Lemma 3. Given non-empty views V1 , ..., Vn , the following statements are equivalent:
A system S ⊆ U is called view definable w.r.t. |=1 , ..., |=n if there exist views V1 ⊆
D1 , ..., Vn ⊆ Dn , such that c|=1 (V1 ) ∩ · · · ∩ c|=n (Vn ) = {S}. In the example of 3D
objects, with 2D projections, the empty object S = {} is view definable, as it is defined
by the empty views. Similarly, all objects Si,j,k = {(i, j, k)} are view definable. Note
that a general cube is not view definable, as there are other objects (e.g., a hollow cube)
which have the same 2D projections.
4 Discrete Systems
Our goal in the rest of this paper is to instantiate the view framework developed in
Section 3. We instantiate it for a class of discrete systems, and we also provide answers
to some of the corresponding algorithmic problems.
We will consider finite-state discrete systems. The state space of such a system can be
represented by a set of boolean variables, X, resulting in 2n potential states, where n =
|X| is the size of X. A state s over X is a valuation over X, i.e., a function s : X → B,
where B := {0, 1} is the set of booleans. For convenience, we sometimes consider other
finite domains with the understanding that they can be encoded as booleans. A behavior
over X is a finite or infinite sequence of states over X, σ = s0 s1 s2 · · · . U(X) denotes
the set of all possible behaviors over X.
Semantically, a discrete system S over X is a set of behaviors over X, i.e., S ⊆
U(X). For computation, we need a concrete representation of discrete systems. We
will start with a simple representation where all system variables are observable. We
will then discuss limitations of this representation and consider an extension where the
system can also have internal (unobservable) variables in addition to the observable
ones.
224 J. Reineke and S. Tripakis
Non-closure Properties
Non-closure Under Projection: The projection hY (S) is defined semantically, as a
set of behaviors. It is natural to ask whether the syntactic representation of discrete
systems is closed under projection. That is, is it true that for any S = (X, θ, φ), and
Y ⊆ X, there exists S = (Y, θ , φ ), such that S = hY (S)? This is not generally
true:
Lemma 4. There exists a FOS S = (X, θ, φ), and Y ⊆ X, such that there is no FOS
S = (Y, θ , φ ), such that S = hY (S).
As it turns out, we can check whether closure under projection holds for a given
system: see Theorem 2 in Section 5.
Basic Problems in Multi-View Modeling 225
Discrete Systems with Internal Variables: The above non-closure properties motivate
us to study, in addition to fully-observable discrete systems, a generalization which ex-
tends them with a set of internal, unobservable state variables. Most practical modeling
languages also allow the construction of models with both internal and observable state
variables.
Accordingly, we extend the definition of a discrete system to be in general a tuple
(X, Z, θ, φ), where X, Z are disjoint (finite) sets of variables. X models the observable
and Z the internal variables. θ is a boolean expression over X ∪ Z and φ is a boolean
expression over X ∪ Z ∪ X ∪ Z . In such a system, we need to distinguish between
behaviors, and observable behaviors. A behavior of a system S = (X, Z, θ, φ) is a
finite or infinite sequence σ over X ∪ Z, defined as above. The observable behavior
corresponding to σ is hX (σ), which is a behavior over X. From now on, S denotes
the set of all behaviors (over X ∪ Z) of S, and So denotes the set of observable
behaviors (over X) of S.
Note that we allow Z to be empty. In that case, the system has no internal variables,
i.e., it is a FOS. We will continue to represent a FOS by a triple S = (X, θ, φ). A FOS S
satisfies S = So .
Closure Properties: We have already shown (Lemma 5) that FOS are not closed under
union. They are however closed under intersection:
Lemma 6. Given two FOS S1 = (X, θ1 , φ1 ) and S2 = (X, θ2 , φ2 ), a FOS S such that
S = S1 ∩ S2 is S1 ∧ S2 = (X, θ1 ∧ θ2 , φ1 ∧ φ2 ).
General discrete systems (with internal variables) are closed under intersection,
union, as well as projection.
S∩ = (X, Z1 ∪ Z2 , θ1 ∧ θ2 , φ1 ∧ φ2 ),
S∪ = X, Z1 ∪ Z2 ∪ {z}, (θ1 ∧ z) ∨ (θ2 ∧ ¬z).(z → φ1 ∧ z ) ∧ (¬z → φ2 ∧ ¬z ) ,
Sh = (Y, Z1 ∪ (X \ Y ), θ1 , φ1 ).
Then, S∩ o = S1 o ∩ S2 o , S∪ o = S1 o ∪ S2 o , and Sh o = hY (S1 o ).
226 J. Reineke and S. Tripakis
Discrete Views, View Domains, and Abstraction Functions: Discrete views are
finite-state discrete systems. They are represented in general by tuples of the form
(X, Z, θ, φ), and when Z = ∅, by triples of the form (X, θ, φ).
In this paper, we will study projection as the abstraction function for the discrete
view framework. That is, a system will be a discrete system S over a set of observable
variables X, and therefore the domain of system behaviors will be U = U(X). A view
will be a discrete system V over a subset of observable variables Y ⊆ X. Therefore,
the view domain of V is D = U(Y ). Note that both S and V may have (each their own)
internal variables.
Let S = (X, Z, θ, φ) be a discrete system, V = (Y, W, θ , φ ) be a discrete view,
with Y ⊆ X, and + be one of the orders =, ⊆, or ⊇. To make notation lighter, we will
write V + hY (S) instead of V o + hY (S). Note, that hY (S) = hY (So ). More
generally, when comparing systems or views, we compare them w.r.t. their observable
behaviors. For instance, when writing V1 + V2 , we mean V1 o + V2 o .
Least and Greatest Fully-Observable Views: Let S be a discrete system over set
of observable variables X. Given a set Y ⊆ X, one might ask whether there is a
“canonical” view V of S w.r.t. Y . Clearly, if we allow V to have internal variables, the
answer is yes: it suffices to turn all variables in X \ Y into internal variables in V . Then,
by Lemma 7, V represents precisely the projection of S to Y , i.e., it is a complete view,
it satisfies V = hY (S), and therefore trivially also V ⊇ hY (S) and V ⊆ hY (S). Note
that this is true independently of whether S has internal variables or not.
In this section we study the question for the case where we forbid V from hav-
ing internal variables, i.e., we restrict views to be fully-observable. As FOS are not
closed under projection, there are systems that have no complete fully-observable view.
On the other hand, there can be multiple views V over Y such that V ⊇ hY (S) or
V ⊆ hY (S). In particular, (Y, true, true) ⊇ hY (S) and (Y, false, false) ⊆ hY (S),
for any S and Y . Thus, the question arises, whether there is a least fully-observable
view lv(S, Y ) of S with lv(S, Y ) ⊇ hY (S), such that for any fully-observable view V
with V ⊇ hY (S), we have V ⊇ lv(S, Y ). Similarly, one may ask whether there is a
greatest fully-observable view gv(S, Y ) w.r.t. ⊆hY . These questions are closely related
to whether views are closed under intersection and union. In particular, we can use clo-
sure under intersection to show that a least view always exists. A greatest view, on the
other hand, does not necessarily exist.
is the unique fully-observable least view lv(S, Y ), that is, lv(S, Y ) ⊇ hY (S), and for
any fully-observable view V over Y with V ⊇ hY (S), we have V ⊇ lv(S, Y ).
Basic Problems in Multi-View Modeling 227
Theorem 5. Problem 1 is in P for partial order +=⊇ if the discrete view V is a FOS.
Proof. First, notice that if Y ⊆ X, then V = (Y, θV , φV ) is a view of S = (X, Z, θ, φ)
if and only if it is a view of the fully-observable system S = (X ∪ Z, θ, φ). This is
because hY (S) = hY (S ). Thus, in the following, we will assume S to be a FOS with
S = (X, θ, φ).
Let ψS denote the reachable states of S. ψS can, e.g., be computed incrementally
using BDDs. Let Z := X \ Y and Z := X \ Y . Then, V ⊇hY S, if and only if the
following two conditions hold, which can be effectively checked:
228 J. Reineke and S. Tripakis
FOS, then
n Problem 3 is trivially decidable as there are only finitely many systems with
X = i=1 Yi . Clearly, this is not very efficient. Theorems 7-9 (which also apply to
general discrete systems, non necessarily FOS) provide a non-brute-force method.
Theorem 7. For a set of views V1 , . . . , Vn , with Vi = (Yi , Wi , θi , φi ) for all i, there al-
ways exists
a computable unique greatest witness system gw(V1 , . . . , Vn ) = (X, Z, θ, φ),
n
with X = i=1 Yi , w.r.t. partial order ⊇.
Proof. First, observe that Si = (X, Wi , θi , φi ) is the unique greatest witness system for
Vi for systems with the set of variables X, i.e., Vi ⊇hYi Si and for all S = (X, W, θ, φ)
such that Vi ⊇hYi S, we have Si ⊇ S. In fact, Vi =hYi Si . Given two views
Vi , Vj , the unique greatest witness system for both views is Si,j = (X, Wi ∪ Wj , θi ∧
θj , φi ∧ φj ), whose behaviors are exactly the intersection of the behaviors of Si and
Sj (see Lemma 7). Adding any behavior to Si,j would violate either Vi ⊇hYi Si,j or
n n
V ⊇ S Generalizing the above, S∧ = (X∧ = i=1 Yi , Z∧ = i=1 Wi , θ∧ =
jn hYi i,j n
i=1 θi , φ∧ = i=1 φi ) is the unique greatest witness system for the set of views
V1 , . . . , Vn .
Theorem 8. Consistency with respect to = holds if and only if the greatest witness
system gw(V1 , . . . , Vn ) derived in Theorem 7 is a witness with respect to =.
Theorem 9. Problem 3 is PSPACE-complete for partial order =.
Theorem 10. There are discrete views V1 , . . . , Vn , with Vi = (Yi , Wi , θi , φi ) for all i,
for which
there is no unique least witness system lw(V1 , . . . , Vn ) = (X, Z, θ, φ), with
X = ni=1 Yi , w.r.t. partial order ⊆.
Proof. Consider the following two views Vx = ({x}, θx = x, φx = true) and Vy =
({y}, θy = y, φy = true). We provide two witness systems S1 , S2 , both consistent
with Vx , Vy , such that their intersection is not consistent with Vx and Vy , which proves
that there is no unique least witness system for Vx , Vy w.r.t. ⊆:
In every behavior of S1 , x and y take the same value, whereas in S2 , x and y are never
both false. In their intersection S∩ = ({x, y}, θ1 ∧ θ2 , φ1 ∧ φ2 ), neither x nor y can
thus ever be false. So S∩ is neither consistent with Vx nor with Vy .
For partial order ⊆, Problem 4 is often trivial. Specifically, if the sets of observable
variables of all views are incomparable, then no information can be transferred from
one view to another:
Theorem 12. Let V1 , ..., Vn be discrete views with Vi = (Yi , Wi , θi , φi ). Assume Yi \
Yj = ∅ for all i, j. Then, assuming + is ⊆, the following holds for all i:
6 Discussion
MVM is not a new topic, and terms such as “view” and “viewpoint” often appear in
system engineering literature, including standards such as ISO 42010 [12]. Despite this
fact, and the fact that MVM is a crucial concern in system design, an accepted mathe-
matical framework for reasoning about views has so far been lacking. This is especially
true for behavioral views, that is, views describing the dynamic behavior of the system,
as opposed to its static structure. Behavioral views are the main focus of our work.
Discrete behavioral views could also be captured in a temporal logic formalism such
as LTL. View consistency could then be defined as satisfiability of the conjunction
φ1 ∧ · · · ∧ φn , where each φi is a view (possibly over a different set of variables).
This definition is however weaker than our definition of strict consistency (w.r.t. =).
Satisfiability of φ1 ∧ · · · ∧ φn is equivalent to checking that the intersection of the in-
verse projections of views is non-empty, which, as we explained earlier, is a necessary
but not sufficient condition for strict consistency.
The same fundamental difference exists between our framework and view consis-
tency as formulated in the context of interface theories, where a special type of interface
conjunction is used [11] (called “fusion” in [2] and “shared refinement” in [7,18]).
Behavioral abstractions/views are also the topic of [15,16]. Their framework is close
to ours, in the sense that it also uses abstraction functions to map behaviors between dif-
ferent levels of abstraction (or between systems and views). The focus of both [15,16]
is to ease the verification task in a heterogeneous (e.g., both discrete and continuous)
setting. Our main focus is checking view consistency. The notion of “heterogeneous
consistency” [15] is different from our notion of view consistency. The notion of “con-
junctive implication” [15] is also different, as views which have an empty intersection
of their inverse projections trivially satisfy conjunctive implication, yet these views can
be inconsistent in our framework. Problems such as view consistency checking are not
considered in [15,16].
Consistency between architectural views, which capture structural but not behavioral
aspects of a system, is studied in [3]. Consistency problems are also studied in [8] using
a static, logic-based framework. Procedures such as join and normalization in relational
databases also relate to notions of static consistency.
An extensive survey of different approaches for multi-view modeling can be found
in [14]. [14] also gives a partial and preliminary formalization, but does not discuss
algorithmic problems. [4] discusses an informal methodology for selecting formalisms,
languages, and tools based on viewpoint considerations. A survey of trends in multi-
paradigm modeling can be found in [1]. Trends and visions in multi-view modeling are
also the topic of [19]. The latter paper also discusses pragmatics of MVM in the context
Basic Problems in Multi-View Modeling 231
of the Ptolemy tool. However formal aspects of MVM and algorithmic problems such
as checking consistency are not discussed.
Implicitly, MVM is supported by multi-modeling languages such as UML, SysML,
and AADL. For instance, AADL defines separate “behavior and error annexes” and
having separate models in these annexes can result in inconsistencies. But capabilities
such as conformance or consistency checking are typically not provided by the tools
implementing these standards. Architectural consistency notions in a UML-like frame-
work are studied in [6].
This work is a first step toward a formal and algorithm-supported framework for
multi-view modeling. A natural direction for future work is to study algorithmic prob-
lems such as consistency checking in a heterogeneous setting. Although the framework
of Section 3 is general enough to capture heterogeneity, in this paper we restricted our
attention to algorithmic MVM problems for discrete systems, as we feel that we first
need a solid understanding of MVM in this simpler case.
Other directions for future work including investigating other types of abstraction
functions, generalizing the methods developed in Section 5, e.g., so that ⊆, =, ⊇ can be
arbitrarily combined, and studying algorithmic problems related to orthogonality.
References
1. Amaral, V., Hardebolle, C., Karsai, G., Lengyel, L., Levendovszky, T.: Recent advances in
multi-paradigm modeling. In: Ghosh, S. (ed.) MODELS 2009. LNCS, vol. 6002, pp. 220–
224. Springer, Heidelberg (2010)
2. Benveniste, A., Caillaud, B., Ferrari, A., Mangeruca, L., Passerone, R., Sofronis, C.: Multiple
viewpoint contract-based specification and design. In: de Boer, F.S., Bonsangue, M.M., Graf,
S., de Roever, W.-P. (eds.) FMCO 2007. LNCS, vol. 5382, pp. 200–225. Springer, Heidelberg
(2008)
3. Bhave, A., Krogh, B.H., Garlan, D., Schmerl, B.: View consistency in architectures for cyber-
physical systems. In: ICCPS 2011, pp. 151–160 (2011)
4. Broman, D., Lee, E.A., Tripakis, S., Törngren, M.: Viewpoints, Formalisms, Languages, and
Tools for Cyber-Physical Systems. In: MPM (2012)
5. Cousot, P., Cousot, R.: Systematic design of program analysis frameworks. In: POPL,
pp. 269–282. ACM (1979)
6. Dijkman, R.M.: Consistency in Multi-Viewpoint Architectural Design. PhD thesis, Univer-
sity of Twente (2006)
7. Doyen, L., Henzinger, T., Jobstmann, B., Petrov, T.: Interface theories with component reuse.
In: EMSOFT, pp. 79–88 (2008)
8. Finkelstein, A., Gabbay, D., Hunter, A., Kramer, J., Nuseibeh, B.: Inconsistency handling in
multiperspective specifications. IEEE TSE 20(8), 569–578 (1994)
9. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-
Completeness. W. H. Freeman (1979)
10. Granger, P.: Improving the results of static analyses programs by local decreasing iteration.
In: Shyamasundar, R.K. (ed.) FSTTCS 1992. LNCS, vol. 652, pp. 68–79. Springer, Heidel-
berg (1992)
11. Henzinger, T.A., Ničković, D.: Independent implementability of viewpoints. In: Calinescu,
R., Garlan, D. (eds.) Monterey Workshop 2012. LNCS, vol. 7539, pp. 380–395. Springer,
Heidelberg (2012)
232 J. Reineke and S. Tripakis
1 Introduction
General Purpose Graphics Processing Units (GPUs) are being applied success-
fully in many areas of research to speed up computations. Model checking [1] is
an automatic technique to verify that a given specification of a complex, safety-
critical (usually embedded) system meets a particular functional property. It
involves very time and memory demanding computations. Many computations
rely on on-the-fly state space exploration. This incorporates interpreting the spec-
ification, resulting in building a graph, or state space, describing all its potential
behaviour. Hence, the state space is not explicitly given, but implicitly, through
the specification. The state space size is not known a priori.
GPUs have been successfully applied to perform computations for probabilis-
tic model checking, when the state space is given a priori [2–4]. However, no
attempts as of yet have been made to perform the exploration itself entirely
using GPUs, due to it not naturally fitting the data parallel approach of GPUs,
but in this paper, we propose a way to do so. Even though current GPUs have
a limited amount of memory, we believe it is relevant to investigate the possibil-
ities of GPU state space exploration, if only to be prepared for future hardware
This work was sponsored by the NWO Exacte Wetenschappen, EW (NWO Physical
Sciences Division) for the use of supercomputer facilities, with financial support
from the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Netherlands
Organisation for Scientific Research, NWO).
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 233–247, 2014.
c Springer-Verlag Berlin Heidelberg 2014
234 A. Wijs and D. Bošnački
of a simple traffic light system specification, where process 0 represents the be-
haviour of a traffic light (the states representing the colours of the light) and pro-
cess 1 represents a pedestrian. We also have V = {(start, continue, crossing)},
meaning that there is only a single synchronisation rule, expressing that the start
event of process 0 can only be fired if event continue of process 1 is fired at the
same time, resulting in the event crossing being fired by the system as a whole.
In general, synchronisation rules are not required to involve all processes; in
order to express that a rule is not applicable on a process i ∈ 1..n, we use a
dummy value • indicating this, and define t[i] = •.
State space exploration now commences as follows: first, the two initial states
of the processes (indicated by an incoming transition without a source state) are
combined into a system state vector s = R, 0. In general, given a vector s, the
corresponding state of Π[i], with i ∈ 1..n, is s[i]. The set of outgoing transitions
(and their corresponding target states or successors of s) can now be determined
a
using two checks for each transition s[i] −
→ pi , with pi a state of process i:
The first check is applicable for all independent transitions, i.e. transitions on
which no rule is applicable, hence they can be fired individually, and therefore
directly ‘lifted’ to the system level. The second check involves applying synchro-
nisation rules. In Figure 1, part of the system state space obtained by applying
the defined checks on the traffic network is displayed on the right.
NVIDIA GPUs can be programmed using the CUDA interface, which extends the
C and Fortran programming languages. These GPUs contain tens of streaming
multiprocessors (SM) (see Figure 2, with N the number of SMs), each containing
a fixed number of streaming processors (SP), e.g. 192 for the Kepler K20 GPU,
236 A. Wijs and D. Bošnački
and fast on-chip shared memory. Each SM employs single instruction, multiple
data (SIMD) techniques, allowing for data parallelisation. A single instruction
stream is performed by a fixed size group of threads called a warp. Threads in
a warp share a program counter, hence perform instructions in lock-step. Due
to this, branch divergence can occur within a warp, which should be avoided:
for instance, consider the if-then-else construct if (C) then A else B. If a
warp needs to execute this, and for at least one thread C holds, then all threads
must step through A. It is therefore possible that the threads must step together
through both A and B, thereby decreasing performance. The size of a warp is
fixed and depends on the GPU type, usually it is 32, we refer to it as WarpSize.
A block of threads is a larger group assigned to a single SM. The threads in
a block can use the shared memory to communicate with each other. An SM,
however, can handle many blocks in parallel. Instructions to be performed by
GPU threads can be defined in a function called a kernel. When launching a
kernel, one can specify how many thread blocks should execute it, and how many
threads each block contains (usually a power of two). Each SM then schedules all
the threads of its assigned blocks up to the warp level. Data parallelisation can
be achieved by using the predefined keywords BlockId and ThreadId, referring
to ID of the block a thread resides in, and the ID of a thread within its block,
respectively. Besides that, we refer with WarpNr to the global ID of a warp, and
with WarpTId to the ID of a thread within its warp. These can be computed as
follows: WarpNr = ThreadId/WarpSize and WarpTId = ThreadId %WarpSize.
Most of the data used by a
Multiprocessor 1 Multiprocessor N
GPU application resides in global
memory or device memory. It em- SP SP SP SP
bodies the interface between the SP SP SP SP
host (CPU) and the kernel (GPU).
SP SP ··· SP SP
Depending on the GPU type, its
size is typically between 1 and 6 SP SP SP SP
GB. It has a high bandwidth, but Shared memory Shared memory
also a high latency, therefore mem-
128B 128B
ory caches are used. The cache
line of most current NVIDIA GPU L1 & L2 cache
L1 and L2 caches is 128 Bytes,
Texture cache
which directly corresponds with
each thread in a warp fetching a Global memory
32-bit integer. If memory accesses
in a kernel can be coalesced within Fig. 2. Hardware model of CUDA GPUs
each warp, efficient fetching can be
achieved, since then, the threads in a warp perform a single fetch together, nicely
filling one cache line, instead of different fetches, which would be serialised by
the GPU, thereby losing many clock-cycles. This plays an important role in the
hash table implementation we propose.
Finally, read-only data structures in global memory can be declared as tex-
tures, by which they are connected to a texture cache. This may be beneficial if
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs 237
access to the data structure is expected to be random, since the cache may help
in avoiding some global memory accesses.
32 32
ProcOffsets ··· 67
StateOffsets · · · 201, 206 ··· · · · 101 1
TransArray ··· t1 , . . . , t 6
s[4] s[3] s[2] s[1] Ts2 Ts1 Ts0 Ta Ts
3 GPU Parallelisation
Alg. 1 provides a high-level view of state space exploration. As in BFS, one can
clearly identify the two main operations, namely successor generation (line 4),
analogous to neighbour gathering, and duplicate detection (line 5), analogous to
status lookup. Finally, in lines 6-7, states are added to the work sets, Visited
being the set of visited states and Open being the set of states yet to be explored
(usually implemented as a queue). In the next subsections, we will discuss our
approach to implementing these operations.
One can imagine that these structures are practically going to be accessed
randomly when exploring. However, since this data is never updated, we can
store the arrays as textures, thereby using the texture caches to improve access.
Besides this, we must also encode the transition entries themselves. This is
shown on the right of Figure 3. Each entry fills a 32-bit integer as much as possi-
ble. It contains the following information: the lowest bit (Ts ) indicates whether
or not the transition depends on a synchronisation rule. The next log2 (ca ) num-
ber of bits, with ca the number of different labels in the entire network, encodes
the transition label (Ta ). We encode the labels, which are basically strings, by
integer values, sorting the labels occurring in a network alphabetically. After
that, each log2 (cs ) bits, with cs the number of states in the process LTS own-
ing this transition, encodes one of the target states. If there is non-determinism
w.r.t. label Ta from the source state, multiple target states will be listed, possibly
continuing in subsequent transition entries.
In the middle of Figure 3, the encoding of state vectors is shown. These are
simply concatenations of encodings of process LTS states. Depending on the
number of bits needed per LTS state, which in turn depends on the number of
states in the LTSs, a fixed number of 32-bit integers is required per vector.
Finally, the synchronisation rules need to be encoded. To simplify this, we
rewrite networks such that we only have rules involving a single label, e.g.
(a, a, a). In practice, this can usually be done without changing the meaning.
For the traffic light system, we could rewrite start and continue to crossing. It
allows encoding the rules as bit sequences of size n, where for each process LTS,
1 indicates that the process should participate, and 0 that it should not partic-
ipate in synchronisation. Two integer arrays then suffice, one containing these
encodings, the other containing the offsets for all the labels.
the corresponding target state vectors are stored for duplicate detection (see
Section 3.3). For all transitions with Ts = 1, to achieve cooperation between
the threads while limiting the amount of used shared memory, the threads it-
erate over their transitions in order of label ID (LID). To facilitate this, the
entries in each segment of outgoing transitions belonging to a particular state in
TransArray are sorted on LID before exploration starts.
Successors reached through synchronisa-
TransArray
tion are constructed in iterations. In each it-
eration, the threads assigned to s fetch the · · · 1 1. . . 2. . . 1. . . 3. . . · · ·
entries with lowest LID and Ts = 1 from their cnt th 0 th 1 th 2 th 3
list of outgoing transitions, and store these in
a designated buffer in the shared memory. The Fig. 4. Fetching transitions
size of this buffer can be determined before exploration as n times the maximum
number of entries with the same LID and Ts = 1 from any process state in the
network. Then, the thread with VGID 0, i.e. the vector group leader, determines
the lowest LID fetched within the vector group. Figure 4 illustrates this for a
vector with n = 4. Threads th 0 to th 3 have fetched transitions with the lowest
LIDs for their respective process states that have not yet been processed in the
successor generation, and thread th 0 has determined that the next lowest LID to
be processed by the vector group is 1. This value is written in the cnt location.
Since transitions in TransArray are sorted per state by LID, we know that all
possible transitions with LID = 1 have been placed in the vector group buffer.
Next, all threads that fetched entries with the lowest LID, in the example threads
th 0 and th 2 , start scanning the encodings of rules in V applicable on that LID.
We say that thread i owns rule r iff there is no j ∈ 1..n with j < i and r[j] = •.
If a thread encounters a rule that it owns, then it checks the buffer contents to
determine whether the rule is applicable. If it is, it constructs the target state
vectors and stores them for duplicate detection. In the next iteration, all entries
with lowest LID are removed, the corresponding threads fetch new entries, and
the vector group leader determines the next lowest LID to be processed.
hash table. Thus, caches also allow threads to cooperatively perform global du-
plicate detection and insertion of new vectors.
Global Hash Table. For the global hash table, we initially used the Cuckoo hash
table of [15]. Cuckoo hashing has the nice property that lookups are done in
constant time, namely, it requires kc memory accesses, with kc the number of
hash functions used.
However, an important aspect of Cuckoo hashing is that elements are relo-
cated in case collisions occur. In [15], key-value pairs are stored in 64-bit integers,
hence insertions can be done atomically using CAS operations. Our state vectors,
though, can encompass more than 64 bits, ruling out completely atomic inser-
tions. After having created our own extension of the hash table of [15] that allows
for larger elements, we experienced in experiments that the number of explored
states far exceeded the actual number of reachable states, showing that in many
cases, threads falsely conclude that a vector was not present (a false negative).
We concluded that this is mainly due to vector relocation, involving non-atomic
removal and insertion, which cannot be avoided for large vectors; once a thread
starts removing a vector, it is not present anymore in the hash table until the
subsequent insertion has finished, and any other thread looking for the vector
will not be able to locate it during that time. It should however be noted, that
although the false negatives may negatively influence the performance, they do
not affect the correctness of our method.
To decrease the number of false negatives, as an alternative, we choose to
implement a hash table using buckets, linear probing and bounded double hash-
ing. It is implemented using an array, each consecutive WarpSize 32-bit integers
forming a bucket. This plays to the strength of warps: when a block of threads
is performing duplicate detection, all the threads in a warp cooperate on check-
ing the presence of a particular s. The first hash function h1 , built as specified
in [15], is used to find the primary bucket. A warp can fetch a bucket with one
memory access, since the bucket size directly corresponds with one cache line.
Subsequently, the bucket contents can be checked in parallel by the warp. This is
similar to the walk-the-line principle of [20], instead that here, the walk is done
in parallel, so we call it warp-the-line. Note that each bucket can contain up to
WarpSize/c vectors, with c the number of 32-bit integers required for a vector. If
the vector is not present and there is a free location, the vector is inserted. If the
bucket is full, h2 is used to jump to another bucket, and so on. This is similar
to [21], instead that we do not move elements between buckets.
The pseudo-code for scanning the local cache and looking up and inserting
new vectors (i.e. find-or-put) in the case that state vectors fit in a single 32-bit
integer is displayed in Alg. 2. The implementation contains the more general case.
The cache is declared extern, meaning that the size is given when launching the
kernel. Once a work tile has been explored and the successors are in the cache,
each thread participates in its warp to iterate over the cache contents (lines 6,
27). If a vector is new (line 8, note that empty slots are marked ‘old’), insertion
in the hash table will be tried up to H ∈ N times. In lines 11-13, warp-the-line is
performed, each thread in a warp investigating the appropriate bucket slot. If any
242 A. Wijs and D. Bošnački
thread sets s as old in line 13, then all threads will detect this in line 15, since s is
read from shared memory. If the vector is not old, then it is attempted to insert
it in the bucket (lines 15-23). This is done by the warp leader (WarpTId = 0,
line 18), by performing a CAS. CAS takes three arguments, namely the address
where the new value must be written, the expected value at the address, and
the new value. It only writes the new value if the expected value is encountered,
and returns the encountered value, therefore a successful write has happened if
empty has been returned (line 20). Finally, in case of a full bucket, h2 is used
to jump to the next one (line 26).
As discussed in Section 4, we experienced good speedups and no unresolved
collisions using a double hashing bound of 8, and, although still present, far fewer
false negatives compared to Cuckoo hashing. Finally, it should be noted that
chaining is not a suitable option on a GPU, since it requires memory allocation
at runtime, and the required sizes of the chains are not known a priori.
Recall that the two important data structures are Open and Visited. Given
the limited amount of global memory, and that the state space size is unknown
a priori, we prefer to initially allocate as much memory as possible for Visited.
But also the required size of Open is not known in advance, so how much mem-
ory should be allocated for it without potentially wasting some? We choose to
combine the two in a single hash table by using the highest bit in each vector
encoding to indicate whether it should still be explored or not. The drawback is
that unexplored vectors are not physically close to each other in memory, but
the typically large number of threads can together scan the memory relatively
fast, and using one data structure drastically simplifies implementation. It has
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs 243
the added benefit that load-balancing is handled by the hash functions, due to
the fact that the distribution over the hash table achieves distribution over the
workers. A consequence is that the search will not be strictly BFS, but this is
not a requirement. At the start of an iteration, each block gathers a tile of new
vectors by scanning predefined parts of the hash table, determined by the block
ID. In the next section, several possible improvements on scanning are discussed.
models to obtain larger state spaces), the next two have been created by us, the
seven after that originate from CADP, and the final five come from the BEEM
database. The latter ones have first been translated manually to mCRL2, since
our input, network of LTSs, uses an action-based representation of system be-
haviour, but BEEM models are state-based, hence this gap needs to be bridged.
An important question is how the ex-
Table 1. Benchmark characteristics
ploration should be configured, i.e. how
many blocks should be launched, and
Model #States #Transitions
how many iterations should be done per
1394 198,692 355,338
kernel launch. We tested different config-
1394.1 36,855,184 96,553,318
urations for 512 threads per block (other
acs 4,764 14,760
numbers of threads resulted in reduced
acs.1 200,317 895,004
performance) using double hashing with
wafer stepper.1 4,232,299 19,028,708
forwarding; Figure 5 shows our results
ABP 235,754,220 945,684,122
launching a varying number of blocks
broadcast 60,466,176 705,438,720
(note the logscale of the right graph),
transit 3,763,192 39,925,524
each performing 10 iterations per kernel
CFS.1 252,101,742 1,367,483,201
launch. The ideal number of blocks for
asyn3 15,688,570 86,458,183
the K20 seems to be 240 per SM, i.e.
asyn3.1 190,208,728 876,008,628
3120 blocks. For GPU standards, this
ODP 91,394 641,226
is small, but launching more often nega-
ODP.1 7,699,456 31,091,554
tively affects performance, probably due
DES 64,498,297 518,438,860
to the heavy use of shared memory.
lamport.8 62,669,317 304,202,665
Figure 6 shows some of our results on
lann.6 144,151,629 648,779,852
varying the number of iterations per ker-
lann.7 160,025,986 944,322,648
nel launch. Here, it is less clear which
peterson.7 142,471,098 626,952,200
value leads to the best results, either 5
szymanski.5 79,518,740 922,428,824
or 10 seems to be the best choice. With
a lower number, the more frequent hash table scanning becomes noticable, while
with higher numbers, the less frequent passing along of work from SMs to each
other leads to too much redundancy, i.e. re-exploration of states, causing the
exploration to take more time.
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs 245
average has a performance similar to using about 10 cores with LTSmin, based
on the fact that LTSmin demonstrates near-linear speedups when the number of
cores is increased. In case of the exceptions, such as the ABP case, about two orders
of magnitude speedup is achieved. This may seem disappointing, considering that
GPUs have an enormous computation potential. However, on-the-fly exploration
is not a straightforward task for a GPU, and a one order of magnitude speedup
seems reasonable. Still, we believe these results are very promising, and merit fur-
ther study. Existing multi-core exploration techniques, such as in [24], scale well
with the number of cores. Unfortunately, we cannot test whether this holds for our
GPU exploration, apart from varying the number of blocks; the number of SMs
cannot be varied, and any number beyond 15 on a GPU is not yet available.
Concluding, our choices regarding data encoding and successor generation seem
to be effective, and our findings regarding a new GPU hash table, local caches and
forwarding can be useful for anyone interested in GPU graph exploration.
5 Conclusions
References
1. Baier, C., Katoen, J.P.: Principles of Model Checking. The MIT Press (2008)
2. Bošnački, D., Edelkamp, S., Sulewski, D., Wijs, A.: Parallel Probabilistic Model
Checking on General Purpose Graphics Processors. STTT 13(1), 21–35 (2011)
3. Bošnački, D., Edelkamp, S., Sulewski, D., Wijs, A.: GPU-PRISM: An Extension
of PRISM for General Purpose Graphics Processing Units. In: Joint HiBi/PDMC
Workshop (HiBi/PDMC 2010), pp. 17–19. IEEE (2010)
4. Wijs, A.J., Bošnački, D.: Improving GPU Sparse Matrix-Vector Multiplication for
Probabilistic Model Checking. In: Donaldson, A., Parker, D. (eds.) SPIN 2012.
LNCS, vol. 7385, pp. 98–116. Springer, Heidelberg (2012)
5. Lang, F.: Exp.Open 2.0: A Flexible Tool Integrating Partial Order, Compositional,
and On-The-Fly Verification Methods. In: Romijn, J.M.T., Smith, G.P., van de
Pol, J. (eds.) IFM 2005. LNCS, vol. 3771, pp. 70–88. Springer, Heidelberg (2005)
6. Garavel, H., Lang, F., Mateescu, R., Serwe, W.: CADP 2010: A Toolbox for
the Construction and Analysis of Distributed Processes. In: Abdulla, P.A., Leino,
K.R.M. (eds.) TACAS 2011. LNCS, vol. 6605, pp. 372–387. Springer, Heidelberg
(2011)
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs 247
1 Introduction
Reachability analysis is a fundamental problem in the areas of formal meth-
ods and of systems theory. It is concerned with assessing whether a certain set
of states of a system is attainable from a given set of initial conditions. The
problem is particularly interesting and compelling over models with continuous
components – either in time or in the (state) space. For the first class of models,
reachability has been widely investigated over discrete-space systems, such as
timed automata [1,2], or (timed continuous) Petri nets [3], or hybrid automata
[4]. On the other hand, much research has been done to enhance and scale the
This work has been supported by the European Commission STREP project MoVeS
257005, by the European Commission Marie Curie grant MANTRAS 249295, by the
European Commission IAPP project AMBI 324432, by the European Commission
NoE Hycon2 257462, and by the NWO VENI grant 016.103.020.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 248–262, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Forward Reachability Computation for Autonomous MPL Systems 249
does not imply that we can employ related techniques for reachability analysis
of MPL systems, since the two modeling frameworks are not equivalent.
While this new approach reduces reachability analysis of MPL models to a
computationally feasible task, the foundations of this contribution go beyond
mere manipulations of DBM: the technique is inspired by the recent work in
[27], which has developed an approach to the analysis of MPL models based
on finite-state abstractions. In particular, the procedure for forward reachabil-
ity computation on MPL models discussed in this work is implemented in the
VeriSiMPL (“very simple”) software toolbox [28], which is freely available. While
the general goals of VeriSiMPL go beyond the topics of this work and are thus
left to the interested reader, in this article we describe the details of the im-
plementation of the suite for reachability analysis within this toolbox over a
running example. With an additional numerical case study, we display the scala-
bility of the tool as a function of model dimension (the number of its continuous
variables): let us emphasize that related approaches for reachability analysis of
discrete-time dynamical systems based on finite abstractions do not reasonably
scale beyond models with a few variables [29], whereas our procedure comfort-
ably handles models with about twenty continuous variables. In this numerical
benchmark we have purposely generated the underlying dynamics randomly: this
allows deriving empirical outcomes that are general and not biased towards possi-
ble structural features of a particular model. Finally, we successfully benchmark
the computation of forward reachability sets against an alternative approach
based on the well-developed MPT software tool [25].
Furthermore, for practical reasons, the state space is taken to be IRn , which also
implies that the state matrix has to be row-finite (cf. Definition 1).
An autonomous (that is, deterministic) MPL model [15, Remark 2.75] is de-
fined as:
x(k) = A ⊗ x(k − 1) , (1)
where A ∈ IRn×n ε , x(k − 1) = [x1 (k − 1) . . . xn (k − 1)]T ∈ IRn for k ∈ IN. The
independent variable k denotes an increasing discrete-event counter, whereas
the state variable x defines the (continuous) timing of the discrete events. Au-
tonomous MPL models are characterized by deterministic dynamics. Related to
the state matrix A is the notion of regular (or row-finite) matrix and that of
irreducibility.
The length of the transient part specifically for x(0) = [3, 0]T can be computed
observing the trajectory
& ' & ' & ' & ' & ' & ' & ' & ' & ' & '
3 5 11 14 19 22 27 30 35 38
, , , , , , , , , ,...
0 6 9 14 17 22 25 30 33 38
The periodic behavior occurs (as expected) after 2 event steps, i.e. k0 ([3, 0]T ) =
2, and shows a period equal to 2, namely x(4) = 4⊗2 ⊗ x(2) = 8 + x(2), and
similarly x(5) = 4⊗2 ⊗ x(3). Furthermore x(k + 2) = 4⊗2 ⊗ x(k) for k ≥ 2.
%
n %
n
Rg = {x ∈ IRn : A(i, gi ) + xgi ≥ A(i, j) + xj } ; (3)
i=1 j=1
xi (k) = xgi (k − 1) + A(i, gi ) , 1≤i≤n . (4)
%
k %
n
R(g1 ,...,gk ) = {x ∈ IRn : A(i, gi ) + xgi ≥ A(i, j) + xj } .
i=1 j=1
Notice that if the region associated with a partial coefficient (g1 , . . . , gk ) is empty,
then the regions associated with the coefficients (g1 , . . . , gn ) are also empty, for
all gk+1 , . . . , gn . The set of all coefficients can be represented as a potential
search tree. For a 2-dimensional MPL model, the potential search tree is given
in Fig. 1. The backtracking algorithm traverses the tree recursively, starting from
the root, in a depth-first order. At each node, the algorithm checks whether the
corresponding region is empty: if the region is empty, the whole sub-tree rooted
at the node is skipped (pruned).
The function maxpl2pwa is used to construct a PWA system from an autonomous
MPL model. The autonomous MPL model is characterized by a row-finite state
matrix (Ampl), whereas the PWA system is characterized by a collection of regions
(D) and a set of affine dynamics (A,B). The affine dynamics that are active in the
Forward Reachability Computation for Autonomous MPL Systems 253
j-th region are characterized by the j-th column of both A and B. Each column of
A and the corresponding column of B contain the coefficients [g1 , . . . , gn ]T and the
constants [A(1, g1 ), . . . , A(n, gn )]T , respectively. The data structure of D will be
discussed in Section 2.3.
Considering the autonomous MPL example in (2), the following script gener-
ates the PWA system:
It will become clear in Section 2.3 that the nonempty regions of the PWA system
produced by the script are: R(1,1) = {x ∈ IR2 : x1 − x2 ≥ 3}, R(2,1) = {x ∈ IR2 :
e ≤ x1 − x2 ≤ 3}, and R(2,2) = {x ∈ IR2 : x1 − x2 ≤ e}. The affine dynamics
corresponding to a region Rg are characterized by g, e.g. those for region R(2,1)
are given by x1 (k) = x2 (k − 1) + 5, x2 (k) = x1 (k − 1) + 3.
IR2
R(1) R(2)
represented as a 1×2 cell, where the corresponding matrices are stacked along
the third dimension.
Each DBM admits an equivalent and unique canonical-form representation,
which is a DBM with the tightest possible bounds [26, Sect. 4.1]. The Floyd-
Warshall algorithm can be used to obtain the canonical-form representation of
a DBM, with a complexity that is cubic w.r.t. its dimension. One advantage
of the canonical-form representation is that it is easy to compute orthogonal
projections w.r.t. a subset of its variables, which is simply performed by deleting
rows and columns corresponding to the complementary variables [26, Sect. 4.1].
Implementation: The Floyd-Warshall algorithm has been implemented in the
function floyd warshall. Given a collection of DBM, this function generates
its canonical-form representation. The following MATLAB script computes the
canonical-form representation of D = {x ∈ IR4 : x1 − x4 ≤ −3, x2 − x1 ≤
−3, x2 − x4 ≤ −3, x3 − x1 ≤ 2}:
Let us discuss the steps in the construction of the DBM D. We first initial-
ize D with IR4 as D = cell(1,2), D{1} = Inf(5), D{1}(1:6:25) = 0, D{2} =
false(5), D{2}(1:6:25) = true. The variable ind contains the location, in lin-
ear index format, of each inequality in the matrix. We define the upper bounds
and the strictness in D{1}(ind) = [-3,-3,-3,2] and D{2}(ind) = true, re-
spectively. The output is Dcf = {x ∈ IR4 : x1 − x4 ≤ −3, x2 − x1 ≤ −3, x2 − x4 ≤
−6, x3 − x1 ≤ 2, x3 − x4 ≤ −1}. Notice that the bounds of x2 − x4 and x3 − x4
are tighter. Moreover, the orthogonal projection of D (or Dcf) w.r.t. {x1 , x2 } is
{x ∈ IR2 : x2 − x1 ≤ −3}.
The following result plays an important role in the computation of reachability
for MPL models.
Proposition 2 ([27, Th. 1]) The image of a DBM with respect to affine dy-
namics (in particular the PWA expression (4) generated by an MPL model) is
a DBM.
The procedure has been implemented in dbm image. It computes the image
of a collection of DBM w.r.t. the corresponding affine dynamics. The following
example computes the image of D = {x ∈ IR2 : e ≤ x1 − x2 ≤ 3} w.r.t. x1 (k) =
x2 (k − 1) + 5, x2 (k) = x1 (k − 1) + 3:
Definition 4 (Reach Tube) Given an MPL model and a nonempty set of ini-
tial positions X0 ⊆ IRn , the reach tube is defined by the set-valued function
k )→ Xk for any given k > 0 where Xk is defined.
Unless otherwise stated, in this work we focus on finite-horizon reachability:
in other words, we compute the reach set for a finite index N (cf. Definition 3)
and the reach tube for k = 1, . . . , N , where N < ∞ (cf. Definition 4). While the
reach set can be obtained as a by-product of the (sequential) computations used
to obtain the reach tube, we will argue that it can be as well calculated by a
tailored procedure (one-shot).
In the computation of the quantities defined above, the set of initial conditions
X0 ⊆ IRn will be assumed to be a union of finitely many DBM. In the more
256 D. Adzkiya, B. De Schutter, and A. Abate
Xk = I(Xk−1 ) = {A ⊗ x : x ∈ Xk−1 } .
In the dynamical systems and automata literature the mapping I is also known
as Post [32, Definition 2.3]. Under the assumption that X0 is a union of finitely
many DBM, by Corollary 1 it can be shown by induction that the reach set Xk
is also a union of finitely many DBM, for each k ∈ IN.
Implementation: Given a state matrix A and a set of initial conditions X0 ,
the general procedure for obtaining the reach tube works as follows: first, we
construct the PWA system generated by A; then, for each k = 1, . . . , N , the
reach set Xk is obtained by computing I(Xk−1 ).
The worst-case complexity of the procedure (excluding that related to the
generation of PWA system) can be assessed as follows. The complexity of com-
puting I(Xk−1 ) is O(|Xk−1 | · |R(A)| · n3 ), for k = 1, . . . , N . This results in an
−1
overall complexity of O(|R(A)| · n3 N k=0 |Xk |). Notice that quantifying explic-
itly the cardinality |Xk | of the DBM union at each step k is not possible in
general (cf. Benchmark in Section 4).
The procedure has been implemented in maxpl reachtube for. The inputs
are the PWA system (A, B, D), the initial states (D0), and the event horizon
(N). The set of initial states D0 is a collection of finitely many DBM and the
event horizon N is a natural number. The output is a 1×(N + 1) cell. For each
1 ≤ i ≤ N + 1, the i-th element contains the reach set Xi−1 , which is a collection
of finitely many DBM (cf. Section 2.3).
Let us consider the unit square as the set of initial conditions X0 = {x ∈ IR2 :
0 ≤ x1 ≤ 1, 0 ≤ x2 ≤ 1}. The following MATLAB script computes the reach
tube for two steps:
4
10
x2 R(2,2) n = 13
R(2,1) 2
10 n = 10
1
10
n=7
0
10
n=4
R(1,1)
−1
10
−2
10
x1 10 20 30 40 50 60 70 80 90 100
event horizon (N )
Fig. 2. (Left plot) Reach tube for autonomous MPL model over 2 event steps. (Right
plot) Time needed to generate reach tube of autonomous models for different models
size and event horizons, cf. Section 4.
Recall that, given a set of initial conditions X0 and a finite event horizon
N ∈ IN, in order to compute XN , we have to calculate X1 , . . . , XN −1 . If the
autonomous MPL system is irreducible, we can exploit the periodic behavior
(cf. Proposition 1) to simplify the computation.
Proposition 3 Let A ∈ IRn×n ε be an irreducible matrix with max-plus eigen-
value λ ∈ IR and cyclicity c ∈ IN. There exists a k0 (X0 ) = maxx∈X0 k0 (x), such
that Xk+c = λ⊗c ⊗ Xk , for all k ≥ k0 (X0 ).
Proof. Recall that for each x(0) ∈ IRn , there exists a k0 (x(0)) such that x(k +
c) = λ⊗c ⊗ x(k), for all k ≥ k0 (x(0)). Since k0 (X0 ) = maxx∈X0 k0 (x), for each
x(0) ∈ X0 , we have x(k+c) = λ⊗c ⊗x(k), for k ≥ k0 (X0 ). Recall from Definition 3
that Xk = {x(k) : x(0) ∈ X0 }, for all k ∈ IN.
it can be shown that the image of a stripe w.r.t. affine dynamics (generated by
an MPL model) is a stripe. Following the arguments after Theorem 2, it can be
shown that the image of a union of finitely many stripes w.r.t. the PWA system
generated by an MPL model is a union of finitely many stripes.
Since a stripe is a collection of equivalence classes [16, Sect. 1.4], then X0 ⊗
α = X0 , for each α ∈ IR. From Proposition 3 and the previous observations,
Xk+c = Xk for all k ≥ k0 (X0 ).
Example: The set of initial conditions can also be described as a stripe, for
example X0 = {x ∈ IR2 : −1 ≤ x1 − x2 ≤ 1}. The reach sets are stripes given
by X1 = {x ∈ IR2 : 1 ≤ x1 − x2 ≤ 2} and X2 = {x ∈ IR2 : 0 ≤ x1 − x2 ≤ 1}.
Additionally, we obtain X1 = X2k−1 and+∞X2 = X2k , for all k ∈ IN. It follows
2
that the infinite-horizon reach tube is k=0 Xk = k=0 Xk = {x ∈ IR2 : −1 ≤
x1 − x2 ≤ 2}.
Using Corollary 1, it can be seen that the reach set XN is a union of finitely
many DBM.
Implementation: Given a state matrix A, a set of initial conditions X0 and a
finite index N , the general procedure for obtaining XN is: 1) computing A⊗N ;
then 2) constructing the PWA system generated by it; finally 3) computing the
image of X0 w.r.t. the obtained PWA system.
Let us quantify the total complexity of the first and third steps in the pro-
cedure. The complexity of computing N -th max-algebraic power of an n × n
matrix (cf. Section 2.1) is O(log2 (N ) · n3 ). Excluding the generation of the
PWA system – step 2), see above – the overall complexity of the procedure is
O(log2 (N ) · n3 + |X0 | · |R(A⊗N )| · n3 ).
The procedure has been implemented in maxpl reachset for. The inputs are
the state matrix (Ampl), the initial states (D0), and the event horizon (N). The set
of initial states D0 is a collection of finitely many DBM (cf. Section 2.3) and the
event horizon N is a natural number. The output is a 1×2 cell: the first element
is the set of initial states and the second one is the reach set at event step N.
Recall that both the initial states and the reach set are a collection of finitely
many DBM.
Let us consider the unit square as the set of initial conditions X0 = {x ∈ IR2 :
0 ≤ x1 ≤ 1, 0 ≤ x2 ≤ 1}. The following MATLAB script computes the reach set
for two steps:
Forward Reachability Computation for Autonomous MPL Systems 259
4 Numerical Benchmark
4.1 Implementation and Setup
The technique for forward reachability computation on MPL models discussed
in this work is implemented in the VeriSiMPL (“very simple”) version 1.3, which
is freely available at [28]. VeriSiMPL is a software tool originally developed to
obtain finite abstractions of Max-Plus-Linear (MPL) models, which enables their
verification against temporal specifications via a model checker. The algorithms
have been implemented in MATLAB 7.13 (R2011b) and the experiments have
been run on a 12-core Intel Xeon 3.47 GHz PC with 24 GB of memory.
In order to test the practical efficiency of the proposed algorithms, we compute
the runtime needed to determine the reach tube of an autonomous MPL system,
for event horizon N = 10 and an increasing dimension n of the MPL model. We
also keep track of the number of regions of the PWA system generated from the
MPL model. For any given n, we generate matrices A with 2 finite elements (in
a max-plus sense) that are randomly placed in each row. The finite elements are
randomly generated integers between 1 and 100. The set of initial conditions is
selected as the unit hypercube, i.e. {x ∈ IRn : 0 ≤ x1 ≤ 1, . . . , 0 ≤ xn ≤ 1}.
Over 10 independent experiments, Table 1 reports the average time needed
to generate the PWA system and to compute the reach tube, as well as the
corresponding number of regions. As confirmed by Table 1, the time needed to
compute the reach tube is monotonically increasing w.r.t. the dimension of the
MPL model (as we commented previously this is not the case for the cardinality
of reach sets, which hinges on the structure of the MPL models). For a fixed
model size and dynamics, the growth of the computational time for forward
260 D. Adzkiya, B. De Schutter, and A. Abate
reachability is linear (in the plot, logarithmic over logarithmic time scale) with
the event horizon as shown in Fig. 2 (right). We have also performed reachability
computations for the case of the set of initial conditions described as a stripe,
which has yielded results that are analogue to those in Table 1.
Table 1. Numerical benchmark, autonomous MPL model: computation of the reach
tube (average over 10 experiments)
Table 2. Time for generation of the reach tube of 10-dimensional autonomous MPL
model for different event horizons (average over 10 experiments)
References
1. Alur, R., Dill, D.: A theory of timed automata. Theoretical Computer Sci-
ence 126(2), 183–235 (1994)
2. Behrmann, G., David, A., Larsen, K.G.: A tutorial on uppaal. In: Bernardo, M.,
Corradini, F. (eds.) SFM-RT 2004. LNCS, vol. 3185, pp. 200–236. Springer, Hei-
delberg (2004)
3. Kloetzer, M., Mahulea, C., Belta, C., Silva, M.: An automated framework for formal
verification of timed continuous Petri nets. IEEE Trans. Ind. Informat. 6(3), 460–
471 (2010)
4. Henzinger, T.A., Rusu, V.: Reachability verification for hybrid automata. In:
Henzinger, T.A., Sastry, S.S. (eds.) HSCC 1998. LNCS, vol. 1386, pp. 190–204.
Springer, Heidelberg (1998)
5. Dang, T., Maler, O.: Reachability analysis via face lifting. In: Henzinger, T.A.,
Sastry, S.S. (eds.) HSCC 1998. LNCS, vol. 1386, pp. 96–109. Springer, Heidelberg
(1998)
6. Chutinan, A., Krogh, B.: Computational techniques for hybrid system verification.
IEEE Trans. Autom. Control 48(1), 64–75 (2003)
7. CheckMate, http://users.ece.cmu.edu/~ krogh/checkmate/
8. Mitchell, I., Bayen, A., Tomlin, C.: A time-dependent Hamilton-Jacobi formula-
tion of reachable sets for continuous dynamic games. IEEE Trans. Autom. Con-
trol 50(7), 947–957 (2005)
9. Mitchell, I.M.: Comparing forward and backward reachability as tools for safety
analysis. In: Bemporad, A., Bicchi, A., Buttazzo, G. (eds.) HSCC 2007. LNCS,
vol. 4416, pp. 428–443. Springer, Heidelberg (2007)
10. Kurzhanskiy, A., Varaiya, P.: Ellipsoidal techniques for reachability analysis of
discrete-time linear systems. IEEE Trans. Autom. Control 52(1), 26–38 (2007)
262 D. Adzkiya, B. De Schutter, and A. Abate
11. Kurzhanskiy, A., Varaiya, P.: Ellipsoidal toolbox. Technical report, EECS Depart-
ment, University of California, Berkeley (May 2006)
12. Asarin, E., Schneider, G., Yovine, S.: Algorithmic analysis of polygonal hybrid sys-
tems, part i: Reachability. Theoretical Computer Science 379(12), 231–265 (2007)
13. Le Guernic, C., Girard, A.: Reachability analysis of hybrid systems using support
functions. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp.
540–554. Springer, Heidelberg (2009)
14. Chen, X., Ábrahám, E., Sankaranarayanan, S.: Flow*: An Analyzer for Non-linear
Hybrid Systems. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044,
pp. 258–263. Springer, Heidelberg (2013)
15. Baccelli, F., Cohen, G., Olsder, G., Quadrat, J.P.: Synchronization and Linearity,
An Algebra for Discrete Event Systems. John Wiley and Sons (1992)
16. Heidergott, B., Olsder, G., van der Woude, J.: Max Plus at Work–Modeling and
Analysis of Synchronized Systems: A Course on Max-Plus Algebra and Its Appli-
cations. Princeton University Press (2006)
17. Roset, B., Nijmeijer, H., van Eekelen, J., Lefeber, E., Rooda, J.: Event driven
manufacturing systems as time domain control systems. In: Proc. 44th IEEE Conf.
Decision and Control and European Control Conf. (CDC-ECC 2005), pp. 446–451
(December 2005)
18. Merlin, P., Farber, D.J.: Recoverability of communication protocols–implications
of a theoretical study. IEEE Trans. Commun. 24(19), 1036–1043 (1976)
19. Plus, M.: Max-plus toolbox of Scilab (1998),
http://www.cmap.polytechnique.fr/~ gaubert/MaxplusToolbox.html
20. Gazarik, M., Kamen, E.: Reachability and observability of linear systems over
max-plus. Kybernetika 35(1), 2–12 (1999)
21. Gaubert, S., Katz, R.: Reachability and invariance problems in max-plus algebra.
In: Benvenuti, L., De Santis, A., Farina, L. (eds.) Positive Systems. LNCIS, vol. 294,
pp. 15–22. Springer, Heidelberg (2003)
22. Gaubert, S., Katz, R.: Reachability problems for products of matrices in semirings.
International Journal of Algebra and Computation 16(3), 603–627 (2006)
23. Gaubert, S., Katz, R.: The Minkowski theorem for max-plus convex sets. Linear
Algebra and its Applications 421(2-3), 356–369 (2007)
24. Zimmermann, K.: A general separation theorem in extremal algebras. Ekonom.-
Mat. Obzor 13(2), 179–201 (1977)
25. Kvasnica, M., Grieder, P., Baotić, M.: Multi-parametric toolbox, MPT (2004)
26. Dill, D.: Timing assumptions and verification of finite-state concurrent systems. In:
Sifakis, J. (ed.) CAV 1989. LNCS, vol. 407, pp. 197–212. Springer, Heidelberg (1990)
27. Adzkiya, D., De Schutter, B., Abate, A.: Finite abstractions of max-plus-linear
systems. IEEE Trans. Autom. Control 58(12), 3039–3053 (2013)
28. Adzkiya, D., Abate, A.: VeriSiMPL: Verification via biSimulations of MPL models.
In: Joshi, K., Siegle, M., Stoelinga, M., D’Argenio, P.R. (eds.) QEST 2013. LNCS,
vol. 8054, pp. 274–277. Springer, Heidelberg (2013),
http://sourceforge.net/projects/verisimpl/
29. Yordanov, B., Belta, C.: Formal analysis of discrete-time piecewise affine systems.
IEEE Trans. Autom. Control 55(12), 2834–2840 (2010)
30. Charron-Bost, B., Függer, M., Nowak, T.: Transience bounds for distributed algo-
rithms. In: Braberman, V., Fribourg, L. (eds.) FORMATS 2013. LNCS, vol. 8053,
pp. 77–90. Springer, Heidelberg (2013)
31. Sontag, E.D.: Nonlinear regulation: The piecewise-linear approach. IEEE Trans.
Autom. Control 26(2), 346–358 (1981)
32. Baier, C., Katoen, J.P.: Principles of Model Checking. The MIT Press (2008)
Compositional Invariant Generation
for Timed Systems
1 Introduction
Compositional methods in verification have been developed to cope with state
space explosion. Generally based on divide et impera principles, these methods
attempt to break monolithic verification problems into smaller sub-problems by
exploiting either the structure of the system or the property or both. Composi-
tional reasoning can be used in different manners e.g., for deductive verification,
assume-guarantee, contract-based verification, compositional generation, etc.
The development of compositional verification for timed systems remains how-
ever challenging. State-of-the-art tools [7,13,25,18] for the verification of such
systems are mostly based on symbolic state space exploration, using efficient
data structures and particularly involved exploration techniques. In the timed
context, the use of compositional reasoning is inherently difficult due to the
synchronous model of time. Time progress is an action that synchronises contin-
uously all the components of the system. Getting rid of the time synchronisation
is necessary for analysing independently different parts of the system (or of the
property) but becomes problematic when attempting to re-compose the partial
verification results. Nonetheless, compositional verification is actively investi-
gated and several approaches have been recently developed and employed in
timed interfaces [2] and contract-based assume-guarantee reasoning [15,22].
In this paper, we propose a different approach for exploiting compositionality
for analysis of timed systems using invariants. In contrast to exact reachability
analysis, invariants are symbolic approximations of the set of reachable states of
the system. We show that rather precise invariants can be computed composi-
tionally, from the separate analysis of the components in the system and from
Work partially supported by the European Integrated Projects 257414 ASCENS,
288175 CERTAINTY, and STREP 318772 D-MILS.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 263–278, 2014.
c Springer-Verlag Berlin Heidelberg 2014
264 L. Aştefănoaei et al.
their composition glue. This method is proved to be sound for the verification
of safety properties. However, it is not complete.
The starting point is the verification method of [9], summarised in Figure 1.
The method exploits compositionality as explained next. Consider a system con-
sisting of components Bi interacting by means of a set γ of multi-party inter-
actions, and let Ψ be a system property of interest. Assume that all Bi as well
as the composition through γ can be independently characterised by means
of component invariants CI (Bi ), respectively interaction invariants II (γ). The
connection between the invariants and the system property Ψ can be intuitively
understood as follows: if Ψ can be proved to be a logical consequence of the con-
junction of components and interaction invariants, then Ψ holds for the system.
In the rule (VR) the symbol “ . ” is
used to underline that the logical im- . CI (Bi ) ∧ II (γ) → Ψ
plication can be effectively proved (for i
(VR)
instance with an SMT solver) and the no- γ Bi |= Ψ
tation “B |= Ψ ” is to be read as “Ψ
Fig. 1. Compositional Verification
holds in every reachable state of B”.
The verification rule (VR) in [9] has been developed for untimed systems. Its
direct application to timed systems may be weak as interaction invariants do not
capture global timings of interactions between components. The key contribution
of this paper is to improve the invariant generation method so to better track
such global timings by means of auxiliary history clocks for actions and inter-
actions. At component level, history clocks expose the local timing constraints
relevant to the interactions of the participating components. At composition
level, extra constraints on history clocks are enforced due to the simultaneity of
interactions and to the synchrony of time progress.
As an illustration, let us consider as running example the timed system in
Figure 2 which depicts a “controller” component serving n “worker” components,
one at a time. The interactions between the controller and the workers are defined
by the set of synchronisations {(a | bi ), (c | di ) | i ≤ n}. Periodically, after every
4 units of time, the controller synchronises its action a with the action bi of any
worker i whose clock shows at least
Controller Worker 3
a b
3
Worker
Worke 4n
2 units of time. Initially, such a
lc b 3
0 2 l 1 2
l 1
Worker
W o worker
1 exists because the controller
x ≥ 4n b
x := 0
d
3
y d:=
2
=0
1 b 3 1
y bl ≥ 4n
2
waits for 4n units of time before in-
3 3 1
y ::= 0 y ≥ 4n
x ≤ 4 lc
1
2
d1
l 3
2
b1
n
teracting with workers. The cycle re-
2 2
y :=l 0 y ≥ 4n
c a, x = 4
d
3
d 2
1 2 1
peats forever because there is always
x := 0 1
x:=0
d 1
l2 a worker “willing” to do b, that is,
lc
2
c the system is deadlock-free. Proving
deadlock-freedom of the system re-
Fig. 2. A Timed System quires to establish that when the con-
troller is at location lc1 there is at least one worker such that yi − x ≥ 4n − 4.
Unfortunately, this property cannot be shown if we use (VR) as it is in [9]. In-
tuitively, this is because the proposed invariants are too weak to infer such cross
constraints relating the clocks of the controller and the clocks of the workers:
Compositional Invariant Generation for Timed Systems 265
Organisation of the paper. Section 2 recalls the needed definitions for modelling
timed systems and their properties. Section 3 presents our method for composi-
tional generation of invariants. Section 4 describes the prototype implementing
the method and some case studies we experimented with. Section 5 concludes.
266 L. Aştefănoaei et al.
In the framework of the present paper, components are timed automata and
systems are compositions of timed automata with respect to multi-party inter-
actions. The timed automata we use are essentially the ones from [3], however,
sligthly adapted to embrace a uniform notation throughout the paper.
Because the semantics defined above is in general infinite, we work with the
so called zone graph [19] as a finite symbolic representation. The symbolic states
in a zone graph are pairs (l, ζ) where l is a location of B and ζ is a zone, a set
of clock valuations defined by clock constraints. Given a symbolic state (l, ζ),
its successor with respect to a transition t of B is denoted as succ(t, (l, ζ)) and
defined by means of its timed and its discrete successor:
where /, [r], norm are usual operations on zones: / ζ is the forward diagonal
projection of ζ, i.e., it contains any valuation v for which there exists a real δ
such that v − δ is in ζ; ζ[r] is the set of all valuations in ζ after applying the
resets in r; norm(ζ) corresponds to normalising ζ such that computation of the
set of all successors terminates. Since we are seeking component invariants which
are over-approximations of the reachable states, a more thorough discussion on
normalisation is not relevant for the present paper. The interested reader may
refer to [12] for more precise definitions.
A symbolic execution of a component starting from a symbolic state s0 is a
sequence of symbolic states s0 , s1 , . . . , sn , . . . such that for any i > 0 there exists
a transition t for which si is succ(t, si−1 ).
Given a component B with initial symbolic state s0 and transitions T , the
set of reachable symbolic states Reach(B) is Reach(s0 ) where Reach is defined
recursively for an arbitrary s as:
Reach(s) = {s} ∪ Reach(succ(t, s)).
t∈T
state predicate “at (l)” which holds in any symbolic state with location l, that is,
the semantics of at (l) is given by (l, ζ) |= at (l). As an example, the component
invariants for the scenario in Figure 2 with one worker are:
where γ 0 α = {β \ α | β ∈ γ ∧ β ⊆ α}.
Remark 1. We can use the interpreted function “min” as syntactic sugar to have
a more compact expression for E(γ):
E(γ) = hai = haj ≤ min hak ∧ E(γ 0 α) .
ak ∈α
α∈γ ai ,aj ∈α
E(γ) characterises the relations between history clocks during any possible
execution of a system. It can be shown, by induction, that this characterisation
is, in fact, an inductive invariant of the extended system.
By Proposition 2, and using the fact that component and interaction invari-
ants are inductive, we have that also their conjunction is an inductive invariant
of the system with history clocks. As a consequence of Proposition 1, we can
eliminate the history clocks from ∧i CI (Bih ) ∧ II (γ) ∧ E(γ) and obtain an invari-
ant of the original system. This invariant is usually stronger than CI (Bi ) ∧ II (γ)
and yields more successful applications of the rule (VR).
Example 2. We reconsider the sub-system of a controller and a worker from
Figure 2. We illustrate how the safety property ψSafe introduced in the beginning
of the section can be shown to hold by using the newly generated invariant. The
invariants for the components with history clocks are:
CI (Controller h ) = (lc0 ∧ h0 = x) ∨
(lc1 ∧ x ≤ 4 ∧ ha ≤ h0 ∧ (ha = hc ≥ 4 + x ∨ x = hc ≤ ha )) ∨
(lc2 ∧ x = ha ∧ hc ≤ h0 ∧ (hc ≥ ha + 8 ∨ hc = ha + 4))
CI (Worker h1 ) = (l11 ∧ (y1 = h0 ∨ y1 = hd1 ≤ hb1 ≤ h0 )) ∨
(l21 ∧ h0 ≥ y1 = hd1 ≥ 4 + hb1 )
By using the interaction invariant described in Section 2 and the equality con-
straints
E(γ) from Example 1, after the elimination of the existential quantifiers
in ∃h0 .∃ha .∃hb1 .∃hc .∃hd1 CI (Controller h ) ∧ CI (Worker h1 ) ∧ II (γ) ∧ E(γ) we
obtain the following invariant Φ :
Φ = (l11 ∧ lc0 ∧ (y1 ≤ x)) ∨ l11 ∧ lc1 ∧ (y1 = x ∨ y1 ≥ x + 4) ∨
1
l2 ∧ lc2 ∧ (y1 = x + 4 ∨ y1 ≥ x + 8) .
It can be easily checked that Φ ∧ ¬ΨSafe has no satisfying model and this proves
that ΨSafe holds for the system. We used bold fonts in Φ to highlight relations
between x and y1 which are not in CI (Controller ) ∧ CI (Worker 1 ) ∧ II (γ).
To sum up, the basic steps described so far are: (1) extend the input com-
ponents Bi to components with history clocks Bih ; (2) compute component in-
variants CI (Bih ) and (3) equality constraints E(γ) from the interactions γ; (4)
finally, eliminate the history clocks in ∧i CI (Bih ) ∧ E(γ) ∧ II (γ), and obtain a
stronger invariant by means of which the application of (VR) is more successful.
We conclude the section with a remark on the size of E(γ). Due to the com-
bination of recursion and disjunction, E(γ) can be large. Much more compact
formulae can be obtained by exploiting non-conflicting interactions, i.e., inter-
actions that do not share actions.
Proposition 3 For γ = γ1 ∪γ2 with Act(γ1 )∩Act (γ2 ) = ∅, E(γ) ≡ E(γ1 )∧E(γ2 ).
Compositional Invariant Generation for Timed Systems 271
Corollary 4 If the interaction model γ has only disjoint interactions, i.e., for
any α1 , α2 ∈ γ, α1 ∩ α2 = ∅, then E(γ) ≡ h ai = h aj .
α∈γ ai ,aj ∈α
where | | stands for absolute values and ka denotes the minimum between the
first occurrence time of a and the minimal time elapse between two consecutive
occurrences of a. It is computed4 locally on the component executing a.
Example 4. In our running example the only shared actions are a and c within
the controller, and both ka and kc are equal to 4, thus the expression of the
separation constraints reduces to:
S(γ) ≡ |hc|di − hc|dj | ≥ 4 ∧ |ha|bi − ha|bj | ≥ 4.
i=j i=j
Proof. By induction on the length of computations. For the base case, we assume
that the initial values of the history clocks for interactions in Γ ∗ are such that
they satisfy S(γ). Obviously, such a satisfying initial model always exists: it
suffices to take all hα with a minimal distance between them greater than the
maximum ka , in an arbitrary order.
For the inductive step, let s be the state reached after i steps, s a successor,
α
α an interaction such that s → s , a an arbitrary action and β ∈ γ such that
a ∈ β. For any β = α, | hβ − hβ |≥ ka is unchanged from s to s (α is the only
interaction for which hα is reset from s to s ) and thus holds by induction. We
now turn to | hβ − hα | which at s evaluates to hβ . Let sa be the most recent
state reached by an interaction containing a. If no such interaction exists, that
is, if a has no appearance in the i steps to s, let sa be the initial state. On the
path from sa to s , hβ could not have been reset (otherwise, sa would not be the
most recent one). Thus hβ ≥ ka by the definition of ka .
The invariant S(γ) is defined over the history clocks for interactions. Previ-
ously, the invariant E(γ) has been expressed using history clocks for actions. In
order to “glue” them together in a meaningful way, we need some connection
between history and interaction clocks. This connection is formally addressed by
the constraints E ∗ defined below.
4
For instance, by reduction to a shortest path problem in weighted graphs [14].
Compositional Invariant Generation for Timed Systems 273
From Proposition 7, together with Propositions 5 and 6, it follows that
∃HA ∃Hγ .(∧i CI (Bih ) ∧ II (γ) ∧ E ∗ (γ) ∧ S(γ)) is an invariant of γ Bi . This new in-
variant is in general stronger than ∧i CI (Bih )∧II (γ)∧E(γ) and it provides better
state space approximations for timed systems with conflicting interactions.
Example 5. To get some intuition about the invariant generated using separation
constraints, let us reconsider the running example with two workers. The subfor-
mula which we emphasise here is the conjunction of E ∗ and S. The interaction
inequalities for history clocks are:
E ∗ (γ) ≡ hb1 = ha|b1 ∧ hb2 = ha|b2 ∧ ha = min (ha|bi )∧
i=1,2
lower
y≤1
approach
z≤1
lower
Fischer Protocol: This is a well-studied protocol for mutual exclusion [20]. The
protocol specifies how processes can share a resource one at a time by means
of a shared variable to which each process assigns its own identifier number.
After θ time units, the process with the id stored in the variable enters the
critical state and uses the resource. We use an auxiliary component Id Variable
to mimic the role of the shared variable. To keep the size of the generated
invariants manageable, we restrict to the acyclic version. The system with two
concurrent processes is represented in Figure 4. The property of interest is mutual
exclusion: (csi ∧ csj ) → i = j. The component Id Variable has combinatorial
behavior and a large number of actions (2n + 1), thus the generated invariant
Compositional Invariant Generation for Timed Systems 275
is huge except for very small values of n. To overcome this issue, we extracted
from the structure of the generated invariant a weaker inductive one which we
verified for validity locally with Uppaal. Basically, it encodes information like
heqi < hseti → heqi < heq0 ∧ hseti < heq0 for any index i. This invariant,
together with the component invariants for the processes and together with E(γ)
is sufficient to show that mutual exclusion holds.
set 1
set 2
set 2
set 2
x1 := 0 x2 := 0
set 2
enter 1 , x1 > θ enter 2 , x2 > θ
w1 cs1 S1 S2 cs2 w2
set 1
eq 1 , set 1 eq 2 , set 2
Process1 Id Variable Process2
The experiments were run on a Dell machine with Ubuntu 12.04, an Intel(R)
Core(TM)i5-2430M processor of frequency 2.4GHz×4, and 5.7GiB memory. The
results, synthesised in Table 1, show the potential of our method in terms of
276 L. Aştefănoaei et al.
accuracy (no false positives) and scalability. For larger numbers of components,
the size of the resulting invariants was not problematic for Z3. However, it may
be the case that history clocks considerably increase the size of the generated
formulae. It can also be observed that Uppaal being highly optimised, it has
better scores on the first example in particular and on smaller systems in general.
The timings for our prototype are obtained with the Unix command time while
the results for Uppaal come from the command verifyta which comes with the
Uppaal 4.1.14 distribution.
Table 1. Results from Experiments. The marking “∗ ” highlights the cases when E
alone was enough to prove the property. The expressions of form “x + y” are to be read
as “the formula ∧i CI (Bi ) ∧ II (γ) ∧ E (γ), resp. E ∗ (γ) ∧ S(γ), has length x, resp. y”.
References
1. Abdellatif, T., Combaz, J., Sifakis, J.: Model-based implementation of real-time
applications. In: EMSOFT (2010)
2. de Alfaro, L., Henzinger, T.A., Stoelinga, M.: Timed interfaces. In: Sangiovanni-
Vincentelli, A.L., Sifakis, J. (eds.) EMSOFT 2002. LNCS, vol. 2491, pp. 108–122.
Springer, Heidelberg (2002)
3. Alur, R., Dill, D.L.: A theory of timed automata. Theor. Comput. Sci. (1994)
4. Astefanoaei, L., Rayana, S.B., Bensalem, S., Bozga, M., Combaz, J.: Compositional
invariant generation for timed systems. Technical Report TR-2013-5, Verimag Re-
search Report (2013)
5. Badban, B., Leue, S., Smaus, J.-G.: Automated invariant generation for the veri-
fication of real-time systems. In: WING@ETAPS/IJCAR (2010)
6. Basu, A., Bozga, M., Sifakis, J.: Modeling heterogeneous real-time components in
BIP. In: SEFM (2006)
7. Behrmann, G., David, A., Larsen, K.G., Håkansson, J., Pettersson, P., Yi, W.,
Hendriks, M.: UPPAAL 4.0. In: QEST (2006)
8. Bengtsson, J., Jonsson, B., Lilius, J., Yi, W.: Partial order reductions for timed
systems. In: Sangiorgi, D., de Simone, R. (eds.) CONCUR 1998. LNCS, vol. 1466,
pp. 485–500. Springer, Heidelberg (1998)
9. Bensalem, S., Bozga, M., Sifakis, J., Nguyen, T.-H.: Compositional verification for
component-based systems and application. In: Cha, S(S.), Choi, J.-Y., Kim, M.,
Lee, I., Viswanathan, M. (eds.) ATVA 2008. LNCS, vol. 5311, pp. 64–79. Springer,
Heidelberg (2008)
10. Berendsen, J., Vaandrager, F.W.: Compositional abstraction in real-time model
checking. In: Cassez, F., Jard, C. (eds.) FORMATS 2008. LNCS, vol. 5215, pp.
233–249. Springer, Heidelberg (2008)
11. Bornot, S., Sifakis, J.: An algebraic framework for urgency. Information and Com-
putation (1998)
12. Bouyer, P.: Forward analysis of updatable timed automata. Form. Methods Syst.
Des. (2004)
13. Bozga, M., Daws, C., Maler, O., Olivero, A., Tripakis, S., Yovine, S.: Kronos: A
model-checking tool for real-time systems. In: Vardi, M.Y. (ed.) CAV 1998. LNCS,
vol. 1427, pp. 546–550. Springer, Heidelberg (1998)
14. Courcoubetis, C., Yannakakis, M.: Minimum and maximum delay problems in real-
time systems. Formal Methods in System Design (1992)
15. David, A., Larsen, K.G., Legay, A., Møller, M.H., Nyman, U., Ravn, A.P., Skou, A.,
Wasowski, A.: Compositional verification of real-time systems using Ecdar. STTT
(2012)
16. de Boer, F.S., Hannemann, U., de Roever, W.-P.: Hoare-style compositional proof
systems for reactive shared variable concurrency. In: Ramesh, S., Sivakumar, G.
(eds.) FSTTCS 1997. LNCS, vol. 1346, pp. 267–283. Springer, Heidelberg (1997)
17. Fietzke, A., Weidenbach, C.: Superposition as a decision procedure for timed au-
tomata. Mathematics in Computer Science (2012)
18. Gardey, G., Lime, D., Magnin, M., Roux, O(H.): Romeo: A tool for analyzing time
petri nets. In: Etessami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576,
pp. 418–423. Springer, Heidelberg (2005)
19. Henzinger, T.A., Nicollin, X., Sifakis, J., Yovine, S.: Symbolic model checking for
real-time systems. Inf. Comput. (1994)
20. Lamport, L.: A fast mutual exclusion algorithm. ACM Trans. Comput. Syst. (1987)
278 L. Aştefănoaei et al.
21. Legay, A., Bensalem, S., Boyer, B., Bozga, M.: Incremental generation of linear
invariants for component-based systems. In: ACSD (2013)
22. Lin, S.-W., Liu, Y., Hsiung, P.-A., Sun, J., Dong, J.S.: Automatic generation of
provably correct embedded systems. In: Aoki, T., Taguchi, K. (eds.) ICFEM 2012.
LNCS, vol. 7635, pp. 214–229. Springer, Heidelberg (2012)
23. Salah, R.B., Bozga, M., Maler, O.: Compositional timing analysis. In: EMSOFT
(2009)
24. Tripakis, S.: Verifying progress in timed systems. In: Katoen, J.-P. (ed.) ARTS
1999. LNCS, vol. 1601, pp. 299–314. Springer, Heidelberg (1999)
25. Wang, F.: Redlib for the formal verification of embedded systems. In: ISoLA (2006)
Characterizing Algebraic Invariants
by Differential Radical Invariants
Abstract We prove that any invariant algebraic set of a given polynomial vector
field can be algebraically represented by one polynomial and a finite set of its
successive Lie derivatives. This so-called differential radical characterization re-
lies on a sound abstraction of the reachable set of solutions by the smallest variety
that contains it. The characterization leads to a differential radical invariant proof
rule that is sound and complete, which implies that invariance of algebraic equa-
tions over real-closed fields is decidable. Furthermore, the problem of generating
invariant varieties is shown to be as hard as minimizing the rank of a symbolic
matrix, and is therefore NP-hard. We investigate symbolic linear algebra tools
based on Gaussian elimination to efficiently automate the generation. The ap-
proach can, e.g., generate nontrivial algebraic invariant equations capturing the
airplane behavior during take-off or landing in longitudinal motion.
Keywords: invariant algebraic sets, polynomial vector fields, real algebraic ge-
ometry, Zariski topology, higher-order Lie derivation, automated generation and
checking, symbolic linear algebra, rank minimization, formal verification
1 Introduction
Reasoning about the solutions of differential equations by means of their conserved
functions and expressions is ubiquitous all over science studying dynamical processes.
It is even crucial in many scientific fields (e.g. control theory or experimental physics),
where a guarantee that the behavior of the system will remain within a certain pre-
dictable region is required. In computer science, the interest of the automated gener-
ation of these conserved expressions, so-called invariants, was essentially driven and
motivated by the formal verification of different aspects of hybrid systems, i.e. systems
combining discrete dynamics with differential equations for the continuous dynamics.
The verification of hybrid systems requires ways of handling both the discrete and
continuous dynamics, e.g., by proofs [15], abstraction [21,27], or approximation [10].
Fundamentally, however, the study of the safety of hybrid systems can be shown to
reduce constructively to the problem of generating invariants for their differential equa-
tions [18]. We focus on this core problem in this paper. We study the case of algebraic
This material is based upon work supported by the National Science Foundation by NSF CA-
REER Award CNS-1054246, NSF EXPEDITION CNS-0926181 and grant no. CNS-0931985.
This research is also partially supported by the Defense Advanced Research Agency under
contract no. DARPA FA8750-12-2-0291.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 279–294, 2014.
c Springer-Verlag Berlin Heidelberg 2014
280 K. Ghorbal and A. Platzer
is far from restrictive and many analytic nonalgebraic functions, such as the square
root, the inverse, the exponential or trigonometric functions, can be exactly modeled as
solutions of ordinary differential equations with a polynomial vector field (a concrete
example will be given in Section 6.2).
While algebraic invariant equations are not the only invariants of interest for hybrid
systems [19,17], they are still intimately related to all other algebraic invariants, such
as semialgebraic invariants. We, thus, believe that the characterization we achieve in
this paper to be an important step forward in understanding the invariance problem of
polynomial vector fields, and hence the hybrid systems with polynomial vector fields.
Our results indicate that algebraic geometry is well suited to reason about and effec-
tively compute algebraic invariant equations. Relevant concepts and results from alge-
braic geometry will be introduced and discussed as needed. The proofs of all presented
results are available in [5].
dxi
= ẋi = pi (x), 1 ≤ i ≤ n, x(tι ) = xι . (1)
dt
Since polynomial functions are smooth (C ∞ , i.e. they have derivatives of any order),
they are locally Lipschitz-continuous. By Cauchy-Lipschitz theorem (a.k.a. Picard-
Lindelöf theorem), there exists a unique maximal solution to the initial value prob-
lem (1) defined on some nonempty open set Ut ⊆ R. A global solution defined for
all t ∈ R may not exist! in general. For instance," the maximal solution x(t) of the
1-dimensional system ẋ = x2 , x(tι ) = xι = 0 is defined on R \ {tι + x−1 ι }.
Algebraic invariant equations for initial value problems are defined as follows.
In Def. 2, the function h(x(t)), and hence the polynomial h(x), depend on the fixed
but unknown initial value xι . We implicitly assume this dependency for a clearer nota-
tion and will emphasize it whenever needed. Also, observe that h(x(t)), seen as a real
valued function of time t, is only defined over the open set Ut ⊆ R since the solution
x(t) is itself only defined over Ut . The polynomial function h : Rn → R; x )→ h(x)
is, however, defined for all Rn .
Definition 3 (Orbit). The reachable set, or orbit, of the solution of Eq. (1), x(t) is
def
defined as O(xι ) = {x(t) | t ∈ Ut } ⊆ Rn .
The complete geometrical characterization of the orbit requires the exact solution
of Eq. (1). Very few initial value problems admit an analytic solution, although a local
approximation can be always given using Taylor series approximations (such approxi-
mation is for instance used in [10] for the verification of hybrid systems). In this work,
we introduce a sound abstraction of the orbit, O(xι ), using (affine) varieties2 . The idea
is to embed the orbit (which is not a variety in general) in a variety to be defined. The
embedding we will be using is a well-known topological closure operation in algebraic
2
In the literature, some authors use the terminology algebraic sets so that varieties is reserved
for irreducible algebraic sets. Here we will use both terms equally.
282 K. Ghorbal and A. Platzer
geometry called the Zariski closure ([6, Chapter 1]). Varieties, which are sets of points,
can be represented and computed efficiently using their algebraic counterpart: ideals
of polynomials. Therefore, we first recall three useful definitions: an ideal of the ring
R[x], the variety of a subset of R[x], and finally the vanishing ideal of a subset of Rn .
Definition 4 (Ideal). An ideal I is a subset of R[x] that contains the polynomial zero
(0), is stable under addition, and external multiplication. That is, for all h1 , h2 ∈ I, the
sum h1 + h2 ∈ I; and if h ∈ I, then, qh ∈ I for all q ∈ R[x].
For a finite natural number r, we denote by h1 , . . . , hr the subset of R[x] generated
by the polynomials {h1 , . . . , hr }, i.e. the set of linear combinations of the polynomials
hi (where the coefficients are themselves polynomials):
r )
def
h1 , . . . , hr = gi hi | g1 , . . . , gr ∈ R[x] .
i=1
By Def. 4, the set h1 , . . . , hr is an ideal. More interestingly, by Hilbert’s Basis The-
orem [7], any ideal I of the Noetherian ring R[x] can be finitely generated by, say
{h1 , . . . , hr }, so that I = h1 , . . . , hr .
Given Y ⊆ R[x], the variety (over the reals), V (Y ), is a subset of Rn defined by the
common roots of all polynomials in Y . That is,
def ! "
V (Y ) = x ∈ Rn | ∀h ∈ Y, h(x) = 0 .
The vanishing ideal (over the reals), I(S), of S ⊆ Rn is the set of all polynomials
that evaluates to zero for all x ∈ S:
def ! "
I(S) = h ∈ R[x] | ∀x ∈ S, h(x) = 0 .
The Zariski closure Ō(xι ) of the orbit O(xι ) is defined as the variety of the vanish-
ing ideal of O(xι ):
def
Ō(xι ) = V (I(O(xι ))) . (3)
That is, Ō(xι ) is defined as the set of all points that are common roots of all polynomials
that are zero everywhere on the orbit O(xι ). The variety Ō(xι ) soundly overapprox-
imates all reachable states x(t) in the orbit of O(xι ), including the initial value xι :
Observe that if the set of generators of I(O(xι )) is only the zero polynomial,
I(O(xι )) = 0, then Ō(xι ) = Rn (the whole space) and the Zariski closure fails
to be informative. For instance, for (non-degenerated) one dimensional vector fields
(n = 1) that evolve over time, the only univariate polynomial that has infinitely many
roots is the zero polynomial. This points out the limitation of the closure operation used
in this work and raises interesting question about how to deal with such cases (this will
be left as future work).
The closure operation abstracts time. This means that Ō(xι ) defines a subset of
Rn within which the solution always evolves without saying anything about where the
system will be at what time (which is what a solution would describe and which is
exactly what the abstraction we are defining here gets rid off). In particular, Ō(xι ) is
independent of whether the system evolves forward or backward in time.
Although, we know that I(O(xι )) is finitely generated, computing all its genera-
tors may be intractable. By the real Nullstellensatz, vanishing ideals over R are in fact
exactly the real radical ideals [1, Section 4.1]. In real algebraic geometry, real radical
ideals are notoriously hard4 to compute. However, we shall see in the sequel that Lie
derivation will give us a powerful computational handle that permits to tightly approx-
imate (and even compute in some cases) I(O(xι )). The Lie derivative of a polynomial
along a vector field is defined as follows.
Definition 5 (Lie Derivative). The Lie derivative of h ∈ R[x] along the vector field
p = (p1 , . . . , pn ) is defined by:
n
def ∂h
Lp (h) = pi . (4)
i=1
∂xi
Since the ring R[x] is Noetherian, the chain above has necessarily a finite length: the
maximal ideal (in the*sense of inclusion), so-called the differential radical ideal5 of h,
L
will be denoted by p h. Its order is the smallest N such that:
(N −1)
L(N
p (h) ∈ Lp (h), . . . , Lp
) (0)
(h) . (5)
Theorem 1 (Differential
* Radical Characterization). Let h ∈ R[x], and let N denote
L
the order of p h. Then, h ∈ I(O(xι )) if and only if
L(i)
p (h)(xι ) = 0 . (6)
0≤i≤N −1
Definition 6 (Invariant Variety). The variety S is an invariant variety for the vector
field p if and only if ∀xι ∈ S, O(xι ) ⊆ S.
Dual to the geometrical point of view in Def. 6, the algebraic point of view is given
by extending the definition of algebraic invariant equation for initial value problems
(Def. 2), to algebraic invariant equation for polynomial vector fields.
Definition 7 (Algebraic Invariant Equation (Vector Field)). The expression h = 0
is an algebraic invariant equation for the vector field p if and only if V (h) is an
invariant variety for p.
Unlike Def. 2, Def. 7, or its geometrical counterpart, Def. 6, corresponds to the typ-
ical object of studies in hybrid system verification as they permit the abstraction of
the continuous part by means of algebraic equations. In the two following sections, we
show how differential radical characterization (Theorem 1) can be used to address two
particular questions: checking the invariance of a variety candidate (Section 3.1) and
characterizing invariant varieties (Section 3.2).
We will the polynomial h is a differential radical invariant (for p) if and
say that
L*
only if V p h is an invariant variety for p.
L
5
The construction of p h is very similar to the construction of the radical of an ideal except
with higher-order Lie derivatives in place of higher powers of polynomials.
Differential Radical Invariants 285
The problem we solve in this section is as follows: given a polynomial vector field p,
can we decide whether the equation h = 0 is an algebraic invariant equation for the
vector field p ? Dually, we want to check whether the variety V (h) is invariant for p.
Theorem 2 solves the problem.
L
*
Theorem 2. Let h ∈ R[x], and let N denote the order of p h. Then, V (h) is an
invariant variety for the vector field p (or equivalently h = 0 is an algebraic invariant
equation for p) if and only if
h=0→ L(i)
p (h) = 0 . (7)
1≤i≤N −1
The sound and complete * related proof rule from Theorem 2 can be written as follows
L
(N denotes the order of p h):
(i)
h=0→ 1≤i≤N −1 Lp (h) = 0
(DRI) . (8)
(h = 0) → [ẋ = p](h = 0)
Using the naive trick in Eq. (2), theoretically, the proof rule can be easily extended to
check for the invariance of any finite disjunction of conjunctions of algebraic invariant
equations for p. This means that we can check for the invariance of any variety for
p, given its algebraic representation. However, in practice, other techniques, outside
the scope of this paper, should be considered to try to keep the degree of the involved
polynomials as low as possible.
Observe how Theorem 3 proves, from the differential radical characterization point of
view, the well-known fact about invariant
* polynomial functions [17, Theorem 3]: if
L
Lp (h(x)) = 0, then, for any c ∈ R, p h(x) − c = h(x)−c, and so s = v(h(x)−
c) is an invariant variety for p.
An algebraic invariant equation for p is defined semantically (Def. 7) as a polyno-
mial that evaluates to zero if it is zero initially (admits xι as a root). Differential radi-
cal invariants are, on the other hand, defined as a structured, syntactically computable,
conjunction of polynomial equations involving one polynomial and its successive Lie
derivatives. By Theorem 3, both coincide.
The explicit formulation of Eq. (5), namely
N −1
L(N )
p (h) = gi L(i)
p (h), (10)
i=0
6
The degree of the zero polynomial (0) is undefined. We assume in this work that all finite
degrees are acceptable for the zero polynomial.
Differential Radical Invariants 287
If h denotes a form of degree d, and d the maximum degree among the degrees of
(k)
pi , then the degree of the polynomial Lp (h) is given by:
where α and β are decoupled. The matrix Md,N (β) is called the matrix representation
(N ) (N −1)
of the ideal membership problem Lp (hα ) ∈?hα , . . . , Lp (hα ).
Recall that the kernel (or null-space) of a matrix M ∈ Rr×c , with r rows and c
columns is the subspace of Rc defined as the preimage of the vector 0 ∈ Rc :
def
ker(M ) = {x ∈ Rc | M x = 0} .
288 K. Ghorbal and A. Platzer
Let s = dim(ker(Md,N (β))) ≤ md . If, for all β, s = 0, then the kernel is {0}.
Hence, α = 0 and, for the chosen N , we have hα = 0: the only differential radi-
cal ideal generated by a form of degree d is the trivial ideal 0. If, however, s ≥ 1,
then, by Theorem 3, we generate an invariant (projective) variety for p. In this case, de-
homogenizing is not always possible. In fact, the constraint on the initial value could
involve x0 , which prevents the de-homogenization (see Example 2). Otherwise, we re-
cover an invariant (affine) variety for the original vector field. This is formally stated in
the following theorem.
The remainder of this section discusses our approach to maximize the dimension of
the kernel of Md,N (h), as well as the complexity of the underlying computation.
where the elements of the vector β are in R. If the vector field p has no parameters,
then the entries of the matrix Md,N (β) are either elements of β or real numbers. Under
these assumptions, the problem (16) is in PSPACE [2, Corollary 20] over the field of
real numbers7, and is at least NP-hard (see [2, Corollary 12] and [8, Theorem 8.2])
independently from the underlying field. In fact, deciding whether the rank of Md,N (β)
is less than or equal to a given fixed bound is no harder than deciding the corresponding
existential first-order theory.
On the other hand, there is an NP-hard lower bound for the feasibility of the original
set of (biaffine) equations in β and α given in Eq. (13). In the simpler bilinear case and,
assuming, as above, that the vector field has no parameters, finding a nontrivial solution
(α = 0 is trivial) is also NP-hard [8, Theorems 3.7 and 3.8].
Sound and Precise Algebraic Abstraction of Reachable Sets (Section 2). Unlike pre-
vious work [28,23,12,11], we start by studying algebraic initial value problems. We
propose a sound abstraction (Proposition 1) to embed (overapproximate) the reachable
set. Our abstraction relies on the Zariski closure operator over affine varieties (closed
sets of the Zariski topology), which allows a clean and sound geometrical abstraction.
From there, we define the vanishing ideal of the closure, and give a necessary and suf-
ficient condition (Theorem 1) for a polynomial equation to be an invariant for algebraic
initial value problems.
Checking Invariant Varieties by Differential Radical Invariants (Section 3.1). The
differential radical characterization allows to check for and falsify the invariance of a
variety candidate. Unlike already existing proof rules [28,12,17], which are sound but
can only prove a restrictive class of invariants. From Theorem 2, we derive a sound and
complete proof rule (Eq. (8)) and prove that the problem is decidable (Corollary 1) over
the real-closed algebraic fields.
Differential Radical Characterization of Invariant Varieties (Section 3.2). The dif-
ferential radical criterion completely characterizes all invariant varieties of polynomial
vector fields. This new characterization (Theorem 3) permits to relate invariant varieties
to a purely algebraic, well-behaved, conjunction of polynomial equations involving one
polynomial and its successive Lie derivatives (Eq. (9)). It naturally generalizes [9,26]
where linear vector fields are handled and [24,12] where only a restrictive class of in-
variant varieties is considered.
Effective Generation of Invariant Varieties (Section 4). Unlike [28,23,11,22], we do
not use quantifier elimination procedures nor Gröbner Bases algorithms for the genera-
tion of invariant varieties. We have developed and generalized the use of symbolic linear
algebra tools to effectively generate families of invariant varieties (Theorem 4) and to
soundly overapproximate reachable sets (Proposition 3). In both cases, the problem re-
quires maximizing the dimension of the kernel of a symbolic matrix. The complexity is
shown to be NP-hard, but in PSPACE, for polynomial vector fields without parameters.
We also generalize the previous related work on polynomial-consecution. In particular,
Theorems 2 and 4 in [12] are special cases of, respectively, Theorem 4 and Proposi-
tion 3, when the order of differential radical ideals is exactly 1.
6 Case Studies
The following challenging example comes up as a subsystem we encountered when
studying aircraft dynamics: p1 = −x2 , p2 = x1 , p3 = x24 , p4 = x3 x4 .
It appears frequently whenever Euler angles and the three dimensional rotational
matrix is used to describe the dynamics of rigid body motions. For some chosen initial
value, such as xι = (1, 0, 0, 1), it is an exact algebraic encoding of the trigonometric
functions : x1 (t) = cos(t), x2 (t) = sin(t), x3 (t) = tan(t), x4 (t) = sec(t). When
d = 2 and N = 1, the matrix M2,1 (β) is 35 × 15, with 90 (out of 525) nonzero
elements, and |β| = 5. The maximum dimension of ker(M2,1 (β)) is 3 attained for
β = 0. The condition of Proposition 3 is satisfied and, for any xι , we find the following
algebraic invariant equations for the corresponding initial value problem:
Differential Radical Invariants 291
In particular, for the initial value xι = (1, 0, 0, 1), one recovers two trigonometric
identities, namely cos(t)2 + sin(t)2 − 1 = 0 for h1 and − tan(t)2 + sec(t)2 − 1 = 0
for h2 .
For N = 3, the matrix M2,3 (β) is 126×15, with 693 (out of 1890) nonzero elements,
and |β| = 55. We found a β for which the dimension of ker(M2,3 (β)) is 5. By Theo-
rem 4, we have a family of invariant varieties for p encoded by the following differential
radical invariant: h = γ1 − x23 γ2 + x24 γ2 + x2 x4 γ3 + x21 γ4 + x22 γ4 + x1 x4 γ5 , where
γi , 1 ≤ i ≤ 5, are real numbers. In particular, when (γ1 , γ2 , γ3 , γ4 , γ5 ) = (1, 0, 0, 0, 1),
we have the following algebraic invariant equation for p:
We stress the fact that Eq. (17) is one algebraic invariant equation for p. In fact, any
conjunct alone, a part from −1 − x23 + x24 = 0, of Eq. (17) is not an algebraic invariant
equation for p. Indeed, we can falsify the candidate −1 + x1 x4 = 0 using Theorem 2:
the implication −1 + x1 x4 = 0 −→ −x2 x4 + x3 = 0 is obviously false in general.
Notice that h1 and h2 can be found separately by splitting the original vector field
into two separate vector fields since the pairs (p1 , p2 ) and (p3 , p4 ) can be decoupled.
However, by decoupling, algebraic invariant equation such as Eq. (17) cannot be found.
This clearly shows that in practice, splitting the vector field into independent ones
should be done carefully when it comes to generating invariant varieties. This is some-
how counter-intuitive as decoupling for the purpose of solving is always desirable. In
fact, any decoupling breaks an essential link between all involved variables: time.
We proceed to discuss collision avoidance of two airplanes (Section 6.1) and then the
use of invariant varieties to tightly capture the vertical motion of an airplane
(Section 6.2).
The angular velocities ω1 and ω2 can be either zero (straight line flight) or equal to a
constant ω which denotes the standard rate turn (typically 180◦/2mn for usual com-
mercial airplanes). When the two airplanes are manoeuvring with the same standard
292 K. Ghorbal and A. Platzer
rate turn ω, apart from the already known invariants, we discovered the following dif-
ferential radical invariant (which corresponds to a family of invariant varieties):
h1 = γ1 d1 + γ2 d2 + γ3 e1 + γ4 e2 = 0 ∧ h2 = γ2 d1 − γ1 d2 + γ4 e1 − γ3 e2 = 0,
*
L
*
L
for an arbitrarily (γ1 , . . . , γ4 ) ∈ R4 . We have p h1 = p h2 = h1 , h2 . Observe
also that V (h1 ) and V (h2 ) are not invariant varieties for p.
The full dynamics of an aircraft are often separated (decoupled) into different modes
where the differential equations take a simpler form by either fixing or neglecting the
rate of change of some configuration variables [25]. The first standard separation used in
stability analysis gives two main modes: longitudinal and lateral-directional. We study
the 6th order longitudinal equations of motion as it captures the vertical motion (climb-
ing, descending) of an airplane. We believe that a better understanding of the envelope
that soundly contains the trajectories of the aircraft will help tightening the surrounding
safety envelope and hence help trajectory management systems to safely allow more
dense traffic around airports. The current safety envelope is essentially a rough cylinder
that doesn’t account for the real capabilities allowed by the dynamics of the airplane.
We use our automated invariant generation techniques to characterize such an envelope.
The theoretical improvement and the effective underlying computation techniques de-
scribed earlier in this work allow us to push further the limits of automated invariant
generation. We first describe the differential equations (vector field) then show the non-
trivial energy functions (invariant functions for the considered vector field) we were
able to generate. Let g denote the gravity acceleration, m the total mass of an airplane,
M the aerodynamic and thrust moment w.r.t. the y axis, (X, Z) the aerodynamics and
thrust forces w.r.t. axis x and z, and Iyy the second diagonal element of its inertia ma-
trix. The restriction of the nominal flight path of an aircraft to the vertical plane reduces
the full dynamics to the following 6 differential equations [25, Chapter 5] (u: axial
velocity, w: vertical velocity, x: range, z: altitude, q: pitch rate, θ: pitch angle):
X
u̇ = − g sin(θ) − qw ẋ = cos(θ)u + sin(θ)w θ̇ = q
m
Z M
ẇ = + g cos(θ) + qu ż = − sin(θ)u + cos(θ)w q̇ = .
m Iyy
We encode the trigonometric functions using two additional variables for cos(θ) and
sin(θ), making the total number of variables equal to 8. The parameters are considered
unconstrained. Unlike [23], we do not consider them as new time independent variables.
So that the total number of state variables (n) and hence the degree of the vector field are
unchanged. Instead, they are carried along the symbolic row-reduction computation as
symbols in Md,N (β). For the algebraic encoding of the above vector field (n = 8), the
matrix M3,1 (β) is 495 × 165, with 2115 (out of 81675) nonzero elements, and |β| = 9.
We were able to automatically generate the following three invariant functions:
Differential Radical Invariants 293
+ , + ,
Mz X Z
+ gθ + − qw cos(θ) + + qu sin(θ),
Iyy m m
+ , + ,
Mx Z X 2M θ
− + qu cos(θ) + − qw sin(θ), −q 2 + .
Iyy m m Iyy
We substituted the intermediate variables that encode sin and cos back to emphasize the
fact that algebraic invariants and algebraic differential systems are suitable to encode
many real complex dynamical systems. Using our Mathematica implementation, the
computation took 1 hour on a recent laptop with 4GB and 1.7GHz Intel Core i5.
Acknowledgments. We thank the anonymous reviewers for their careful reading and
detailed comments. We also would like to very much thank J EAN -BAPTISTE J EANNIN
and A NDREW S OGOKON for the multiple questions, various comments and fruitful
objections they both had on an early version of this work. We are finally grateful to
E RIC G OUBAULT and S YLVIE P UTOT for the relevant references they pointed out to us
on the integrability theory of nonlinear systems.
7 Conclusion
References
1. Bochnak, J., Coste, M., Roy, M.F.: Real Algebraic Geometry. A series of modern surveys in
mathematics. Springer (2010)
2. Buss, J.F., Frandsen, G.S., Shallit, J.: The computational complexity of some problems of
linear algebra. J. Comput. Syst. Sci. 58(3), 572–596 (1999)
3. Cox, D.A., Little, J., O’Shea, D.: Ideals, Varieties, and Algorithms: An Introduction to Com-
putational Algebraic Geometry and Commutative Algebra. Springer (2007)
4. Dubins, L.E.: On curves of minimal length with a constraint on average curvature, and
with prescribed initial and terminal positions and tangents. American Journal of Mathemat-
ics 79(3), 497–516 (1957)
294 K. Ghorbal and A. Platzer
5. Ghorbal, K., Platzer, A.: Characterizing algebraic invariants by differential radical invariants.
Tech. Rep. CMU-CS-13-129, School of Computer Science, Carnegie Mellon University,
Pittsburgh, PA, 15213 (November 2013),
http://reports-archive.adm.cs.cmu.edu/anon/2013/
abstracts/13-129.html
6. Hartshorne, R.: Algebraic Geometry. Graduate Texts in Mathematics. Springer (1977)
7. Hilbert, D.: Über die Theorie der algebraischen Formen. Mathematische Annalen 36(4), 473–
534 (1890)
8. Hillar, C.J., Lim, L.H.: Most tensor problems are NP-hard. J. ACM 60(6), 45 (2013)
9. Lafferriere, G., Pappas, G.J., Yovine, S.: Symbolic reachability computation for families of
linear vector fields. J. Symb. Comput. 32(3), 231–253 (2001)
10. Lanotte, R., Tini, S.: Taylor approximation for hybrid systems. In: Morari, Thiele (eds.) [13],
pp. 402–416
11. Liu, J., Zhan, N., Zhao, H.: Computing semi-algebraic invariants for polynomial dynamical
systems. In: Chakraborty, S., Jerraya, A., Baruah, S.K., Fischmeister, S. (eds.) EMSOFT, pp.
97–106. ACM (2011)
12. Matringe, N., Moura, A.V., Rebiha, R.: Generating invariants for non-linear hybrid systems
by linear algebraic methods. In: Cousot, R., Martel, M. (eds.) SAS 2010. LNCS, vol. 6337,
pp. 373–389. Springer, Heidelberg (2010)
13. Morari, M., Thiele, L. (eds.): HSCC 2005. LNCS, vol. 3414. Springer, Heidelberg (2005)
14. Neuhaus, R.: Computation of real radicals of polynomial ideals II. Journal of Pure and Ap-
plied Algebra 124(13), 261–280 (1998)
15. Platzer, A.: Differential dynamic logic for hybrid systems. J. Autom. Reasoning 41(2), 143–
189 (2008)
16. Platzer, A.: Logical Analysis of Hybrid Systems: Proving Theorems for Complex Dynamics.
Springer, Heidelberg (2010)
17. Platzer, A.: A differential operator approach to equational differential invariants - (invited
paper). In: Beringer, L., Felty, A.P. (eds.) ITP. LNCS, vol. 7406, pp. 28–48. Springer (2012)
18. Platzer, A.: Logics of dynamical systems. In: LICS, pp. 13–24. IEEE (2012)
19. Platzer, A.: The structure of differential invariants and differential cut elimination. Logical
Methods in Computer Science 8(4), 1–38 (2012)
20. Platzer, A., Clarke, E.M.: Computing differential invariants of hybrid systems as fixedpoints.
In: Gupta, A., Malik, S. (eds.) CAV 2008. LNCS, vol. 5123, pp. 176–189. Springer, Heidel-
berg (2008)
21. Rodrı́guez-Carbonell, E., Kapur, D.: An abstract interpretation approach for automatic gen-
eration of polynomial invariants. In: Giacobazzi, R. (ed.) SAS 2004. LNCS, vol. 3148, pp.
280–295. Springer, Heidelberg (2004)
22. Rodrı́guez-Carbonell, E., Tiwari, A.: Generating polynomial invariants for hybrid systems.
In: Morari, Thiele (eds.) [13], pp. 590–605
23. Sankaranarayanan, S.: Automatic invariant generation for hybrid systems using ideal fixed
points. In: Johansson, K.H., Yi, W. (eds.) HSCC, pp. 221–230. ACM (2010)
24. Sankaranarayanan, S., Sipma, H.B., Manna, Z.: Constructing invariants for hybrid systems.
Formal Methods in System Design 32(1), 25–55 (2008)
25. Stengel, R.F.: Flight Dynamics. Princeton University Press (2004)
26. Tiwari, A.: Approximate reachability for linear systems. In: Maler, O., Pnueli, A. (eds.)
HSCC 2003. LNCS, vol. 2623, pp. 514–525. Springer, Heidelberg (2003)
27. Tiwari, A.: Abstractions for hybrid systems. Formal Methods in System Design 32(1), 57–83
(2008)
28. Tiwari, A., Khanna, G.: Nonlinear systems: Approximating reach sets. In: Alur, R., Pappas,
G.J. (eds.) HSCC 2004. LNCS, vol. 2993, pp. 600–614. Springer, Heidelberg (2004)
Quasi-Equal Clock Reduction:
More Networks, More Queries
1 Introduction
Real-time systems often use distributed architectures and communication pro-
tocols to exchange data in real-time. Examples of such protocols are the classes
of TDMA-based protocols [1] and EPL-based protocols [2].
Real-time systems can be modelled and verified by using networks of timed
automata [3]. In [4] a technique that reduces the number of clocks that model the
local timing behaviour and synchronisation activity of distributed components is
presented in order to reduce the verification runtime of properties in networks of
timed automata that fulfill a set of syntactical criteria called well-formedness. In
systems implementing, e.g., TDMA or EPL protocols this technique eliminates
the unnecessary verification overhead caused by the interleaving semantics of
timed automata, where the automata reset their clocks one by one at the end
of each communication phase. This interleaving induces sets of reachable inter-
mediate configurations which grow exponentially in the number of components
in the system. Model checking tools like Uppaal [5] explore these configurations
even when they are irrelevant for the property being verified. This exploration
unnecessarily increases the overall memory consumption and runtime verification
of the property.
The notion of quasi-equal clocks was presented in [4] to characterise clocks
that evolve at the same rate and whose valuation only differs in unstable phases,
1
CONACYT (Mexico) and DAAD (Germany) sponsor the work of the first author.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 295–309, 2014.
c Springer-Verlag Berlin Heidelberg 2014
296 C. Herrera, B. Westphal, and A. Podelski
i.e., points in time where these clocks are reset one by one. Sets of quasi-equal
clocks induce equivalence classes in networks of timed automata.
Although the technique introduced in [4] shows promising results for trans-
formed networks, the technique has two severe drawbacks. Namely, it loses all the
information from intermediate configurations and it supports only local queries,
i.e., properties defined over single timed automaton of well-formed networks. A
concrete consequence of these drawbacks can be observed in the system with
quasi-equal clocks presented in [6] which implements an EPL protocol. In the
transformed model of this system it is not possible to perform the sanity check
that a given automaton receives configuration data from other system compo-
nents right before this automaton resets its quasi-equal clock. The check involves
querying information of several automata from intermediate configurations. Sys-
tem properties are quite often expressed in terms of several automata.
To overcome these limitations, in this work we revisit the reduction of quasi-
equal clocks in networks of timed automata, and we present an approach based
on the following new idea. For each set of quasi-equal clocks we summarise
unstable configurations using dedicated locations of automata introduced during
network transformation. Queries which explicitly refer to unstable configurations
are rewritten to refer to the newly introduced summary location instead. The
dedicated summary locations also allow us to support complex resetting edges
in the original model, i.e. edges with synchronisation of assignments other than
clock resets. This allows us to extend the queries that we support as per our new
approach which is also a source-to-source transformation, i.e. our approach can
be used with a wide range of model-checking tools.
Our approach aims to provide the modelling engineer with a system optimi-
sation technique which allows him to naturally model systems without caring
to optimise them for verification. Our contributions are: (1) We now support
properties referring to multiple timed automata, in particular properties which
query (possibly overlapping) unstable configurations. (2) We enlarge the appli-
cability of our new approach by relaxing the well-formedness criteria presented
in [4]. Our approach allows us to prove in a much simpler and more elegant way
(without a need for the reordering lemma from [4]) that transformed networks
are weakly bisimilar to their original counterparts. We show that properties wrt.
an original network are fully preserved in the transformed network, i.e., the
transformed network satisfies a transformed property if and only if the original
network satisfies the original property. We evaluate our approach on six real
world examples, three of them new, where we observe significant improvements
in the verification cost of non-local queries compared to the cost of verifying
them in the original networks.
The paper is organized as follows. In Section 2, we provide basic definitions.
Section 3 introduces the formal definition of well-formed networks and presents
the algorithm that implements our approach. In Section 4, we formalise the
relation of a well-formed network and its transformed network and prove the
correctness of our approach. In Section 5, we compare the verification time of six
real world examples before and after applying our approach. Section 6 concludes.
Quasi-Equal Clock Reduction: More Networks, More Queries 297
Related Work. The methods in [7–9] eliminate clocks by using static analysis
over single timed automaton, networks of timed automata and parametric timed
automata, respectively. The approaches in [7, 8] reduce the number of clocks in
timed automata by detecting equal and active clocks. Two clocks are equal in a
location if both are reset by the same incoming edge, so just one clock for each
set of equal clocks is necessary to determine the future behavior of the system.
A clock is active at a certain location if this clock appears in the invariant of
that location, or in the guard of an outgoing edge of such a location, or another
active clock takes its value when taking an outgoing edge. Non-active clocks play
no role in the future evolution of the system and therefore can be eliminated.
In [9] the same principle of active clocks is used in parametric timed automata.
Our benchmarks use at most one clock per component which is always active,
hence the equal and active approach is not applicable on them.
The work in [10, 11] uses observers, i.e., single components encoding properties
of a system, to reduce clocks in systems. For each location of the observer, the
technique can deactivate clocks if they do not play a role in the future evolution
of this observer. Processing our benchmarks in order to encode properties as per
the observers approach may be more expensive than our method (one observer
per property), and may not guarantee the preservation of information from in-
termediate configurations which in the case of our EPL benchmark is needed. In
general using observers to characterise non-local queries is not straightforward.
In sequential timed automata [12], one set of quasi-equal clocks is syntacti-
cally declared. Those quasi-equal clocks are implicitly reduced by applying the
sequential composition operator. The work in [13] avoids the use of shared clocks
in single timed automaton by replacing shared clocks with fresh ones if the
evolution of these automata does not depend on these clocks. This approach
increments the number of clocks (in contrast to ours). Our benchmarks do not
use shared clocks. The approach in [14] detects quasi-equal clocks in networks of
timed automata. Interestingly, the authors demonstrate the feasibility of their
approach in benchmarks that we also use in this paper.
2 Preliminaries
where #ini = ini,1 , . . . , ini,N and νini (x) = 0 for each x ∈ X (Ai ), 1 ≤ i ≤ N .
λ λ
A finite or infinite sequence σ = s0 −→ 1
s1 −→2
s2 . . . is called transition sequence
(starting in s0 ∈ Cini ) of N . Sequence σ is called computation of N if and only
if it is infinite and s0 ∈ Cini . We denote the set of all computations of N by
Π(N ). A configuration s is called reachable (in T (N )) if and only if there exists
a computation σ ∈ Π(N ) such that s occurs in σ.
The set of basic formulae over N is given by the grammar β ::= | ¬ | ϕ
where ∈ L(Ai ), 1 ≤ i ≤ n, and ϕ ∈ Φ(X (N ), V(N )). Basic formula β is satisfied
by configuration s ∈ Conf (N ) if and only if s,i = , s,i = , or νs |= ϕ, resp. A
reachability query EPF over N is ∃♦ CF where CF is a configuration formula
Quasi-Equal Clock Reduction: More Networks, More Queries 299
over N , i.e., any logical connection of basic formulae. We use β(CF ) to denote
the set of basic formulae in CF . N satisfies ∃♦ CF , denoted by N |= ∃♦ CF , if
and only if there is a configuration s reachable in T (N ) s.t. s |= CF .
We recall from [4] the following definitions. Given a network N with clocks
X , two clocks x, y ∈ X are called quasi-equal, denoted by x 3 y, if and only
if for all computation paths of N , the valuations of x and y are equal, or the
λ1 λ2
valuation of one of them is equal to 0, i.e., if ∀ s0 −→ s1 −→ s2 · · · ∈ Π(N ) ∀ i ∈
N0 • νsi |= (x = 0 ∨ y = 0 ∨ x = y). In the following, we use EC N to denote
the set {Y ∈ X /3 | 1 < |Y |} of equivalence classes of quasi-equal clocks of
N with at least two elements. For each Y ∈ X /3, we assume a designated
representative denoted by rep(Y ). For x ∈ Y , we use rep(x) to denote rep(Y ).
Given a constraint ϕ ∈ Φ(X , V), we write Γ (ϕ) to denote the constraint that is
obtained by syntactically replacing each occurrence of a clock x ∈ X in ϕ, by the
representative rep(x). Given an automaton A ∈ N , a set of clocks X ⊆ X (A),
and a set of variables V ⊆ V(A), we use SE X (A) to denote the set of simple
resetting edges of A which reset clocks from X, have action τ , no variables occur
in their guards, and do not update any variables, i.e., SE X (A) = {(, α, ϕ, #r, ) ∈
E(A) | clocks(#r) ∩ X = ∅ ∧ α = τ ∧ vars (ϕ) = ∅ ∧ vars(#r) = ∅}. We use CE X (A)
to denote the set of complex resetting edges of A which reset clocks from X
and have an action different from τ or update some variables, i.e., CE X (A) =
{(, α, ϕ, #r, ) ∈ E(A) | clocks(#r) ∩ X = ∅ ∧ (vars(#r) ∩ V = ∅ ∨ α = τ )}.
We use LS X (A) and LC X (A) to respectively denote the set of locations (source
and destination) of simple and complex resetting edges wrt. X of A. We use
EX (A) = SE X (A) ∪ CE X (A) to denote the set of resetting edges of A which
reset clocks from X, and RES X (N ) to denote the set of automata in N which
have a resetting edge, i.e., RES X (N ) = {A ∈ N | EX (A) = ∅}. A location ( )
is called is called reset (successor) location wrt. Y ∈ EC N in N if and only if
there is a resetting edge in SE Y (A) ∪ CE Y (A) from (to) ( ). We use RLY (N )
(RL+ Y (N )) to denote the set of reset (successor) locations wrt. Y in N and we
set RLEC N (N ) := Y ∈EC N RLY (N ) and similarly RL+ EC N (N ).
A configuration s ∈ Conf (N ) is called stable wrt. Y ∈ EC N if and only if
all clocks in Y have the same value in s, i.e., if ∀ x ∈ Y • νs (x) = νs (rep(x)).
We use SC YN to denote#the set of all configurations that are stable wrt. Y and
SC N to denote the set Y ∈ECN SC YN of globally stable configurations of N . Con-
figurations not in SC N are called unstable. An edge e of a timed automaton
A in network N is called delayed if and only if time must pass before e can
λ1 λn
be taken, i.e., if ∀ s0 −→ E1 s1 . . . sn−1 −
−→En sn ∈ Π(N ) • e ∈ En =⇒
∃ 0 ≤ j < n • λj ∈ Time \ {0} ∧ ∀ j ≤ i < n • E(A) ∩ Ei = ∅. Where
λ λ
we write si −→ i
Ei si+1 , i ∈ N
>0
, to denote that the transition si −→i
si+1
is justified by the set of edges Ei ; Ei is empty for delay transitions, i.e., if
λi ∈ Time. We say EC N -reset edges are pre/post delayed in network N if and
only if all edges originating in reset or reset successor locations are delayed, i.e.,
if ∀ e = (, α, ϕ, #r , ) ∈ E(N ) • ∈ RLECN (N ) ∪ RL+ EC N (N ) =⇒ e is delayed.
300 C. Herrera, B. Westphal, and A. Podelski
x ≤ 59
idle
x ≥ 50
y ≥ 40
closed := 0 A2 :
A1 : wait fill wait fill
x ≥ 60 y ≥ 60
x ≤ 60 x ≤ 60
x := 0, closed := 1 y ≤ 60 y := 0 y ≤ 60
(R1) An edge resets at most one clock x ∈ Y , in the constraint (guard) of this
edge there is a clause of the form x ≥ CY , and the source location of that
edge has an invariant x ≤ CY for some constant CY > 0, i.e.,
(R3) For pairs of edges that synchronise on some channel a ∈ B(N ), either all
edges reset a clock from Y , or none of these edges resets a clock from Y , or
the output a! is in one edge resetting a clock from Y , and the inputs a? are
in the edges of automata which do not reset clocks from Y , i.e.,
x ≥ 40
idle x ≤ 59 A2 :
rst IY ++
rst IY ++ rst IY −− wait fill
reset Y ?
x ≥ 50
x ≤ 60 Y −−
rst O x ≤ 60
closed := 0
A1 : wait
rst IY ++ fill reset Y !
x ≤ 60 x ≤ 60
RY : x ≥ 60 ∧ rst IY = 2
reset Y ? rst IY := 0
closed := 1
ini,RY x := 0
nst ,Y
Y −−
rst O
ξ Y rst O
Y := 2
x≤0
Y = 0∧x ≤ 0
rst O
x≤0
otherwise,
– r2 = rst IY := rst IY −1 if e is from a reset location in RLY (N ) and e ∈
/ EY (A),
and r2 =
otherwise, and
– r3 = rst OY := rst Y − 1 if e ∈ EY (A), and r3 =
otherwise.
O
It initializes the variable rst IY to iLY := |{A ∈ N | ini,A ∈ RLY (N )}|, i.e. the
number of automata whose initial location is a reset location of Y , and rst O Y
to nY := |RES Y (N )|, i.e. the number of automata that reset the clocks of Y .
There are two locations with the invariants I(ini,RY ) = true and I(nst,Y ) =
rep(Y ) ≤ 0. The set of edges E consists of
Example 1. Applying K to N1 from Figure 1 yields network N1 (cf. Figure 2).
Similar to the algorithm in [4], only the representative clock of each equivalence
class remains. All guards and invariants with quasi-equal clocks are re-written
to refer to the representative clock, and the reset operation is delegated to the
resetter. The variable rst IY together with well-formedness enforces a blocking
multicast synchronisation between resetter and the automata in RES Y (N ).
In order to support non-local queries, and in particular queries for possibly
overlapping unstable configurations, the approach presented here introduces one
resetter per equivalence class with two locations each. The location nst,Y rep-
resents all unstable configuration wrt. Y . To support complex edges, and thus
non-trivial behaviour during unstable phases, complex edges are basically split
into two. The first one synchronises with the resetter and the second one carries
out the actions of the original complex edge. As long as the second edge has not
been taken, the system is unstable. The variable rst OY is introduced to indicate
to automaton RY when this unstability finishes. Its value gives the number of
automata which still need to take their reset edge in the current unstable phase.
In N1 , we have thereby eliminated the interleaving induced by resetting the
clocks x and y in N1 , but the interleaving wrt. variable updates during reset of
quasi-equal clocks is preserved by splitting the complex edge into two. Note that
in transformed networks, configurations with the locations nst,Y1 , . . . , nst,Yn ,
where 1 < n, reflect overlapping unstable phases, i.e. instability wrt. multiple
equivalence classes at one point in time.
∧ (˜i =⇒ x̃j = CY ) ∧ (˜i =⇒ x̃j = 0) ∧ (˜i =⇒ )
1≤i≤k,1≤j≤m, 1≤i≤k,1≤j≤m, (
,α,ϕ,r ,
)
xj ∈Xp ∩Y,1≤p≤n, xj ∈Xp ∩Y,1≤p≤n, ∈SE Y (A),
For example, for Ω(φ) we obtain, after some simplifications given that A2 has
only simple resetting edges, the following transformed formula:
∃ x̃ ∈ {0, CY } • closed = 1 ∧ ((x ≤ 60 ∧ ¬nst,Y ) ∨ (nst,Y ∧ x̃ ≥ 60)).
s
τ s̄ s
τ s̄
(N )
λ τ λ
(N )
r r̄ r r̄ r̄
(i) Get unstable by SE-edge. (iv) Get unstable by CE-edge.
r τ r̄ r τ r̄
(N )
τ τ ... τ 0
(N )
s s̄1 s̄n s s
(i) Reset some SE-edges. (ii) Reset all CE-edges.
Fig. 3. Some involved weak bisimulations cases between the transition system (TS) of a
well-formed network N and the TS of the network N = K(N , E C N ). Dots with the leg-
end (s̄)s and (r̄)r represent (unstable) stable configurations of N and N , respectively.
Arrows represent transitions between configurations of the same TS. Configurations s
and r are in transition simulation if they are linked by a dotted line.
Table 1. Column XX-N (K) gives the figures for case study XX with N sensors (and
K applied). ‘C’ gives the number of clocks in the model, ‘kStates’ the number of visited
states times 103 , ‘M’ memory usage in MB, and ‘t(s)’ verification time in seconds.
FraTTA transformed each of our benchmarks in at most 5 seconds.
(Env.: Intel i3, 2.3GHz, 3GB, Ubuntu 11.04, verifyta 4.1.3.4577 with default options.)
Proof. Use Lemma 1 and induction over the lenght of paths to show that CF
holds in N if and only if Ω(CF ) holds in K(N , EC N ).
5 Experimental Results
We applied our approach to six industrial case studies using FraTTA [16], our
implementation of K. The three case studies FS [17], CR [18], CD [19] are from
the class of TDMA protocols and appear in [4]. The relaxed well-formedness cri-
teria (compared to [4]) allowed us to include the three new case studies EP [6],
TT [20], LS [21]. We verified non-local queries as proposed by the respective
authors of these case studies. None of these queries could be verified with the
approach presented in [4]. Our motivating case study is inspired by the network
from [6] which use the Ethernet PowerLink protocol in Alstom Power Control
Systems. The network consists of N sensors and one master. The sensors ex-
change information with the master in two phases, the first is isochronous and the
second asynchronous. An error occurs if a sensor fails to update the configuration
data as sent by the master in the beginning of the isochronous phase. Specif-
ically, each sensor should update its internal data before the master has reset
its clock. The query configData := ∀ A.configData = 1 ∧ A.x = 0 ∧ M.y > 0,
where A is a sensor and M is the master, x and y are quasi-equal clocks from
308 C. Herrera, B. Westphal, and A. Podelski
the same equivalence class, and configData is a boolean variable set to true by
the edge that resets x when A has successfully updated its configuration data,
checks whether this network is free from errors as explained before. Note that
query configData is non-local and in addition refers to an unstable configuration.
We refer the reader to [17–21] for more information on the other case studies.
Table 1 gives figures for the verification of the non-local queries in instances of the
original and the transformed model. The rows without results indicate the smallest
instances for which we did not obtain results within 24 hours. For all examples ex-
cept for TT, we achieved significant reductions in verification time. The quasi-equal
clocks in the TT model are reset by a broadcast transition so there is no interleaving
of resets in the original model. Still, verification of the transformed TT instances
including transformation time is faster than verification of the original ones. Re-
garding memory consumption, note that verification of the K -models of EP and LS
takes slightly more memory than verification of the original counterparts. We argue
that this is due to all resetting edges being complex in these two networks. Thus,
our transformation preserves the full interleaving of clock resets and the whole set
of unstable locations whose size is exponential in the number of participating au-
tomata, and it adds the transitions to and from location nst . The shown reduction
of the verification time is due to a smaller size of the DBMs that Uppaal uses to
represent zones [22] and whose size grows quadratically in the number of clocks. If
the resetting edges are simple (as in FS, CD, and CR), our transformation removes
all those unstable configurations.
6 Conclusion
Our new technique reduces the verification time of networks of timed automata
with quasi-equal clocks. It represents all clocks from an equivalence class by one
representative, and it eliminates those configurations induced by automata that
reset quasi-equal clocks one by one. All interleaving transitions which are induced
by simple resetting edges are replaced by just two transitions in the transformed
networks. We use nst-locations to summarise unstable configurations. This allows
us to also reduce the runtime of non-local properties or properties explicitly query-
ing unstable phases. With variables rstI , rstO we unfold information summarised
in nst-locations, and together with a careful syntactical transformation of proper-
ties, we reflect all properties of original networks in transformed ones. Our new ap-
proach fixes the two severe drawbacks of [4], which only supports local queries and
whose strong well-formedness conditions rules out many industrial case-studies.
Our experiments show the feasibility and potential of the new approach, even if
some interleavings are preserved and only the number of clocks is reduced.
References
1. Rappaport, T.S.: Wireless communications, vol. 2. Prentice Hall (2002)
2. Cena, G., Seno, L., et al.: Performance analysis of ethernet powerlink networks for
distributed control and automation systems. CSI 31(3), 566–572 (2009)
Quasi-Equal Clock Reduction: More Networks, More Queries 309
3. Alur, R., Dill, D.: A theory of timed automata. TCS 126(2), 183–235 (1994)
4. Herrera, C., Westphal, B., Feo-Arenis, S., Muñiz, M., Podelski, A.: Reducing quasi-
equal clocks in networks of timed automata. In: Jurdziński, M., Ničković, D. (eds.)
FORMATS 2012. LNCS, vol. 7595, pp. 155–170. Springer, Heidelberg (2012)
5. Behrmann, G., David, A., Larsen, K.G.: A tutorial on uppaal. In: Bernardo, M.,
Corradini, F. (eds.) SFM-RT 2004. LNCS, vol. 3185, pp. 200–236. Springer, Hei-
delberg (2004)
6. Limal, S., Potier, S., Denis, B., Lesage, J.: Formal verification of redundant media
extension of ethernet powerlink. In: ETFA, pp. 1045–1052. IEEE (2007)
7. Daws, C., Yovine, S.: Reducing the number of clock variables of timed automata.
In: RTSS, pp. 73–81. IEEE (1996)
8. Daws, C., Tripakis, S.: Model checking of real-time reachability properties using
abstractions. In: Steffen, B. (ed.) TACAS 1998. LNCS, vol. 1384, pp. 313–329.
Springer, Heidelberg (1998)
9. André, É.: Dynamic clock elimination in parametric timed automata. In: FSFMA,
OASICS, pp. 18–31, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2013)
10. Braberman, V., Garbervestky, D., Kicillof, N., Monteverde, D., Olivero, A.: Speed-
ing up model checking of timed-models by combining scenario specialization and
live component analysis. In: Ouaknine, J., Vaandrager, F.W. (eds.) FORMATS
2009. LNCS, vol. 5813, pp. 58–72. Springer, Heidelberg (2009)
11. Braberman, V.A., Garbervetsky, D., Olivero, A.: Improving the verification of
timed systems using influence information. In: Katoen, J.-P., Stevens, P. (eds.)
TACAS 2002. LNCS, vol. 2280, pp. 21–36. Springer, Heidelberg (2002)
12. Muñiz, M., Westphal, B., Podelski, A.: Timed automata with disjoint activity. In:
Jurdziński, M., Ničković, D. (eds.) FORMATS 2012. LNCS, vol. 7595, pp. 188–203.
Springer, Heidelberg (2012)
13. Balaguer, S., Chatain, T.: Avoiding shared clocks in networks of timed automata.
In: Koutny, M., Ulidowski, I. (eds.) CONCUR 2012. LNCS, vol. 7454, pp. 100–114.
Springer, Heidelberg (2012)
14. Muñiz, M., Westphal, B., Podelski, A.: Detecting quasi-equal clocks in timed au-
tomata. In: Braberman, V., Fribourg, L. (eds.) FORMATS 2013. LNCS, vol. 8053,
pp. 198–212. Springer, Heidelberg (2013)
15. Olderog, E.-R., Dierks, H.: Real-time systems - formal specification and automatic
verification. Cambridge University Press (2008)
16. Fitriani, K.: FraTTA: Framework for transformation of timed automata, Master
Team Project, Albert-Ludwigs-Universität Freiburg (2013)
17. Dietsch, D., Feo-Arenis, S., et al.: Disambiguation of industrial standards through
formalization and graphical languages. In: RE, pp. 265–270. IEEE (2011)
18. Gobriel, S., Khattab, S., Mossé, D., et al.: RideSharing: Fault tolerant aggregation
in sensor networks using corrective actions. In: SECON, pp. 595–604. IEEE (2006)
19. Jensen, H., Larsen, K., Skou, A.: Modelling and analysis of a collision avoidance
protocol using SPIN and Uppaal. In: 2nd SPIN Workshop (1996)
20. Steiner, W., Elmenreich, W.: Automatic recovery of the TTP/A sensor/actuator
network. In: WISES, pp. 25–37, Vienna University of Technology (2003)
21. Kordy, P., Langerak, R., et al.: Re-verification of a lip synchronization protocol
using robust reachability. In: FMA. EPTCS, vol. 20, pp. 49–62 (2009)
22. Bengtsson, J., Yi, W.: Timed automata: Semantics, algorithms and tools. In: Desel,
J., Reisig, W., Rozenberg, G. (eds.) ACPN 2003. LNCS, vol. 3098, pp. 87–124.
Springer, Heidelberg (2004)
Are Timed Automata Bad for a Specification Language?
Language Inclusion Checking for Timed Automata
Ting Wang1 , Jun Sun2 , Yang Liu3 , Xinyu Wang1 , and Shanping Li1
1
College of Computer Science and Technology, Zhejiang University, China
2
ISTD, Singapore University of Technology and Design, Singapore
3
School of Computer Engineering, Nanyang Technological University, Singapore
1 Introduction
Timed automata, introduced by Alur and Dill in [2], have emerged as one of the most
popular models to specify and analyze real-time systems. It has been shown that the
reachability problem for timed automata is decidable using the construction of region
graphs [2]. Efficient zone-based methods for checking both safety and liveness proper-
ties have later been developed [14,21]. In [2], it has also been shown that timed automata
in general cannot be determinized, and the language inclusion problem is undecidable,
which “is an obstacle in using timed automata as a specification language”.
In order to avoid undecidability, a number of subclasses of timed automata which are
determinizable (and perhaps serve as a good specification language) have been identi-
fied, e.g., event-clock timed automata [3,17], timed automata restricted to at most one
clock [16] and integer resets timed automata [18]. Recently, Baier et al. [4] described a
method for determinizing arbitrary timed automaton, which under a boundedness con-
dition, yields an equivalent deterministic timed automaton in finite time. Furthermore,
they show that the boundedness condition is satisfied by several subclasses of timed au-
tomata which are known to be determinizable. However, the method is based on region
graphs and it is well-known that region graphs are inefficient and lead to state space
explosion. Compared to region graphs, zone graphs are often used in existing tools
for real-time system verification, such as U PPAAL [14] and K RONOS [23]. Zone-based
approaches have also been used to solve problems which are related to the language
This research is sponsored in part by NSFC Program (No.61103032) of China.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 310–325, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Are Timed Automata Bad for a Specification Language? 311
inclusion problem, like the universality problem (which asks whether a timed automa-
ton accepts all timed words) for timed automata with one-clock only [1]. However, to
the best of our knowledge, there has not been any zone-based method proposed for
language inclusion checking for arbitrary timed automata.
In this work we develop a zone-based method to solve the language inclusion prob-
lem. Formally, given an implementation timed automaton P and a specification timed
automaton S, the language inclusion problem is to decide whether the language of P
is a subset of that of S. It is known that the problem can be converted to a reachabil-
ity problem on the synchronous product of P and determinization of S [16]. Inspired
by [1,4], the main contribution of this work is that we present a semi-algorithm with a
transformation that determinizes S and constructs the product on-the-fly, where zones
are used as a symbolic representation. Furthermore, simulation relations between the
product states are used, which can be obtained through LU-simulation [5] and Anti-
Chain [22]. With the simulation relations, many product states may be skipped, which
often contributes to the termination of our semi-algorithm.
Our semi-algorithm can be applied to arbitrary timed automata, though it may not ter-
minate sometimes. To argue that timed automata can nonetheless serve as a specification
language, we investigate when our approach is terminating, both theoretically and em-
pirically. Firstly, we prove that, with the clock boundedness condition [4], we are able to
construct a suitable well-quasi-order on the product state space to ensure termination.
It thus implies that our semi-algorithm is always terminating for subclasses of timed
automata which are known to be determinizable. Furthermore, we prove that for some
classes of timed automata which may violate the boundedness condition, our semi-
algorithm is always terminating as long as there is a well-quasi-order on the abstract
state space explored. Secondly, using randomly generated timed automata, we show
that our approach terminates for many timed automata which are not determinizable
(and violating the boundedness condition) because of the simulation reduction. Thirdly,
we collect a set of commonly used patterns for specifying timed properties [8,12] and
show that our approach is always terminating for all of those properties. Lastly, our
semi-algorithm has been implemented in the PAT [19] framework, and applied to a
number of benchmark systems to demonstrate its effectiveness and scalability.
The remainders of the paper is organized as follows. Section 2 reviews the notions of
timed automata. Section 3 shows how to reduce language inclusion checking to a reach-
ability problem, which is then solved using a zone-based approach. Section 4 reports
the experimental results. Section 5 reviews related work. Section 6 concludes.
2 Background
In this section, we review the relevant background and define the language inclusion
problem. We start with defining labeled transition systems (LTS). An LTS is a tuple
L = (S, Init, Σ, T ), where S is a set of states; Init ⊆ S is a set of initial states;
Σ is an alphabet; and T ⊆ S × Σ × S is a labeled transition relation. A run of L is
a finite sequence of alternating states and events s0 , e1 , s1 , e2 , · · · , en , sn such that
(si , ei , si+1 ) ∈ T for all 0 ≤ i ≤ n − 1. We say the run starts with s0 and ends with sn .
A state s is reachable from s iff there is a run starting with s and ending with s . A state
312 T. Wang et al.
is always reachable from itself. A run is rooted if it starts with a state in Init. A state
is reachable if there is a rooted run which ends at the state. Given a state s ∈ S and an
event e ∈ Σ, we write post(s, e, L) to denote {s |(s, e, s ) ∈ T }. We write post(s, L)
to denote {s |∃e ∈ Σ · (s, e, s ) ∈ T }, i.e., the set of successors of s.
Let F ⊆ S be a set of target states. Given two states s0 and s1 in S, we say that s0
is simulated by s1 with respect to F if s0 ∈ F implies that s1 ∈ F ; and for any e ∈ Σ,
(s0 , e, s0 ) ∈ T implies there exists (s1 , e, s1 ) ∈ T such that s0 is simulated by s1 . In
order to check whether a state in F is reachable, if we know that s is simulated by s ,
then s can be skipped during system exploration if s has been explored already. This is
known as simulation reduction [7].
The original definition of timed automata is finite-state timed Büchi automata [2]
equipped with real-valued clock variables and Büchi accepting condition (to enforce
progress). Later, timed safety automata were introduced in [11] which adopt an intu-
itive notion of progress. That is, instead of having accepting states, each state in timed
safety automata is associated with a local timing constraint called a state invariant. An
automaton can stay at a state as long as the valuation of the clocks satisfies the state
invariant. The reader can refer to [9] for the expressiveness of timed safety automata.
In the following, we focus on timed safety automata as they are supported in the state-
of-art model checker U PPAAL [14] and are often used in practice. Hereafter, they are
simply referred to as timed automata.
Let R+ be the set of non-negative real numbers. Given a set of clocks C, we define
Φ(C) as the set of clock constraints. Each clock constraint is inductively defined by:
δ := true|x ∼ n|δ1 ∧δ2 |¬δ1 where ∼∈ {=, ≤, ≥, <, >}; x is a clock in C and n ∈ R+
is a constant. Without loss of generality, we assume that n is an integer constant. The
set of downward constraints obtained with ∼∈ {≤, <} is denoted as Φ≤,< (C). A clock
valuation v for a set of clocks C is a function which assigns a real value to each clock. A
clock constraint can be viewed as the set of clock valuations which satisfy the constraint.
A clock valuation v satisfies a clock constraint δ, written as v ∈ δ, iff δ evaluates to be
true using the clock values given by v. For d ∈ R+ , let v + d denote the clock valuation
v s.t. v (c) = v(c) + d for all c ∈ C. For X ⊆ C, let clock resetting notion [X )→ 0]v
denote the valuation v such that v (c) = v(c) for all c ∈ C ∧ c ∈ / X, and v (x) = 0 for
all x ∈ X. We write C = 0 to be the clock valuation where each clock c ∈ C reads 0.
Formally, a timed automaton is a tuple A = (S, Init, Σ, C, L, T ) where S is a finite
set of states; Init ⊆ S is a set of initial states; Σ is an alphabet; C is a finite set of
clocks; L : S → Φ≤,< (C) labels each state with an invariant; T ⊆ S × Σ × Φ(C) ×
2C × S is a labeled transition relation. Intuitively, a transition (s, e, δ, X, s) ∈ T can
be fired if δ is satisfied. After event e occurs, clocks in X are set to zero. The (concrete)
semantics of A is an infinite-state LTS, denoted as C(A) = (Sc , Initc , R+ × Σ, Tc)
such that Sc is a set of configurations of A, each of which is a pair (s, v) where s ∈ S
is a state and v is a clock valuation; Initc = {(s, C = 0)|s ∈ Init} is a set of initial
configurations; and Tc is a set of concrete transitions of the form ((s, v), (d, e), (s , v ))
such that there exists a transition (s, e, δ, X, s ) ∈ T ; v + d ∈ δ; v + d ∈ L(s);
[X )→ 0](v + d) = v ; and v ∈ L(s ). Intuitively, the system idles for d time units at
state s and then take the transition (generating event e) to reach state s . An example
timed automaton is shown in Fig. 1(a). The initial state is p0 . The automaton has a state
Are Timed Automata Bad for a Specification Language? 313
s3
a, {z4} z4>0, a, {z5}
s1 s2 {z4}
z0>3, a, {z1} a, {z2} z2>0, a, {z3} z4>0, a, {z5}
s0 s1 s2 {z3, z2} {z4}
s1
z2>0, a, {z3} z23, a, {z4}
{z0} {z1, z0} {z2} s3 s0 {z5, z4}
z4>3, a, {z5}
{z2} {z4, z2}
s1
{z5, z2}
Fig. 2. Unfolding a timed automaton into an infinite timed tree
n = (s, A), we define A to be {fn (c)|c ∈ Cs }. The initial states Init∞ and transition
relation T∞ are unfolded as follows.
– For any s ∈ Inits , there is a level-0 node n = (s, {z0 }) in St∞ with level(n) = 0,
fn (c) = z0 for all c ∈ Cs .
– For each node n = (s, A) at level i and for each transition (s, e, δ, X, s) ∈
Ts , we add a node n = (s , A ) at level i + 1 such that fn (c) = fn (c) if
c ∈ Cs \ X, fn (c) = zi+1 if c ∈ X; level(n ) = i + 1. We add a transition
(n, e, fn (δ), {zi+1 }, n ) to T∞ .
Note that transitions at the same level have the same set of resetting clocks, which
contains one clock. Given a node n = (s, A) in the tree, observe that not every clock x
in A is active as the clock may never be used to guard any transition from s. Hereafter,
we assume that inactive clocks are always removed.
(a) (b)
– All states in Xs are at the same level and thus all transitions in T∞ (e, Xs ) have the
same resetting clock. Let Y be that clock and δ = ([Y ∪ Xp )→ 0](δ ∧ g ∧ gp ))↑ .
– The transition from (sp , Xs , δ) to (sp , Xs , δ ) is labeled with the tuple (e, gp ∧
g, Xp ∪ Y ).
We illustrate the above using the example in Fig. 3(a). Given the abstract config-
uration in level 4, which is (p2 , {(s2 , {z4 }), (s0 , {z4 , z2 })}, 0 ≤ x = z4 < z2 ∧
z2 − x ≤ 3 ∧ z2 − z4 ≤ 3). As shown in Fig. 2, there are two transitions from
state (s2 , {z4 }) which are labeled with event a and one from state (s0 , {z4 , z2 }), which
makes up Ts (a, {(s2 , {z4 }), (s0 , {z4 , z2 })}). The two transitions from (s2 , {z4 }) have
the same guard z4 > 0 and the one from (s0 , {z4 , z2 }) has the guard z4 > 3. The set
Cons(e, Xs ) contains the following constraints: z4 > 0 ∧ z4 > 3, z4 ≤ 0 ∧ z4 > 3,
z4 > 0 ∧ z4 ≤ 3, and z4 ≤ 0 ∧ z4 ≤ 3. Taking the transition form p2 to p1 as an
example, we generate four potential successors for each of constraints in Cons(e, Xs ),
as shown above. Two of them are infeasible as the resultant constraints are false. The
rest two are shown in Fig. 3(a) (the first two from left at level 5). Since z2 is no longer
active for the second successor, the clock constraint of the second successor is modified
to 0 ≤ x = z5 < z4 ∧ z4 − x ≤ 3 ∧ z4 − z5 ≤ 3 so as to remove constraints on z2 .
Similarly, we can generate other configurations in Fig. 3(a).
In the following, we reduce the language inclusion checking problem to a reach-
ability problem in Z∞ . Notice that one of constraints in Cons(e, Xs ) conjuncts the
negations of all guards of transitions in T∞ (e, Xs ). Let us denote the constraint as neg.
For instance, given the same abstract state in the middle of level 4 in Fig. 3(a), the
constraint neg in Cons(e, Xs ) is: z4 ≤ 0 ∧ z4 ≤ 3, which is equivalent to z4 ≤ 0.
Conjuncted with the guard condition x > 0 and the initial constraint 0 ≤ x = z4 <
z2 ∧ z2 − x ≤ 3 ∧ z2 − z4 ≤ 3, it becomes false and hence no successor is generated
for neg. Given neg, assume the corresponding successor is (sp , Xs , δ ). It is easy to see
Are Timed Automata Bad for a Specification Language? 317
that Xs is empty. If δ is not false, intuitively there exists a time point such that P can
perform e whereas S cannot, which implies language inclusion is not true. Thus, we
have the following theorem.
We have so far reduced the language inclusion checking problem to a reachability prob-
lem in the potentially infinite-state LTS Zr . Next, we reduce the size of Zr by exploring
simulation relation between states in Zr . We first extend the lower-upper bounds (here-
after LU-bounds) simulation relation defined in [5] to language inclusion checking.
We define two functions L and U . Given a state s in Zr and a clock x ∈ Cp ∪
Z, we perform a depth-first-search to collect all transitions reachable from s without
going through a transition which resets x. Next, we set L(s, x) (resp. U (s, x)) to be the
maximal constant k such that there exists a constraint x > k or x ≥ k (resp. x < k or
x ≤ k) in a guard of those transitions. If such a constant does not exist, we set L(s, x)
(resp. U (s, x)) to −∞. We remark that L(s, x) is always the same as U (s, x) for a clock
in Z because both guard conditions and their negations are used in constructing Zr . For
instance, if we denote the state at level 0 in Fig. 3(b) as s0 , which can be seen as the
initial state in Zr , the function L is then defined such that L(s0 , x) = 3, L(s0 , z0 ) = 3.
Next, we define a relation between two zones using the LU-bounds and show that
the relation constitutes a simulation relation. Given two clock valuations v and v at a
state s and the two functions L and U , we write v LU v if for each clock c, either
v (c) = v(c) or L(s, c) < v (c) < v(c) or U (s, c) < v(c) < v (c). Next, given two
zones δ1 and δ2 , we write δ1 LU δ2 to denote that for all v1 ∈ δ1 , there is a v2 ∈ δ2
such that v1 LU v2 . The following shows that LU constitutes a simulation relation.
Lemma 1. Let (s, X, δi ) where i ∈ {0, 1} be two states of Zr and F be the set of states
{(s , ∅, δ )} in Zr . (s, X, δ1 ) simulates (s, X, δ0 ) w.r.t. F if δ0 LU δ1 .
318 T. Wang et al.
With the above lemma, given an abstract state (s, X, δ) of Zr , we can enlarge the time
constraint δ so as to include all clock valuations which are simulated by some valuations
in δ without changing the result of reachability analysis. In the following, we write
LU (δ) to denote the set {v|∃v ∈ δ · v LU v }2 . We construct an LTS, denoted
as ZrLU which replaces each state (s, X, δ) in Zr with (s, X, LU (δ)). We denote the
successors of a state ps in ZrLU as post(ps, ZrLU ). By a simple argument, we can show
that there is a reachable state (s, ∅, δ) in Zr iff there is a reachable state (s , ∅, δ ) in
ZrLU . For instance, given the Zr after renaming Z∞ shown in Fig. 3(a), Fig. 3(b) shows
the corresponding ZrLU .
Next, we incorporate another simulation relation in our work which is inspired by
the Anti-Chain algorithm [22]. The idea is that given two abstract states (s, X, δ) and
(s , X , δ ) of ZrLU , we can infer a simulation relation by comparing X and X . One
problem is that states in X and X may have different sets of active clocks. The exact
names of the clocks however do not matter semantically. In order to compare X and
X (and compare δ and δ ), we define clock mappings. A mapping from Act(X ) to
Act(X) is a injection function f : Act(X ) → Act(X) which maps every clock in
Act(X ) to one in Act(X). We write X ⊆f X if there exists a mapping f such that
for all (ss , A ) ∈ X , there exists (ss , A) ∈ X such that ss = ss and for all x ∈ A ,
f (x) ∈ A. Notice that there might be clocks in Act(X) which are not mapped to. We
write range(f ) to denote the set of clocks which are mapped to in Act(X). With an
abuse of notations, given a constraint δ constituted by clocks in Act(X ), we write
f (δ ) to denote the constraint obtained by renaming the clocks accordingly to f . We
write δ ⊆f δ if δ[range(f )] ⊆ f (δ ), i.e., the clock valuations which satisfy the
constraint δ[range(f )] (obtained by projecting δ onto clocks in Act(X )) satisfy δ
after clock renaming. Next, we define a relation between two abstract configurations.
We write (s, X, δ) ( (s , X , δ ) iff the following are satisfied: s = s and there exists
a mapping f such that X ⊆f X and δ ⊆f δ . The next lemma establishes that ( is a
simulation relation.
Lemma 2. Let (s, X, δ) and (s , X , δ ) be states in ZrLU . Let F = {(s, ∅, δ0 )} be the
set of target states. (s , X , δ ) simulates (s, X, δ) w.r.t. F if (s, X, δ) ( (s , X , δ ).
For example, let ps0 denote the state at level 1 in Fig. 3(a). Let ps1 denote the bold-
lined state at level 1 and ps2 denote the one at level 5 in Fig. 3(b). With the LU simu-
lation relation, ps0 can be replaced by ps1 . A renaming function f can be defined from
clocks in ps1 to clocks in ps2 , i.e., f (z0 ) = z1 and f (z1 ) = z2 . After renaming, ps1
becomes (p1 , {(s1 , {z2 , z1 })}, 0 ≤ x = z2 < z1 ). Therefore, ps2 ( ps1 and hence we
do not need to explore from ps2 . Similarly, we do not need to explore from the bold-
lined state at level 3 in Fig. 3(b), namely ps3 . Notice that without the LU simulation
reduction ps3 ( ps1 cannot hold, and the successors of ps3 must be explored.
3.4 Algorithm
In the following, we present our semi-algorithm. Let ZrLU be the tuple (S, Init, Σ, T )
where Init is a set (initp , Inits , LU ((Cp = 0 ∧ z0 = 0)↑ )). Algorithm 1 constructs
2
Notice that we may not be able to represent this set as a convex time constraint [5].
Are Timed Automata Bad for a Specification Language? 319
Next, we establish sufficient conditions for the termination of semi-algorithm with the
theorems of well quasi-order (W QO [15]). A quasi-order (QO) on a set A is a pair
(A, ) where is a reflective and transitive binary relation in A × A. A QO is a
W QO if for each infinite sequence a0 , a1 , a2 , . . . composed of the elements in A,
there exists i < j such that aj ai . Therefore if a W QO can be found among states
in ZrLU with the simulation relation (, our semi-algorithm terminates, as stated in the
following theorem.
The above theorem implies that our semi-algorithm always terminates given the subclass
of timed automata satisfying the clock boundedness condition [4], including strongly
non-Zeno timed automata, event-clock timed automata and timed automata with integer
resets. That is, if the boundedness condition is satisfied, ZrLU has a bounded number
320 T. Wang et al.
of clocks and if the number of clocks are bounded, obviously the set S is finite (with
maximum ceiling zone normalization). Since (S, =) is a W QO if S is finite by a property
of W QO, and ‘=’ implies ‘(’, (S, () is also a W QO for this special case. Furthermore,
the theorem also shows that the semi-algorithm is terminating for all single-clock timed
automata, as a W QO has been shown in [1], which may not satisfy the boundedness
condition.
4 Evaluation
Our method has been implemented with 46K lines of C# code and integrated into the
PAT model checker [19]3 . We remark that in our setting, a zone may not be convex
(for instance, due to negation used in constructing Zr ) and thus cannot be represented
as a single difference bound matrix (DBM). Rather it can be represented either as a
difference bound logic formula, as shown in [3], or as a set of DBMs. In this work, the
latter approach is adopted for the efficiency reason. In the following, we evaluate our
approach in order to answer three research questions. All experiment data are obtained
using a PC with Intel(R) Core(TM) i7-2600 CPU at 3.40 GHz and 8.0 GB RAM.
The first question is: are timed automata good to specify commonly used timed
properties? That is, if timed automata are used to model the properties, will our semi-
algorithm terminate? In [8,12], the authors summarized a set of commonly used patterns
for real-time properties. Some of the patterns are shown below where a, b, c are events;
x is a clock and h denotes all the other events. Most of the patterns are self-explanatory
and therefore we refer to the readers to [8,12] for details. We remark that although the
patterns below are all single-clock timed automata, a specification may be the parallel
composition of multiple patterns and hence have multiple clocks. Observe that all timed
automata below are deterministic except (g). A simple investigation shows that (g) sat-
isfies the clock boundedness condition and hence our semi-algorithm terminates for all
the properties below.
x>k x>k a
xk a xk a xk h h {x}
s0 x > k a s 0
x k a s0 s1
h x>k h s0 s1 xk
h h b b
(a) absence (b) universality (c) existence a
xk (d) response
a h xk h
h h a
{x} a
s0 s1 xk xk xk
x>k xk
s0 a s1 b s2 c s3
a
b b
(e) precedence-1 xk (g) chains c b
a h c
xk xk
h {x} xk
h h h
s0 s1 h xk xk xk
xk
a s0 a s1 a s2 a s3
b {x}
(f) precedence-2 (h) occurrence times a
The second question is: is the semi-algorithm useful in practice? That is, given a
real-world system, is it scalable? In the following, we model and verify benchmark
3
PAT and the experiment details can be found at
http://www.comp.nus.edu.sg/˜pat/refine ta
Are Timed Automata Bad for a Specification Language? 321
timed systems using our semi-algorithm and evaluate its performance. The benchmark
systems include Fischer’s mutual exclusion protocol (Fischer for short, similarly here-
inafter), Lynch-Shavit’s mutual exclusion protocol (Lynch), railway control system
(Railway), fiber distributed data interface (FDDI), and CSMA/CD protocol (CSMA).
The results are shown in Table 1. The systems are all built as networks of timed au-
tomata, and the number of processes is shown in column ‘System’. The verified prop-
erties are requirements on the systems specified using the timed patterns. Some of the
properties contain one timed automaton with one clock, while the rest are networks of
timed automata with more than one clock (one clock for each timed automaton). In
the table, column ‘|Cs |’ is the number of clocks (processes) in the specification. The
systems in the same group, e.g., Fischer*6 and Fischer*7 both with |Cs | = 2, have
the same specification. Notice that the number of processes in a system and the one
in the specification can be different because we can ‘hide’ events in the systems and
use h in the specifications as shown in the patterns. Column ‘Det’ shows whether the
specification is deterministic or not. The results of our semi-algorithm are shown in
column ‘( +LU ’. In order to show the effectiveness of simulation reduction, we show
the results without ( reduction in column LU and the results without LU -reduction
in column (. For each algorithm, column ‘stored’ denotes the number of stored states;
column ‘total’ denotes the total number of generated states; column ‘time’ denotes the
verification time in seconds. Symbol ‘-’ means either the verification time is more than
2 hours or out-of-memory exception happens. Notice that our semi-algorithm termi-
nates in all cases and all verification results are true. Comparing stored and total, we
can see that many states are skipped due to simulation reduction. From the verification
time we can see that both simulation relations are helpful in reducing the state space.
To the best of our knowledge, there is no existing tool supporting language inclusion
checking of these models.
322 T. Wang et al.
The last question is: how good are timed automata as a specification language? We
consider a timed automaton specification is ‘good’ if given an implementation model,
our semi-algorithm answers conclusively on the language inclusion problem. To answer
this question, we extend the approach on generating non-deterministic finite automata
in [20] to automatically generate random timed automata, and then apply our semi-
algorithm for language inclusion checking. Without loss of generality, a generated timed
automaton has always one initial state and the alphabet is {0, 1}. In addition, the follow-
ing parameters are used to control the random generation process: the number of state
|S|, the number of clocks |C|, a parameter Dt for transition density and a clock ceiling.
For each event in the alphabet, we generate k transitions (and hence the transition den-
sity for the event is Dt = k/|S|) and distribute the transitions randomly among all |S|
states. For each transition, the clock constraint and the resetting clocks are generated
randomly according to the clock ceiling. We remark that if both implementation and
specification models are generated randomly, language inclusion almost always fails.
Thus, in order to have cases where language inclusion does hold, we generate a group
of implementation specification pairs by generating an implementation first, and then
adding transitions to the implementation to get the specification.
The experimental results are shown in Table 24 . For each different combinations of
|S|, |C| and Dt, we compute three numbers shown in the form of a \ b \ c. a is the per-
centage of cases in which our semi-algorithm terminates; c is the percentage of the cases
satisfying the boundedness condition (and therefore being determinizable [4]). The gap
between a and c thus shows the effectiveness of our approach on timed automata which
may be non-determinizable. In order to show the effectiveness of simulation reduction,
b is the percentage of cases in which our semi-algorithm terminates without simulation
reduction (and with maximum ceiling zone normalization). We generate 1000 random
pairs to calculate each number. In all cases a > b and b > c, e.g., a is much larger than
4
Notice that there are cases where there is only one clock in the specification and yet our semi-
algorithm is not terminating. This is because of using a set of DBMs to represent zones. That
is, because there is no efficient procedure to check whether a zone z is a subset of another
(which is represented as the union of multiple DBMs), the LU-simulation that we discover is
partial and we may unnecessarily explore more states, infinitely more in some cases.
Are Timed Automata Bad for a Specification Language? 323
b and c when Dt ≥ 1.0. This result implies that our semi-algorithm terminates even if
the specification may not be ‘determinizable’, which we credit to simulation reduction
and the fact that the semi-algorithm is on-the-fly (so that language inclusion checking
can be done without complete determinization). When transition density increases, the
gap between a and b increases (e.g., when Dt ≥ 1.0, b is always much smaller that
a), which evidences the effectiveness of our simulation reduction. In general, the lower
the density is, the more likely it is that the semi-algorithm terminates. We calculate the
transition density of the timed property patterns and the benchmark systems. We find
that all the events have transition densities less than or equal to 1.0 except the absence
pattern. Based on the results presented in Table 2, we conclude that in practice, our
semi-algorithm has a high probability of terminating. This perhaps supports the view
that timed automata could serve as a good specification language.
5 Related Work
The work in [2] is the first study on the language inclusion checking problem for
timed automata. The work shows that timed automata are not closed under complement,
which is an obstacle in automatically comparing the languages of two timed automata.
Naturally, this conclusion leads to work on identifying determinizable subclasses of
timed automata, with reduced expressiveness. Several subclasses of timed automata
have been identified, i.e., event-clock timed automata [3,17], timed automata with inte-
ger resets [18] or with one clock [16] and strongly non-Zeno timed automata [4].
Our work is inspired by the work in [4] which presents an approach for deciding
when a timed automaton is determinizable. The idea is to check whether the timed
automaton satisfies a clock boundedness condition. The authors show that the condi-
tion is satisfied by event-clock timed automata, timed automata with integer resets and
strongly non-Zeno timed automata. Using region construction, it is shown in [4] that an
equivalent deterministic timed automaton can be constructed if the given timed automa-
ton satisfies the boundedness condition. The work is closely related to [1], in which the
authors proposed a zone-based approach for determinizing timed automata with one
clock. Our work combines [1,4] and extends them with simulation reduction so as to
provide an approach which could be useful for arbitrary timed automata in practice.
In addition, a game-based approach for determinizing timed automata has been pro-
posed in [6,13]. This approach produces an equivalent deterministic timed automaton
or a deterministic over-approximation, which allows one to enlarge the set of timed
automata that can be automatically determinized compared to the one in [4]. In com-
parison, our approach could determinize timed automata which fail the boundedness
condition in [4], and can cover the examples shown in [6]. The work is remotely re-
lated to work in [10]. In particular, it has been shown that under digitization with the
definition of weakly monotonic timed words, whether the language of a closed timed
automaton is included in the language of an open timed automaton is decidable [10].
6 Conclusion
In summary, the contributions of this work are threefold. First, we develop a zone-based
approach for language inclusion checking of timed automata, which is further combined
324 T. Wang et al.
with simulation reduction for better performance. Second, we investigate, both theoret-
ical and empirically, when the semi-algorithm is terminating. Lastly, we implement the
semi-algorithm in the PAT framework and apply it to benchmark systems. As far as
the authors know, our implementation is the first tool which supports using arbitrary
timed automata as a specification language. More importantly, with the proposed semi-
algorithm and the empirical results, we would like to argue that timed automata do serve
a specification language in practice. As for the future work, we would like to investigate
the language inclusion checking problem with the assumption of non-Zenoness.
References
1. Abdulla, P.A., Ouaknine, J., Quaas, K., Worrell, J.B.: Zone-Based Universality Analysis
for Single-Clock Timed Automata. In: Arbab, F., Sirjani, M. (eds.) FSEN 2007. LNCS,
vol. 4767, pp. 98–112. Springer, Heidelberg (2007)
2. Alur, R., Dill, D.L.: A Theory of Timed Automata. Theory of Computer Science 126(2),
183–235 (1994)
3. Alur, R., Fix, L., Henzinger, T.A.: Event-clock Automata: A Determinizable Class of Timed
Automata. Theoretical Computer Science 211, 253–273 (1999)
4. Baier, C., Bertrand, N., Bouyer, P., Brihaye, T.: When Are Timed Automata Determiniz-
able? In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.)
ICALP 2009, Part II. LNCS, vol. 5556, pp. 43–54. Springer, Heidelberg (2009)
5. Behrmann, G., Bouyer, P., Larsen, K.G., Pelánek, R.: Lower and Upper Bounds in Zone-
based Abstractions of Timed Automata. International Journal on Software Tools for Tech-
nology Transfer 8(3), 204–215 (2004)
6. Bertrand, N., Stainer, A., Jéron, T., Krichen, M.: A Game Approach to Determinize Timed
Automata. In: Hofmann, M. (ed.) FOSSACS 2011. LNCS, vol. 6604, pp. 245–259. Springer,
Heidelberg (2011)
7. Dill, D.L., Hu, A.J., Wong-Toi, H.: Checking for Language Inclusion Using Simulation Pre-
orders. In: Larsen, K.G., Skou, A. (eds.) CAV 1991. LNCS, vol. 575, pp. 255–265. Springer,
Heidelberg (1992)
8. Gruhn, V., Laue, R.: Patterns for Timed Property Specifications. Electronic Notes in Theo-
retical Computer Science 153(2), 117–133 (2006)
9. Henzinger, T.A., Kopke, P.W., Wong-Toi, H.: The Expressive Power of Clocks. In: Fülöp, Z.
(ed.) ICALP 1995. LNCS, vol. 944, pp. 417–428. Springer, Heidelberg (1995)
10. Henzinger, T.A., Manna, Z., Pnueli, A.: What Good are Digital Clocks? In: Kuich, W. (ed.)
ICALP 1992. LNCS, vol. 623, pp. 545–558. Springer, Heidelberg (1992)
11. Henzinger, T.A., Nicollin, X., Sifakis, J., Yovine, S.: Symbolic Model Checking for Real-
time Systems. Journal of Information and Computation 111(2), 193–244 (1994)
12. Konrad, S., Cheng, B.H.C.: Real-time Specification Patterns. In: ICSE, pp. 372–381 (2005)
13. Krichen, M., Tripakis, S.: Conformance Testing for Real-Time Systems. Formal Methods in
System Design 34(3), 238–304 (2009)
14. Larsen, K.G., Petterson, P., Wang, Y.: UPPAAL in a Nutshell. Journal on Software Tools for
Technology Transfer 1(1-2), 134–152 (1997)
15. Marcone, A.: Foundations of BQO Theory. Transactions of the American Mathematical So-
ciety 345(2), 641–660 (1994)
16. Ouaknine, J., Worrell, J.: On The Language Inclusion Problem for Timed Automata: Closing
a Decidability Gap. In: LICS, pp. 54–63 (2004)
17. Raskin, J., Schobbens, P.: The Logic of Event Clocks - Decidability, Complexity and Expres-
siveness. Journal of Automata, Languages, and Combinatories 4(3), 247–286 (1999)
Are Timed Automata Bad for a Specification Language? 325
18. Suman, P.V., Pandya, P.K., Krishna, S.N., Manasa, L.: Timed Automata with Integer Resets:
Language Inclusion and Expressiveness. In: Cassez, F., Jard, C. (eds.) FORMATS 2008.
LNCS, vol. 5215, pp. 78–92. Springer, Heidelberg (2008)
19. Sun, J., Liu, Y., Dong, J.S., Pang, J.: PAT: Towards flexible verification under fairness. In:
Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 709–714. Springer, Heidel-
berg (2009)
20. Tabakov, D., Vardi, M.Y.: Experimental Evaluation of Classical Automata Constructions.
In: Sutcliffe, G., Voronkov, A. (eds.) LPAR 2005. LNCS (LNAI), vol. 3835, pp. 396–411.
Springer, Heidelberg (2005)
21. Tripakis, S.: Verifying progress in timed systems. In: Katoen, J.-P. (ed.) ARTS 1999. LNCS,
vol. 1601, pp. 299–314. Springer, Heidelberg (1999)
22. De Wulf, M., Doyen, L., Henzinger, T.A., Raskin, J.-F.: Antichains: A New Algorithm for
Checking Universality of Finite Automata. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS,
vol. 4144, pp. 17–30. Springer, Heidelberg (2006)
23. Yovine, S.: Kronos: a Verification Tool for Real-time Systems. Journal on Software Tools for
Technology Transfer 1(1-2), 123–133 (1997)
Formal Design of Fault Detection
and Identification Components
Using Temporal Epistemic Logic
1 Introduction
The correct operation of complex critical systems (e.g., trains, satellites, cars)
increasingly relies on the ability to detect when and which faults occur during op-
eration. This function, called Fault Detection and Identification (FDI), provides
information that is vital to drive the containment of faults and their recovery.
This is especially true for fail-operational systems, where the occurrence of faults
should not compromise the ability to carry on critical functions, as opposed to
fail-safe systems, where faults are typically handled by going to a safe state. FDI
is typically carried out by dedicated modules, called FDI components, running
in parallel with the system. An FDI component processes sequences of observa-
tions, made available by predefined sensors, and is required to trigger a set of
predefined alarms in a timely and accurate manner. The alarms are then used
by recovery modules to autonomously guarantee the survival of the system.
Faults are often not directly observable, and their occurrence can only be
inferred by observing the effects that they have on the observable parts of the
system. Moreover, faults may have complex dynamics, and may interact with
each other in complex ways. For these reasons, the design of FDI components
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 326–340, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Formal Design of FDI Components Using Temporal Epistemic Logic 327
which accounts for the issues of delay in raising the alarms, trace diagnosability,
and maximality. Furthermore, a consistency-based approach [3] is not applica-
ble to the design of FDI: in order to formally verify the effectiveness of an FDI
component as part of an overall fault-management strategy, a formal model of
the FDI component (e.g., as an automaton) is required.
This paper is structured as follows. Section 2 provides some introductory
background. Section 3 formalizes the notion of FDI. Section 4 presents the spec-
ification language. In Section 5 we discuss how to validate the requirements,
and how to verify an FDI component with respect to the requirements. In Sec-
tion 6 we present an algorithm for the synthesis of correct-by-construction FDI
components. The results of evaluating our approach in an industrial setting are
presented in Section 7. Section 8 compares our work with previous related works.
Section 9 concludes the paper with a hint on future work.
2 Background
Plants and FDIs are represented as transition systems. A transition system is a
tuple S = V, Vo , W, Wo , I, T , where V is the set of state variables, Vo ⊆ V is
the set of observable state variables; W is the set of input variables, Wo ⊆ W is
the set of observable input variables; I is a formula over V defining the initial
states, T is a formula over V , W , V (with V being the next version of the state
variables) defining the transition relation.
A state s is an assignment to the state variables V . We denote with s the
corresponding assignment to V . An input i is an assignment to the input vari-
ables W . The observable part obs(s) of a state s is the projection of s on the
subset Vo of observable state variables. The observable part obs(i) of an input i
is the projection of i on the subset Wo of observable input variables. Given an
assignment a to a set of variables X and X1 ⊆ X, we denote the projection of a
over X1 with a|X1 . Thus, obs(s) = s|V o and obs(i) = i|W o .
A trace of S is a sequence π = s0 , i1 , s1 , i2 , s2 , . . . of states and in-
puts such that s0 satisfies I and, for each k ≥ 0, sk , ik+1 , sk+1 satis-
fies T . W.l.o.g. we consider infinite traces only. The observable part of π is
obs(π) = obs(s0 ), obs(i1 ), obs(s1 ), obs(i2 ), obs(s2 ), . . .. Given a sequence π =
s0 , i1 , s1 , i2 , s2 , . . . and an integer k ≥ 0, we denote with σ k the finite prefix
s0 , i1 , . . . , sk of π containing the first k + 1 states. We denote with π[k] the k + 1-
th state sk . We say that s is reachable in S iff there exists a trace π of S such
that s = π[k] for some k ≥ 0. We say that S is deterministic if i) there are no
two initial states s0 and s0 s.t. obs(s0 ) = obs(s0 ) ii) there are no two transi-
tions s, i1 , s1 and s, i2 , s2 from a reachable state s s.t. obs(i1 ) = obs(i2 ) and
obs(s1 ) = obs(s2 ).
Let S 1 = V 1 , Vo1 , W 1 , Wo1 , I 1 , T 1 and S 2 = V 2 , Vo2 , W 2 , Wo2 , I 2 , T 2 be two
transition systems with ∅ = (V 1 \Vo1 )∩V 2 = V 1 ∩(V 2 \Vo2 ) = (W 1 \Wo1 )∩W 2 =
W 1 ∩ (W 2 \ Wo2 ). We define the synchronous product S 1 × S 2 as the transition
system V 1 ∪ V 2 , Vo1 ∪ Vo2 , W 1 ∪ W 2 , Wo1 ∪ Wo2 , I 1 ∧ I 2 , T 1 ∧ T 2 . Every state s
of S 1 × S 2 can be considered as the product s1 × s2 such that s1 = s|V 1 is a
state of S 1 and s2 = s|V 2 is a state of S 2 .
Formal Design of FDI Components Using Temporal Epistemic Logic 329
3 Formal Framework
3.1 Diagnoser
The first element for the specification of the FDI requirements is given by the
conditions that must be monitored. Here, we distinguish between detection and
identification, which are the two extreme cases of the diagnosis problem; the first
deals with knowing whether a fault occurred in the system, while the second tries
to identify the characteristics of the fault. Between these two cases there can be
intermediate ones: we might want to restrict the detection to a particular sub-
system, or identification among two similar faults might not be of interest.
For example, a data acquisition system composed of a sensor and a filter
might have several possible faults: the sensor might fail in a single way (sdie)
while the filter might fail in two ways (f diehigh or f dielow ). The detection task
is the problem of understanding when (at least) one of the two components has
failed. The identification task tries to understand exactly which fault occurred.
Similarly, e.g., if we can replace the filter whenever it fails, it might suffice to
know that one of f diehigh or f dielow occurred (this is sometimes called isolation).
FDI components are generally used to recognize faults. However, there is
no reason to restrict our interest to faults. Recovery procedures might differ
depending on the current state of the plant, therefore, it might be important to
consider other unobservable information of the system.
We call the condition of the plant to be monitored diagnosis condition, de-
noted with β. We assume that for any point in time along a trace execution of
330 M. Bozzano et al.
β
ExactDel(A, β, 2)
BoundDel(A, β, 4)
FiniteDel(A, β)
the plant (and therefore also of the system), β is either true or false based on
what happened before that time point. Therefore, β can be an atomic condi-
tion (including faults), a sequence of atomic conditions, or Boolean combination
thereof. If β is a fault, the fault must be identified; if β is a disjunction of faults,
instead, it suffices to perform the detection, without identifying the exact fault.
3.4 Diagnosability
Given an alarm condition, we need to know whether it is possible to build a
diagnoser for it. In fact, there is no reason in having a specification that cannot
be realized. This property is called diagnosability and was introduced in [5].
In this section, we define the concept of diagnosability for the different types
of alarm conditions. We proceed by first giving the definition of diagnosability in
the traditional way (à la Sampath) in terms of observationally equivalent traces
w.r.t. the diagnosis condition. Then, we prove that a plant P is diagnosable iff
there exists a diagnoser that satisfies the specification. In the following, we will
not provide definitions for finite-delay since they can be obtained by generalizing
the ones for bounded-delay.
A specification that is trace diagnosable in a plant along all points of all traces
is diagnosable in the classical sense, and we say it is system diagnosable.
3.5 Maximality
As shown in Figure 1, bounded- and finite-delay alarms are correct if they are
raised within the valid bound. However, there are several possible variations of
the same alarm in which the alarm is active in different instants or for different
periods. We address this problem by introducing the concept of maximality.
Intuitively, a maximal diagnoser is required to raise the alarms as soon as possible
and as long as possible (without violating the correctness condition).
4 Formal Specification
In this section, we present the Alarm Specification Language with Epistemic
operators (ASLK ). This language allows designers to define requirements on the
FDI alarms including aspects such as delays, diagnosability and maximality.
Diagnosis conditions and alarm conditions are formalized using LTL with past
operators [6] (from here on, simply LTL). The definitions of trace diagnosability
Formal Design of FDI Components Using Temporal Epistemic Logic 333
ExactDel
G(KY n β → A)
G(A → O≤n β) ∧ G(β → F ≤n A) G(A → O≤n β) ∧ G(β → F ≤n A) ∧
BoundDel
G(KO≤n β → A)
G(A → Oβ) ∧ G(β → F A) G(A → Oβ) ∧ G(β → F A) ∧
FiniteDel
G(KOβ → A)
G(A → Y n β) ∧ G(A → Y n β) ∧
ExactDel
G((β → X n KY n β) → (β → X n A)) G((β → X n KY n β) → (β → X n A)) ∧
diag = T race
G(KY n β → A)
G(A → O≤n β) ∧ G(A → O≤n β) ∧
BoundDel
G((β → F ≤n KO≤n β) → (β → F ≤n A)) G((β → F ≤n KO≤n β) → (β → F ≤n A)) ∧
G(KO≤n β → A)
G(A → Oβ) ∧ G(A → Oβ) ∧
FiniteDel
G((β → F KOβ) → (β → F A)) G((β → F KOβ) → (β → F A)) ∧
G(KOβ → A)
When diag = trace instead, we precondition the completeness to the trace diag-
nosability (as defined in Figure 3.a); this means that the diagnoser will raise an
alarm whenever the diagnosis condition is satisfied and the diagnoser is able to
know it. The formalizations presented in the table can be simplified, but are left
as-is to simplify their comprehension. For example, in the case diag = trace, we
do not need to verify the completeness due to the following result:
Theorem 4. Given a diagnoser D for a plant P and a trace diagnosable alarm
condition ϕ, if D is maximal for ϕ, then D is complete.
A similar result holds for ExactDel in the non-maximal case, that becomes:
G(A → Y n β) ∧ G(KY n β → A). Finally, the implications for the completeness in
the trace diagnosability case can be rewritten as, e.g., G((β ∧ F KOβ) → (F A)).
Another interesting result is the following:
Theorem 5. Given a diagnoser D for a plant P and a system diagnosable con-
dition ϕ, if D is maximal for ϕ and ϕ is diagnosable in P then D is complete.
An ASLK specification is built by instantiating the patterns defined in Fig-
ure 4. For example, we would write ExactDelK (A, β, n, trace, T rue) for an
exact-delay alarm A for β with delay n, that satisfies the trace diagnosability
property and is maximal. An introductory example on the usage of ASLK for
the specification of a diagnoser is provided in [7].
To perform this verification steps, we need in general a model checker for KL1
with synchronous perfect recall such as MCK [10]. However, if the specification
falls in the pure LTL fragment (ASL) we can verify it with an LTL model-checker
such as NuSMV [11] thus benefiting from the efficiency of the tools in this area.
Moreover, a diagnoser is required to be compatible with the plant. Therefore,
we need to take care that the synchronous composition of the plant with the
diagnoser does not reduce the behaviors of the plant. This would imply that
there is a state and an observation that are possible for the plant, but not taken
into account by the diagnoser. Compatibility can be checked with dedicated tools
such as Ticc [12] based on game theory. However, here we require compatibility in
all environments and therefore, compatibility can be checked by model checking
by adding a sink state to the diagnoser, so that if we are in a state and we receive
an observation that was not covered by the original diagnoser, we go to the sink
state. Once we modified the diagnoser, we verify that D × P |= G¬SinkState.
P
We define B0 = {b | there exists u ∈ 2Vo s.t. for all s ∈ b, s |= I P and
obs(s) = u}: we assume that the diagnoser can be initialized by observing the
plant, and each initial belief state must, therefore, be compatible with one of the
possible initial observations on the plant. The transition function R is defined as
follows R(b, e) = {s | ∃s ∈ b s.t. s, i, s |= T P , obs(s ) = e|VoP , obs(i) = e|WoP }:
the belief state b = R(b, e) is a successor of b iff all the states in b are compatible
with the observations from a state in b.
The diagnoser is obtained by annotating each state of the belief automaton
with the corresponding alarms. To do this we explore the belief automaton,
338 M. Bozzano et al.
and annotate with Aϕ all the states b that satisfy the temporal property τ (ϕ):
b |= Aϕ iff ∀s ∈ b.s |= τ (ϕ). It might occur that neither Kτ (ϕ) nor K¬τ (ϕ)
hold in a state. In this case there is at least a state in the belief state in which
Kτ (ϕ) holds and one in which it does not hold. This pair of states represents
uncertainty, and are caused by non-diagnosabile traces.
We define Dϕ as the diagnoser for ϕ. For the propositional case τ (ϕ) = p,
D D
Dϕ = V Dϕ , Vo ϕ , W Dϕ , Wo ϕ , I Dϕ , T Dϕ is a symbolic representation of B(P )
Dϕ D
with Vo = VoP ∪{Aϕ }, Wo ϕ = WoP and such that every state b of Dϕ represents
a state in B (with abuse of notation we do not distinguish between the two) and,
D
for all v ∈ Vo ϕ , v ∈ obs(b) iff for all s ∈ b, v ∈ s, and such that every observation
e of Dϕ represents an observation in E and obs(e) = e|W Dϕ . The following holds:
o
All other alarm conditions can be reduced to the propositional case. We build
a new plant P by adding a monitor variable τ to P s.t., P = P ×(G(τ (ϕ) ↔ τ )),
where we abuse notation to indicate the automaton that encodes the monitor
variable. By rewriting the alarm condition as ϕ = ExactDel(Aϕ , τ , 0), we
obtain that D × P |= ϕ iff D × P |= ϕ .
7 Industrial Experience
The framework described in this paper has been motivated by, and used in, the
AUTOGEF project [1], funded by the European Space Agency. The main goal
of the project was the definition of a set of requirements for an on-board Fault
Detection, Identification and Recovery (FDIR) component and its synthesis. The
problem was tackled by synthesizing the Fault Detection (FDI) and Fault Recov-
ery (FR) components separately, with the idea that the FDI provides sufficient
diagnosis information for the FR to act on.
The AUTOGEF framework was evaluated using scalable benchmark exam-
ples. Moreover, Thales Alenia Space evaluated AUTOGEF on a case study based
on the EXOMARS Trace Gas Orbiter. This case-study is a significant exemplifi-
cation of the framework described in this paper, since it covers all the phases of
the FDIR development process. The system behavior (including faulty behavior)
was modeled using a formal language and table- and pattern-based description
of the mission phases/modes and observability characteristics of the system. The
specification of FDIR requirements by means of patterns greatly simplified the
accessibility of the tool to engineers that were not experts in formal methods.
Specification of alarms was carried out in the case of finite delay, under the
assumption of trace diagnosability and maximality of the diagnoser. Moreover,
different faults and alarms were associated with specific mission phase/mode and
Formal Design of FDI Components Using Temporal Epistemic Logic 339
8 Related Work
This paper presents a formal framework for the design of FDI components,
that covers many practically-relevant issues such as delays, non-diagnosability
and maximality. The framework is based on a formal semantics provided by
temporal epistemic logic. We covered the specification, validation, verification
and synthesis steps of the FDI design, and evaluated the applicability of each
step on a case-study from aerospace. To the best of our knowledge, this is the
first work that provides a formal and unified view to all the phases of FDI design.
In the future, we plan to explore the following research directions. First, we
will extend FDI to deal with asynchronous and infinite-state systems. In this
work we addressed the development of FDI for finite state synchronous systems
340 M. Bozzano et al.
References
1. European Space Agency: ITT AO/1-6570/10/NL/LvH “Dependability Design Ap-
proach for Critical Flight Software”. Technical report (2010)
2. Halpern, J.Y., Vardi, M.Y.: The complexity of reasoning about knowledge and time.
Lower bounds. Journal of Computer and System Sciences 38(1), 195–237 (1989)
3. Grastien, A., Anbulagan, A., Rintanen, J., Kelareva, E.: Diagnosis of discrete-event
systems using satisfiability algorithms. In: AAAI, vol. 1, pp. 305–310 (2007)
4. Rintanen, J., Grastien, A.: Diagnosability testing with satisfiability algorithms. In:
Veloso, M.M. (ed.) IJCAI, pp. 532–537 (2007)
5. Sampath, M., Sengupta, R., Lafortune, S., Sinnamohideen, K., Teneketzis, D.C.:
IEEE Transactions on Control Systems Technology 4, 105–124 (1996)
6. Lichtenstein, O., Pnueli, A., Zuck, L.: The glory of the past. In: Parikh, R. (ed.)
Logics of Programs, vol. 193, pp. 196–218. Springer, Heidelberg (1985)
7. Bozzano, M., Cimatti, A., Gario, M., Tonetta, S.: Formal Specification and Syn-
thesis of FDI through an Example. In: Workshop on Principles of Diagnosis, DX
2013 (2013), https://es.fbk.eu/people/gario/dx2013.pdf
8. Cimatti, A., Roveri, M., Susi, A., Tonetta, S.: Validation of requirements for hy-
brid systems: A formal approach. ACM Transactions on Software Engineering and
Methodology 21(4), 22 (2012)
9. Cimatti, A., Pecheur, C., Cavada, R.: Formal Verification of Diagnosability via
Symbolic Model Checking. In: IJCAI, pp. 363–369 (2003)
10. Gammie, P., van der Meyden, R.: MCK: Model checking the logic of knowledge.
In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 479–483. Springer,
Heidelberg (2004)
11. Cimatti, A., Clarke, E., Giunchiglia, E., Giunchiglia, F., Pistore, M., Roveri, M.,
Sebastiani, R., Tacchella, A.: NuSMV 2: An OpenSource Tool for Symbolic Model
Checking. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp.
359–364. Springer, Heidelberg (2002)
12. Adler, B.T., de Alfaro, L., da Silva, L.D., Faella, M., Legay, A., Raman, V., Roy,
P.: Ticc: A Tool for Interface Compatibility and Composition. In: Ball, T., Jones,
R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 59–62. Springer, Heidelberg (2006)
13. Schumann, A.: Diagnosis of discrete-event systems using binary decision diagrams.
In: Workshop on Principles of Diagnosis (DX 2004), pp. 197–202 (2004)
14. Jiang, S., Kumar, R.: Failure diagnosis of discrete event systems with linear-time
temporal logic fault specifications. IEEE Transactions on Automatic Control, pp.
128–133 (2001)
15. Ezekiel, J., Lomuscio, A., Molnar, L., Veres, S.: Verifying Fault Tolerance and Self-
Diagnosability of an Autonomous Underwater Vehicle. In: IJCAI, pp. 1659–1664
(2011)
16. Huang, X.: Diagnosability in concurrent probabilistic systems. In: Proceedings of
the 2013 International Conference on Autonomous Agents and Multi-agent Systems
(2013)
Monitoring Modulo Theories
1 Introduction
In this paper we consider runtime verification of multi-threaded, object-oriented
systems, representing a major class of today’s practical software. As opposed
to other verification techniques such as model checking or theorem proving,
runtime verification (RV) does not aim at the analysis of the whole system but
on evaluating a correctness property on a particular run, based on log-files or
on-the-fly. To this end, typically a monitor is synthesized from some high-level
specification that is monitoring the run at hand.
In recent years, a variety of synthesis algorithms has been developed, dif-
fering in the underlying expressiveness of the specification formalism and the
resulting monitoring approach. Typically, a variant of linear-time temporal logic
(LTL) is employed as specification language and monitoring is automata-based
or rewriting-based.
Within the setting of multiple, in general arbitrarily many instances of pro-
gram parts, for example in terms of threads or objects, a software engineer is
naturally interested in verifying that the interaction of individual instances fol-
lows general rules. The ability of taking the dynamics of data structures and
values into account is a desirable feature for specification and verification ap-
proaches. As such, the expressiveness of plain propositional temporal logics such
as LTL does not suffice, as they do not allow for specifying complex properties
on data.
In this paper, we enhance traditional runtime verification techniques for propo-
sitional temporal logics by first-order theories for reasoning about data, based
on SMT solvers. In result, we obtain a powerful tool for verifying complex prop-
erties at runtime which exceeds the expressiveness of previous approaches. The
implementation in our tool jUnitRV [1] also shows that the framework is suitable
for practical applications.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 341–356, 2014.
c Springer-Verlag Berlin Heidelberg 2014
342 N. Decker, M. Leucker, and D. Thoma
Today’s SMT solvers are highly optimized tools that can check the satisfiabil-
ity of formulae over a variety of first-order theories such as arithmetics, arrays,
lists and uninterpreted functions. They allow for reasoning on a large class of data
structures used in modern software systems. We hence aim at integrating their
capabilities with the efficient monitoring approaches for temporal properties. We
formulate example properties showing the specific strength of our framework in
terms of expressiveness. Our benchmarks for monitoring Java programs show
that such specifications can be monitored efficiently.
Temporal data logic. In the example, the data logic replaces the propositional
part that LTL is based on. To the logic expressing the temporal aspect, we gener-
ically refer to as the temporal logic. Our assumptions on the temporal logic must
be that it is linear (defined on words) and that it only uses atomic propositions
Monitoring Modulo Theories 343
to “access” the word. For example, the semantics of some temporal operator
must not depend on the current letter directly but only on the semantics of
some proposition. We formally define that requirement in Section 3 but for now
only remark that typical temporal logics like LTL, the linear μ-calculus or the
temporal logic of calls and returns (CaRet) [2] fit into that schema.
Given a suitable temporal logic and data logic we can define the formalism
we aim at. Taking the temporal logic and replacing the atomic propositions by
data formulae, we obtain what we call a temporal data logic. The theory and
universe is fixed by the data logic and the semantics of temporal data logic
formulae can thus be defined over a sequence of observations. The free variables
are bound universally so the formula is evaluated over the observation sequence
for all possible valuations. The semantics of the formula is the conjunction (more
generally the infimum) of these results.
expanding traces. In [6] a runtime verification approach for the temporal evalu-
ation of integer-modulo-constraints was presented. The underlying logic has a de-
cidable satisfiability problem and the overall approach is anticipatory. However,
only limited computations can be followed. To reason about the temporal evolu-
tion of data values along some computation, some form of bounded unrolling like
in bounded model checking [7] can be used. For runtime verification, however, such
an approach is not suitable, as the observed trace cannot be bounded.
Closely related to our work is that of Chen and Rosu [8]. It considers the
setting of sequences of actions which are parameterized by identifiers (ID). The
main idea is to divide the sequence of a program into sub-sequences, called slices,
containing only a single ID, and monitor each slice independently. Hence, in
contrast to our approach, no interdependencies between the different slices can be
checked. Moreover, our monitoring approach is not limited to plain IDs but allows
the user to reason more generally over data in terms of arbitrary (decidable) first-
order theories. The work considers a dedicated temporal logic (LTL) together
with the dedicated notion of parameters, whereas in our framework an arbitrary
linear temporal logic is extended by a first-order theory.
Recently, Bauer et al. presented an approach combining LTL with a variant
of first-order logic for runtime verification [9]. However, their approach restricts
quantification to finite sets always determined in advance by the system observa-
tion. This allows for finitely instantiating quantifiers during monitor execution,
but also profoundly limits the expressiveness of first-order logic. Basically, it is
only possible to evaluate first-order formulae over finite system observations, and
not to express properties in a declarative manner.
2 Preliminaries
First-Order Logic. A signature S = (P, F, ar) consists of finite sets P , F
of predicate and function symbols, respectively, each of some arity defined by
ar : P ∪ F → N. An extension of S is a signature T = (P , F , ar ) such that
P ⊆ P , F ⊆ F and ar ⊆ ar.
The syntax of first-order formulae over the signature S is defined in the usual
way using operators ∨ (or), ∧ (and), ¬ (negation), variables x0 , x1 , . . . , predicate
and function symbols p ∈ P , f ∈ F , quantifiers ∀ (universal), ∃ (existential).
Free variables are not in the scope of some quantifier and are assumed to come
from some set V. The set of all first-order formulae over a signature S is denoted
FO[S]. We consider constants as function symbols f with ar(f ) = 0. A sentence
is a formula without free variables.
An S-structure is a tuple s = (U, s) comprising a non-empty universe U and
a function s mapping each predicate symbol p ∈ P to a relation ps ⊆ U n of arity
n = ar(p) and each function symbol f ∈ F to a function fs : U m → U of arity
m = ar(f ). A T -structure t = (T , t) is an extension of s if T is an extension of
S, T = U and s(r) = t(r) for all symbols r ∈ P ∪ F .
A valuation is a mapping θ : V → U of free variables to values. The set of
all such mappings may be denoted U V . The semantics of first-order formulae is
defined as usual. We write (s, θ) |= χ if a formula χ is satisfied for some structure
s and valuation θ. For sentences, we refer to a sole satisfying structure as a model,
omitting a valuation. The theory T of an S-structure s is the set of all sentences
χ such that s is a model for χ.
Monitoring Modulo Theories 345
Proposition 1. Let ϕ be a TDL formula, DV the valuation space for free variables
formulae used in ϕ and AP = {χ1 , . . . , χn }. For
in ϕ, χ1 , . . . , χn the data logic
γ ∈ Γ ∗ we have ϕTDL (γ) = θ∈DV ϕTL (psAP )(πθ (γ)).
RLTL and CaRet: Regular and nesting properties. Regular LTL [12] is an ex-
tension of LTL based on regular expressions. CaRet [2] is a temporal logic with
Monitoring Modulo Theories 349
calls and returns expressing non-regular properties. In addition to the LTL oper-
ators, CaRet allows for abstract temporal operators such as Xa and Ga , moving
forward by jumping on a word from a calling position to matching return posi-
tion, reflecting the intuition of procedure calls. For RLTL and CaRet monitor
constructions have been proposed [6,13]. Despite both are more complex the
same arguments as for LTL apply. Example properties are listed in Table 1 and
express matching call- and return values and nesting-depth bounds.
4 Monitoring
In this section we present our monitoring procedure for TDL formulae. It relies
on the observation made in Proposition 1, namely that the TDL semantics for
an input word γ ∈ Γ ∗ is characterized by the TL semantics for projections of γ.
Any TDL formula can be interpreted as TL formula when considering all
occurring data logic formulae as individual symbols. With this interpretation we
can employ the monitor construction for TL to obtain a monitor over a finite
alphabet constructed from these symbols.
Definition 5 (Symbolic monitor). Let ϕ be a TDL formula and χ1 , . . . , χn
the data logic formulae used in ϕ and AP = {χ1 , . . . , χn }. The symbolic
alphabet for ϕ is the finite set Σ := 2AP . The symbolic monitor for ϕ is the
monitor MΣ constructed for ϕ interpreted as TL formula over AP.
The symbolic monitor MΣ for a TDL formula ϕ computes the semantics
ϕ(psAP ) : Σ ∗ → S. Following Proposition 1, what remains is to maintain a
monitor for each valuation θ ∈ DV and to individually compute the correspond-
ing projection πθ on the input.
Within this section we present an algorithm for efficiently maintaining these,
in general infinitely many, monitor instances. It uses a data structure, called
constraint tree, that represents finitely many equivalence classes of symbolic
monitors. The constraint tree also allows for easy computation of the infimum
of the outputs of all monitor instances, which is the semantics of the property
on the input trace read so far.
4.4 Correctness
Proposition 3 (Termination). On a constraint tree T , the function step in
Algorithm 1 terminates and has a running time in O(|T | · |Σ|) where |T | is the
number of nodes in T and |Σ| = 2|AP| is the number of abstract symbols.
The monitoring procedure presented above is correct in that the data monitor
MΓ for a TDL formula ϕ computes the correct semantics for all input words.
Theorem 1 (Correctness). Let ϕ be a TDL formula and MΓ the data mon-
itor for ϕ. Then, for all γ ∈ Γ ∗ , MΓ (γ) = ϕTDL (γ).
In order to prove correctness, we first settle some observations.
Recall that the
semantics ϕTDL can be represented as the conjunction θ∈DV ϕTL (psAP )
(πpsθ (γ)) over projections πpsθ (γ) (Proposition 1). We fix the data logic DL for
this section and write πθ for πpsθ in the following.
Monitoring Modulo Theories 353
We can now proof that the data monitor computes the correct semantics.
5 Experimental Results
·102 ·102
400
300 1
time in ms
200
0.5
100
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
steps ·104 steps ·104
Counter EQ Mutex EQ Server2 EQ Iterator EQ Counter Z3
Mutex Z3 Server2 Z3 Iterator Z3 Velocity Z3 Server Z3
example the maximal size of the constraint tree was six. All experiments were
carried out on an Intel i5 (750) CPU.
6 Conclusion
With the combination of propositional temporal logics and first-order theories,
the framework we propose in this paper allows for a precise, yet high-level and
universal formulation of behavioural properties. This helps the user to avoid
modeling errors by formulating specifications describing a system on a higher
level of abstraction than required for an actual implementation.
The clear separation of the aspects of time and data allows for efficient run-
time verification as the different aspects are handled separately in terms of a
symbolic monitor construction and solving satisfiability for first-order theories.
The independent application of techniques from monitoring and SMT solving
benefits from improvements in both fields.
Our implementation and the experimental evaluation show that the approach
is applicable in the setting of object-oriented systems and that the runtime
overhead is reasonably small. Note that this is despite the properties expressible
in our framework are hard to analyze. The satisfiability problem, for example,
is already undecidable for the combination of LTL and the very basic theory of
identities.
References
1. Decker, N., Leucker, M., Thoma, D.: jUnitRV –Adding Runtime Verification to jU-
nit. In: Brat, G., Rungta, N., Venet, A. (eds.) NFM 2013. LNCS, vol. 7871, pp.
459–464. Springer, Heidelberg (2013)
2. Alur, R., Etessami, K., Madhusudan, P.: A temporal logic of nested calls and
returns. In: Jensen, K., Podelski, A. (eds.) TACAS 2004. LNCS, vol. 2988, pp.
467–481. Springer, Heidelberg (2004)
356 N. Decker, M. Leucker, and D. Thoma
3. Stolz, V., Bodden, E.: Temporal assertions using AspectJ. Electr. Notes Theor.
Comput. Sci. (2006)
4. Goldberg, A., Havelund, K.: Automated runtime verification with Eagle. In:
MSVVEIS. INSTICC Press (2005)
5. Barringer, H., Rydeheard, D.E., Havelund, K.: Rule systems for run-time monitor-
ing: From Eagle to RuleR. In: Sokolsky, O., Taşıran, S. (eds.) RV 2007. LNCS,
vol. 4839, pp. 111–125. Springer, Heidelberg (2007)
6. Dong, W., Leucker, M., Schallhart, C.: Impartial anticipation in runtime-
verification. In: Cha, S(S.), Choi, J.-Y., Kim, M., Lee, I., Viswanathan, M. (eds.)
ATVA 2008. LNCS, vol. 5311, pp. 386–396. Springer, Heidelberg (2008)
7. Biere, A., Clarke, E., Raimi, R., Zhu, Y.: Verifying safety properties of a
powerPCTM microprocessor using symbolic model checking without BDDs. In:
Halbwachs, N., Peled, D.A. (eds.) CAV 1999. LNCS, vol. 1633, pp. 60–71. Springer,
Heidelberg (1999)
8. Chen, F., Roşu, G.: Parametric trace slicing and monitoring. In: Kowalewski, S.,
Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505, pp. 246–261. Springer, Hei-
delberg (2009)
9. Bauer, A., Küster, J.-C., Vegliach, G.: From propositional to first-order monitoring.
In: Legay, A., Bensalem, S. (eds.) RV 2013. LNCS, vol. 8174, pp. 59–75. Springer,
Heidelberg (2013)
10. Bauer, A., Leucker, M., Schallhart, C.: Monitoring of real-time properties. In: Arun-
Kumar, S., Garg, N. (eds.) FSTTCS 2006. LNCS, vol. 4337, pp. 260–272. Springer,
Heidelberg (2006)
11. Bauer, A., Leucker, M., Schallhart, C.: Runtime verification for LTL and TLTL.
ACM Trans. Softw. Eng. Methodol. (2011)
12. Leucker, M., Sánchez, C.: Regular linear temporal logic. In: Jones, C.B., Liu, Z.,
Woodcock, J. (eds.) ICTAC 2007. LNCS, vol. 4711, pp. 291–305. Springer, Heidel-
berg (2007)
13. Decker, N., Leucker, M., Thoma, D.: Impartiality and anticipation for monitoring of
visibly context-free properties. In: Legay, A., Bensalem, S. (eds.) RV 2013. LNCS,
vol. 8174, pp. 183–200. Springer, Heidelberg (2013)
14. Bauer, A., Leucker, M., Schallhart, C.: The good, the bad, and the ugly, but how
ugly is ugly? In: Sokolsky, O., Taşıran, S. (eds.) RV 2007. LNCS, vol. 4839, pp.
126–138. Springer, Heidelberg (2007)
15. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
Temporal-Logic Based Runtime Observer Pairs
for System Health Management
of Real-Time Systems⋆,⋆⋆
1 Introduction
Autonomous and automated systems, including Unmanned Aerial Systems (UAS),
rovers, and satellites, have a large number of components, e.g., sensors, actuators, and
⋆
A full version with appendices containing full proofs of correctness for all observer al-
gorithms is available at http://research.kristinrozier.com/TACAS14.html.
This work was supported in part by the Austrian Research Agency FFG, grant 825891, and
NASA grant NNX08AY50A.
⋆⋆
The rights of this work are transferred to the extent transferable according to title 17 U.S.C. 105.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 357–372, 2014.
© Springer-Verlag Berlin Heidelberg 2014
358 T. Reinbacher, K.Y. Rozier, and J. Schumann
software, that must function together reliably at mission time. System Health Manage-
ment (SHM) [17] can detect, isolate, and diagnose faults and possibly initiate recovery
activities on such real-time systems. Effective SHM requires assessing the status of the
system with respect to its specifications and estimating system health during mission
time. Johnson et al. [17, Ch.1] recently highlighted the need for new, formal-methods
based capabilities for modeling complex relationships among different sensor data and
reasoning about timing-related requirements; computational expense prevents the cur-
rent best methods for SHM from meeting operational needs.
We need a new SHM framework for real-time systems like the Swift [16] electric
UAS (see Fig. 1), developed at NASA Ames. SHM for such systems requires:
R ESPONSIVENESS : the SHM framework must continuously monitor the system. Devi-
ations from the monitored specifications must be detected within a tight and a priori
known time bound, enabling mitigation or rescue measures, e.g., a controlled emer-
gency landing to avoid damage on the ground. Reporting intermediate status and satis-
faction of timed requirements as early as possible is required for enabling responsive
decision-making.
U NOBTRUSIVENESS : the SHM framework must not alter crucial properties of the sys-
tem including functionality (not change behavior), certifiability (avoid re-certification
of flight software/hardware), timing (not interfere with timing guarantees), and toler-
ances (not violate size, weight, power, or telemetry bandwidth constraints). Utilizing
commercial-off-the-shelf (COTS) and previously proven system components is abso-
lutely required to meet today’s tight time and budget constraints; adding the SHM
framework to the system must not alter these components as changes that require them
to be re-certified cancel out the benefits of their use. Our goal is to create the most ef-
fective SHM capability with the limitation of read-only access to the data from COTS
components.
R EALIZABILITY: the SHM framework must be usable in a plug-and-play manner by
providing a generic interface to connect to a wide variety of systems. The specification
language must be easily understood and expressive enough to encode e.g. temporal
relationships and flight rules. The framework must adapt to new specifications without
a lengthy re-compilation. We must be able to efficiently monitor different requirements
during different mission stages, like takeoff, approach, measurement, and return.
results [6,11], restrictions of LTL to its past-time fragment have most often been used for
RV. Though specifications including past time operators may be natural for some other
domains [19], flight rules require future-time reasoning. To enable more intuitive spec-
ifications, others have studied monitoring of future-time claims; see [22] for a survey
and [5, 11, 14, 21, 27, 28] for algorithms and frameworks. Most of these observer algo-
rithms, however, were designed with a software implementation in mind and require a
powerful computer. There are many hardware alternatives, e.g. [12], however all either
resynthesize monitors from scratch or exclude checking real-time properties [2]. Our
unique approach runs the logic synthesis tool once to synthesize as many real-time ob-
server blocks as we can fit on our platform, e.g., FPGA or ASIC; our Sec. 4.1 only inter-
connects these blocks. Others have proposed using Bayesian inference techniques [10]
to estimate the health of a system. However, modeling timing-related behavior with dy-
namic Bayesian networks is very complex and quickly renders practical implementa-
tions infeasible.
We propose a new paired-observer SHM framework allowing systems like the Swift
UAS to assess their status against a temporal logic specification while enabling advanced
health estimation, e.g., via discrete Bayesian networks (BN) [10] based reasoning. This
novel combination of two approaches, often seen as orthogonal to each other, enables
us to check timing-related aspects with our paired observers while keeping BN health
models free of timing information, and thus computationally attractive. Essentially, we
can enable better real-time SHM by utilizing paired temporal observers to optimize BN-
based decision making. Following our requirements, we call our new SHM framework
for real-time systems a rt-R2U2 (real-time, Realizable, Responsive, Unobtrusive Unit).
Our rt-R2U2 synthesizes a pair of observers for a real-time specification ϕ given in
Metric Temporal Logic (MTL) [1] or a specialization of LTL for mission-time bounded
characteristics, which we define in Sec. 2. To ensure R ESPONSIVENESS of our rt-R2U2,
we design two kinds of observer algorithms in Sec. 3 that verify whether ϕ holds at a
discrete time and run them in parallel. Synchronous observers have small hardware foot-
prints (max. eleven two-input gates per operator; see Theorem 3 in Sec. 4) and return
an instant, three-valued abstraction {true, false, maybe}) of the satisfaction check of ϕ
with every new tick of the Real Time Clock (RTC) while their asynchronous counter-
parts concretize this abstraction at a later, a priori known time. This unique approach al-
lows us to signal early failure and acceptance of every specification whenever possible
via the asynchronous observer. Note that previous approaches to runtime monitoring
signal only specification failures; signaling acceptance, and particularly early accep-
tance is unique to our approach and required for supporting other system components
such as prognostics engines or decision making units. Meanwhile, our synchronous ob-
server’s three-valued output gives intermediate information that a specification has not
yet passed/failed, enabling probabilistic decision making via a Bayesian Network as
described in [26].
We implement the rt-R2U2 in hardware as a self-contained unit, which runs
externally to the system, to support U NOBTRUSIVENESS; see Sec. 4. Safety-critical
embedded systems often use industrial, vehicle bus systems, such as CAN and PCI,
interconnecting hardware and software components, see Fig 1. Our rt-R2U2 provides
360 T. Reinbacher, K.Y. Rozier, and J. Schumann
estimation
Higher
health
Flight Laser H
Health Level
... rt-
Com- Alti- M
Model Reasoning
R2U2 (BN)
puter meter
Swift UAS en ⊧ {ϕ1 , .., ϕn }
Specifi-
cation
system
status
Common Bus Interface (ϕ) Runtime
Observers
event updates
Baro
... Radio IMU & Event
Alti-
Link GPS Capture
meter & RTC
n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
en ⊧ (alt ≥ 600f t)
en ⊧ (pitch ≥ 5○ )
en ⊧ (cmd == takeoff )
Fig. 1. rt-R2U2: An instance of our SHM framework rt-R2U2 for the NASA Swift UAS. Swift
subsystems (top): The laser altimeter maps terrain and determines elevation above ground by
measuring the time for a laser pulse to echo back to the UAS. The barometric altimeter deter-
mines altitude above sea level via atmospheric pressure. The inertial measurement unit (IMU)
reports velocity, orientation (yaw, pitch, and roll), and gravitational forces using accelerometers,
gyroscopes, and magnetometers. Running example (bottom): predicates over Swift UAS sensor
data on execution e; ranging over the readings of the barometric altimeter, the pitch sensor, and
the takeoff command received from the ground station; n is the time stamp as issued by the
Real-Time-Clock.
access v of such an element. We say Tϕ holds if Tϕ .v is true and Tϕ does not hold if
Tϕ .v is false. For a given execution sequence ⟨Tϕ ⟩ = ⟨Tϕ0 ⟩, ⟨Tϕ1 ⟩, ⟨Tϕ2 ⟩, ⟨Tϕ3 ⟩, . . . , the
tuple accessed by ⟨Tϕi ⟩ corresponds to a section of an execution e as follows: for all
times n ∈ [⟨Tϕi−1 ⟩.τe + 1, ⟨Tϕi ⟩.τe ], en ⊧ ϕ in case ⟨Tϕi ⟩.v is true and en ⊭ ϕ in case
⟨Tϕi ⟩.v is false. In case ⟨Tϕi ⟩ is maybe, neither en ⊧ ϕ nor en ⊭ ϕ is defined.
In the remainder of this paper, we will frequently refer to execution sequences col-
lected from the Swift UAS as shown in Fig. 1. The predicates shown are atomic propo-
sitions over sensor data in our specifications and are sampled with every new time
stamp n issued by the RTC. For example, ⟨Tpitch≥5○ ⟩ = ((false, 0), (false, 1), (false, 2),
(true, 3), . . . , (true, 17), (true, 18)) describes en ⊧ (pitch ≥ 5○ ) sampled over n ∈
[0, 18] and ⟨Tpitch≥5○ ⟩ holds 19 elements.
Negation (¬ ϕ). The observer for ¬ ϕ, as stated in Alg. 1, is straightforward: for every
input Tϕ we negate the truth value of Tϕ .v. The observer generates (. . . , (true, 2),
(false, 3), . . . ).
Invariant within the Next τ Time Stamps ( τ ϕ). An observer for τ ϕ requires
registers m↑ϕ and mτs with domain N0 : m↑ϕ holds the time stamp of the latest
transition of ⟨Tϕ ⟩ whereas mτs holds the start time of the next tuple in ⟨Tϕ ⟩. For the
observer in Alg. 2, the check m ≤ (Tϕ .τe − τ ) in line 8 tests whether ϕ held for at
least the previous τ time stamps. To illustrate the algorithm, consider an observer for
○
5 (pitch ≥ 5 ) and the execution in Fig. 1. At time n = 0, we have m↑ϕ = 0 and since
⟨Tpitch≥5○ ⟩ does not hold the output is (false, 0). Similarly, the outputs for n ∈ [1, 2]
0
are (false, 1) and (false, 2). At time n = 3, a transition of ⟨Tpitch≥5○ ⟩ occurs, thus
m↑ϕ = 3. Since the check in line 8 does not hold, the algorithm does not generate a
new output, i.e., returns ( , ) designating output is delayed until a later time, which
repeats at times n ∈ [4, 7]. At n = 8, the check in line 8 holds and the algorithm returns
(true, 3). Likewise, the outputs for n ∈ [9, 10] are (true, 4) and (true, 5). At n = 11,
⟨Tpitch≥5
11
○ ⟩ does not hold and the algorithm outputs (false, 11). We note the ability of
the observer to re-synchronize its output with respect to its inputs and the RTC. For
n ∈ [8, 10], outputs are given for a time prior to n, however, at n = 11 the observer re-
synchronizes: the output (false, 11) signifies that en ⊭ 5 (pitch ≥ 5○ ) for n ∈ [6, 11].
By the equivalence τ ϕ ≡ ¬ τ ¬ϕ, we immediately arrive at an observer for τ ϕ from
Alg. 2 by negating both the input and the output tuple.
2
Proofs of correctness for every observer algorithm appear in the Appendix.
364 T. Reinbacher, K.Y. Rozier, and J. Schumann
Invariant within Future Interval (◻J ϕ). The observer for ◻J ϕ, as stated in Alg. 4,
builds on an observer for τ ϕ and makes use of the equivalence τ ϕ ≡ ◻[0,τ ] ϕ. Intu-
itively, the observer for τ ϕ returns true iff ϕ holds for at least the next τ time units.
We can thus construct an observer for ◻J ϕ by reusing the algorithm for τ ϕ, assign-
ing τ = dur(J) and shifting the obtained output by min(J) time stamps into the past.
From the equivalence ◇J ϕ ≡ ¬ ◻J ¬ϕ, we can immediately derive an observer for
◇J ϕ from the observer for ◻J ϕ. To illustrate the algorithm, consider an observer for
◻5,10 (alt ≥ 600f t) over the execution in Fig. 1. For n ∈ [0, 4] the algorithm returns
( , ), since (⟨Talt≥600f
0...4
t ⟩.τe − 5) ≥ 0 (line 3 of Alg. 4) does not hold. At n = 5 the under-
lying observer for 5 (alt ≥ 600f t) returns (false, 5), which is transformed (by line 4)
into the output (false, 0). For similar arguments, the outputs for n ∈ [6, 9] are (false, 1),
(false, 2), (false, 3), and (false, 4). At n ∈ [10, 14], the observer for 5 (alt ≥ 600f t)
returns ( , ). At n = 15, 5 (alt ≥ 600f t) yields (true, 10), which is transformed (by
line 4) into the output is (true, 5). Note also that X ϕ ≡ ◻[1,1] ϕ.
Temporal-Logic Based Runtime Observer Pairs for SHM 365
The remaining observers for the binary operators ϕ ∧ ψ and ϕ UJ ψ take tuples
(Tϕ , Tψ ) as inputs, where Tϕ is from ⟨Tϕ ⟩ and Tψ is from ⟨Tψ ⟩. Since ⟨Tϕ ⟩ and ⟨Tψ ⟩
are execution sequences produced by two different observers, the two elements of the
input tuple (Tϕ , Tψ ) are not necessarily generated at the same time. Our observers for
binary MTL operators thus use two FIFO-organized synchronization queues to buffer
parts of ⟨Tϕ ⟩ and ⟨Tψ ⟩, respectively. For a synchronization queue q we denote by q =()
its emptiness and by ∣q∣ its size.
Conjunction (ϕ∧ψ). The observer for ϕ∧ψ, as stated in Alg. 3, reads inputs (Tϕ , Tψ )
from two synchronization queues, qϕ and qψ . Intuitively, the algorithm follows the
rules for conjunction in Boolean logic with additional emptiness checks on qϕ and
qψ . The procedure dequeue(qϕ , qψ , Tξ .τe ) drops all entries Tϕ in qϕ for which the
following holds: Tϕ .τe ≤ Tξ .τe (analogous for qψ ). To illustrate the algorithm, con-
sider an observer for 5 (alt ≥ 600f t)∧(pitch ≥ 5○ ) and the execution in Fig. 1. For
n ∈ [0, 9] the two observers for the involved subformulas immediately output (false, n).
For n ∈ [10, 14], the observer for 5 (alt ≥ 600f t) returns ( , ), while in the meantime,
the atomic proposition (pitch ≥ 5○ ) toggles its truth value several times, i.e., (true, 10),
(false, 11), (false, 12), (true, 13), (false, 14). These tuples need to be buffered in
queue qpitch≥5○ until the observer for 5 (alt ≥ 600f t) generates its next output, i.e.,
(true, 10) at n = 15. We apply the function aggregate(⟨Tϕ ⟩), which repeatedly re-
places two consecutive elements ⟨Tϕi ⟩, ⟨Tϕi+1 ⟩ in ⟨Tϕ ⟩ by ⟨Tϕi+1 ⟩ iff ⟨Tϕi ⟩.v = ⟨Tϕi+1 ⟩.v,
to the content of qpitch≥5○ once every time an element is added to qpitch≥5○ . Therefore,
at n = 15: qpitch≥5○ = ((true, 10), (false, 12), (true, 13), (false, 14), (true, 15)) and
q 5 (alt≥600f t) = ((true, 10)). The observer returns (true, 10) (line 3) and
dequeue(qϕ , qψ , 10) yields: qpitch≥5○ = ((false, 12), (true, 13), (false, 14), (true, 15))
and q 5 (alt≥600f t) = ().
Until within Future Interval (ϕ UJ ψ). The observer for ϕ UJ ψ, as stated in Alg. 5,
reads inputs (Tϕ , Tψ ) from two synchronization queues and makes use of a Boolean
flag p and three registers m↑ϕ , m↓ϕ , and mpre with domain N0 ∪ {−∞}: m↑ϕ (m↓ϕ )
holds the time stamp of the latest transition ( transition) of ⟨Tϕ ⟩ and mpre holds
the latest time stamp where the observer detected ϕ UJ ψ to hold. Input tuples (Tϕ , Tψ )
for the observer are read from synchronization queues in a lockstep mode: (Tϕ , Tψ ) is
split into (Tϕ′ , Tψ′ ), where Tϕ′ .τe = Tψ′ .τe and the time stamp Tϕ′′ .τe of the next tuple
(Tϕ′′, Tψ′′ ) is Tϕ′ .τe + 1. This ensures that the observer outputs only a single tuple at
each run and avoids output buffers, which would account for additional hardware re-
sources (see correctness proof in the Appendix for a discussion). Intuitively, if Tϕ does
not hold (lines 22-26) the observer is synchronous to its input and immediately outputs
(false, Tϕ .τe ). If Tϕ holds (lines 11-20) the time stamp n′ of the output tuple is not nec-
essarily synchronous to the time stamp Tϕ .τe of the input anymore, however, bounded
by (Tϕ .τe − max(J)) ≤ n′ ≤ Tϕ .τe (see Lemma “unrolling” in the Appendix). To il-
lustrate the algorithm, consider an observer for (pitch ≥ 5○ ) U[5,10] (alt ≥ 600f t) over
the execution in Fig. 1. At time n = 0, we have mpre = 0, m↑ϕ = 0, and m↓ϕ = −∞
and since ⟨Tpitch≥5
0
○ ⟩ does not hold, the observer outputs (false, 0) in line 26. The out-
puts for n ∈ [1, 2] are (false, 1) and (false, 2). At time n = 3, a transition of
⟨Tpitch≥5○ ⟩ occurs, thus we assign m↑ϕ = 2 and mpre = −∞ (lines 3 and 4). Since
366 T. Reinbacher, K.Y. Rozier, and J. Schumann
⟨Tpitch≥5
3
○ ⟩ holds and ⟨Talt≥600f t ⟩ does not hold, the predicate in line 18 is evaluated,
3
which holds and the algorithm returns ⟨false, max(2, 3 − 10)⟩ = (false, 2). Thus, the
observer does not yield a new output in this case, which repeats for times n ∈ [4, 9].
At time n = 10, a transition of ⟨Talt≥600f t ⟩ occurs and the predicate in line 12
is evaluated. Since (2 + 5) < 10 holds, the algorithm returns (true, 5), revealing that
en ⊧ (pitch ≥ 5○ ) U[5,10] (alt ≥ 600f t) for n ∈ [3, 5]. At time n = 11, a transition of
⟨Tpitch≥5○ ⟩ occurs and since ⟨Talt≥600f
11
t ⟩ holds, p and the truth value of the current input
⟨Tpitch≥5
11
○ ⟩.v are set true and m↓ϕ = 11. Again, line 12 is evaluated and the algorithm re-
the algorithm returns (false, 12) in line 26, i.e., en ⊭ (pitch ≥ 5○ ) U[5,10] (alt ≥ 600f t)
for n ∈ [7, 12]. At time n = 13, a transition of ⟨Tpitch≥5○ ⟩ occurs, thus m↑ϕ = 12
and mpre = −∞. The predicates in line 12 and 15 do not hold, the algorithm returns
no new output in line 28. At time n = 14, a transition of ⟨Tpitch≥5○ ⟩ occurs, thus
p and ⟨Tpitch≥5○ ⟩.v are set true and m↓ϕ = 14. The predicate in line 15 holds, and the
14
algorithm outputs (false, 14), revealing that en ⊭ (pitch ≥ 5○ ) U[5,10] (alt ≥ 600f t) for
n ∈ [13, 14].
n ∈ {11, 12, 14} the synchronous observer completes early evaluation of ξ, producing
output that would, without the abstraction, be guaranteed by the exact asynchronous
observer with a delay of 5 time units, i.e., at times n ∈ {16, 17, 19}.
Algorithm 6. Assigning synchronization queue sizes for AST(ξ ′ ). Let S be a set of nodes;
Initially: w = 0, add all Σ nodes of AST(ξ ′ ) to S; The function wcd ∶ → N0 calculates the
worst-case-delay an asynchronous observer may introduce by: wcd(¬ ϕ) = wcd(ϕ ∧ ψ) = 0,
wcd( τ ϕ) = τ , wcd(◻J ϕ) = wcd(ϕ UJ ψ) = max(J).
1: while S is not empty do
2: s, w ← get next node from S, 0
3: if s is type ϕ UJ ψ or ϕ ∧ ψ then
4: w ← max(∣qϕ ∣, ∣qψ ∣) + wcd(s)
5: end if
6: while s is not a synchronization queue do
7: s, w ← get predecessor of s in AST(ξ ′ ), w + wcd(s)
8: end while
9: Set ∣q∣ = w; (q is opposite synchronization queue of s)
10: Add all ϕ UJ ψ and ϕ ∧ ψ nodes that have unassigned synchronization queue sizes to S
11: end while
– Interconnect and dimensioning. Connect observers and queues according to AST(ξ ′).
Execute Alg. 6 (MA2).
edge detection
¬ ξ5
′
en ⊧ ξ ̂ (ξ)
en ⊧ eval ̂ (¬ ξ5 )
eval
q= asynchronous synchronous
Tϕ .τe − τ mτs = σ1 ∧ ξ 4 ̂ (σ1 ∧ ξ4 )
eval
outputs
Tϕ Tϕ .τe + 1 Tξ
qξ 4
m↑ϕ ≥ q
ξ2 ∧ ξ3 ̂ (ξ2 ∧ ξ3 )
eval
depth d of
multiplexer qξ2 qξ3 AST (ξ) = 5
10 ξ0 100 ξ1 ̂(
eval 10
̂(
ξ0 ) eval 100 ξ1 )
q σ1
i1 ¬ σ2 ¬ σ3 ̂ (¬ σ2 ) ̂ (¬ σ3 )
y1 eval eval
j inputs
σ1
y2 σ2
i2 σ3
Fig. 2. Left: hardware implementations for τ ϕ (top) and eval ̂ ( τ ϕ) (bottom). Right: subfor-
mulas of AST(ξ), observers, and queues synthesized for ξ. Mapping the observers to hardware
yields two levels of parallelism: (i) asynchronous (left) and the synchronous observers (right) run
in parallel and (ii) observers for subformulas run in parallel, e.g., 10 ξ0 and 100 ξ1 .
Temporal-Logic Based Runtime Observer Pairs for SHM 369
For our synchronous observers, we prove upper bounds in terms of two-input gates on
the size of resulting circuits. Actual implementations may yield significant better results
on circuit size, depending on the performance of the logic synthesis tool.
900
Swift UAS flight data
600
H LaserAlt UA Θ UA H BaroAlt
(HL ) inc 0.5 U Altimeter (HB )
Barometric altitude (altB ) / f t
300 dec 0.5 (UA )
HL Θ HL HB ΘHB
Laser altitude (altL ) / f t healthy 0.7 healthy 0.9
bad 0.3 bad 0.1
en ⊧ altB ≥ 600f t
Inputs to our rt-R2U2 are flight data, sampled in real time;
en ⊧ (cmd == takeoff ) a health model as BN, right; and an MTL specification ϕ.
̂ (ϕ1 ) Outputs: health estimation (posterior marginals of HL
resolve by async. observer en ⊧ eval and HB , quantifying the health of the laser and
e n ⊧ ϕ1 v barometric altimeter) and the status of the UAS.
0 0 0 0 0 0 0 0 0 0 0 1 12 13 14 15 16 17 18 19 20 21 22 23 τe
Our real-time SHM analysis matched post-flight analysis by test engineers, includ-
ing successfully pinpointing a laser altimeter failure, see Fig 3: the barometric altime-
ter, pitch, and the velocity readings indicated an increase in altitude (σSB↑ and ϕSS↑
Temporal-Logic Based Runtime Observer Pairs for SHM 371
held) while the laser altimeter indicated a decrease (σSL↓ held). The posterior marginal
Pr(HL = healthy ∣ en ⊧ {σSL , σSB , ϕSS }) of the node HL , inferred from the BN,
dropped from 70% to 8%, indicating a low degree of trust in the laser altimeter reading
during the outage; engineers attribute the failure to the UAS exceeding its operational
altitude.
6 Conclusion
We presented a novel SHM technique that enables both real-time assessment of the
system status of an embedded system with respect to temporal-logic-based specifica-
tions and also supports statistical reasoning to estimate its health at runtime. To ensure
R EALIZABILITY, we observe specifications given in two real-time projections of LTL
that naturally encode future-time requirements such as flight rules. Real-time health
modeling, e.g., using Bayesian networks allows mitigative reactions inferred from com-
plex relationships between observations. To ensure R ESPONSIVENESS, we run both an
over-approximative, but synchronous to the real-time clock (RTC), and an exact, but
asynchronous to the RTC, observer in parallel for every specification. To ensure U NOB -
TRUSIVENESS to flight-certified systems, we designed our observer algorithms with a
light-weight, FPGA-based implementation in mind and showed how to map them into
efficient, but reconfigurable circuits. Following on our success using rt-R2U2 to analyze
real flight data recorded by NASA’s Swift UAS, we plan to analyze future missions of
the Swift or small satellites with the goal of deploying rt-R2U2 onboard.
References
1. Alur, R., Henzinger, T.A.: Real-time Logics: Complexity and Expressiveness. In: LICS, pp.
390–401. IEEE (1990)
2. Backasch, R., Hochberger, C., Weiss, A., Leucker, M., Lasslop, R.: Runtime verification for
multicore SoC with high-quality trace data. ACM Trans. Des. Autom. Electron. Syst. 18(2),
18:1–18:26 (2013)
3. Barre, B., Klein, M., Soucy-Boivin, M., Ollivier, P.-A., Hallé, S.: MapReduce for paral-
lel trace validation of LTL properties. In: Qadeer, S., Tasiran, S. (eds.) RV 2012. LNCS,
vol. 7687, pp. 184–198. Springer, Heidelberg (2013)
4. Barringer, H., et al.: RV 2010. LNCS, vol. 6418. Springer, Heidelberg (2010)
5. Basin, D., Klaedtke, F., Müller, S., Pfitzmann, B.: Runtime monitoring of metric first-order
temporal properties. In: FSTTCS, pp. 49–60 (2008)
6. Basin, D., Klaedtke, F., Zălinescu, E.: Algorithms for monitoring real-time properties. In:
Khurshid, S., Sen, K. (eds.) RV 2011. LNCS, vol. 7186, pp. 260–275. Springer, Heidelberg
(2012)
7. Bauer, A., Leucker, M., Schallhart, C.: Comparing LTL semantics for runtime verification. J.
Log. and Comp. 20, 651–674 (2010)
8. Bauer, A., Leucker, M., Schallhart, C.: Runtime verification for LTL and TLTL. ACM Trans.
Softw. Eng. M. 20, 14:1–14:64 (2011)
9. Colombo, C., Pace, G., Abela, P.: Safer asynchronous runtime monitoring using compensa-
tions. FMSD 41, 269–294 (2012)
10. Darwiche, A.: Modeling and Reasoning with Bayesian Networks, 1st edn. Cambridge Uni-
versity Press, New York (2009)
11. Divakaran, S., D’Souza, D., Mohan, M.R.: Conflict-tolerant real-time specifications in metric
temporal logic. In: TIME, pp. 35–42 (2010)
372 T. Reinbacher, K.Y. Rozier, and J. Schumann
12. Finkbeiner, B., Kuhtz, L.: Monitor circuits for LTL with bounded and unbounded future. In:
Bensalem, S., Peled, D.A. (eds.) RV 2009. LNCS, vol. 5779, pp. 60–75. Springer, Heidelberg
(2009)
13. Fischmeister, S., Lam, P.: Time-aware instrumentation of embedded software. IEEE Trans.
Ind. Informatics 6(4), 652–663 (2010)
14. Geilen, M.: An improved on-the-fly tableau construction for a real-time temporal logic. In:
Hunt Jr., W.A., Somenzi, F. (eds.) CAV 2003. LNCS, vol. 2725, pp. 394–406. Springer, Hei-
delberg (2003)
15. Havelund, K.: Runtime verification of C programs. In: Suzuki, K., Higashino, T., Ulrich, A.,
Hasegawa, T. (eds.) TestCom/FATES 2008. LNCS, vol. 5047, pp. 7–22. Springer, Heidelberg
(2008)
16. Ippolito, C., Espinosa, P., Weston, A.: Swift UAS: An electric UAS research platform for
green aviation at NASA Ames Research Center. In: CAFE EAS IV (April 2010)
17. Johnson, S., Gormley, T., Kessler, S., Mott, C., Patterson-Hine, A., Reichard, K., Philip Scan-
dura, J.: System Health Management: with Aerospace Applications. Wiley & Sons (2011)
18. Kleene, S.C.: Introduction to Metamathematics. North Holland (1996)
19. Lichtenstein, O., Pnueli, A., Zuck, L.: The glory of the past. In: Parikh, R. (ed.) Logic of
Programs 1985. LNCS, vol. 193, pp. 196–218. Springer, Heidelberg (1985)
20. Lu, H., Forin, A.: The design and implementation of P2V, an architecture for zero-overhead
online verification of software programs. Tech. Rep. MSR-TR-2007-99 (2007)
21. Maler, O., Nickovic, D., Pnueli, A.: On synthesizing controllers from bounded-response
properties. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 95–107.
Springer, Heidelberg (2007)
22. Maler, O., Nickovic, D., Pnueli, A.: Checking temporal properties of discrete, timed and con-
tinuous behaviors. In: Avron, A., Dershowitz, N., Rabinovich, A. (eds.) Pillars of Computer
Science. LNCS, vol. 4800, pp. 475–505. Springer, Heidelberg (2008)
23. Pike, L., Niller, S., Wegmann, N.: Runtime verification for ultra-critical systems. In: Khur-
shid, S., Sen, K. (eds.) RV 2011. LNCS, vol. 7186, pp. 310–324. Springer, Heidelberg (2012)
24. Reinbacher, T., Függer, M., Brauer, J.: Real-time runtime verification on chip. In: Qadeer, S.,
Tasiran, S. (eds.) RV 2012. LNCS, vol. 7687, pp. 110–125. Springer, Heidelberg (2013)
25. Schumann, J., Mbaya, T., Mengshoel, O., Pipatsrisawat, K., Srivastava, A., Choi, A., Dar-
wiche, A.: Software health management with Bayesian Networks. Innovations in Systems
and SW Engineering 9(4), 271–292 (2013)
26. Schumann, J., Rozier, K.Y., Reinbacher, T., Mengshoel, O.J., Mbaya, T., Ippolito, C.: To-
wards real-time, on-board, hardware-supported sensor and software health management for
unmanned aerial systems. In: PHM (2013)
27. Tabakov, D., Rozier, K.Y., Vardi, M.Y.: Optimized temporal monitors for SystemC. Formal
Methods in System Design 41(3), 236–268 (2012)
28. Thati, P., Roşu, G.: Monitoring Algorithms for Metric Temporal Logic specifications.
ENTCS 113, 145–162 (2005)
Status Report on Software Verification
(Competition Summary SV-COMP 2014)
Dirk Beyer
1 Introduction
Software verification is an important part of software engineering, which is re-
sponsible for guaranteeing safe and reliable performance of the software systems
that our economy and society relies on. The latest research results need to be
implemented in verification tools, in order to transfer the theoretical knowledge
to engineering practice. The Competition on Software Verification (SV-COMP) 1
is a systematic comparative evaluation of the effectiveness and efficiency of the
state of the art in software verification. The benchmark repository of SV-COMP 2
is a collection of verification tasks that represent the current interest and abil-
ities of tools for software verification. For the purpose of this competition, the
verification tasks are arranged in nine categories, according to the characteristics
of the programs and the properties to verify. Besides the verification tasks that
are used in this competition and written in the programming language C, the
SV-COMP repository also contains tasks written in Java 3 and as Horn clauses 4 .
The main objectives of the Competition on Software Verification are to:
1. provide an overview of the state of the art in software-verification technology,
2. establish a repository of software-verification tasks that is widely used,
3. increase visibility of the most recent software verifiers, and
4. accelerate the transfer of new verification technology to industrial practice.
1
http://sv-comp.sosy-lab.org
2
https://svn.sosy-lab.org/software/sv-benchmarks/trunk
3
https://svn.sosy-lab.org/software/sv-benchmarks/trunk/java
4
https://svn.sosy-lab.org/software/sv-benchmarks/trunk/clauses
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 373–388, 2014.
c Springer-Verlag Berlin Heidelberg 2014
374 D. Beyer
2 Procedure
The procedure for the competition was not changed in comparison to the previ-
ous editions [1,2], and consisted of the phases (1) benchmark submission (collect
and classify new verification tasks), (2) training (teams inspect verification tasks
and train their verifiers), and (3) evaluation (verification runs with all competi-
tion candidates and review of the system descriptions by the competition jury).
All systems and their descriptions were again archived and stamped for identifi-
cation with SHA hash values. Also, before public announcement of the results,
all teams received the preliminary results of their verifier for approval. After the
competition experiments for the ‘official’ categories were finished, some teams
participated in demonstration categories, in order to experiment with new cate-
gories and new rules for future editions of the competition.
TRUE: The property is satisfied (i.e., no path that violates the property exists).
FALSE: The property is violated (i.e., there exists a path that violates the
property) and a counterexample path is produced and reported as witness.
UNKNOWN: The tool cannot decide the problem, or terminates by a tool
crash, or exhausts the computing resources time or memory (i.e., the compe-
tition candidate does not succeed in computing an answer TRUE or FALSE).
For the counterexample path that must be produced as witness for the result
FALSE, we did not require a particular fixed format. (Future editions of SV-
COMP will support machine-readable error witnesses, such that error witnesses
can be automatically validated by a verifier.) The time is measured as consumed
CPU time until the verifier terminates, including the consumed CPU time of all
processes that the verifier started. If time is equal to or larger than the time
limit, then the verifier is terminated and the answer is set to ‘timeout’ (and
interpreted as UNKNOWN). The verification tasks are partitioned into nine
separate categories and one category Overall that contains all verification tasks.
The categories, their defining category-set files, and the contained programs are
explained under Verification Tasks on the competition web site.
Properties. The specification to be verified is stored in a file that is given
as parameter to the verifier. In the repository, the specifications are available
in .prp files in the main directory.
The definition init(main()) gives the initial states of the program by a call of
function main (with no parameters). The definition LTL(f) specifies that formula
f holds at every initial state of the program. The LTL (linear-time temporal logic)
operator G f means that f globally holds (i.e., everywhere during the program
execution), and the operator F f means that f eventually holds (i.e., at some
point during the program execution). The proposition label(ERROR) is true if
the C label ERROR is reached, and the proposition end is true if the program
execution terminates (e.g., return of function main, program exit, abort).
Label Unreachability. The reachability property perror is encoded in the program
source code using a C label and expressed using the following specification (the
interpretation of the LTL formula is given in Table 1):
CHECK( init(main()), LTL(G ! label(ERROR)) )
The new syntax (in comparison to previous SV-COMP editions) allows a more
general specification of the reachability property, by decoupling the specification
from the program source code, and thus, not requiring the label to be named
ERROR.
Memory Safety. The memory-safety property pmemsafety (only used in one cat-
egory) consists of three partial properties and is expressed using the following
specification (interpretation of formulas given in Table 1):
CHECK( init(main()), LTL(G valid-free) )
CHECK( init(main()), LTL(G valid-deref) )
CHECK( init(main()), LTL(G valid-memtrack) )
376 D. Beyer
Table 2. Scoring schema for SV-COMP 2013 and 2014 (taken from [2])
Reported result Points Description
UNKNOWN 0 Failure to compute verification result
FALSE correct +1 Violation of property in program was correctly found
FALSE incorrect −4 Violation reported but property holds (false alarm)
TRUE correct +2 Correct program reported to satisfy property
TRUE incorrect −8 Incorrect program reported as correct (missed bug)
The verification result FALSE for the property pmemsafety is required to include
the violated partial property: FALSE(p), with p ∈ {pvalid−free, pvalid−deref ,
pvalid−memtrack}, means that the (partial) property p is violated. According to the
requirements for verification tasks, all programs in category MemorySafety violate
at most one (partial) property p ∈ {pvalid−free , pvalid−deref , pvalid−memtrack}. Per
convention, function malloc is assumed to always return a valid pointer, i.e., the
memory allocation never fails, and function free always deallocates the memory
and makes the pointer invalid for further dereferences.
Program Termination. The termination property ptermination (only used in a
demonstration category) is based on the proposition end and expressed using the
following specification (interpretation in Table 1):
CHECK( init(main()), LTL(F end) )
Evaluation by Scores and Run Time. The scoring schema was not changed
from SV-COMP 2013 to 2014 and is given in Table 2. The ranking is decided
based on the sum of points and for equal sum of points according to success
run time, which is the total CPU time over all verification tasks for which the
verifier reported a correct verification result. Sanity tests on obfuscated versions
of verification tasks (renaming of variable and function names; renaming of file)
Status Report on Software Verification 377
did not reveal any discrepancy of the results. Opting-out from Categories and
and Computation of Score for Meta Categories were defined as in SV-COMP
2013 [2]. The Competition Jury consists again of the chair and one member of
each participating team. Team representatives are indicated in Table 3.
4 Participating Teams
Table 3 provides an overview of the participating competition candidates. The
detailed summary of the achievements for each verifier is presented in Sect. 5.
A total of 15 competition candidates participated in SV-COMP 2014: Blast
2.7.2 14 , Cbmc 15 , CPAchecker 16 , CPAlien 17 , CSeq-Lazy 18 , CSeq-Mu,
Esbmc 1.22 19 , FrankenBit 20 , Llbmc 21 , Predator 22 , Symbiotic 2 23 ,
Threader 24 , Ufo 25 , Ultimate Automizer 26 , and Ultimate Kojak 27 .
14
http://forge.ispras.ru/projects/blast
15
http://www.cprover.org/cbmc
16
http://cpachecker.sosy-lab.org
17
http://www.fit.vutbr.cz/~imuller/cpalien
18
http://users.ecs.soton.ac.uk/gp4/cseq/cseq.html
19
http://www.esbmc.org
20
http://bitbucket.org/arieg/fbit
21
http://llbmc.org
22
http://www.fit.vutbr.cz/research/groups/verifit/tools/predator
23
https://sf.net/projects/symbiotic
24
http://www7.in.tum.de/tools/threader
25
http://bitbucket.org/arieg/ufo
26
http://ultimate.informatik.uni-freiburg.de/automizer
27
http://ultimate.informatik.uni-freiburg.de/kojak
378 D. Beyer
Table 4. Technologies and features that the verification tools offer (incl. demo track )
Concurrency Support
ARG-based Analysis
Bit-precise Analysis
Symbolic Execution
Ranking Functions
Lazy Abstraction
Interval Analysis
Shape Analysis
Interpolation
Verification tool CEGAR
(incl. demo
track)
AProVE ✓ ✓
Blast 2.7.2 ✓ ✓ ✓ ✓ ✓
Cbmc ✓ ✓ ✓
CPAlien ✓ ✓
CPAchecker ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
CSeq-Lazy ✓ ✓
CSeq-Mu ✓ ✓
Esbmc 1.22 ✓ ✓ ✓
FuncTion ✓ ✓
FrankenBit ✓ ✓ ✓
Llbmc ✓
Predator ✓
Symbiotic 2 ✓
T2 ✓ ✓ ✓ ✓ ✓ ✓ ✓
Tan ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Threader ✓ ✓ ✓ ✓ ✓
Ufo ✓ ✓ ✓ ✓ ✓ ✓ ✓
Ultimate Automizer ✓ ✓ ✓ ✓
Ultimate Kojak ✓ ✓ ✓ ✓
Ultimate Büchi ✓ ✓ ✓ ✓ ✓
Table 4 lists the features and technologies that are used in the verification
tools. Counterexample-guided abstraction refinement (CEGAR) [8], predicate
abstraction [13], bounded model checking [6], lazy abstraction [19], and inter-
polation for predicate refinement [18] are implemented in many verifiers. Other
features that were implemented include symbolic execution [22], the construction
of an abstract reachability graph (ARG) as proof of correctness [3], and shape
analysis [21]. Only a few tools support the verification of concurrent programs.
Computing ranking functions [28] for proving termination is a feature that is
implemented in tools that participated in the demo category on termination.
Status Report on Software Verification 379
Table 5. Quantitative overview over all results — Part 1 (score / CPU time)
Concurrency
ControlFlow
HeapManip.
49 verif. tasks
78 verif. tasks
80 verif. tasks
BitVectors
Competition candidate
Representing jury member
CSeq-Mu — 136 — — —
G. Parlato, Southampton, UK 1 200 s
Esbmc 1.22 77 32 949 2 358 97
L. Cordeiro, Manaus, Brazil 1 500 s 30 000 s 35 000 s 140 000 s 970 s
FrankenBit — — 986 2 639 —
A. Gurfinkel, Pittsburgh, USA 6 300 s 3 000 s
Llbmc 86 0 961 0 107
S. Falke, Karlsruhe, Germany 39 s 0.0 s 13 000 s 0.0 s 130 s
Predator -92 0 511 50 111
T. Vojnar, Brno, Czech Republic 28 s 0.0 s 3 400 s 9.9 s 9.5 s
Table 6. Quantitative overview over all results — Part 2 (score / CPU time)
Sequentialized
39 points max.
67 points max.
61 verif. tasks
23 verif. tasks
45 verif. tasks
Recursive
Overall
Simple
Competition candidate
Representing jury member
Blast 2.7.2 — — — 30 —
V. Mutilin, Moscow, Russia 5 400 s
Cbmc 4 30 237 66 3 501
M. Tautschnig, London, UK 11 000 s 11 000 s 47 000 s 15 000 s 560 000 s
CPAchecker 95 0 97 67 2 987
S. Löwe, Passau, Germany 460 s 0.0 s 9 200 s 430 s 48 000 s
CPAlien 9 — — — —
P. Muller, Brno, Czech Republic 690 s
CSeq-Lazy — — — — —
B. Fischer, Stellenbosch, ZA
CSeq-Mu — — — — —
G. Parlato, Southampton, UK
Esbmc 1.22 -136 -53 244 31 975
L. Cordeiro, Manaus, Brazil 1 500 s 4 900 s 38 000 s 27 000 s 280 000 s
FrankenBit — — — 37 —
A. Gurfinkel, Pittsburgh, USA 830 s
Threader — — — — —
C. Popeea, Munich, Germany
Ufo — — 83 67 —
A. Albarghouthi, Toronto, Canada 4 800 s 480 s
Table 7. Overview of the top-three verifiers for each category (CPU time in s)
Fig. 1. Quantile functions: For each competition candidate, we plot all data points (x, y)
such that the maximum run time of the n fastest correct verification runs is y and x is
the accumulated score of all incorrect results and those n correct results. A logarithmic
scale is used for the time range from 1 s to 1000 s, and a linear scale is used for the
time range between 0 s and 1 s. The graphs are decorated with symbols at every 15-th
data point.
Table 8 shows the results, which are promising: five teams participated, namely
29
AProVE , FuncTion 30 , T2 31 , Tan 32 , and Ultimate Büchi 33 .
Also, the quality of the termination checkers was extremely good: almost all
tools had no false positive (‘false alarms’, the verifier reported the program
would not terminate although it does) and no false negative (‘missed bug’, the
verifier reported termination but infinite looping is possible).
Device-Driver Challenge. Competitions are always looking for hard problems.
We received some unsolved problems from the LDV project 34 . Three teams
participated and could compute answers to 6 of the 15 problems: Cbmc found 3,
CPAchecker found 4, and Esbmc found 2 solutions to the problems.
Error-Witnesses. One of the objectives of program verification is to provide
a witness for the verification result. This is an open problem of verification
technology: there is no commonly supported witness format yet, and the verifiers
are not producing accurate witnesses that can be automatically assessed for
validity 35 . The goal of this demonstration category is to change this (restricted
to error witnesses for now): in cooperation with interested groups we defined a
format for error witnesses and the verifiers were asked to produce error paths in
that format, in order to validate their error paths with another verification tool.
Three tools participated in this category: Cbmc, CPAchecker, and Esbmc.
The demo revealed many interesting insights on practical issues of using a com-
mon witness format, serving as a test before introducing it as a requirement to
29
http://aprove.informatik.rwth-aachen.de
30
http://www.di.ens.fr/~urban/FuncTion.html
31
http://research.microsoft.com/en-us/projects/t2
32
http://www.cprover.org/termination/cta/index.shtml
33
http://ultimate.informatik.uni-freiburg.de/BuchiAutomizer
34
http://linuxtesting.org/project/ldv
35
There was research already on reusing previously computed error paths, but by the
same tool and in particular, using tool-specific formats: for example, Esbmc was
extended to reproduce errors via instantiated code [30], and CPAchecker was used
to re-check previously computed error paths by interpreting them as automata that
control the state-space search [5].
386 D. Beyer
the next edition of the competition. We will report here only a few cases to show
how this technique can help. We selected a group of verification tasks (with ex-
pected verification result ‘false’) that Cbmc could solve, but CPAchecker was
not able to compute a verification result. We started CPAchecker again on the
verification task, now together with Cbmc’s error witness. Table 9 reports the
details of eight such runs: CPAchecker can prove the error witnesses of Cbmc
valid, although it could not find the bug in the program without the hints from
the witness. In some cases this is efficient (first and last row) and sometimes it
is quite inefficient: the matching algorithm needs improvement. The matching
is based purely on syntactical hints (sequence of tokens of the source program).
This technique of re-verifying a program with a different verification tool signifi-
cantly increases the confidence in the verification result (and makes false-alarms
unnecessary).
6 Conclusion
The third edition of the Competition on Software Verification had more partic-
ipants than before: the participation in the ‘official’ categories increased from
eleven to fifteen teams, and five teams took part in the demonstration on ter-
mination checking. The number of benchmark problems increased to a total of
2 868 verification tasks (excluding demonstration categories). The organizer and
the jury made sure that the competition follows the high quality standards of the
TACAS conference, in particular to respect the important principles of fairness,
community support, transparency, and technical accuracy.
The results showcase the progress in developing new algorithms and data
structures for software verification, and in implementing efficient tools for fully-
automatic program verification. The best verifiers have shown good quality in the
categories that they focus on, in terms of robustness, soundness, and complete-
ness. The participants represent a variety of general approaches — SMT-based
model checking, bounded model checking, symbolic execution, and program anal-
ysis showed their different, complementing strengths. Also, the SV-COMP repos-
itory of verification tasks has grown considerably: it now contains termination
problems and problems for regression verification [4], but also Horn clauses and
some Java programs in addition to C programs.
Acknowledgement. We thank K. Friedberger for his support during the evalua-
tion phase and for his work on the benchmarking infrastructure, the competition
jury for making sure that the competition is well-grounded in the community,
and the teams for making SV-COMP possible through their participation.
References
1. Beyer, D.: Competition on software verification (SV-COMP). In: Flanagan, C.,
König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 504–524. Springer, Heidelberg
(2012)
2. Beyer, D.: Second competition on software verification. In: Piterman, N., Smolka,
S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 594–609. Springer, Heidelberg (2013)
Status Report on Software Verification 387
3. Beyer, D., Henzinger, T.A., Jhala, R., Majumdar, R.: The software model checker
Blast. Int. J. Softw. Tools Technol. Transfer 9(5-6), 505–525 (2007)
4. Beyer, D., Löwe, S., Novikov, E., Stahlbauer, A., Wendler, P.: Precision reuse for
efficient regression verification. In: Proc. ESEC/FSE, pp. 389–399. ACM (2013)
5. Beyer, D., Wendler, P.: Reuse of verification results - conditional model checking,
precision reuse, and verification witnesses. In: Bartocci, E., Ramakrishnan, C.R.
(eds.) SPIN 2013. LNCS, vol. 7976, pp. 1–17. Springer, Heidelberg (2013)
6. Biere, A., Cimatti, A., Clarke, E., Zhu, Y.: Symbolic model checking without BDDs.
In: Cleaveland, W.R. (ed.) TACAS 1999. LNCS, vol. 1579, pp. 193–207. Springer,
Heidelberg (1999)
7. Brockschmidt, M., Cook, B., Fuhs, C.: Better termination proving through cooper-
ation. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 413–429.
Springer, Heidelberg (2013)
8. Clarke, E.M., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guided
abstraction refinement for symbolic model checking. J. ACM 50(5), 752–794 (2003)
9. Dudka, K., Peringer, P., Vojnar, T.: Predator: A shape analyzer based on symbolic
memory graphs (Competition contribution). In: Ábrahám, E., Havelund, K. (eds.)
TACAS 2014. LNCS, vol. 8413, pp. 412–414. Springer, Heidelberg (2014)
10. Ermis, E., Nutz, A., Dietsch, D., Hoenicke, J., Podelski, A.: Ultimate Kojak (Com-
petition contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS,
vol. 8413, pp. 421–423. Springer, Heidelberg (2014)
11. Falke, S., Merz, F., Sinz, C.: LLBMC: Improved bounded model checking of C
programs using LLVM (Competition contribution). In: Piterman, N., Smolka, S.A.
(eds.) TACAS 2013. LNCS, vol. 7795, pp. 623–626. Springer, Heidelberg (2013)
12. Giesl, J., Schneider-Kamp, P., Thiemann, R.: AProVE 1.2: Automatic termina-
tion proofs in the dependency pair framework. In: Furbach, U., Shankar, N. (eds.)
IJCAR 2006. LNCS (LNAI), vol. 4130, pp. 281–286. Springer, Heidelberg (2006)
13. Graf, S., Saïdi, H.: Construction of abstract state graphs with Pvs. In: Grumberg,
O. (ed.) CAV 1997. LNCS, vol. 1254, pp. 72–83. Springer, Heidelberg (1997)
14. Albarghouthi, A., Gurfinkel, A., Li, Y., Chaki, S., Chechik, M.: UFO: Verification
with interpolants and abstract interpretation. In: Piterman, N., Smolka, S.A. (eds.)
TACAS 2013. LNCS, vol. 7795, pp. 637–640. Springer, Heidelberg (2013)
15. Gurfinkel, A., Belov, A.: FrankenBit: Bit-precise verification with many bits (Com-
petition contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS,
vol. 8413, pp. 408–411. Springer, Heidelberg (2014)
16. Heizmann, M., Christ, J., Dietsch, D., Hoenicke, J., Lindenmann, M., Musa, B.,
Schilling, C., Wissert, S., Podelski, A.: Ultimate automizer with unsatisfiable cores
(Competition contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014.
LNCS, vol. 8413, pp. 418–420. Springer, Heidelberg (2014)
17. Heizmann, M., Hoenicke, J., Leike, J., Podelski, A.: Linear ranking for linear lasso
programs. In: Van Hung, D., Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172, pp.
365–380. Springer, Heidelberg (2013)
18. Henzinger, T.A., Jhala, R., Majumdar, R., McMillan, K.L.: Abstractions from
proofs. In: Proc. POPL, pp. 232–244. ACM (2004)
19. Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Lazy abstraction. In: Proc.
POPL, pp. 58–70. ACM (2002)
20. Inverso, O., Tomasco, E., Fischer, B., La Torre, S., Parlato, G.: Lazy-CSeq: A
lazy sequentialization tool for C (Competition contribution). In: Ábrahám, E.,
Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 398–401. Springer, Hei-
delberg (2014)
388 D. Beyer
21. Jones, N.D., Muchnick, S.S.: A flexible approach to interprocedural data-flow anal-
ysis and programs with recursive data structures. In: POPL, pp. 66–74 (1982)
22. King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–
394 (1976)
23. Kröning, D., Sharygina, N., Tsitovich, A., Wintersteiger, C.M.: Termination anal-
ysis with compositional transition invariants. In: Touili, T., Cook, B., Jackson, P.
(eds.) CAV 2010. LNCS, vol. 6174, pp. 89–103. Springer, Heidelberg (2010)
24. Kröning, D., Tautschnig, M.: CBMC – C bounded model checker (Competition
contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413,
pp. 389–391. Springer, Heidelberg (2014)
25. Löwe, S., Mandrykin, M., Wendler, P.: CPAchecker with sequential combination
of explicit-value analyses and predicate analyses (Competition contribution). In:
Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 392–394.
Springer, Heidelberg (2014)
26. Morse, J., Ramalho, M., Cordeiro, L., Nicole, D., Fischer, B.: ESBMC 1.22 (Com-
petition contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS,
vol. 8413, pp. 405–407. Springer, Heidelberg (2014)
27. Muller, P., Vojnar, T.: CPAlien: Shape analyzer for CPAChecker (Competition
contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413,
pp. 395–397. Springer, Heidelberg (2014)
28. Podelski, A., Rybalchenko, A.: A complete method for the synthesis of linear rank-
ing functions. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp.
239–251. Springer, Heidelberg (2004)
29. Popeea, C., Rybalchenko, A.: Threader: A verifier for multi-threaded programs
(Competition contribution). In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013.
LNCS, vol. 7795, pp. 633–636. Springer, Heidelberg (2013)
30. Rocha, H., Barreto, R., Cordeiro, L., Neto, A.D.: Understanding programming bugs
in ANSI-C software using bounded model checking counter-examples. In: Derrick,
J., Gnesi, S., Latella, D., Treharne, H. (eds.) IFM 2012. LNCS, vol. 7321, pp. 128–
142. Springer, Heidelberg (2012)
31. Shved, P., Mandrykin, M., Mutilin, V.: Predicate analysis with BLAST 2.7. In:
Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 525–527.
Springer, Heidelberg (2012)
32. Slaby, J., Strejček, J.: Symbiotic 2: More precise slicing (Competition contribution).
In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 415–417.
Springer, Heidelberg (2014)
33. Tomasco, E., Inverso, O., Fischer, B., La Torre, S., Parlato, G.: MU-CSeq: Sequen-
tialization of C programs by shared memory unwindings (Competition contribu-
tion). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp.
402–404. Springer, Heidelberg (2014)
34. Urban, C., Miné, A.: An abstract domain to infer ordinal-valued ranking functions.
In: Shao, Z. (ed.) ESOP 2014. LNCS, vol. 8410, pp. 412–431. Springer, Heidelberg
(2014)
CBMC – C Bounded Model Checker
(Competition Contribution)
1 Overview
The C Bounded Model Checker (CBMC) [2] demonstrates the violation of as-
sertions in C programs, or proves safety of the assertions under a given bound.
CBMC implements a bit-precise translation of an input C program, annotated
with assertions and with loops unrolled to a given depth, into a formula. If the
formula is satisfiable, then an execution leading to a violated assertion exists.
For SV-COMP, satisfiability of the formula is decided using MiniSat 2.2.0 [4].
2 Architecture
Bounded model checkers such as CBMC reduce questions about program paths
to constraints that can be solved by off-the-shelf SAT or SMT solvers. With the
SAT back end, and given a program annotated with assertions, CBMC outputs a
CNF formula the solutions of which describe program paths leading to assertion
violations. In order to do so, CBMC performs the following main steps, which
are outlined in Figure 1, and are explained below.
Front end. The command-line front end first configures CBMC according to
user-supplied parameters, such as the bit-width. The C parser utilises an off-
the-shelf C preprocessor (such as gcc -E) and builds a parse tree from the pre-
processed source. Source file- and line information is maintained in annotations.
Type checking populates a symbol table with type names and symbol identifiers
by traversing the parse tree. Each symbol is assigned bit-level type information.
CBMC aborts if any inconsistencies are detected at this stage.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 389–391, 2014.
c Springer-Verlag Berlin Heidelberg 2014
390 D. Kroening and M. Tautschnig
Command Line
Front End
Counterexample
Symbolic Execution CNF Conversion SAT Solver
Analysis
Back end. While CBMC also supports SMT solvers as back ends, we use Min-
iSat 2.2.0 in this competition. Consequently, the resulting equation is translated
into a CNF formula by bit-precise modelling of all expressions plus the Boolean
guards [3]. A model computed by the SAT solver corresponds to a path violat-
ing at least one of the assertions in the program under scrutiny, and the model
is translated back to a sequence of assignments to provide a human-readable
counterexample. Conversely, if the formula is unsatisfiable, no assertion can be
violated within the given unwinding bounds.
4 Tool Setup
The competition submission is based on CBMC version 4.5. The full source code
of the competing version is available at
http://svn.cprover.org/svn/cbmc/releases/cbmc-4.5-sv-comp-2014/.
To process a benchmark FOO.c (with properties in FOO.prp), the script
cbmc-wrapper.sh should be invoked as follows:
cbmc-wrapper.sh --propertyfile FOO.prp --32 FOO.c
for all categories with a 32-bit memory model; for those with a 64-bit memory
model, --32 should be replaced by --64.
5 Software Project
CBMC is maintained by Daniel Kroening with patches supplied by the com-
munity. It is made publicly available under a BSD-style license. The source code
and binaries for popular platforms are available at http://www.cprover.org/cbmc.
References
1. Alglave, J., Kroening, D., Tautschnig, M.: Partial orders for efficient bounded model
checking of concurrent software. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS,
vol. 8044, pp. 141–157. Springer, Heidelberg (2013)
2. Clarke, E., Kroening, D., Lerda, F.: A tool for checking ANSI-C programs. In:
Jensen, K., Podelski, A. (eds.) TACAS 2004. LNCS, vol. 2988, pp. 168–176. Springer,
Heidelberg (2004)
3. Clarke, E.M., Kroening, D., Yorav, K.: Behavioral consistency of C and Verilog
programs using Bounded Model Checking. In: DAC, pp. 368–371 (2003)
4. Eén, N., Sörensson, N.: An extensible SAT-solver. In: Giunchiglia, E., Tacchella, A.
(eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518. Springer, Heidelberg (2004)
CPAchecker with Sequential Combination
of Explicit-Value Analyses
and Predicate Analyses
(Competition Contribution)
1 Software Architecture
CPAchecker, which is built on the foundations of Configurable Program
Analysis (CPA), strives for high extensibility and reuse. As such, auxiliary anal-
yses, such as tracking the program counter, modeling the call stack, and keeping
track of function pointers, all of which is required for virtually any verification
tool, are implemented as independent CPAs. The same is true for the main anal-
yses, such as, e.g., the explicit-value analysis and the analysis based on predicate
abstraction, which are also available as decoupled CPAs within CPAchecker.
All these CPAs can be enabled and flexibly recombined on a per-demand basis
without the need of changing adjacent CPAs. Other algorithms, like CEGAR,
counterexample checks, parallel or sequential combinations of analyses, as the
one being filed to this year’s SV-COMP’14, can be plugged together by simply
passing the according configuration options to the CPAchecker framework.
CPAchecker, which is written in Java, uses the C parser of the Eclipse CDT
project1 , and MathSAT52 for solving SMT formulae and interpolation queries.
1
http://www.eclipse.org/cdt/
2
http://mathsat.fbk.eu/
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 392–394, 2014.
c Springer-Verlag Berlin Heidelberg 2014
CPAchecker with Sequential Combination of Explicit-Value Analyses 393
2 Verification Approach
analysis for an even higher confidence in the result. For checking memory safety
properties, we use a bounded analysis consisting of concrete memory graphs in
combination with an instance of the explicit-value analysis mentioned above.
Please add the parameter -64 for C programs assuming a 64-bit environment.
For machines with less RAM, the amount of memory given to the Java VM
needs to be adjusted with the parameter -heap. CPAchecker will print the
verification result and the name of the output directory to the console. Additional
information (such as the error path) will be written to files in this directory.
References
1. Beyer, D., Henzinger, T.A., Keremoglu, M.E., Wendler, P.: Conditional model check-
ing: A technique to pass information between verifiers. In: Proc. FSE. ACM (2012)
2. Beyer, D., Keremoglu, M.E., Wendler, P.: Predicate abstraction with adjustable-
block encoding. In: Proc. FMCAD, pp. 189–197, FMCAD (2010)
3. Beyer, D., Löwe, S.: Explicit-state software model checking based on CEGAR and
interpolation. In: Cortellessa, V., Varró, D. (eds.) FASE 2013. LNCS, vol. 7793, pp.
146–162. Springer, Heidelberg (2013)
CPA LIEN: Shape Analyzer for CPAChecker
(Competition Contribution)
1 Verification Approach
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 395–397, 2014.
c Springer-Verlag Berlin Heidelberg 2014
396 P. Muller and T. Vojnar
pointer analysis based on SMGs works well for enough test cases from the SV-COMP
benchmark to get a positive score in categories where we participate.
The CPA framework allows one to merge the encountered states to reduce the gen-
erated state space. This feature is, however, not used in CPA LIEN. To compute the cov-
ering relation, which is used by the high-level CPA reachability algorithm to determine
the end of the state space search, CPA LIEN uses the SMG join operation. CPA LIEN
also uses several specialized helper analyses provided by the CPAChecker framework
to deal with certain specific tasks. These helper analyses are the Location, CallStack,
and FunctionPointer CPAs.
2 Software Architecture
CPA LIEN builds upon the CPAChecker framework for implementation, execution, and
combination of instances of the CPA formalism. CPAChecker implements a reachabil-
ity analysis algorithm over a generic CPA and also provides several other algorithms.
CPA LIEN is an implementation of a CPA instance, consisting of the abstract domain
definition and the transfer relation between the states. Symbolic execution is driven by
CPAChecker. CPAChecker also provides a C language parsing capability, wrapping a C
parser present in the Eclipse CDT. Both CPAChecker and CPA LIEN are written in Java.
and its integration with other specialized analyses, providing heap analysis capabilities
still missing in the CPAChecker ecosystem.
4 Tool Setup and Configuration
CPA LIEN is available online at the project page:
http://www.fit.vutbr.cz/research/groups/verifit/tools/cpalien/
It is a modified version of the upstream CPAChecker, containing code not yet present
in the upstream repository. For the participation in the competition, we have prepared a
tarball. The only dependency needed to run CPA LIEN is Java version 7.
For running the verifier, we have prepared a wrapper script to provide the output
required by the competition rules. The script is run in the following way:
$ ./cpalien.sh target_program.c
Upon completion, a single line with the answer is provided. More information about
the verification result, such as the error path, is provided in the output directory. The
tool does not adhere to competition requirements with respect to property files: it does
not allow a property file to be passed as a parameter. This was caused by our incorrect
reading of the requirements. The property file is expected to be present in the same
directory as the verification task.
CPA LIEN participates in the Heap Manipulation, Memory Safety and Control Flow
and Integer Variable categories. We opt out from the remaining ones.
5 Software Project and Contributors
CPA LIEN is an extension of the CPAChecker project, building on the CPAChecker
heavily. CPA LIEN is developed by the VeriFIT 1 group at the Brno University of Tech-
nology. A significant part of the SMG code was contributed by Alexander Driemeyer
from University of Passau, whom we would like to thank. CPAChecker is a project
developed mainly by the Software Systems Lab2 at the University of Passau. Both
CPA LIEN and CPAChecker are distributed under the Apache 2.0 license.
References
1. Dudka, K., Peringer, P., Vojnar, T.: Predator: A Practical Tool for Checking Manipulation of
Dynamic Data Structures Using Separation Logic. In: Gopalakrishnan, G., Qadeer, S. (eds.)
CAV 2011. LNCS, vol. 6806, pp. 372–378. Springer, Heidelberg (2011)
2. Dudka, K., Peringer, P., Vojnar, T.: Byte-Precise Verification of Low-Level List Manipulation.
In: Logozzo, F., Fähndrich, M. (eds.) SAS 2013. LNCS, vol. 7935, pp. 215–237. Springer,
Heidelberg (2013)
3. Beyer, D., Henzinger, T.A., Théoduloz, G.: Configurable Software Verification: Concretizing
the Convergence of Model Checking and Program Analysis. In: Damm, W., Hermanns, H.
(eds.) CAV 2007. LNCS, vol. 4590, pp. 504–518. Springer, Heidelberg (2007)
4. Beyer, D., Keremoglu, M.E.: CPACHECKER: A Tool for Configurable Software Verification.
In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 184–190. Springer,
Heidelberg (2011)
1
http://www.fit.vutbr.cz/research/groups/verifit/
2
http://www.sosy-lab.org/
Lazy-CSeq: A Lazy Sequentialization Tool for C
(Competition Contribution)
1 Introduction
Sequentialization translates concurrent programs into (under certain assumptions) equiv-
alent non-deterministic sequential programs and so reduces concurrent verification to
its sequential counterpart. The widely used (e.g., in CSeq [2,3] or Rek [1]) sequential-
ization schema by Lal and Reps (LR) [6] considers only round-robin schedules with
K rounds, which bounds the number of context switches between the different threads.
LR first replaces the shared global memory by K indexed copies. It then executes the
individual threads to completion, simulating context switches by non-deterministically
incrementing the index. The first thread works with the initial memory guesses, while
the remaining threads work with the values left by their predecessors. The initial guesses
are also stored in a second set of copies; after all threads have terminated these are used
to ensure consistency (i.e., the last thread has ended its execution in each round with
initial guesses for the next round).
LR explores a large number of configurations unreachable by the concurrent pro-
gram, due to the completely non-deterministic choice of the global memory copies and
the late consistency check. The lazy sequentialization schema by La Torre, Madhu-
sudan, and Parlato (LMP) [4,5] avoids this non-determinism, but at each context switch
it re-computes from scratch the local state of each process. This can lead to verifica-
tion conditions of exponential size when constructing the formula in a bounded model
checking approach (due to function inlining). However, for bounded programs this re-
computation can be avoided and the sequentialized program can instead jump to the
context switch points. Lazy-CSeq implements this improved bounded LMP schema
(bLMP) for sequentially consistent C programs that use POSIX threads.
This work was partially funded by the MIUR grant FARB 2011-2012, Università degli Studi
di Salerno (Italy).
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 398–401, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Lazy-CSeq: A Lazy Sequentialization Tool for C 399
2 Verification Approach
Overview. bLMP considers only round-robin schedules with K rounds. It further as-
sumes that the concurrent program (and thus in particular the number of possible threads)
is bounded and that all jumps are forward jumps, which are both enforced in Lazy-CSeq
by unrolling. Unlike LR, however, bLMP does not run the individual threads to comple-
tion in one fell swoop; instead, it repeatedly calls the sequentialized thread functions in
a round-robin fashion. For each thread it maintains the program locations at which the
previous round’s context switch has happened and thus the computation must resume
in the next round. The sequentialized thread functions then jump (in multiple hops)
back to these stored locations. bLMP also keeps the thread-local variables persistent (as
static) and thus, unlike the original LMP, does not need to re-compute their values
from saved copies of previous global memory states before it resumes the computation.
Data Structures. bLMP only stores and maintains, for each thread, a flag denoting
whether the thread is active, the thread’s original arguments, and an integer denoting the
program location at which the previous context switch has happened. Since it does not
need any copy of the shared global memory, heap allcotion needs no special treatment
during the sequentialization and can be delegated entirely to the backend model checker.
Main Driver. The sequentialized program’s main function orchestrates the analysis.
It consists of a sequence of small code snippets, one for each thread and each round,
that check the thread’s active flag (maintained by Lazy-CSeq’s implementation of the
pthread create and pthread join functions), and, if this is set, non-determinis-
tically increment the next context switch point pc cs (which must be smaller than the
thread’s size), call the sequen- if (active_tr[thr_idx] == 1) {
tialized thread function with pc_cs = pc[thr_idx] + nondet_uint();
the original arguments, and assume(pc_cs <= SIZE_<thr_idx>);
store the context switch point thread_<thr_tdx>(thr_args[thr_idx]);
for the next round. Lazy- pc[thr_idx] = pc_cs;
CSeq obtains from the un- }
rolling phase the set of thread instances that the original concurrent program can pos-
sibly create within the given bounds. This allows the static construction of the main
driver. Note that the choice of the context switch points in the driver is the only addi-
tional non-determinism introduced by the sequentialization.
Thread Translation. The sequentialized program also contains a function for each
thread instance (including the original main) identified during the unrolling phase.
Within the function each statement is guarded by a check whether its location is be-
fore the stored location or after the next context switch non-deterministically chosen
by the driver. In the former case, the statement has already been executed in a previ-
ous round, and the simulation jumps ahead one hop; in the latter case, the statement
will be executed in a future round, and the simulation jumps to the thread’s exit. Each
jump target (corresponding either directly to a goto label or indirectly to a branch of
an if statement) is also guarded by an additional check to ensure that the jump does
not jump over the context switch. Since bLMP only explores states reachable in the
400 O. Inverso et al.
original concurrent program, assert statements need no special treatment during the
sequentialization and can be delegated entirely to the backend model checker.
References
1. Chaki, S., Gurfinkel, A., Strichman, O.: Time-bounded analysis of real-time systems. In: FM-
CAD, pp. 72–80 (2011)
2. Fischer, B., Inverso, O., Parlato, G.: CSeq: A Sequentialization Tool for C (Competition Con-
tribution). In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 616–618.
Springer, Heidelberg (2013)
3. Fischer, B., Inverso, O., Parlato, G.: CSeq: A Concurrency Pre-Processor for Sequential C
Verification Tools. In: ASE, pp. 710–713 (2013)
Lazy-CSeq: A Lazy Sequentialization Tool for C 401
4. La Torre, S., Madhusudan, P., Parlato, G.: Reducing context-bounded concurrent reachability
to sequential reachability. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp.
477–492. Springer, Heidelberg (2009)
5. La Torre, S., Madhusudan, P., Parlato, G.: Sequentializing parameterized programs. In: FIT,
EPTCS 87, pp. 34–47 (2012)
6. Lal, A., Reps, T.W.: Reducing concurrent analysis under a context bound to sequential analy-
sis. Formal Methods in System Design 35(1), 73–97 (2009)
7. Tomasco, E., Inverso, O., Fischer, B., La Torre, S., Parlato, G.: MU-CSeq: Sequentialization
of C Programs by Shared Memory Unwindings (Competition Contribution). In: Ábrahám,
E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 402–404. Springer, Heidelberg
(2014)
MU-CSeq: Sequentialization of C Programs
by Shared Memory Unwindings
(Competition Contribution)
1 Introduction
Sequentialization translates a concurrent program into a corresponding sequential one
while preserving a given verification property (e.g., reachability). The idea is to reuse
in the domain of concurrent programs the technology developed for the analysis of se-
quential programs. This simplifies and speeds up the development of robust tools for
concurrent programs. It also allows the designers to focus only on the concurrency
aspects and provides them with a framework in which they can quickly check the effec-
tiveness of their solutions. A sequentialization tool can be designed as a front-end for a
number of analysis tools that share the same input language, and thus many alternatives
are immediately available.
We design a new sequentialization algorithm for multi-threaded C programs with
dynamic thread creation. Its main novelty is the idea of memory unwinding (MU). We
fix (by a nondeterministic guess) the sequence of write operations in the shared mem-
ory and then simulate the behavior of the program according to any scheduling that
respects this choice. We can then use of the number of writes in the shared memory
as a parameter of the bounded analysis, which is orthogonal to considering the num-
ber of context switches underlying previous research on sequentializations based on
the notion of bounded context-switching (e.g., [10,6,7,2,1,8,9]). Moreover, MU-CSeq
naturally accommodates the simulation of dynamic thread creation by function calls.
We implement MU-CSeq as a new module of the tool CSeq [3,4]. Other modules of
CSeq implement the Lal/Reps algorithm [6] and a lazy-sequentialization scheme aimed
to exploit bounded model checking [5].
This work was partially funded by the MIUR grant FARB 2011-2012, Università degli Studi
di Salerno (Italy).
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 402–404, 2014.
c Springer-Verlag Berlin Heidelberg 2014
MU-CSeq: Sequentialization of C Programs by Shared Memory Unwindings 403
2 Verification Approach
Overview. MU-CSeq translates a multi-threaded C program P , into a standard C pro-
gram P . The source-to-source translation is parameterized over the number of writes
Nw in the shared memory and the maximum number of threads Nt . The overall scheme
consists of guessing a sequence σ of Nw writes and then simulating any execution of P
that matches σ. The simulation is done thread-by-thread, starting from the original main
function; when a new thread is created the simulation of the current thread is suspended
until the simulation of the new thread has ended. When the number of threads passes
the bound Nt , each new thread creation operation is just ignored.
Modules of P’. The main function of P is in charge of guessing a consistent sequence
of writes σ and starting the simulation of P . P has a function for each function (includ-
ing the main) and each thread of P . The translation of P modules into the corresponding
modules of P consists of: 1) adding a few lines of control code to handle creation and
execution of threads, and 2) replacing the reads and writes in the shared memory with
calls to read and write functions, respectively.
Guessing the Sequence of Writes. We use a global two-dimensional array mem that
corresponds to the temporal unwinding of the shared memory according to the memory
updates. Here, each column corresponds to an updating event (i.e., a write) in σ and
each row corresponds to a variable. The entry mem[i,j] contains the value of the i-th
shared variable after the j-th write in σ. We use a second global array sigma to store
for each write the involved variable and the thread that has executed the write. To guess
the writes, we assign non-deterministic values to these arrays. The main function of P
then uses assume statements to check the consistency of the values stored in the guessed
arrays before starting the simulation of P .
Accessing Global Memory. On executing each thread t, we store in a variable thr pos
the index of the last executed write in σ. This variable is updated by read and write.
On calling write for the assignment x=e, thr pos is updated to the corresponding index
and then mem[x,thr pos]=e is checked. By calling read for reading variable x, first
thr pos is nondeterministically updated to any index between its current value and the
next write in σ by t, and then mem[x,thr pos] is returned.
Thread Creation and Execution. Thread creation and execution are implemented as
function calls in P . Thus, if a thread t2 is created from a thread t1 , the simulation of t1
stops until the call to t2 has terminated. Before the simulation of t2 starts, the current
value of thr pos is stored in a local variable such that when t2 has terminated, the
simulation of t1 restarts from this index. Accordingly, the simulation of t2 starts from
the current value of thr pos. When either the last statement of thread t2 is reached, or
a write after the last guessed write for t2 is executed, or an index greater than Nw is
guessed for a read, then all the calls of thread t2 are returned, including the call that has
started the thread simulation. After the return we check that all write operations that t2
has to execute actually happened.
References
1. Bouajjani, A., Emmi, M., Parlato, G.: On sequentializing concurrent programs. In: Yahav, E.
(ed.) SAS 2011. LNCS, vol. 6887, pp. 129–145. Springer, Heidelberg (2011)
2. Emmi, M., Qadeer, S., Rakamaric, Z.: Delay-bounded scheduling. In: POPL, pp. 411–422
(2011)
3. Fischer, B., Inverso, O., Parlato, G.: CSeq: A Sequentialization Tool for C (Competition
Contribution). In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp.
616–618. Springer, Heidelberg (2013)
4. Fischer, B., Inverso, O., Parlato, G.: CSeq: A Concurrency Pre-Processor for Sequential C
Verification Tools. In: ASE, pp. 710–713 (2013)
5. Inverso, O., Tomasco, E., Fischer, B., La Torre, S., Parlato, G.: Lazy-CSeq: A Lazy Se-
quentialization tool for C (Competition Contribution). In: Ábrahám, E., Havelund, K. (eds.)
TACAS 2014. LNCS, vol. 8413, pp. 398–401. Springer, Heidelberg (2014)
6. Lal, A., Reps, T.W.: Reducing concurrent analysis under a context bound to sequential anal-
ysis. Formal Methods in System Design 35(1), 73–97 (2009)
7. La Torre, S., Madhusudan, P., Parlato, G.: Reducing context-bounded concurrent reachability
to sequential reachability. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643,
pp. 477–492. Springer, Heidelberg (2009)
8. La Torre, S., Madhusudan, P., Parlato, G.: Sequentializing parameterized programs. In: FIT,
EPTCS 87, pp. 34–47 (2012)
9. La Torre, S., Parlato, G.: Scope-bounded Multistack Pushdown Systems: Fixed-Point, Se-
quentialization, and Tree-Width. In: FSTTCS. LIPIcs, vol. 18, pp. 173–184 (2012)
10. Qadeer, S., Wu, D.: KISS: keep it simple and sequential. In: PLDI, pp. 14–24 (2004)
ESBMC 1.22
(Competition Contribution)
1 Overview
ESBMC is a context-bounded symbolic model checker that allows the verification of
single- and multi-threaded C code with shared variables and locks. ESBMC was origi-
nally branched off CBMC (v2.9) [4] and has inherited its object-based memory model.
With the increasingly large SV-COMP benchmarks this is now reaching its limits. We
have thus implemented an improved memory model for ESBMC; however, we opted for
an incremental change and have kept the underlying object-based model in place, rather
than adapting a fully byte-precise memory model as for example used by LLBMC [7].
We believe this strikes the right balance between precision and scalability.
In this paper we focus on the differences from the ESBMC version used in last year’s
competition (1.20) and, in particular, on the memory model; an overview of ESBMC’s
architecture and more details are given in our previous work [1–3, 5].
3 Memory Model
The correct implementation of operations involving pointers is a significant challenge
in model checking C programs. As a bounded model checker, ESBMC reduces the
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 405–407, 2014.
c Springer-Verlag Berlin Heidelberg 2014
406 J. Morse et al.
bounded program traces to first order logic, which requires us to eliminate pointers in
the model checker. We follow CBMC’s approach and use a static analysis to approx-
imate for each pointer variable the set of data objects (i.e., memory chunks) at which
it might point at some stage in the program execution. The data objects are numbered,
and a pointer target is represented by a pair of integers identifying the data object and
the offset within the object. The value of a pointer variable is then the set of (object,
offset)-pairs to which the pointer may point at the current execution step. The result
of a dereference is the union of the sets of values associated with each of the (object,
offset)-pairs.
The performance of this approach suffers if pointer offsets cannot be statically de-
termined, e.g., if a program reads a byte from an arbitrary offset into a structure. The
resulting SMT formula is large and unwieldy, and its construction is error-prone. To
avoid this, we extended the static pointer analysis to determine the weakest alignment
guarantee that a particular pointer variable provides, and inserted padding in structures
to make all fields align to word boundaries, as prescribed by C’s semantics.
These guarantees, in combination with enforcing memory access alignment rules,
allow us to significantly reduce the number of valid dereference behaviours and thus
the size of the resulting formula, and to detect alignment errors which we have previ-
ously ignored. In circumstances where the underlying type of a memory allocation is
unclear (e.g., dynamically allocated memory with nondeterministic size), we fall back
to allocating a byte array and piecing together higher level types from the bytes.
Other models checkers (in particular LLBMC [7]) treat all memory as a single byte
array, upon which all pointer accesses are decomposed into byte operations. This can
lead to performance problems due to the repeated updates to the memory array that
need to be reflected in the SMT formula.
4 Competition Approach
In bounded model checking, the choice of the unwinding bounds can make a huge
difference. In contrast to previous years, where we only used a single experimentally
determined unwinding bound, we now operate an explicit iterative deepening schema
(n = 8, 12, 16). This replaces the iterative deepening that is implicit in the k-induction
that we used last year [5]. In addition, we no longer use the partial loops option [3].
For categories other than MemorySafety we only check for the reachability of the
error label and ignore all other built-in properties. We use a small script that implements
iterative deepening and calls ESBMC with the main parameters set as follows:
esbmc --timeout 15m --memlimit 15g --64 --unwind <n>
--no-unwinding-assertions --no-assertions --error-label ERROR
--no-bounds-check --no-div-by-zero-check --no-pointer-check <f>
5 Results
With the approach described above, ESBMC correctly claims 1837 benchmarks cor-
rect and finds existing errors in 557. However, it also finds unexpected errors for 38
benchmarks and fails to find the expected errors in another 52. The failures are con-
centrated in the MemorySafety and Recursive categories, where we produce 36
and 15 unexpected results, respectively. In MemorySafety, these are caused by differ-
ences in the memory models respectively assumed by the competition, and implemented
in ESBMC; in particular, in 22 cases ESBMC detects an unchecked dereference of a
pointer to a freshly allocated memory chunk, which can lead to a null pointer violation
and so mask the result expected by the benchmark. In Recursive, all unexpected
results are false alarms, which are caused by bounding the programs. Additionally,
ESBMC produces 259 time-outs, which mostly stem from the larger benchmarks in
ldv-consumption, ldv-linux-3.4-simple, seq-mthreaded, and eca.
The remaining programs fail due to parsing errors (16), conversion error (1), or dif-
ferent internal (mostly out-of-memory) errors during the symbolic execution (108). ES-
BMC produces good results for all categories but MemorySafety and Recursive;
however, since we did not opt out of these, our overall result suffered substantially.
ESBMC’s performance has improved greatly over last year’s version (v1.20). The
number of errors detected has gone up from 448 to 557, while the number of un-
expected and missed errors has gone down, from 53 to 38 and from 209 to 52, re-
spectively. The biggest improvements are in the categories Sequentialized and
ControlFlowInteger.
Demonstration Section. We took part in the stateful verification, error-witness check-
ing, and device-driver challenges tracks. In particular, we use EZProofC [6] to collect
and manipulate the counterexample produced by ESBMC in order to reproduce the
identified error for the first round (B1) of the error-witness checking.
Acknowledgements. The third author thanks Samsung for financial support.
References
1. Cordeiro, L., Fischer, B.: Verifying Multi-Threaded Software using SMT-based Context-
Bounded Model Checking. In: ICSE, pp. 331–340 (2011)
2. Cordeiro, L., Fischer, B., Marques-Silva, J.: SMT-based bounded model checking for embed-
ded ANSI-C software. IEEE Trans. Software Eng. 38(4), 957–974 (2012)
3. Cordeiro, L., Morse, J., Nicole, D., Fischer, B.: Context-Bounded Model Checking with
ESBMC 1.17 (Competition Contribution). In: Flanagan, C., König, B. (eds.) TACAS 2012.
LNCS, vol. 7214, pp. 534–537. Springer, Heidelberg (2012)
4. Kroening, D., Clarke, E., Yorav, K.: Behavioral Consistency of C and Verilog Programs Using
Bounded Model Checking. In: DAC, pp. 368–371. IEEE (2003)
5. Morse, J., Cordeiro, L., Nicole, D., Fischer, B.: Handling Unbounded Loops with ESBMC
1.20 (Competition Contribution). In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS,
vol. 7795, pp. 619–622. Springer, Heidelberg (2013)
6. Rocha, H., Barreto, R., Cordeiro, L., Neto, A.D.: Understanding Programming Bugs in ANSI-
C Software Using Bounded Model Checking Counter-Examples. In: Derrick, J., Gnesi, S.,
Latella, D., Treharne, H. (eds.) IFM 2012. LNCS, vol. 7321, pp. 128–142. Springer, Heidelberg
(2012)
7. Sinz, C., Falke, S., Merz, F.: A Precise Memory Model for Low-Level Bounded Model Check-
ing. In: SSV, USENIX (2010)
FrankenBit: Bit-Precise Verification
with Many Bits
(Competition Contribution)
1 Verification Approach
FrankenBit combines two orthogonal techniques: one searches for bit-precise
counterexamples, and the other synthesizes bit-precise inductive invariants. The
counterexample search is done using Bounded Model Checking, and is delegated
completely to LLBMC [11]. Invariant synthesis is implemented by first unsoundly
approximating programs using Linear Arithmetic (LA), then computing induc-
tive invariants for the approximation, and using those to guide the search for
bit-precise invariants. The details of this approach are described in [7].
2 Software Architecture
The architecture of FrankenBit is shown in Fig. 1. First, the input C source
is processed and compiled into LLVM [10] bitcode using the UFO front-end
(UFO-FE) [1]. This involves normalizing with a custom CIL [12] pass, compiling
with llvm-gcc, and simplifying using customized optimizations from LLVM ver-
sion 2.6. The front-end is often sufficient to decide simple verification tasks. Sec-
ond, two threads are started, one used to synthesize an inductive invariant (left
part of Fig. 1), and the other to search for a counterexample (right part of Fig. 1).
This material is based upon work funded and supported by the Department of De-
fense under Contract No. FA8721-05-C-0003 with Carnegie Mellon University for
the operation of the Software Engineering Institute, a federally funded research and
development center. This material has been approved for public release and unlim-
ited distribution. DM-0000870. The second author is financially supported by SFI
PI grant BEACON (09/IN.1/I2618).
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 408–411, 2014.
c Springer-Verlag Berlin Heidelberg 2014
FrankenBit: Bit-Precise Verification with Many Bits 409
C source
UFO-FE
LLVM bitcode
UNSAFE/UNKNOWN UNKNOWN
UFO-MUZ LLBMC
SAFE
UNSAFE
LA system safe LA invariant
LA → BV LA → BV candidate BV invariant
SAT
Z3
MISPER
Boolector
AIGER UNSAT
MUSer2
BV system BV invariant
UNSAT
Z3 (safety)
SAT
SAT/UNKNOWN
Z3/PDR UNSAT
If both threads
are here
Return: TRUE Return: UNKNOWN Return: FALSE
Invariants. Invariants are synthesized using our new algorithm Misper [7]. First,
Z3/PDR engine [8] of UFO (UFO-MUZ) abstracts the input over Linear Arith-
metic (LA) and synthesizes LA invariant. If this fails, synthesis is aborted. Sec-
ond, the LA invariant and abstraction are converted to bit-vectors (LA → BV).
Third, the candidate bit-vector (BV) invariant is checked using Z3 [4]. If the
candidate is not inductive, it is weakened until it becomes inductive using Mis-
per that, in turn, uses Boolector [3] for bit-blasting, aiger for CNF conversion,
and MUSer2 [2] for extraction of Minimal Unsatisfiable Subformulas (MUSes).
Finally, the safety of the weakened invariant is checked again with Z3 (Z3 safety),
and, if necessary, strengthened using the bit-precise version of Z3/PDR.
410 A. Gurfinkel and A. Belov
References
1. Albarghouthi, A., Gurfinkel, A., Li, Y., Chaki, S., Chechik, M.: UFO: Verifica-
tion with Interpolants and Abstract Interpretation (Competition Contribution).
In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 637–640.
Springer, Heidelberg (2013)
2. Belov, A., Marques-Silva, J.: MUSer2: An Efficient MUS Extractor. JSAT 8(1/2)
(2012)
3. Brummayer, R., Biere, A.: Boolector: An Efficient SMT Solver for Bit-Vectors and
Arrays. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505,
pp. 174–177. Springer, Heidelberg (2009)
4. de Moura, L., Bjørner, N.: Z3: An Efficient SMT Solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
5. Eén, N., Sörensson, N.: An Extensible SAT-solver. In: Giunchiglia, E., Tacchella,
A. (eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518. Springer, Heidelberg (2004)
6. Ganesh, V., Dill, D.L.: A Decision Procedure for Bit-Vectors and Arrays. In:
Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 519–531.
Springer, Heidelberg (2007)
7. Gurfinkel, A., Belov, A., Marques-Silva, J.: Synthesizing Safe Bit-Precise Invari-
ants. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp.
93–108. Springer, Heidelberg (2014)
8. Hoder, K., Bjørner, N.: Generalized Property Directed Reachability. In: Cimatti,
A., Sebastiani, R. (eds.) SAT 2012. LNCS, vol. 7317, pp. 157–171. Springer, Hei-
delberg (2012)
FrankenBit: Bit-Precise Verification with Many Bits 411
9. Komuravelli, A., Gurfinkel, A., Chaki, S., Clarke, E.M.: Automatic Abstraction
in SMT-Based Unbounded Software Model Checking. In: Sharygina, N., Veith, H.
(eds.) CAV 2013. LNCS, vol. 8044, pp. 846–862. Springer, Heidelberg (2013)
10. Lattner, C., Adve, V.S.: LLVM: A Compilation Framework for Lifelong Program
Analysis & Transformation. In: CGO, pp. 75–88. IEEE Computer Society (2004)
11. Merz, F., Falke, S., Sinz, C.: LLBMC: Bounded Model Checking of C and C++
Programs Using a Compiler IR. In: Joshi, R., Müller, P., Podelski, A. (eds.) VSTTE
2012. LNCS, vol. 7152, pp. 146–161. Springer, Heidelberg (2012)
12. Necula, G.C., McPeak, S., Rahul, S.P., Weimer, W.: CIL: Intermediate Language
and Tools for Analysis and Transformation of C Programs. In: Horspool, R.N. (ed.)
CC 2002. LNCS, vol. 2304, pp. 213–228. Springer, Heidelberg (2002)
Predator: A Shape Analyzer
Based on Symbolic Memory Graphs
(Competition Contribution)
Abstract. Predator is a shape analyzer that uses the abstract domain of symbolic
memory graphs in order to support various forms of low-level memory manipu-
lation commonly used in optimized C code. This paper briefly describes the ver-
ification approach taken by Predator and its strengths and weaknesses revealed
during its participation in the Software Verification Competition (SV-COMP’14).
1 Verification Approach
Predator is a shape analyzer that uses the abstract domain of symbolic memory graphs
(SMGs) in order to support various forms of low-level memory manipulation commonly
used in optimized C code. Compared to separation logic-based works [1], which our
work is inspired by, SMGs allow one to easily apply various graph-based algorithms to
efficiently manipulate with the low-level memory representation.
The formal definition of SMGs can be found in [2] together with algorithms of all
the operations needed for use of SMGs in a fully automatic shape analysis. This is in
particular the case of a specialised unary abstraction operator and a binary join oper-
ator that aid termination of the SMG-based shape analysis. The join operator is based
on an algorithm that simultaneously traverses a pair of input SMGs and merges their
corresponding nodes. The core of the join algorithm is also used by the algorithm imple-
menting the abstraction operator to merge pairs of neighbouring nodes, together with
their sub-SMGs (describing the data structures nested below them), into a single list
segment. For checking entailment of SMGs, Predator again reuses the join algorithm
(extended to compare generality of the SMGs being joined).
Predator requires all external functions to be properly modelled wrt. memory safety
in order to exclude any side effects that could possibly break soundness of the analy-
sis. Our distribution of Predator includes models of memory allocation functions (like
malloc or free) and selected memory manipulating functions (memset, memcpy,
memmove, etc.).
Since SV-COMP’13, the core algorithms of shape analysis were reimplemented
in order to match their description presented in [2]. Consequently, the current imple-
mentation is much easier to follow, but at the same time also faster and more precise (as
witnessed by the results of SV-COMP’14).
This work was supported by the Czech Science Foundation project 14-11384S and the
EU/Czech IT4Innovations Centre of Excellence project CZ.1.05/1.1.00/02.0070.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 412–414, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Predator: A Shape Analyzer Based on Symbolic Memory Graphs 413
2 Software Architecture
The source code of the Predator release1 used in the competition can be downloaded
from the project web page. The file README-SVCOMP-2014 included in the archive
1
http://www.fit.vutbr.cz/research/groups/verifit/tools/predator
/download/predator-2013-10-30-d1bd405.tar.gz
414 K. Dudka, P. Peringer, and T. Vojnar
describes how to build Predator from source code and how to apply the tool on the com-
petition benchmarks. After successfully building the tool from sources, the sl build
directory contains a script named check-property.sh, which needs to be invoked
once for each input program. Besides the name of the input program, the script requires
a mandatory option --propertyfile specifying the property to be verified. Com-
piler flags needed to compile the input program with GCC must be specified after the
file name of the input program. For programs relying on a particular target architecture
(such as preprocessed C sources), it is important to use the -m32 or -m64 compiler
flags to specify the architecture. The script also provides a voluntary option --trace
that allows one to write the error trace to a file. The verification result is printed to the
standard output on success. Otherwise, the verification outcome should be treated as
UNKNOWN. The script does not check for exceeding any resource limits on its own.
Although we use a global configuration of Predator for all categories, the tool pro-
vides many useful compile-time options via the sl/config.h configuration file. The
default configuration is tweaked to obtain good overall results in both the competi-
tion benchmark and Predator’s regression test-suite. The configuration can be further
tweaked to improve the results in a particular category, however, at the cost of loosing
some points in other categories.
References
1. Berdine, J., Calcagno, C., Cook, B., Distefano, D., O’Hearn, P.W., Wies, T., Yang, H.: Shape
Analysis for Composite Data Structures. In: Damm, W., Hermanns, H. (eds.) CAV 2007.
LNCS, vol. 4590, pp. 178–192. Springer, Heidelberg (2007)
2. Dudka, K., Peringer, P., Vojnar, T.: Byte-Precise Verification of Low-Level List Manipulation.
In: Logozzo, F., Fähndrich, M. (eds.) SAS 2013. LNCS, vol. 7935, pp. 215–237. Springer,
Heidelberg (2013)
3. Dudka, K., Peringer, P., Vojnar, T.: An Easy to Use Infrastructure for Building Static Analy-
sis Tools. In: Moreno-Dı́az, R., Pichler, F., Quesada-Arencibia, A. (eds.) EUROCAST 2011,
Part I. LNCS, vol. 6927, pp. 527–534. Springer, Heidelberg (2012)
4. Habermehl, P., Holı́k, L., Rogalewicz, A., Šimáček, J., Vojnar, T.: Forest Automata for Veri-
fication of Heap Manipulation. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS,
vol. 6806, pp. 424–440. Springer, Heidelberg (2011)
Symbiotic 2: More Precise Slicing
(Competition Contribution)
Abstract. Symbiotic 2 keeps the concept and the structure of the orig-
inal bug-finding tool Symbiotic, but it uses a more precise slicing based
on a field-sensitive pointer analysis instead of field-insensitive analysis of
the original tool. The paper discusses this improvement and its conse-
quences. We also briefly recall basic principles of the tool, its strong and
weak points, installation, and running instructions. Finally, we comment
the results achieved by Symbiotic 2 in the competition.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 415–417, 2014.
c Springer-Verlag Berlin Heidelberg 2014
416 J. Slaby and J. Strejček
References
1. Andersen, L.O.: Program Analysis and Specialization for the C Programming Lan-
guage. PhD thesis, DIKU, University of Copenhagen (1994)
2. Cadar, C., Dunbar, D., Engler, D.: KLEE: Unassisted and automatic generation
of high-coverage tests for complex systems programs. In: Proceedings of OSDI, pp.
209–224. USENIX Association (2008)
3. King, J.C.: Symbolic execution and program testing. Communications of
ACM 19(7), 385–394 (1976)
4. Slabý, J., Strejček, J., Trtı́k, M.: Checking properties described by state machines:
On synergy of instrumentation, slicing, and symbolic execution. In: Stoelinga, M.,
Pinger, R. (eds.) FMICS 2012. LNCS, vol. 7437, pp. 207–221. Springer, Heidelberg
(2012)
5. Slaby, J., Strejček, J., Trtı́k, M.: Compact symbolic execution. In: Van Hung, D.,
Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172, pp. 193–207. Springer, Heidelberg
(2013)
6. Slaby, J., Strejček, J., Trtı́k, M.: Symbiotic: Synergy of instrumentation, slicing,
and symbolic execution (competition contribution). In: Piterman, N., Smolka, S.A.
(eds.) TACAS 2013. LNCS, vol. 7795, pp. 630–632. Springer, Heidelberg (2013)
7. Weiser, M.: Program slicing. In: Proceedings of ICSE, pp. 439–449. IEEE (1981)
Ultimate Automizer with Unsatisfiable Cores
(Competition Contribution)
1 Verification Approach
Ultimate Automizer verifies a C program by first executing several program
transformations and then performing an interpolation-based variant of trace
abstraction [4]. As a first step, we translate the C program into a Boogie [6]
program. The heap of the system is modeled via arrays in this Boogie pro-
gram [7]. Next, the Boogie program is translated into an interprocedural control
flow graph [9]. As an optimization, we do not label the edges with single program
statements but with loop free code blocks of the program [11]. Our verification
algorithm then performs the following steps iteratively:
1. We take a sequence of statements π that leads from the start of the main
procedure to an error location and analyze its correctness (resp. feasibility).
In this analysis an SMT solver is used.
2. We consider this sequence of statements as a standalone program Pπ and
compute a correctness proof for Pπ in form of a Hoare annotation.
3. We find a larger program Pπ that has the same correctness proof [4].
4. We consider the preceding step as a semantical decomposition of the original
program P into one part P π whose correctness is already proven and one re-
maining part Prest := P\Pπ , on which we continue. The programs P, P π , Prest
are represented by automata. This allows us to compute and represent the
remaining part of the program Prest (the part for which correctness was not
yet proven). Furthermore, this automata-theoretic representation allows us
to apply minimization [10] to represent the programs P, P π , Prest efficiently.
This work is supported by the German Research Council (DFG) as part of the
Transregional Collaborative Research Center “Automatic Verification and Analysis
of Complex Systems” (SFB/TR14 AVACS)
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 418–420, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Ultimate Automizer with Unsatisfiable Cores 419
2 Software Architecture
3 Discussion of Approach
The zip archive in which Ultimate Automizer is shipped contains the Python
script automizerSV-COMP.py which wraps input and output for the SV-COMP.
1
https://www.eclipse.org/cdt/
2
https://z3.codeplex.com/
420 M. Heizmann et al.
Using the following command, the C program fnord.c is verified with respect to
the property file prop.prp and an error path is written to the file errPath.txt.
python AutomizerSvcomp.py prop.prp fnord.c errPath.txt
References
1. Dietsch, D.: STALIN: A plugin-based modular framework for program analysis.
Bachelor Thesis, Albert-Ludwigs-Universität, Freiburg, Germany (2008)
2. Heizmann, M., et al.: Ultimate automizer with SMTInterpol. In: Piterman, N.,
Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 641–643. Springer, Heidel-
berg (2013)
3. Heizmann, M., Hoenicke, J., Leike, J., Podelski, A.: Linear ranking for linear lasso
programs. In: Van Hung, D., Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172, pp.
365–380. Springer, Heidelberg (2013)
4. Heizmann, M., Hoenicke, J., Podelski, A.: Software model checking for people who
love automata. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp.
36–52. Springer, Heidelberg (2013)
5. Leike, J.: Ranking function synthesis for linear lasso programs. Master’s thesis,
University of Freiburg, Germany (2013)
6. Leino, K.R.M.: This is Boogie 2. Manuscript working draft, Microsoft Research,
Redmond, WA, USA (June 2008),
http://research.microsoft.com/en-us/um/people/leino/papers/krml178.pdf
7. Lindenmann, M.: A simple but sufficient memory model for ultimate. Master’s
thesis, University of Freiburg, Germany (2012)
8. Musa, B.: Trace abstraction with unsatisfiable cores. Bachelor’s thesis, University
of Freiburg, Germany (2013)
9. Reps, T.W., Horwitz, S., Sagiv, S.: Precise interprocedural dataflow analysis via
graph reachability. In: POPL 1995, pp. 49–61. ACM (1995)
10. Schilling, C.: Minimization of nested word automata. Master’s thesis, University
of Freiburg, Germany (2013)
11. Wissert, S.: Adaptive block encoding for recursive control flow graphs. Master’s
thesis, University of Freiburg, Germany (2013)
Ultimate Kojak
(Competition Contribution)
1 Verification Approach
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 421–423, 2014.
c Springer-Verlag Berlin Heidelberg 2014
422 E. Ermis et al.
··· ···
ϕ1 · · · ϕi ϕ1 ϕi ϕ1 ϕi
ϕ
··· ···
Fig. 1. We split a node that represents program location , and that has earlier been
annotated with the invariant formula Inv, with the interpolant I. The node and its
incoming and outgoing edges are duplicated and one copy of the node is labeled with
I and the other with ¬I.
interpolants for these error paths, we use tree (nested) interpolation [2,3]. Ul-
timate Kojak utilizes block encoding [4] to summarize loop-free segments of
the program, such that the focus is put on loops.
2 Software Architecture
Ultimate Kojak is a toolchain in the Ultimate1 Verification Framework,
which is implemented in Java. Ultimate manages different representations of
a program and passes them between its plug-ins which may analyse, transform,
or visualize the representation. Ultimate also provides an interface for com-
munication with SMT-LIBv2 compatible SMT solvers. For parsing C programs,
we use the C parser provided by the CDT2 project. We use Z33 for feasibility
checks of error paths and transition formulas. Interpolation is done by our own
algorithm, which is not yet published [5].
In principle, Ultimate Kojak can handle any program that can be formal-
ized in a logic the attached SMT solver is capable of. Currently, we do not
support bit-precise treatment of integers or concurrent programs.
References
1. Ermis, E., Hoenicke, J., Podelski, A.: Splitting via interpolants. In: Kuncak, V.,
Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 186–201. Springer, Hei-
delberg (2012)
2. Heizmann, M., Hoenicke, J., Podelski, A.: Nested interpolants. In: Hermenegildo,
M.V., Palsberg, J. (eds.) POPL, pp. 471–482. ACM (2010)
3. Christ, J., Hoenicke, J.: Extending proof tree preserving interpolation to sequences
and trees (work in progress). In: SMT Workshop, pp. 72–86 (2013)
4. Beyer, D., Cimatti, A., Griggio, A., Erkan Keremoglu, M., Sebastiani, R.: Software
model checking via large-block encoding. In: FMCAD, pp. 25–32. IEEE (2009)
5. Musa, B.: Trace abstraction with unsatisfiable cores. Bachelor’s thesis, University
of Freiburg, Germany (2013)
4
We use version 4.3.2 for Windows, any recent version should work.
5
http://ultimate.informatik.uni-freiburg.de/automizer/
Discounting in LTL
Abstract. In recent years, there is growing need and interest in formalizing and
reasoning about the quality of software and hardware systems. As opposed to
traditional verification, where one handles the question of whether a system satis-
fies, or not, a given specification, reasoning about quality addresses the question
of how well the system satisfies the specification. One direction in this effort is to
refine the “eventually” operators of temporal logic to discounting operators: the
satisfaction value of a specification is a value in [0, 1], where the longer it takes
to fulfill eventuality requirements, the smaller the satisfaction value is.
In this paper we introduce an augmentation by discounting of Linear Tem-
poral Logic (LTL), and study it, as well as its combination with propositional
quality operators. We show that one can augment LTL with an arbitrary set of
discounting functions, while preserving the decidability of the model-checking
problem. Further augmenting the logic with unary propositional quality opera-
tors preserves decidability, whereas adding an average-operator makes the model-
checking problem undecidable. We also discuss the complexity of the problem,
as well as various extensions.
1 Introduction
One of the main obstacles to the development of complex hardware and software sys-
tems lies in ensuring their correctness. A successful paradigm addressing this obstacle
is temporal-logic model checking – given a mathematical model of the system and a
temporal-logic formula that specifies a desired behavior of it, decide whether the model
satisfies the formula [5]. Correctness is Boolean: a system can either satisfy its specifi-
cation or not satisfy it. The richness of today’s systems, however, justifies specification
formalisms that are multi-valued. The multi-valued setting arises directly in systems
with quantitative aspects (multi-valued / probabilistic / fuzzy) [9–11, 16, 23], but is ap-
plied also with respect to Boolean systems, where it origins from the semantics of the
specification formalism itself [1, 7].
When considering the quality of a system, satisfying a specification should no longer
be a yes/no matter. Different ways of satisfying a specification should induce differ-
ent levels of quality, which should be reflected in the output of the verification pro-
cedure. Consider for example the specification G(request → F(response grant ∨
response deny)) (“every request is eventually responded, with either a grant or a de-
nial”). There should be a difference between a computation that satisfies it with re-
sponses generated soon after requests and one that satisfies it with long waits.
Moreover, there may be a difference between grant and deny responses, or cases in
which no request is issued. The issue of generating high-quality hardware and software
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 424–439, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Discounting in LTL 425
systems attracts a lot of attention [13, 26]. Quality, however, is traditionally viewed as
an art, or as an amorphic ideal. In [1], we introduced an approach for formalizing qual-
ity. Using it, a user can specify quality formally, according to the importance he gives to
components such as security, maintainability, runtime, and more, and then can formally
reason about the quality of software.
As the example above demonstrates, we can distinguish between two aspects of the
quality of satisfaction. The first, to which we refer as “temporal quality” concerns the wait-
ing time to satisfaction of eventualities. The second, to which we refer as “propositional
quality” concerns prioritizing related components of the specification. Propositional qual-
ity was studied in [1]. In this paper we study temporal quality as well as the combinations
of both aspects. One may try to reduce temporal quality to propositional quality by a re-
peated use of the X (“next”) operator or by a use of bounded (prompt) eventualities [2, 3].
Both approaches, however, partitions the future into finitely many zones and are limited:
correctness of LTL is Boolean, and thus has inherent dichotomy between satisfaction
and dissatisfaction. On the other hand, the distinction between “near” and “far” is not
dichotomous.
This suggests that in order to formalize temporal quality, one must extend LTL to
an unbounded setting. Realizing this, researchers have suggested to augment temporal
logics with future discounting [8]. In the discounted setting, the satisfaction value of spec-
ifications is a numerical value, and it depends, according to some discounting function,
on the time waited for eventualities to get satisfied.
In this paper we add discounting to Linear Temporal Logic (LTL), and study it, as
well as its combination with propositional quality operators. We introduce LTLdisc [D]
– an augmentation by discounting of LTL. The logic LTLdisc [D] is actually a family of
logics, each parameterized by a set D of discounting functions – strictly decreasing func-
tions from to [0, 1] that tend to 0 (e.g., linear decaying, exponential decaying, etc.).
LTLdisc [D] includes a discounting-“until” (Uη ) operator, parameterized by a function
η ∈ D. We solve the model-checking threshold problem for LTLdisc [D]: given a Kripke
structure K, an LTLdisc [D] formula ϕ and a threshold t ∈ [0, 1], the algorithm decides
whether the satisfaction value of ϕ in K is at least t.
In the Boolean setting, the automata-theoretic approach has proven to be very use-
ful in reasoning about LTL specifications. The approach is based on translating LTL
formulas to nondeterministic Büchi automata on infinite words [28]. Applying this ap-
proach to the discounted setting, which gives rise to infinitely many satisfaction values,
poses a big algorithmic challenge: model-checking algorithms, and in particular those
that follow the automata-theoretic approach, are based on an exhaustive search, which
cannot be simply applied when the domain becomes infinite. A natural relevant exten-
sion to the automata-theoretic approach is to translate formulas to weighted automata
[22]. Unfortunately, these extensively-studied models are complicated and many prob-
lems become undecidable for them [15]. We show that for threshold problems, we can
translate LTLdisc [D] formulas into (Boolean) nondeterministic Büchi automata, with the
property that the automaton accepts a lasso computation iff the formula attains a value
above the threshold on that computation. Our algorithm relies on the fact that the lan-
guage of an automaton is non-empty iff there is a lasso witness for the non-emptiness.
426 S. Almagor, U. Boker, and O. Kupferman
We cope with the infinitely many possible satisfaction values by using the discounting be-
havior of the eventualities and the given threshold in order to partition the state space into
a finite number of classes. The complexity of our algorithm depends on the discounting
functions used in the formula. We show that for standard discounting functions, such as
exponential decaying, the problem is PSPACE-complete – not more complex than stan-
dard LTL. The fact our algorithm uses Boolean automata also enables us to suggest a
solution for threshold satisfiability, and to give a partial solution to threshold synthesis.
In addition, it allows to adapt the heuristics and tools that exist for Boolean automata.
Before we continue to describe our contribution, let us review existing work on dis-
counting. The notion of discounting has been studied in several fields, such as economy,
game-theory, and Markov decision processes [25]. In the area of formal verification, it
was suggested in [8] to augment the μ-calculus with discounting operators. The discount-
ing suggested there is exponential; that is, with each iteration, the satisfaction value of the
formula decreases by a multiplicative factor in (0, 1]. Algorithmically, [8] shows how to
evaluate discounted μ-calculus formulas with arbitrary precision. Formulas of LTL can
be translated to the μ-calculus, thus [8] can be used in order to approximately model-
check discounted-LTL formulas. However, the translation from LTL to the μ-calculus
involves an exponential blowup [6] (and is complicated), making this approach ineffi-
cient. Moreover, our approach allows for arbitrary discounting functions, and the algo-
rithm returns an exact solution to the threshold model-checking problem, which is more
difficult than the approximation problem.
Closer to our work is [7], where CTL is augmented with discounting and weighted-
average operators. The motivation in [7] is to introduce a logic whose semantics is not
too sensitive to small perturbations in the model. Accordingly, formulas are evaluated
on weighted-systems or on Markov-chains. Adding discounting and weighted-average
operators to CTL preserves its appealing complexity, and the model-checking problem
for the augmented logic can be solved in polynomial time. As is the case in the Boolean
semantics, the expressive power of discounted CTL is limited. The fact the same com-
bination, of discounting and weighted-average operators, leads to undecidability in the
context of LTL witnesses the technical challenges of the LTLdisc [D] setting.
Perhaps closest to our approach is [19], where a version of discounted-LTL was in-
troduced. Semantically, there are two main differences between the logics. The first is
that [19] uses discounted sum, while we interpret discounting without accumulation,
and the second is that the discounting there replaces the standard temporal operators, so
all eventualities are discounted. As discounting functions tend to 0, this strictly restricts
the expressive power of the logic, and one cannot specify traditional eventualities in it.
On the positive side, it enables a clean algebraic characterization of the semantics, and
indeed the contribution in [19] is a comprehensive study of the mathematical properties
of the logic. Yet, [19] does not study algorithmic questions about to the logic. We, on
the other hand, focus on the algorithmic properties of the logic, and specifically on the
model-checking problem.
Discounting in LTL 427
Let us now return to our contribution. After introducing LTLdisc [D] and studying its
model-checking problem, we augment LTLdisc [D] with propositional quality operators.
Beyond the operators min, max, and ¬, which are already present, two basic proposi-
tional quality operators are the multiplication of an LTLdisc [D] formula by a constant
in [0, 1], and the averaging between the satisfaction values of two LTLdisc [D] formulas
[1]. We show that while the first extension does not increase the expressive power of
LTLdisc [D] or its complexity, the latter causes the model-checking problem to become
undecidable. In fact, model checking becomes undecidable even if we allow averaging
in combination with a single discounting function. Recall that this is in contrast with
the extension of discounted CTL with an average operator, where the complexity of the
model-checking problem stays polynomial [7].
We consider additional extensions of LTLdisc [D]. First, we study a variant of the
discounting-eventually operators in which we allow the discounting to tend to arbitrary
values in [0, 1] (rather than to 0). This captures the intuition that we are not always pes-
simistic about the future, but can be, for example, ambivalent about it, by tending to 12 .
We show that all our results hold under this extension. Second, we add to LTLdisc [D] past
operators and their discounting versions (specifically, we allow a discounting-“since” op-
erator, and its dual). In the traditional semantics, past operators enable clean specifica-
tions of many interesting properties, make the logic exponentially more succinct, and
can still be handled within the same complexity bounds [17, 18]. We show that the same
holds for the discounted setting. Finally, we show how LTLdisc [D] and algorithms for it
can be used also for reasoning about weighted systems.
Due to lack of space, most proofs are omitted, and can be found in the full version, in
the authors’ home pages.
is a value in [0, 1], denoted [[π, ϕ]]. The value is defined by induction on the structure of
ϕ as follows, where π i = πi , πi+1 , . . ..
428 S. Almagor, U. Boker, and O. Kupferman
The intuition is that events that happen in the future have a lower influence, and the
rate by which this influence decreases depends on the function η. 1 For example, the sat-
isfaction value of a formula ϕUη ψ in a computation π depends on the best (supremum)
value that ψ can get along the entire computation, while considering the discounted sat-
isfaction of ψ at a position i, as a result of multiplying it by η(i), and the same for the
value of ϕ in the prefix leading to the i-th position.
We add the standard abbreviations Fϕ ≡ TrueUϕ, and Gϕ = ¬F¬ϕ, as well as their
quantitative counterparts: Fη ϕ ≡ TrueUη ϕ, and Gη ϕ = ¬Fη ¬ϕ. We denote by |ϕ| the
number of subformulas of ϕ.
A computation of the form π = u · v ω , for u, v ∈ (2AP )∗ , with v =
, is called a
lasso computation. We observe that since a specific lasso computation has only finitely
many distinct suffixes, the inf and sup in the semantics of LTLdisc [D] can be replaced
with min and max, respectively, when applied to lasso computations.
The semantics is extended to Kripke structures by taking the path that admits the low-
est satisfaction value. Formally, for a Kripke structure K and an LTLdisc [D] formula ϕ
we have that [[K, ϕ]] = inf {[[π, ϕ]] : π is a computation of K}.
Example 1. Consider a lossy-disk: every moment in time there is a chance that some
bit would flip its value. Fixing flips is done by a global error-correcting procedure. This
procedure manipulates the entire content of the disk, such that initially it causes more
errors in the disk, but the longer it runs, the more bits it fixes.
Let init and terminate be atomic propositions indicating when the error-correcting
procedure is initiated and terminated, respectively. The quality of the disk (that is, a mea-
sure of the amount of correct bits) can be specified by the formula ϕ = GFη (init ∧
¬Fμ terminate) for some appropriate discounting functions η and μ. Intuitively, ϕ gets
a higher satisfaction value the shorter the waiting time is between initiations of the error-
correcting procedure, and the longer the procedure runs (that is, not terminated) in be-
tween these initiations. Note that the “worst case” nature of LTLdisc [D] fits here. For
instance, running the procedure for a very short time, even once, will cause many errors.
model-checking problem is to compute [[K, ϕ]], where ϕ is now an LTLdisc [D] formula.
A simpler version of this problem is the threshold model-checking problem: given ϕ, K,
and a threshold v ∈ [0, 1], decide whether [[K, ϕ]] ≥ v. In this section we show how we
can solve the latter.
Our solution uses the automata-theoretic approach, and consists of the following steps.
We start by translating ϕ and v to an alternating weak automaton Aϕ,v such that L(Aϕ,v ) =
∅ iff there exists a computation π such that [[π, ϕ]] > v. The challenge here is that ϕ has in-
finitely many satisfaction values, naively implying an infinite-state automaton. We show
that using the threshold and the discounting behavior of the eventualities, we can restrict
attention to a finite resolution of satisfaction values, enabling the construction of a finite
automaton. Complexity-wise, the size of Aϕ,v depends on the functions in D. In Sec-
tion 3.3, we analyze the complexity for the case of exponential-discounting functions.
The second step is to construct a nondeterministic Büchi automaton B that is equiva-
lent to Aϕ,v . In general, alternation removal might involve an exponential blowup in the
state space [21]. We show, by a careful analysis of Aϕ,v , that we can remove its alterna-
tion while only having a polynomial state blowup.
We complete the model-checking procedure by composing the nondeterministic Büchi
automaton B with the Kripke structure K, as done in the traditional, automata-based,
model-checking procedure.
The complexity of model-checking an LTLdisc [D] formula depends on the discounting
functions in D. Intuitively, the faster the discounting tends to 0, the less states there will be.
For exponential-discounting, we show that the complexity is NLOGSPACE in the system
(the Kripke structure) and PSPACE in the specification (the LTLdisc [D] formula and the
threshold), staying in the same complexity classes of standard LTL model-checking.
We conclude the section by showing how to use the generated nondeterministic Büchi
automaton for addressing threshold satisfiability and synthesis.
For a given set X, let B +(X) be the set of positive Boolean formulas over X (i.e., Boolean
formulas built from elements in X using ∧ and ∨), where we also allow the formulas
True and False. For Y ⊆ X, we say that Y satisfies a formula θ ∈ B + (X) iff the
truth assignment that assigns true to the members of Y and assigns false to the members
of X \ Y satisfies θ. An alternating Büchi automaton on infinite words is a tuple A =
Σ, Q, qin , δ, α, where Σ is the input alphabet, Q is a finite set of states, qin ∈ Q is an
initial state, δ : Q×Σ → B + (Q) is a transition function, and α ⊆ Q is a set of accepting
states. We define runs of A by means of (possibly) infinite DAGs (directed acyclic graphs).
A run of A on a word w = σ0 · σ1 · · · ∈ Σ ω is a (possibly) infinite DAG G = V, E
satisfying the following (note that there may be several runs of A on w).
– V ⊆ Q × is as follows. Let Ql ⊆ Q denote all states in level l. Thus, Ql = {q :
q, l ∈ V }. Then, Q0 = {qin }, and Ql+1 satisfies q∈Ql δ(q, σl ).
– For every l ∈ , Ql is minimal with respect to containment.
– E ⊆ l≥0 (Ql × {l}) × (Ql+1 × {l + 1}) is such that for every state q ∈ Ql , the
set {q ∈ Ql+1 : E(< q, l >, < q , l + 1 >)} satisfies δ(q, σl ).
430 S. Almagor, U. Boker, and O. Kupferman
Thus, the root of the DAG contains the initial state of the automaton, and the states asso-
ciated with nodes in level l + 1 satisfy the transitions from states corresponding to nodes
in level l. The run G accepts the word w if all its infinite paths satisfy the acceptance con-
dition α. Thus, in the case of Büchi automata, all the infinite paths have infinitely many
nodes q, l such that q ∈ α (it is not hard to prove that every infinite path in G is part
of an infinite path starting in level 0). A word w is accepted by A if there is a run that
accepts it. The language of A, denoted L(A), is the set of infinite words that A accepts.
When the formulas in the transition function of A contain only disjunctions, then A
is nondeterministic, and its runs are DAGs of width 1, where at each level there is a single
node.
The alternating automaton A is weak, denoted AWA, if its state space Q can be par-
titioned into sets Q1 , . . . , Qk , such that the following hold: First, for every 1 ≤ i ≤ k
either Qi ⊆ α, in which case we say that Qi is an accepting set, or Qi ∩ α = ∅, in
which case we say that Qi is rejecting. Second, there is a partial-order ≤ over the sets,
and for every 1 ≤ i, j ≤ k, if q ∈ Qi , s ∈ Qj , and s ∈ δ(q, σ) for some σ ∈ Σ, then
Qj ≤ Qi . Thus, transitions can lead only to states that are smaller in the partial order.
Consequently, each run of an AWA eventually gets trapped in a set Qi and is accepting
iff this set is accepting.
Lemma 1. Given an LTLdisc [D] formula ϕ, there exist LTL formulas ϕ+ and ϕ<1 such
that |ϕ+ | and |ϕ<1 | are both O(|ϕ|) and the following hold for every computation π.
1. If [[π, ϕ]] > 0 then π |= ϕ+ , and if [[π, ϕ]] < 1 then π |= ϕ<1 .
2. If π is a lasso, then if π |= ϕ+ then [[π, ϕ]] > 0 and if π |= ϕ<1 then [[π, ϕ]] < 1.
Remark 1. The curious reader may wonder why we do not prove that [[π, ϕ]] > 0 iff
π |= ϕ+ for every computation π. As it turns out, a translation that is valid also for
computations with no period is not always possible. For example, as is the case with
the prompt-eventuality operator of [14], the formula ϕ = G(Fη p) is such that the set of
computations π with [[π, ϕ]] > 0 is not ω-regular, thus one cannot hope to define an LTL
formula ϕ+ .
Observe that xcl(ϕ) may be infinite, and that it has both LTLdisc [D] formulas (from
Classes 1 and 3) and LTL formulas (from Class 2).
Theorem 1. Given an LTLdisc [D] formula ϕ and a threshold v ∈ [0, 1], there exists an
AWA Aϕ,v such that for every computation π the following hold.
1. If [[π, ϕ]] > v, then Aϕ,v accepts π.
2. If Aϕ,v accepts π and π is a lasso computation, then [[π, ϕ]] > v.
from the initial state (ϕ > v), and we can compute these states in advance. Intuitively,
it follows from the fact that once the proportion between t and η(i) goes above 1, for
Type-1 states associated with threshold t and sub formulas with a discounting function
η, we do not have to generate new states.
A detailed proof of A’s finiteness and correctness is given in the full version.
Since Aϕ,v is a Boolean automaton, then L(A) = ∅ iff it accepts a lasso computation.
Combining this observation with Theorem 1, we conclude with the following.
Corollary 2. For an LTLdisc [D] formula ϕ and a threshold v ∈ [0, 1], it holds that
L(Aϕ,v ) = ∅ iff there exists a computation π such that [[π, ϕ]] > v.
Furthermore, the number of states of Aϕ,v is singly exponential in |ϕ| and in the de-
scription of v.
The proof follows from the following observation. Let λ ∈ (0, 1) and v ∈ (0, 1). When
discounting by expλ , the number of states in the AWA constructed as per Theorem 1 is
log v
proportional to the maximal number i such that λi > v, which is at most logλ v = log λ,
which is polynomial in the description length of v and λ. A similar (yet more complicated)
consideration is applied for the setting of multiple discounting functions and negations.
The idea behind our complexity analysis is as follows. Translating an AWA to an NBA
involves alternation removal, which proceeds by keeping track of entire levels in a run-
DAG . Thus, a run of the NBA corresponds to a sequence of subsets of Q. The key to the
reduced state space is that the number of such subsets is only |Q|O(|ϕ|) and not 2|Q| . To
see why, consider a subset S of the states of A. We say that S is minimal if it does not
include two states of the form ϕ < t1 and ϕ < t2 , for t1 < t2 , nor two states of the form
ϕUη+i ψ < t and ϕUη+j ψ < t, for i < j, and similarly for “>”. Intuitively, sets that are
not minimal hold redundant assertions, and can be ignored. Accordingly, we restrict the
state space of the NBA to have only minimal sets.
Lemma 2. For an LTLdisc [D] formula ϕ and v ∈ [0, 1], the AWA Aϕ,v constructed in
Theorem 1 with state space Q can be translated to an NBA with |Q|O(|ϕ|) states.
Note that the complexity in Theorem 3 is only NLOGSPACE in the system, since
our solution does not analyze the Kripke structure, but only takes its product with the
specification’s automaton. This is in contrast to the approach of model checking temporal
logic with (non-discounting) accumulative values, where, when decidable, involves a
doubly-exponential dependency on the size of the system [4].
Finally, observe that the NBA obtained in Lemma 2 can be used to solve the threshold-
satisfiability problem: given an LTLdisc [D] formula ϕ and a threshold v ∈ [0, 1], we can
decide whether there is a computation π such that [[π, ϕ]] ∼ v, for ∼∈ {<, >}, and return
such a computation when the answer is positive. This is done by simply deciding whether
there exists a word that is accepted by the NBA.
order to address the synthesis problem, as stated in the following theorem (see the full
version for the proof).
Theorem 4. Consider an LTLdisc [D] formula ϕ. If there exists a transducer T all of
whose computations π satisfy [[π, ϕ]] > v, then we can generate a transducer T all of
whose computations τ satisfy [[τ, ϕ]] ≥ v.
5 Extensions
LTLdisc [D] with Past Operators A useful augmentation of LTL is the addition of past
operators [18]. These operators enable the specification of clearer and more succinct for-
mulas while preserving the PSPACE complexity of model checking. In the full version,
we add discounting-past operators to LTLdisc [D] and show how to perform model check-
ing on the obtained logic. The solution goes via 2-way weak alternating automata and
preserves the complexity of LTLdisc [D].
Weighted Systems. In LTLdisc [D], the verified system need not be weighted in order to
get a quantitative satisfaction – it stems from taking into account the delays in satisfying
the requirements. Nevertheless, LTLdisc [D] also naturally fits weighted systems, where
the atomic propositions have values in [0, 1]. In the full version we extend the semantics of
LTLdisc [D] to weighted Kripke structures, whose computations assign weights in [0, 1]
to every atomic proposition. We solve the corresponding model-checking problem by
properly extending the construction of the automaton Aϕ,v .
Changing the Tendency of Discounting. One may observe that in our discounting
scheme, the value of future formulas is discounted toward 0. This, in a way, reflects an
intuition that we are pessimistic about the future. While in some cases this fits the needs
of the specifier, it may well be the case that we are ambivalent to the future. To capture
this notion, one may want the discounting to tend to 12 . Other values are also possible. For
example, it may be that we are optimistic about the future, say when a system improves
its performance while running and we know that components are likely to function better
in the future. We may then want the discounting to tend, say, to 34 .
To capture this notion, we define the operator Oη,z , parameterized by η ∈ D and
z ∈ [0, 1], with the semantics. [[π, ϕOη,z ψ]] = supi≥0 {min{η(i)[[π i , ψ]] + (1 − η(i))z,
min0≤j<i η(j)[[π j , ϕ]] + (1 − η(j))z}}. The discounting function η determines the rate
of convergence, and z determines the limit of the discounting. In the full version, we
show how to augment the construction of Aϕ,v with the operator O in order to solve the
model-checking problem.
6 Discussion
An ability to specify and to reason about quality would take formal methods a signifi-
cant step forward. Quality has many aspects, some of which are propositional, such as
prioritizing one satisfaction scheme on top of another, and some are temporal, for ex-
ample having higher quality for implementations with shorter delays. In this work we
provided a solution for specifying and reasoning about temporal quality, augmenting the
commonly used linear temporal logic (LTL). A satisfaction scheme, such as ours, that
is based on elapsed times introduces a big challenge, as it implies infinitely many sat-
isfaction values. Nonetheless, we showed the decidability of the model-checking prob-
lem, and for the natural exponential-decaying satisfactions, the complexity remains as
the one for standard LTL, suggesting the interesting potential of the new scheme. As for
combining propositional and temporal quality operators, we showed that the problem is,
in general, undecidable, while certain combinations, such as adding priorities, preserve
the decidability and the complexity.
438 S. Almagor, U. Boker, and O. Kupferman
References
1. Almagor, S., Boker, U., Kupferman, O.: Formalizing and reasoning about quality. In: Fomin,
F.V., Freivalds, R., Kwiatkowska, M., Peleg, D. (eds.) ICALP 2013, Part II. LNCS, vol. 7966,
pp. 15–27. Springer, Heidelberg (2013)
2. Almagor, S., Hirshfeld, Y., Kupferman, O.: Promptness in ω-regular automata. In: Bouajjani,
A., Chin, W.-N. (eds.) ATVA 2010. LNCS, vol. 6252, pp. 22–36. Springer, Heidelberg (2010)
3. Bojańczyk, M., Colcombet, T.: Bounds in ω-regularity. In: 21st LICS, pp. 285–296 (2006)
4. Boker, U., Chatterjee, K., Henzinger, T.A., Kupferman, O.: Temporal Specifications with
Accumulative Values. In: 26th LICS, pp. 43–52 (2011)
5. Clarke, E., Grumberg, O., Peled, D.: Model Checking. MIT Press (1999)
6. Dam, M.: CTL and ECTL as fragments of the modal μ-calculus. TCS 126, 77–96 (1994)
7. de Alfaro, L., Faella, M., Henzinger, T., Majumdar, R., Stoelinga, M.: Model checking dis-
counted temporal properties. TCS 345(1), 139–170 (2005)
8. de Alfaro, L., Henzinger, T., Majumdar, R.: Discounting the future in systems theory.
In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS,
vol. 2719, pp. 1022–1037. Springer, Heidelberg (2003)
9. Droste, M., Rahonis, G.: Weighted automata and weighted logics with discounting.
TCS 410(37), 3481–3494 (2009)
10. Droste, M., Vogler, H.: Weighted automata and multi-valued logics over arbitrary bounded
lattices. TCS 418, 14–36 (2012)
11. Faella, M., Legay, A., Stoelinga, M.: Model checking quantitative linear time logic. Electr.
Notes Theor. Comput. Sci. 220(3), 61–77 (2008)
12. Gastin, P., Oddoux, D.: Fast LTL to büchi automata translation. In: Berry, G., Comon, H.,
Finkel, A. (eds.) CAV 2001. LNCS, vol. 2102, pp. 53–65. Springer, Heidelberg (2001)
13. Kan, S.H.: Metrics and Models in Software Quality Engineering. Addison-Wesley Longman
Publishing Co. (2002)
14. Kupferman, O., Piterman, N., Vardi, M.Y.: From Liveness to Promptness. In: Damm, W.,
Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 406–419. Springer, Heidelberg (2007)
15. Krob, D.: The equality problem for rational series with multiplicities in the tropical semiring
is undecidable. International Journal of Algebra and Computation 4(3), 405–425 (1994)
16. Kwiatkowska, M.: Quantitative verification: models techniques and tools. In:
ESEC/SIGSOFT FSE, pp. 449–458 (2007)
17. Laroussinie, F., Schnoebelen, P.: A hierarchy of temporal logics with past. In: Enjalbert, P.,
Mayr, E.W., Wagner, K.W. (eds.) STACS 1994. LNCS, vol. 775, pp. 47–58. Springer, Hei-
delberg (1994)
18. Lichtenstein, O., Pnueli, A., Zuck, L.: The glory of the past. In: Parikh, R. (ed.) Logic of
Programs 1985. LNCS, vol. 193, pp. 196–218. Springer, Heidelberg (1985)
19. Mandrali, E.: Weighted LTL with discounting. In: Moreira, N., Reis, R. (eds.) CIAA 2012.
LNCS, vol. 7381, pp. 353–360. Springer, Heidelberg (2012)
20. Minsky, M.: Computation: Finite and Infinite Machines, 1st edn. Prentice Hall (1967)
21. Miyano, S., Hayashi, T.: Alternating finite automata on ω-words. TCS 32, 321–330 (1984)
22. Mohri, M.: Finite-state transducers in language and speech processing. Computational Lin-
guistics 23(2), 269–311 (1997)
23. Moon, S., Lee, K., Lee, D.: Fuzzy branching temporal logic. IEEE Transactions on Systems,
Man, and Cybernetics, Part B 34(2), 1045–1055 (2004)
Discounting in LTL 439
24. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Proc. 16th POPL, pp. 179–
190 (1989)
25. Shapley, L.: Stochastic games. Proc. of the National Academy of Science 39 (1953)
26. Spinellis, D.: Code Quality: The Open Source Perspective. Addison-Wesley Professional
(2006)
27. Vardi, M.Y.: An automata-theoretic approach to linear temporal logic. In: Moller, F.,
Birtwistle, G. (eds.) Logics for Concurrency. LNCS, vol. 1043, pp. 238–266. Springer, Hei-
delberg (1996)
28. Vardi, M., Wolper, P.: An automata-theoretic approach to automatic program verification. In:
1st LICS, pp. 332–344 (1986)
Symbolic Model Checking of Stutter-Invariant
Properties Using Generalized Testing Automata(
1 Introduction
Model checking for Linear-time Temporal Logic (LTL) is usually based on converting
the negation of the property to check into an ω-automaton B , composing that automa-
ton with a model M given as a Kripke structure, and finally checking the language
emptiness of the resulting product B ⊗ M [21].
One way to implement this procedure is the explicit approach where B and M are
represented as explicit graphs. B is usually a Büchi automaton or a generalization us-
ing multiple acceptance sets. We use Transition-based Generalized Büchi Automata
(TGBA) for their conciseness. When the property to verify is stutter-invariant [8], test-
ing automata [13] should be preferred to Büchi automata. Instead of observing the
values of state propositions in the system, testing automata observe the changes of
these values, making them suitable to represent stutter-invariant properties. In previ-
ous work [1], we showed how to generalize testing automata using several acceptance
sets, and allowing a more efficient emptiness check. Our comparison showed these
Transition-based Generalized Testing Automata (TGTA) to be superior to TGBA for
model-checking of stutter-invariant properties.
Another implementation of this procedure is the symbolic approach where the au-
tomata and their products are represented by means of decision diagrams (a concise
way to represent large sets or relations) [3]. Symbolic encodings for generalized Büchi
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 440–454, 2014.
*
c Springer-Verlag Berlin Heidelberg 2014
Symbolic Model Checking of Stutter-Invariant Properties 441
automata are pretty common [17]. With such encodings, we can compute, in one step, the
sets of all direct successors (PostImage) or predecessors (PreImage) of any set of states.
Using this technique, there have been a lot of propositions for symbolic emptiness-check
algorithms [9, 19, 14]. These symbolic algorithms manipulate fixpoints on the transition
relation which can be optimized using saturation techniques [4, 20].
However these approaches do not offer any reduction when verifying stutter-invariant
properties. So far, and to the best of our knowledge, testing automata have never been
used in symbolic model checking. Our goal is therefore to propose a symbolic approach
for model checking using TGTA, and compare it to the symbolic approach using TGBA.
In particular, we show that the computation of fixpoints on the transition relation of the
product can be sped up with a dedicated evaluation of stuttering transitions. We exploit
a separation of the transition relation into two terms, one of which greatly benefits from
saturation techniques.
This paper is organized as follows. Section 2 presents the symbolic model-checking
approach for TGBA. For generality we define our symbolic structures using predicates
over state variables in order to remain independent of the decision diagrams used to
actually implement the approach. Section 3 focuses on the encoding of TGTA in the
same framework. We first show how a TGTA can be encoded, then we show how to im-
prove the encoding of the Kripke structure and the product to benefit from saturation in
the encoding of stuttering transitions in the TGTA. Finally, Section 4 compares the two
approaches experimentally with an implementation that uses hierarchical Set Decision
Diagrams (SDD) [20] (a particular type of Decision Diagrams on integer variables, on
which we can apply user-defined operations). On our large, BEEM-based benchmark,
our symbolic encoding of TGTA appears to to be superior to TGBA.
can adjust to our definitions by “pushing” the acceptance of states to their outgoing
transitions [7].
Any LTL formula ϕ can be converted into a TGBA whose language is the set of ex-
ecutions that satisfy ϕ [7]. Figure 1(a) shows a TGBA derived from the LTL formula
F Ga. The Boolean expression over AP = {a} that labels each transition represents the
valuation of atomic propositions that hold in this transition (in this example, Σ = {a, ā}).
Any infinite path in this example is accepted if it visits infinitely often the only accep-
tance set containing transition (1, a, 1).
Like Kripke structures, TGBAs can be encoded by predicates [18] on state variables.
We now show how to build a synchronous product by composing the symbolic repre-
sentations of a TGBA with that of a Kripke structure, inspired from Sebastian et al. [18].
a a a ā a
{a}
a {a}
0 1 0 1 2
{a}
ā 0/ 0/ 0/
(a) TGBA (b) TGTA
Fig. 1. TGBA and TGTA for the LTL property ϕ = F Ga. Acceptance transitions are
indicated by .
444 A.E. Ben Salem et al.
The labels are used to ensure that a transition (q, , q5 ) of A is synchronized with a
state s of K such that L(s, ). This way, we ensure that the product recognizes only the
executions of K that are also recognized by A. However we do not need to remember
how product transitions are labeled to check K ⊗ A for emptiness. A product can be seen
as a TGBA without labels on transitions.
In symbolic model checkers, the exploration of the product is based on the following
PostImage operation 3 [18]. For any set of states 4 encoded by a predicate P, PostImage(P)
(s5 , q5 ) = ∃(s, q) P(s, q) ∧ T ((s, q), (s5 , q5 )) returns a predicate representing the set of
states reachable in one step a state in P.
Because in TGBA the acceptance conditions are based on transitions, we also define
PostImage(P, f ) to computes the successors of P reached 3 using only transitions from4 an
acceptance set f ∈ F: PostImage(P, f )(s5 , q5 ) = ∃(s, q) P(s, q) ∧ T f ((s, q), (s5 , q5 )) .
These two operations are at the heart of the symbolic emptiness check presented in
the next section.
Figure 1(b) shows a TGTA recognizing the LTL formula F Ga. Acceptance sets are
represented using dots as in TGBAs. Transitions are labeled by changesets: e.g., the
transition (0, {a}, 1) means that the value of a changes between states 0 and 1. Initial
446 A.E. Ben Salem et al.
valuations are shown above initial arrows: U(0) = {a}, U(1) = {ā} and U(2) = {a}. As
{a} 0/ 0/
an illustration, the execution ā; a; a; a; . . . is accepted by the run 1 2 2 2 ...
because the value a only changes between the first two steps.
Theorem 2. Any stutter-invariant property can be translated into an equivalent
TGTA [1].
Note that Def. 7 differs from our previous work [1] because we now enforce a par-
tition of δ such that stuttering transitions can only be self-loops. However, the TGTA
resulting from the LTL translation we presented previously [1] already have this prop-
erty. We will use it to optimize symbolic computation in section 3.3.
Finally, a TGTA’s symbolic encoding is similar to that of a TGBA.
Definition 8 (Symbolic TGTA)
A TGTA T = Q, Q0 ,U, δ, F is symbolically encoded by a triplet of predicates
U0 , Δ⊕ , {Δ⊕f } f ∈F where:
– U0 (q, ) is true iff (q ∈ Q0 ) ∧ (U(q) = )
– Δ⊕ (q, c, q5 ) is true iff (q, c, q5 ) ∈ δ
– ∀ f ∈ F, Δ⊕f (q, c, q5 ) is true iff ((q, c, q5 ) ∈ f )
where R⊕ ⊕
∗ and Δ∗ encode respectively the non-stuttering transitions of the model and
of the TGTA:
448 A.E. Ben Salem et al.
– Δ⊕ 5 5
∗ (q, c, q ) is true iff (q, c, q ) ∈ δ∗ (see Def. 7)
– R∗ (s, c, s ) is true iff R (s, c, s5 ) ∧ (c = 0)
⊕ 5 ⊕ /
According to the definition of δ0/ in Def. 7, the predicate Δ⊕ (q, 0, / q5 ) encodes the set
of TGTA’s self-loops and can be replaced by the predicate equal(q, q5), simplifying T :
3 4
T ((s, q), (s5 , q5 )) = R⊕ (s, 0,
/ s5 ) ∧ equal(q, q5) ∨ ∃c R⊕ 5 ⊕ 5
∗ (s, c, s ) ∧ Δ∗ (q, c, q ) (1)
5 67 8 5 67 8
T0/ ((s,q),(s5 ,q5 )) T∗ ((s,q),(s5 ,q5 ))
4 Experimentation
We now compare the approaches presented in this paper. The symbolic model-checking
approach using TGBA, presented in Section 2 serves as our baseline. We first describe
our implementation and selected benchmarks, prior to discussing the results.
4.1 Implementation
All approaches are implemented on top of three libraries1 : Spot, SDD/ITS, and LTSmin.
Spot is a model-checking library providing several bricks that can be combined to
build model checkers [7]. In our implementation, we reused the modules providing a
translation form an LTL formula into a TGBA and into a TGTA [1].
SDD/ITS is a library for symbolic representation of state spaces in the form of
Instantiable Transition Systems (ITS): an abstract interface for symbolic Labeled Tran-
sition Systems (LTS). The symbolic encoding of ITS is based on Hierarchical Set De-
cision Diagrams (SDD) [20]. SDDs allow a compact symbolic representation of states
and transition relation.
1 Respectively http://spot.lip6.fr, http://ddd.lip6.fr, and
http://fmt.cs.utwente.nl/tools/ltsmin.
Symbolic Model Checking of Stutter-Invariant Properties 449
The algorithms presented in this paper can be implemented using any kind of de-
cision diagram (such as OBDD), but use of the SDD software library allows to easily
benefit from the automatic saturation mechanism described in [12].
LTSmin [2] can generate state spaces from various input formalisms (µCRL, DVE,
GNA, MAPLE, PROMELA, ...) and store the obtained LTS in a concise symbolic for-
mat, called Extended Table Format (ETF). We used LTSmin to convert DVE models
into ETF for our experiments. This approach offers good generality for our tool, since
it can process any formalism supported by LTSmin tool.
Our symbolic model checker inputs an ETF file and an LTL formula. The LTL
formula is converted into TGBA or TGTA which is then encoded using an ITS. The
ETF model is also symbolically encoded using an ITS (see Sec. 4.2). The two obtained
ITSs are then composed to build a symbolic product, which is also an ITS. Finally, the
OWCTY emptiness check is applied to this product.
4.3 Benchmark
We evaluated the TGBA and TGTA approaches on the following models and formulae:
2 http://fmt.cs.utwente.nl/tools/ltsmin/doc/etf.html
450 A.E. Ben Salem et al.
– Our models come from the BEEM benchmark [15], a suite of models for explicit
model checking, which contains some models that are considered difficult for sym-
bolic model checkers [2]. Table 1 summarizes the 16 models we selected as repre-
sentatives of the overall benchmark.
– BEEM provides a few LTL formulae, but they mostly represent safety properties
and can thus be checked without building a product. Therefore, for each model,
we randomly generated 200 stutter-invariant LTL formulae: 100 verified formulae
(empty product) and 100 violated formulae (non-empty product). We consequently
have a total 3200 pairs of (model, formula).
All tests were run on a 64bit Linux system running on an Intel Xeon E5645 at 2.40GHz.
Executions that exceeded 30 minutes or 4GB of RAM were aborted and are reported
with time and memory above these thresholds in our graphics.
In all approaches evaluated, symbolic products are encoded using the same variable
ordering: we used the symbolic encoding named “log-encode with top-order” by Se-
bastiani et al. [18].
4.4 Results
The results of our experimental3 comparisons are presented by the two scatter plot ma-
trices of Fig. 3 and Fig. 4. The scatter plot highlighted at the bottom of Fig. 3 compares
the time-performance of the TGTA-approach against the reference TGBA approach.4
Each point of the scatter plot represents a measurement for a pair (model, formula). For
the highlighted plot, the x-axis represents the TGBA approach and the y-axis represent
the TGTA approach, so 3060 points below the diagonal correspond to cases where the
TGTA approach is better, and the 131 points above the diagonal corresponds to points
were the TGBA approach is better (In scatter plot matrices, each point below the di-
agonal is in favor of the approach displayed on the right, while each point above the
3 The results, models, formulae and tools used in these tests, can be downloaded from
http://www.lrde.epita.fr/~ala/TACAS-2014/Benchmark.html
4 We recommend viewing these plots online.
Symbolic Model Checking of Stutter-Invariant Properties 451
Time (s)
TGBA (sat)
10 25 Timeouts
1 OutOfMem
0.1
2064 cases
TGTA (nosat)
10 127 Timeouts
6 OutOfMem
0.1
853 cases 809 cases
TGTA (sat)
10 0 Timeouts
2 OutOfMem
0.1
3073 cases 3060 cases 3173 cases
0.1 10 1000 0.1 10 1000 0.1 10 1000
Fig. 3. Time-comparison of the TGBA and TGTA approaches, with saturation enabled
“(sat)” or disabled “(nosat)”, on a set of 3199 pairs (model, formula). Timeouts and
Out-of-memory errors are plotted on separate lines on the top or right edges of the
scatter plots. Each plot also displays the number of cases that are above or below
the main diagonal (including timeouts and out-of-memory errors), i.e., the number of
(model, formula) for which one approach was better than the other. Additional diago-
nals show the location of ×10 and /10 ratios. Points are plotted with transparency to
better highlight dense areas, and lessen the importance of outliers.
diagonal is in favor of the approach displayed in the top). Axes use a logarithmic scale.
The colors distinguish violated formulae (non-empty product) from verified formulae
(empty products). In order to show the influence of the saturation technique, we also ran
the TGBA and TGTA approaches with saturation disabled. In our comparison matrix,
the labels “(sat)” and “(nosat)” indicate whether saturation was enabled or not. Fig. 4
gives the memory view of this experiment.
452 A.E. Ben Salem et al.
Memory (MB)
As shown by the highlighted scatter plots in Fig. 3 and 4, the TGTA approach clearly
outperforms the traditional TGBA-based scenario by one order of magnitude. This is
due to the combination of two factors: saturation and exploration of stuttering.
The saturation technique does not significantly improve the model checking using
TGBA (compare “TGBA (sat)” against “TGBA (nosat)” at the top of Fig. 3 and 4). In
fact, the saturation technique is limited on the TGBA approach, because in the transition
relation of Def. 5 each conjunction must consult the variable q representing the state of
the TGBA, therefore q impacts the supports and the reordering of clusters evaluated by
the saturation. This situation is different in the case of TGTA approach, where the T0/
term of the transition-relation of the product (equation (1)) does not involve the state q
of the TGTA: here, saturation strongly improve performances (compare “TGTA (sat)”
against “TGTA (nosat)”).
Symbolic Model Checking of Stutter-Invariant Properties 453
Overall the improvement to this symbolic technique was only made possible because
the TGTA representation makes it easy to process the stuttering behaviors separately
from the rest. These stuttering transitions represent a large part of the models transi-
tions, as shown by the stuttering-ratios of Table 1. Using these stuttering-ratios, we can
estimate in our Benchmark the importance of the term T0/ compared to T∗ in equation (1).
5 Conclusion
Testing automata [10] are a way to improve the explicit model checking approach when
verifying stutter-invariant properties, but they had not been used for symbolic model
checking. In this paper, we gave the first symbolic approach using testing automata, with
generalized acceptance (TGTA), and compare it to a more classical symbolic approach
(using TGBA).
On our benchmark, using TGTA, we were able to gain one order of magnitude over
the TGBA-based approach.
We have shown that fixpoints over the transition relation of a product between a
Kripke structure and a TGTA can benefit from the saturation technique, especially be-
cause part of their expression is only dependent on the model, and can be evaluated
without consulting the transition relation of the property automaton. The improvement
was possible only because TGTA makes it possible to process stuttering behaviors
specifically, in a way that helps the saturation technique.
In future work, we plan to evaluate the use of TGTA in the context of hybrid ap-
proaches, mixing both explicit and symbolic approaches [18, 6].
References
1. Ben Salem, A.-E., Duret-Lutz, A., Kordon, F.: Model checking using generalized testing
automata. In: Jensen, K., van der Aalst, W.M., Ajmone Marsan, M., Franceschinis, G., Kleijn,
J., Kristensen, L.M. (eds.) ToPNoC VI. LNCS, vol. 7400, pp. 94–122. Springer, Heidelberg
(2012)
2. Blom, S., van de Pol, J., Weber, M.: LTS MIN : Distributed and symbolic reachability. In:
Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 354–359. Springer,
Heidelberg (2010)
3. Burch, J.R., Clarke, E.M., McMillan, K.L., Dill, D.L., Hwang, L.: Symbolic model checking:
1020 states and beyond. In: Proc. of the Fifth Annual IEEE Symposium on Logic in Computer
Science, pp. 1–33. IEEE Computer Society Press (1990)
4. Ciardo, G., Marmorstein, R., Siminiceanu, R.: Saturation unbound. In: Garavel, H., Hatcliff,
J. (eds.) TACAS 2003. LNCS, vol. 2619, pp. 379–393. Springer, Heidelberg (2003)
5. Ciardo, G., Yu, A.J.: Saturation-based symbolic reachability analysis using conjunctive and
disjunctive partitioning. In: Borrione, D., Paul, W. (eds.) CHARME 2005. LNCS, vol. 3725,
pp. 146–161. Springer, Heidelberg (2005)
6. Duret-Lutz, A., Klai, K., Poitrenaud, D., Thierry-Mieg, Y.: Self-loop aggregation product
— A new hybrid approach to on-the-fly LTL model checking. In: Bultan, T., Hsiung, P.-A.
(eds.) ATVA 2011. LNCS, vol. 6996, pp. 336–350. Springer, Heidelberg (2011)
7. Duret-Lutz, A., Poitrenaud, D.: SPOT: an extensible model checking library using transition-
based generalized Büchi automata. In: Proc. of MASCOTS 2004, pp. 76–83. IEEE Computer
Society Press (2004)
454 A.E. Ben Salem et al.
8. Etessami, K.: Stutter-invariant languages, ω-automata, and temporal logic. In: Halbwachs,
N., Peled, D.A. (eds.) CAV 1999. LNCS, vol. 1633, pp. 236–248. Springer, Heidelberg
(1999)
9. Fisler, K., Fraer, R., Kanhi, G., Vardi, M.Y., Yang, Z.: Is there a best symbolic cycle-detection
algorithm? In: Margaria, T., Yi, W. (eds.) TACAS 2001. LNCS, vol. 2031, pp. 420–434.
Springer, Heidelberg (2001)
10. Geldenhuys, J., Hansen, H.: Larger automata and less work for LTL model checking. In:
Valmari, A. (ed.) SPIN 2006. LNCS, vol. 3925, pp. 53–70. Springer, Heidelberg (2006)
11. Giannakopoulou, D., Lerda, F.: From states to transitions: Improving translation of LTL for-
mulæ to Büchi automata. In: Peled, D.A., Vardi, M.Y. (eds.) FORTE 2002. LNCS, vol. 2529,
pp. 308–326. Springer, Heidelberg (2002)
12. Hamez, A., Thierry-Mieg, Y., Kordon, F.: Hierarchical set decision diagrams and automatic
saturation. In: van Hee, K.M., Valk, R. (eds.) PETRI NETS 2008. LNCS, vol. 5062, pp.
211–230. Springer, Heidelberg (2008)
13. Hansen, H., Penczek, W., Valmari, A.: Stuttering-insensitive automata for on-the-fly detec-
tion of livelock properties. In: Proc. of FMICS 2002, ENTCS, vol. 66(2) (2002)
14. Kesten, Y., Pnueli, A., Raviv, L.-O.: Algorithmic verification of linear temporal logic spec-
ifications. In: Larsen, K.G., Skyum, S., Winskel, G. (eds.) ICALP 1998. LNCS, vol. 1443,
pp. 1–16. Springer, Heidelberg (1998)
15. Pelánek, R.: BEEM: Benchmarks for explicit model checkers. In: Bošnački, D., Edelkamp,
S. (eds.) SPIN 2007. LNCS, vol. 4595, pp. 263–267. Springer, Heidelberg (2007)
16. Peled, D., Wilke, T.: Stutter-invariant temporal properties are expressible without the next-
time operator. Information Processing Letters 63(5), 243–246 (1995)
17. Rozier, K.Y., Vardi, M.Y.: A multi-encoding approach for LTL symbolic satisfiability check-
ing. In: Butler, M., Schulte, W. (eds.) FM 2011. LNCS, vol. 6664, pp. 417–431. Springer,
Heidelberg (2011)
18. Sebastiani, R., Tonetta, S., Vardi, M.Y.: Symbolic systems, explicit properties: On hybrid
approaches for LTL symbolic model checking. In: Etessami, K., Rajamani, S.K. (eds.) CAV
2005. LNCS, vol. 3576, pp. 350–363. Springer, Heidelberg (2005)
19. Somenzi, F., Ravi, K., Bloem, R.: Analysis of symbolic SCC hull algorithms. In: Aagaard,
M.D., O’Leary, J.W. (eds.) FMCAD 2002. LNCS, vol. 2517, pp. 88–105. Springer, Heidel-
berg (2002)
20. Thierry-Mieg, Y., Poitrenaud, D., Hamez, A., Kordon, F.: Hierarchical set decision diagrams
and regular models. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505,
pp. 1–15. Springer, Heidelberg (2009)
21. Vardi, M.Y.: An automata-theoretic approach to linear temporal logic. In: Moller, F.,
Birtwistle, G. (eds.) Logics for Concurrency. LNCS, vol. 1043, pp. 238–266. Springer, Hei-
delberg (1996)
Symbolic Synthesis for Epistemic Specifications
with Observational Semantics
Abstract. The paper describes a framework for the synthesis of protocols for
distributed and multi-agent systems from specifications that give a program struc-
ture that may include variables in place of conditional expressions, together with
specifications in a temporal epistemic logic that constrain the values of these vari-
ables. The epistemic operators are interpreted with respect to an observational
semantics. The framework generalizes the notion of knowledge-based program
proposed by Fagin et al (Dist. Comp. 1997). An algorithmic approach to the syn-
thesis problem is developed that computes all solutions, using a reduction to epis-
temic model checking, that has been implemented using symbolic techniques. An
application of the approach to synthesize mutual exclusion protocols is presented.
1 Introduction
In concurrent, distributed or multi-agent systems it is typical that agents must act on the
basis of local data to coordinate to ensure global properties of the system. This leads
naturally to the consideration of the notion of what an agent knows about the global
state, given the state of its local data structures. Epistemic logic, or the logic of knowl-
edge [9] has been developed as a formal language within which to express reasoning
about this aspect of concurrent systems. In particular, knowledge-based programs [10],
a generalization of standard programs in which agents condition their actions on for-
mulas expressed in a temporal-epistemic logic, have been proposed as a framework for
expressing designs of distributed protocols at the knowledge level. Many of the inter-
esting analyses of problems in distributed computing based on notions of knowledge
(e.g. [13]) can be cast in the form of knowledge-based programs.
Knowledge-based programs have the advantage of abstracting from the details of
how information is encoded in an agent’s local state, enabling a focus on what an agent
needs to know in order to decide between its possible actions. On the other hand, this
abstraction means that knowledge-based programs do not have an operational seman-
tics. They are more like specifications than like programs in this regard: obtaining an
implementation of a knowledge-based program requires that concrete properties of the
agent’s local state be found that are equivalent to the conditions on the agent’s knowl-
edge used in the program.
This gap has meant that knowledge-based analyses have been largely conducted as
pencil and paper exercises to date, and only limited automated support for knowledge-
based programming has been available. One approach to automation that has emerged
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 455–469, 2014.
c Springer-Verlag Berlin Heidelberg 2014
456 X. Huang and R. van der Meyden
in the last ten years is the development of epistemic model checking tools [11,16].
These give a partial solution to the gap, in that they allow a putative implementation
of a knowledge-based program to be verified for correctness (for examples, see [2,3]).
However, this leaves open the question of how such an implementation is to be obtained,
which still requires human insight.
Our contribution in this paper is to develop and implement an approach that auto-
mates the construction of implementations for knowledge-based programs for the case
of the observational semantics for knowledge-based programs. (In earlier work [14]
we dealt with stronger semantics for a more limited program syntax, see Section 7 for
discussion). Our approach is to reduce the problem to model checking, enabling the
investment in epistemic model checking to be leveraged to automatically synthesize
implementations of knowledge-based programs. In particular, we build on symbolic
techniques for epistemic model checking.
We in fact generalize the notion of knowledge-based program to a more liberal notion
that we call epistemic protocol specification, based on a protocol template together with
a set of temporal-epistemic formulas that constrain how the template is to be instanti-
ated. This enables our techniques to be applied also to cover ideas such as the sound
local proposition generalization of knowledge-based programs [8]. We illustrate the
approach through an application of the knowledge-based programming methodology
to the development of protocols for mutual exclusion. We give an abstract knowledge-
based specification of a protocol for mutual exclusion, and show how our approach can
automatically extract different protocols solving this problem.
where v ∈ V and i ∈ Ags. This is CTL plus the construct Ki φ, which says that agent i
knows that φ holds. We freely use standard operators that are definable in terms of the
above, e.g., AFφ = ¬EG¬φ and AGφ = ¬E(trueU¬φ).
A (finite) model is a tuple M = (S , I, −→, {∼i }i∈Ags , F , π) where S is a (finite) set of
states, I ⊆ S is a set of initial states, →⊆ S × S is a transition relation, ∼i : S × S →
{0, 1} is an indistinguishability relation of agent i, component F is a fairness condition
(explained below), and π : S → P(V) is a truth assignment (here P(V) denotes the
powerset of V.) A path in M from a state s ∈ S is a finite or infinite sequence s = s0 −→
s1 −→ s2 −→ . . . We assume that −→ is serial, i.e. for each s ∈ S there exists t ∈ S such
that s −→ t. We model fairness using the condition F by taking this to be a generalized
Symbolic Synthesis for Epistemic Specifications 457
We assume that the sets Vare and LVari , for i ∈ Ags, are mutually disjoint.1
Note that the transition relation −→i indicates how an agent’s local variables are
updated when performing an action, which may depend on the current values of the pa-
rameter variables in the environment. This transition relation does not specify a change
in the value of the parameter variables: changes to these are determined in the envi-
ronment on the basis of the actions that this agent, and others, perform in the given
step.
Given an environment E and a collection {Proti }i∈Ags of concrete protocols for the
agents, we may construct a model M(E, {Proti }i∈Ags ) = (S , I, −→, {∼i }i∈Ags , F , π) as fol-
lows. The set of states is S = P(Vare ∪ i∈Ags LVari ), i.e., the set of all assignments to the
environment and local variables. We represent such states in the form s = se ∪ i∈Ags li ,
where se ⊆ Vare and each li ⊆ LVari . Such a state s is taken to be an initial state in I if
se ∈ Ie and li ∈ Ii for all agents i. That is, I is the set of states where the environment
and each of the agents is in an initial state. The epistemic indistinguishability relations
for agent i over the states S is defined by s ∼i t iff s ∩ OVari = t ∩ OVari , i.e., the states
s and t have the same values for all of agent i’s observable variables. The transition
relation −→ is given by se ∪ i∈Ags li −→ se ∪ i∈Ags li if there exists a joint action a
a ai
such that se −→e se and (se ∩ PVari ) ∪ li −→i li for each agent i. We take the fairness
condition F to contain
{se ∪ li | se ∈ α, l1 ∈ P(LVar1 ), . . . , ln ∈ P(LVarn )}
i∈Ags
for each α ∈ Fe . That is, we impose the environment’s fairness constraints on the envi-
ronment portion of the state. The assignment π is given by π(s) = s.
To illustrate our approach we use a running example concerned with mutual exclusion.
Mutual exclusion protocols [7] are intended for settings where it is required that only
one of a set of agents has access to a resource (e.g. a printer, or a write access to a
file) at a given time. There exists a large literature on this topic, with many different
approaches to its solution [17].
To model the structure of a mutual exclusion protocol, we suppose that each agent
has three states: waiting, trying, and critical. Intuitively, while in the waiting
state, the agent does not require the resource, and it idles for some period of time until
it decides that it needs access to the resource. It then enters the trying state, where
it waits for permission to use the resource. Once this permission has been obtained, it
enters the critical state, within which it may use the resource. Once done, it exits
the critical state and returns to the waiting state. The overall structure of the protocol
is therefore a cycle waiting → trying → critical → waiting. To ensure fair
sharing of the resource, we assume that no agent remains in its critical state forever.
To avoid the situation where two agents are using the resource at the same time, the
specification requires that no two agents are in the critical state simultaneously. In
order for a solution to the mutual exclusion problem to satisfy this specification, the
agents need to share some information about their state and to place an appropriate
guard on the transition from the trying state to the critical state. Mutual exclusion
protocols differ in their approach to these requirements by providing different ways for
agents to use shared variables to distribute and exploit information about their state.
Our application of our synthesis methodology assumes that the designer has some
intuitions concerning what information needs to be distributed, and writes the protocol
and environment in so as to reflect these ideas concerning information distribution.
However, given a pattern of communication, it may still be a subtle matter to determine
what information an agent can deduce from some particular values of its observable
variables. We use the epistemic specification to relate the information distributed and
the conditions used by the agent to make state transitions.
A general structure for a mutual exclusion protocol is given as a protocol template
in Figure 1. The code uses a simple programming language, containing a Dijkstra style
nondeterministic-if construct if e1 → P1 [] . . . [] ek → P p fi which nondeterministically
executes one of the statements Pi for which the corresponding guard ei evaluates to
true. The final ek may be the keyword otherwise which represents the negation of the
disjunction of the preceding ei . If there is no otherwise clause and none of the guards in
a conditional are true then the program defaults to a skip action. Evaluation of guards in
if and while statements is assumed to take zero time, and a transition occurs only once
an action is encountered in the execution. This applies also to an exit from a while loop.
Symbolic Synthesis for Epistemic Specifications 461
Variables in the programming notation are allowed to be of finite types (these are
boolean encoded in the translation to the semantic level. We assume that a vector of vari-
ables state indexed by agent names records the state in {waiting, trying, critical}
of each agent. Thus, mutual exclusion can be specified by the formula
AG ¬(state[i] = critical ∧ state[ j] = critical) . (1)
i, j∈Ags, i j
The protocol template also uses three actions for the agent: EnterTry, EnterCrit and
ExitCrit, which correspond to entering the trying, critical and waiting states respec-
tively. We take the variables state[i] to be included in the set of environment variable
Vare . When there are n agents, with Ags = {0 . . . n − 1}, we assume the code for the
environment transition always includes the following:
for i = 0 . . . n − 1 do
if i.EnterTry → state[i] := trying
[] i.EnterCrit → state[i] := critical
[] i.ExitCrit → state[i] := waiting
fi
(Here i.a is a proposition that holds during the computation of any transition in which
agent i performs the action a.) Additional code describing the effect of these actions
may be included, which represents the way that the agents distribute information to
each other concerning their state. A number of different instantiations of this additional
code for these actions are discussed below.
In our epistemic specifications, we include in Φ, for each agent i, the following
constraint on the template variable xi that guards entry to the critical section:
AG(xi ⇔ Ki (AX( ( j i ⇒ state[ j] critical))) (2)
j∈Ags
Intuitively, this states that agent i enters its critical section when it knows that, after
next transition, no other agent will be in its critical section. Note that this formula falls
462 X. Huang and R. van der Meyden
which requires that the protocol synthesized ensures that whenever an agent starts try-
ing, it is eventually able to enter its critical section.
One of the benefits of knowledge-based programs is that they enable the essential
reasons for correctness of a protocol to be abstracted in a way that separates the infor-
mation on the basis of which an agent acts from the way that this information is encoded
in the state of the system. This, it is argued, allows for simpler correctness proofs that
display the commonalities between different protocols solving the same problem.
This can be seen in the present specification: if the agents follow this specification,
then they will not violate mutual exclusion. The proof of this is straightforward; we
sketch it informally. Suppose that there is a violation of mutual exclusion, and let t
be the earliest time that we have state[i] = critical ∧ state[ j] = critical for
some pair of agents i j. Then either i or j performs EnterCrit to enter its critical
section at time t − 1. Assuming, without loss of generality, that it is agent i, we have xi
at time t − 1, so by (2), we must have Ki (AX( k∈Ags (k i ⇒ state[k] critical)))
at time t − 1. But then (since validity of Ki φ ⇒ φ is immediate from the semantics
of the knowledge operator), it follows that AX(state[ j] critical) at time t − 1,
contradicting the fact that the protocol makes a transition, in the next step, to a state
where state[ j] = critical.
We note that only the implication from left to right in (2) is used in this argument,
and it would also be valid if we removed the knowledge operator. This is an example of
a general point that led to the “sound local proposition” generalization of knowledge-
based programs proposed in [8]. However, weakening (2) to only the left to right part
allows the trivial implementation θ(xi ) = False, where no agent ever enters its critical
section. The implication from right to left in (2) amounts to saying that rather than this
very weak implementation, we want the strongest possible implementation where an
agent enters its critical section whenever it has sufficient information. Here the knowl-
edge operator is essential since, in general, the non-local condition inside the knowledge
operator will not be equivalent to a local proposition implementing xi .
The description above is not yet a complete solution to the mutual exclusion problem:
it remains to describe how agents distribute information about their state, and how the
data structures encoding this information are related to a local condition of the agent’s
state that can be substituted for the template variable so as to satisfy the epistemic
specification. We consider here two distinct patterns of information passing, based on
two overall systems architectures. In both cases KVari = {xi } for all agents i.
Ring Architecture: In the ring architecture we consider n agents Ags = {0, . . . , n − 1}
in a ring, with agent i able to communicate with agent i + 1 mod n. This communication
pattern is essentially that of token ring protocols. In this case we assume that communi-
cation is by means of a single bit for each agent i, represented by a variable bit[i]. We
take Vare = {bit[i], state[i] | i = 0 . . . n − 1} and let PVari = {bit[i], state[i]} and
LVari = ∅ and OVari = {bit[i]}. Agent i is able to affect its own bit as well as the bit of
Symbolic Synthesis for Epistemic Specifications 463
agent i + 1 mod n through its actions. More precisely, we add to the above code for the
environment state transitions the following semantics for the ExitCrit actions:
for i = 0 . . . n − 1 do
if i.ExitCrit then begin bit[i] := ¬bit[i]; bit[i + 1 mod n] := ¬bit[i + 1 mod n] end
That is, on exiting the critical section, the agent flips the value of its own bit, as well
as the value of its successor’s bit. To ensure fairness, we also add to the environment,
for each agent i, the Büchi fairness constraint state[i] waiting, which says that the
agent does not remain forever in the waiting state, but eventually tries to go critical. This
ensures that this agent takes its turn and does not forever block other agents who may
be trying to enter their critical section. We also add the fairness constraints state[i]
critical to ensure that no agent stays in its critical section forever. (However, we do
not include state[i] trying as a fairness constraint: it is up to the protocol to ensure
that an agent is eventually able to enter its critical section once it starts trying!)
Broadcast Architecture: In the broadcast architecture, we assume that the n agents
broadcast their state to all other agents. In this case, no additional variables are needed
and we take Vare = {state[ j] | j = 0 . . . n − 1}. Also, for each agent i, we take PVari =
OVari = Vare and LVari = ∅. The only code required for the actions EnterTry, EnterCrit
and ExitCrit is that given above for updating the variables state[i]. We do not
need to assume eventual progression from waiting to trying in this case (we allow an
agent to wait forever, in this case) so the only fairness constraints are state[i]
critical to ensure that no agent is forever critical.
Implementation Example: We describe an example of an implementation in the case
of the ring architecture for mutual exclusion described above. We assume that initially,
bit[i] = 0 for all agents i. Consider the substitution defined by θ(xi ) = ¬bit[i] if i = 0
and θ(xi ) = bit[i] if i 0. (Note that these are boolean expressions in the observable
variables OVari = {bit[i]}.) It can be shown that this yields an implementation of the
epistemic protocol specification for the ring architecture (we discuss our automated
synthesis of this implementation below.) Intuitively, in this implementation, agent 0
initially holds the token, represented by bit[0] = 0. After using the token to enter its
critical section, it sets bit[0] = 1 to relinquish the token, and bit[1] = 1 in order
to pass the token to agent 1. Thus, for agent 1, holding the token is represented by
bit[1] being true. The same holds for the remaining agents. (Obviously, there is an
asymmetry in these conditions for the agents, but any solution needs to somehow break
the symmetry in the initial state.) Intuitively, specification formula (2) holds because the
implementation maintains the invariant that at most one of the conditions θ(xi ) guarding
entry to the agents’ critical sections holds at any time, and when it is false, the agent is
not in its critical section. Thus, the agent i for which θ(xi ) is true knows that no other
agent is in, or is able to enter, its critical section. Consequently, it knows that no other
agent will be in its critical section at the next moment of time.
essentially constructs a model that encodes all possible guesses of the environment, and
then uses model checking to determine which guesses actually yield an implementation.
The consideration of all guesses is done in bulk, using symbolic techniques.
For each agent i, let Oi be the set of boolean assignments to OVari ; this represents
the set of possible observations that agent i can make. We may associate to each o ∈ Oi
a conjunction ψo of literals over variables v in OVari , containing literal v if o(v) = 1 and
¬v otherwise.
Since an implementation θ(v) for a template variable v is a boolean condition over
observable variables, we may equivalently view this as corresponding to the set of ob-
servations on which it holds. This set can in turn be represented by its characteristic
mapping from Oi to boolean values. To represent the entire implementation θ, we intro-
duce for each agent i ∈ Ags a set of new boolean variables Xi , containing the variables
xi,o,v , where o ∈ Oi and v ∈ KVari . Let X = i∈Ags Xi . We call X the implementation
variables of the epistemic protocol specification S.
A candidate assignment θ for an implementation of the epistemic protocol specifica-
tion, can be represented by a state χθ over the variables X, such that for an observation
o ∈ Oi and variable v ∈ KVari , we have xi,o,v ∈ χθ iff θ(v) holds with respect to assign-
ment o. Conversely, given a state χ over the variables X, we can construct an assignment
θχ mapping, for each agent i, the variables KVari to boolean conditions over OVari , by
θχ (v) = ψo .
o∈Oi , xi,o,v ∈χ
a a
For li ∈ P(LVari ) and a ∈ Actsi , we then let s ∪ χ −→iX li iff s ∪ κ(s, χ) −→i li .
Intuitively, since the assignment χ to the variables X encodes an implementation θ,
we make these variables an input to the transformed protocol, which uses them to make
Symbolic Synthesis for Epistemic Specifications 465
decisions that depend on the protocol template variables when executing the protocol
template. In particular, when an observation o = s ∩ OVari ∈ Oi (equivalently, s |= ψo )
satisfies xi,o,v ∈ χ, this corresponds to the template variable v taking the value true on
state s according to the implementation θ(v). We therefore execute a transition of the
protocol template in which v is taken to be true.
Note that the definition of the sets OVariX makes the variables X observable to all the
agents: this effectively makes the particular implementation being run common knowl-
edge to the agents, as it is in the system that we obtain from each concrete imple-
mentation. However, the combined transformed environment and transformed protocol
templates represent not just one implementation, but all possible implementations. This
is stated formally in the following result.
Theorem 2. Let S = Ags, E, {Pi }i∈Ags , Φ be an epistemic protocol specification, and
let X be the set of implementation variables of S. For each implementation θ of S, we
have M(E X , {PiX }i∈Ags ), s |= Φθ for all initial states s of M(E X , {PiX }i∈Ags ) with s∩X = χθ .
Conversely, suppose that χ ∈ P(X) is such that M(E X , {PiX }i∈Ags ), s |= Φθχ for all initial
states s of M(E X , {PiX }i∈Ags ) with s ∩ X = χ. Then θχ is an implementation of S.
This result gives a reduction from the synthesis problem to the well understood prob-
lem of model checking. Any algorithm for model checking specifications expressible in
the framework can now be applied. In particular, symbolic model checking techniques
apply. We have implemented the above approach as an extension of binary-decision di-
agram (BDD) based epistemic model checking algorithms already implemented in the
epistemic model checker MCK [11], which handles formulas in CTL∗ Kn with fairness
constraints using BDD based representations. The model checking techniques involved
are largely standard, as in [6], with a trivial extension to handle the epistemic operators
(these just require BDD’s representing the set of reachable states and an equivalence
on observable variables.) We make one optimization, based on the observation that the
variables X encoded in the state do not actually change on any given run. We can there-
fore reduce the number of BDD variables required to represent the transition relation
by retaining only one copy of these variables. Also, we first compute the observations
o ∈ Oi that can occur at reachable states in any putative implementation, to reduce the
set X to variables xi,o,v where o is in fact a possible observation.
We note that the reduction does entail a blowup in the number of variables. Suppose
we have n agents, with the number of observable variables of agent i being ki . Then the
size of the set Xi could be as large as 2ki |KVari |, so that |X| = i=1...n 2ki |KVari | is the
number of new variables that need to be included in the BDD computation. With BDD-
based symbolic model checking currently typically viable for numbers of the BDD
variables in the order of 100’s, this places an inherent limit on the size of example that
we can expect to handle using our technique. Evidently, the technique favours examples
in which the number of observable variables per agent is kept small. This is reflected in
the results obtained for our running example, which we now discuss.
No. of Agents 2 3 4 5 6 7 8
Ring (time) 0.3 1.7 5.5. 17.2 157.7 509.1 597
(No. BDD vars) 22 33 44 55 66 77 88
Broadcast (time) 0.2 194.2
(No. BDD vars) 34 105 356
set of all possible implementations. We now describe the implementations obtained for
the two versions of this specification.
We note that, as defined above, two implementations, corresponding to substitu-
tions θ1 and θ2 for the template variables, may be behaviorally equivalent, yet for-
mally distinct. Define the equivalence relation ∼ on such substitutions by θ1 ∼ θ2 if
M(E, {Pi θ1 }i∈Ags ) and M(E, {Pi θ2 }i∈Ags ) have the same set of reachable states, and for all
such reachable states s, and all template variables v, we have M(E, {Pi θ1 }i∈Ags ), s |= θ1 (v)
iff M(E, {Pi θ2 }i∈Ags ), s |= θ2 (v). Intuitively, this means that θ1 and θ2 are equivalent, ex-
cept on unreachable states. We treat such implementations as identical and return only
one element of each equivalence class.
Ring Architecture: We have already discussed one of the possible implementations
of the epistemic protocol specification for the ring architecture as the example in Sec-
tion 4, viz., that in which θ(x0 ) = ¬bit0 and θ(xi ) = biti for i 0. Our synthesis
system returns this as one of the implementations synthesized. As discussed above, this
implementation essentially corresponds to a token ring protocol in which agent 0 ini-
tially holds the token. By symmetry, it is easily seen that we can take any agent k to
be the one initially holding the token, and each such choice yields an implementation,
with θ(xk ) = ¬bitk and θ(xi ) = biti for i k. Our synthesis system returns all these
solutions, but also confirms that there are no others. Thus, up to symmetry, there is
essentially just one implementation for this specification.
We note that, whatever the total number of agents n, the number of variables observ-
able to agent i is just one, so we have |Xi | = 2 and we add |X| = 2n variables to the
underlying BDD for model checking in order to perform synthesis. This gives a slow
growth rate in the number of BDD variables as we scale the number of agents, and
enables us to deal with moderate size instances. Table 1 gives the performance results
for our implementation as we scale the number of agents.2 The total number of BDD
variables per state (i.e., the environment variables, local protocol and program counter
variables and X) is also indicated.
Broadcast Architecture: In case of the broadcast architecture, the number of variables
that need to be added for synthesis increases much more rapidly. In case of n agents,
we have |Xi | = 22n (since we need two bits to represent each agent’s state variable
state[ j]), and |X| = n22n . Accordingly, the approach works only on modest scale
examples. We describe the solutions obtained in the case of 3 agents. Our synthesis
procedure computes that there exist 6 distinct solutions, which amount essentially to
2
Our experiments were conducted on a Debian Linux system, 3.3GHz Intel i5-2500 CPU, with
each process allocated up to 500M memory.
Symbolic Synthesis for Epistemic Specifications 467
TTT:0
WTT:1 TTW:0
TWT:2
TWW:0
WWT:2 WTW:1
one solution under permutation of the roles of the agents. To understand this solution,
note first that if any agent is in its critical section, all others know this, but cannot know
whether the agent will exit its critical section in the next step. It follows that no agent
is able to enter its critical section in the next step. It therefore suffices to consider the
behavior of the solution on states where no agent is in its critical section, but at least one
agent is in state trying. We describe this by means of the graph in Figure 2. Vertices
in this graph indicate the protocol state inhabited by each of the agents, as well as the
agent that the protocol selects for entry to the critical state, e.g., WTT:1 indicates that
agent 0 is in state waiting, and agents 1 and 2 are in state trying, and that agent
1 enters its critical state in the next step. The edges point to possible successor states
reached at the time the selected agent next exits its critical state. (Note that, at this time,
no other agent has had the opportunity to enter its critical state, but another agent may
have moved from waiting to trying, so there is some nondeterminism in the graph.)
It can be verified by inspection (a focus on the upper triangle suffices, since only one
agent is trying in states in the lower triangle) that the solution is fair: there is no cycle
where an agent is constantly trying but never selected for entry to the critical section.
7 Related Work
Most closely related to our work in this paper are results on the complexity of verifying
and deciding the existence of knowledge-based programs [20,10], with respect to what
is essentially the observational semantics. The key idea of these complexity results is
similar to the one we have used in our construction: guess a knowledge assignment that
indicates at which observations (local states, in their terminology) a knowledge formula
holds, and verify that this corresponds to an implementation. However, our epistemic
specifications are syntactically more expressive than knowledge-based programs, and
some of the details of their work are more complex, in that a labelling of runs by sub-
formulas of knowledge formulas is also required. In part this is because of the focus on
linear time temporal logic in this work, compared to our use of branching time temporal
logic. This work also does not consider any concrete implementation of the theoretical
results using symbolic techniques. The complexity bounds for determining the exis-
tence of an implementation of a knowledge-based program in [20,10] (NP-complete
468 X. Huang and R. van der Meyden
8 Conclusion
Our focus in this paper has been to develop an approach that enables the space of all
solutions to an epistemic protocol specification to be explored. Our implementation
gives the first tool with this capability with respect to the observational semantics for
knowledge, opening up the ability to more effectively explore the overall methodology
of the knowledge-based approach to concurrent systems design through experimenta-
tion with examples beyond the simple mutual exclusion protocol we have considered.
Applications of the tool to the synthesis of fault-tolerant protocols, where the flow of
knowledge is considerably more subtle than in the reliable setting we have considered,
Symbolic Synthesis for Epistemic Specifications 469
is one area that we intend to explore in future work. Use of alternative model checking
approaches to the BDD-based algorithm we have used (e.g., SAT-based algorithms) are
also worth exploring.
References
1. Bar-David, Y., Taubenfeld, G.: Automatic discovery of mutual exclusion algorithms. In: Fich,
F.E. (ed.) DISC 2003. LNCS, vol. 2848, pp. 136–150. Springer, Heidelberg (2003)
2. Bataineh, O.A., van der Meyden, R.: Abstraction for epistemic model checking of dining-
cryptographers based protocols. In: Proc. TARK, pp. 247–256 (2011)
3. Baukus, K., van der Meyden, R.: A knowledge based analysis of cache coherence. In: Davies,
J., Schulte, W., Barnett, M. (eds.) ICFEM 2004. LNCS, vol. 3308, pp. 99–114. Springer,
Heidelberg (2004)
4. Bensalem, S., Peled, D., Sifakis, J.: Knowledge based scheduling of distributed systems. In:
Manna, Z., Peled, D.A. (eds.) Time for Verification. LNCS, vol. 6200, pp. 26–41. Springer,
Heidelberg (2010)
5. Bonollo, U., van der Meyden, R., Sonenberg, E.: Knowledge-based specification: Investigat-
ing distributed mutual exclusion. In: Bar Ilan Symposium on Foundations of AI (2001)
6. Clarke, E., Grumberg, O., Peled, D.: Model Checking. The MIT Press (1999)
7. Dijkstra, E.W.: Solution of a problem in concurrent programming control. Commun.
ACM 8(9), 569 (1965)
8. Engelhardt, K., van der Meyden, R., Moses, Y.: Knowledge and the logic of local proposi-
tions. In: Proc. Conf. Theoretical Aspects of Knowledge and Rationality, pp. 29–41 (1998)
9. Fagin, R., Halpern, J., Moses, Y., Vardi, M.: Reasoning About Knowledge. MIT Press (1995)
10. Fagin, R., Halpern, J.Y., Moses, Y., Vardi, M.Y.: Knowledge-based programs. Distributed
Computing 10(4), 199–225 (1997)
11. Gammie, P., van der Meyden, R.: MCK: Model checking the logic of knowledge. In: Alur, R.,
Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 479–483. Springer, Heidelberg (2004)
12. Graf, S., Peled, D., Quinton, S.: Achieving distributed control through model checking. For-
mal Methods in System Design 40(2), 263–281 (2012)
13. Halpern, J.Y., Zuck, L.D.: A little knowledge goes a long way: Knowledge-based derivations
and correctness proofs for a family of protocols. J. ACM 39(3), 449–478 (1992)
14. Huang, X., van der Meyden, R.: Symbolic synthesis of knowledge-based program imple-
mentations with synchronous semantics. In: Proc. TARK, pp. 121–130 (2013)
15. Katz, G., Peled, D., Schewe, S.: Synthesis of distributed control through knowledge accumu-
lation. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 510–525.
Springer, Heidelberg (2011)
16. Lomuscio, A., Qu, H., Raimondi, F.: MCMAS: A model checker for the verification of multi-
agent systems. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 682–688.
Springer, Heidelberg (2009)
17. Srimani, P., Das, S.R. (eds.): Distributed Mutual Exclusion Algorithms. IEEE (1992)
18. van der Meyden, R., Vardi, M.Y.: Synthesis from knowledge-based specifications (Extended
abstract). In: Sangiorgi, D., de Simone, R. (eds.) CONCUR 1998. LNCS, vol. 1466, pp.
34–49. Springer, Heidelberg (1998)
19. van der Meyden, R., Wilke, T.: Synthesis of distributed systems from knowledge-based spec-
ifications. In: Abadi, M., de Alfaro, L. (eds.) CONCUR 2005. LNCS, vol. 3653, pp. 562–576.
Springer, Heidelberg (2005)
20. Vardi, M.Y.: Implementing knowledge-based programs. In: Proc. Conf. on Theoretical As-
pects of Rationality and Knowledge, pp. 15–30 (1996)
Synthesis for Human-in-the-Loop Control Systems
1 Introduction
Many safety-critical systems are interactive, i.e., they interact with a human being, and
the human operator’s role is central to the correct working of the system. Examples
of such systems include fly-by-wire aircraft control systems (interacting with a pilot),
automobiles with driver assistance systems (interacting with a driver), and medical de-
vices (interacting with a doctor, nurse, or patient). We refer to such interactive control
systems as human-in-the-loop control systems. The costs of incorrect operation in the
application domains served by these systems can be very severe. Human factors are
often the reason for failures or “near failures”, as noted by several studies (e.g., [1,7]).
One alternative to human-in-the-loop systems is to synthesize a fully autonomous
controller from a high-level mathematical specification. The specification typically cap-
tures both assumptions about the environment and correctness guarantees that the con-
troller must provide, and can be specified in a formal language such as linear temporal
logic (LTL) [15]. While this correct-by-construction approach looks very attractive, the
existence of a fully autonomous controller that can satisfy the specification is not al-
ways guaranteed. For example, in the absence of adequate assumptions constraining
its behavior, the environment can be modeled as being overly adversarial, causing the
synthesis algorithm to conclude that no controller exists. Additionally, the high-level
specification might abstract away from inherent physical limitations of the system, such
as insufficient range of sensors, which must be taken into account in any real implemen-
tation. Thus, while full manual control puts too high a burden on the human operator,
This work was performed when the first author was at UC Berkeley.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 470–484, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Synthesis for Human-in-the-Loop Control Systems 471
In this paper, we study the construction of such controller in the context of reactive
synthesis from LTL specifications. Reactive synthesis is the process of automatically
synthesizing a discrete system (e.g., a finite-state Mealy transducer) that reacts to en-
vironment changes in such a way that the given specification (e.g., a LTL formula) is
satisfied. There has been growing interest recently in the control and robotics commu-
nities (e.g., [20,9]) to apply this approach to automatically generate embedded control
software. In summary, the main contributions of this paper are:
• A formalization of human-in-the-loop control systems and the problem of synthesiz-
ing such controllers from high-level specifications, including four key criteria these
controllers must satisfy.
• An algorithm for synthesizing human-in-the-loop controllers that satisfy the afore-
mentioned criteria.
• An application of the proposed technique to examples motivated by driver-assistance
systems for automobiles.
The paper is organized as follows. Section 2 describes an motivating example dis-
cussing a car following example. Section 3 provides a formalism and characterization
of the human-in-the-loop controller synthesis problem. Section 4 reviews material on
reactive controller synthesis from temporal logic. Section 5 describes our algorithm for
the problem. We then present case studies of safety critical driving scenarios in Sec-
tion 6. Finally, we discuss related work in Section 7 and conclude in Section 8.
1
In this paper, we do not consider explicit dynamics of the plant. Therefore it can be considered
as part of the environment also.
Synthesis for Human-in-the-Loop Control Systems 473
2 Motivating Example
Consider the example in Figure 2. Car A is the autonomous vehicle, car B and C are
two other cars on the road. We assume that the road has been divided into discretized
regions that encode all the legal transitions for the vehicles on the map, similar to the
discretization setup used in receding horizon temporal logic planning [21]. The objec-
tive of car A is to follow car B. Note that car B and C are part of the environment and
cannot be controlled. The notion of following can be stated as follows. We assume that
car A is equipped with sensors that allows it to see two squares ahead of itself if its view
is not obstructed, as indicated by the enclosed region by blue dashed lines in Figure 2a.
In this case, car B is blocking the view of car A, and thus car A can only see regions 3,
4, 5 and 6. Car A is said to be able to follow car B if it can always move to a position
where it can see car B. Furthermore, we assume that car A and C can move at most 2
squares forward, but car B can move at most 1 square ahead, since otherwise car B can
out-run or out-maneuver car A.
Car C Car C
1 3 5 7 9 1 3 5 7 9
2 4 6 8 10 2 4 6 8 10
Given this objective, and additional safety rules such as cars not crashing into one
another, our goal is to automatically synthesize a controller for car A such that:
• car A follows car B whenever possible;
• and in situations where the objective may not be achievable, switches control to the
human driver while allowing sufficient time for the driver to respond and take control.
In general, it is not always possible to come up with a fully automatic controller that
satisfies all requirements. Figure 2b illustrates such a scenario where car C blocks the
view as well as the movement path of car A after two time steps. The brown arrows
indicate the movements of the three cars in the first time step, and the purple arrows
indicate the movements of car B and C in the second time step. Positions of a car X
at time t is indicated by Xt . In this failure scenario, the autonomous vehicle needs to
notify the human driver since it has lost track of car B.
Hence, human-in-the-loop synthesis is tasked with producing an autonomous con-
troller along with advisories for the human driver in situations where her attention is
required. Our challenge, however, is to identify the conditions that we need to monitor
and notify the driver when they may fail. In the next section, we discuss how human
constraints such as response time can be simultaneously considered in the solution, and
mechanisms for switching control between the auto-controller and the human driver.
474 W. Li et al.
1. Monitoring. An advisory auto is issued to the human operator under specific con-
ditions. These conditions in turn need to be determined unambiguously at runtime,
potentially based on history information but not predictions. In a reactive setting,
this means we can use trace information only up to the point when the environment
provides a next input from the current state.
2. Minimally intervening. Our mode of interaction requires only selective human inter-
vention. An intervention occurs when HC transitions from the “non-active” state to
the “active” state (we discuss mechanisms for suggesting a transition from “active”
to “non-active” in Section 5.3, after prompted by the advisory signal auto being
false). However, frequent transfer of control would mean constant attention is
required from the human operator, thus nullifying the benefits of having the auto-
controller. In order to reduce the overhead of human participation, we want to mini-
mize a joint objective function C that combines two elements: (i) the probability that
when auto is set to false, the environment will eventually force AC into a failure
scenario, and (ii) the cost of having the human operator taking control. We formalize
this objective function in Sec. 5.1.
3. Prescient. It may be too late to seek the human operator’s attention when failure is
imminent. We also need to allow extra time for the human to respond and study the
situation. Thus, we require an advisory to be issued ahead of any failure scenario. In
the discrete setting, we assume we are given a positive integer T representing human
response time (which can be driver-specific), and require that auto is set to false
at least T number of transitions ahead of a state (in AC) that is unsafe.
4. Conditionally-Correct. The auto-controller is responsible for correct operation as
long as auto is set to true. Formally, if auto = true when AC is at a state q,
then F (q) = false. Additionally, when auto is set to false, the auto-controller
should still maintain correct operation in the next T − 1 time steps, during or after
which we assume the human operator take over control. Formally, if auto changes
from true to false when AC is at a state q, let RT (q) be the set of states reachable
from q within T − 1 transitions, then F (q ) = false, ∀q ∈ RT (q).
Now we are ready to state the HuIL Controller Synthesis Problem: Given a model
of the system and its specification expressed in a formal language, synthesize a HuIL
controller Hu IL that is, by construction, monitoring, minimally intervening, prescient,
and conditionally correct.
In this paper, we study the synthesis of a HuIL controller in the setting of synthesis of
reactive systems from linear temporal logic (LTL) specifications. We give background
on this setting in Section 4, and propose an algorithm for solving the HuIL controller
synthesis problem in Section 5.
LTL formulas are usually interpreted over infinite words (traces) w ∈ Σ ω , where
Σ = 2AP . The language of an LTL formula ψ is the set of infinite words that satisfy
ψ, given by L(ψ) = {w ∈ Σ ω | w |= ψ}. One classic example is the LTL formula
G (p → F q), which means every occurrence of p in a trace must be followed by some
q in the future.
An LTL formula ψ is satisfiable if there exists an infinite word that satisfies ψ, i.e.,
∃w ∈ (2AP )ω such that w |= ψ. A transducer M satisfies an LTL formula ψ if L(M ) ⊆
L(ψ). We write this as M |= ψ. Realizability is the problem of determining whether
there exists a transducer M with input alphabet X = 2X and output alphabet Y = 2Y
such that M |= ψ.
π = q0 q1 . . . of states such that q0 |= θg and (qi , qi+1 ) ∈ ρenv ∧ ρsys for all i ≥ 0. A
play π is winning for the system iff it is infinite and π |= W in. Otherwise, π is winning
for the environment. The set of states from which there exists a winning strategy for the
environment is called the winning region for env.
A finite-memory strategy for env in G is a tuple S env = (Γ env , γ env0 , η env ), where
Γ env
is a finite set representing the memory, γ env0 ∈ Γ env is the initial memory con-
tent, and η env ⊆ Qg × Γ env × X × Γ env is a relation mapping a state in G and some
memory content γ env ∈ Γ env to the possible next inputs the environment can pick and
an updated memory content. A strategy S env is winning for env from a state q if all
plays starting in q and conforming to S env are won by env. Following the terminology
used in [8], if a strategy S env is winning from an initial state q satisfying θg , then it is
called a counterstrategy for env. The existence of a counterstrategy is equivalent to the
specification being unrealizable. We refer the readers to [8] for details on how a coun-
terstrategy can be extracted from intermediate results of the fix-point computation for
the winning region for env. On the other hand, a winning strategy for the system can
be turned into an implementation, e.g., a sequential circuit with |X| inputs, |X| + |Y |
state-holding elements (flip-flops), and |Y | outputs that satisfies the given GR(1) speci-
fication. In this paper, the synthesized implementation is effectively the auto-controller
in the proposed HuIL framework, and can be viewed as a Mealy machine with state
space Q ⊆ 2X∪Y . We refer the readers to [14] for details of this synthesis process.
node, and there is an edge from node qic to node qjc if given the current state at qic , there
exists a next input picked from the counterstrategy for which the system can produce a
legal next output so that the game proceeds to a new state at qjc .
assumptions is violated, as flagged by the advisory controller, then the control is safely
switched to the human operator in a way that she can have sufficient time to respond.
The challenge, however, is to decide when an advisory should be sent to the human
operator, in a way that it is also minimally intervening to the human operator. We use
the following example to illustrate our algorithm.
Example 1. Consider X = {x}, Y = {y} and the following GR(1) sub-formulas which
together form ψ = ψ env → ψ sys .
1. ψfenv = G (F ¬x)
2. ψtsys = G (¬x → ¬y)
3. ψfsys = G (F y)
(a) Counterstrategy graph Gc for unre- (b) Condensed graph Ĝc for Gc after
alizable specification ψ. contracting the SCC.
Now we make the connection of the labeling function F for a controller M to the
counterstrategy graph Gc which describes behaviors that M should not exhibit. Con-
sider an auto-controller M and a state q (represented by the assignment xy) in M .
F (q) = true if and only if there exist some q c ∈ Qc such that θc (q c ) = xy and q c is
either failure-imminent or failure-doomed. In practice, it is not always the case that the
environment will behave in the most adversarial way. For example, a car in front may
yield if it is blocking our path. Hence, even though the specification is not realizable, it
is still important to assess, at any given state, whether it will actually lead to a violation.
For simplicity, we assume that the environment will adhere to the counterstrategy once
it enters a failure-doomed state.
We can convert Gc to its directed acyclic graph (DAG) embedding Ĝc = (Q̂c , Q̂c0 , ρ̂c )
by contracting each SCC in Gc to a single node. Figure 3b shows the condensed graph
Ĝc of Gc shown in Figure 3a. We use a surjective function fˆ : Qc → Q̂c to describe
the mapping of nodes from Gc to Ĝc . We say a node q̂ ∈ Q̂c is failure-prone if a node
q c ∈ Qc is either failure-imminent or failure-doomed and fˆ(q c ) = q̂.
Recall from Section 3.3 that the notion of minimally-intervening requires the mini-
mization of a cost function C, which involves the probability that auto is set to false,
Thus far, we have not associated any probabilities with transitions taken by the environ-
ment or the system. While our approach can be adapted to work with any assignment
of probabilities, for ease of presentation, we make a particular choice in this paper.
Specifically, we assume that at each step, the environment picks a next-input uniformly
at random from the set of possible legal actions (next-inputs) obtained from η env given
the current state. In Example 1 and correspondingly Figure 3a, this means that it is
equally likely for env to choose x̄ or x from any of the states. We use c(q) to denote the
total number of legal actions that the environment can take from a state q.
In addition, we take into account of the cost of having the human operator perform
the maneuver instead of the auto-controller. In general, this cost increases with longer
human engagement. Based on these two notions, we define ), which assigns a weight
to an edge e ∈ Q̂c × Q̂c in Ĝc , recursively as follows. For an edge between q̂i and q̂j ,
1 if q̂j is failure-prone
)(q̂i , q̂j ) = pen(q̂i )×len(q̂i )
c(q̂i ) Otherwise
where pen : Q̂c → Q+ is a user-defined penalty parameter3, and len : Q̂c → Z+ is the
length (number of edges) of the shortest path from a node q̂i to any failure-prone node
in Ĝc . Intuitively, a state far away from any failure-prone state is less likely to cause a
failure since the environment would need to make multiple consecutive moves all in an
adversarial way. However, if we transfer control at this state, the human operator will
have to spend more time in control, which is not desirable for a HuIL controller. Next,
we describe how to use this edge-weighted DAG representation of a counterstrategy
graph to derive a HuIL controller that satisfies the criteria established earlier.
T , minimally intervening4 with respect to the cost function fC = e∈E φ )(e), and
conditionally correct5.
Proof. (Sketch) When ψ is realizable, a fully autonomous controller is synthesized and
unconditionally satisfies ψ. Now consider that case when ψ is not realizable.
The HuIL controller is monitoring as φ only comprises a set of environment transi-
tions up to the next environment input.
It is prescient by construction. The auto flag advising the human operator to take
over control is set to false precisely when φ is violated. When φ is violated, it cor-
responds to the environment making a next-move from the current state q according to
some edge e = (q̂i , q̂j ) ∈ E φ . Consider any q c ∈ Qc such that fˆ(q c ) = q̂i , θc (q c ) = q.
Since q̂i ∈ Q̂cT by the construction of ĜcT , q̂i is at least T transitions away from any
failure-prone state in Ĝc . This means q c must also be at least T transitions away from
any failure-imminent state or failure-doomed state in Qc . Hence, by the definition of F
with respect to a failure-doomed or failure-doomed state in Section 5.1, q is (and auto
is set) at least T transitions ahead of any state that is unsafe.
The HuIL controller is also conditionally correct. By the same reasoning as above,
for any state q ∈ RT (q), F (q ) = false, i.e. q is safe.
Finally, since auto is set to false precisely when φ is violated, and φ in turn is
constructed
based on the set of edges E φ , which minimizes the cost function fC =
e∈E φ )(e), the HuIL controller is minimally-intervening with respect to the cost
function fC .
4
We assume the counterstrategy we use to mine the assumptions is an optimal one – it forces a
violation of the system guarantees as quickly as possible.
5
We assume that all failure-prone nodes are at least T steps away from any initial node.
482 W. Li et al.
6 Experimental Results
Our algorithm is implemented as an extension to the GR(1) synthesis tool RATSY [4].
Due to space constraint, we discuss only the car-following example (as shown in Sec-
tion 2) here and refer the readers to http://verifun.eecs.berkeley.edu/
tacas14/ for other examples.
Recall the car-following example shown in Section 2. We describe some of the more
interesting specifications below and their corresponding LTL formulas. pA , pB , pC are
used to denote the positions of car A, B and C respectively.
• Any position can be occupied by at most one car at a time (no crashing):
G pA = x → (pB = x ∧ pC = x)
where x denotes a position on the discretized space. The cases for B and C are
similar, but they are part of ψenv .
• Car A is required to follow car B:
G (vAB = true ∧ pA = x) → X (vAB = true)
where vAB = true iff car A can see car B.
• Two cars cannot cross each other if they are right next to each other. For example,
when pC = 5, pA = 6 and pC = 8 (in the next cycle), pA = 7. In LTL,
G ((pC = 5) ∧ (pA = 6) ∧ (X pC = 8)) → (X (pA = 7))
The other specifications can be found in the link described at the beginning of this
section. Observe that car C can in fact force a violation of the system guarantees in one
step under two situations – when pC = 5, pB = 8 and pA = 4, or pC = 5, pB = 8 and
pA = 6. Both are situations where car C is blocking the view of car A, causing it to
lose track of car B. The second failure scenario is illustrated in Figure 2b.
Applying our algorithm to this (unrealizable) specification with T = 1, we obtain
the following assumption φ.
φ = G ((pA = 4) ∧ (pB = 6) ∧ (pC = 1)) → ¬X ((pB = 8) ∧ (pC = 5))
G ((pA = 4) ∧ (pB = 6) ∧ (pC = 1)) → ¬X ((pB = 6) ∧ (pC = 3))
G ((pA = 4) ∧ (pB = 6) ∧ (pC = 1)) → ¬X ((pB = 6) ∧ (pC = 5))
Synthesis for Human-in-the-Loop Control Systems 483
In fact, φ corresponds to three possible evolutions of the environment from the initial
state. In general, φ can be a conjunction of conditions at different time steps as env and
sys progress. The advantage of our approach is that it can produce φ such that we can
synthesize an auto-controller that is guaranteed to satisfy the specification if φ is not
violated, together with an advisory controller that prompts the driver (at least) T (T =
1 in this case) time steps ahead of a potential failure when φ is violated.
7 Related Work
Similar to [9], we synthesize a discrete controller from temporal logic specifications.
Wongpiromsarn et al. [21] consider a receding horizon framework to reduce the synthe-
sis problem to a set of simpler problems for a short horizon. Livingston et al. [11,12]
exploit the notion of locality that allows “patching” a nominal solution. They update the
local parts of the strategy as new data accumulates allowing incremental synthesis. The
key innovation in this paper is that we consider synthesizing interventions to combine
an autonomous controller with a human operator.
Our work is inspired by the recent works on assumption mining. Chatterjee et al. [5]
construct a minimal environment assumption by removing edges from the game graph
to ensure safety assumptions, then compute liveness assumptions to put additional fair-
ness constraints on the remaining edges. Li et al. [10] and later Alur et al. [2] use a
counterstrategy-guided approach to mine environment assumptions for GR(1) specifi-
cations. We adapt this approach to the synthesis of human-in-the-loop control systems.
In recent years, there has been an increasing interest in human-in-the-loop systems
in the control systems community. Anderson et al. [3] study obstacle avoidance and
lane keeping for semiautonomous cars. They use a model predictive control for their
autonomous control. Our approach, unlike this one, seeks to provide correctness guar-
antees in the form of temporal logic properties. Vasudevan et al. [19] focus on learning
and predicting a human model based on prior observations. Based on the measured level
of threat, the controller intervenes and overwrites the driver’s input. However, we believe
that allowing an auto-controller to override the human inputs is unsafe especially since
it is hard to fully model the environment. We propose a different paradigm where we
allow the human to take control if the autonomous system predicts failure. Finally, hu-
man’s reaction time while driving is an important consideration in this paper. The value
of reaction time can range from 1 to 2.5 seconds for different tasks and drivers [18].
8 Conclusions
In this paper, we propose a synthesis approach for designing human-in-the-loop con-
trollers. We consider a mode of interaction where the controller is mostly autonomous
but requires occasional intervention by a human operator, and study important criteria
for devising such controllers. Further, we study the problem in the context of controller
synthesis from (unrealizable) temporal-logic specifications. We propose an algorithm
based on mining monitorable conditions from the counterstrategy of the unrealizable
specifications. Preliminary results on applying this approach to driver assistance in au-
tomobiles are encouraging. One limitation of the current approach is the use of an ex-
plicit counterstrategy graph (due to weight assignment). We plan to explore symbolic
algorithms in the future.
484 W. Li et al.
Acknowledgment. This work was supported in part by TerraSwarm, one of six centers
of STARnet, a Semiconductor Research Corporation program sponsored by MARCO
and DARPA. This work was also supported by the NSF grants CCF-1116993 and
CCF-1139138.
References
1. Federal Aviation Administration. The interfaces between flight crews and modern flight sys-
tems (1995)
2. Alur, R., et al.: Counter-strategy guided refinement of gr(1) temporal logic specifications. In:
The Conference on Formal Methods in Computer-Aided Design, pp. 26–33 (2013)
3. Anderson, S.J., et al.: An optimal-control-based framework for trajectory planning, threat as-
sessment, and semi-autonomous control of passenger vehicles in hazard avoidance scenarios.
International Journal of Vehicle Autonomous Systems 8(2), 190–216 (2010)
4. Bloem, R., Cimatti, A., Greimel, K., Hofferek, G., Könighofer, R., Roveri, M., Schuppan, V.,
Seeber, R.: RATSY – A new requirements analysis tool with synthesis. In: Touili, T., Cook,
B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 425–429. Springer, Heidelberg (2010)
5. Chatterjee, K., Henzinger, T.A., Jobstmann, B.: Environment assumptions for synthesis.
In: van Breugel, F., Chechik, M. (eds.) CONCUR 2008. LNCS, vol. 5201, pp. 147–161.
Springer, Heidelberg (2008)
6. Costa, M.-C., et al.: Minimal multicut and maximal integer multiflow: A survey. European
Journal of Operational Research 162(1), 55–69 (2005)
7. Kohn, L.T., et al.: To err is human: Building a safer health system. Technical report, A report
of the Committee on Quality of Health Care in America, Institute of Medicine (2000)
8. Könighofer, R., et al.: Debugging formal specifications using simple counterstrategies. In:
Conference on Formal Methods in Computer-Aided Design, pp. 152–159 (2009)
9. Kress-Gazit, H., et al.: Temporal-logic-based reactive mission and motion planning. IEEE
Transactions on Robotics 25(6), 1370–1381 (2009)
10. Li, W., et al.: Mining assumptions for synthesis. In: Conference on Formal Methods and
Models for Codesign, pp. 43–50 (2011)
11. Livingston, S.C., et al.: Backtracking temporal logic synthesis for uncertain environments.
In: Conference on Robotics and Automation, pp. 5163–5170 (2012)
12. Livingston, S.C., et al.: Patching task-level robot controllers based on a local μ-calculus
formula. In: Conference on Robotics and Automation, pp. 4588–4595 (2013)
13. National Highway Traffic Safety Administration. Preliminary statement of policy concerning
automated vehicles (May 2013)
14. Piterman, N., Pnueli, A., Sa’ar, Y.: Synthesis of reactive(1) designs. In: Emerson, E.A.,
Namjoshi, K.S. (eds.) VMCAI 2006. LNCS, vol. 3855, pp. 364–380. Springer, Heidelberg
(2006)
15. Pnueli, A.: The temporal logic of programs. In: Annual Symposium on Foundations of Com-
puter Science, pp. 46–57 (1977)
16. Rosner, R.: Modular synthesis of reactive systems. Ph.D. dissertation, Weizmann Institute of
Science (1992)
17. Sadigh, D., et al.: Data-driven probabilistic modeling and verification of human driver be-
havior. In: Formal Verification and Modeling in Human-Machine Systems (2014)
18. Triggs, T.J., et al.: Reaction time of drivers to road stimuli (1982)
19. Vasudevan, R., et al.: Safe semi-autonomous control with enhanced driver modeling. In:
American Control Conference, pp. 2896–2903 (2012)
20. Wongpiromsarn, T., et al.: Receding horizon temporal logic planning for dynamical systems.
In: Conference on Decision and Control, pp. 5997–6004 (2009)
21. Wongpiromsarn, T., et al.: Receding horizon temporal logic planning. IEEE Transactions on
Automatic Control 57(11), 2817–2830 (2012)
Learning Regular Languages over Large Alphabets
CNRS-VERIMAG
University of Grenoble
France
Abstract. This work is concerned with regular languages defined over large al-
phabets, either infinite or just too large to be expressed enumeratively. We define
a generic model where transitions are labeled by elements of a finite partition of
the alphabet. We then extend Angluin’s L∗ algorithm for learning regular lan-
guages from examples for such automata. We have implemented this algorithm
and we demonstrate its behavior where the alphabet is the set of natural numbers.
1 Introduction
The main contribution of this paper is a generic algorithm for learning regular languages
defined over a large alphabet Σ. Such an alphabet can be infinite, like N or R or just
so large, like Bn for very large n, that it is impossible or impractical to treat it in an
enumerative way, that is, to write down δ(q, a) for every a ∈ Σ. The obvious solution
is to use a symbolic representation where transitions are labeled by predicates which are
applicable to the alphabet in question. Learning algorithms infer an automaton from a
finite set of words (the sample) for which membership is known. Over small alphabets,
the sample should include the set S all the shortest words that lead to each state and,
in addition, the set S · Σ of all their Σ-continuations. Over large alphabets this is not
a practical option and as an alternative we develop a symbolic learning algorithm over
symbolic words which are only partially backed up by the sample. In a sense, our algo-
rithm is a combination of automaton learning and learning of non-temporal functions.
Before getting technical, let us discuss briefly some motivation.
Finite automata are among the corner stones of Computer Science. From a practical
point of view they are used daily in various domains ranging from syntactic analy-
sis, design of user interfaces or administrative procedures to implementation of digital
hardware and verification of software and hardware protocols. Regular languages ad-
mit a very nice, clean and comprehensive theory where different formalisms such as
automata, logic, regular expressions, semigroups and grammars are shown to be equiv-
alent. As for learning from examples, a problem introduced by Moore [Moo56], the
Nerode right-congruence relation [Ner58] which declares two input histories as equiv-
alent if they lead to the same future continuations, provides a crisp characterization of
what a state in a dynamical system is in terms of observable input-output behavior.
All algorithms for learning automata from examples, starting with the seminal work of
Gold [Gol72] and culminating in the well-known L∗ algorithm of Angluin [Ang87] are
based on this concept [DlH10].
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 485–499, 2014.
c Springer-Verlag Berlin Heidelberg 2014
486 O. Maler and I.-E. Mens
One weakness, however, of the classical theory of regular languages is that it is rather
“thin” and “flat”. In other words, the alphabet is often considered as a small set devoid of
any additional structure. On such alphabets, classical automata are good for expressing
and exploring the temporal (sequential, monoidal) dimension embodied by the concate-
nation operations, but less good in expressing “horizontal” relationships. To make this
statement more concrete, consider the verification of a system consisting of n automata
running in parallel, making independent as well as synchronized transitions. To express
the set of joint behaviors of this product of automata as a formal language, classical
theory will force you to use the exponential alphabet of global states and indeed, a large
part of verification is concerned with fighting this explosion using constructs such as
BDDs and other logical forms that exploit the sparse interaction among components.
This is done, however, without a real interaction with classical formal language theory
(one exception is the theory of traces [DR95] which attempts to treat this issue but in a
very restricted context).1
These and other considerations led us to use symbolic automata as a generic frame-
work for recognizing languages over large alphabets where transitions outgoing from a
state are labeled, semantically speaking, by subsets of the alphabet. These subsets are
expressed syntactically according to the specific alphabet used: Boolean formulae when
Σ = Bn or by some classes of inequalities when Σ = N. Determinism and complete-
ness of the transition relation, which are crucial for learning and minimization, can be
enforced by requiring that the subsets of Σ that label the transitions outgoing from a
given state form a partition of the alphabet.
Readers working on program verification or hybrid automata are, of course, aware
of automata with symbolic transition guards but it should be noted that in our model no
auxiliary variables are added to the automaton. Let us stress this point by looking at a
popular extension of automata to infinite alphabets, initiated by Kaminski and Francez
[KF94] using register automata to accept data languages (see [BLP10] for theoretical
properties and [HSJC12] for learning algorithms). In that framework, the automaton
is augmented with additional registers that can store some input letters. The registers
can then be compared with newly-read letters and influence transitions. With register
automata one can express, for example, the requirement that your password at login is
the same as the password at sign-up. This very restricted use of memory makes register
automata much simpler than more notorious automata with variables whose emptiness
problem is typically undecidable. The downside is that beyond equality they do not
really exploit the potential richness of the alphabets/theories.
Our approach is different: we do allow the values of the input symbols to influence
transitions via predicates, possibly of a restricted complexity. These predicates involve
domain constants and they partition the alphabet into finitely many classes. For exam-
ple, over the integers a state may have transitions labeled by conditions of the form
c1 ≤ x ≤ c2 which give real (but of limited resolution) access to the input domain. On
the other hand, we insist on a finite (and small) memory so that the exact value of x
cannot be registered and has no future influence beyond the transition it has triggered.
The symbolic transducers, recently introduced by [VHL+ 12], are based on the same
1
This might also be the reason that Temporal Logic is more popular in verification than regular
expressions because the nature of until is less global and less synchronous than concatenation.
Learning Regular Languages over Large Alphabets 487
We briefly survey Angluin’s L∗ algorithm [Ang87] for learning regular sets from mem-
bership queries and counter-examples, with slightly modified definitions to accommo-
date for its symbolic extension. Let Σ be a finite alphabet and let Σ ∗ be the set of
sequences (words) over Σ. Any order relation < over Σ can be naturally lifted to a
lexicographic order over Σ ∗ . With a language L ⊆ Σ ∗ we associate a characteristic
function f : Σ ∗ → {0, 1}.
A deterministic finite automaton over Σ is a tuple A = (Σ, Q, δ, q0 , F ), where Q is a
non-empty finite set of states, q0 ∈ Q is the initial state, δ : Q×Σ → Q is the transition
function, and F ⊆ Q is the set of final or accepting states. The transition function δ can
be extended to δ : Q × Σ ∗ → Q, where δ(q,
) = q and δ(q, u · a) = δ(δ(q, u), a)
for q ∈ Q, a ∈ Σ and u ∈ Σ ∗ . A word w ∈ Σ ∗ is accepted by A if δ(q0 , w) ∈ F ,
otherwise w is rejected. The language recognized by A is the set of all accepted words
and is denoted by L(A).
Learning algorithms, represented by the learner, are designed to infer an unknown
regular language L (the target language). The learner aims to construct a finite automa-
ton that recognizes the target language by gathering information from the teacher. The
teacher knows the target language and can provide information about it. It can answer
two types of queries: membership queries, i.e., whether a word belongs to the target
language, and equivalence queries, i.e., whether a conjectured automaton suggested by
the learner is the right one. If this automaton fails to accept L the teacher responds to
the equivalence query by a counter-example, a word misclassified by the conjectured
automaton.
488 O. Maler and I.-E. Mens
In the L∗ algorithm, the learner starts by asking membership queries. All information
provided is suitably gathered in a table structure, the observation table. Then, when the
information is sufficient, the learner constructs a hypothesis automaton and poses an
equivalence query to the teacher. If the answer is positive then the algorithm terminates
and returns the conjectured automaton. Otherwise the learner accommodates the in-
formation provided by the counter-example into the table, asks additional membership
queries until it can suggest a new hypothesis and so on, until termination.
A prefix-closed set S 6 R ⊂ Σ ∗ is a balanced Σ-tree if ∀a ∈ Σ: 1) For every s ∈ S
s · a ∈ S ∪ R, and 2) For every r ∈ R, r · a ∈ S ∪ R. Elements of R are called boundary
elements or leaves.
The set (S ∪ R) · E is the sample associated with the table, that is, the set of words
whose membership is known. The elements of S admit a tree structure isomorphic to a
spanning tree of the transition graph rooted in the initial state. Each s ∈ S corresponds
to a state q of the automaton for which s is an access sequence, one of the shortest words
that lead from the initial state to q. The elements of R should tell us about the back- and
cross-edges in the automaton and the elements of E are “experiments” that should be
sufficient to distinguish between states. This works by associating with every s ∈ S ∪ R
a specialized classification function fs : E → {0, 1}, defined as fs (e) = f (s·e), which
characterizes the row of the observation table labeled by s. To build an automaton from
a table it should satisfy certain conditions.
Note that a reduced table is trivially consistent and that for a closed and reduced table
we can define a function g : R → S mapping every r ∈ R to the unique s ∈ S such
that fs = fr . From such an observation table T = (Σ, S, R, E, f ) one can construct an
automaton AT = (Σ, Q, q0 , δ, F ) where Q = S, q0 =
, F = {s ∈ S : fs (
) = 1} and
9
s·a when s · a ∈ S
δ(s, a) =
g(s · a) when s · a ∈ R
The learner attempts to keep the table closed at all times. The table is not closed
when there is some r ∈ R such that fr is different from fs for all s ∈ S. To close
the table, the learner moves r from R to S and adds the Σ-successors of r to R. The
extended table is then filled up by asking membership queries until it becomes closed.
Variants of the L∗ algorithm differ in the way they treat counter-examples, as de-
scribed in more detail in [BR04]. The original algorithm [Ang87] adds all the prefixes
of the counter-example to S and thus possibly creating inconsistency that should be
Learning Regular Languages over Large Alphabets 489
fixed. The version proposed in [MP95] for learning ω-regular languages adds all the
suffixes of the counter-example to E. The advantage of this approach is that the table
always remains consistent and reduced with S corresponding exactly to the set of states.
A disadvantage is the possible introduction of redundant columns that do not contribute
to further discrimination between states. The symbolic algorithm that we develop in this
paper is based on an intermediate variant, referred to in [BR04] as the reduced obser-
vation algorithm, where some prefixes of the counter-example are added to S and some
suffixes are added to E.
Example: We illustrate the behavior of the L∗ algorithm while learning L = aΣ ∗
over Σ = {a, b}. We use +w to indicate a counter-example w ∈ L rejected by the
conjectured automaton, and −w for the opposite case. Initially, the observation table
is T0 = (Σ, S, R, E, f ) with S = E = {
} and R = Σ and we ask membership
queries for all words in (S ∪ R) · E = {
, a, b} to obtain table T0 , shown in Fig. 1. The
table is not closed so we move a to S, add its continuations, aa and ab to R and ask
membership queries to obtain the closed table T1 , from which the hypothesis automaton
A1 of Fig. 2 is derived. In response to the equivalence query for A1 , a counter-example
−ba is presented, its prefixes b and ba are added to S and their successors are added
to R, resulting in table T2 of Fig. 1. This table is not consistent: two elements
and b
in S are equivalent but their a-successors a and ba are not. Adding a to E and asking
membership queries yields a consistent table T3 whose automaton A3 is the minimal
automaton recognizing L.
T0 T1 T2 T3
a
0 0 1
a 1 a 1 1
0 b 0 b 0 0
0 a 1 ba 0 ba 0 0
a 1 b 0 aa 1 aa 1 1
b 0 aa 1 ab 1 ab 1 1
ab 1 bb 0 bb 0 0
baa 0 baa 0 0
bab 0 bab 0 0
3 Symbolic Automata
Symbolic automata are automata over large alphabets where from each state there is a
small number of outgoing transitions labelled by subsets of Σ that form a partition of
the alphabet. Let Σ be a large and possibly infinite alphabet, that we call the concrete
alphabet. Let ψ be a total surjective function from Σ to a finite (symbolic) alphabet Σ.
For each symbolic letter a ∈ Σ we assign a Σ-semantics [a]ψ = {a ∈ Σ : ψ(a) = a}.
Since ψ is total and surjective, the set {[a]ψ : a ∈ Σ} forms a partition of Σ. We will
490 O. Maler and I.-E. Mens
A1 A3
q1 a, b
a, b a
b
q0
q0 q1 b
a
q2 a, b
often omit ψ from the notation and use [a] where ψ, which is always present, is clear
from the context. The Σ-semantics can be extended to symbolic words of the form
w = a1 · a2 · · · ak ∈ Σ ∗ as the concatenation of the concrete one-letter languages
associated with the respective symbolic letters or, recursively speaking, [] = {
} and
[w · a] = [w] · [a] for w ∈ Σ ∗ , a ∈ Σ.
Definition 3 (Symbolic Automaton). A deterministic symbolic automaton is a tuple
A = (Σ, Σ, ψ, Q, δ, q0 , F ), where
– Σ is the input alphabet, :
– Σ is a finite alphabet, decomposable into Σ = q∈Q Σq ,
– ψ = {ψq : q ∈ Q} is a family of total surjective functions ψq : Σ → Σq ,
– Q is a finite set of states,
– δ : Q × Σ → Q is a partial transition function decomposable into a family of total
functions δq : {q} × Σq → Q,
– q0 is the initial state and F is the set of accepting states.
Automaton A can be viewed as representing a concrete deterministic automaton A
whose transition function is defined as δ(q, a) = δ(q, ψq (a)) and its accepted concrete
language is L(A) = L(A).
Remark: The association of a symbolic language with a symbolic automaton is more
subtle because we allow different partitions of Σ and hence different input alphabets
at different states, rendering the transition function partial with respect to Σ. When in
a state q and reading a symbol a ∈ Σq , the transition to be taken is well defined only
when [a] ⊆ [a ] for some a ∈ Σq . The model can, nevertheless, be made deterministic
and complete over a refinement of the symbolic alphabet. Let
(
Σ = Σq , with the Σ-semantics [(a1 , . . . , an )] = [a1 ] ∩ . . . ∩ [an ]
q∈Q
– Q = Q1 × Q2 , q0 = (q01 , q02 ), F = F1 × F2
– For every (q1 , q2 ) ∈ Q
• Σ(q1 ,q2 ) = {(a1 , a2 ) ∈ Σ1 × Σ2 | [a1 ] ∩ [a2 ] = ∅}
• ψ(q1 ,q2 ) (a) = (ψ1,q1 (a), ψ2,q2 (a)) ∀a ∈ Σ
• δ((q1 , q2 ), (a1 , a2 )) = (δ1 (q1 , a1 ), δ2 (q2 , a2 )) ∀(a1 , a2 ) ∈ Σ(q1 ,q2 )
We will use observation tables whose rows are symbolic words and hence an entry
in the table will constitute a statement about the inclusion or exclusion of a large set
of concrete words in the language. We will not ask membership queries concerning all
those words, but only for a small representative sample that we call evidence.
– Σ is an alphabet,
– (Σ, S, R, ψ) is a finite balanced symbolic Σ-tree (with R being its boundary),
– E is a subset of Σ ∗ ,
– f : (S ∪ R) · E → {0, 1} is the symbolic classification function
∗
– μ : (S ∪ R) · E → 2Σ − {∅} is an evidence function satisfying μ(w) ⊆ [w]. The
image of the evidence function is prefix-closed: w · a ∈ μ(w · a) ⇒ w ∈ μ(w).
492 O. Maler and I.-E. Mens
We use, as for the concrete case, fs : E → {0, 1} to denote the partial evaluation
of f to some symbolic word s ∈ S ∪ R, such that, fs (e) = f (s · e). Note that the
set E consists of concrete words but this poses no problem because elements of E are
used only to distinguish between states and do not participate in the derivation of the
symbolic automaton from the table. The notions of closed, consistent and reduced table
are similar to the concrete case.
The set MT = (S ∪ R) · E is called the symbolic sample associated with T . We
require that for each word w ∈ MT there is at least one concrete w ∈ μ(w) whose
membership in L, denoted by f (w), is known. The set of such words is called the
concrete sample and is defined as MT = {s · e : s ∈ μ(s), s ∈ S ∪ R, e ∈ E}. A table
where all evidences of the same symbolic word admit the same classification is called
evidence-compatible.
Theorem 1 (Automaton from Table). From a closed, reduced and evidence compat-
ible table T = (Σ, Σ, S, R, ψ, E, f , μ) one can construct a deterministic symbolic
automaton compatible with the concrete sample.
4 The Algorithm
In this section we present a symbolic learning algorithm starting with an intuitive verbal
description. From now on we assume that the alphabet is ordered and use a0 to denote
its minimal. We assume that the teacher always provides the smallest counter-example
with respect to length and lexicographic order on Σ ∗ . Also, when we choose an evi-
dence for a new symbolic word w in a membership query we always take the smallest
possible element of [w].
The algorithmic scheme is similar to the concrete L∗ algorithm but differs in the
treatment of counter-examples and the new concept of evidence compatibility. When the
table is not closed, S ∪ R is extended until closure. Then a conjectured automaton AT
Learning Regular Languages over Large Alphabets 493
is constructed and an equivalence query is posed. If the answer is positive we are done.
Otherwise the teacher provides a counter-example leading possibly to the extension of
E and/or S ∪ R. Whenever such an extension occurs, additional membership queries
are posed to fill the table. The table is always kept evidence compatible and reduced
except temporarily during the processing of counter-examples.
The learner starts with the symbolic table T = (Σ, Σ, S, R, ψ, E, f , μ), where
Σ = {a0 }, S = {}, R = {a0 }, E = {
}, and μ(a0 ) = {a0 }. Whenever T is
not closed, there is some r ∈ R such that fr = fs for every s ∈ S. To make the
table closed we move r from R to S and add to R the word r = r · a, where a
is a new symbolic letter with [a] = Σ, and extend the evidence function by letting
μ(r ) = μ(r) · a0 .
When a counter-example w is presented, it is of course not part of the concrete sam-
ple. It admits a factorization w = u · a · v, where u is the largest prefix of u such that
u ∈ μ(u) for some u ∈ S ∪ R. There are two cases, the second of which is particular
to our symbolic algorithm.
1. u ∈ R: Assume that g(u) = s ∈ S and since the table is reduced, fu = fs for
any other s ∈ S. Because w is the shortest counter-example, the classification of
s · a · v in the automaton is correct (otherwise s · a · v, for some s ∈ [s] would
constitute a shorter counter-example) and different from that of u · a · v. Thus we
conclude that u deserves to be a state and should be added to S. To distinguish
between u and s we add a · v to E, possibly with some of its suffixes (see [BR04]
for a more detailed discussion of counter-example treatment). As u is a new state
we need to add its continuations to R. We distinguish two cases depending on a:
(a) If a = a0 is the smallest element of Σ then a new symbolic letter a is added
to Σ, with [a] = Σ and μ(u · a) = μ(u) · a0 , and the symbolic word u · a is
added to R.
(b) If a = a0 then two new symbolic letters, a and a , are added to Σ with [a] =
{b : b < a}, [a ] = {b : b ≥ a} and μ(u · a) = μ(u) · a0 , μ(u · a) = μ(u) · a.
The words u · a and u · a are added to R.
2. u ∈ S: In this case the counter-example indicates that u · a was wrongly assumed
to be part of [u · a] for some a ∈ Σu , and a was wrongly assumed to be part of
[a]. There are two cases:
(a) There is some a = a such that the classification of u · a · v by the symbolic
automaton agrees with the classification of u · a · v. In this case we just move
a and all letters greater than a from [a] to [a ] and no new state is added.
(b) If there is no such a symbolic letter, we create a new a with [a ] = {b ∈ [a] :
b ≥ a} and update [a] to be [a] − [a ]. We let μ(u · a ) = μ(u) · a and add
u · a to R.
T0 T1 T2 T3 T4 T5
b
01
0
0 a0 0 0
0 a0 0
0 a1 1 a1 1 1
0 a1 1 a1 1
a0 0 a0 0 a0 a4 0 0
a0 0 a0 0 a0 a4 0
a1 1 a1 a2 1 a1 a2 1 1
a1 a2 1 a1 a2 1
a3 0 a3 0 0
a3 0
A0 A2
a0 a0 a2
q0 q0 q1
a1
[a0 ] = {a | a < b}
[a0 ] = Σ [a1 ] = {a | a ≥ b}
[a2 ] = Σ
q1 a2
a1
a0 ∪ a3 a2
q0
q0 q1
a1
a0 ∪ a3
q2 a4
add experiment b to E and fill the gaps using membership queries, resulting in table T5
which is closed, reduced and evidence compatible. The derived automaton A5 is the
right one and the algorithm terminates.
It is easy to see that for large alphabets our algorithm is much more efficient than
L∗ . For example, when Σ = {1..100}, b = 20 and c = 50, the L∗ algorithm will
need around 400 queries while ours will ask less than 10. The symbolic algorithm is
influenced not by the size of the alphabet but by the resolution (partition size) with
which we observe it. Fig. 5 shows a larger automaton over the same alphabet learned
by our procedure.
497
Fig. 5. An automaton learned by our procedure using 418 membership queries, 27 equivalence
Learning Regular Languages over Large Alphabets
5 Discussion
We have defined a generic algorithmic scheme for automaton learning, targeting lan-
guages over large alphabets that can be recognized by finite symbolic automata having
a modest number of states and transitions. Some ideas similar to ours have been pro-
posed for the particular case of parametric languages [BJR06] and recently in a more
general setting [HSM11, IHS13] including partial evidential support and alphabet re-
finement during the learning process.2
The genericity of the algorithm comes from the semantic approach (alphabet parti-
tions) but of course, each and every domain will have its own semantic and syntactic
specialization in terms of the size and shape of the alphabet partitions. In this work we
have implemented an instantiation of this scheme for the alphabet Σ = (N, ≤) and
the adaptation to real numbers is immediate. When dealing with numbers, the partition
into a finite number of intervals (and convex sets in higher dimensions) is very natural
and used in many application domains ranging from quantization of sensor readings
to income tax regulations. It will be interesting to compare the expressive power and
succinctness of symbolic automata with other approaches for representing numerical
time series and to compare our algorithm with other inductive inference techniques for
sequences of numbers.
As a first excursion into the domain, we have made quite strong assumptions on
the nature of the equivalence oracle, which, already for small alphabets, is a bit too
strong and pedagogical to be realistic. We assumed that it provides the shortest counter-
example and also that it chooses always the minimal available concrete symbol. We
can relax the latter (or both) and replace the oracle by random sampling, as already
proposed in [Ang87] for concrete learning. Over large alphabets, it might be even more
appropriate to employ probabilistic convergence criteria a-la PAC learning [Val84] and
be content with a correct classification of a large fraction of the words, thus tolerating
imprecise tracing of boundaries in the alphabet partitions. This topic, as well as the
challenging adaptation of our framework to languages over Boolean vectors are left for
future work.
Acknowledgement. This work was supported by the French project EQINOCS (ANR-
11-BS02-004). We thank Peter Habermehl, Eugene Asarin and anonymous referees for
useful comments and pointers to the literature.
References
[Ang87] Angluin, D.: Learning regular sets from queries and counterexamples. Information
and Computation 75(2), 87–106 (1987)
[BJR06] Berg, T., Jonsson, B., Raffelt, H.: Regular inference for state machines with param-
eters. In: Baresi, L., Heckel, R. (eds.) FASE 2006. LNCS, vol. 3922, pp. 107–121.
Springer, Heidelberg (2006)
[BLP10] Benedikt, M., Ley, C., Puppis, G.: What you must remember when processing data
words. In: AMW (2010)
2
Let us remark that the modification of partition boundaries is not always a refinement in the
precise mathematical sense of the term.
Learning Regular Languages over Large Alphabets 499
[BR04] Berg, T., Raffelt, H.: Model Checking. In: Broy, M., Jonsson, B., Katoen, J.-P.,
Leucker, M., Pretschner, A. (eds.) Model-Based Testing of Reactive Systems. LNCS,
vol. 3472, pp. 557–603. Springer, Heidelberg (2005)
[DlH10] De la Higuera, C.: Grammatical inference: learning automata and grammars. Cam-
bridge University Press (2010)
[DR95] Diekert, V., Rozenberg, G.: The Book of Traces. World Scientific (1995)
[Gol72] Gold, E.M.: System identification via state characterization. Automatica 8(5), 621–
636 (1972)
[HSJC12] Howar, F., Steffen, B., Jonsson, B., Cassel, S.: Inferring canonical register automata.
In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 251–
266. Springer, Heidelberg (2012)
[HSM11] Howar, F., Steffen, B., Merten, M.: Automata learning with automated alphabet
abstraction refinement. In: Jhala, R., Schmidt, D. (eds.) VMCAI 2011. LNCS,
vol. 6538, pp. 263–277. Springer, Heidelberg (2011)
[IHS13] Isberner, M., Howar, F., Steffen, B.: Inferring automata with state-local alphabet ab-
stractions. In: Brat, G., Rungta, N., Venet, A. (eds.) NFM 2013. LNCS, vol. 7871,
pp. 124–138. Springer, Heidelberg (2013)
[KF94] Kaminski, M., Francez, N.: Finite-memory automata. Theoretical Computer Sci-
ence 134(2), 329–363 (1994)
[Moo56] Moore, E.F.: Gedanken-experiments on sequential machines. In: Automata Studies.
Annals of Mathematical Studies, vol. 34, pp. 129–153. Princeton (1956)
[MP95] Maler, O., Pnueli, A.: On the learnability of infinitary regular sets. Information and
Computation 118(2), 316–326 (1995)
[Ner58] Nerode, A.: Linear automaton transformations. Proceedings of the American Mathe-
matical Society 9(4), 541–544 (1958)
[Val84] Valiant, L.G.: A theory of the learnable. Communications of the ACM 27(11), 1134–
1142 (1984)
[VHL+ 12] Veanes, M., Hooimeijer, P., Livshits, B., Molnar, D., Björner, N.: Symbolic finite
state transducers: algorithms and applications. In: POPL, pp. 137–150 (2012)
Verification of Concurrent Quantum Protocols
by Equivalence Checking
1 Introduction
There have been significant advances in quantum information science over the
last few decades and technologies based on these developments are at a stage well
suited for deployment in a range of industrial applications. The construction of
practical, general purpose quantum computers has been challenging. The only
large scale quantum computer available today is manufactured by the Cana-
dian company D-Wave. However, it does not appear to be general purpose and
not everyone is convinced that it is truly quantum. On the other hand, quantum
Supported by the Centre for Discrete Mathematics and its Applications (DIMAP),
University of Warwick, EPSRC award EP/D063191/1.
Partially supported by “Process Algebra Approach to Distributed Quantum Com-
putation and Secure Quantum Communication”, Australian Research Council Dis-
covery Project DP110103473.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 500–514, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Verification of Concurrent Quantum Protocols by Equivalence Checking 501
communication and cryptography have made large strides and is now well estab-
lished. Physical restrictions of quantum communication, like preserving photon
states over long distances, are gradually being resolved, for example, by quantum
repeaters [11] and using quantum teleportation. Various Quantum Key Distri-
bution networks have been built, including the DARPA Quantum Network in
Boston, the SeCoQC network around Vienna and the Tokyo QKD Network.
There is no doubt that quantum communication and quantum cryptographic
protocols will become an integral part of our society’s infrastructure.
On the theoretical side, quantum key distribution protocols such as BB84
have been proved to be unconditionally secure [20]. It is important to understand
that this an information-theoretic proof, which does not necessarily guarantee
that implemented systems are unconditionally secure. That is why alternative
approaches, such as those based on formal methods, could be useful in analysing
behaviour of implemented systems.
The area of formal verification, despite being a relatively young field, has
found numerous applications in hardware and software technologies. Today, ver-
ification techniques span a wide spectrum from model checking and theorem
proving to process calculus, all of them have helped us to grasp better under-
standing of interactive and complicated distributed systems. This work repre-
sents another milestone in our ongoing programme of applying formal methods to
quantum systems. In this paper, we present a concurrent language for describing
quantum systems, and perform verification by equivalence checking. The goal in
equivalence checking is to show that the implementation of a program is identical
to its specification, on all possible executions of the program. This is different
from property based model checking, where an intended property is checked over
all possible execution paths of a program. The key idea of this paper is to check
equivalence of quantum protocols by using their superoperator semantics (Sec-
tion 4). Superoperators are linear, so they are completely defined by their effect
on a basis of the appropriate space. To show that two protocols are equivalent,
we show that their associated superoperators are equivalent by simulating the
protocols for every state in a basis of the input vector space. By choosing a basis
that consists of stabilizer states, we can do this simulation efficiently.
One of the main challenges here is the explosion of states arising from branch-
ing and concurrency of programs. This is in addition to the need to deal with the
explosion of space needed for specifying quantum states. For a quantum state
with n qubits(quantum bits), we need to consider 2n complex coefficients. To
avoid this problem we restrict ourself to the stabilizer formalism [1]. In this for-
malism, quantum states can be described in polynomial space and also for certain
quantum operations, the evolution of stabilizer states can be done in polynomial
time. Although one cannot do universal quantum computation within the stabi-
lizer formalism, many important protocols such as Teleportation [7], Quantum
Error Correction [8] as well as quantum entanglement can be analysed within
it. Crucially, quantum error correction is a prerequisite for fault tolerant quan-
tum computing [24]. The latter is necessary for building a scalable quantum
computer, capable of doing universal quantum computing.
502 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan
Contributions. This paper extends our previous work [4] substantially in two
ways. First, we now use a concurrent modelling language, which means that
we can explicitly represent concurrency and communication in order to model
protocols more realistically. Second, we have analysed a much wider range of
examples, including several standard quantum protocols.
The paper is organised as follows. In Section 2 we give preliminaries from
quantum computing and the stabilizer formalism. In Sections 3 and 4, we give
the syntax and semantics of our concurrent modelling language. Section 5 de-
scribes our equivalence checking technique and Section 6 presents example pro-
tocols. In Section 7 we give details of our equivalence checking tool, and present
experimental results. Section 8 reviews related work and Section 9 concludes.
2 Background
In this section, we give a very concise introduction to quantum computing. For
more details, we refer to [22]. The basic element of quantum information is a
qubit (quantum bit). Qubits are vectors in an inner product vector space which is
called Hilbert space.1 Quantum states are description of qubits with the general
form: |Ψ = α1 |00 . . . 0 + . . . + αn |11 . . . 1, where αi ∈ C are called amplitudes
satisfying |α1 |2 + . . . + |αn |2 = 1. The so-called Ket notation or Dirac’s nota-
tion is used to distinguish unit vectors |0 and |1 from classical bits 0 and 1.
Also |00 . . . 0 corresponds to tensor product of unit vectors (i.e |0 ⊗ 0 . . . ⊗ |0).
There are two kinds of operations on quantum states, unitary transformations
and measurement. The side effect of the measurement operation is classical infor-
mation, for example, the outcome of measuring the above state |Ψ is a classical
bit string (00 . . . 0) with probability |α1 |2 , to (11 . . . 1) with probability |αn |2 .
Note that measurement is a destructive operation and it changes the state of a
qubit permanently. Qubits can be entangled. For example, a two qubit entangled
state |00 + |11, which is called a Bell state, cannot be decomposed into two
single qubit states. Measuring one of the qubits will fix the state of the other
qubit, even if they are physically separated.
Some basic quantum operations and their matrix representation are shown in
the Figure 1. A model for describing a quantum system is the quantum circuit
01 1 0 0 −i 10
X= , Z= , Y = , I=
10 0 −1 i 0 01
model, analogous to the classical circuit model. Each quantum circuit consists of
unitary and measurement gates. Unitary gates can be applied to one and more
1
Normally Hilbert space is defined with additional conditions which we are not con-
cerned with in this paper.
Verification of Concurrent Quantum Protocols by Equivalence Checking 503
qubits. In a certain kind of multiple qubit gates, which are called controlled gates,
there is one or more control qubits and one or more target qubits. Depending
on the value of the control qubit, a unitary gate is applied to the target qubit.
Controlled-X (or CNot ) and Toffoli [22, p. 29] gates are examples of controlled
gates. Quantum circuits are normally described in the following way: single wires
represent qubits, double wires represent classical bits. Single gates and measure-
ment gates are depicted with squares, whereas controlled gates are shown with a
point representing the control qubit and a circle depicting the target qubit with
a vertical wire connecting them.
The stabilizer formalism is a useful scheme which characterises a small but
important part of quantum mechanics. The core idea of the stabilizer formalism
is to represent certain quantum states, which are called stabilizer states, with
their stabilizer group, instead of an exponential number of complex amplitudes.
For an n-qubit quantum stabilizer state |ϕ, the stabilizer group is defined by
Stab(|ϕ) = {S|S |ϕ = +1 |ϕ}. This group can be represented elegantly by its n
generators in the Pauli group (i.e P n for P in Figure 1). Several algorithms have
been developed for specifying and manipulating stabilizer states using group
representation of quantum states, see [1]. Importantly, the effect of Clifford Op-
erators (members of the normaliser of Pauli group, known as Clifford group) on
stabilizer states can be simulated by a polynomial time algorithm. Consequently,
we have the following important theorem which guarantees stabilizer states can
be specified in polynomial space and certain operations and measurement can
be done in polynomial time:
Theorem 1. (Gottesman-Knill, [22, p. 464]) Any quantum computation which
consists of only the following components:
1. State preparation, Hadamard gates, Phase gates, Controlled-Not gates and
Pauli gates.
2. Measurement gates.
3. Classical control conditions on the outcomes of measurements.
can be efficiently simulated on a classical computer.
The density operator is an alternative way of describing a quantum state where
we need to deal with uncertainty. For instance, an ensemble of a quantum state
{(|φi , pi )}, where pi s are probabilities, can be represented by the following
density operator:
ρ := pi |φi φi |
i
where |φi φi | denote outer product. Density operators are positive and Hermi-
tian, meaning they satisfy |ϕ: ϕ| ρ |ϕ ≥ 0 and ρ† = ρ († denotes transpose of
the complex conjugate) respectively. Also a composite quantum system can be
elegantly described in the language of density operators. In particular, one can
obtain a reduced density operator by applying a partial trace operation on the
density operator of a composite system, see [22, p. 105].
Superoperators are linear transforms on the space of density operators. Note
that for an n-qubit system, the space of density operators has dimension of 22n .
504 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan
p ::= t | t || t
val ::= 0 | 1
P ≡ P P −
→ Q Q ≡ Q
α
α
α R-Cong α.P −
→P R-Act
P −
→Q
→ P
α
τ P −
c!v.P c?x.Q −
→P Q[v/x] R-Com
α R-Par
P Q−
→ P Q
Proof. The QPL programs corresponding to all of the interleavings of P map in-
puts to outputs in the same way, and therefore all define the same superoperator,
which we take to be [[P ]].
for all v ∈ B do
for all i ∈ {1, 2} do
|φvi = StabSim ∗ (Pi , v, 1)
for all j ∈ I(Pi , v) − {1} do
if ¬EQ S (StabSim ∗ (Pi , v, j), |φvi ) then
return Pi non-functional
end if
end for
end for
if ¬EQ S (|φv1 , |φv2 ) then
return P1 P2
end if
end for
return P1 ∼ = P2
Fig. 4. Algorithm for checking equivalence of concurrent protocols
6 Examples
We have analysed a range of quantum protocols covering quantum communica-
tion, quantum fault-tolerance and quantum cryptography using our equivalence
checking tool. In this section, we present two of the protocols we have verified.
The remaining protocols that we have analysed, in particular, fault tolerant pro-
tocols such as one bit teleportation and remote CNOT as well as error correction
protocols can be found at http://go.warwick.ac.uk/eardeshir/qec.
Teleportation [7]: The goal in this protocol is to teleport a quantum state
from Alice to Bob without physically transferring qubits, using quantum en-
tanglement. Before starting the communication between the two parties, Alice
and Bob, an entangled pair is established and shared between them. Alice then
entangles the input qubit with her half of the entangled qubit by applying a
Verification of Concurrent Quantum Protocols by Equivalence Checking 509
|ψ • H
•
FE
|0
FE
|0 H • X Z |ψ
controlled-not gate followed by a Hadamard gate. She then measures her qubits
and sends the classical outcome to the Bob. Depending on the four classical
outcomes of Alice measurements, Bob applies certain X and Z operations and
recovers the input state on his entangled qubit. The circuit which implements
Quantum Teleportation is shown in the Figure 5.
However the circuit model does not provide a high level interface and does
not capture the notion of physical separation between Alice and Bob. Through
our concurrent language, we provide a programming interface and we can also
describe the implementation of teleportation reflecting physical separation. The
specification and implementation programs for teleportation in our concurrent
language are shown in Figure 6.
//Implementation:
//Preparing EPR pair and sending to Alice and Bob:
newqubit y . newqubit z . H(y) . CNOT(y,z) . c!y . d!z . nil
|
//Alice’s process:
(input x . c?y . CNOT(x,y) . H(x) . m := measure x . n := measure y.
//Bob’s process :
d?w . b?m . b?n . if n then X(w) . if m then Z(w) . output w . nil)
//Specification:
input x.output x.nil
Quantum Secret Sharing : This protocol was first introduced by Hillery et.
al. [19]. The original problem of secret sharing involves an agent Alice sending a
message to two agents Bob and Charlie, one of whom is dishonest. Alice doesn’t
know which one of the agents is dishonest, so she must encode the message so that
Bob and Charlie must collaborate to retrieve it. For the quantum version of this
protocol the three agents need to share a maximally entangled three-qubit state,
called the GHZ state, prior to the execution of the protocol: |000 + |111 . In
Figure 7, we assume that Charlie will end up with the original qubit (a variation
of the protocol will allow Bob to end up with it). First, Alice entangles the input
qubit with her entangled qubit from the GHZ state. Then Alice measures her
qubits and sends the outcome to Charlie. Bob also measures his qubit and sends
510 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan
d ! a . e ! b . f ! c . nil
|
n:=measure a . t ! m . w ! n . nil
|
the outcome to Charlie. Finally, Charlie is able to retrieve the original qubit once
he has access to the bits from Alice and Bob. The specification of secret sharing
is similar to teleportation, expressed in Figure 6. The security of this protocol is
a consequence of no-cloning theorem and is discussed in [19]. The specification
for this protocol is the same as for teleportation.
We conclude with some final remarks. First, we can easily add more inputs
to each of our protocols, which means that we are checking e.g. teleportation of
one qubit in the presence of entanglement with other qubits. This follows from
linearity, but it is never explicitly stated in standard presentations of telepor-
tation. Second, we can model different implementations of a protocol, e.g. by
changing the amount of concurrency. These differences are invisible at the level
of circuit diagrams.
run on a 2.5GHz Intel Core i3 machine with 4GB RAM. We would like to com-
pare our results with those produced by the model checker QMC, but we have
not been successful in running all the examples. This is partly because QMC is
based on a different approach to verification i.e, temporal logic model checking,
rather than equivalence checking. The tool Quantomatic is not a fully automatic
tool, therefore we were not able to provide a comparison with case studies in
that tool as well. The experimental results show how concurrency affects quan-
tum systems. Not surprisingly, with more sharing of entanglement and increased
classical and quantum communication, we have to deal with a larger number of
interleavings. We have verified (Figure 8) sequential models of protocols in our
current and previous tools [4]. Because of the complex structure of measurements
in the five qubit code, we were not able to model this protocol in the sequen-
tial equivalence checker. The scheduler in our previous tool is slower than the
one in our current work. This is because we were building program graphs for
extracting schedules, whereas in this work schedules are directly obtained from
abstract syntax tree. Comparing the results in Figure 8 shows that sequential
models are analysed more quickly because they do not deal with concurrency.
However, error correction protocols are inherently sequential, so their sequential
and concurrent models are very similar and produce similar results.
8 Related Work
In recent years there have been several approaches to the formal analysis of
quantum information systems. In this section we review some of the work that
is most relevant to this paper.
We have already mentioned the QMC system. QMC checks properties in
Quantum Computation Tree Logic (QCTL) [6] on models which lie within the
stabilizer formalism. It can be used to check some protocols in a process-oriented
512 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan
style similar to that of the present paper; however, it simulates the protocols on
all stabilizer states as inputs, not just the smaller set of stabilizer states that
form a basis for the space of density matrices, and is therefore less efficient.
Our previous work [4] uses a similar approach to the present paper, but lim-
ited to sequential protocols. It therefore lacks the ability to explore, for a given
protocol, different models with different degrees of concurrency.
Process calculus can also be used to analyse quantum systems. Gay and Na-
garajan introduced CQP [17] based on the π-calculus; bisimulation for CQP has
been developed and applied by Davidson et al. [10,9]. Ying et al. have developed
qCCS [27] based on classical CCS, and studied its theory of bisimulation [14].
These are theoretical investigations which have not yet produced tools.
Wille et al. [26] consider two reversible circuits and then check their equiv-
alence with respect to a target functionality (specification). To this end, tech-
niques based on Boolean SAT and Quantum Binary Decision Diagrams [25] have
been used. However, these methods are only applicable to quantum circuits with
classical inputs/outputs.
Abramsky and Coecke [2] have developed diagrammatic reasoning techniques
for quantum systems, based on a category-theoretic formulation of quantum me-
chanics. Quantomatic [12] is a tool based on this formalism, which uses graph
rewriting in order to reason about quantum systems. The interface of Quan-
tomatic is graphical, in contrast to our tool which uses a programming lan-
guage syntax. Also, our tool verifies quantum protocols in a fully automatic
way, whereas Quantomatic is a semi-automatic tool which needs a considerable
amount of user intervention (see [13] for an example and discussion).
References
1. Aaronson, S., Gottesman, D.: Improved simulation of stabilizer circuits. Phys. Rev.
A 70, 052328 (2004)
2. Abramsky, S., Coecke, B.: A categorical semantics of quantum protocols. In: Pro-
ceedings of the 19th Annual IEEE Symposium on Logic in Computer Science, pp.
415–425 (2004)
3. Ardeshir-Larijani, E.: Quantum equivalence checker (2013),
http://go.warwick.ac.uk/eardeshir/qec
4. Ardeshir-Larijani, E., Gay, S.J., Nagarajan, R.: Equivalence checking of quantum
protocols. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795,
pp. 478–492. Springer, Heidelberg (2013)
5. Audenaert, K.M.R., Plenio, M.B.: Entanglement on mixed stabilizer states: normal
forms and reduction procedures. New Journal of Physics 7(1), 170 (2005)
6. Baltazar, P., Chadha, R., Mateus, P.: Quantum computation tree logic—model
checking and complete calculus. International Journal of Quantum Informa-
tion 6(2), 219–236 (2008)
7. Bennett, C.H., Brassard, G., Crépeau, C., Jozsa, R., Peres, A., Wootters, W.K.:
Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-
Rosen channels. Phys. Rev. Lett. 70, 1895–1899 (1993)
8. Calderbank, A.R., Shor, P.W.: Good quantum error-correcting codes exist. Phys.
Rev. A 54, 1098–1105 (1996)
9. Davidson, T.A.S., Gay, S.J., Nagarajan, R., Puthoor, I.V.: Analysis of a quantum
error correcting code using quantum process calculus. EPTCS 95, 67–80 (2012)
10. Davidson, T.A.S.: Formal Verification Techniques Using Quantum Process Calcu-
lus. PhD thesis, University of Warwick (2011)
514 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan
11. de Riedmatten, H., Marcikic, I., Tittel, W., Zbinden, H., Collins, D., Gisin, N.:
Long distance quantum teleportation in a quantum relay configuration. Physical
Review Letters 92(4), 047904 (2004)
12. Dixon, L., Duncan, R.: Graphical reasoning in compact closed categories for quan-
tum computation. Annals of Mathematics and Artificial Intelligence 56(1), 23–42
(2009)
13. Duncan, R., Lucas, M.: Verifying the Steane code with quantomatic.
arXiv:1306.4532 (2013)
14. Feng, Y., Duan, R., Ying, M.: Bisimulation for quantum processes. In: Proceed-
ings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of
Programming Languages, pp. 523–534. ACM (2011)
15. Gagnon, E.: SableCC, an object-oriented compiler framework. Master’s thesis,
School of Computer Science, McGill University (1998)
16. Gay, S.J.: Stabilizer states as a basis for density matrices. arXiv:1112.2156 (2011)
17. Gay, S.J., Nagarajan, R.: Communicating Quantum Processes. In: Proceedings
of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages, pp. 145–157. ACM (2005)
18. Harel, D., Kupferman, O., Vardi, M.Y.: On the complexity of verifying concurrent
transition systems. Information and Computation 173(2), 143–161 (2002)
19. Hillery, M., Bužek, V., Berthiaume, A.: Quantum secret sharing. Phys. Rev. A 59,
1829–1834 (1999)
20. Mayers, D.: Unconditional Security in Quantum Cryptography. Journal of the
ACM 48(3), 351–406 (2001)
21. Milner, R.: Communication and concurrency. Prentice Hall (1989)
22. Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information.
Cambridge University Press (2000)
23. Selinger, P.: Towards a quantum programming language. Mathematical Structures
in Computer Science 14(4), 527–586 (2004)
24. Shor, P.W.: Fault-tolerant quantum computation. In: Proceedings of the 37th An-
nual Symposium on Foundations of Computer Science, FOCS 1996. IEEE Com-
puter Society, Washington, DC (1996)
25. Viamontes, G.F., Markov, I.L., Hayes, J.P.: Quantum Circuit Simulation. Springer
(2009)
26. Wille, R., Grosse, D., Miller, D., Drechsler, R.: Equivalence checking of reversible
circuits. In: 39th International Symposium on Multiple-Valued Logic, pp. 324–330
(2009)
27. Ying, M., Feng, Y., Duan, R., Ji, Z.: An algebra of quantum processes. ACM Trans.
Comput. Logic 10(3), 19:1–19:36 (2009)
Computing Conditional Probabilities
in Markovian Models Efficiently
1 Introduction
Probabilistic model checking has become a prominent technique for the quanti-
tative analysis of systems with stochastic phenomena. Tools like PRISM [20] or
MRMC [18] provide powerful probabilistic model checking engines for Markovian
models and temporal logics such as probabilistic computation tree logic (PCTL)
for discrete models and its continuous-time counterpart CSL (continuous stochas-
tic logic) or linear temporal logic (LTL) as formalism to specify complex path
properties. The core task for the quantitative analysis is to compute the prob-
ability of some temporal path property or the expected value of some random
variable. For finite-state Markovian model with discrete probabilities, this task
is solvable by a combination of graph algorithms, matrix-vector operations and
methods for solving linear equation systems or linear programming techniques
[25,9,15,7]. Although probabilistic model checking is a very active research topic
and many researchers have suggested sophisticated methods e.g. to tackle the
state explosion problem or to provide algorithms for the analysis of infinite-state
This work was in part funded by the DFG through the CRC 912 HAEC, the cluster
of excellence cfAED, the project QuaOS, the DFG/NWO-project ROCKS, and by
the ESF young researcher group IMData 100098198, and the EU-FP-7 grant 295261
(MEALS).
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 515–530, 2014.
c Springer-Verlag Berlin Heidelberg 2014
516 C. Baier et al.
where s is a state in M with PrM s (ψ) > 0. If both the objective ϕ and the
condition ψ are ω-regular path properties, e.g. specified by LTL formulas or
some ω-automaton, then ϕ ∧ ψ is again ω-regular, and the above quotient is
computable with standard techniques. This approach has been taken by Andrés
and van Rossum [1,2] for the case of discrete Markov chains and PCTL path
formulas, where ϕ ∧ ψ is not a PCTL formula, but a ω-regular property of some
simple type if nested state formulas are viewed as atoms. Recently, an automata-
based approach has been developed for continuous-time Markov chains and CSL
path formulas built by cascades of the until-operator with time- and cost-bounds
[13]. This approach has been adapted in [17] for discrete-time Markov chains and
PCTL-like path formulas with multiple bounded until-operators.
For models that support both the representation of nondeterministic and prob-
abilistic behaviors, such as Markov decision processes (MDPs), reasoning about
Computing Conditional Probabilities in Markovian Models Efficiently 517
2 Preliminaries
We briefly summarize our notations used for Markov chains and Markov decision
processes. Further details can be found in textbooks on probability theory and
Markovian models, see e.g. [24,19,16].
518 C. Baier et al.
as the probability for π. The cylinder set Cyl (π) is the set of all infinite paths ς
where π is a prefix of ς. We write FPaths(s) for the set of all finite paths π with
first(π) = s. Similarly, Paths(s) stands for the set of infinite paths starting in s.
Given a state s, the probability space induced by M and s is defined using
classical measure-theoretic concepts. The underlying sigma-algebra is generated
by the cylinder sets of finite paths. This sigma-algebra does not depend on
s. We refer to the elements of this sigma-algebra as (measurable) path events.
The probability measure PrM s is defined on the basis of standard measure ex-
tension
theorems
that yield the existence of a probability measure PrM s with
M
Prs Cyl (π) = Pr(π) for all π ∈ FPaths(s), while the cylinder sets of paths π
with first(π) = s have measure 0 under PrM s .
We write Act(s) for the set of actions that are enabled in s, i.e., P (s, α, s ) > 0
for some s ∈ S. For technical reasons, we require that Act(s) = ∅ for all states
s. State s is said to be probabilistic if Act(s) = {α} is a singleton, in which case
we also write P (s, s ) rather than P (s, α, s ). A trap state is a probabilistic state
s with P (s, s) = 1. Paths are finite or infinite sequences s0 s1 s2 . . . of states such
that for all i ≥ 1 there exists an action αi with P (si−1 , αi , si ) > 0. (For our
purposes, the actions are irrelevant in paths.) Several notations that have been
introduced for Markov chains can now be adapted for Markov decision processes,
such as first(π), FPaths(s), Paths(s).
Reasoning about probabilities for path properties in MDPs requires the selec-
tion of an initial state and the resolution of the nondeterministic choices between
the possible transitions. The latter is formalized via schedulers, often also called
policies or adversaries, which take as input a finite path and select an action to be
executed. For the purposes of this paper, it suffices to consider deterministic, pos-
sibly history-dependent schedulers, i.e., partial functions S : FPaths → Act such
Computing Conditional Probabilities in Markovian Models Efficiently 519
that S(π) ∈ Act last(π) for all finite paths π. Given a scheduler S, an S-path
is any path that might arise when the nondeterministic choices in M are resolved
using S. Thus, π = s0 s1 . . . sn is an S-path iff P sk−1 , S(s0 s1 . . . sk−1 ), sk > 0
for all 1 ≤ k ≤ n. In this case, S[π] denotes the scheduler “S after π” given
by S[π](t0 t1 . . . tk ) = S(s0 s1 . . . sn t1 . . . tk ) if sn = t0 . The behavior of S[π]
for paths not starting in sn is irrelevant. The probability of π under S is the
product of the probabilities of its transitions:
<
n−1
PrS (π) = P sk−1 , S(s0 s1 . . . sk−1 ), sk
i=0
ω-regular path properties are measurable [25]. We abuse notations and identify
measurable path properties and the induced path event. Thus,
! "
PrS
M,s (ϕ) = PrM,s
S
π ∈ Paths(s) : π |= ϕ
denotes the probability for ϕ under scheduler S and starting state s.
PrM
v (♦G)
Pψ (sbef , v bef ) = P (s, v) ·
PrM
s (♦G)
For s ∈ G, we define Pψ (sbef , v nor ) = P (s, v), modeling the switch from before
to normal mode. For the states in normal mode, the transition probabilities are
given by Pψ (snor , v nor ) = P (s, v). In all other cases, Pψ (·) = 0. For the labeling
with atomic propositions, we suppose that each state s in M and its copies sbef
and snor in Mψ satisfy the same atomic propositions.
By applying standard arguments for finite Markov chains we obtain that
M
Prsbefψ (♦Gbef ) = 1 for all states s in M with s |= ∃♦G. (This is a simple conse-
quence of the fact that all states in S bef can reach Gbef .) Thus, up to the switch
from G to Gbef , the condition ♦G (which we impose for M) holds almost surely
for Mψ . For each path property ϕ, there is a one-to-one correspondence between
Computing Conditional Probabilities in Markovian Models Efficiently 521
Let (M, sinit ) be a pointed MDP where M = (S, Act, P ) and let F , G ⊆ S such
that sinit |= ∃♦G, in which case Prmax
M,sinit (♦G) > 0. The task is to compute
PrS
M,sinit ♦F ∧ ♦G
max PrS
M,sinit ♦F | ♦G ) = max
S S PrS
M,sinit ♦G
where S ranges over all schedulers for M such that PrS M,sinit (♦G) > 0. By the
results of [1,2], there exists a scheduler S maximizing the conditional probability
for ♦F , given ♦G. (This justifies the use of max rather than sup.)
Only for simplicity, we assume that F ∩ G = ∅. Thus, there are just two cases
for the event ♦F ∧ ♦G: “either F before G, or G before F ”. We also suppose
sinit ∈
/ F ∪ G and that all states s ∈ S are accessible from sinit .
Recall that S[π] denotes the scheduler “S after π”. The idea is that T behaves
as S as long as neither F nor G has been reached. As soon as a G-state (resp. F -
state) has been entered, T mimics some scheduler that maximizes the probability
to reach F (resp. G). This scheduler satisfies (2) and (3) by construction. Item
(1) follows after some calculations (see [6]).
Computing Conditional Probabilities in Markovian Models Efficiently 523
The three fresh states goal , fail and stop are trap states. Then, by Lemma 1:
the paths satisfying ϕ ∧ ψ and the paths satisfying ψ is not affected and almost
surely a path satisfying ψ will be generated. Thus, the conditional probability
for ϕ ∧ ψ given ψ under some scheduler of the original MDP agrees with the
(unconditional) probability for ϕ under the corresponding scheduler of the new
MDP Mϕ|ψ .
The restart policy is obvious for finite paths that enter the trap state fail .
Instead of staying in fail , we simply restart the computation by returning to the
initial state sinit . The second possibility to violate ψ are paths that never enter
one of the three trap states in T . To treat such paths we rely on well-known
results for finite-state MDPs stating that for all schedulers S almost all S-paths
eventually enter an end component (i.e., a strongly connected sub-MDP), stay
there forever and visit all its states infinitely often [11,12]. The idea is that we
equip all states s that belong to some end component without any T -state with
the restart-option, i.e., we add the nondeterministic alternative to return to the
initial state sinit . To enforce that such end components will be left eventually
by taking the restart-transition, one might impose strong fairness conditions for
the schedulers in Mϕ|ψ . Such fairness assumptions are, however, irrelevant for
maximal reachability conditions [3,4].
Let B be the set of (bad) states v such that there exists a scheduler S that
never visits one of the three trap states goal , stop or fail when starting in v:
9
v ∈ B iff
there exists a scheduler
S
such that PrS M ,v ♦T = 0
The MDP Mϕ|ψ = (S , Act ∪ {τ }, Pϕ|ψ ) has the same state space as the normal
form MDP M . Its action set extends the action set of M by a fresh action
symbol τ for the restart-transitions. For the states s ∈ S \ B with s = fail ,
the new MDP Mϕ|ψ behaves as M , i.e., Pϕ|ψ (s, α, s ) = P (s, α, s ) for all
s ∈ S \ (B ∪ {fail }), α ∈ Act and s ∈ S . The fresh action τ is not enabled in
the states s ∈ S \(B ∪{fail }). For the fail-state, Mϕ|ψ returns to the initial state,
i.e., Pϕ|ψ (fail , τ, sinit ) = 1 and Pϕ|ψ (fail , τ, s ) = 0 for all states s ∈ S \ {sinit }.
No other action than τ is enabled in fail . For the states v ∈ B, Mϕ|ψ decides
nondeterministically to behave as M or to return to the initial state sinit . That is,
if v ∈ B, α ∈ Act, s ∈ S then Pϕ|ψ (v, α, s ) = P (v, α, s ) and Pϕ|ψ (v, τ, sinit ) = 1.
In all remaining cases, we have Pϕ|ψ (v, τ, ·) = 0.
Paths in M that satisfy B or that end up in fail , do not “contribute” to the
conditional probability for ♦goal , given ♦(goal ∨ stop). Instead the probability
for the infinite paths π with π ∈ B or π |= ♦fail in M are “distributed” to
the probabilities for ♦goal and ♦stop when switching to conditional probabilities.
This is mimicked by the restart-transitions to sinit in Mϕ|ψ .
Theorem 2 (Soundness of step 2). For the initial state s = sinit , we have:
=
Prmax =
M ,s ♦goal ♦(goal ∨ stop) = PrMϕ|ψ ,s ♦goal
max
We can rely on very similar ideas as for reachability conditions (see Section
4.1). The construction of a normal form MDP M (step 1) is roughly the same
except that we deal only with two fresh trap states: goal and fail . The restart
mechanism in step 2 can be realized by switching from M to a new MDP Mϕ|ψ
526 C. Baier et al.
that is defined in the same way as in Section 4.1, except that restart-transitions
are only added to those states v where v ∈ Ri for some i ∈ {1, . . . , k}, and v
is contained in some end component that does not contain goal and does not
contain any Gi -state. For further details we refer to the extended version [6]).
All calculations for this paper were carried out on a computer with 2 Intel E5-
2680 8-core CPUs at 2.70 GHz with 384Gb of RAM. Table 1 lists results for the
calculation of the conditional probabilities (B1)–(B3), with N = 128 fragments
and M = 10 retries. We report the number of states and the time for building the
Computing Conditional Probabilities in Markovian Models Efficiently 527
Table 1. Statistics for the computation of (B1), (B2), (B3) for N = 128, M = 10
=
model and statistics for the calculation of PrM =
s ( ϕ ψ ) with the method presented
in Section 3 and via the quotient of Prs ( ϕ ∧ ψ) and PrM
M
s ( ψ ). In addition to
the total time for the calculation, for our method we list as well the size of
the transformed model Mψ , the time spent in the transformation phase and
the time spent to calculate the probabilities of ϕ in Mψ . In these experiments,
our transformation method outperforms the quotient approach by separating
the treatment of ψ and ϕ. As expected, the particular condition significantly
influences the size of Mψ and the time spent for the calculation in Mψ . We
plan to allow caching of Mψ if the task is to treat multiple objectives under the
same condition ψ. We have carried out experiments for conditional rewards with
similar scalability results as well, see [6].
Experiments with MDPs. We report on experimental
studies with our im-
plementation of the calculation of Prmax
M,s ♦F | ♦G for the initial state s =
sinit of the parameterized MDP presented in [23]; see also [22], http://www.
prismmodelchecker.org/casestudies/wlan.php. It models a two-way hand-
shake mechanism of the IEEE 802.11 (WLAN) medium access control scheme
with two senders S1 and S2 that compete for the medium. As messages get
corrupted when both senders send at the same time (called a collision), a prob-
abilistic back-off mechanism is employed. The model deals with the case where
a single message from S1 and S2 should be successfully sent. We consider here:
=
(W1) Prmax =
M,s ♦ “c2 collisions” ♦ “c1 collisions”
=
(W2) Prmax =
M,s ♦“deadline t expired without success of S1 ” ♦ “c collisions”
The parameter N specifies the maximal number of back-offs that each sender per-
forms. The atomic propositions “c collisions” are supported by a global counter
variable in the model that counts the collisions (up to the maximal interesting
value for the property). For (W2), the deadline t is encoded in the model by a
global variable counting down until the deadline is expired.
Calculating (W1). Table 2 lists results for the calculation of (W1) with c2 = 4
and c1 = 2. We report the number of states and the time for building the
model. The states in the transformed MDP Mϕ|ψ consist of the states in the
original MDP M plus the three trap states introduced in the transformation.
We list the time for the transformation M Mϕ|ψ and for the computation
in Mϕ|ψ separately. For comparison, we list as well the time for calculating
the unconditional probabilities Prmax max
M ( ϕ ) and PrM ( ψ ) for all states in the
528 C. Baier et al.
model M Prmax
M,s (ϕ|ψ) Prmax max
M (ϕ) PrM (ψ)
t c states build M Mϕ|ψ calc in Mϕ|ψ total time total total
50 1 539,888 10.0 s 6.4 s 0.4 s 6.8 s 6.0 s 0.1 s
50 2 539,900 9.5 s 7.1 s 4.6 s 11.7 s 6.0 s 0.6 s
100 1 4,769,199 95.1 s 194.6 s 2.4 s 197.1 s 192.0 s 0.5 s
100 2 4,769,235 93.3 s 199.8 s 85.5 s 285.5 s 184.4 s 10.4 s
model, which account for a large part of the transformation. As can be seen, our
approach scales reasonably well.
Calculating (W2). Table 3 lists selected results and statistics for (W2) with
N = 3, deadline t ∈ {50, 100} and number of collisions in the condition c ∈ {1, 2}.
Again, the time for the transformation is dominated by the computations of
Prmax max
M (ϕ) and PrM (ψ). However, in contrast to (W1), the time for the com-
putation in Mϕ|ψ is significantly lower. The complexity in practice thus varies
significantly with the particularities of the model and the condition.
6 Conclusion
We presented new methods for the computation of (maximal) conditional prob-
abilities via reductions to the computation of ordinary (maximal) probabilities
in discrete Markov chains and MDPs. These methods rely on transformations of
the model to encode the effect of conditional probabilities. For MDPs we concen-
trated on the computation of maximal conditional probabilities. Our techniques
are, however, also
applicable
for reasoning
about
minimal conditional probabil-
M,s ϕ | ψ = 1 − PrM,s ¬ϕ | ψ . By our results, the complexity of
ities as: Prmin max
the problem that asks whether the (maximal) conditional probabilities meets a
given probability bound is not harder than the corresponding question for uncon-
ditional probabilities. This is reflected in our experiments. In our experiments
with Markov chains, our new method outperforms the naı̈ve approach. In future
work, we will extend our implementations for MDPs that currently only supports
reachability objectives and conditions and study methods for the computation
of maximal or minimal expected conditional accumulated rewards.
Computing Conditional Probabilities in Markovian Models Efficiently 529
References
1. Andrés, M.E., van Rossum, P.: Conditional probabilities over probabilistic and
nondeterministic systems. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008.
LNCS, vol. 4963, pp. 157–172. Springer, Heidelberg (2008)
2. Andrés, M.E.: Quantitative Analysis of Information Leakage in Probabilistic and
Nondeterministic Systems. PhD thesis, UB Nijmegen (2011)
3. Baier, C.: On the algorithmic verification of probabilistic systems. Universität
Mannheim, Habilitation Thesis (1998)
4. Baier, C., Groesser, M., Ciesinski, F.: Quantitative analysis under fairness con-
straints. In: Liu, Z., Ravn, A.P. (eds.) ATVA 2009. LNCS, vol. 5799, pp. 135–150.
Springer, Heidelberg (2009)
5. Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT Press (2008)
6. Baier, C., Klein, J., Klüppelholz, S., Märcker, S.: Computing conditional prob-
abilities in Markovian models efficiently. Technical report, TU Dresden (2014),
http://wwwtcs.inf.tu-dresden.de/ALGI/PUB/TACAS14/
7. Bianco, A., De Alfaro, L.: Model checking of probabilistic and non-deterministic
systems. In: Thiagarajan, P.S. (ed.) FSTTCS 1995. LNCS, vol. 1026, pp. 499–513.
Springer, Heidelberg (1995)
8. Clarke, E., Grumberg, O., Peled, D.: Model Checking. MIT Press (2000)
9. Courcoubetis, C., Yannakakis, M.: The complexity of probabilistic verification.
Journal of the ACM 42(4), 857–907 (1995)
10. D’Argenio, P.R., Jeannet, B., Jensen, H.E., Larsen, K.G.: Reachability analysis of
probabilistic systems by successive refinements. In: de Alfaro, L., Gilmore, S. (eds.)
PAPM-PROBMIV 2001. LNCS, vol. 2165, pp. 39–56. Springer, Heidelberg (2001)
11. de Alfaro, L.: Formal Verification of Probabilistic Systems. PhD thesis, Stanford
University, Department of Computer Science (1997)
12. de Alfaro, L.: Computing minimum and maximum reachability times in probabilis-
tic systems. In: Baeten, J.C.M., Mauw, S. (eds.) CONCUR 1999. LNCS, vol. 1664,
pp. 66–81. Springer, Heidelberg (1999)
13. Gao, Y., Xu, M., Zhan, N., Zhang, L.: Model checking conditional CSL for
continuous-time Markov chains. Information Processing Letters 113(1-2), 44–50
(2013)
14. Grädel, E., Thomas, W., Wilke, T. (eds.): Automata, Logics, and Infinite Games.
LNCS, vol. 2500. Springer, Heidelberg (2002)
15. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal
Aspects of Computing 6, 512–535 (1994)
16. Haverkort, B.: Performance of Computer Communication Systems: A Model-Based
Approach. Wiley (1998)
17. Ji, M., Wu, D., Chen, Z.: Verification method of conditional probability based on
automaton. Journal of Networks 8(6), 1329–1335 (2013)
18. Katoen, J.-P., Zapreev, I.S., Hahn, E.M., Hermanns, H., Jansen, D.N.: The ins and
outs of the probabilistic model checker MRMC. Performance Evaluation 68(2), 90–
104 (2011)
19. Kulkarni, V.: Modeling and Analysis of Stochastic Systems. Chapman & Hall
(1995)
20. Kwiatkowska, M., Norman, G., Parker, D.: Probabilistic symbolic model checking
with PRISM: A hybrid approach. STTT 6(2), 128–142 (2004)
21. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of probabilistic
real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS,
vol. 6806, pp. 585–591. Springer, Heidelberg (2011)
530 C. Baier et al.
22. Kwiatkowska, M., Norman, G., Parker, D.: The PRISM benchmark suite. In: QEST
2012. IEEE (2012)
23. Kwiatkowska, M., Norman, G., Sproston, J.: Probabilistic model checking of the
IEEE 802.11 wireless local area network protocol. In: Hermanns, H., Segala, R.
(eds.) PAPM-PROBMIV 2002. LNCS, vol. 2399, pp. 169–187. Springer, Heidelberg
(2002)
24. Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Program-
ming. John Wiley & Sons, Inc., New York (1994)
25. Vardi, M.: Automatic verification of probabilistic concurrent finite-state programs.
In: FOCS 1985, pp. 327–338. IEEE (1985)
26. Vardi, M.Y.: Probabilistic linear-time model checking: An overview of the
automata-theoretic approach. In: Katoen, J.-P. (ed.) ARTS 1999. LNCS, vol. 1601,
pp. 265–276. Springer, Heidelberg (1999)
Permissive Controller Synthesis
for Probabilistic Systems
1 Introduction
Probabilistic model checking is used to automatically verify systems with stochas-
tic behaviour. Systems are modelled as, for example, Markov chains, Markov
decision processes, or stochastic games, and analysed algorithmically to verify
quantitative properties specified in temporal logic. Applications include checking
the safe operation of fault-prone systems (“the brakes fail to deploy with prob-
ability at most 10−6 ”) and establishing guarantees on the performance of, for
example, randomised communication protocols (“the expected time to establish
connectivity between two devices never exceeds 1.5 seconds”).
A closely related problem is that of controller synthesis. This entails construct-
ing a model of some entity that can be controlled (e.g., a robot, a vehicle or a
machine) and its environment, formally specifying the desired behaviour of the
system, and then generating, through an analysis of the model, a controller that
will guarantee the required behaviour. In many applications of controller syn-
thesis, a model of the system is inherently probabilistic. For example, a robot’s
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 531–546, 2014.
c Springer-Verlag Berlin Heidelberg 2014
532 K. Dräger et al.
2 Preliminaries
We denote by Dist(X) the set of discrete probability distributions over a set X.
A Dirac distribution is one that assigns probability 1 to some s ∈ X. The support
of a distribution d ∈ Dist (X) is defined as supp(d) = {x ∈ X | d(x) > 0}.
def
unique player ◦ such that s ∈ S◦ picks the action a ∈ A(s) to be taken in state
534 K. Dräger et al.
The classical controller synthesis problem asks whether there is a sound strategy.
We can determine whether this is the case by computing the optimal strategy
for player ♦ in game G [12,15]. This problem is known to be in NP ∩ co-NP, but,
in practice, methods such as value or policy iteration can be used efficiently.
Example 1. Fig. 1 shows a stochas- east pass
s0 s1 s2
tic game G, with controller and
0.75 block
environment player states drawn as 0.7 south
south 0.25
diamonds and squares, respectively. It 0.3 pass
north
models the control of a robot mov- s4 s5
east
ing between 4 locations (s0 , s2 , s3 , s5 ). s3 block
done
0.6
When moving east (s0 →s2 or s3 →s5 ), 0.4
3.1 Multi-strategies
Multi-strategies generalise the notion of strategies, as defined in Section 2.
Definition 2 (Multi-strategy). A (memoryless) multi-strategy for a game
G=S♦ , S , s, A, δ is a function θ:S♦ →Dist(2A ) with θ(s)(∅) = 0 for all s ∈ S♦ .
As for strategies, a multi-strategy θ is deterministic if θ always returns a Dirac
distribution, and randomised otherwise. We write ΘGdet and ΘGrand for the sets of
all deterministic and randomised multi-strategies in G, respectively.
A deterministic multi-strategy θ chooses a set of allowed actions in each state
s ∈ S♦ , i.e., those in the unique set B ⊆ A for which θ(s)(B) = 1. The re-
maining actions A(s) \ B are said to be blocked in s. In contrast to classical
controller synthesis, where a strategy σ can be seen as providing instructions
about precisely which action to take in each state, in permissive controller syn-
thesis a multi-strategy provides multiple actions, any of which can be taken. A
randomised multi-strategy generalises this by selecting a set of allowed actions
in state s randomly, according to distribution θ(s).
536 K. Dräger et al.
Example 2. We return to the stochastic game from Ex. 1 (see Fig. 1) and re-use
the property φ = Rmoves
5 [ C ]. The strategy that picks south in s0 and east in s3
results in an expected reward of 3.5 (i.e., 3.5 moves on average to reach s5 ). The
strategy that picks east in s0 and south in s2 yields expected reward 5. Thus a
(deterministic) multi-strategy θ that picks {south, east } in s0 , {south} in s2 and
{east} in s3 is sound for φ since the expected reward is always at most 5.
ψ given by the local penalties: ψ (s, a) = pen loc (ψ, θ, s) for all a ∈ A(s). Then:
σ,π
pen dyn (ψ, θ) = sup{EG,s (ψ ) | σ ∈ ΣG♦ , π ∈ ΣG and σ complies with θ}.
Example 3. We return to Ex. 2 and consider a static penalty scheme (ψ, sta)
assigning 1 to the actions north, east , south (in any state). The deterministic
multi-strategy θ from Ex. 2 is optimally permissive for φ = Rmoves 5 [ C ], with
penalty 1 (just north in s3 is blocked). If we instead use φ = Rmoves
16 [ C ], the multi-
strategy θ that extends θ by also allowing north is now sound and optimally
permissive, with penalty 0. Alternatively, the randomised multi-strategy θ that
picks 0.7:{north}+0.3:{north, east } in s3 is sound for φ with penalty just 0.7.
Next, we establish several fundamental results about the permissive controller
synthesis problem. Proofs can be found in [13].
Optimality. Recall that two key parameters of the problem are the type of
multi-strategy sought (deterministic or randomised) and the type of penalty
scheme used (static or dynamic). We first note that randomised multi-strategies
are strictly more powerful than deterministic ones, i.e. they can be more permis-
sive (yield a lower penalty) for the same property φ.
Theorem 1. The answer to a permissive controller synthesis problem (for ei-
ther a static or dynamic penalty scheme) can be “no” for deterministic multi-
strategies, but “yes” for randomised ones.
This is why we explicitly distinguish between classes of multi-strategies when
defining permissive controller synthesis. This situation contrasts with classi-
cal controller synthesis, where deterministic strategies are optimal for the same
classes of properties φ. Intuitively, randomisation is more powerful in this case
because of the trade-off between rewards and penalties: similar results exist in,
for example, multi-objective controller synthesis on MDPs [14].
Second, we observe that, for the case of static penalties, the optimal penalty
value for a given property (the infimum of achievable values) may not actually
be achievable by any randomised multi-strategy.
538 K. Dräger et al.
An important feature of the MILP solvers we use is that they work incre-
mentally, producing a sequence of increasingly good solutions. Here, that means
generating a series of sound multi-strategies that are increasingly permissive. In
practice, when resources are constrained, it may be acceptable to stop early and
accept a multi-strategy that is sound but not necessarily optimally permissive.
4.1 Deterministic Multi-strategies
We first consider synthesis of deterministic multi-strategies. Here, and in the
rest of this section, we assume that the property φ is of the form Rrb [ C ]. Upper
bounds on expected rewards (φ = Rrb [ C ]) can be handled by negating rewards
and converting to a lower bound. For the purposes of encoding into MILP, we
σ,π
rescale r and b such that supσ,π EG,s (r) < 1 for all s, and rescale every (non-zero)
penalty such that ψ(s, a) 1 for all s and a ∈ A(s).
Static Penalties. Fig. 2 shows an encoding into MILP of the problem of finding
an optimally permissive deterministic multi-strategy for property φ = Rrb [ C ]
and a static penalty scheme (ψ, sta). The encoding uses 5 types of variables:
ys,a ∈ {0, 1}, xs ∈ R0 , αs ∈ {0, 1}, βs,a,t ∈ {0, 1} and γt ∈ [0, 1], where s, t ∈ S
and a ∈ A. So the worst-case size of the MILP problem is O(|A|·|S|2 ·κ), where
κ stands for the longest encoding of a number used.
Variables ys,a encode a multi-strategy θ: ys,a =1 iff θ allows action a in s
(constraint (2) enforces at least one action per state). Variables xs represent
the worst-case expected total reward (for r) from state s, under any controller
strategy complying with θ and under any environment strategy. This is captured
by constraints (3)–(4) (which amounts to minimising the reward in an MDP).
Constraint (1) imposes the required bound of b on the reward from s.
The objective function minimises the static penalty (the sum of all local
penalties) minus the expected reward in the initial state. The latter acts as
a tie-breaker between solutions with equal penalties (but, thanks to rescaling, is
always dominated by the penalties and therefore does not affect optimality).
As an additional technicality, we need to ensure that the values of xs are the
least solution of the defining inequalities, to deal with the possibility of zero
reward loops [24]. To achieve this, we use an approach similar to the one taken
in [28]. It is sufficient to ensure that xs = 0 whenever the minimum expected
reward from s achievable under θ is 0, which is the case if and only if, starting
from s, it is possible to avoid ever taking an action with positive reward.
In our encoding, αs = 1 if xs is positive (constraint (5)). The binary variables
βs,a,t = 1 represent, for each such s and each action a allowed in s, a choice of
successor t ∈ supp(δ(s, a)) (constraint (6)). The variables γs then represent a
ranking function: if r(s, a) = 0, then γs > γt(s,a) (constraint (8)). If a positive
reward could be avoided starting from s, there would in particular be an infinite
sequence s0 , a1 , s1 , . . . with s0 = s and, for all i, si+1 = t(si , ai ) and r(si , ai ) = 0,
and therefore γsi > γsi+1 . Since S is finite, this sequence would have to enter a
loop, leading to a contradiction.
Dynamic Penalties. Next, we show how to compute an optimally permissive
sound multi-strategy for a dynamic penalty scheme (ψ, dyn). This case is more
540 K. Dräger et al.
Minimise: − xs + (1 − ys,a )·ψ(s, a) subject to:
s∈S♦ a∈A(s)
xs b (1)
1
ys,a for all s ∈ S♦ (2)
a∈A(s)
xs
δ(s, a)(t)·xt + r(s, a) + (1 − ys,a ) for all s ∈ S♦ , a ∈ A(s) (3)
t∈S
xs
δ(s, a)(t)·xt for all s ∈ S , a ∈ A(s) (4)
t∈S
subtle since the optimal penalty can be infinite. Hence, our solution proceeds
in two steps as follows. Initially, we determine if there is some sound multi-
strategy. For this, we just need to check for the existence of a sound strategy,
using standard algorithms for solution of stochastic games [12,15].
If there is no sound multi-strategy, we are done. If there is, we use the MILP
problem in Fig. 3 to determine the penalty for an optimally permissive sound
multi-strategy. This MILP encoding extends the one in Fig. 2 for static penal-
ties, adding variables s and zs , representing the local and the expected penalty
in state s, and three extra sets of constraints. Equations (9) and (10) define
the expected penalty in controller states, which is the sum of penalties for all
disabled actions and those in the successor states, multiplied by their transition
probability. The behaviour of environment states is captured by Equation (11),
where we only maximise the penalty, without incurring any penalty locally.
The constant c in (10) is chosen to be no lower than any finite penalty achiev-
able by a deterministic multi-strategy, a possible value being ∞ i=0 (1 − p
|S| i
) ·
|S|
p · i · |S| · pen max , where p is the smallest non-zero probability assigned by δ,
and pen max is the maximal local penalty over all states. If the MILP problem has
a solution, this is the optimal dynamic penalty over all sound multi-strategies.
If not, no deterministic sound multi-strategy has finite penalty and the optimal
penalty is ∞ (recall that we established there is some sound multi-strategy).
Permissive Controller Synthesis for Probabilistic Systems 541
b1 a1
s1
s1
...
p1
a1 bn ak
s s
...
ak b1 a1
pm sn
...
sm
bn ak
Fig. 4. Transformed game for approximating randomised multi-strategies (Section 4.2)
In practice, we might choose a lower value of c than the one above, resulting in
a multi-strategy that is sound, but possibly not optimally permissive.
The following result states that, by varying the granularity M , we can get
arbitrarily close to the optimal penalty for a randomised multi-strategy and, for
the case of static penalties, defines a suitable choice of M .
Theorem 7. Let θ be a sound multi-strategy. For any ε > 0, there is an M and
a sound multi-strategy θ of granularity M satisfying pen t (ψ, θ )−pen t (ψ, θ)
ε.
Moreover, for static penalties it suffices to take M = s∈S,a∈A(s) ψ(s,a) ε .
5 Experimental Results
We have implemented our techniques within PRISM-games [9], an extension of
the PRISM model checker for performing model checking and strategy synthe-
sis on stochastic games. PRISM-games can thus already be used for (classical)
controller synthesis problems on stochastic games. To this, we add the ability
to synthesise multi-strategies using the MILP-based method described in Sec-
tion 4. Our implementation currently uses CPLEX to solve MILP problems. It
also supports SCIP and lp solve, but in our experiments (run on a PC with a
1.7GHz i7 Core processor and 4GB RAM) these were slower in all cases.
We investigated the applicability and performance of our approach on a va-
riety of case studies, some of which are existing benchmark examples and some
of which were developed for this work. These are described in detail below and
the files used can be found online [29].
Deterministic Multi-strategy Synthesis. We first discuss the generation
of optimal deterministic multi-strategies, the results of which are summarised
in Table 1. In each row, we first give details of the model: the case study, any
parameters used, the number of states (|S|) and of controller states (|S♦ |). Then,
we show the property φ used, the penalty value of the optimal multi-strategy
and the time to generate it. Below, we give further details for each case study,
illustrating the variety of ways that permissive controller synthesis can be used.
Permissive Controller Synthesis for Probabilistic Systems 543
†
See Table 1 for parameter names.
∗
Sound but possibly non-optimal multi-strategy obtained after 5 minute MILP time-out.
6 Conclusions
We have presented a framework for permissive controller synthesis on stochastic
two-player games, based on generation of multi-strategies that guarantee a spec-
ified objective and are optimally permissive with respect to a penalty function.
We proved several key properties, developed MILP-based synthesis methods and
evaluated them on a set of case studies. Topics for future work include synthesis
for more expressive temporal logics and using history-dependent multi-strategies.
Acknowledgements. The authors are part supported by ERC Advanced Grant
VERIWARE and EPSRC projects EP/K038575/1 and EP/F001096/1.
Permissive Controller Synthesis for Probabilistic Systems 545
References
1. Behrmann, G., Cougnard, A., David, A., Fleury, E., Larsen, K.G., Lime, D.:
UPPAAL-tiga: Time for playing games! In: Damm, W., Hermanns, H. (eds.) CAV
2007. LNCS, vol. 4590, pp. 121–125. Springer, Heidelberg (2007)
2. Bernet, J., Janin, D., Walukiewicz, I.: Permissive strategies: from parity games to
safety games. ITA 36(3), 261–275 (2002)
3. Bouyer, P., Duflot, M., Markey, N., Renault, G.: Measuring permissivity in finite
games. In: Bravetti, M., Zavattaro, G. (eds.) CONCUR 2009. LNCS, vol. 5710, pp.
196–210. Springer, Heidelberg (2009)
4. Bouyer, P., Markey, N., Olschewski, J., Ummels, M.: Measuring permissiveness in
parity games: Mean-payoff parity games revisited. In: Bultan, T., Hsiung, P.-A.
(eds.) ATVA 2011. LNCS, vol. 6996, pp. 135–149. Springer, Heidelberg (2011)
5. Calinescu, R., Ghezzi, C., Kwiatkowska, M., Mirandola, R.: Self-adaptive software
needs quantitative verification at runtime. CACM 55(9), 69–77 (2012)
6. Calinescu, R., Johnson, K., Kikuchi, S.: Compositional reverification of probabilis-
tic safety properties for large-scale complex IT systems. In: LSCITS (2012)
7. Canny, J.: Some algebraic and geometric computations in PSPACE. In: Proc.
STOC 1988, pp. 460–467. ACM, New York (1988)
8. Chen, T., Forejt, V., Kwiatkowska, M., Parker, D., Simaitis, A.: Automatic verifi-
cation of competitive stochastic systems. In: Flanagan, C., König, B. (eds.) TACAS
2012. LNCS, vol. 7214, pp. 315–330. Springer, Heidelberg (2012)
9. Chen, T., Forejt, V., Kwiatkowska, M., Parker, D., Simaitis, A.: PRISM-games: A
model checker for stochastic multi-player games. In: Piterman, N., Smolka, S.A.
(eds.) TACAS 2013. LNCS, vol. 7795, pp. 185–191. Springer, Heidelberg (2013)
10. Chen, T., Kwiatkowska, M., Parker, D., Simaitis, A.: Verifying team formation
protocols with probabilistic model checking. In: Leite, J., Torroni, P., Ågotnes,
T., Boella, G., van der Torre, L. (eds.) CLIMA XII 2011. LNCS, vol. 6814, pp.
190–207. Springer, Heidelberg (2011)
11. Chen, T., Kwiatkowska, M., Simaitis, A., Wiltsche, C.: Synthesis for multi-
objective stochastic games: An application to autonomous urban driving. In: Joshi,
K., Siegle, M., Stoelinga, M., D’Argenio, P.R. (eds.) QEST 2013. LNCS, vol. 8054,
pp. 322–337. Springer, Heidelberg (2013)
12. Condon, A.: On algorithms for simple stochastic games. In: Advances in Compu-
tational Complexity Theory. DIMACS Series, vol. 13, pp. 51–73 (1993)
13. Draeger, K., Forejt, V., Kwiatkowska, M., Parker, D., Ujma, M.: Permissive con-
troller synthesis for probabilistic systems. Technical Report CS-RR-14-01, Depart-
ment of Computer Science, University of Oxford (2014)
14. Etessami, K., Kwiatkowska, M., Vardi, M., Yannakakis, M.: Multi-objective model
checking of Markov decision processes. LMCS 4(4), 1–21 (2008)
15. Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer (1997)
16. Forejt, V., Kwiatkowska, M., Norman, G., Parker, D., Qu, H.: Quantitative multi-
objective verification for probabilistic systems. In: Abdulla, P.A., Leino, K.R.M.
(eds.) TACAS 2011. LNCS, vol. 6605, pp. 112–127. Springer, Heidelberg (2011)
17. Garey, M.R., Graham, R.L., Johnson, D.S.: Some np-complete geometric problems.
In: STOC 1976, pp. 10–22. ACM, New York (1976)
18. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal
Aspects of Computing 6(5), 512–535 (1994)
546 K. Dräger et al.
19. Kemeny, J., Snell, J., Knapp, A.: Denumerable Markov Chains. Springer (1976)
20. Kumar, R., Garg, V.: Control of stochastic discrete event systems modeled by
probabilistic languages. IEEE Trans. Automatic Control 46(4), 593–606 (2001)
21. Lahijanian, M., Wasniewski, J., Andersson, S., Belta, C.: Motion planning and
control from temporal logic specifications with probabilistic satisfaction guarantees.
In: Proc. ICRA 2010, pp. 3227–3232 (2010)
22. McIver, A., Morgan, C.: Results on the quantitative mu-calculus qMu. ACM Trans-
actions on Computational Logic 8(1) (2007)
23. Ozay, N., Topcu, U., Murray, R., Wongpiromsarn, T.: Distributed synthesis of
control protocols for smart camera networks. In: Proc. ICCPS 2011 (2011)
24. Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Program-
ming. John Wiley and Sons (1994)
25. Schrijver, A.: Theory of Linear and Integer Programming. John Wiley & Sons
(1998)
26. Shankar, N.: A tool bus for anytime verification. In: Usable Verification (2010)
27. Steel, G.: Formal analysis of PIN block attacks. TCS 367(1-2), 257–270 (2006)
28. Wimmer, R., Jansen, N., Ábrahám, E., Becker, B., Katoen, J.-P.: Minimal critical
subsystems for discrete-time Markov models. In: Flanagan, C., König, B. (eds.)
TACAS 2012. LNCS, vol. 7214, pp. 299–314. Springer, Heidelberg (2012); Extended
version available as technical report SFB/TR 14 AVACS 88
29. http://www.prismmodelchecker.org/files/tacas14pcs/
Precise Approximations of the Probability
Distribution of a Markov Process in Time:
An Application to Probabilistic Invariance
1 Introduction
Verification techniques and tools for deterministic, discrete time, finite-state sys-
tems have been available for many years [9]. Formal methods in the stochastic
context is typically limited to discrete state structures, either in continuous or
in discrete time [3, 12]. Stochastic processes evolving over continuous (uncount-
able) spaces are often related to undecidable problems (the exception being
when they admit analytical solutions). It is thus of interest to resort to formal
approximation techniques that allow solving corresponding problems over finite
discretizations of the original models. In order to relate the approximate solu-
tions to the original problems, it is of interest to come up with precise bounds on
the error introduced by the approximations. The use of formal approximations
techniques for such complex models can be looked at from the perspective of the
research on abstraction techniques, which are of wide use in formal verification.
Successful numerical schemes based on Markov chain approximations of sto-
chastic systems in continuous time have been introduced in the literature, e.g.
[10]. However, the finite abstractions are only related to the original models
asymptotically (at the limit), with no explicit error bounds. This approach has
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 547–561, 2014.
c Springer-Verlag Berlin Heidelberg 2014
548 S. Esmaeil Zadeh Soudjani and A. Abate
Suppose that the initial state of the Markov process M is random and dis-
tributed according to the density function π0 : S → R≥0 . The state distribu-
tion of M at time t ∈ N={1, ˙ 2, 3, . . .} is characterized by a density function
πt : S → R≥0 , which fully describes the statistics of the process at t and is in
particular such that, for all A ∈ B(S),
?
P(s(t) ∈ A) = πt (s)ds,
A
where the symbol P is loosely used to indicate the probability associated to events
over the product space S t+1 with elements s = [s(0), s(1), . . . , s(t)], whereas the
bold typeset is constantly used in the sequel to indicate vectors.
The state density functions πt (·) can be computed recursively, as follows:
?
πt+1 (s̄) = ts (s̄|s)πt (s)ds ∀s̄ ∈ S. (1)
S
In practice the forward recursion in (1) rarely yields a closed form for the density
function πt+1 (·). A special instance where this is the case is represented by a
linear dynamical system perturbed by Gaussian process noise: due to the closure
property of the Gaussian distribution with respect to addition and multiplication
by a constant, it is possible to explicitly write recursive formulas for the mean and
the variance of the distribution, and thus express in a closed form the distribution
in time of the solution of the model. In more general cases, it is necessary to
numerically (hence, approximately) compute the density function of the model
in time.
This article provides a numerical approximation of the density function of
M as the probability mass function (pmf) of a finite-state Markov chain Mf in
time. The Markov chain Mf is obtained as an abstraction of the concrete Markov
process M. The abstraction is associated with a guaranteed and tunable error
bound, and algorithmically it leverages a state-space partitioning procedure. The
procedure is comprised of two steps:
where the parameters a, σ > 0, whereas b ∈ R, and such that w(·) is a process
comprised of independent, identically distributed random variables with a stan-
dard normal distribution. The initial state of the process is selected uniformly in
550 S. Esmaeil Zadeh Soudjani and A. Abate
the bounded interval [β0 , γ0 ] ⊂ R. The solution of the model is a Markov process,
evolving over the state space S = R, and fully characterized by the conditional
density function
1
φσ (u) = √ e−u /2σ .
2 2
ts (s̄|s) = φσ (s̄ − as − b),
σ 2π
We raise the following assumptions in order to be able to later relate the state
density function of M to the probability mass function of Mf .
Assumption 1. For given sets Γ ⊂ S 2 and Λ0 ⊂ S, there exist positive con-
stants
and ε0 , such that ts (s̄|s) and π0 (s) satisfy the following conditions:
ts (s̄|s) ≤
∀(s, s̄) ∈ S 2 \Γ, and π0 (s) ≤ ε0 ∀s ∈ S\Λ0 . (2)
Assumption 2. The density functions π0 (s) and ts (s̄|s) are (globally) Lipschitz
continuous, namely there exist finite constants λ0 , λf , such that the following
Lipschitz continuity conditions hold:
|π0 (s) − π0 (s )| ≤ λ0 s − s ∀s, s ∈ Λ0 , (3)
|ts (s̄|s) − ts (s̄ |s)| ≤ λf s̄ − s̄ ∀s, s̄, s̄ ∈ S. (4)
Moreover, there exists a finite constant Mf such that
9? = @
=
Mf = sup ts (s̄|s)ds==s̄ ∈ S . (5)
S
In the sequel the function IA (·) denotes the indicator function of a set A ⊆ S,
namely IA (s) = 1, if s ∈ A; else IA (s) = 0.
Example 1 (Continued). Select the interval Λ0 = [β0 , γ0 ] and define the set Γ
by the linear inequality
=
Γ = {(s, s̄) ∈ R2 =|s̄ − as − b| ≤ ασ}.
The initial density function π0 of the process can be represented by the function
ψ0 (s) = I[β0 ,γ0 ] (s)/(γ0 − β0 ).
Then Assumption 1 holds with
= φ1 (α)/σ and ε0 = 0. The constant Mf in
Assumption 2 is equal to 1/a. Lipschitz
√ continuity, as per (3) and (4), holds for
constants λ0 = 0 and λf = 1/ σ 2 2πe .
Approximate State Probability Distribution of a Markov Process in Time 551
Theorem 2. SupposeN that the state space of the process M has been truncated
to the set Υ = t=0 Λt . Let us introduce the following recursion to compute
functions μt : S → R≥0 as an approximation of the density functions πt :
?
μt+1 (s̄) = IΥ (s̄) ts (s̄|s)μt (s)ds, μ0 (s) = IΛ0 (s)π0 (s) ∀s̄ ∈ S. (9)
S
s̄
Γ
b + ασ
Λt+1
s
Λt
b − ασ
The initial distribution of Mf is> the pmf p0 = [p0 (1), p0 (2), . . . , p0 (n+1)], and it
is obtained from π0 as p0 (i) = Ai π0 (s)ds, ∀i ∈ Nn+1 . Then the pmf associated
to the state distribution of Mf at time t can be computed as pt = p0 P t .
It is intuitive that the discrete pmf pt of the Markov chain Mf approximates
the continuous density function πt of the Markov process M. In the rest of the
section we show how to formalize this relationship: pt is used to construct an
554 S. Esmaeil Zadeh Soudjani and A. Abate
Lemma 1. Suppose that the inequality in (4) holds. Then the state density func-
tions πt (·) are globally Lipschitz continuous with constant λf for all t ∈ N:
n
pt (i)
ψt (s) = IAi (s) ∀t ∈ N, (11)
i=1
L(A i)
πt − ψt ∞ ≤ εt + Et ∀t ∈ N, (12)
Et+1 = Mf Et + λf δ, E0 = λ0 δ, (13)
and δ is an upper bound on the diameters of the partition sets {Ai }ni=1 , namely
δ = sup {s − s , s, s ∈ Ai , i ∈ Nn }.
Note that the functions ψt are defined over the whole state space S, but (11)
implies that they are equal to zero outside the set Υ .
Corollary 1. The recursion in (13) admits the explicit solution
3 4
Et = κ(t, Mf )λf + Mft λ0 δ,
This is clearly due to the fact that we are operating on the dynamics of M
truncated over the set Υ . It is thus intuitive that the approximation procedure
and the derived error bounds are also valid for the case of sub-stochastic density
functions, namely
? ?
ts (s̄|s)ds̄ ≤ 1 ∀s ∈ S, π0 (s)ds ≤ 1,
S S
the only difference being that the obtained Markov chain Mf is as well sub-
stochastic.
Further, whenever the Lipschitz continuity requirement on the initial density
function, as per (3) in Assumption 2, does not hold, (for instance, this is the
case when the initial state of the process is deterministic) we can relax this
continuity assumption on the initial distribution of the process by starting the
discrete computation from the time step t = 1. In this case we define the pmf
p1 = [p1 (1), p1 (2), . . . , p1 (n + 1)], where
? ?
p1 (i) = ts (s̄|s)π0 (s)dsds̄ ∀i ∈ Nn+1 ,
Ai S
and derive pt = p1 P t−1 for all t ∈ N. Theorem 3 follows along similar lines,
except for eqn. (13), where the initial error is set to E0 = 0 and the time-
dependent terms Et can be derived as Et = κ(t, Mf )λf δ.
It is important to emphasize the computability of the derived errors and
the fact that they can be tuned. Further, in order to attain abstractions that
are practically useful, it imperative to seek improvements on the derived error
bounds: in particular, the approximation errors can be computed locally (under
corresponding local Lipschitz continuity assumptions), following the procedures
discussed in [7].
Example 1 (Continued). The error of proposed Markov chain abstraction can
be expressed as
& '
δ φ1 (α) 1
πt − ψt ∞ ≤ κ(t, Mf ) √ + , Mf = .
σ 2 2πe σ a
a = 1.2 a = 0.8
0.8
ψ1 (s), π1 (s) 2 ψ1 (s), π1 (s)
0.7 ψ2 (s), π2 (s) 1.8 ψ2 (s), π2 (s)
ψ3 (s), π3 (s) 1.6 ψ3 (s), π3 (s)
0.6
ψ4 (s), π4 (s) 1.4 ψ4 (s), π4 (s)
0.5
ψ5 (s), π5 (s) 1.2 ψ5 (s), π5 (s)
0.4 1
0.3 0.8
0.6
0.2
0.4
0.1
0.2
0
−1 0 1 2 3 4 −1 −0.5 0 0.5 1
s s
Fig. 2. Piece-wise constant approximation of the state density function ψt (·), compared
to the actual function πt (·) (derived analytically) for a = 1.2 (left) and a = 0.8 (right)
|t(s̄|s) − t(s̄|s )| ≤ λb s − s
∀s, s , s̄ ∈ A.
>
A finite constant Mb is introduced as Mb = sups∈A A ts (s̄|s)ds̄ ≤ 1.
The procedure introduces a partition of the safe set A = ∪ni=1 Ai and extends
it to S = ∪n+1
i=1 Ai , with An+1 = S\A. Then it selects arbitrary representative
558 S. Esmaeil Zadeh Soudjani and A. Abate
where δ is the max partitions diameter, L(A) is the Lebesgue measure of set A.
For the numerical simulation we select a safety set A = [0, 1], a noise level
σ = 0.1, and a time horizon N = 10. The solution of the safety problem for the
two cases a = 1.2 and a = 0.8 is plotted in Figure 3. We have computed constants
λf = 24.20, Mb = 1 in both cases, while λb = 29.03, Mf = 0.83 for the first case,
and λb = 19.36, Mf = 1.25 for the second case. We have selected the center
of the partition sets (distributed uniformly over the set A) as representative
points for Markov chain Mb . In order to compare the two approaches, we have
assumed the same computational effort (related to the same partition size of
δ = 0.7 × 10−4 ), and have obtained an error Ef = 0.008, Eb = 0.020 for a =
1.2 and Ef = 0.056, Eb = 0.014 for a = 0.8. The simulations show that the
forward approach works better for a = 1.2, while the backward approach is
better suitable for a = 0.8. Note that the approximate solutions provided by
the two approaches are very close: the difference of the transition probabilities
computed via the Markov chains Mf , Mb are in the order of 10−8 , and the
difference in the approximate solutions (black curve in Figure 3) is in the order
of 10−6 . This has been due to the selection of very fine partition sets that have
resulted in small abstraction errors.
a = 1.2 a = 0.8
0.35 0.7
forward approach forward approach
backward approach backward approach
0.3 approximate solution 0.6 approximate solution
0.25 0.5
0.2
ps (A)
ps (A)
0.4
0.15 0.3
0.1 0.2
0.05 0.1
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
s s
Fig. 3. Approximate solution of the probabilistic invariance problem (black line), to-
gether with error intervals of forward (red band) and backward (blue band) approaches,
for a = 1.2 (left) and a = 0.8 (right)
Theorem 4. Assume that the initial density function π0 (s) is bounded and that
the constant Mf is finite and Mf < 1. If the state space is unbounded, the
sequence of density functions {πt (s)|t ≥ 0} uniformly exponentially converges to
zero. The sequence of probabilities P{s(t) ∈ A} and the corresponding solution
of the safety problem for any compact safe set A exponentially converge to zero.
Theorem 4 indicates that under the invoked assumptions the probability “spreads
out” over the unbounded state space as time progresses. Moreover, the theorem en-
sures the absence of absorbing sets [16, 17], which are indeed known to characterize
the solution of infinite-horizon properties. Example 2 studies the relationship be-
tween constant Mf and the stability of linear stochastic difference equations.
is invertible. Then the implicit function theorem guarantees the existence and
uniqueness of a function g : Rn × Rn → Rn such that w(t) = g(s(t + 1), s(t)).
The conditional density function of the system in this case is [14]:
= & '=
= ∂g =
=
ts (s̄|s) = =det (s̄, s) == tw (g(s̄, s)).
∂s̄
The Lipschitz constants λf , λb are specified by the dependence of function g(s̄, s)
∂f
from the variables s̄, s, respectively. As a special case the invertability of ∂w is
guaranteed for systems with additive process noise, namely f (s, w) = fa (s) + w.
Then g(s̄, s) = s̄ − fa (s), λf is the Lipschitz constant of tw (·), while λb is the
multiplication of Lipschitz constant of tw (·) and of fa (·).
References
1. Abate, A., Katoen, J.-P., Lygeros, J., Prandini, M.: Approximate model checking
of stochastic hybrid systems. European Journal of Control 6, 624–641 (2010)
2. Abate, A., Prandini, M., Lygeros, J., Sastry, S.: Probabilistic reachability and
safety for controlled discrete time stochastic hybrid systems. Automatica 44(11),
2724–2734 (2008)
Approximate State Probability Distribution of a Markov Process in Time 561
3. Baier, C., Katoen, J.-P., Hermanns, H.: Approximate symbolic model checking of
continuous-time Markov chains (Extended abstract). In: Baeten, J.C.M., Mauw, S.
(eds.) CONCUR 1999. LNCS, vol. 1664, pp. 146–162. Springer, Heidelberg (1999)
4. Esmaeil Zadeh Soudjani, S., Abate, A.: Adaptive gridding for abstraction and veri-
fication of stochastic hybrid systems. In: Proceedings of the 8th International Con-
ference on Quantitative Evaluation of Systems, Aachen, DE, pp. 59–69 (September
2011)
5. Esmaeil Zadeh Soudjani, S., Abate, A.: Higher-Order Approximations for Verifica-
tion of Stochastic Hybrid Systems. In: Chakraborty, S., Mukund, M. (eds.) ATVA
2012. LNCS, vol. 7561, pp. 416–434. Springer, Heidelberg (2012)
6. Esmaeil Zadeh Soudjani, S., Abate, A.: Probabilistic invariance of mixed
deterministic-stochastic dynamical systems. In: ACM Proceedings of the 15th In-
ternational Conference on Hybrid Systems: Computation and Control, Beijing,
PRC, pp. 207–216 (April 2012)
7. Esmaeil Zadeh Soudjani, S., Abate, A.: Adaptive and sequential gridding proce-
dures for the abstraction and verification of stochastic processes. SIAM Journal on
Applied Dynamical Systems 12(2), 921–956 (2013)
8. Koutsoukos, X., Riley, D.: Computational methods for reachability analysis of sto-
chastic hybrid systems. In: Hespanha, J.P., Tiwari, A. (eds.) HSCC 2006. LNCS,
vol. 3927, pp. 377–391. Springer, Heidelberg (2006)
9. Kurshan, R.P.: Computer-Aided Verification of Coordinating Processes: The
Automata-Theoretic Approach. Princeton Series in Computer Science. Princeton
University Press (1994)
10. Kushner, H.J., Dupuis, P.G.: Numerical Methods for Stochastic Control Problems
in Continuous Time. Springer, New York (2001)
11. Kvasnica, M., Grieder, P., Baotić, M.: Multi-parametric toolbox, MPT (2004)
12. Kwiatkowska, M., Norman, G., Segala, R., Sproston, J.: Verifying quantitative
properties of continuous probabilistic timed automata. In: Palamidessi, C. (ed.)
CONCUR 2000. LNCS, vol. 1877, pp. 123–137. Springer, Heidelberg (2000)
13. Mitchell, I.M.: Comparing forward and backward reachability as tools for safety
analysis. In: Bemporad, A., Bicchi, A., Buttazzo, G. (eds.) HSCC 2007. LNCS,
vol. 4416, pp. 428–443. Springer, Heidelberg (2007)
14. Papoulis, A.: Probability, Random Variables, and Stochastic Processes, 3rd edn.
Mcgraw-hill (1991)
15. Prandini, M., Hu, J.: Stochastic reachability: Theory and numerical approximation.
In: Cassandras, C.G., Lygeros, J. (eds.) Stochastic Hybrid Systems. Automation
and Control Engineering Series, vol. 24, pp. 107–138. Taylor & Francis Group/CRC
Press (2006)
16. Tkachev, I., Abate, A.: On infinite-horizon probabilistic properties and stochastic
bisimulation functions. In: Proceedings of the 50th IEEE Conference on Decision
and Control and European Control Conference, Orlando, FL, pp. 526–531 (De-
cember 2011)
17. Tkachev, I., Abate, A.: Characterization and computation of infinite-horizon spec-
ifications over markov processes. Theoretical Computer Science 515, 1–18 (2014)
SACO: Static Analyzer for Concurrent Objects
1 Introduction
With the trend of parallel systems, and the emergence of multi-core comput-
ing, the construction of tools that help analyzing and verifying the behaviour
of concurrent programs has become fundamental. Concurrent programs contain
several processes that work together to perform a task and communicate with
each other. Communication can be programmed using shared variables or mes-
sage passing. When shared variables are used, one process writes into a variable
that is read by another; when message passing is used, one process sends a mes-
sage that is received by another. Shared memory communication is typically
implemented using low-level concurrency and synchronization primitives These
programs are in general more difficult to write, debug and analyze, while its main
advantage is efficiency. The message passing model uses higher-level concurrency
constructs that help in producing concurrent applications in a less error-prone
way and also more modularly. Message passing is the essence of actors [1], the
concurrency model used in concurrent objects [9], in Erlang, and in Scala.
This paper presents the SACO system, a S tatic Analyzer for C oncurrent
O bjects. Essentially, each concurrent object is a monitor and allows at most
one active task to execute within the object. Scheduling among the tasks of
an object is cooperative, or non-preemptive, such that the active task has to
release the object lock explicitly (using the await instruction). Each object has
an unbounded set of pending tasks. When the lock of an object is free, any task
in the set of pending tasks can grab the lock and start executing. When the
result of a call is required by the caller to continue executing, the caller and the
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 562–567, 2014.
c Springer-Verlag Berlin Heidelberg 2014
SACO: Static Analyzer for Concurrent Objects 563
14 } 28 }//end class
on v. However, the await at L38 synchronizes the execution of This with the com-
pletion of the task retrieveCoins in v by means of the future variable f. Namely, at
the await, if the task spawned at L37 has not finished, the processor is released and
any available task on the This object could take it. The result of the execution of
retrieveCoins is obtained by means of the blocking get instruction which blocks the
execution of This until the future variable f is ready. In general, the use of get can
introduce deadlocks. In this case, the await at L38 ensures that retrieveCoins has
finished and thus the execution will not block.
Points-to Analysis. Inferring the set of memory locations to which a reference
variable may point-to is a classical analysis in object-oriented languages. In
SACO we follow Milonava et al. [11] and abstract objects by the sequence of
allocation sites of all objects that lead to its creation. E.g., if we create an
object o1 at program point pp1 , and afterwards call a method of o1 that creates
an object o2 at program point pp2 , then the abstract representation of o2 is
pp1 .pp2 . In order to ensure termination of the inference process, the analysis is
parametrized by k, the maximal length of these sequences. In the example, for
any k ≥ 2, assuming that the allocation site of the This object is ψ, the points-to
analysis abstracts v and out to ψ.35 and ψ.34, respectively. For k = 1, they would
be abstracted to 35 and 34. As variables can be reused, the information that
the analysis gives is specified at the program point level. Basically, the analysis
results are defined by a function P(op , pp, v) which for a given (abstract) object
op , a program point pp and a variable v, it returns the set of abstract objects
to which v may point to. For instance, P(ψ, 36, v) = 35 should be read as: when
executing This and instruction L36 is reached, variable v points to an object
whose allocation site is 35. Besides, we can trivially use the analysis results to find
out to which task a future variable f is pointing to. I.e., P(op , pp, f ) = o.m where
o is an abstract object and m a method name, e.g., P(ψ, 37, f ) = 35.retrieveCoins .
Points-to analysis allows making any analysis object-sensitive [11]. In addition, in
SACO we use it: (1) in the resource analysis in order to know to which object the
cost must be attributed, and (2) in the deadlock analysis, where the abstraction
of future variables above is used to spot dependencies among tasks.
May-Happen-in-Parallel. An MHP analysis [10,3] provides a safe approximation
of the set of pairs of statements that can execute in parallel across several objects,
or in an interleaved way within an object. MHP allows ensuring absence of data
races, i.e., that several objects access the same data in parallel and at least one of
them modifies such data. Also, it is crucial for improving the accuracy of deadlock,
termination and resource analysis. The MHP analysis implemented in SACO [3]
can be understood as a function MHP(op , pp) which returns the set of program
points that may happen in parallel with pp when executing in the abstract object
op . A remarkable feature of our analysis is that it performs a local analysis of meth-
ods followed by a composition of the local results, and it has a polynomial complex-
ity. In our example, SACO infers that the execution of showIncome (L2) cannot hap-
pen in parallel with any instruction in retrieveCoins (L18–L27), since retrieveCoins
must be finished in the await at L38. Similarly, it also reveals that showCoin (L3)
cannot happen in parallel with showIncome. On the other hand, SACO detects that
SACO: Static Analyzer for Concurrent Objects 565
the await (L24) and the assignment (L16) may happen in parallel. This could be a
problem for the termination of retrieveCoins, as the shared variable coins that con-
trols the loop may be modified in parallel, but our termination analysis can over-
come this difficulty. Since the result of the MHP analysis refines the control-flow,
we could also consider applying the MHP and points-to analyses continuously to
refine the results of each other. In SACO we apply them only once.
3 Advanced Analyses
Termination Analysis. The main challenge is in handling shared-memory con-
current programs. When execution interleaves from one task to another, the
shared-memory may be modified by the interleaved task. The modifications can
affect the behavior of the program and change its termination behavior and its
resource consumption. Inspired by the rely-guarantee principle used for com-
positional verification and analysis [12,5] of thread-based concurrent programs,
SACO incorporates a novel termination analysis for concurrent objects [4] which
assumes a property on the global state in order to prove termination of a loop
and, then, proves that this property holds. The property to prove is the finite-
ness of the shared-data involved in the termination proof, i.e., proving that such
shared-memory is updated a finite number of times. Our analysis is based on a
circular style of reasoning since the finiteness assumptions are proved by proving
termination of the loops in which that shared-memory is modified. Crucial for
accuracy is the use of the information inferred by the MHP analysis which allows
us to restrict the set of program points on which the property has to be proved
to those that may actually interleave its execution with the considered loop.
Consider the function retrieveCoins from Sec. 2. At the await (L24) the value
of the shared variable coins may change, since other tasks may take the object’s
lock and modify coins. In order to prove termination, the analysis first assumes
that coins is updated a finite number of times. Under this assumption the loop is
terminating because eventually the value of coins will stop being updated by other
tasks, and then it will decrease at each iteration of the loop. The second step is
to prove that the assumption holds, i.e., that the instructions updating coins are
executed a finite number of times. The only update instruction that may happen
in parallel with the await is in insertCoin (L16), which is called from insertCoins and
this from main. Since these three functions are terminating (their termination
can be proved without any assumption), the assumption holds and therefore
retrieveCoins terminates. Similarly, the analysis can prove the termination of the
other functions, thus proving the whole program terminating.
Resource Analysis. SACO can measure different types of costs (e.g., number
of execution steps, memory created, etc.) [2]. In the output, it returns upper
bounds on the worst-case cost of executing the concurrent program. The results
of our termination analysis provide useful information for cost: if the program
is terminating then the size of all data is bounded (we use x+ to refer to the
maximal value for x). Thus, we can give cost bounds in terms of the maximum
and/or minimum values that the involved data can reach. Still, we need novel
566 E. Albert et al.
References
1. Agha, G.A.: Actors: A Model of Concurrent Computation in Distributed Systems.
MIT Press, Cambridge (1986)
2. Albert, E., Arenas, P., Genaim, S., Gómez-Zamalloa, M., Puebla, G.: Cost Analysis
of Concurrent OO Programs. In: Yang, H. (ed.) APLAS 2011. LNCS, vol. 7078,
pp. 238–254. Springer, Heidelberg (2011)
3. Albert, E., Flores-Montoya, A.E., Genaim, S.: Analysis of May-Happen-in-Parallel
in Concurrent Objects. In: Giese, H., Rosu, G. (eds.) FMOODS/FORTE 2012.
LNCS, vol. 7273, pp. 35–51. Springer, Heidelberg (2012)
4. Albert, E., Flores-Montoya, A., Genaim, S., Martin-Martin, E.: Termination and
Cost Analysis of Loops with Concurrent Interleavings. In: Van Hung, D., Ogawa,
M. (eds.) ATVA 2013. LNCS, vol. 8172, pp. 349–364. Springer, Heidelberg (2013)
5. Cook, B., Podelski, A., Rybalchenko, A.: Proving Thread Termination. In: PLDI
2007, pp. 320–330. ACM (2007)
6. Flores-Montoya, A.E., Albert, E., Genaim, S.: May-Happen-in-Parallel Based Dead-
lock Analysis for Concurrent Objects. In: Beyer, D., Boreale, M. (eds.) FMOODS/-
FORTE 2013. LNCS, vol. 7892, pp. 273–288. Springer, Heidelberg (2013)
7. Giachino, E., Laneve, C.: Analysis of Deadlocks in Object Groups. In: Bruni, R.,
Dingel, J. (eds.) FMOODS/FORTE 2011. LNCS, vol. 6722, pp. 168–182. Springer,
Heidelberg (2011)
8. http://research.microsoft.com/enus/um/cambridge/projects/terminator/
9. Johnsen, E.B., Hähnle, R., Schäfer, J., Schlatte, R., Steffen, M.: ABS: A Core
Language for Abstract Behavioral Specification. In: Aichernig, B.K., de Boer, F.S.,
Bonsangue, M.M. (eds.) FMCO 2011. LNCS, vol. 6957, pp. 142–164. Springer,
Heidelberg (2011)
10. Lee, J.K., Palsberg, J.: Featherweight X10: A Core Calculus for Async-Finish Par-
allelism. In: PPoPP 2010, pp. 25–36. ACM (2010)
11. Milanova, A., Rountev, A., Ryder, B.G.: Parameterized Object Sensitivity for
Points-to Analysis for Java. ACM Trans. Softw. Eng. Methodol. 14, 1–41 (2005)
12. Popeea, C., Rybalchenko, A.: Compositional Termination Proofs for Multi-
threaded Programs. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS,
vol. 7214, pp. 237–251. Springer, Heidelberg (2012)
VeriMAP: A Tool for Verifying Programs
through Transformations
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 568–574, 2014.
c Springer-Verlag Berlin Heidelberg 2014
VeriMAP: A Tool for Verifying Programs through Transformations 569
the task of guaranteeing that VeriMAP computes sound results, as the soundness
of the transformation rules can be proved once and for all, before performing
any verification using VeriMAP.
Transformation Strategies
Constraint Domain
Unfolding Generalization Constraint Replacement
Data Theory Operators Operators Solvers Rules
with respect to the CLP representation of the program and safety property gen-
erated by C2CLP (that is, the clauses defining at, phiInit, and phiError).
The output of the specialization process is the CLP representation of the VC’s.
This specialization process is said to ‘remove the interpreter’ in the sense that
it removes every reference to the predicates used in the CLP definition of the
interpreter in favour of new predicates corresponding to (a subset of) the ‘pro-
gram points’ of the original C program. Indeed, the structure of the call-graph
of the CLP program generated by the VCG module corresponds to that of the
control-flow graph of the C program.
The IV module consists of two submodules: (i) the Unfold/Fold Transformer,
and (ii) the Analyzer. The Unfold/Fold Transformer propagates the constraints
occurring in the definition of phiInit and phiError through the input VC’s
thereby deriving a new, equisatisfiable set of VC’s. The Analyzer checks the
satisfiability of the VC’s by performing a lightweight analysis. The output of this
analysis is either (i) true, if the VC’s are satisfiable, and hence the program is
safe, or (ii) false, if the VC’s are unsatisfiable, and hence the program is unsafe
(and a counterexample may be extracted), or (iii) unknown, if the lightweight
analysis is unable to determine whether or not the VC’s are satisfiable. In this
last case the verification continues by iterating the propagation of constraints
by invoking again the Unfold/Fold Transformer submodule. At each iteration,
the IV module can also apply a Reversal transformation [4], with the effect of
reversing the direction of the constraint propagation (either from phiInit to
phiError or vice versa, from phiError to phiInit).
The VCG and IV modules are realized by using MAP [14], a transformation
engine for CLP programs (written in SICStus Prolog), with suitable concrete
versions of Transformation Strategies. There are various versions of the transfor-
mation strategies which, as indicated in [4], are defined in terms of: (i) Unfold-
ing Operators, which guide the symbolic evaluation of the VC’s, by controlling
the expansion of the symbolic execution trees, (ii) Generalization Operators [6],
which guarantee termination of the Unfold/Fold Transformer and are used (to-
gether with widening and convex-hull operations) for the automatic discovery
loop invariants, (iii) Constraint Solvers, which check satisfiability and entailment
within the Constraint Domain at hand (for example, the integers or the ratio-
nals), and (iv) Replacement Rules, which guide the application of the axioms
and the properties of the Data Theory under consideration (like, for example,
the theory of arrays), and their interaction with the Constraint Domain.
Usage. VeriMAP can be downloaded from http://map.uniroma2.it/VeriMAP
and can be run by executing the following command: ./VeriMAP program.c,
where program.c is the C program annotated with the property to be verified.
VeriMAP has options for applying custom transformation strategies and for
exiting after the execution of the C2CLP or VCG modules, or after the execution
of a given number of iterations of the IV module.
572 E. De Angelis et al.
3 Experimental Evaluation
Table 1. Verification results using VeriMAP, ARMC, HSF(C), and TRACER. Time
is in seconds. The time limit for timeout is five minutes. (∗) These errors are due to
incorrect parsing, or excessive memory requirements, or similar other causes.
TRACER
VeriMAP ARMC HSF(C)
SPost WPre
correct answers 185 138 160 91 103
safe problems 154 112 138 74 85
unsafe problems 31 26 22 17 18
incorrect answers 0 9 4 13 14
missed bugs 0 1 1 0 0
false alarms 0 8 3 13 14
errors (∗) 0 18 0 20 22
timeout 31 51 52 92 77
total time 10717.34 15788.21 15770.33 27757.46 23259.19
average time 57.93 114.41 98.56 305.03 225.82
The results of the experiments show that our approach is competitive with
state-of-the-art verifiers. Besides the above benchmark set, we have used Ver-
iMAP on a small benchmark set of verification problems of C programs acting
on integers and arrays. These problems include programs for computing the
maximum elements of arrays and programs for performing array initialization,
array copy, and array search. Also for this benchmark, the results we have ob-
tained show that our transformational approach is effective and quite efficient
in practice.
All experiments have been performed on an Intel Core Duo E7300 2.66Ghz
processor with 4GB of memory running GNU/Linux, using a time limit of five
minutes. The source code of all the verification problems we have considered is
available at http://map.uniroma2.it/VeriMAP.
VeriMAP: A Tool for Verifying Programs through Transformations 573
4 Future Work
The current version of VeriMAP deals with safety properties of a subset of the
C language where, in particular, pointers and recursive procedures do not occur.
Moreover, the user is only allowed to configure the transformation strategies by
choosing among some available submodules for unfolding, generalization, con-
straint solving, and replacement rules (see Figure 1). Future work will be devoted
to make VeriMAP a more flexible tool so that the user may configure other pa-
rameters, such as: (i) the programming language and its semantics, (ii) the class
of properties and their proof rules (thus generalizing an idea proposed in [8]),
and (iii) the theory of the data types in use, including those for dynamic data
structures, such as lists and heaps.
References
1. Beyer, D.: Second Competition on Software Verification (SV-COMP 2013). In:
Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 594–609.
Springer, Heidelberg (2013)
2. Bjørner, N., McMillan, K., Rybalchenko, A.: On solving universally quantified Horn
clauses. In: Logozzo, F., Fähndrich, M. (eds.) SAS 2013. LNCS, vol. 7935, pp. 105–
125. Springer, Heidelberg (2013)
3. De Angelis, E., Fioravanti, F., Pettorossi, A., Proietti, M.: Verification of impera-
tive programs by constraint logic program transformation. In: SAIRP 2013, Elec-
tronic Proceedings in Theoretical Computer Science, vol. 129, pp. 186–210 (2013)
4. De Angelis, E., Fioravanti, F., Pettorossi, A., Proietti, M.: Verifying Programs via
Iterated Specialization. In: PEPM 2013, pp. 43–52. ACM (2013)
5. Fioravanti, F., Pettorossi, A., Proietti, M.: Transformation rules for locally strat-
ified constraint logic programs. In: Bruynooghe, M., Lau, K.-K. (eds.) Program
Development in Computational Logic. LNCS, vol. 3049, pp. 291–339. Springer,
Heidelberg (2004)
6. Fioravanti, F., Pettorossi, A., Proietti, M., Senni, V.: Generalization strategies for
the verification of infinite state systems. Theory and Practice of Logic Program-
ming 13(2), 175–199 (2013)
7. Grebenshchikov, S., Gupta, A., Lopes, N.P., Popeea, C., Rybalchenko, A.: HSF(C):
A software verifier based on Horn clauses. In: Flanagan, C., König, B. (eds.) TACAS
2012. LNCS, vol. 7214, pp. 549–551. Springer, Heidelberg (2012)
8. Grebenshchikov, S., Lopes, N.P., Popeea, C., Rybalchenko, A.: Synthesizing soft-
ware verifiers from proof rules. In: PLDI 2012, pp. 405–416. ACM (2012)
9. Gulavani, B.S., Chakraborty, S., Nori, A.V., Rajamani, S.K.: Automatically re-
fining abstract interpretations. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS
2008. LNCS, vol. 4963, pp. 443–458. Springer, Heidelberg (2008)
10. Gupta, A., Rybalchenko, A.: InvGen: An efficient invariant generator. In: Boua-
jjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 634–640. Springer,
Heidelberg (2009)
11. Hoder, K., Bjørner, N., de Moura, L.: µZ– An efficient engine for fixed points with
constraints. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806,
pp. 457–462. Springer, Heidelberg (2011)
574 E. De Angelis et al.
12. Hojjat, H., Konečný, F., Garnier, F., Iosif, R., Kuncak, V., Rümmer, P.: A verifi-
cation toolkit for numerical transition systems. In: Giannakopoulou, D., Méry, D.
(eds.) FM 2012. LNCS, vol. 7436, pp. 247–251. Springer, Heidelberg (2012)
13. Jaffar, J., Murali, V., Navas, J.A., Santosa, A.E.: TRACER: A symbolic execution
tool for verification. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS,
vol. 7358, pp. 758–766. Springer, Heidelberg (2012)
14. The MAP system, http://www.iasi.cnr.it/~proietti/system.html
15. McMillan, K.L., Rybalchenko, A.: Solving constrained Horn clauses using interpo-
lation. MSR Technical Report 2013-6, Microsoft Report (2013)
16. Necula, G.C., McPeak, S., Rahul, S.P., Weimer, W.: CIL: Intermediate language
and tools for analysis and transformation of C programs. In: Horspool, R.N. (ed.)
CC 2002. LNCS, vol. 2304, pp. 209–265. Springer, Heidelberg (2002)
17. Peralta, J.C., Gallagher, J.P., Saglam, H.: Analysis of imperative programs through
analysis of Constraint Logic Programs. In: Levi, G. (ed.) SAS 1998. LNCS,
vol. 1503, pp. 246–261. Springer, Heidelberg (1998)
18. Podelski, A., Rybalchenko, A.: ARMC: The logical choice for software model check-
ing with abstraction refinement. In: Hanus, M. (ed.) PADL 2007. LNCS, vol. 4354,
pp. 245–259. Springer, Heidelberg (2007)
19. De Angelis, E., Fioravanti, F., Pettorossi, A., Proietti, M.: Verifying Array Pro-
grams by Transforming Verification Conditions. In: McMillan, K.L., Rival, X. (eds.)
VMCAI 2014. LNCS, vol. 8318, pp. 182–202. Springer, Heidelberg (2014)
CIF 3: Model-Based Engineering
of Supervisory Controllers
1 Introduction
A supervisory controller coordinates the behavior of a (cyber-physical) system
from discrete-event observations of its state. Based on such observations the
supervisory controller decides on the activities that the uncontrolled system can
safely perform or on the activities that (are more likely to) lead to acceptable
system behavior. Engineering of supervisory controllers is a challenging task
in practice, amongst others because of the high complexity of the uncontrolled
system.
In model-based engineering, models are used in the design process, instead of
directly implementing a solution. The Compositional Interchange Format (CIF)
is an automata-based modeling language that supports the entire model-based
engineering development process of supervisory controllers, including modeling,
supervisory controller synthesis (deriving a controller from its requirements),
simulation-based validation, verification, and visualization, real-time testing, and
code generation. CIF 3 is a substantially enhanced new version of CIF, after CIF
1 [BRRS08] and CIF 2 [NBR12]. It has been improved based on feedback from
industry, as well as new theoretical advances. The various versions of CIF have
been developed in European projects HYCON, HYCON2, Multiform, and C4C.
CIF is actively being developed by the Manufacturing Networks Group1 of the
1
Until recently the group was named Systems Engineering Group.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 575–580, 2014.
c Springer-Verlag Berlin Heidelberg 2014
576 D.A. van Beek et al.
Discrete
events, from the state observed at the hybrid controller
plant. They can be interpreted as virtual sen- sensor actuator
sors by the controller, abstracting away timed events events
behavior. Examples are a timeout, or an event
that signals that a certain combination of val- Hybrid Timed/hybrid
observer
ues of physical quantities has occurred.
Fig. 2 depicts the workflow of the sim- sensor actuator
plified framework for model-based engineer- variables variables
Hybrid Code
Verification
observer generation
plant model and the control requirements. The supervisory controller is com-
bined with the hybrid observer, resulting in the actual controller (observer-based
supervisor ). This model is used for model-based validation, by means of real-
time interactive simulation and visualization, based on a user-supplied image
of the system. This brings a higher confidence that the models fulfill expected
properties. The mentioned simulation-based visualization can also be used for
validating the other models, such as the discrete-event and hybrid plant models.
As a final step, actual real-time control code is generated, for the implemen-
tation of the controller.
4 Applications
CIF has been used in an industrial context for a number of years now. We
mention some of the more prominent applications.
– Development of a coordinator for maintenance procedures for a high-tech
Océ printer [MJB+ 10]
– Improving evolvability of a patient communication control system using
state-based supervisory control synthesis [TBR12]
– Application of supervisory control theory to theme park vehicles [FMSR12]
– Supervisory control of MRI subsystems [Geu12]
– Design of a supervisory controller for a Philips MRI-scanner [Dij13]
CIF 3: Model-Based Engineering of Supervisory Controllers 579
5 Future Developments
CIF is constantly being improved and extended. A planned extension of CIF is
the addition of point-to-point communication by means of channels. Our experi-
ence with industrial cases has shown that these are well suited to model physical
movements of objects. The channels will be fully integrated into the language.
For instance, supervisors will be able to prohibit channel communications.
For verification, we intend to support a larger class of CIF models for the
transformation to Uppaal. We will develop model transformations to other model
checking tools as experimented with in [MR12a]. For performance analysis we are
considering model transformations to MRMC and/or PRISM [MR12b, MER13].
The Manufacturing Networks Group also works on extensions of supervisory
control theory, such as the domain of plant models to which synthesis may be
applied, and the expressivity of the logic for requirements. See [HFR13] for a
first publication of this line of research. As soon as such extensions reach an
acceptable level of maturity, they are incorporated in the CIF tooling.
References
[ACH+ 95] Alur, R., Courcoubetis, C., Halbwachs, N., Henzinger, T.A., Ho, P.-H.,
Nicollin, X., Olivero, A., Sifakis, J., Yovine, S.: The algorithmic analysis
of hybrid systems. Theoretical Computer Science 138(1), 3–34 (1995)
[AGH+ 00] Alur, R., Grosu, R., Hur, Y., Kumar, V., Lee, I.: Modular specification of
hybrid systems in CHARON. In: Lynch, N.A., Krogh, B.H. (eds.) HSCC
2000. LNCS, vol. 1790, pp. 6–19. Springer, Heidelberg (2000)
[BHSR13] van Beek, D.A., Hendriks, D., Swartjes, L., Reniers, M.A.: Report on the
extensions of the CIF and transformation algorithms. Technical Report HY-
CON Deliverable D6.2.4 (2013)
580 D.A. van Beek et al.
[BRRS08] van Beek, D.A., Reniers, M.A., Rooda, J.E., Schiffelers, R.R.H.: Concrete
syntax and semantics of the Compositional Interchange Format for hybrid
systems. In: IFAC World Congress 2008, pp. 7979–7986, IFAC (2008)
[CL07] Cassandras, C.G., Lafortune, S.: Introduction to Discrete Event Systems,
2nd edn. Springer (2007)
[Dij13] van Dijk, D.: Supervisory control of a Philips MRI-scanner. Master’s thesis,
Eindhoven University of Technology (2013)
[FMSR12] Forschelen, S.T.J., van de Mortel-Fronczak, J.M., Su, R., Rooda, J.E.: Ap-
plication of supervisory control theory to theme park vehicles. Discrete
Event Dynamic Systems 22(4), 511–540 (2012)
[Geu12] Geurts, J.W.P.: Supervisory control of MRI subsystems. Master’s thesis,
Eindhoven University of Technology (2012)
[Hen00] Henzinger, T.A.: The theory of hybrid automata. In: Verification of Digital
and Hybrid Systems. NATO ASI Series F: Computer and Systems Science,
vol. 170, pp. 265–292. Springer (2000)
[HFR13] van Hulst, A., Fokkink, W.J., Reniers, M.A.: Maximal synthesis for
Hennessy-Milner Logic. In: ACSD 2013, pp. 1–10. IEEE (2013)
[JT10] John, K.H., Tiegelkamp, M.: IEC 61131-3: Programming Industrial Au-
tomation Systems, 2nd edn. Springer (2010)
[Kam13] Kamphuis, R.H.J.: Design and real-time implementation of a supervisory
controller for baggage handling at Veghel Airport. Master’s thesis, Eind-
hoven University of Technology (2013)
[LSV01] Lynch, N., Segala, R., Vaandrager, F.: Hybrid I/O automata revisited.
In: Di Benedetto, M.D., Sangiovanni-Vincentelli, A.L. (eds.) HSCC 2001.
LNCS, vol. 2034, pp. 403–417. Springer, Heidelberg (2001)
[MER13] Markovski, J., Estens Musa, E.S., Reniers, M.A.: Extending a synthesis-
centric model-based systems engineering framework with stochastic model
checking. ENTCS 296, 163–181 (2013)
[MJB+ 10] Markovski, J., Jacobs, K.G.M., van Beek, D.A., Somers, L.J.A.M., Rooda,
J.E.: Coordination of resources using generalized state-based requirements.
In: WODES 2010, pp. 300–305. IFAC (2010)
[MR12a] Markovski, J., Reniers, M.A.: An integrated state- and event-based frame-
work for verifying liveness in supervised systems. In: ICARCV 2012, pp.
246–251. IEEE (2012)
[MR12b] Markovski, J., Reniers, M.A.: Verifying performance of supervised plants.
In: ACSD 2012, pp. 52–61. IEEE (2012)
[NBR12] Nadales Agut, D.E., van Beek, D.A., Rooda, J.E.: Syntax and semantics of
the compositional interchange format for hybrid systems. Journal of Logic
and Algebraic Programming 82(1), 1–52 (2012)
[NRS+ 11] Nadales Agut, D.E., Reniers, M.A., Schiffelers, R.R.H., Jørgensen, K.Y.,
van Beek, D.A.: A semantic-preserving transformation from the Composi-
tional Interchange Format to UPPAAL. In: IFAC World Congress 2011, pp.
12496–12502, IFAC (2011)
[TBR12] Theunissen, R.J.M., van Beek, D.A., Rooda, J.E.: Improving evolvability
of a patient communication control system using state-based supervisory
control synthesis. Advanced Engineering Informatics 26(3), 502–515 (2012)
[The05] The MathWorks, Inc. Writing S-functions, version 6 (2005),
http://www.mathworks.com
[WR87] Wonham, W.M., Ramadge, P.J.: On the supremal controllable sublanguage
of a given language. SIAM Journal on Control and Optimization 25(3),
637–659 (1987)
EDD: A Declarative Debugger
for Sequential Erlang Programs
1 Introduction
Declarative debugging, also known as algorithmic debugging, is a well-known
technique that only requires from the user knowledge about the intended behav-
ior of the program, that is, the expected results of the program computations,
abstracting the execution details and hence presenting a declarative approach. It
has been successfully applied in logic [5], functional [6], and object-oriented [4]
programming languages. In [3,2] we presented a declarative debugger for se-
quential Erlang. These works gave rise to EDD, the Erlang Declarative Debugger
presented in this paper. EDD has been developed in Erlang. EDD, its documenta-
tion, and several examples are available at https://github.com/tamarit/edd
(check the README.md file for installing the tool).
As usual in declarative debugging the tool is started by the user when an
unexpected result, called the error symptom, is found. The debugger then builds
internally the so-called debugging tree, whose nodes correspond to the auxiliary
computations needed to obtain the error symptom. Then the user is questioned
Research supported by EU project FP7-ICT-610582 ENVISAGE, Spanish
projects StrongSoft (TIN2012-39391-C04-04), DOVES (TIN2008-05624), and VI-
VAC (TIN2012-38137), and Comunidad de Madrid PROMETIDOS (S2009/
TIC-1465). Salvador Tamarit was partially supported by research project POLCA,
Programming Large Scale Heterogeneous Infrastructures (610686), funded by the
European Union, STREP FP7.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 581–586, 2014.
c Springer-Verlag Berlin Heidelberg 2014
582 R. Caballero et al.
about the validity of some tree nodes until the error is found. In our proposal,
the debugger first concentrates in the function calls occurred during the com-
putation. The goal is to find a function call that returned an invalid result,
but such that all the function calls occurring in the function body returned a
valid result. The associated node is called a buggy node. We prove in [3] that
such function is a wrong function, and that every program producing an error
symptom1 contains at least one wrong function. An important novelty of our
debugger w.r.t. similar tools is that it allows using zoom debugging to detect an
erroneous fragment of code inside the wrong function. At this stage the user is
required to answer questions about the validity of certain variable matchings,
or about the branch that should be selected in a case/if statement for a given
context. The theoretical results in [2] ensure that this code is indeed erroneous,
and that a wrong function always contains an erroneous statement.
The rest of the paper is organized as follows: Section 2 introduces Erlang and
EDD. Section 3 describes the questions that can be asked by the tool and the
errors that can be detected. Section 4 concludes and presents the future work.
In this section we introduce some pieces of Erlang [1] which are relevant for
our presentation. At the same time we introduce the basic features of our tool.
Erlang is a concurrent language with a sequential subset that is a functional
language with dynamic typing and strict evaluation. Programs are structured
using modules, which contain functions defined by collections of clauses.
Example 1. Obtain the square of a number X without using products. This is
possible defining Y = X − 1 and considering X 2 = (Y + 1)2 = Y 2 + Y + Y + 1.
-module(mathops).
-export([square/1]).
square(0) -> 0;
square(X) when X>0 -> Y=X-1, DoubleY=X+X, square(Y)+DoubleY+1.
Observe that variables start with an uppercase letter or underscore. In order
to evaluate a function call Erlang scans sequentially the function clauses until a
match is found. Then, the variables occurring in the head are bound. In our ex-
ample the second clause of the function square is erroneous: the underlined sub-
term X+X should be Y+Y. Using this program we check that mathops:square(3)
is unexpectedly evaluated to 15. Then, we can start EDD, obtaining the debug-
ging session in Fig. 1, where the user answers are boxed. Section 3.1 explains all
the possible answers to the debugger questions. Here we only use ‘n’ (standing
for ‘no’), indicating that the result is invalid, and ‘y’ (standing for ‘yes’), indicat-
ing that it is valid. After two questions the debugger detects that the tree node
containing the call mathops:square(1) is buggy (it produces and invalid result
while its only child mathops:square(0) returns a valid result). Consequently,
1
Note that, if the module has multiple errors that compensate each other, there is no
error symptom and hence declarative debugging cannot be applied.
EDD: A Declarative Debugger for Sequential Erlang Programs 583
> edd:dd("mathops:square(3)").
mathops:square(1) = 3? n
mathops:square(0) = 0? y
Call to a function that contains an error:
mathops:square(1) = 3
Please, revise the second clause:
square(X) when X > 0 -> Y=X-1, DoubleY=X+X, square(Y)+DoubleY+1.
Continue the debugging session inside this function? y
In the function square(1) matching with second clause succeed.
Is this correct? y
Given the context: X = 1
the following variable is assigned: DoubleY = 2? n
This is the reason for the error:
Variable DoubleY is badly assigned 2 in the expression:
DoubleY = X + X (line 4).
the tool points out the second clause of square as wrong. Next, the user is asked
if zoom debugging must be used. The user agrees with inspecting the code as-
sociated to the buggy node function call. The debugger proceeds asking about
the validity of the chosen function clause, which is right (the second one), and
about the validity of the value for DoubleY, which is incorrect (it should be 0
since X=1 implies Y=0). The session finishes pointing to this incorrect match-
ing as the source of the error. Observe that an incorrect matching is not always
associated to wrong code, because it could depend on a previous value that con-
tains an incorrect result. However, the correctness results in [3] ensure that only
matchings with real errors are displayed as errors by our tool. Note in this ses-
sion the improvement with respect to the trace, the standard debugging facility
for Erlang programs. While the trace shows every computation step, our tool
focuses first on function calls, simplifying and shortening the debugging process.
The next example shows that Erlang allows more sophisticated expressions in
the function bodies, including case or if expressions.
Example 2. Select the appropriate food taking into account different preferences.
-module(meal).
-export([food/1]).
food(Preferences) ->
case Preferences of
{vegetarian,ovo_vegetarian} -> omelette;
{vegetarian,_lacto_vegetarian} -> yogurt;
{vegetarian,vegan} -> salad;
_Else -> fish
end.
Now we can evaluate the expression meal:food({vegetarian,vegan}) and
we obtain the unexpected result yogurt. This time the first phase of the debugger
is not helpful: it points readily to the only clause of food. In order to obtain
more precise information we use zoom debugging, and the debugger asks:
584 R. Caballero et al.
– inadmissible (i): the question does not apply because the arguments should
not take these values. The statement is marked as valid.
– don’t know (d): the answer is unknown. The statement is marked as unknown,
and might be asked again if it is required for finding the buggy node.
– switch strategy (s): changes the navigation strategy. The navigation strate-
gies provided by the tool are explained below.
– undo (u): reverts to the previous question.
– abort (a): finishes the debugging session.
In the case of zoom debugging, the answer trusted has not any sense and it is
never available, while the answers yes, no, and inadmissible cannot be used in
some situations, for instance in compound questions about case/if expressions.
The rest of answers are always available.
The tool includes a memoization feature that stores the answers yes, no,
trusted, and inadmissible, preventing the system from asking the same question
twice. It is worth noting that don’t know is used to ease the interaction with the
debugger but it may introduce incompleteness; if the debugger reaches a deadlock
due to these answers it presents two alternatives to the user: either answering
some of the discarded questions to find the buggy node or showing the possible
buggy code, depending on the answers to the nodes marked as unknown.
3.2 Strategies
As indicated in the introduction, the statements are represented in suitable de-
bugging trees, which represents the structure of the wrong computation. The
system can internally utilize two different navigation strategies [7,8], Divide &
Query and Top Down Heaviest First, in order to choose the next node and there-
fore the next question presented to the user. Top Down selects as next node the
largest child of the current node, while Divide & Query selects the node whose
subtree is closer to half the size of the whole tree. In this way, Top Down sessions
usually presents more questions to the user, but they are presented in a logical
order, while Divide & Query leads to shorter sessions of unrelated questions.
Wrong case argument, which indicates that the argument of a specific case
statement has not been coded as the user expected.
Wrong pattern, which indicates that a pattern in the function arguments or
in a case/if branch is wrong.
Wrong guard, which indicates that a guard in either a function clause or in a
case/if branch is wrong.
Wrong binding, which indicates that a variable binding is incorrect.
586 R. Caballero et al.
EDD is a declarative debugger for sequential Erlang. Program errors are found
by asking questions about the intended behavior of some parts of the program
being debugged, until the bug is found. Regarding usability, EDD provides sev-
eral features that make it a useful tool for debugging real programs, such as sup-
port for built-in functions and external libraries, anonymous (lambda) functions,
higher-order values, don’t know and undo answers, memoization, and trusting
mechanisms, among others. See [2,3] for details.
We have used this tool to debug several programs developed by others. This
gives us confidence in its robustness, but also illustrates an important point of
declarative debugging: it does not require the person in charge of debugging to
know the details of the implementation; it only requires to know the intended
behavior of the functions, which is much easier and more intuitive, hence allowing
a simpler form of debugging than other approaches, like tracing or breakpoints.
As future work we plan to extend this proposal to include the concurrent
features of Erlang. This extension requires first to extend our calculus with
these features. Then, we must identify the errors that can be detected in this
new framework, define the debugging tree, and adapt the tool to work with these
modifications.
References
1. Armstrong, J., Williams, M., Wikstrom, C., Virding, R.: Concurrent Programming
in Erlang, 2nd edn. Prentice-Hall (1996)
2. Caballero, R., Martin-Martin, E., Riesco, A., Tamarit, S.: A zoom-declarative de-
bugger for sequential Erlang programs. Submitted to the JLAP
3. Caballero, R., Martin-Martin, E., Riesco, A., Tamarit, S.: A declarative debugger
for sequential Erlang programs. In: Veanes, M., Viganò, L. (eds.) TAP 2013. LNCS,
vol. 7942, pp. 96–114. Springer, Heidelberg (2013)
4. Insa, D., Silva, J.: An algorithmic debugger for Java. In: Lanza, M., Marcus, A.
(eds.) Proc. of ICSM 2010, pp. 1–6. IEEE Computer Society (2010)
5. Naish, L.: Declarative diagnosis of missing answers. New Generation Comput-
ing 10(3), 255–286 (1992)
6. Nilsson, H.: How to look busy while being as lazy as ever: the implementation of a
lazy functional debugger. Journal of Functional Programming 11(6), 629–671 (2001)
7. Silva, J.: A comparative study of algorithmic debugging strategies. In: Puebla, G.
(ed.) LOPSTR 2006. LNCS, vol. 4407, pp. 143–159. Springer, Heidelberg (2007)
8. Silva, J.: A survey on algorithmic debugging strategies. Advances in Engineering
Software 42(11), 976–991 (2011)
APTE: An Algorithm
for Proving Trace Equivalence
Vincent Cheval
1 Introduction
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 587–592, 2014.
c Springer-Verlag Berlin Heidelberg 2014
588 V. Cheval
Existing Tools. To our knowledge, there are only three tools that can handle
equivalence properties: ProVerif [3], SPEC [12] and AKiSs [9]. The tool
ProVerif originally was designed to prove trace properties but it can also
check some equivalence properties (so-called diff-equivalence) [4] that are usu-
ally too strong to model a real intruder since they consider that the intruder has
complete knowledge of the internal states of all the honest protocols executed.
Note that this is the only tool that can handle an unbounded number of sessions
of a protocol with a large class of cryptographic primitives in practice. However,
the downside of ProVerif is that it may not terminate and it may also return a
false-negative. More recently, the tool AKiSs [9] was developed in order to decide
the trace equivalence of bounded processes that do not contain non-trivial else
branching. This tool was proved to be sound and complete and accepts a large
class of primitives but the algorithm was only conjectured to terminate. At last,
the tool SPEC [12] is based on a decision procedure for open-bisimulation for
bounded processes. The scope is however limited: open-bisimulation coincides
with trace equivalence only for determinate processes and the procedure also as-
sumes a fixed set of primitives (symmetric encryption and pairing), and a pattern
based message passing, hence, in particular, no non-trivial else branching.
Hence, some interesting protocols cannot be handle by these tools. It is par-
ticularly the case for the Private Authentication protocol [2], or the protocols of
the electronic passport [1] since they rely on a conditional with a non-trivial else
branch to be properly modelled. Moreover, even though recent work [6] led to a
new release of ProVerif that can deal with the Private Authentication protocol,
ProVerif is still not able to handle the protocols of the electronic passport and
yields a false positive due to the too strong equivalence that it proves.
At last none of the existing tools take into account the fact that an attacker
can always observe the length of a message, even though it can leak information
on private data. For example, in most of existing encryption schemes, the length
of ciphertext depends on the length of its plaintext. Thus the ciphertext {m}k
corresponding to the encryption of a message m by the key k can always be
distinguished from the ciphertext {m, m}k corresponding to the encryption of
the message m repeated twice by the key k. This is simply due to the fact that
{m, m}k is longer than {m}k . However, these two messages would be considered
as indistinguishable in all previous mentioned tools.
2 Trace Equivalence
Our tool is based on a symbolic model where the messages exchanged over the
network are represented by terms. They are built from a set of variables and
names by applying function symbols modelling the cryptographic primitives.
For example, the symbol function senc (resp. sdec) represents the symmetric
encryption (resp. decryption) primitive and the term senc(m, k) models the en-
cryption of a message m by a key k. The behaviour of each primitive is modelled
by a rewriting system. As such, a term sdec(senc(m, k), k) will be rewritten m
to model the fact that decrypting a ciphertext by the key that was used to en-
crypt indeed yields the plain text. Moreover, to take into account the length
APTE: An Algorithm for Proving Trace Equivalence 589
where P, Q are processes, u, v are terms and x is a variable. The nil process is
denoted 0. The process P + Q represents the non-deterministic choice between P
and Q. The process new k is the creation of a fresh name. The process out(u, v)
represents the emission of the message v into the channel u. Similarly, in(u, x) is
the process that receives a message on the channel u and binds it to x. Typically,
an attacker can interact with the process by emitting or receiving messages from
honest participants through public channels. Hence we represent possible inter-
actions of the attacker with P by the notion of trace, that is a pair (s, σ) where
s is the sequence of actions that the attacker performs and σ is the sequence of
messages that the attacker receives from the honest participants. A process is
said to be determinate when the execution of the process is deterministic. For
example, a process containing the choice operator is not determinate.
Moreover, we say that P and Q are in length trace equivalence when the length
static equivalence ∼ω is used to compare the sequence of messages σ and σ , i.e.
σ ∼ω σ .
Intuitively, this definition indicates that whatever the actions the attacker
performs on P , the same actions can be performed on Q and the sequences of
messages obtained in both cases are indistinguishable, and conversely.
Implementation details. The tool is implemented in Ocaml1 and the source code
has about 12Klocs. The source code is highly modular: each mathematical notion
used in the algorithm is implemented in a separate module. To facilitate any new
extension and optimisation of the tool, the data structures are always hidden in
the modules, i.e. we only use abstract types (sometimes called opaque types) in
the interface files. The format of the comments in the interface files is the one
of Ocamldoc that generates a LaTex file with the documented interfaces.
These results were obtained by using APTE on a 2.9 Ghz Intel Core i7, 8 GB DDR3.
4 Experimental Results
We use APTE on several case studies found in the literature. The figure 1
summarises the results. In particular, we focused on the Private Authentication
(PrA) protocol [2] and the protocols of the electronic passport (a description of
the protocols can be found in [5]). The two key results that we obtained using
APTE are a new attack on the anonymity of the Private Authentication proto-
col and a new attack on the unlinkability of the Passive Authentication protocol
(PaA) of the electronic passport. Both attacks rely on the attacker being able
to observe the length of messages. In both cases, we propose possible fixes and
show their security with APTE for few sessions of the protocols. Observe that
execution times for the trace equivalence and length trace equivalence are very
similar. However, depending on how many sessions you consider, the execution
time varies greatly. For example, in the case of the Private Authentication pro-
tocol, one sessions is computed in less than a second whereas two sessions take
more than two days. Using APTE, we also rediscovered an existing attack on
592 V. Cheval
the unlinkability of the Basic Access Control protocol (BAC) used in the French
electronic passport. Note that proving the unlinkability of the BAC protocol for
the UK passport took too much time and so we stopped the execution after two
days. We applied APTE to prove the anonymity of the Passive Authentication
protocol. At last, since all reachability properties can be expressed by an equiv-
alence, APTE is also able to find the very classical attack on the secrecy of the
Needham-Schroeder protocol and prove the secrecy on the Needham-Schroeder-
Love protocol.
References
1. Machine readable travel document. Technical Report 9303, International Civil Avi-
ation Organization (2008)
2. Abadi, M., Fournet, C.: Private authentication. Theoretical Computer Sci-
ence 322(3), 427–476 (2004)
3. Blanchet, B.: An Efficient Cryptographic Protocol Verifier Based on Prolog Rules.
In: 14th Computer Security Foundations Workshop, CSFW 2001 (2001)
4. Blanchet, B., Abadi, M., Fournet, C.: Automated verification of selected equiva-
lences for security protocols. Journal of Logic and Algebraic Programming 75(1),
3–51 (2008)
5. Cheval, V.: Automatic verification of cryptographic protocols: privacy-type prop-
erties. Phd thesis. ENS Cachan, France (2012)
6. Cheval, V., Blanchet, B.: Proving more observational equivalences with ProVerif.
In: Basin, D., Mitchell, J.C. (eds.) POST 2013. LNCS, vol. 7796, pp. 226–246.
Springer, Heidelberg (2013)
7. Cheval, V., Comon-Lundh, H., Delaune, S.: Trace equivalence decision: Negative
tests and non-determinism. In: 18th ACM Conference on Computer and Commu-
nications Security, CCS 2011 (2011)
8. Cheval, V., Cortier, V., Plet, A.: Lengths may break privacy – or how to check
for equivalences with length. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS,
vol. 8044, pp. 708–723. Springer, Heidelberg (2013)
9. Ciobâcă, Ş.: Automated Verification of Security Protocols with Applications to
Electronic Voting. Thèse de doctorat, Laboratoire Spécification et Vérification.
ENS Cachan, France (December 2011)
10. Cremers, C.J.F.: Unbounded verification, falsification, and characterization of se-
curity protocols by pattern refinement. In: CCS 2008: Proceedings of the 15th ACM
Conference on Computer and Communications Security, pp. 119–128. ACM, New
York (2008)
11. Grewal, G., Ryan, M., Bursuc, S., Ryan, P.: Caveat coercitor: Coercion-evidence
in electronic voting. In: IEEE Symposium on Security and Privacy, pp. 367–381.
IEEE Computer Society (2013)
12. Tiu, A., Dawson, J.: Automating open bisimulation checking for the spi calculus.
In: Proc. 23rd IEEE Computer Security Foundations Symposium (CSF 2010), pp.
307–321. IEEE Computer Society Press (2010)
13. Viganò, L.: Automated security protocol analysis with the avispa tool. In: Pro-
ceedings of the XXI Mathematical Foundations of Programming Semantics (MFPS
2005). ENTCS, vol. 155, pp. 61–86. Elsevier (2006)
The Modest Toolset: An Integrated Environment
for Quantitative Modelling and Verification
1 Introduction
Our reliance on complex safety-critical or economically vital systems such as fly-
by-wire controllers, networked industrial automation systems or “smart” power
grids increases at an ever-accelerating pace. The necessity to study the reliab-
ility and performance of these systems is evident. Over the last two decades,
significant progress has been made in the area of formal methods to allow the
construction of mathematically precise models of such systems and automatic-
ally evaluate properties of interest on the models. Classically, model checking has
been used to study functional correctness properties such as safety or liveness.
However, since a correct system implementation may still be prohibitively slow
or energy-consuming, performance requirements need to be considered as well.
The desire to evaluate both qualitative as well as quantitative properties fostered
the development of integrative approaches that combine probabilities, real-time
aspects or costs with formal verification techniques [1].
The Modest Toolset is an integrated collection of tools for the creation
and analysis of formally specified behavioural models with quantitative aspects.
It constitutes the second generation [8] of tools revolving around the Modest
modelling language [7]. By now, it has become a versatile and extensible toolset
based on the rich semantic foundation of networks of stochastic hybrid automata
(SHA), supporting multiple input languages and multiple analysis backends.
This work is supported by the Transregional Collaborative Research Centre SFB/TR
14 AVACS, the NWO-DFG bilateral project ROCKS, and the 7th EU Framework
Programme under grant agreements 295261 (MEALS) and 318490 (SENSATION).
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 593–598, 2014.
c Springer-Verlag Berlin Heidelberg 2014
594 A. Hartmanns and H. Hermanns
SHA Key:
SHA stochastic hybrid automata
PHA STA
+ continuous
PHA probabilistic hybrid automata
probability STA stochastic timed automata
HA PTA HA hybrid automata
+ continuous PTA probabilistic timed automata
dynamics
TA timed automata
TA MDP MDP Markov decision processes
+ real time
LTS labelled transition systems
nonde- discrete DTMC discrete-time Markov chains
terminism LTS DTMC probabilities
The Modest Toolset’s aim is to incorporate the state of the art in research
on the analysis of stochastic hybrid systems and special cases thereof, such as
probabilistic real-time systems. In particular, it goes beyond the usual “research
prototype” by providing a single, stable, easy-to-install and easy-to-use package.
In this paper, we illustrate how SHA provide a unified formalism for quant-
itative modelling that subsumes a wide variety of well-known automata-based
models (Section 2); we highlight the Modest Toolset’s approach to model-
ling and model reuse through its support of three very different input languages
(Section 3); we give an overview of the available analysis backends for different
specialisations of SHA (Section 4); and we provide some background on technical
aspects of the toolset and its cross-platform user interface (Section 5).
Related work. Two tools have substantially inspired the design of the Modest
Toolset: Möbius [9] is a prominent multiple-formalism, multiple-solution
tool. Focussing on performance and dependability evaluation, its input form-
alisms include Petri nets, Markov chains and stochastic process algebras. Cadp
[11], in contrast, is a tool suite for explicit-state system verification, comprising
about fifty interoperable components, supporting various input languages and
analysis approaches. The Modest Toolset has so far focused on reusing ex-
isting tools on the analysis side whereas Möbius and Cadp rely on their own
implementations.
Fig. 2. Modelling a channel with loss probability 0.01 and transmission delay 2
prohver PHAVer
Modest
Networks of mcpta Prism
Guarded Cmds Stochastic Hybrid Results
Automata mctau Uppaal
Uppaal TA
modes
of the Prism [17] model checker, so its support within the Modest Toolset
allows the reuse of many existing Prism models.
Uppaal TA Uppaal is built upon a graphical interface to model (probabil-
istic) timed automata [3]. A textual language is used for expressions and to
specify the composition of components. The Modest Toolset can import
and export Uppaal TA models. It supports a useful subset of the language’s
advanced features such as parameterised templates and C-style functions.
Fig. 2 shows a comparison of the three languages for a small example PTA
model. Through the use of the intermediate networks-of-SHA representation,
models can be freely converted between the input languages.
5 An Integrated Toolset
As presented in the previous sections, the Modest Toolset consists of several
components and concepts. Several of its analysis backends have been developed
independently and presented separately before. However, it is their combination
The Modest Toolset: An Integrated Environment 597
Fig. 4. The mime graphical user interface for modelling (left) and analysis (right)
and integration that give rise to the advance in utility that the toolset presents.
This integration is visible in the main interfaces of the toolset:
mime is the toolset’s graphical user interface. It provides a modern editor
for the supported textual input languages and gives full access to the analysis
backends and their configuration. mime is cross-platform, based on web techno-
logies such as HTML5, Javascript and the WebSocket protocol. Fig. 4 shows two
screenshots of the mime interface. For scripting and automation scenarios, all
backends are also available as standalone command-line tools.
The toolset itself is built around a small set of object-oriented program-
ming interfaces for input components, SHA-to-SHA model conversions, model
restrictions (to enforce certain subsets of SHA) and analysis backends. Adding
a new input language, for example, can be accomplished by implementing the
IInputFormalism interface and providing a semantics in terms of networks of
SHA; for mime support, syntax highlighting information can be included.
The Modest Toolset is implemented in C#. This allows the same binary
distribution to run on 32- and 64-bit Windows, Mac OS and Linux machines.
Libraries with a C interface are easy to use from C#. modes uses the runtime
bytecode generation facilities in the standard Reflection.Emit namespace to
generate fast simulation code for the specific model at hand.
6 Conclusion
We have presented the Modest Toolset, version 2.0, highlighting how it fa-
cilitates reuse of modelling expertise via Modest, a high-level compositional
modelling language, while allowing reuse of existing models by providing import
and export facilities for existing languages; and how it permits reuse of existing
tools by integrating them in a unified modelling and analysis environment.
The toolset and the Modest language have been used on several case studies,
most notably to analyse safety properties of a wireless bicycle brake [2] and
to evaluate stability, availability and fairness characteristics of power micro-
generation control algorithms [15]. For a more extensive list of case studies, we
refer the interested reader to [13].
The Modest Toolset, including example models, is available for download
on its website, which also provides documentation, a list of relevant publications
and the description of several case studies, at www.modestchecker.net.
598 A. Hartmanns and H. Hermanns
References
1. Baier, C., Haverkort, B.R., Hermanns, H., Katoen, J.P.: Performance evaluation
and model checking join forces. Commun. ACM 53(9), 76–85 (2010)
2. Baró Graf, H., Hermanns, H., Kulshrestha, J., Peter, J., Vahldiek, A., Vasudevan, A.:
A verified wireless safety critical hard real-time design. In: WoWMoM. IEEE (2011)
3. Behrmann, G., David, A., Larsen, K.G.: A tutorial on uppaal. In: Bernardo, M.,
Corradini, F. (eds.) SFM-RT 2004. LNCS, vol. 3185, pp. 200–236. Springer, Heidel-
berg (2004)
4. Bogdoll, J., David, A., Hartmanns, A., Hermanns, H.: mctau: Bridging the gap
between modest and UPPAAL. In: Donaldson, A., Parker, D. (eds.) SPIN 2012.
LNCS, vol. 7385, pp. 227–233. Springer, Heidelberg (2012)
5. Bogdoll, J., Ferrer Fioriti, L.M., Hartmanns, A., Hermanns, H.: Partial order meth-
ods for statistical model checking and simulation. In: Bruni, R., Dingel, J. (eds.)
FMOODS/FORTE 2011. LNCS, vol. 6722, pp. 59–74. Springer, Heidelberg (2011)
6. Bogdoll, J., Hartmanns, A., Hermanns, H.: Simulation and statistical model check-
ing for Modestly nondeterministic models. In: Schmitt, J.B. (ed.) MMB/DFT 2012.
LNCS, vol. 7201, pp. 249–252. Springer, Heidelberg (2012)
7. Bohnenkamp, H.C., D’Argenio, P.R., Hermanns, H., Katoen, J.P.: MoDeST: A
compositional modeling formalism for hard and softly timed systems. IEEE Trans.
Software Eng. 32(10), 812–830 (2006)
8. Bohnenkamp, H.C., Hermanns, H., Katoen, J.-P.: motor: The modest Tool En-
vironment. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp.
500–504. Springer, Heidelberg (2007)
9. Courtney, T., Gaonkar, S., Keefe, K., Rozier, E., Sanders, W.H.: Möbius 2.3: An
extensible tool for dependability, security, and performance evaluation of large and
complex system models. In: DSN, pp. 353–358. IEEE (2009)
10. Frehse, G.: PHAVer: Algorithmic verification of hybrid systems past HyTech. In:
Morari, M., Thiele, L. (eds.) HSCC 2005. LNCS, vol. 3414, pp. 258–273. Springer,
Heidelberg (2005)
11. Garavel, H., Lang, F., Mateescu, R., Serwe, W.: Cadp 2011: a toolbox for the
construction and analysis of distributed processes. STTT 15(2), 89–107 (2013)
12. Hahn, E.M., Hartmanns, A., Hermanns, H., Katoen, J.P.: A compositional mod-
elling and analysis framework for stochastic hybrid systems. Formal Methods in
System Design 43(2), 191–232 (2013)
13. Hartmanns, A.: Modest - a unified language for quantitative models. In: FDL, pp.
44–51. IEEE (2012)
14. Hartmanns, A., Hermanns, H.: A Modest approach to checking probabilistic timed
automata. In: QEST, pp. 187–196. IEEE Computer Society (2009)
15. Hartmanns, A., Hermanns, H., Berrang, P.: A comparative analysis of decentralized
power grid stabilization strategies. In: Winter Simulation Conference (2012)
16. Hartmanns, A., Timmer, M.: On-the-fly confluence detection for statistical model
checking. In: Brat, G., Rungta, N., Venet, A. (eds.) NFM 2013. LNCS, vol. 7871,
pp. 337–351. Springer, Heidelberg (2013)
17. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of probabilistic
real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS,
vol. 6806, pp. 585–591. Springer, Heidelberg (2011)
Bounds2: A Tool for Compositional
Multi-parametrised Verification
Antti Siirtola
1 Introduction
Modern software systems are not only multithreaded but also object-oriented
and component-based. Such systems have several natural parameters, such as
the number of processes and the number of data objects. Moreover, some com-
ponents, like external libraries and subsystems concurrently under construction,
may only be available in the interface specification form. That is why there is
an evident need for verification techniques that can handle multi-parametrised
systems in a compositional way.
Bounds2 is a tool that enables parametrised verification by establishing upper
bounds, i.e., cut-offs, for the values of parameters such that if there is a bug in an
implementation instance with a parameter value greater than the cut-off, then
there is an analogous bug in an implementation instance where the values of
the parameters are within the cut-offs. When using Bounds2, implementations
and specifications are composed of labelled transitions systems (LTSs) with ex-
plicit input and output events by using parallel composition and hiding. We can
also use several kinds of parameters: types represent the sets of the identifiers
of replicated components or the sets of data values of an arbitrary size, typed
variables refer to the identities of individual components or data values and rela-
tion symbols represent binary relations over replicated processes. Correctness is
understood either as a refinement, which can be trace inclusion [1] or alternating
simulation [2], or the compatibility of the components of the implementation [2].
Hence, Bounds2 enables compositional reasoning, too.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 599–604, 2014.
c Springer-Verlag Berlin Heidelberg 2014
600 A. Siirtola
The tool consists of two parts. The instance generator determines cut-offs
for the types, computes the allowed parameter values up to the cut-offs and
outputs the corresponding finite state verification tasks. It can also apply a
limited form of abstraction if the cut-offs cannot be determined without. After
that, the outputted instances are verified by an instance checker specific to the
notion of correctness. Trace refinement and compatibility checkers exploit the
refinement checker FDR2 [1] to verify the instances, the alternating simulation
checker makes use of the MIO Workbench [3] refinement checker, too. Bounds2
is publicly available at [4].
Types P and V represent the set of the identifiers of processes and variables,
respectively, and D denotes the domain of the shared variables. Variables p, v
and d are used to refer to an individual process, a shared variable and a value of
a shared variable, respectively. The event writebeg(p,v,d) (writeend(p,v,d))
denotes that the process p starts (is finished with) writing the value d to the
shared variable v. The input events are marked by ?, the other events are outputs.
A parametrised LTS (PLTS) VarIF captures the interface of a shared variable v:
only one process can access v at a time. As we let v to range over all identifiers
of shared variables and compose the instances of VarIF in parallel, we obtain
the PLTS VarsIF which captures the joint interface of all the shared variables.
Suppose that we also have an implementation VarImpl of the variable inter-
face, the interface PrIF of a process p and the alphabet PrAlph of PrIF without
the write events. In order to check that (a) all the variable and process inter-
faces are compatible, i.e., they can co-operate in some environment such that
whenever an output is sent, it is matched by an input, (b) the implementation
of the variable refines its interface and (c) no two process access the variable
simultaneously, we specify the following parametrised verification tasks:
We can also define binary relations over parametric types. For example, in
order to specify a total order TO in which the processes access the shared variables
we could write as follows:
In this case, we need also a PLTS Pr2IF which describes the behaviour of the
process interface from the viewpoint of two processes p1 and p2 such that p1
comes before p2 in the total order.
Once we have proved that a system implementation refines its interface spec-
ification, we can use the specification, which is usually much smaller, in place of
the system implementation in further verification efforts. This is possible since
the input formalism of Bounds2 is compositional.
3 Novel Features
The first version of Bounds was introduced in [5] and it featured the support
for process types, relation symbols and trace refinement (tref). The novelties of
Bounds2 introduced here are fourfold.
More expressive input language. The main novelty of Bounds2 is a more ex-
pressive input language with a larger decidable fragment. This is enabled by the
introduction of a replicated choice (data types), the classification of events into
inputs and outputs, and two new notions of correctness: compatibility (comp)
and alternating simulation (altsim) [2]. A replicated choice adds to expressive-
ness, since it allows us to express components with a parametrised state space.
Earlier, it was only possible to parametrise the number of concurrent compo-
nents. Distinguishing between input and output events allows us to consider the
compatibility of PLTSs representing software interfaces and gives rise to another
refinement, the alternating simulation, which is a natural notion of the correct-
ness of software interfaces. However, the support for the alternating simulation
is currently not as good as for two other notions of correctness, since the back
end alternating simulation checker, MIO Workbench, can only handle relatively
small models. The theoretical background of the extensions is described in [7, 8].
Improved reduction. The cut-offs computed earlier are rough structural ones.
They are fast and easy to compute but they are often far from optimal, espe-
cially in the case of data types. Therefore, Bounds2 tries to improve the cut-offs
further by analysing the instances up to the structural cut-offs. Basically, the
tool discards instances that can be obtained as a composition of smaller ones as
Bounds2: A Tool for Compositional Multi-parametrised Verification 603
described in [7, 8]. Also this additional reduction is sound and complete and it
is an important enhancement over Bounds1, because the discarded instances are
always the biggest ones which are the most expensive to verify.
4 Experimental Results
We have made several case studies with Bounds2 and compared its performance
against Bounds1 [5]. We have not compared Bounds2 with other parametrised
verification tools, since most of them are targeted to low level software with fi-
nite data [10–14] whereas our focus is on higher level applications which are not
only multithreaded but also object-oriented and component-based. The compar-
ison with other tools would be difficult anyway, since the tools solve a different
decidable fragment and we are not aware of any other tools for parametrised
refinement checking.
For each system, the table below lists the number of types (typ), relation
symbols (rel) and variables (var) used in the model, the notion of correctness
(corr) and the structural cut-offs for types. For both versions of Bounds, the
number of instances outputted and the running times of the instance generator
(tG ) and the instance checker (tC ) are reported. For Bounds2, the former running
times are given with a single core (tG1 ) and six cores (tG6 ) being used. We can
see that the cut-offs provided by the tool are often very small and compared
with Bounds1, the new version not only has a broader application domain but
also operates faster and produces less instances that need to be checked. We can
also see that the bottleneck in the verification chain is typically not Bounds2
but the back end finite state verification tool. The experiments were made on a
six-core AMD Phenom II with 8GB of memory running Ubuntu 12.04 LTS.
5 Conclusions
tool is sound and complete verification with the support for compositional rea-
soning and the possibility to parametrise both the number of processes and the
size of data types as well as the structure of the system. We believe that the
tool will be useful in the analysis of multithreaded, component-based, object-
oriented software systems, which involve both process and data parameters and
where some components may only be available in the interface specification form.
Hence, Bounds2 nicely complements other parametrised verification tools most
of which are targeted for low level software acting on finite data.
References
1. Roscoe, A.W.: Understanding Concurrent Systems. Springer (2010)
2. De Alfaro, L., Henzinger, T.: Interface automata. ACM SIGSOFT Software Engi-
neering Notes 26(5), 109–120 (2001)
3. Bauer, S.S., Mayer, P., Schroeder, A., Hennicker, R.: On weak modal compatibility,
refinement, and the MIO workbench. In: Esparza, J., Majumdar, R. (eds.) TACAS
2010. LNCS, vol. 6015, pp. 175–189. Springer, Heidelberg (2010)
4. Siirtola, A.: Bounds website, http://www.cs.hut.fi/u/siirtoa1/bounds
5. Siirtola, A.: Bounds: from parameterised to finite-state verification. In: Caillaud,
B., Carmona, J., Hiraishi, K. (eds.) ACSD 2011, pp. 31–35. IEEE (2011)
6. Siirtola, A.: Algorithmic Multiparameterised Verification of Safety Properties. Pro-
cess Algebraic Approach. PhD thesis, University of Oulu (2010)
7. Siirtola, A., Heljanko, K.: Parametrised compositional verification with multiple
process and data types. In: Carmona, J., Lazarescu, M.T., Pietkiewicz-Koutny, M.
(eds.) ACSD 2013, pp. 67–76. IEEE (2013)
8. Siirtola, A.: Parametrised interface automata (unpublished draft) (2013),
http://www.cs.hut.fi/u/siirtoa1/papers/pia_paper.pdf
9. Grama, A., Gupta, A., Karypis, G., Kumar, V.: Introduction to Parallel Comput-
ing. Addison Wesley (2003)
10. Delzanno, G., Raskin, J.-F., Van Begin, L.: Towards the automated verification of
multithreaded Java programs. In: Katoen, J.-P., Stevens, P. (eds.) TACAS 2002.
LNCS, vol. 2280, pp. 173–187. Springer, Heidelberg (2002)
11. Ghilardi, S., Ranise, S.: Backward reachability of array-based systems by SMT
solving: termination and invariant synthesis. Log. Meth. Comput. Sci. 6(4) (2010)
12. Kaiser, A., Kroening, D., Wahl, T.: Dynamic cutoff detection in parameterized
concurrent programs. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS,
vol. 6174, pp. 645–659. Springer, Heidelberg (2010)
13. La Torre, S., Madhusudan, P., Parlato, G.: Model-checking parameterized concur-
rent programs using linear interfaces. In: Touili, T., Cook, B., Jackson, P. (eds.)
CAV 2010. LNCS, vol. 6174, pp. 629–644. Springer, Heidelberg (2010)
14. Yang, Q., Li, M.: A cut-off approach for bounded verification of parameterized
systems. In: Kramer, J., Bishop, J., Devanbu, P.T., Uchitel, S. (eds.) ICSE 2010,
pp. 345–354. ACM (2010)
On the Correctness
of a Branch Displacement Algorithm
1 Introduction
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 605–619, 2014.
c Springer-Verlag Berlin Heidelberg 2014
606 J. Boender and C.S. Coen
Formulating the final statement of correctness and finding the loop invariants
have been non-trivial tasks and are, indeed, the main contribution of this paper.
It has required considerable care and fine-tuning to formulate not only the min-
imal statement required for the ulterior proof of correctness of the assembler,
but also the minimal set of invariants needed for the proof of correctness of the
algorithm.
The research presented in this paper has been executed within the CerCo
project which aims at formally verifying a C compiler with cost annotations.
The target architecture for this project is the MCS-51, whose instruction set
contains span-dependent instructions. Furthermore, its maximum addressable
memory size is very small (64 Kb), which makes it important to generate pro-
grams that are as small as possible. With this optimisation, however, comes
increased complexity and hence increased possibility for error. We must make
sure that the branch instructions are encoded correctly, otherwise the assembled
program will behave unpredictably.
All Matita files related to this development can be found on the CerCo web-
site, http://cerco.cs.unibo.it. The specific part that contains the branch
displacement algorithm is in the ASM subdirectory, in the files PolicyFront.ma,
PolicyStep.ma and Policy.ma.
The chosen target architecture of the CerCo project is the Intel MCS-51,
which features three types of branch instructions (or jump instructions; the two
terms are used interchangeably), as shown in Figure 2.
Conditional branch instructions are only available in short form, which means
that a conditional branch outside the short address range has to be encoded using
three branch instructions (for instructions whose logical negation is available, it
can be done with two branch instructions, but for some instructions this is not
the case). The call instruction is only available in absolute and long forms.
On the Correctness of a Branch Displacement Algorithm 607
Note that even though the MCS-51 architecture is much less advanced and
much simpler than the x86-64 architecture, the basic types of branch instruction
remain the same: a short jump with a limited range, an intra-segment jump and
a jump that can reach the entire available memory.
Generally, in code fed to the assembler as input, the only difference between
branch instructions is semantics, not span. This means that a distinction is made
between an unconditional branch and the several kinds of conditional branch,
but not between their short, absolute or long variants.
The algorithm used by the assembler to encode these branch instructions into
the different machine instructions is known as the branch displacement algorithm.
The optimisation problem consists of finding as small an encoding as possible,
thus minimising program length and execution time.
Similar problems, e.g. the branch displacement optimisation problem for other
architectures, are known to be NP-complete [7,9], which could make finding an
optimal solution very time-consuming.
The canonical solution, as shown by Szymanski [9] or more recently by Dick-
son [2] for the x86 instruction set, is to use a fixed point algorithm that starts
with the shortest possible encoding (all branch instruction encoded as short
jumps, which is likely not a correct solution) and then iterates over the source
to re-encode those branch instructions whose target is outside their range.
L0 : jmp X
X: ...
...
L1 : ...
% Start of new segment if
% jmp X is encoded as short
jmp X ...
... jmp L1
L0 : ... ...
% Start of new segment if jmp L1
% jmp X is encoded as short ...
... jmp L1
jmp L0 ...
(a) Example of a program where a long (b) Example of a program where the
jump becomes absolute fixed-point algorithm is not optimal
3 Our Algorithm
3.1 Design Decisions
Given the NP-completeness of the problem, finding optimal solutions (using, for
example, a constraint solver) can potentially be very costly.
The SDCC compiler [8], which has a backend targeting the MCS-51 instruc-
tion set, simply encodes every branch instruction as a long jump without taking
the distance into account. While certainly correct (the long jump can reach any
destination in memory) and a very fast solution to compute, it results in a less
than optimal solution in terms of output size and execution time.
On the other hand, the gcc compiler suite, while compiling C on the x86
architecture, uses a greatest fix point algorithm. In other words, it starts with
all branch instructions encoded as the largest jumps available, and then tries to
reduce the size of branch instructions as much as possible.
Such an algorithm has the advantage that any intermediate result it returns is
correct: the solution where every branch instruction is encoded as a large jump is
always possible, and the algorithm only reduces those branch instructions whose
destination address is in range for a shorter jump. The algorithm can thus be
stopped after a determined number of steps without sacrificing correctness.
The result, however, is not necessarily optimal. Even if the algorithm is run
until it terminates naturally, the fixed point reached is the greatest fixed point,
not the least fixed point. Furthermore, gcc (at least for the x86 architecture)
only uses short and long jumps. This makes the algorithm more efficient, as
shown in the previous section, but also results in a less optimal solution.
In the CerCo assembler, we opted at first for a least fixed point algorithm,
taking absolute jumps into account.
610 J. Boender and C.S. Coen
The first two are parameters that remain the same through one iteration, the
final three are standard parameters for a fold function (including ppc, which is
simply the number of instructions of the program already processed).
The δ functions used by f are not of the same type as the final δ func-
tion: they are of type δ : N → N × {short jump, absolute jump, long jump}; a
function that associates a pseudo-address with a memory address and a jump
length. We do this to ease the comparison of jump lengths between iterations.
In the algorithm, we use the notation sigma1 (x) to denote the memory address
corresponding to x, and sigma2 (x) for the jump length corresponding to x.
Note that the δ function used for label lookup varies depending on whether
the label is behind our current position or ahead of it. For backward branches,
where the label is behind our current position, we can use sigma for lookup,
since its memory address is already known. However, for forward branches, the
memory address of the address of the label is not yet known, so we must use
old sigma.
We cannot use old sigma without change: it might be the case that we have
already increased the size of some branch instructions before, making the pro-
gram longer and moving every instruction forward. We must compensate for this
by adding the size increase of the program to the label’s memory address ac-
cording to old sigma, so that branch instruction spans do not get compromised.
On the Correctness of a Branch Displacement Algorithm 613
4 The Proof
In this section, we present the correctness proof for the algorithm in more de-
tail. The main correctness statement is shown, slightly simplified, in Figure 5.
Informally, this means that when fetching a pseudo-instruction at ppc, the trans-
lation by δ of ppc + 1 is the same as δ(ppc) plus the size of the instruction at
ppc. That is, an instruction is placed consecutively after the previous one, and
there are no overlaps. The rest of the statement deals with memory size: either
the next instruction fits within memory (next pc < 216 ) or it ends exactly at
the limit memory, in which case it must be the last translated instruction in the
program (enforced by specfiying that the size of all subsequent instructions is 0:
there may be comments or cost annotations that are not translated).
Finally, we enforce that the program starts at address 0, i.e. δ(0) = 0. It may
seem strange that we do not explicitly include a safety property stating that
every jump instruction is of the right type with respect to its target (akin to
the lemma from Figure 7), but this is not necessary. The distance is recalculated
according to the instruction addresses from δ, which implicitly expresses safety.
Since our computation is a least fixed point computation, we must prove ter-
mination in order to prove correctness: if the algorithm is halted after a number
of steps without reaching a fixed point, the solution is not guaranteed to be
correct. More specifically, branch instructions might be encoded which do not
coincide with the span between their location and their destination.
Proof of termination rests on the fact that the encoding of branch instructions
can only grow larger, which means that we must reach a fixed point after at most
2n iterations, with n the number of branch instructions in the program. This
worst case is reached if at every iteration, we change the encoding of exactly one
branch instruction; since the encoding of any branch instruction can change first
from short to absolute, and then to long, there can be at most 2n changes.
614 J. Boender and C.S. Coen
The first invariant states that any pseudo-address not yet examined is not
present in the lookup trie.
not jump default ≡ λpref ix.λstrie.∀i.i < |pref ix| →
¬is jump (nth i pref ix) → lookup i (snd strie) = short jump
This invariant states that when we try to look up the jump length of a pseudo-
address where there is no branch instruction, we will get the default value, a
short jump.
jump increase ≡ λpc.λop.λp.∀i.i < |pref ix| →
let oj ≡ lookup i (snd op) in
let j ≡ lookup i (snd p) in jmpleq oj j
This invariant states that between iterations (with op being the previous iter-
ation, and p the current one), jump lengths either remain equal or increase.
It is needed for proving termination. We now proceed with the safety lem-
mas. The lemma in Figure 6 is a temporary formulation of the main property
sigma policy specification. Its main difference from the final version is that
it uses instruction size jmplen to compute the instruction size. This func-
tion uses j to compute the span of branch instructions (i.e. it uses the δ under
construction), instead of looking at the distance between source and destination.
This is because δ is still under construction; we will prove below that after the
final iteration, sigma compact unsafe is equivalent to the main property in Fig-
ure 7 which holds at the end of the computation. We compute the distance using
the memory address of the instruction plus its size. This follows the behaviour
of the MCS-51 microprocessor, which increases the program counter directly
after fetching, and only then executes the branch instruction (by changing the
program counter again).
There are also some simple, properties to make sure that our policy remains
consistent, and to keep track of whether the fixed point has been reached. We
do not include them here in detail. Two of these properties give the values of δ
for the start and end of the program; δ(0) = 0 and δ(n), where n is the number
of instructions up until now, is equal to the maximum memory address so far.
There are also two properties that deal with what happens when the previous
iteration does not change with respect to the current one. added is a variable
that keeps track of the number of bytes we have added to the program size by
changing the encoding of branch instructions. If added is 0, the program has not
changed and vice versa.
We need to use two different formulations, because the fact that added is 0
does not guarantee that no branch instructions have changed. For instance, it
is possible that we have replaced a short jump with an absolute jump, which
does not change the size of the branch instruction. Therefore policy pc equal
states that old sigma1 (x) = sigma1 (x), whereas policy jump equal states that
616 J. Boender and C.S. Coen
old sigma2 (x) = sigma2 (x). This formulation is sufficient to prove termination
and compactness.
Proving these invariants is simple, usually by induction on the prefix length.
This is almost the same invariant as sigma compact unsafe, but differs in that
it computes the sizes of branch instructions by looking at the distance between
position and destination using δ. In actual use, the invariant is qualified: δ is
compact if there have been no changes (i.e. the boolean passed along is true).
This is to reflect the fact that we are doing a least fixed point computation: the
result is only correct when we have reached the fixed point.
There is another, trivial, invariant in case the iteration returns Some δ: it must
hold that fst sigma < 216 . We need this invariant to make sure that addresses
do not overflow.
The proof of nec plus ultra goes as follows: if we return None, then the
program size must be greater than 64 Kb. However, since the previous iteration
did not return None (because otherwise we would terminate immediately), the
program size in the previous iteration must have been smaller than 64 Kb.
Suppose that all the branch instructions in the previous iteration are encoded
as long jumps. This means that all branch instructions in this iteration are long
On the Correctness of a Branch Displacement Algorithm 617
jumps as well, and therefore that both iterations are equal in the encoding of
their branch instructions. Per the invariant, this means that added = 0, and
therefore that all addresses in both iterations are equal. But if all addresses
are equal, the program sizes must be equal too, which means that the program
size in the current iteration must be smaller than 64 Kb. This contradicts the
earlier hypothesis, hence not all branch instructions in the previous iteration are
encoded as long jumps.
The proof of sigma compact follows from sigma compact unsafe and the fact
that we have reached a fixed point, i.e. the previous iteration and the current
iteration are the same. This means that the results of instruction size jmplen
and instruction size are the same.
These are the invariants that hold after 2n iterations, where n is the pro-
gram size (we use the program size for convenience; we could also use the
number of branch instructions, but this is more complex). Here, we only need
out of program none, sigma compact and the fact that δ(0) = 0.
Termination can now be proved using the fact that there is a k ≤ 2n, with n
the length of the program, such that iteration k is equal to iteration k + 1. There
are two possibilities: either there is a k < 2n such that this property holds, or
every iteration up to 2n is different. In the latter case, since the only changes
between the iterations can be from shorter jumps to longer jumps, in iteration 2n
every branch instruction must be encoded as a long jump. In this case, iteration
2n is equal to iteration 2n + 1 and the fixed point is reached.
5 Conclusion
pass it builds a map from instruction labels to addresses in the assembly code.
In the second pass it iterates over the code, translating every pseudo jump at
address src to a label l associated to the assembly instruction at address dst to
a jump of the size dictated by (δ src) to (δ dst). In case of conditional jumps,
the translated jump may be implemented with a series of instructions.
The proof of correctness abstracts over the algorithm used and only relies
on sigma policy specification (page 5). It is a variation of a standard 1-
to-many forward simulation proof [5]. The relation R between states just maps
every code address ppc stored in registers or memory to (δ ppc). To identify the
code addresses, an additional data structure is always kept together with the
source state and is updated by the semantics. The semantics is preserved only for
those programs whose source code operations (f ppc1 . . . ppcn ) applied to code
addresses ppc1 . . . ppcn are such that (f (δ ppc1 ) ... (δ ppcn ) = f ppc1 ppcn )).
For example, an injective δ preserves a binary equality test f for code addresses,
but not pointer subtraction.
The main lemma (fetching simulation), which relies on sigma policy
specification and is established by structural induction over the source code,
says that fetching an assembly instruction at position ppc is equal to fetching
the translation of the instruction at position (δ ppc), and that the new incremen-
ted program counter is at the beginning of the next instruction (compactness).
The only exception is when the instruction fetched is placed at the end of code
memory and is followed only by dead code. Execution simulation is trivial be-
cause of the restriction over well behaved programs w.r.t. sigma. The condition
δ 0 = 0 is necessary because the hardware model prescribes that the first in-
struction to be executed will be at address 0. For the details see [6].
Instead of verifying the algorithm directly, another solution to the problem
would be to run an optimisation algorithm, and then verify the safety of the
result using a verified validator. Such a validator would be easier to verify than
the algorithm itself and it would also be efficient, requiring only a linear pass over
the source code to test the specification. However, it is surely also interesting
to formally prove that the assembler never rejects programs that should be
accepted, i.e. that the algorithm itself is correct. This is the topic of the current
paper.
References
1. Asperti, A., Sacerdoti Coen, C., Tassi, E., Zacchiroli, S.: User interaction with the
Matita proof assistant. Automated Reasoning 39, 109–139 (2007)
2. Dickson, N.G.: A simple, linear-time algorithm for x86 jump encoding. CoRR
abs/0812.4973 (2008)
3. Hyde, R.: Branch displacement optimisation (2006),
http://groups.google.com/group/alt.lang.asm/msg/d31192d442accad3
4. Intel: Intel 64 and IA-32 Architectures Developer’s Manual,
http://www.intel.com/content/www/us/en/processors/architectures-
software-developer-manuals.html
5. Leroy, X.: A formally verified compiler back-end. Journal of Automated Reas-
oning 43, 363–446 (2009), http://dx.doi.org/10.1007/s10817-009-9155-4, doi:
10.1007/s10817-009-9155-4
6. Mulligan, D.P., Sacerdoti Coen, C.: On the correctness of an optimising as-
sembler for the intel MCS-51 microprocessor. In: Hawblitzel, C., Miller, D.
(eds.) CPP 2012. LNCS, vol. 7679, pp. 43–59. Springer, Heidelberg (2012),
http://dx.doi.org/10.1007/978-3-642-35308-6_7
7. Robertson, E.L.: Code generation and storage allocation for machines with span-
dependent instructions. ACM Trans. Program. Lang. Syst. 1(1), 71–83 (1979),
http://doi.acm.org/10.1145/357062.357067
8. Small device C compiler 3.1.0 (2011), http://sdcc.sourceforge.net/
9. Szymanski, T.G.: Assembling code for machines with span-dependent instructions.
Commun. ACM 21(4), 300–308 (1978),
http://doi.acm.org/10.1145/359460.359474
Analyzing the Next Generation Airborne
Collision Avoidance System
1 Introduction
The current onboard collision avoidance standard, TCAS [7], has been successful
in preventing mid-air collisions. However, its deterministic logic limits robust-
ness in the presence of unanticipated pilot responses, as exposed by the collision
of two aircraft in 2002 over Überlingen, Germany [4]. To increase robustness,
Lincoln Laboratory has been developing a new system, ACAS X, which uses
probabilistic models to represent uncertainty. Simulation studies with recorded
radar data have confirmed that this novel approach leads to a significant im-
provement in safety and operational performance. The Federal Aviation Admin-
istration (FAA) has formed a team of organizations to mature the system, aiming
to make ACAS X the next international standard for collision avoidance.
The adoption of a completely new algorithmic approach to a safety-critical
system naturally poses a significant challenge for verification and certification.
The first author performed this work while employed by SGT Inc. as an intern at the
NASA Ames Research Center. This work was funded under the System-wide Safety
Analysis Technologies Project of the Aviation Safety Program, NASA ARMD.
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 620–635, 2014.
c Springer-Verlag Berlin Heidelberg 2014
Analyzing the Next Generation Airborne Collision Avoidance System 621
Our goal in this work is to study the applicability of formal probabilistic verifi-
cation and synthesis techniques, which go beyond simulation studies [8,5]. Our
study was driven by tasks defined in collaboration with the ACAS X team to be
complementary to their verification efforts. During the course of our work, we
identified shortcomings of existing tools, which lead us to develop a framework
customized for ACAS X (or similar systems). In our framework, models are ex-
pressed in a traditional programming language for increased expressiveness, and
verification and synthesis algorithms are designed for scalability and efficiency.
The contributions of this work can be summarized as follows: 1) Develop-
ment of a faithful model for synthesis of the ACAS X controller, based on the
Lincoln Laboratory publications [6]; 2) Development of customized verification
and synthesis algorithms for efficient handling of ACAS X (and like) systems;
3) Identification of design and verification challenges for ACAS X as related to
probabilistic verification and synthesis; 4) Results obtained from the application
of our framework to ACAS X and recommendations for the ACAS X effort.
The results of our work will serve as input for the certification of ACAS X.
Due to access restrictions, we analyze a previous version of the system [6], but
are currently working with the ACAS X team to extend our work to the current
version. We believe that ACAS X presents researchers in probabilistic verification
and synthesis with a unique opportunity to focus on a relevant, safety-critical
case study. For this reason, we are preparing a public release of our models and
framework, to encourage other members of the community to build on our work.
The remainder of this paper is organized as follows. Section 2 describes the
ACAS X system as designed and deployed by the ACAS X team. In addition
to these techniques, our work implements and applies formal verification and
synthesis approaches, described in Sections 3 and 4. We discuss implementation
details in Section 5, with Section 6 concluding the paper.
ago (4) ps the pilot state. Pilot state and advisories can take the following values
— note that the pilot can either follow the advisory (i.e., ps = adv) or perform
random maneuvers (i.e., ps = COC), since studies have shown that pilots may
not react immediately or at all to an advisory:
– COC stands for “clear of conflict” — the pilot is free to choose how to control
the plane.
– CLI1500 / DES1500 stand for “climb / descend with 1500 ft / min”, respec-
tively; they advise the pilot to change the climbing rate with 14 g until reach-
ing a climbing rate of 1500 ft / min / −1500 ft / min, respectively.
– Advisories SCLI1500 / SDES1500 and SCLI2500 / SDES2500 are similar but
employ an acceleration of 13 g. Moreover, SCLI2500 / SDES2500 target a final
climbing rate of 2500 ft / min / −2500 ft / min, respectively.
(a) Resolution (10, 10, 10) (b) Resolution (20, 20, 20)
Fig. 1. Two controllers generated with the same weight in different resolutions. x-
axis shows time until LHS, y-axis height difference. Parameters dh0 and dh1 are zero
throughout, and adv = ps = COC. Color indicates selected advisory: black (0) for COC,
red (1) for CLI1500, yellow (2) for DES1500.
remain on the same height for a long time (due to their random movement),
and it is therefore better to wait until the intruder either starts climbing and
or descending and go in the opposite direction. Secondly, notice the “mouth”
shape close to time 0 and around height difference 0. In this collision situation,
ACAS X is not giving any advisory, although one would intuitively expect that
some advisory would be more informative to the pilot than COC, which may be
misleading. This is an artifact of the costs used for synthesis, and we describe a
technique that identifies situations like these in Section 3.
3 Verification
To complement the ACAS X work that primarily uses simulation, we apply for-
mal analysis techniques to evaluate the ACAS X controller. Simulation-based
techniques are studied and discussed in Section 4, where we explore the design-
space of controllers and compare different generated controllers among them-
selves. In this section, we evaluate the ACAS X controller 1) in terms of the
quality criteria used for its generation, and 2) through model checking of PCTL
[3] properties, which are ideal for probabilistic models such as ACAS X’s. For
evaluation, we use models discretized at different resolutions, and could even use
different model characteristics and parameters (although we do not do the latter
in the experiments presented here).
The type of analysis that we perform provides a value v(s) for each state of
the discretized model. To easily compare results of analyses with each other and
with simulations, we define a probability distribution I(s) over the states of the
model as follows (similarly to [6]). The only states we consider are those at 40
seconds from LHS, and in which ps = adv = COC. Over those states, we first
define a continuous distribution over (dh0 , dh1 , h) ∈ R3 by sampling dh0 and
dh1 uniformly from [−1000, 1000] ft / min, denoted as dh0 ∼ U (−1000, 1000) and
dh1 ∼ U (−1000, 1000). To make a collision likely, and therefore to provoke the
controller into action, h is sampled from 40((dh1 − dh0 )/60) + N (0, 25).
Analyzing the Next Generation Airborne Collision Avoidance System 625
creasing resolutions. Line “Height”
represents resolutions (10, 10, n), while
line “Climbing Rate” represents the
dh0
h
dh1
dh1
h
dh0
Fig. 3. Trace plots for properties 1 and 3. x-axis displays time to LHS, y-axis displays
values of (dh0 , dh1 , h). The color of line h depicts the advisory, tagged above the line.
mid-air collision, formally P=? [F NMAC]. During analysis, we observed that the
most likely cases of this undesirable scenario stem from late reactions from
the pilot. We therefore decided to instead concentrate on NMACs that occur
despite immediate reactions to advisories by the pilot. We formulate this as
P=? (F NMAC | G adv = ps), i.e., what is the probability of reaching an NMAC
state although the pilot always reacts immediately.
The highest probability over all initial states that we encounter with the
conditional probability formula is 2.30 · 10−8 , as opposed to 6.92 · 10−4 with the
original formula. This confirms that the vast majority of NMACs happen because
the pilot does not react fast enough or at all. To understand the NMACs that
occur despite the fact that the pilot reacts to advisories, we analyzed some traces
that are most likely to fulfill P=? (F NMAC | G adv = ps). Figure 3 depicts such a
scenario: initially, our airplane is 1000ft below the intruder and we are climbing
with 2500 ft / min. The intruder, on the other hand, starts out with a climbing
rate of −250 ft / min. Until 22 seconds to LHS, the two airplanes maintain their
course, and therefore the height difference shrinks. If both planes were to continue
to maintain their course, then our plane would be well above the intruder at time
0 to LHS, so ACAS X does not alert.
At this point, climbing rate of the intruder starts increasing, and the verti-
cal distance becomes −150 ft. The height difference levels off as a result of the
intruder’s increase in climbing rate from now on. ACAS X signals the DES1500
advisory seven seconds later, and SDES2500 one second after that. As a result,
our airplane starts descending steeply until it reaches −2500 ft / min. At the point
of the first alarm, the vertical distance is 50 ft, i.e., our plane is slightly above
the intruder. Unfortunately, the climbing rate of the intruder starts decreasing
at exactly the same point and from that point on, the two climbing rates are
not different enough to carry our plane outside of the danger zone and we end
up with a vertical distance of 100 ft, and hence an NMAC.
Traces like these capture exactly the type of unforeseen behaviour that led to
the Überlingen accident [4], and probabilistic model checking can detect cases
like these easily. We consider it encouraging that the most likely case of colli-
sion requires relatively complex behaviour of the intruder (first increasing the
climbing rate, then decreasing it, at exactly the right point in time).
Analyzing the Next Generation Airborne Collision Avoidance System 627
an NMAC is imminent. Figure 4
shows the probability of the formula
0 ft / min and adv = ps = COC. This
-Alerts
−P (NMAC)
(a) Simplified Pareto curve (b) Points generated for two objectives
Fig. 5. Fictional and actual Pareto fronts
difference in climbing rates was 1000 ft / min, and the height difference was -30
ft. Since we were about 15 seconds away from LHS, this amounted to a decreased
vertical distance of about 260 ft. ACAS X decided to increase the vertical distance
by increasing the rate of descent.
It would be interesting to study whether the cost function of ACAS X may
encourage such cases of split advisories. Given that (Alert + COC < Reversal),
it is possible that ACAS X decided to gain a small reward for selecting COC after
the first advisory, and additionally avoid the cost of a reversal that would be
incurred if the advisory was switched directly from a climb to a descend.
infinity in all dimensions, defines the section of the Pareto front in which we are
interested. To find this section, we modified an algorithm presented in [9].
While the details of the approach are beyond the scope of this paper, the idea
can be summarized as follows. Initially, the optimal controller for each dimension
is generated, i.e., the controller with the lowest P(NMAC), the controller with the
lowest expected number of Alerts (i.e., zero), etc. We add the performance of
these controllers to the approximation of the Pareto front. These points, illus-
trated as the two green dots on the axes in Figure 5(a), reflect the performance
of the corresponding controller in terms of the selected quality attributes.
We then keep adding points to the Pareto front in the following way. We
calculate the convex hull of the points generated so far. This hull defines a set of
n-dimensional faces (lines, in our picture), that connect the points. Further, the
hull defines a lower bound for new points (the Pareto front is convex, so missing
points must lie on or above the hull). In the picture, the lines connecting the
green dots form the hull. The generated points also define an upper bound on
the space of controller performances, illustrated by the dashed lines in the figure.
The direction (normal) of the dashed line (separating hyperplane) is given by
the weights we used to generate the point. If there are any more points we can
generate, then these points exist between the hull and the upper bound.
Since we want to find new points in the box defined by the target, we pick
new weights so as to refine the face (by lowering the upper bound or breaking
up the face) above which there is a point that 1) lies inside the upper bound
2) lies above the target 3) is maximally far away from the face (as defined by
the Euclidean distance). We continue until we either prove that the target lies
outside the upper point (which means that no controller fulfilling the minimal
requirement exists) or until we have found enough points above the target.
Figure 5(b) presents a subset of the points generated by this approach on
Alert and NMAC exclusively. The target point and the box it defines are plotted in
black, and the points generated are plotted in red. The algorithm first generated
8 points outside the box. The first point generated within the target box (the
9th overall) is depicted in blue. We generated 10 more points after we found it.
We note that all subsequent 10 points that are generated also lie within the box.
The same effect has been observed for three dimensions. We conclude that this
algorithm is good at approximating the interesting part of the Pareto front (that
inside the box) once it finds the first point that meets the target specifications.
We have checked this algorithm against various targets, and it always either
finds a controller meeting the requirement, or proves that no such controller
exists. Note that finding a controller in the box is an NP-complete problem (easy
adaptation of proof from [1]). In the worst case, the algorithm has to generate all
points of the Pareto front of the model, of which there are exponentially many.
However, as the next section shows, little more than 100 points suffice to find a
controller meeting the requirement for various resolutions.
We believe that this technique can be very helpful as the controller model
ACAS X evolves. Each evolution (be it a change in discretization or a
change in parameters), necessitates tuning weights anew (as witnessed by the
630 C. von Essen and D. Giannakopoulou
(n, n, 10) (Climbing Rate), (10, 10, n) (Height) and (n, n, n) (All) respectively. It
can be seen that we were almost unable to decrease P(NMAC) using the climbing
rate alone. The relative performance of these controllers is consistently around
99.5%. When we increase the resolution of the height, then we get a relative per-
formance of about 85%. Finally, when increasing the resolution of both we see a
relative performance of about 83%. As witnessed in Section 3, the discretization
of height seems to have the biggest influence on controller quality. Interestingly,
the relative performance does not improve as we increase the resolution.
!
!
"#
"#
5 Implementation
We originally used existing probabilistic model checking tools for ACAS X but
encountered several limitations. First, we could not express the linear interpola-
tion needed in the controller evaluation. Second, we not only require capabilities
Analyzing the Next Generation Airborne Collision Avoidance System 633
for the specification of a model, but also for loading generated controllers for
subsequent verification. Last but not least, for our mupliple experiments involv-
ing increasing resolution, the state spaces we generate grow prohibitively large,
and there is a considerable slow-down that could benefit from parallelization,
which is unavailable in current releases of existing tools.
More specifically, the size of the controller has 40 · ((2rdh0 + 1) · (2rdh1 + 1) ·
(2rh + 1) · 13) states in resolution (rdh0 , rdh1 , rh ). So, for example, the model
from [6] has 4, 815, 720 states overall. A controller with resolution (50, 50, 50)
has 535, 756, 520 states. We wrote a simplified version of the model in [6] for
PRISM [8] (without linear interpolation, but with sigma point sampling). While
PRISM succeeded in loading the model as a BDD model, analyzing it was not
possible (we aborted conversion to the hybrid representation after 10 min).
These problems motivated us to create our own framework that takes advan-
tage of two key insights into the ACAS X model. Firstly, if we want to calculate
the values of any property in this model at time t, then we only need to keep
the value of time t − 1 in memory. This alone leads to a reduction of memory
consumption to 2.5%. Secondly, since we need to calculate value iteration steps
only a relatively small number of times for each state, it is possible to avoid
storing the transition matrix in memory and generate the values on-demand.
In addition, we parallelized value iteration, and the speed-up obtained in ex-
periments using up to 12 cores was almost linear (1.94 for 2 cores, 3.37 for 4
cores, 4.67 for 6 cores, 6.47 for 8 cores, 7.54 for 10 cores, 8.93 for 12 cores). Paral-
lelization proved essential for our experiments involving increasing discretization
resolution; generating the Pareto fronts for all cases took about 2 days, as op-
posed to more than a month.
This cautions us, in exploring the space of controllers, to ultimately evaluate their
relative performance in simulation. However, the Pareto-front-based techniques
for controller generation provide a systematic way of generating and comparing
controllers that can complement designer intuition.
PCTL model checking also proves valuable in studying properties of generated
controllers. However, more useful than the model checking itself, is the capability
to visualize its results and generate traces that help with understanding of the
model checking results. We therefore found that latter aspect of our tools most
helpful, together with a simulator that we built, which allows to interactively
explore generated controllers. In the future, we plan to connect the simulator to
the model checker, to allow replay of the generated traces.
The techniques and tools that we developed are general, and the customization
for memory savings is applicable to problems that have a similar nature; for
example, it could be used in the domain of car collision avoidance systems,
which is important as we move towards self-driving cars. Our work on analysis
of ACAS X will continue beyond this paper. Our plans for future work include the
modeling of a reasonably adversarial pilot for the intruder plane, and alternative
representations of the look-up table for verification and deployment. Moreover,
we plan to study a version of ACAS X that is targeted to unmanned vehicles, as
well as experiment with the evaluation of generated controllers in the context of
hybrid verification tools, which the ACAS X team has expertise in.
References
1. Chatterjee, K.: Markov decision processes with multiple long-run average objec-
tives. In: Arvind, V., Prasad, S. (eds.) FSTTCS 2007. LNCS, vol. 4855, pp. 473–484.
Springer, Heidelberg (2007)
2. Forejt, V., Kwiatkowska, M., Parker, D.: Pareto curves for probabilistic model
checking. In: Chakraborty, S., Mukund, M. (eds.) ATVA 2012. LNCS, vol. 7561,
pp. 317–332. Springer, Heidelberg (2012)
3. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal
Aspects of Computing 6, 102–111 (1994)
4. Johnson, C.: Final report: review of the BFU Überlingen accident report. Con-
tract C/1.369/HQ/SS/04 to Eurocontrol (2004), http://www.dcs.gla.ac.uk/~
johnson/Eurocontrol/Ueberlingen/Ueberlingen Final Report.PDF
5. Katoen, J.-P., Zapreev, I.S., Hahn, E.M., Hermanns, H., Jansen, D.N.: The ins and
outs of the probabilistic model checker MRMC. Perform. Eval. 68(2) (2011)
6. Kochenderfer, M.J., Chryssanthacopoulos, J.P.: Robust airborne collision avoid-
ance through dynamic programming. Project Report ATC-371, Massachusetts In-
stitute of Technology, Lincoln Laboratory (2011)
Analyzing the Next Generation Airborne Collision Avoidance System 635
7. Kuchar, J., Drumm, A.C.: The traffic alert and collision avoidance system. Lincoln
Laboratory Journal 16(2), 277 (2007)
8. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of probabilistic
real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS,
vol. 6806, pp. 585–591. Springer, Heidelberg (2011)
9. Rennen, G., van Dam, E.R., den Hertog, D.: Enhancement of sandwich algorithms
for approximating higher-dimensional convex Pareto sets. INFORMS Journal on
Computing 23(4), 493–517 (2011)
10. Zuliani, P., Platzer, A., Clarke, E.M.: Bayesian statistical model checking with
application to Stateflow/Simulink verification. Formal Methods in System De-
sign 43(2), 338–367 (2013)
Environment-Model Based Testing
of Control Systems: Case Studies *
1 Introduction
Lurette is a black-box testing tool for reactive systems that automates the tests decision
and the stimulation of the System Under Test (SUT). Lurette is based on two syn-
chronous languages: Lustre [1], to specify test oracles, and Lutin [2], to model reactive
environments. Lurette does not require to analyze the code, thus it can deal with any
kind of reactive systems, as the experimentations reported below illustrate.
The COMON project∗ gathered three industrial companies that conceive control-
command systems of nuclear plants. Corys Tess designs plant simulators used in par-
ticular for training operators. Atos Worldgrid designs software and hardware of com-
puterized control rooms. Rolls-Royce designs the software and hardware of classified
automatisms in charge of the plant security. The goal for this consortium was to take
advantage of the partners complementarity to set up a development framework based
on early simulations, model refinements, continuous integration, and automatic testing.
During the project, the consortium has crafted a case study representative of each of the
partners activity [3]. They also wanted to experiment on their own designs the Lurette
languages and methodology. This article presents those experimentations.
We first recall Lurette principles in Section 2, and briefly present enough of the
Lustre and the Lutin languages to be able to understand the examples. Section 3 presents
* This work was supported by the COMON Minalogic project [2009-2012] funded by the
french government (DGCIS/FUI), la Metro, and the city Grenoble – http://comon.
minalogic.net/
E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 636–650, 2014.
© Springer-Verlag Berlin Heidelberg 2014
Environment-Model Based Testing of Control Systems: Case Studies 637
the Corys case study and demonstrates the use of Lurette on a library object used to
simulate the behavior of the temperature and the pressure of a fluid in a pipe. Section 4
presents the Atos case study that illustrates how Lurette can be used to automate the run
of existing test plans designed for a Supervisory Control and Data Acquisition (SCADA)
library object. We discuss related work and conclude in Sections 5 and 6.
Lustre. Lustre allows defining reactive programs via sets of data-flow equations that
are virtually executed in parallel. Equations are structured into nodes. Nodes transform
input sequences into output sequences. The Lustre node r_edge below processes one
Boolean input sequence, and computes one Boolean output sequence.
This node defines its output with one equation and four operators (i.e., predefined
nodes). The memory operator “pre” gives access to the previous value in a sequence:
if x holds the sequences (x1 ,x2 ,. . . ), then pre(x) holds (⊥,x1,x2 ,. . . ), where ⊥ denotes
an undefined value. The arrow operator “->” modifies the value of the first element of a
sequence: if x holds (x1 ,x2 ,x3 ,. . . ), then init->x holds (init,x2 ,x3 ,. . . ). This operator is
useful for sequences that are undefined at their first instant, such as pre(x). The “and”
and “not” operators are the logical conjunction and negation lifted over sequences.
Hence, r_edge(x) is equal to x at the first instant, and then is true if and only if x is
true at the current instant and false at the previous one. This node detects rising edges.
Lutin. Lutin is a probabilistic extension of Lustre with an explicit control structure
based on regular operators: sequence (fby, for “followed by”), Kleene star (loop), and
choice (|). At each step, the Lutin interpreter (1) computes the set of reachable con-
straints, which depends on the current control-state; (2) removes from it unsatisfiable
constraints, which depends on the current data-state (input and memories); (3) draws a
constraint among the satisfiable ones (control-level non-determinism); (4) draws a point
in the solution set of the constraint (data-level non-determinism). This chosen point de-
fines the output for the current reaction. The solver of the current Lutin interpreter uses
Binary Decision Diagrams (BDD) and convex polyhedron libraries [6]. It is thus able
to deal with any combination of logical operators and linear constraints. Let us first
illustrate the Lutin syntax and semantics with a program using equality constraints.
3
node sn_gen () returns ( sn : int ) =
loop [10 ,20] sn =1 fby 2
sn_gen()
loop [20 ,30] sn =2 1
0
This node generates an integer finite sequence, without using any input. It first uses
an atomic constraint that binds sn to 1, during between (uniformly) 10 and 20 reaction
steps. Then it uses sn=2 during between 20 and 30 steps, and then stops. A constraint
can actually have any number of solutions, as in the x_gen node below.
node x_gen ( i : real ) returns ( x : real ) = loop { 0 < x and x < i }
The first macro defines the absolute value of any real. The next ones define two zones
where a couple of real values (x,y) evolve. We present below a last example (used later)
that illustrates how to use Lutin to guide the random exploration of the environment.
node x_y_gen () returns (x , y : real ) =
loop { {|3: zone1 (x , y ) |1: zone4 (x , y )} fby loop ~50:5 x = pre x and y = pre y }
For the first reaction, a point is drawn in zone1 with a probability of 3/(3+1)=0.75 or
in zone4 with a probability of 1/(3+1)=0.25. Then x and y keep their previous values
for 50 steps in average, with a standard deviation of 5. This process then starts again
thanks to the outer loop. Preventing the environment outputs to change at each reac-
tion produces better coverage for requirements guarded by stability conditions (which
is common in control-command applications). More generally, a too chaotic environ-
ment might set the SUT into degraded modes, which would prevent the test of nominal
modes. Lutin also has constructs to execute in parallel nodes (run) or constraints (&>),
as well as exceptions 2 .
dM dh
∑ Qei − h ∑ Qmi
= ∑ Qmi = i i
dt i dt M
where ∑i Qmi and ∑i Qei are respectively the sum of the mass flow (kg/s) and the sum
of the powers arriving in the node; M and h are the mass (kg) and the mass enthalpy
(J/kg) of the system; t is the time. The SUT is made of this object connected to two
pipes, themselves connected to two objects (load loss) that models the fluid mass flow
and transported power. The resulting equations are discretized and solved using the
Newton-Raphson method. Table 1 describes the SUT input/output variables. We have
shortened some variable names for the sake of readability.
2 cf http://www-verimag.imag.fr/Lutin.html for more information.
640 E. Jahier et al.
The input variables to stimulate this node are the limit conditions for the pressure (Pin
and Pout), the temperature (Tin and Tout), and the ambient temperature (T_amb).
The admissible values for those inputs are part of the object documentation, which
states that pressure values vary within [10000.0, 190.0e5], and temperature values vary
within [5.0, 365.0]. Moreover, Corys wanted to test this node in average conditions, and
therefore required that the stimuli generator satisfies the following constraints:
– temperature and pressure cannot vary more than 10% between two instants;
– orders change only when mass and temperature values are stable (i.e., they do not
change of more than 1% between two steps).
To stimulate the SUT, we therefore designed a Lutin program that is a direct for-
malization of the preceding constraints. We use the limit_der macro, which can be
used both to test if an input varies more than a given percentage (limit_der(1.0,M)
to test if M varies less than 1%), or to constraint the derivative of some output
(limit_der(10.0,Pin) to constraint Pin to vary less than 10%).
let l i m i t _ d e r( pc : real ; x : real ref ): bool = abs (x - pre x ) < abs ( pc /100.0* pre x )
node l i q u i d _ s p l _ e n v(M , T : real ) returns (
Pin , Pout : real [ 1 0 0 0 0 . 0 ; 190.0 e5 ]; Tin , Tout , Tamb : real [5.0; 365.0];
) =
-- a few aliases to make it more readable
let i n p u t s _ a r e _ s t a b l e = l i m i t _ d e r(1.0 , M ) and l i m i t _ d e r(1.0 , T ) in
let d o n t _ c h a n g e = -- outputs keep their previous values
Pin = pre Pin and Tin = pre Tin and
Pout = pre Pout and Tout = pre Tout and Tamb = pre Tamb in
let change = -- outputs do not vary more than 10%
l i m i t _ d e r(10.0 , Pin ) and l i m i t _ d e r(10.0 , Pout ) and
l i m i t _ d e r(10.0 , Tin ) and l i m i t _ d e r(10.0 , Tout ) and l i m i t _ d e r(10.0 , Tamb )
in -- a simple scenario
true -- the first instant
fby loop { if i n p u t s _ a r e _ s t a b l e then change else d o n t _ c h a n g e}
Environment-Model Based Testing of Control Systems: Case Studies 641
The main node liquid_spl_env has two real inputs (produced by the SUT), and
five real outputs. At the first instant, the only constraints on output variables are the
ones mentioned in their declarations; a random value is drawn in their respective inter-
val domains. For example, Tamb is drawn between 5 and 365. Then, for the remaining
instants, variables keep their previous values if one of the environment input (M or T)
varies more than 1%; otherwise they vary at random, but without exceeding 10%. One
could of course imagine scenarios that are more complex. However, it hasn’t been nec-
essary to cover the expected properties we present in the following.
Note the feedback loop: the SUT reacts to its environment, which itself reacts to
the SUT by testing the stability of M and T. This is typical of what offline test vectors
generators cannot do when they ignore the reactive nature of the SUT.
1. if the sum of powers (coming from Qe1, Qe2, and Qe_amb sensors), and the sum
of incoming mass flows (coming from Qm1 and Qm2 sensors) are positive, then the
mass and the temperature of the node increase;
2. if the sum of powers Qe, and the sum of flows mass Qm are negative, then the mass
and temperature of the node decrease;
3. if the sum of powers is zero, and the sum of mass flow rate Qm is positive, then the
mass increases;
4. if the sum of Qe is zero, and the sum of mass flow rate Qm is negative, then the
mass decreases;
5. if the sum of mass flows Qm is zero, and the sum of powers Qe is negative, then the
temperature decreases;
6. if the sum of mass flows Qm is zero, and the sum of powers Qe is positive, then the
temperature increases.
When we run Lurette with the SUT, the environment, and the oracles we described,
all oracles are violated after a few steps. After several discussions with the person who
wrote down the requirements, we ended up in Lurette runs that worked fine for hours.
We now sum-up the fixes we needed to perform.
642 E. Jahier et al.
First Problem. We have formalized the sentence “the sum of powers (coming from Qe1,
Qe2, and Qe_amb sensors)”, and “the sum of mass flows (coming from Qm1 and Qm2”)
as Qe=Qe1+Qe2+Qe_amb and Qm=Qm1+Qm2. However, the node connectors are oriented:
the first pipe flows in, whereas the second pipe flows out. Hence the correct interpreta-
tion leads to the following definitions: Qe=Qe1-Qe2+Qe_amb and Qm=Qm1-Qm2.
Second Problem. We have performed a bad interpretation of “are positive/negative” in
the requirements. Indeed, when one compares to 0 a sum of values that are computed up
to a certain precision (0.1 for flow mass, and 100 for powers), one has to specify some
tolerance levels. Hence, for example, the second property should be rewritten as: “if
Qe<=-Tol_Qe and Qm<=-Tol_Qm then the mass and temperature of the node decrease”,
where Tol_Qe=300 (three times the precision of power sensors) and Tol_Qm=0.2 (two
times the precision of flow mass sensors).
Third Problem. In properties 5 and 6, the statements “the sum of powers Qe is positive”
should take into account the mass enthalpy of the node (Qe-h.Qm instead of just Qe).
Fourth Problem. At this stage, the requirements fixes we have performed allow run-
ning simulations that last several minutes without violating oracles. After more steps
(around 1000 steps in average), property 5 is violated. This time, the problem was more
subtle and required a deeper investigation to the Corys engineer. His conclusion was
that the convergence criteria (thresholds parametrizing the differential equation solver)
in this simulation were too small. By setting a convergence criterion of 1 (versus 0.1)
for the power, and of 1000 (versus 100) for the flow mass, no oracle is violated, even if
we run the simulation for hours. Since the convergence criterion implies the precision
of sensors computations, we need to modify again the values of Tol_Qm and Tol_P.
Those new convergence criteria are actually the ones that are typically used in Alices
for modeling pipes in power plants, which explain why this problem was (probably)
never triggered before by Alices users.
Table 2. Summary of requirements fixes. Version 2 arises from the fixing of the first three prob-
lems. Version 3 arises from the fixing of the fourth problem.
engineers and revealed a real feature of this very frequently used object that behaves
unexpectedly when used with an unusual convergence criterion.
The principal lesson of this experimentation is that writing executable requirements
is not that difficult and can be very effective. Indeed, the experiment was conducted
by an engineer that was ignorant about Lustre, Lutin, Alices, and dynamic systems
modeling. Still, he was able to pinpoint four issues in less than one week of work with
a few interactions with the Alices libraries supervisor.
We performed a similar study during the COMON project on voters designed in
Scade by Rolls-Royce. Their voters were much simpler, with no internal state. Hence
their formalization into Lustre oracles ended up into something equivalent to the Scade
implementation. We believe that using oracles in this context is still useful, as it amounts
to have two teams implementing the same specification, which is a classical strategy to
gain confidence in software implementations. In such cases, Lutin stimulators can still
be useful to compare thoroughly two implementations. In the particular case of Rolls-
Royce voters, it was not necessary as we were able to prove their equivalence by state
exploration (using the Lesar model-checker [7]). This illustrates the synergy we can
have between formal-based testing and formal verification.
The first step to implement with Lurette an automated version of this test plan was to
connect our languages APIs to the Atos SCADA . To do that, we re-used the infrastructure
that was set up for the I/0 stimulator. We also added a layer in charge of interfacing an
event-triggered workbench (SCADA) with time-triggered programs (Lutin/Lustre). From
Environment-Model Based Testing of Control Systems: Case Studies 645
Lurette to SCADA , we generate an event each time a variable value changes (up to a
given threshold). From SCADA to Lurette, we perform a periodic sampling of the variable
values. This sampling is done at 1 hertz, to avoid data race problems and to remain
deterministic and reproducible: indeed, 1 second is enough for the SUT to address all
events resulting from the change of all interface variables. Note that it would have been
easy and interesting to test what happens at higher rates.
The « Expected result » column of Table 3 in Lustre. In order to detect bad behav-
iors, we formalize the observation column of the CRT_019_S04 test plan with a Lustre
oracle that monitors the following inputs: the step number (sn ∈ [1,7]) ; the current
zone (czone ∈ [1,5]); the alarm of zone 2 (A2); the elected domain (d_elec); the cur-
rent timestamp (ts_c); and the timestamp of alarm A2 (ts_a2). Here again, we have
shortened variable names for the sake of readability.
node c r t 0 1 9 _ s 0 4( sn : int ; czone : int ; A2 : bool ; d_elec , ts_c , ts_a2 : int )
returns ( ok : bool );
var ok1 , ok2 , ok3 , ok4 , ok5 , ok6 , ok7 : bool ; lts_a2 : int ;
let
lts_a2 = 0 -> if r_edge ( A2 ) then ts_c else pre ( lts_a2 );
ok1 = ( sn =1 => ( czone =1));
ok2 = ( sn =2 => ( czone =2 and A2 ));
ok3 = ( sn =3 => ( czone =2 and A2 and ts_a2 = lts_a2 ));
ok4 = ( sn =4 => ( czone =2 and A2 and ts_a2 = lts_a2 ));
ok5 = ( sn =5 => ( czone =4 and not A2 ));
ok6 = ( sn =6 => ( czone =2 and d_elec =2 and ts_a2 = ts_c ));
ok7 = ( sn =7 => ( czone =4 and not A2 ));
ok = ok1 and ok2 and ok3 and ok4 and ok5 and ok6 and ok7 ;
tel
646 E. Jahier et al.
The local variables ok1 to ok7 encode the seven steps of the third column. In order
to « write down the timestamp » at step 2, we define a local variable lts_a2 as follows:
initially set to 0, it then takes the value of the current timestamp ts_c when A2 is raised
(r_edge(A2)), and keeps its previous value otherwise (pre(lts_a2)). To encode the
expected result of steps 3 and 4, we compare the timestamp of the A2 provided in input
ts_a2 with its counterpart computed locally lts_a2.
The « action » column of Table 3 in Lutin. We first present a completely deterministic
Lutin program that mimics the behavior of an operator that processes this test plan.
Then we show how slight modifications of this program can lead to a stimuli generator
that covers much more cases. Let us first define a few Boolean macros to enhance the
programs readability. The tfff macro below binds its first parameter to true, and all
the other ones to false.
let tfff (x ,y ,z , t : bool ): bool = x and not y and not z and not t
Similarly, we define ftff, which binds its second parameter to true; f7 and f8 bind
all their parameters to false. The integer input sn is used to choose the instant at which
we change the step. It can be controlled by a physical operator or by another Lutin node
that sequentially assigns values from 1 to 7 (similar to the sn_gen node of Section 2).
The fourteen outputs of this node controls the domain to display (display domain i if
ddi is true), to force (force domain i if fdi is true), or to un-force (un-force domain i
if udi is true).
node c r t 0 1 9 _ s 0 4( sn : int ) returns
(X , Y : real ; dd1 , dd2 , dd3 , dd4 , fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 , ud4 : bool ) =
loop {
sn =1 and X =25.0 and Y =40.0 and tfff ( dd1 , dd2 , dd3 , dd4 ) and
f8 ( fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 , ud4 )
As long as the sn input is equal to 1, the outputs of the crt019_s04 node satisfy
the constraint above that states that only the first domain should be displayed, and no
domain is forced or unforced. X and Y are set in the authorized zone 1. When sn becomes
equal to 2, the control passes to the constraint below, which is the same as the previous
one except that the point is set somewhere in zone 2.
} fby loop {
sn =2 and X =40.0 and Y =28.0 and tfff ( dd1 , dd2 , dd3 , dd4 ) and
f8 ( fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 , ud4 )
} fby loop {
sn =3 and X =40.0 and Y =28.0 and ftff ( dd1 , dd2 , dd3 , dd4 ) and
f8 ( fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 , ud4 )
} fby loop {
sn =4 and X =40.0 and Y =28.0 and ftff ( dd1 , dd2 , dd3 , dd4 ) and
fd3 and f7 ( fd1 , fd2 , fd4 , ud1 , ud2 , ud3 , ud4 )
} fby loop {
sn =5 and X =40.0 and Y =28.0 and ftff ( dd1 , dd2 , dd3 , dd4 ) and
fd4 and f7 ( fd1 , fd2 , fd3 , ud1 , ud2 , ud3 , ud4 )
} fby loop {
sn =6 and X =40.0 and Y =28.0 and ftff ( dd1 , dd2 , dd3 , dd4 ) and
ud4 and f7 ( fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 )
} fby loop {
sn = 7 and X = -9.0 and Y =25.0 and ftff ( dd1 , dd2 , dd3 , dd4 ) and
f8 ( fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 , ud4 )
}
Environment-Model Based Testing of Control Systems: Case Studies 647
This Lutin program, once run with the oracle of Section 4.3, allows test automation.
However, it suffers from the same flaw as its original non-automated counterpart: it can
be tedious to maintain. Indeed, if for some reason, the shape of zone 1 is changed, the
chosen point (25,40) might no longer be part of zone 1. Choosing pseudo-randomly any
point in zone 1 using the Lutin constraint solver makes the plan more robust to software
evolution. Moreover, with the same effort, it covers much more cases. In the same spirit,
we can further loosen this plan by replacing “choose a point in the authorized zone 1”
by “choose a point in any authorized zone” (cf the x_y_gen node of Section 2). In
step 3, 4, and 5, we could also toss the choice of the domain to be forced. Actually, by
loosening this plan in this way, we obtain a plan that covers more cases than the twenty
other plans of the test campaign!
in driving the SUT to set it in some configurations that exhibit interesting cases. This
is a work of SUT experts. Leveraging testers from the tedious and systematic part, and
letting them focus on interesting parts using high-level languages could restore the in-
terest in testing, which often has a poor reputation. Writing Lutin programs is a creative
activity, and generalising its use could ease to relocate test teams.
5 Related Work
Automating the test decision with executable oracles is a simple and helpful idea used
by many others. The real distinction between Lurette and other tools lies in the way
SUT inputs are generated. In the following, we group approaches according to the in-
put generation techniques: source-based, model-based, or environment-model based.
We found no work dealing with automated testing of SCADA systems. For dynamical
systems workbenches (such as Alices), the literature is quite abundant, and mostly con-
cerns Simulink [9]. Hence we focus here on works targeting Simulink, and refer to the
related work section of [3] for a broader and complementary positioning of Lurette in
the test of reactive systems.
Source Code Based Testing (White-Box). The White-box testing approach consists
in trying to increase structural coverage by analysing the SUT source using techniques
coming from formal verification such as model-checking [10], constraint solving, or
search-based exploration [11,12]. Such approaches are completely automated, but can
be confronted to the same limitations as formal verification with respect to state space
explosion. Several industrial tools use white-box techniques to test Simulink designs,
e.g., Safety Test Builder [13], or Design Verifier [14].
Model-Based Testing (Grey-Box). A very popular approach in the literature [15,16]
consists in viewing the SUT as a black-box, and designing a more or less detailed model
of it. This model should be faithful enough to provide valuable insights, and small
enough to be analyzable. The model structure is sometimes used to define coverage
criteria. The model is used both for the test decision and the stimuli generation. T-VEC
[17,18] and Reactis Tester [19] are an industrial tools using this approach to generate
tests offline. With Lurette, we also use a model of the SUT, but this model is only used
for oracles. The input generation is developed by exploration of environment models.
A way to combine this approach with Lurette would be to use such models of the SUT
to generate Lutin scenarios to guide the SUT to specific states and increase coverage.
Environment Model Based Testing (Black-Box). While the white-box approach in-
tends to increase structural coverage, the main goal of black-box testing is to increase
(functional) requirements coverage [20]. Time Partition Testing (TPT) is an industrial
black-box tool distributed by Piketec [21]. As Lurette, TPT have its own formalism to
model the environment and automate the SUT stimulation [22,23]. It is a graphical for-
malism based on hierarchical hybrid automata that is able to react online on the SUT
outputs. The major difference with Lutin is that those automata are deterministic. It
uses python oracles to automate the test decision, although Lustre is arguably better for
specifying high-level timed properties.
Another way to explore the environment state space, which has been experimented
on Simulink programs [24,25], is to perform heuristic search (evolutionary algorithms,
Environment-Model Based Testing of Control Systems: Case Studies 649
simulated annealing [26]). The idea is to associate to each SUT input a set of possible
parametrized generators (ramp, sinus, impulse, spline). The search algorithms generate
input sequences playing with several parameters, such as the number of steps each
generator is used, their order, or the amplitude of the signal. A fitness function estimates
the distance of the trace to the requirements. Then, another trace is generated with
other parameters, until an optimal solution is found. A limitation of their generators is
that they are not able to react to SUT outputs. More generally, for systems that have
a complex internal state, it can be very difficult to drive it in some specific operating
mode; to do that, the knowledge of the expert is mandatory (and being able to react
to SUT outputs too). Instead of guiding a random exploration via heuristics, Lurette
proposal consists in asking experts to write programs that performs a guided random
exploration of the SUT input state space. A way to combine both approaches could be
to let some evolutionary algorithms choose some parameters of Lutin programs, such
as choice point weights or variable bounds.
6 Conclusion
The main lesson of the first experimentation is that writing executable requirements is
not that difficult and allows to write precise and consistent requirements. This study
gave new insights to Corys engineers on one of their most frequently used object.
The second experimentation demonstrates a way to automate the execution of timed
test plans. Test plans are commonly used in industry, and automating their process
aroused a big interest within our industrial partners. Lutin and Lustre allows improv-
ing their use by permitting the design of more abstract test plans that are more robust to
temporal and data changes. One noteworthy outcome of this study is that the resulting
randomized and automated test plan actually covers more than the 21 test plans of the
original test suite.
There is a synergy between automated oracles and automated stimulus generation.
Indeed, generating thousands of simulation traces would be useless without automatic
test decision. Conversely, designing executable requirements to automate the decision
of a few scenarios generated manually might not be worth the effort.
This work also demonstrates that synchronous languages are not only useful for de-
signing critical systems (as the success of Scade gives evidence of), but can also be
used to validate dynamic systems models (Alices) or event-based asynchronous systems
(SCADA). The language-based approach of Lurette allows performing several kinds of
test (unit, integration, system, non-regression) on various domains [3,5].
From an industrial use perspective, a general-purpose library and specialized
domain-based ones are still to be done. That situation may progress in the near fu-
ture, as the interest expressed in Lurette by the three industrial partners of the COMON
project is one of the reasons that convinced people to establish in 2013 the Argosim
company. Argosim is developing the Stimulus tool based on the Lurette principles [27].
References
1. Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous dataflow programming
language Lustre. Proceedings of the IEEE 79(9), 1305–1320 (1991)
2. Raymond, P., Roux, Y., Jahier, E.: Lutin: a language for specifying and executing reactive
scenarios. EURASIP Journal on Embedded Systems (2008)
650 E. Jahier et al.
3. Jahier, E., Halbwachs, N., Raymond, P.: Engineering functional requirements of reactive
systems using synchronous languages. In: International Symposium on Industrial Embedded
Systems, SIES 2013, Porto, Portugal (2013)
4. Halbwachs, N., Fernandez, J.C., Bouajjanni, A.: An executable temporal logic to express
safety properties and its connection with the language lustre. In: ISLIP 1993, Quebec (1993)
5. Jahier, E., Raymond, P., Baufreton, P.: Case studies with lurette v2. Software Tools for Tech-
nology Transfer 8(6), 517–530 (2006)
6. Jahier, E., Raymond, P.: Generating random values using binary decision diagrams and con-
vex polyhedra. In: CSTVA, Nantes, France (2006)
7. Raymond, P.: Synchronous program verification with lustre/lesar. In: Modeling and Verifica-
tion of Real-Time Systems. ISTE/Wiley (2008)
8. Bailey, D., Wright, E.: Practical SCADA for industry. Elsevier (2003)
9. The Mathworks: Simulink/stateflow, http://www.mathworks.com
10. Hamon, G., de Moura, L., Rushby, J.: Generating efficient test sets with a model checker. In:
Software Engineering and Formal Methods, pp. 261–270 (2004)
11. Satpathy, M., Yeolekar, A., Ramesh, S.: Randomized directed testing (redirect) for
simulink/stateflow models. In: Proceedings of the 8th ACM International Conference on Em-
bedded Software, EMSOFT 2008, pp. 217–226. ACM, New York (2008)
12. Zhan, Y., Clark, J.A.: A search-based framework for automatic testing of MATLAB/Simulink
models. Journal of Systems and Software 81(2), 262–285 (2008)
13. TNI Software: Safety Test Builder,
http://www.geensoft.com/fr/article/safetytestbuilder/
14. The Mathworks: Design verifier, http://www.mathworks.com/products
15. Broy, M., Jonsson, B., Katoen, J.-P., Leucker, M., Pretschner, A. (eds.): Model-Based Testing
of Reactive Systems. LNCS, vol. 3472. Springer, Heidelberg (2005)
16. Zander, J., Schieferdecker, I., Mosterman, P.J.: 1. In: A Taxonomy of Model-based Testing
for Embedded Systems from Multiple Industry Domains, pp. 3–22. CRC Press (2011)
17. T-VEC: T-vec tester, http://www.t-vec.com
18. Blackburn, M., Busser, R., Nauman, A., Knickerbocker, R., Kasuda, R.: Mars polar lander
fault identification using model-based testing. In: 8th IEEE International Conference on En-
gineering of Complex Computer Systems, pp. 163–169 (2002)
19. Reactive Systems: Testing and validation of simulink models with reactis white paper
20. Cu, C., Jeppu, Y., Hariram, S., Murthy, N., Apte, P.: A new input-output based model cover-
age paradigm for control blocks. In: 2011 IEEE Aerospace Conference, pp. 1–12 (2011)
21. Piketec: Tpt, http://www.piketec.com
22. Lehmann, E.: Time partition testing: A method for testing dynamic functional behaviour. In:
Proceedings of TEST 2000, London, Great Britain (2000)
23. Bringmann, E., Kramer, A.: Model-based testing of automotive systems. In: 2008 1st Inter-
national Conference on Software Testing, Verification, and Validation, pp. 485–493 (2008)
24. Vos, T.E., Lindlar, F.F., Wilmes, B., Windisch, A., Baars, A.I., Kruse, P.M., Gross, H., We-
gener, J.: Evolutionary functional black-box testing in an industrial setting. Software Quality
Control 21(2), 259–288 (2013)
25. Baresel, A., Pohlheim, H., Sadeghipour, S.: Structural and functional sequence test of dy-
namic and state-based software with evolutionary algorithms. In: Cantú-Paz, E., et al. (eds.)
GECCO 2003. LNCS, vol. 2724, pp. 2428–2441. Springer, Heidelberg (2003)
26. McMinn, P.: Search-based software test data generation: a survey: Research articles. Softw.
Test. Verif. Reliab. 14(2), 105–156 (2004)
27. Argosim: Stimulus, http://www.argosim.com
Author Index