100% found this document useful (1 vote)

338 views670 pages

2014 Book ToolsAndAlgorithmsForTheConstr PDF

Uploaded by

Muersault

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

338 views670 pages

2014 Book ToolsAndAlgorithmsForTheConstr PDF

Uploaded by

Muersault

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 670

Erika Ábrahám

Klaus Havelund (Eds.)

ARCoSS

Tools and Algorithms

LNCS 8413

for the Construction

and Analysis of Systems
20th International Conference, TACAS 2014
Held as Part of the European Joint Conferences
on Theory and Practice of Software, ETAPS 2014
Grenoble, France, April 5–13, 2014, Proceedings

123
Lecture Notes in Computer Science 8413
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison, UK Takeo Kanade, USA
Josef Kittler, UK Jon M. Kleinberg, USA
Alfred Kobsa, USA Friedemann Mattern, Switzerland
John C. Mitchell, USA Moni Naor, Israel
Oscar Nierstrasz, Switzerland C. Pandu Rangan, India
Bernhard Steffen, Germany Doug Tygar, USA
Demetri Terzopoulos, USA
Gerhard Weikum, Germany

Advanced Research in Computing and Software Science

Subline of Lectures Notes in Computer Science

Subline Series Editors

Giorgio Ausiello, University of Rome ‘La Sapienza’, Italy
Vladimiro Sassone, University of Southampton, UK

Subline Advisory Board

Susanne Albers, University of Freiburg, Germany
Benjamin C. Pierce, University of Pennsylvania, USA
Bernhard Steffen, University of Dortmund, Germany
Deng Xiaotie, City University of Hong Kong
Jeannette M. Wing, Microsoft Research, Redmond, WA, USA
Erika Ábrahám Klaus Havelund (Eds.)

Tools and Algorithms

for the Construction
andAnalysis of Systems

20th International Conference, TACAS 2014

Held as Part of the European Joint Conferences
on Theory and Practice of Software, ETAPS 2014
Grenoble, France, April 5-13, 2014
Proceedings

13
Volume Editors
Erika Ábrahám
RWTH Aachen University
Aachen, Germany
E-mail: [email protected]
Klaus Havelund
Jet Propulsion Laboratory
California Institute of Technology
Pasadena, CA, USA
E-mail: [email protected]

ISSN 0302-9743 e-ISSN 1611-3349

ISBN 978-3-642-54861-1 e-ISBN 978-3-642-54862-8
DOI 10.1007/978-3-642-54862-8
Springer Heidelberg New York Dordrecht London

Library of Congress Control Number: 2014934147

LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues

© Springer-Verlag Berlin Heidelberg 2014
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and
executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication
or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location,
in ist current version, and permission for use must always be obtained from Springer. Permissions for use
may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution
under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of publication,
neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or
omissions that may be made. The publisher makes no warranty, express or implied, with respect to the
material contained herein.
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
Foreword

ETAPS 2014 was the 17th instance of the European Joint Conferences on The-
ory and Practice of Software. ETAPS is an annual federated conference that
was established in 1998, and this year consisted of six constituting conferences
(CC, ESOP, FASE, FoSSaCS, TACAS, and POST) including eight invited speak-
ers and two tutorial speakers. Before and after the main conference, numerous
satellite workshops took place and attracted many researchers from all over the
globe.
ETAPS is a confederation of several conferences, each with its own Program
Committee (PC) and its own Steering Committee (if any). The conferences cover
various aspects of software systems, ranging from theoretical foundations to pro-
gramming language developments, compiler advancements, analysis tools, formal
approaches to software engineering, and security. Organizing these conferences
in a coherent, highly synchronized conference program, enables the participation
in an exciting event, having the possibility to meet many researchers working
in different directions in the field, and to easily attend the talks of different
conferences.
The six main conferences together received 606 submissions this year, 155 of
which were accepted (including 12 tool demonstration papers), yielding an over-
all acceptance rate of 25.6%. I thank all authors for their interest in ETAPS, all
reviewers for the peer reviewing process, the PC members for their involvement,
and in particular the PC co-chairs for running this entire intensive process. Last
but not least, my congratulations to all authors of the accepted papers!
ETAPS 2014 was greatly enriched by the invited talks of Geoffrey Smith
(Florida International University, USA) and John Launchbury (Galois, USA),
both unifying speakers, and the conference-specific invited speakers (CC) Benoı̂t
Dupont de Dinechin (Kalray, France), (ESOP) Maurice Herlihy (Brown
University, USA), (FASE) Christel Baier (Technical University of Dresden, Ger-
many), (FoSSaCS) Petr Jančar (Technical University of Ostrava, Czech Repub-
lic), (POST) David Mazières (Stanford University, USA), and finally (TACAS)
Orna Kupferman (Hebrew University Jerusalem, Israel). Invited tutorials were
provided by Bernd Finkbeiner (Saarland University, Germany) and Andy Gor-
don (Microsoft Research, Cambridge, UK). My sincere thanks to all these speak-
ers for their great contributions.
For the first time in its history, ETAPS returned to a city where it had been
organized before: Grenoble, France. ETAPS 2014 was organized by the Univer-
sité Joseph Fourier in cooperation with the following associations and societies:
ETAPS e.V., EATCS (European Association for Theoretical Computer Science),
EAPLS (European Association for Programming Languages and Systems), and
EASST (European Association of Software Science and Technology). It had
VI Foreword

support from the following sponsors: CNRS, Inria, Grenoble INP, PERSYVAL-
Lab, Université Joseph Fourier, and Springer-Verlag.

The organization team comprised:

General Chair: Saddek Bensalem

Conferences Chair: Alain Girault and Yassine Lakhnech
Workshops Chair: Axel Legay
Publicity Chair: Yliès Falcone
Treasurer: Nicolas Halbwachs
Webmaster: Marius Bozga

The overall planning for ETAPS is the responsibility of the Steering Commit-
tee (SC). The ETAPS SC consists of an executive board (EB) and representa-
tives of the individual ETAPS conferences, as well as representatives of EATCS,
EAPLS, and EASST. The Executive Board comprises Gilles Barthe (satellite
events, Madrid), Holger Hermanns (Saarbrücken), Joost-Pieter Katoen (chair,
Aachen and Twente), Gerald Lüttgen (treasurer, Bamberg), and Tarmo Uustalu
(publicity, Tallinn). Other current SC members are: Martı́n Abadi (Santa Cruz
and Mountain View), Erika Ábrahám (Aachen), Roberto Amadio (Paris), Chris-
tel Baier (Dresden), Saddek Bensalem (Grenoble), Giuseppe Castagna (Paris),
Albert Cohen (Paris), Alexander Egyed (Linz), Riccardo Focardi (Venice), Björn
Franke (Edinburgh), Stefania Gnesi (Pisa), Klaus Havelund (Pasadena), Reiko
Heckel (Leicester), Paul Klint (Amsterdam), Jens Knoop (Vienna), Steve Kre-
mer (Nancy), Pasquale Malacaria (London), Tiziana Margaria (Potsdam), Fabio
Martinelli (Pisa), Andrew Myers (Boston), Anca Muscholl (Bordeaux), Catuscia
Palamidessi (Palaiseau), Andrew Pitts (Cambridge), Arend Rensink (Twente),
Don Sanella (Edinburgh), Vladimiro Sassone (Southampton), Ina Schäfer (Braun-
schweig), Zhong Shao (New Haven), Gabriele Taentzer (Marburg), Cesare Tinelli
(Iowa), Jan Vitek (West Lafayette), and Lenore Zuck (Chicago).
I sincerely thank all ETAPS SC members for all their hard work in making the
17th ETAPS a success. Moreover, thanks to all speakers, attendants, organizers
of the satellite workshops, and Springer for their support. Finally, many thanks
to Saddek Bensalem and his local organization team for all their eﬀorts enabling
ETAPS to return to the French Alps in Grenoble!

January 2014 Joost-Pieter Katoen

Preface

This volume contains the proceedings of TACAS 2014: the 20th International
Conference on Tools and Algorithms for the Construction and Analysis of Sys-
tems. TACAS 2014 took place during April 7–11, 2014, in Grenoble, France. It
was part of ETAPS 2014: the 17th European Joint Conferences on Theory and
Practice of Software.
TACAS is a forum for researchers, developers, and users interested in rigor-
ously based tools and algorithms for the construction and analysis of systems.
The research areas covered by TACAS include, but are not limited to, formal
methods, software and hardware speciﬁcation and veriﬁcation, static analysis,
dynamic analysis, model checking, theorem proving, decision procedures, real-
time, hybrid and stochastic systems, communication protocols, programming
languages, and software engineering. TACAS provides a venue where common
problems, heuristics, algorithms, data structures, and methodologies in these
areas can be discussed and explored.
TACAS 2014 solicited four kinds of papers, including three types of full-length
papers (15 pages), as well as short tool demonstration papers (6 pages):

– Research papers – papers describing novel research.

– Case study papers – papers reporting on case studies (preferably in a “real
life” setting), describing methodologies and approaches used.
– Regular tool papers – papers describing a tool, and focusing on engineering
aspects of the tool, including, e.g., software architecture, data structures,
and algorithms.
– Tool demonstration papers – papers focusing on the usage aspects of tools.

This year TACAS attracted a total of 161 paper submissions, divided into
117 research papers, 11 case study papers, 18 regular tool papers, and 15 tool
demonstration papers. Each submission was refereed by at least three reviewers.
Papers by PC members were refereed by four reviewers. 42 papers were accepted
for presentation at the conference: 26 research papers, 3 case study papers, 6
regular tool papers, and 7 tool demonstration papers. This yields an overall
acceptance rate of 26.1 %. The acceptance rate for full papers (research + case
study + regular tool) was 24.0 %.
TACAS 2014 also hosted the Competition on Software Veriﬁcation again, in
its third edition. This volume includes an overview of the competition results,
and short papers describing 11 of the 15 tools that participated in the competi-
tion. These papers were reviewed by a separate Program Committee, and each
included paper was refereed by at least four reviewers. The competition was
organized by Dirk Beyer, the Competition Chair. A session in the TACAS pro-
gram was reserved for presenting the results (by the Chair) and the participating
veriﬁers (by the developer teams).
VIII Preface

In addition to refereed contributions, the program included an invited talk by

Orna Kupferman. TACAS took place in an exciting and vibrant scientific atmo-
sphere, jointly with five other sister conferences (CC, ESOP, FASE, FoSSaCS,
and POST), with related scientific fields of interest, their invited speakers, and
the ETAPS unifying speakers Geoffrey Smith and John Launchbury.
We would like to thank all of the authors who submitted papers to TACAS
2014, the Program Committee members, and additional reviewers, without whom
TACAS would not be a success. Nikolaj Bjørner provided invaluable help as
TACAS Tool Chair, and Dirk Beyer as the Chair of the Competition on Software
Verification. We thank the competition teams for participating and show-casing
their tools to the TACAS community. We benefited greatly from the EasyChair
conference management system, which we used to handle the submission, re-
view, discussion, and proceedings preparation processes. Finally, we would like
to thank the TACAS Steering Committee, the ETAPS Steering Committee, and
the ETAPS Organizing Committee chaired by Saddek Bensalem.

January 2014 Erika Ábrahám

Klaus Havelund
Organization

Steering Committee
Rance Cleaveland University of Maryland, USA
Holger Hermanns Saarland University, Germany
Kim Guldstrand Larsen Aalborg University, Denmark
Bernhard Steﬀen TU Dortmund, Germany
Lenore Zuck University of Illinois at Chicago, USA

Program Committee
Erika Ábrahám RWTH Aachen University, Germany (Co-chair)
Christel Baier Technische Universität Dresden, Germany
Saddek Bensalem Verimag/UJF, France
Nathalie Bertrand Inria/IRISA, France
Armin Biere Johannes Kepler University, Austria
Nikolaj Bjørner Microsoft Research, USA (Tool Chair)
Alessandro Cimatti Fondazione Bruno Kessler, Italy
Rance Cleaveland University of Maryland, USA
Cindy Eisner IBM Research - Haifa, Israel
Martin Fränzle Carl von Ossietzky University Oldenburg,
Germany
Patrice Godefroid Microsoft Research, Redmond, USA
Susanne Graf Verimag, France
Orna Grumberg Technion, Israel
Klaus Havelund NASA/JPL, USA (Co-chair)
Boudewijn Haverkort University of Twente, The Netherlands
Gerard Holzmann NASA/JPL, USA
Barbara Jobstmann Verimag/CNRS, France and EPFL,
Switzerland
Joost-Pieter Katoen RWTH Aachen University, Germany and
University of Twente, The Netherlands
Kim Guldstrand Larsen Aalborg University, Denmark
Roland Meyer TU Kaiserslautern, Germany
Corina Pasareanu NASA Ames Research Center, USA
Doron Peled Bar Ilan University, Israel
Paul Pettersson Mälardalen University, Sweden
Nir Piterman University of Leicester, UK
Jaco van de Pol University of Twente, The Netherlands
Sriram Sankaranarayanan University of Colorado Boulder, USA
Natasha Sharygina Università della Svizzera Italiana, Switzerland
X Organization

Scott Smolka Stony Brook University, USA

Bernhard Steﬀen TU Dortmund, Germany
Mariëlle Stoelinga University of Twente, The Netherlands
Cesare Tinelli The University of Iowa, USA
Frits Vaandrager Radboud University Nijmegen,
The Netherlands
Willem Visser University of Stellenbosch, South Africa
Ralf Wimmer University of Freiburg, Germany
Lenore Zuck University of Illinois at Chicago, USA

Program Committee for SV-COMP 2014

Aws Albarghouthi University of Toronto, Canada
Dirk Beyer University of Passau, Germany
(Competition Chair)
Lucas Cordeiro Federal University of Amazonas, Brazil
Stephan Falke Karlsruhe Institute of Technology, Germany
Bernd Fischer Stellenbosch University, South Africa
Arie Gurﬁnkel SEI, USA
Matthias Heizmann University of Freiburg, Germany
Stefan Löwe University of Passau, Germany
Petr Muller Brno University of Technology, Czech Republic
Vadim Mutilin Russian Academy of Sciences, Russia
Alexander Nutz University of Freiburg, Germany
Gennaro Parlato University of Southampton, UK
Corneliu Popeea TU Munich, Germany
Jiri Slaby Masaryk University at Brno, Czech Republic
Michael Tautschnig Queen Mary University of London, UK
Tomáš Vojnar Brno University of Technology, Czech Republic

Additional Reviewers
Aarts, Fides Badouel, Eric Corbineau, Pierre
Abd Elkader, Karam Balasubramanian, Corzilius, Florian
Afzal, Wasif Daniel Csallner, Christoph
Alberti, Francesco Bernardo, Marco D’Ippolito, Nicolas
Aleksandrowicz, Gadi Bollig, Benedikt Dalsgaard, Andreas
Alt, Leonardo Bortolussi, Luca Engelbredt
Andres, Miguel Bouajjani, Ahmed Dang, Thao
Arbel, Eli Bozga, Marius David, Alexandre
Arenas, Puri Bozzano, Marco de Paula, Flavio M.
Aştefănoaei, Lacramioara Bradley, Aaron de Ruiter, Joeri
Aucher, Guillaume Bruintjes, Harold Defrancisco, Richard
Bacci, Giorgio Chakraborty, Souymodip Dehnert, Christian
Bacci, Giovanni Chen, Xin Derevenetc, Egor
Organization XI

Doganay, Kivanc Johnsen, Andreas Ranise, Silvio

Dubslaff, Clemens Jéron, Thierry Reimer, Sven
Dutertre, Bruno Kant, Gijs Remke, Anne
Eggers, Andreas Klüppelholz, Sascha Rojas, Jose
Ellen, Christian Kneuss, Etienne Rollini, Simone Fulvio
Enea, Constantin Komuravelli, Anvesh Roveri, Marco
Enoiu, Eduard Paul Kordy, Barbara Rozier, Kristin Yvonne
Estievenart, Morgane Kuncak, Viktor Ruah, Sitvanit
Fabre, Eric Kupferschmid, Stefan Rungta, Neha
Fedyukovich, Grigory Laarman, Alfons Rydhof Hansen, Rene
Feiten, Linus Lafourcade, Pascal Rüthing, Oliver
Ferrere, Thomas Lamprecht, Anna-Lena Sadre, Ramin
Filieri, Antonio Leucker, Martin Sanchez, Cesar
Fournier, Paulin Löding, Christof Sangnier, Arnaud
Frehse, Goran Luckow, Kasper Sankur, Ocan
Frias, Marcelo Mahdi, Ahmed Sauer, Matthias
Fu, Hongfei Majumdar, Rupak Scheibler, Karsten
Gao, Yang Maler, Oded Schivo, Stefano
Gario, Marco Marin, Paolo Schupp, Stefan
Giesl, Jürgen Marinescu, Raluca Schwabe, Peter
Girard, Antoine McMillan, Kenneth Seidl, Martina
Goessler, Gregor Meller, Yael Shacham, Ohad
Gopalakrishnan, Ganesh Menet, Quentin Sharma, Arpit
Gretz, Friedrich Micheli, Andrea She, Zhikun
Griggio, Alberto Monniaux, David Sheinvald, Sarai
Groote, Jan Friso Mooij, Arjan Shoham, Sharon
Guck, Dennis Mostowski, Wojciech Sosnovich, Adi
Haddad, Serge Mousavi, Mohammad Sproston, Jeremy
Hahn, Ernst Moritz Reza Srba, Jiri
Hatvani, Leo Mover, Sergio Srivathsan, Balaguru
Helouet, Loic Naujokat, Stefan Steffen, Martin
Herbreteau, Frederic Nellen, Johanna Sticksel, Christoph
Hommersom, Arjen Neubauer, Johannes Suryadevara, Jagadish
Howar, Falk Nevo, Ziv Swaminathan, Mani
Hyvärinen, Antti Nguyen, Viet Yen Sznajder, Nathalie
Höfner, Peter Noll, Thomas Te Brinke, Steven
Hölzenspies, Philip Olesen, Mads Chr. Timmer, Judith
Isberner, Malte Parker, David Timmer, Mark
Ivrii, Alexander Payet, Etienne Tkachuk, Oksana
Jacobs, Bart Pidan, Dmitry Tonetta, Stefano
Jacobs, Swen Poplavko, Peter Trivedi, Ashutosh
Jansen, Christina Poulsen, Danny Bøgsted Tzoref-Brill, Rachel
Jansen, David Prochnow, Steffen van den Broek, Pim
Jansen, David N. Puch, Stefan van der Pol, Kevin
Jansen, Nils Quilbeuf, Jean Verriet, Jacques
XII Organization

Vizel, Yakir Windmüller, Stephan Zarzani, Niko

Volpato, Michele Xue, Bingtian Zimmermann, Martin
von Essen, Christian Yan, Rongjie Zuliani, Paolo
von Styp, Sabrina Yang, Junxing
Weissenbacher, Georg Yorav, Karen
Widmer, Gerhard Zalinescu, Eugen

Additional Reviewers for SV-COMP 2014

Andrianov, Pavel Inverso, Omar Peringer, Petr

Dudka, Kamil Mandrykin, Mikhail Tomasco, Ermenegildo
Table of Contents

Invited Contribution
Variations on Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Orna Kupferman

Decision Procedures and Their Application

in Analysis
Decision Procedures for Flat Array Properties . . . . . . . . . . . . . . . . . . . . . . . 15
Francesco Alberti, Silvio Ghilardi, and Natasha Sharygina

SATMC: A SAT-Based Model Checker for Security-Critical Systems . . . . 31

Alessandro Armando, Roberto Carbone, and Luca Compagna

IC3 Modulo Theories via Implicit Predicate Abstraction . . . . . . . . . . . . . . 46

Alessandro Cimatti, Alberto Griggio, Sergio Mover, and
Stefano Tonetta

SMT-Based Veriﬁcation of Software Countermeasures against

Side-Channel Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Hassan Eldib, Chao Wang, and Patrick Schaumont

Detecting Unrealizable Speciﬁcations of Distributed Systems . . . . . . . . . . 78

Bernd Finkbeiner and Leander Tentrup

Synthesizing Safe Bit-Precise Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Arie Gurﬁnkel, Anton Belov, and Joao Marques-Silva

PEALT: An Automated Reasoning Tool for Numerical Aggregation

of Trust Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Michael Huth and Jim Huan-Pu Kuo

GRASShopper: Complete Heap Veriﬁcation with Mixed

Speciﬁcations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Ruzica Piskac, Thomas Wies, and Damien Zuﬀerey

Complexity and Termination Analysis

Alternating Runtime and Size Complexity Analysis of Integer
Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Marc Brockschmidt, Fabian Emmes, Stephan Falke,
Carsten Fuhs, and Jürgen Giesl
XIV Table of Contents

Proving Nontermination via Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

Hong-Yi Chen, Byron Cook, Carsten Fuhs, Kaustubh Nimkar, and
Peter O’Hearn

Ranking Templates for Linear Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

Jan Leike and Matthias Heizmann

Modeling and Model Checking Discrete Systems

FDR3 — A Modern Reﬁnement Checker for CSP . . . . . . . . . . . . . . . . . . . . 187
Thomas Gibson-Robinson, Philip Armstrong,
Alexandre Boulgakov, and Andrew W. Roscoe

Concurrent Depth-First Search Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 202

Gavin Lowe

Basic Problems in Multi-View Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Jan Reineke and Stavros Tripakis

GPUexplore: Many-Core On-the-Fly State Space Exploration Using

GPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Anton Wijs and Dragan Bošnački

Timed and Hybrid Systems

Forward Reachability Computation for Autonomous Max-Plus-Linear
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Dieky Adzkiya, Bart De Schutter, and Alessandro Abate

Compositional Invariant Generation for Timed Systems . . . . . . . . . . . . . . . 263

Lacramioara Aştefănoaei, Souha Ben Rayana, Saddek Bensalem,
Marius Bozga, and Jacques Combaz

Characterizing Algebraic Invariants by Diﬀerential Radical Invariants . . . 279

Khalil Ghorbal and André Platzer

Quasi-Equal Clock Reduction: More Networks, More Queries . . . . . . . . . . 295

Christian Herrera, Bernd Westphal, and Andreas Podelski

Are Timed Automata Bad for a Speciﬁcation Language? Language

Inclusion Checking for Timed Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
Ting Wang, Jun Sun, Yang Liu, Xinyu Wang, and Shanping Li
Table of Contents XV

Monitoring, Fault Detection and Identiﬁcation

Formal Design of Fault Detection and Identiﬁcation Components Using
Temporal Epistemic Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
Marco Bozzano, Alessandro Cimatti, Marco Gario, and
Stefano Tonetta

Monitoring Modulo Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

Normann Decker, Martin Leucker, and Daniel Thoma

Temporal-Logic Based Runtime Observer Pairs for System Health

Management of Real-Time Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Thomas Reinbacher, Kristin Yvonne Rozier, and Johann Schumann

Competition on Software Veriﬁcation

Status Report on Software Veriﬁcation (Competition Summary
SV-COMP 2014) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
Dirk Beyer

CBMC – C Bounded Model Checker (Competition Contribution) . . . . . . 389

Daniel Kroening and Michael Tautschnig

CPAchecker with Sequential Combination of Explicit-Value

Analyses and Predicate Analyses (Competition Contribution) . . . . . . . . . . 392
Stefan Löwe, Mikhail Mandrykin, and Philipp Wendler

CPALIEN: Shape Analyzer for CPAChecker

(Competition Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
Petr Muller and Tomáš Vojnar

Lazy-CSeq: A Lazy Sequentialization Tool for C (Competition

Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
Omar Inverso, Ermenegildo Tomasco, Bernd Fischer,
Salvatore La Torre, and Gennaro Parlato

MU-CSeq: Sequentialization of C Programs by Shared Memory

Unwindings (Competition Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
Ermenegildo Tomasco, Omar Inverso, Bernd Fischer,
Salvatore La Torre, and Gennaro Parlato

ESBMC 1.22 (Competition Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . 405

Jeremy Morse, Mikhail Ramalho, Lucas Cordeiro, Denis Nicole, and
Bernd Fischer

FrankenBit: Bit-Precise Veriﬁcation with Many Bits (Competition

Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
Arie Gurﬁnkel and Anton Belov
XVI Table of Contents

Predator: A Shape Analyzer Based on Symbolic Memory Graphs

(Competition Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
Kamil Dudka, Petr Peringer, and Tomáš Vojnar

Symbiotic 2: More Precise Slicing (Competition Contribution) . . . . . . . . . 415

Jiri Slaby and Jan Strejček

Ultimate Automizer with Unsatisﬁable Cores

(Competition Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
Matthias Heizmann, Jürgen Christ, Daniel Dietsch,
Jochen Hoenicke, Markus Lindenmann, Betim Musa,
Christian Schilling, Stefan Wissert, and Andreas Podelski

Ultimate Kojak (Competition Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . 421

Evren Ermis, Alexander Nutz, Daniel Dietsch,
Jochen Hoenicke, and Andreas Podelski

Specifying and Checking Linear Time Properties

Discounting in LTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424

Shaull Almagor, Udi Boker, and Orna Kupferman

Symbolic Model Checking of Stutter-Invariant Properties Using

Generalized Testing Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
Ala Eddine Ben Salem, Alexandre Duret-Lutz, Fabrice Kordon, and
Yann Thierry-Mieg

Synthesis and Learning

Symbolic Synthesis for Epistemic Speciﬁcations with Observational

Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Xiaowei Huang and Ron van der Meyden

Synthesis for Human-in-the-Loop Control Systems . . . . . . . . . . . . . . . . . . . 470

Wenchao Li, Dorsa Sadigh, S. Shankar Sastry, and Sanjit A. Seshia

Learning Regular Languages over Large Alphabets . . . . . . . . . . . . . . . . . . . 485

Oded Maler and Irini-Eleftheria Mens

Quantum and Probabilistic Systems

Veriﬁcation of Concurrent Quantum Protocols by Equivalence

Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
Ebrahim Ardeshir-Larijani, Simon J. Gay, and Rajagopal Nagarajan
Table of Contents XVII

Computing Conditional Probabilities in Markovian Models

Eﬃciently . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
Christel Baier, Joachim Klein, Sascha Klüppelholz, and
Steﬀen Märcker

Permissive Controller Synthesis for Probabilistic Systems . . . . . . . . . . . . . 531

Klaus Dräger, Vojtěch Forejt, Marta Kwiatkowska,
David Parker, and Mateusz Ujma

Precise Approximations of the Probability Distribution of a Markov

Process in Time: An Application to Probabilistic Invariance . . . . . . . . . . . 547
Sadegh Esmaeil Zadeh Soudjani and Alessandro Abate

Tool Demonstrations
SACO: Static Analyzer for Concurrent Objects . . . . . . . . . . . . . . . . . . . . . . 562
Elvira Albert, Puri Arenas, Antonio Flores-Montoya,
Samir Genaim, Miguel Gómez-Zamalloa, Enrique Martin-Martin,
German Puebla, and Guillermo Román-Dı́ez

VeriMAP: A Tool for Verifying Programs through Transformations . . . . . 568

Emanuele De Angelis, Fabio Fioravanti, Alberto Pettorossi, and
Maurizio Proietti

CIF 3: Model-Based Engineering of Supervisory Controllers . . . . . . . . . . . 575

Dirk A. van Beek, Wan J. Fokkink, Dennis Hendriks,
Albert Hofkamp, Jasen Markovski,
Joanna M. van de Mortel-Fronczak, and Michel A. Reniers

EDD: A Declarative Debugger for Sequential Erlang Programs . . . . . . . . . 581

Rafael Caballero, Enrique Martin-Martin, Adrian Riesco, and
Salvador Tamarit

APTE: An Algorithm for Proving Trace Equivalence . . . . . . . . . . . . . . . . . 587

Vincent Cheval

The Modest Toolset: An Integrated Environment for Quantitative

Modelling and Veriﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
Arnd Hartmanns and Holger Hermanns

Bounds2: A Tool for Compositional Multi-parametrised Veriﬁcation . . . . 599

Antti Siirtola

Case Studies
On the Correctness of a Branch Displacement Algorithm . . . . . . . . . . . . . . 605
Jaap Boender and Claudio Sacerdoti Coen
XVIII Table of Contents

Analyzing the Next Generation Airborne Collision Avoidance System . . . 620

Christian von Essen and Dimitra Giannakopoulou

Environment-Model Based Testing of Control Systems: Case Studies . . . 636

Erwan Jahier, Simplice Djoko-Djoko, Chaouki Maiza, and
Eric Lafont

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651

Variations on Safety

Orna Kupferman

Hebrew University, School of Engineering and Computer Science, Jerusalem 91904, Israel
[email protected]

Abstract. Of special interest in formal verification are safety properties, which

assert that the system always stays within some allowed region, in which nothing
“bad” happens. Equivalently, a property is a safety property if every violation of
it occurs after a finite execution of the system. Thus, a computation violates the
property if it has a “bad prefix”, all whose extensions violate the property. The
theoretical properties of safety properties as well as their practical advantages
with respect to general properties have been widely studied. The paper surveys
several extensions and variations of safety. We start with bounded and checkable
properties – fragments of safety properties that enable an even simpler reasoning.
We proceed to a reactive setting, where safety properties require the system to
stay in a region of states that is both allowed and from which the environment
cannot force it out. Finally, we describe a probability-based approach for defining
different levels of safety.

1 Introduction
Today’s rapid development of complex and safety-critical systems requires reliable veri-
fication methods. In formal verification, we verify that a system meets a desired property
by checking that a mathematical model of the system meets a formal specification that
describes the property. Of special interest are properties asserting that the observed be-
havior of the system always stays within some allowed region, in which nothing “bad”
happens. For example, we may want to assert that every message sent is acknowledged
in the next cycle. Such properties of systems are called safety properties. Intuitively, a
property ψ is a safety property if every violation of ψ occurs after a finite execution of
the system. In our example, if in a computation of the system a message is sent with-
out being acknowledged in the next cycle, this occurs after some finite execution of the
system. Also, once this violation occurs, there is no way to “fix” the computation.
In order to formally define what safety properties are, we refer to computations of a
nonterminating system as infinite words over an alphabet Σ. Consider a language L of
infinite words over Σ. A finite word x over Σ is a bad prefix for L iff for all infinite
words y over Σ, the concatenation x·y of x and y is not in L. Thus, a bad prefix for L is
a finite word that cannot be extended to an infinite word in L. A language L is a safety
language if every word not in L has a finite bad prefix. For example, if Σ = {0, 1},
then L = {0ω , 1ω } is a safety language. Indeed, every word not in L contains either the
sequence 01 or the sequence 10, and a prefix that ends in one of these sequences cannot
be extended to a word in L. 1 .
1
The definition of safety we consider here is given in [1,2], it coincides with the definition of
limit closure defined in [12], and is different from the definition in [26], which also refers to
the property being closed under stuttering.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 1–14, 2014.

c Springer-Verlag Berlin Heidelberg 2014
2 O. Kupferman

The interest in safety started with the quest for natural classes of specifications. The
theoretical aspects of safety have been extensively studied [2,28,29,33]. With the grow-
ing success and use of formal verification, safety has turned out to be interesting also
from a practical point of view [14,20,23]. Indeed, the ability to reason about finite pre-
fixes significantly simplifies both enumerative and symbolic algorithms. In the first,
safety circumvents the need to reason about complex ω-regular acceptance conditions.
For example, methods for temporal synthesis, program repair, or parametric reasoning
are much simpler for safety properties [18,32]. In the second, it circumvents the need
to reason about cycles, which is significant in both BDD-based and SAT-based meth-
ods [5,6]. In addition to a rich literature on safety, researchers have studied additional
classes, such as liveness and co-safety properties [2,28].
The paper surveys several extensions and variations of safety. We start with bounded
and checkable properties – fragments of safety properties that enable an even simpler
reasoning. We proceed to a reactive setting, where safety properties require the system
to stay in a region of states that is both allowed and from which the environment cannot
force it out. Finally, we describe a probability-based approach for defining different
levels of safety. The survey is based on the papers [24], with Moshe Y. Vardi, [21],
with Yoad Lustig and Moshe Y. Vardi, [25], with Sigal Weiner, and [10], with Shoham
Ben-David.

2 Preliminaries

Safety and Co-Safety Languages. Given an alphabet Σ, a word over Σ is a (possibly

infinite) sequence w = σ0 · σ1 · · · of letters in Σ. Consider a language L ⊆ Σ ω of
infinite words. A finite word x ∈ Σ ∗ is a bad prefix for L iff for all y ∈ Σ ω , we have
x · y ∈ L. Thus, a bad prefix is a finite word that cannot be extended to an infinite
word in L. Note that if x is a bad prefix, then all the finite extensions of x are also bad
prefixes. A language L is a safety language iff every infinite word w ∈ L has a finite bad
prefix. For a safety language L, we denote by bad-pref (L) the set of all bad prefixes for
L.
For a language L ⊆ Σ ω , we use comp(L) to denote the complement of L; i.e.,
comp(L) = Σ ω \ L. A language L ⊆ Σ ω is a co-safety language iff comp(L) is a safety
language. (The term used in [28] is guarantee language.) Equivalently, L is co-safety iff
every infinite word w ∈ L has a good prefix x ∈ Σ ∗ : for all y ∈ Σ ω , we have x · y ∈ L.
For a co-safety language L, we denote by good-pref (L) the set of good prefixes for L.
Note that for a safety language L, we have that good-pref (comp(L)) = bad-pref (L).

Word Automata. A nondeterministic Büchi word automaton (NBW, for short) is A =

Σ, Q, δ, Q0, F , where Σ is the input alphabet, Q is a finite set of states, δ : Q × Σ →
2Q is a transition function, Q0 ⊆ Q is a set of initial states, and F ⊆ Q is a set of
accepting states. If |Q0 | = 1 and δ is such that for every q ∈ Q and σ ∈ Σ, we have
that |δ(q, σ)| ≤ 1, then A is a deterministic Büchi word automaton (DBW, for short).
Given an input word w = σ0 · σ1 · · · in Σ ω , a run of A on w is a sequence r0 , r1 , . . .
of states in Q such that r0 ∈ Q0 and for every i ≥ 0, we have ri+1 ∈ δ(ri , σi ). For
a run r, let inf (r) denote the set of states that r visits infinitely often. That is,
Variations on Safety 3

inf (r) = {q ∈ Q : ri = q for infinitely many i ≥ 0}. As Q is finite, it is guaran-

teed that inf (r) = ∅. The run r is accepting iff inf (r) ∩ F = ∅. That is, iff there exists
a state in F that r visits infinitely often. A run that is not accepting is rejecting. When
α = Q, we say that A is a looping automaton. We use NLW and DLW to denote non-
deterministic and deterministic lopping automata. An NBW A accepts an input word w
iff there exists an accepting run of A on w. The language of an NBW A, denoted L(A),
is the set of words that A accepts. We assume that a given NBW A has no empty states,
except maybe the initial state (that is, at least one word is accepted from each state –
otherwise we can remove the state).

Linear Temporal Logic. The logic LTL is a linear temporal logic. Formulas of LTL are
constructed from a set AP of atomic propositions using the usual Boolean operators
and the temporal operators G (“always”), F (“eventually”), X (“next time”), and U
(“until”). Formulas of LTL describe computations of systems over AP . For example,
the LTL formula G(req → F ack ) describes computations in which every position in
which req holds is eventually followed by a position in which ack holds. Thus, each
LTL formula ψ corresponds to a language, denoted ||ψ||, of words in (2AP )ω that satisfy
it. For the detailed syntax and semantics of LTL, see [30]. The model-checking problem
for LTL is to determine, given an LTL formula ψ and a system M , whether all the
computations of M satisfy ψ.
General methods for LTL model checking are based on translation of LTL formulas
to nondeterministic Büchi word automata. By [36], given an LTL formula ψ, one can
construct an NBW Aψ over the alphabet 2AP that accepts exactly all the computations
that satisfy ψ. The size of Aψ is, in the worst case, exponential in the length of ψ.
Given a system M and an LTL formula ψ, model checking of M with respect to ψ is
reduced to checking the emptiness of the product of M and A¬ψ [36]. This check can
be performed on-the-fly and symbolically [7,35], and the complexity of model checking
that follows is PSPACE, with a matching lower bound [34].
It is shown in [2,33,22] that when ψ is a safety formula, we can assume that all the
states in Aψ are accepting. Indeed, Aψ accepts exactly all words all of whose prefixes
have at least one extension accepted by Aψ , which is what we get if we define all
the states of Aψ to be accepting. Thus, safety properties can be recognized by NLWs.
Since every NLW can be determined to an equivalent DLW by applying the subset
construction, all safety formulas can be translated to DLWs.

3 Interesting Fragments

In this section we discuss two interesting fragments of safety properties: clopen (a.k.a.
bounded) properties, which are useful in bounded model checking, and checkable prop-
erties, which are useful in real-time monitoring.

3.1 Clopen Properties

Bounded model checking methodologies check the correctness of a system with respect
to a given specification by examining computations of a bounded length. Results from
4 O. Kupferman

set-theoretic topology imply that sets in Σ ω that are both open and closed (clopen sets)
are bounded: membership in a clopen set can be determined by examining a bounded
number of letters in Σ.
In [24] we studied safety properties from a topological point of view. We showed
that clopen sets correspond to properties that are both safety and co-safety, and show
that when clopen specifications are given by automata or LTL formulas, we can point
to a bound and translate the specification to bounded formalisms such as bounded LTL
and cycle-free automata.

Topology. Consider a set X and a distance function d : X × X → IR between the

elements of X. For an element x ∈ X and γ ≥ 0, let K(x, γ) be the set of elements
x such that d(x, x ) ≤ γ. Consider a set S ⊆ X. An element x ∈ S is called an
interior element of S if there is γ > 0 such that K(x, γ) ⊆ S. The set S is open if all
the elements in S are interior. A set S is closed if X \ S is open. So, a set S is open
if every element in S has a nonempty “neighborhood” contained in S, and a set S is
closed if every element not in S has a nonempty neighborhood whose intersection with
S is empty. A set that is both open and close is called a clopen set.
A Cantor space consists of X = Dω , for some finite set D, and d defined by
d(w, w ) = 21n , where n is the first position where w and w differ. Thus, elements
of X can be viewed as infinite words over D and two words are close to each other if
they have a long common prefix. If w = w , then d(w, w ) = 0. It is known that clopen
sets in Cantor space are bounded, where a set S is bounded if it is of the form W · Dω
for some finite set W ⊆ D∗ . Hence, clopen sets in our Cantor space correspond exactly
to bounded properties: each clopen language L ⊆ Σ ω has a bound k ≥ 0 such that
membership in L can be determined by the prefixes of length k of words in Σ ω .
It is not hard to see that a language L ⊆ Σ ω is co-safety iff L is an open set in our
Cantor space [27,17]. To see this, consider a word w in a co-safety language L, and let
x be a good prefix of w. All the words w with d(w, w ) ≤ 2|x| 1
have x as their prefix,
so they all belong to L. For the second direction, consider a word w in an open set L,
and let γ > 0 be such that K(w, γ) ⊆ L. The prefix of w of length log γ1 is a good
prefix for L. It follows that the clopen sets in Σ ω are exactly these properties that are
both safety and co-safety!

Bounding Clopen Properties. Our goal in this section is to identify a bound for a
clopen property given by an automaton. Consider a clopen language L ⊆ Σ ω . For
a finite word x ∈ Σ ∗ , we say that x is undetermined with respect to L if there are
y ∈ Σ ω and z ∈ Σ ω such that x · y ∈ L and x · z ∈ L. As shown in [24], every word in
Σ ω has only finitely many prefixes that are undetermined with respect to L. It follows
that L is bounded: there are only finitely many words in Σ ∗ that are undetermined with
respect to L. For an integer k, we say that L is bounded by k if all the words x ∈ Σ ∗
such that |x| ≥ k are determined with respect to L. Moreover, since L is bounded, then
a minimal DLW that recognizes L must be cycle free. Indeed, otherwise we can pump
a cycle to infinitely many undetermined prefixes. Let diameter (L) be the diameter of
the minimal DLW for L.
Lemma 1. A clopen ω-regular language L ⊆ Σ ω is bounded by diameter (L).
Variations on Safety 5

Proof: Let A be the minimal deterministic looping automaton for L. Consider a word
x ∈ Σ ∗ with |x| ≥ diameter (L). Since A is cycle free, its run on x either reaches
an accepting sink, in which case x is a good prefix, or it does not reach an accepting
sink, in which case, by the definition of diameter (A), we cannot extend x to a word
accepted by A, thus x is a bad prefix.

For a language L, the in index of L, denoted inindex (L), is the minimal num-
ber of states that an NBW recognizing L has. Similarly, the out index of L, denoted
outindex (L), is the minimal number of states that an NBW recognizing comp(L) has.

Lemma 2. A clopen ω-regular language L ⊆ Σ ω is bounded by inindex (L) ·

outindex (L).

Proof: Assume by way of contradiction that there is a word x ∈ Σ ∗ such that |x| ≥
inindex (L) · outindex (L) and x is undetermined with respect to L. Thus, there are
suffixes y and z such that x · y ∈ L and x · z ∈ L. Let A1 and A2 be nondeterminis-
tic looping automata such that L(A1 ) = L, L(A2 ) = comp(L), and A1 and A2 have
inindex (L) and outindex (L) states, respectively. Consider two accepting runs r1 and
r2 of A1 and A2 on x · y and x · z, respectively. Since |x| ≥ inindex (L) · outindex (L),
there are two prefixes x[1, . . . , i] and x[1, . . . , j] of x such that i < j and both runs re-
peat their state after these two prefixes; i.e., r1 (i) = r1 (j) and r2 (i) = r2 (j). Consider
the word x = x[1, . . . , i] · x[i + 1, . . . , j]ω . Since A1 is a looping automaton, the run r1
induces an accepting run r1 of A1 on x . Formally, for all l ≤ i we have r1 (l) = r1 (l)
and for all l > i, we have r1 (l) = r1 (i + ((l − i)mod(j − i))). Similarly, the run r2
induces an accepting run of A2 on x . It follows that x is accepted by both A1 and A2 ,
contradicting the fact that L(A2 ) = comp(L(A1 )).

3.2 Checkable Properties

For an integer k ≥ 1, a language L ⊆ Σ ω is k-checkable if there is a language R ⊆
Σ k (of “allowed subwords”) such that a word w belongs to L iff all the subwords
of w of length k belong to R. A property is locally checkable if its language is k-
checkable for some k. Locally checkable properties, which are a special case of safety
properties, are common in the specification of systems. In particular, one can often
bound an eventuality constraint in a property by a fixed time frame, which results in a
checkable property.
The practical importance of locally checkable properties lies in the low memory
demand for their run-time verification. Indeed, k-checkable properties can be verified
with a bounded memory – one that has access only to the last k-computation cycles.
Run-time verification of a property amounts to executing a monitor together with the
system allowing the detection of errors in run time [20,3,9]. Run-time monitors for
checkable specifications have low memory demand. Furthermore, in the case of general
ω-regular properties, when several properties are checked, we need a monitor for each
property, and since the properties are independent of each other, so are the state spaces
of the monitors. Thus, the memory demand (as well as the resources needed to maintain
the memory) grow linearly with the number of properties monitored. Such a memory
6 O. Kupferman

demand is a real problem in practice. In contrast, as shown in [21], a monitor for a k-

checkable property needs only a record of the last k computation cycles. Furthermore,
even if a large number of k-checkable properties are monitored, the monitors can share
their memory, resulting in memory demand of |Σ|k , which is independant of the number
of properties monitored.
As in the case of clopen properties, our goal is to identify a bound for a checkable
property given by an automaton. We first need some notations. For a word w ∈ Σ ω and
k ≥ 0, we denote by sub(w, k) the set of finite subwords of w of length k, formally,
sub(w, k) = {y ∈ Σ ∗ : |y| = k and there exist x ∈ Σ ∗ and z ∈ Σ ω such that w =
xyz}. A language L ⊆ Σ ω is k-checkable if there exists a finite language R ⊆ Σ k
such that w ∈ L iff all the k-long subwords of w are in R. That is, L = {w ∈
Σ ω : sub(w, k) ⊆ R}. A language L ⊆ Σ ω is k-co-checkable if there exists a fi-
nite language R ⊆ Σ k such that w ∈ L iff there exists a k-long subword of w that is
in R. That is, L = {w ∈ Σ ω : sub(w, k) ∩ R = ∅}. A language is checkable (co-
checkable) if it is k-checkable (k-co-checkable, respectively) for some k. We refer to k
as the width of L. It is easy to to see that all checkable languages are safety, and sim-
ilarly for co-checkable and co-safety. In particular, L is a checkable language induced
by R iff comp(L) is co-checkable and induced by comp(L).
In order to demonstrate the the subtlety of the width question, consider the following
example.
Example 1. Let Σ = {0, 1, 2}. The DBW A below recognizes the language L of all the
words that contain 10, 120 or 220 as subwords. Note that L is the 3-co-checkable lan-
guage L co-induced by R = {010, 110, 210, 100, 101, 102, 120, 220}. Indeed, a word
w is in L iff sub(w, 3) ∩ R = ∅.

0 1,2
0,1,2
q0 1 q1 0
qac

2 0
1,2
q2

At first sight, it seems that the same considerations applied in Lemma 1 can be used
in order to prove that the width of a checkable language is bounded by the diameter
of the smallest DBW recognizing the language. Indeed, it appears that in an accepting
run, the traversal through the minimal good prefix should not contain a cycle. This
impression, however, is misleading, as demonstrated in the DBW A from Example 1,
where a traversal through the subword 120 contains a cycle, and similarly for 010. The
diameter of the DBW A is 3, so it does not constitute a counterexample to the conjecture
that the diameter bounds the width, but the problem remains open in [21], and the
tightest bound proven there depends on the size of A and not only on its diameter, and
is even not linear. Intuitively, it follows form an upper-bound on the size of a DBW that
recognizes minimal bad prefixes of L. Formally, we have the following.
Variations on Safety 7

Theorem 1. If a checkable (or co-checkable) language L is recognized by a DBW with

n states, then the width of L is bounded by O(n2 ).
As noted above, the bound in Theorem 1 is not tight and the best known lower bound
is only the diameter of a DBW for L. For the nondeterministic setting the bound is
tighter:
Theorem 2. If a checkable language L is recognized by an NBW with n states, then
the width of L is bounded by 2O(n) . Also, There exist an NBW A with O(n) states such
that L(A) is k-checkable but not (k − 1)-checkable, for k = (n + 1)2n + 2 .

4 Safety in a Reactive Setting

Recall that safety is defined with respect to languages over an alphabet Σ. Typically,
Σ = 2AP , where AP is the set of the system’s atomic propositions. Thus, the definition
and studies of safety treat all the atomic propositions as equal and do not distinguish
between input and output signals. As such, they are suited for closed systems – ones
that do not maintain an interaction with their environment. In open (also called reactive)
systems [19,31], the system interacts with the environment, and a correct system should
satisfy the specification with respect to all environments. A good way to think about
the open setting is to consider the situation as a game between the system and the
environment. The interaction between the players in this game generates a computation,
and the goal of the system is that only computations that satisfy the specification will
be generated.
Technically, one has to partition the set AP of atomic propositions to a set I of input
signals, which the environment controls, and a set O of output signals, which the system
controls. An open system is then an I/O-transducer – a deterministic automaton over
the alphabet 2I in which each state is labeled by an output in 2O . Given a sequence
of assignments to the input signals (each assignment is a letter in 2I ), the run of the
transducer on it induces a sequence of assignments to the output signals (that is, letters
in 2O ). Together these sequences form a computation, and the transducer realizes a
specification ψ if all its computations satisfy ψ [31].
The transition from the closed to the open setting modifies the questions we typically
ask about systems. Most notably, the synthesis challenge, of generating a system that
satisfies the specification, corresponds to the satisfiability problem in the closed setting
and to the realizability problem in the open setting. As another example, the equiva-
lence problem between LTL specifications is different in the closed and open settings
[16]. That is, two specifications may not be equivalent when compared with respect
to arbitrary systems on I ∪ O, but be open equivalent; that is, equivalent when com-
pared with respect to I/O-transducers. To see this, note for example that a satisfiable
yet non-realizable specification is equivalent to false in the open but not in the closed
setting.
As mentioned above, the classical definition of safety does not distinguish between
input and output signals. The definition can still be applied to open systems, as a special
case of closed systems with Σ = 2I∪O . In [11], Ehlers and Finkbeiner introduced reac-
tive safety – a definition of safety for the setting of open systems. Essentially, reactive
8 O. Kupferman

safety properties require the system to stay in a region of states that is both allowed and
from which the environment cannot force it out. The definition in [11] is by means of
sets of trees with directions in 2I and labels in 2O . The use of trees naturally locate
reactive safety between linear and branching safety. In [25], we suggested an equivalent
yet differently presented definition, which explicitly use realizability, and study the the-
oretical aspects of receive safety and other reactive fragments of specifications. In this
section, we review the definition and results from [25].

4.1 Definitions
We model open systems by transducers. Let I and O be finite sets of input and output
signals, respectively. Given x = i0 · i1 · i2 · · · ∈ (2I )ω and y = o0 · o1 · o2 · · · ∈ (2O )ω ,
we denote their composition by x ⊕ y = (i0 , o0 ) · (i1 , o1 ) · (i2 , o2 ) · · · ∈ (2I∪O )ω . An
I/O-transducer is a tuple T = I, O, S, s0 , η, L, where S is a set of states, s0 ∈ S is
an initial state, η : S × 2I → S is a transition function, and L : S → 2O is a labeling
function. The run of T on a (finite or infinite) input sequence x = i0 · i1 · i2 · · · , with
ij ∈ 2I , is the sequence s0 , s1 , s2 , . . . of states such that sj+1 = η(sj , ij+1 ) for all
j ≥ 0. The computation of T on x is then x ⊕ y, for y = L(s0 ) · L(s1 ) · L(s2 ) · · · Note
that T is responsive and deterministic (that is, it suggests exactly one successor state for
each input letter), and thus T has a single run, generating a single computation, on each
input sequence. We extend η to finite words over 2I in the expected way. In particular,
η(s0 , x), for x ∈ (2I )∗ is the |x|-th state in the run on x. A transducer T induces a
strategy f : (2I )∗ → 2O such that for all x ∈ (2I )∗ , we have that f (x) = L(η(s0 , x)).
Given an LTL formula ψ over I ∪ O, we say that ψ is I/O-realizable if there is a finite-
state I/O-transducer T such that all the computations of T satisfy ψ [31]. We then say
that T realizes ψ. When it is clear from the context, we refer to I/O-realizability as
realizability, or talk about realizability of languages over the alphabet 2I∪O .
Since the realizability problem corresponds to deciding a game between the system
and the environment, and the game is determined [15], realizability is determined too,
in the sense that either there is an I/O-transducer that realizes ψ (that is, the system
wins) or there is an O/I-transducer that realizes ¬ψ (that is, the environment wins).
Note that in an O/I-transducer the system and the environment “switch roles” and the
system is the one that provides the inputs to the transducer. A technical detail is that
in order for the setting of O/I-realizability to be dual to the one in I/O-realizability
we need, in addition to switching the roles and negating the specification, to switch
the player that moves first and consider transducers in which the environment initiates
the interaction and moves first. Since we are not going to delve into constructions, we
ignore this point, which is easy to handle.
Let I and O be sets of input and output signals, respectively. Consider a language
L ⊆ (2I∪O )ω . For a finite word u ∈ (2I∪O )∗ , let Lu = {s : u · s ∈ L} be the set of all
infinite words s such that u · s ∈ L. Thus, if L describes a set of allowed computations,
then Lu describes the set of allowed suffixes of computations starting with u.
We say that a finite word u ∈ (2I∪O )∗ is a system bad prefix for L iff Lu is not
realizable. Thus, a system bad prefix is a finite word u such that after traversing u,
the system does not have a strategy to ensure that the interaction with the environment
would generate a computation in L. We use sbp(L) to denote the set of system bad
Variations on Safety 9

prefixes for L. Note that by determinacy of games, whenever Lu is not realizable by the
system, then its complement is realizable by the environment. Thus, once a bad prefix
has been generated, the environment has a strategy to ensure that the entire generated
behavior is not in L.
ω
A language L ⊆ (2I∪O ) is a reactive safety language if every word not in L has
a system bad prefix. Below are two examples, demonstrating that a reactive safety lan-
guage need not be safe. Note that the other direction does hold: Let L be a safe language.
Consider a word w ∈ / L and a bad prefix u ∈ (2I∪O )∗ of w. Since u is a bad prefix, the
u
set L is empty, and is therefore unrealizable, so u is also a system bad prefix. Thus,
every word not in L has a system bad prefix, implying that L is reactively safe.

Example 2. Let I = {ﬁx }, O = {err}, ψ = G(err → F ﬁx ), and L = ψ. Note

that ψ is realizable using the system strategy “never err”. Also, L is clearly not safe, as
every prefix can be extended to one that satisfies ψ. On the other hand, L is reactively
safe. Indeed, every word not in L must have a prefix u that ends with {err }. Since
Lu = F ﬁx , which is not realizable, we have that u is a system bad prefix and L is
reactively safe.

Example 3. Let I = {ﬁx }, O = {err }, ψ = G¬err ∨ F Gﬁx , and L = ψ. Note

that ψ is realizable using the system strategy “never err”. Also, L is clearly not safe.
We show L is reactively safe. Consider a word w ∈ / L. Since w does not satisfy G¬err ,
there must be a prefix u of w such that u contains a position satisfying err . Since
words with prefix u do not satisfy G¬err , we have that Lu = F Gﬁx , which is not
realizable. Thus, u is a system bad prefix and L is reactively safe.

4.2 Properties of Reactive Safety

In the closed settings, the set bad-pref (L) is closed under finite extensions for all lan-
guages L ⊆ Σ ω . That is, for every finite word u ∈ bad-pref (L) and finite extension
v ∈ Σ ∗ , we have that u · v ∈ bad-pref (L). This is not the case in the reactive setting:

Theorem 3. System bad prefixes are not closed under finite extension.

Proof: Let I = {ﬁx }, O = {err}, and ψ = G(err → X ﬁx ) ∧ FG¬err . Thus, ψ

states that every error the system makes is fixed by the environment in the following
step, and that there is a finite number of errors. Let L = ψ. Clearly, ψ is realizable, as
the strategy “never err” is a winning strategy for the system. Also, L is reactively safe,
as a word w ∈ / L must have a prefix u that ends in a position satisfying err , and u is
a system bad prefix. We show that sbp(L) is not closed under finite extensions. To see
this, consider the word w = ({err , fix } · {fix})ω . That is, the system makes an error on
every odd position, and the environment always fixes errors. Since there are infinitely
many errors in w, it does not satisfy ψ. The prefix u = {err , fix } of w is a system bad
prefix. Indeed, an environment strategy that starts with ¬fix is a winning strategy. On
the other hand, u’s extension v = {err , fix } · {fix } is not a system bad prefix. Indeed,
Lv is realizable using the winning system strategy “never err”.
10 O. Kupferman

Recall that reasoning about safety properties is easier than reasoning about general
properties. In particular, rather than working with automata on infinite words, one can
model check safety properties using automata (on finite words) for bad prefixes. The
question is whether and how we can take advantage of reactive safety when the specifi-
cation is not safe (but is reactively safe). In [11], the authors answered this question to
the positive and described a transition from reactively safe to safe formulas. The trans-
lation is by means of nodes in the tree in which a violation starts. The translation from
[25] we are going to describe here uses realizability explicitly, which we find simpler.
For a language L ⊆ (2I∪O )ω , we define close(L) = L ∩ {w : w has no system bad
prefix for L}. Equivalently, close(L) = L \ {w : w has a system bad prefix for L}.
Intuitively, we obtain close(L) by defining all the finite extensions of sbp(L) as bad
prefixes. It is thus easy to see that sbp(L) ⊆ bad-pref (close(L)).
As an example, consider again the specification ψ = G(err → X fix ) ∧ FG¬err ,
with I = {fix }, O = {err }. An infinite word contains a system bad prefix for ψ iff it
has a position that satisfies err . Accordingly, close(ψ) = G¬err . As another example,
let us add to O the signal ack , and let ψ = G(err → X (fix ∧ F ack )), with I = {fix },
O = {err , ack }. Again, ψ is reactively safe and an infinite word contains a system bad
prefix for ψ iff it has a position that satisfies err . Accordingly, close(ψ) = G¬err .
Our definition of close(L) is sound, in the following sense:
Theorem 4. A language L ⊆ (2I∪O )ω is reactively safe iff close(L) is safe.
While L and close(L) are not equivalent, they are open equivalent [16]. Formally,
we have the following.
Theorem 5. For every language L ⊆ (2I∪O )ω and I/O-transducer T , we have that
T realizes L iff T realizes close(L).
It is shown in [11] that given an LTL formula ψ, it is possible to construct a determin-
istic looping word automaton for close(ψ) with doubly-exponential number of states.
In fact, as suggested in [23], it is then possible to generate also a deterministic automa-
ton for the bad prefixes of close(ψ). Note that when L is not realizable, we have that

∈ sbp(L), implying that close(L) = ∅. It follows that we cannot expect to construct

small automata for close(L), even nondeterministic ones, as the realizability problem
for LTL can be reduced to easy questions about them.
Theorem 5 implies that a reactive safety language L is open equivalent to a safe
language, namely close(L). Conversely, open equivalence to a safe language implies
reactive safety. This follows from the fact that if L and L are open-equivalent lan-
guages, then a prefix x is a minimal system bad prefix in L iff x is a minimal system
bad prefix in L . We can thus conclude with the following.
Theorem 6. A language L is reactively safe iff L is open equivalent to a safe language.
In the setting of open systems, dualization of specifications is more involved, as one
has not only to complement the language but to also dualizes the roles of the system
and the environment. Accordingly, we actually have four fragments of languages that
are induced by dualization of the reactive safety definition. We define them by means
of bad and good prefixes.
Consider a language L ⊆ (2I∪O )ω and a prefix u ∈ (2I∪O )∗ . We say that:
Variations on Safety 11

– u is a system bad prefix if Lu is not I/O-realizable.

– u is a system good prefix if Lu is I/O-realizable.
– u is an environment bad prefix if Lu is not O/I-realizable.
– u is an environment good prefix if Lu is O/I-realizable.

Now, a language L ⊆ (2I∪O )ω is a system (environment) safety language if every

word not in L has a system (environment, respectively) bad prefix. The language L is
a system (environment) co-safety language if every word in L has a system (environ-
ment, respectively) good prefix. System safety and environment co-safety dualize each
other: For every language L ⊆ (2I∪O )ω , we have that L is system safe iff comp(L) is
environment co-safe.
Since each language Lu is either I/O-realizable or not I/O-realizable, and the same
for O/I-realizability, all finite words are determined, in the following sense.
Theorem 7. Consider a language L ⊆ (2I∪O )ω . All finite words in (2I∪O )∗ are deter-
mined with respect to L. That is, every prefix is either system good or system bad, and
either environment good or environment bad, with respect to L.
Note that while every prefix is determined, a word may have both system bad and
system good prefixes, and similarly for the environment, which is not the case in the
setting of closed systems. For example, recall the language L = G(err → X ﬁx ) ∧
FG¬err , for I = {ﬁx } and O = {err}. As noted above, the word ({err, f ix} ·
{f ix})ω has both a system bad prefix {err, f ix}, and a system good prefix {err, f ix} ·
{f ix}.
In Section 3.1 we showed that in the closed setting, the intersection of safe and co-
safe properties induces the fragment of bounded properties. It is shown in [25] that
boundedness in the open setting is more involved, as a computation may have both
infinitely many good and infinitely many bad prefixes. It is still possible, however, to
define reactive bounded properties and use their appealing practical advantages.

5 A Spectrum between Safety and Co-safety

Safety is a binary notion. A property may or may not satisfy the definition of safety.
In this section we describe a probability-based approach for defining different levels
of safety. The origin of the definition is a study of vacuity in model checking [4,23].
Vacuity detection is a method for finding errors in the model-checking process when
the specification is found to hold in the model. Most vacuity algorithms are based on
checking the effect of applying mutations on the specification. It has been recognized
that vacuity results differ in their significance. While in many cases vacuity results
are valued as highly informative, there are also cases in which the results are viewed as
meaningless by users. In [10], we suggested a method for an automatic ranking of vacu-
ity results according to their level of importance. Our method is based on the probability
of the mutated specification to hold in a random computation. For example, two natural
mutations of the specification G(req → F ready) are G(¬req), obtained by mutating
the subformula ready to false, and GF ready , obtained by mutating the subformula
req to true. It is agreed that vacuity information about satisfying the first mutation is
12 O. Kupferman

more alarming than information about satisfying the second. The framework in [10] for-
mally explains this, as the probability of G(¬req) to hold in a random computation is 0,
whereas the probability of GF ready is 1. In this section we suggest to use probability
also for defining levels of safety.

5.1 The Probabilistic Setting

Given a set S of elements, a probability distribution on S is a function μ : S → [0, 1]
such that Σs∈S μ(s) = 1. Consider an alphabet Σ. A random word over Σ is a word in
which for all indices i, the i-th letter is drown uniformly at random. In particular, when
Σ = 2AP , then a random computation π is such that for each atomic proposition q and
for each position in π, the probability of q to hold in the position is 12 . An equivalent
definition of this probabilistic model is by means of the probabilistic labeled structure
UΣ , which generates computations in a uniform distribution. Formally, UΣ is a clique
with |Σ| states in which a state σ ∈ Σ is labeled σ, is initial with probability |Σ|1
, and
1
the probability to move from a state σ to a state σ is |Σ| .
We define the probability of a language L ⊆ Σ ω , denoted P r(L), as the probability
of the event {π : π is a path in UΣ that is labeled by a word in L}. Accordingly, for
an LTL formula ϕ, we define P r(ϕ) as the probability of the event {π : π is a path in
U2AP that satisfies ϕ}. For example, the probabilities of Xp, Gp, and F p are 12 , 0, and
1, respectively. Using UΣ we can reduce the problem of finding P r(ϕ) to ϕ’s model
checking. Results on probabilistic LTL model checking [8] then imply that the problem
of finding the probability of LTL formulas is PSPACE-complete.
First-order logic respects a 0/1-law: the probability of a formula to be satisfied in
a random model is either 0 or 1 [13]. It is easy to see that a 0/1-law does not hold
for LTL. For example, for an atomic proposition p, we have that P r(p) = 12 . Back to
our safety story, it is not hard to see that P r(Gξ), for a formula ξ with P r(ξ) = 1,
is 0. Dually, P r(F ξ), for a formula ξ with P r(ξ) = 0 is 1. Can we relate this to the
fact that Gp is a safety property whereas F p is a co-safety property? Or perhaps it
has to do with F p being a liveness property?2 This is not clear, as, for example, the
probability of clopen formulas depends on finitely many events and can vary between
0 to 1. As another example, consider the two possible semantics of the Until temporal
operator. For the standard, strong, Until, which is not a safe, we have P r(pU q) = 23 . By
changing the semantics of the Until to a weak one, we get the safety formula pW q, with
pW q = pU q∨Gp. Still, P r(pW q) = P r(pU q). Thus, the standard probabilistic setting
does not suggest a clear relation between probability and different levels of safety.
We argue that we can still use the probabilistic approach in order to measure safety.
The definition of P r(ϕ) in [10] assumes that the probability of an atomic proposition to
hold in each position is 12 . This corresponds to computations in an infinite-state system
and is the standard approach taken in studies of 0/1-laws. Alternatively, one can also
study the probability of formulas to hold in computations of finite-state systems. For-
mally, for an integer l ≥ 1, let P rl (ϕ) denote the probability that ϕ holds in a random
cycle of length l. Here too, the probability of each atomic proposition to hold in a state is
1
2 , yet we have only l states to fix an assignment to. So, for example, while P r(Gp) = 0,
2
A language L is a liveness language if L = Σ ∗ · L [1].
Variations on Safety 13

we have that P r1 (Gp) = 12 , P r2 (Gp) = 14 , and in general P rj (Gp) = 21j . Indeed, an

l-cycle satisfies Gp iff all its states satisfy p.
There are several interesting issues in the finite-state approach. First, it may seem
obvious that the bigger l is, the closer P rl (ϕ) gets to P r(ϕ). This is, however, not so
simple. For example, issues like cycles in ϕ can cause P rl (ϕ) to be non-monotonic.
For example, when ϕ requires p to hold in exactly all even positions, then P r1 (ϕ) =
0, P r2 (ϕ) = 14 , P r3 (ϕ) = 0, P r4 (ϕ) = 16 1
, and so on.
Assume now that we have cleaned the cycle-based issue (for example by restricting
attention to formulas without Xs, or by restricting attention to cycles of “the right”
length). Can we characterize safety properties by means of the asymptotic behavior of
P rl (ϕ)? Can we define different levels of safety according to the rate the probability
decreases or increases? For example, clearly P rl (Gp) tends to 0 as l increases, whereas
P rl (F p) tends to 1. Also, now, for a given l, we have that P rl (pW q) > P rl (pU q). In
addition, for a clopen property ϕ, we have that P rl (ϕ) stablizes once l is bigger than
the bound of ϕ. Still, the picture is not clean. For example, F Gp is a liveness formula,
but P rl (F Gp) decreases as l increases. Finding a characterization of properties that is
based on the analysis of P rl is an interesting question, and our initial research suggests
a connection between the level of safety of ϕ and the behavior of P rl (ϕ).

References
1. Alpern, B., Schneider, F.B.: Defining liveness. IPL 21, 181–185 (1985)
2. Alpern, B., Schneider, F.B.: Recognizing safety and liveness. Distributed Computing 2,
117–126 (1987)
3. Barringer, H., Goldberg, A., Havelund, K., Sen, K.: Rule-based runtime verification. In: Stef-
fen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 44–57. Springer, Heidelberg
(2004)
4. Beer, I., Ben-David, S., Eisner, C., Rodeh, Y.: Efficient detection of vacuity in ACTL formu-
las. In: Grumberg, O. (ed.) CAV 1997. LNCS, vol. 1254, pp. 279–290. Springer, Heidelberg
(1997)
5. Biere, A., Cimatti, A., Clarke, E., Zhu, Y.: Symbolic model checking without BDDs. In:
Cleaveland, W.R. (ed.) TACAS 1999. LNCS, vol. 1579, pp. 193–207. Springer, Heidelberg
(1999)
6. Bloem, R., Gabow, H.N., Somenzi, F.: An algorithm for strongly connected component anal-
ysis in n log n symbolic steps. In: Johnson, S.D., Hunt Jr., W.A. (eds.) FMCAD 2000. LNCS,
vol. 1954, pp. 37–54. Springer, Heidelberg (2000)
7. Courcoubetis, C., Vardi, M.Y., Wolper, P., Yannakakis, M.: Memory efficient algorithms for
the verification of temporal properties. FMSD 1, 275–288 (1992)
8. Courcoubetis, C., Yannakakis, M.: The complexity of probabilistic verification. J. ACM 42,
857–907 (1995)
9. d’Amorim, M., Roşu, G.: Efficient monitoring of omega-languages. In: Etessami, K., Raja-
mani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 364–378. Springer, Heidelberg (2005)
10. Ben-David, S., Kupferman, O.: A framework for ranking vacuity results. In: Van Hung, D.,
Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172, pp. 148–162. Springer, Heidelberg (2013)
11. Ehlers, R., Finkbeiner, B.: Reactive safety. In: Proc. 2nd GANDALF. Electronic Proceedings
in TCS, vol. 54, pp. 178–191 (2011)
12. Emerson, E.A.: Alternative semantics for temporal logics. TCS 26, 121–130 (1983)
13. Fagin, R.: Probabilities in finite models. Journal of Symb. Logic 41(1), 50–58 (1976)
14 O. Kupferman

14. Filiot, E., Jin, N., Raskin, J.-F.: An antichain algorithm for LTL realizability. In: Bouajjani,
A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 263–277. Springer, Heidelberg (2009)
15. Gale, D., Stewart, F.M.: Infinite games of perfect information. Ann. Math. Studies 28,
245–266 (1953)
16. Greimel, K., Bloem, R., Jobstmann, B., Vardi, M.: Open implication. In: Aceto, L., Damgård,
I., Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds.) ICALP 2008,
Part II. LNCS, vol. 5126, pp. 361–372. Springer, Heidelberg (2008)
17. Gumm, H.P.: Another glance at the Alpern-Schneider characterization of safety and liveness
in concurrent executions. IPL 47, 291–294 (1993)
18. Harel, D., Katz, G., Marron, A., Weiss, G.: Non-intrusive repair of reactive programs. In:
ICECCS, pp. 3–12 (2012)
19. Harel, D., Pnueli, A.: On the development of reactive systems. In: Logics and Models of
Concurrent Systems, NATO ASI, vol. F-13, pp. 477–498. Springer (1985)
20. Havelund, K., Roşu, G.: Synthesizing monitors for safety properties. In: Katoen, J.-P.,
Stevens, P. (eds.) TACAS 2002. LNCS, vol. 2280, pp. 342–356. Springer, Heidelberg (2002)
21. Kupferman, O., Lustig, Y., Vardi, M.Y.: On locally checkable properties. In: Hermann, M.,
Voronkov, A. (eds.) LPAR 2006. LNCS (LNAI), vol. 4246, pp. 302–316. Springer, Heidel-
berg (2006)
22. Kupferman, O., Vardi, M.Y.: Model checking of safety properties. In: Halbwachs, N., Peled,
D.A. (eds.) CAV 1999. LNCS, vol. 1633, pp. 172–183. Springer, Heidelberg (1999)
23. Kupferman, O., Vardi, M.Y.: Model checking of safety properties. FMSD 19(3), 291–314
(2001)
24. Kupferman, O., Vardi, M.Y.: On bounded specifications. In: Nieuwenhuis, R., Voronkov, A.
(eds.) LPAR 2001. LNCS (LNAI), vol. 2250, pp. 24–38. Springer, Heidelberg (2001)
25. Kupferman, O., Weiner, S.: Environment-friendly safety. In: Biere, A., Nahir, A., Vos, T.
(eds.) HVC 2012. LNCS, vol. 7857, pp. 227–242. Springer, Heidelberg (2013)
26. Lamport, L.: Logical foundation. In: Alford, M.W., Hommel, G., Schneider, F.B., Ansart,
J.P., Lamport, L., Mullery, G.P., Zhou, T.H. (eds.) Distributed Systems. LNCS, vol. 190,
pp. 19–30. Springer, Heidelberg (1985)
27. Manna, Z., Pnueli, A.: he anchored version of the temporal framework. In: de Bakker, J.W.,
de Roever, W.-P., Rozenberg, G. (eds.) Linear Time, Branching Time and Partial Order in
Logics and Models for Concurrency. LNCS, vol. 354, pp. 201–284. Springer, Heidelberg
(1989)
28. Manna, Z., Pnueli, A.: The Temporal Logic of Reactive and Concurrent Systems: Specifica-
tion. Springer (1992)
29. Manna, Z., Pnueli, A.: The Temporal Logic of Reactive and Concurrent Systems: Safety.
Springer (1995)
30. Pnueli, A.: The temporal semantics of concurrent programs. TCS 13, 45–60 (1981)
31. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Proc. 16th POPL, pp. 179–
190 (1989)
32. Pnueli, A., Shahar, E.: Liveness and acceleration in parameterized verification. In: Emerson,
E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 328–343. Springer, Heidelberg
(2000)
33. Sistla, A.P.: Safety, liveness and fairness in temporal logic. Formal Aspects of Computing 6,
495–511 (1994)
34. Sistla, A.P., Clarke, E.M.: The complexity of propositional linear temporal logic. Journal of
the ACM 32, 733–749 (1985)
35. Touati, H.J., Brayton, R.K., Kurshan, R.: Testing language containment for ω-automata using
BDD’s. I & C 118(1), 101–109 (1995)
36. Vardi, M.Y., Wolper, P.: Reasoning about infinite computations. I & C 115(1), 1–37 (1994)
Decision Procedures for Flat Array Properties

Francesco Alberti1 , Silvio Ghilardi2 , and Natasha Sharygina1

1
University of Lugano, Lugano, Switzerland
2
Università degli Studi di Milano, Milan, Italy

Abstract. We present new decidability results for quantiﬁed fragments

of theories of arrays. Our decision procedures are fully declarative, para-
metric in the theories of indexes and elements and orthogonal with re-
spect to known results. We also discuss applications to the analysis of
programs handling arrays.

1 Introduction

Decision procedures constitute, nowadays, one of the fundamental components of

tools and algorithms developed for the formal analysis of systems. Results about
the decidability of fragments of (first-order) theories representing the semantics
of real system operations deeply influenced, in the last decade, many research
areas, from verification to synthesis. In particular, the demand for procedures
dealing with quantified fragments of such theories fast increased. Quantified
formulas arise from several static analysis and verification tasks, like modeling
properties of the heap, asserting frame axioms, checking user-defined assertions
in the code and reasoning about parameterized systems.
In this paper we are interested in studying the decidability of quantified frag-
ments of theories of arrays. Quantification is required over the indexes of the
arrays in order to express significant properties like “the array has been ini-
tialized to 0” or “there exist two different positions of the array containing an
element c”, for example. From a logical point of view, array variables are inter-
preted as functions. However, adding free function symbols to a theory T (with
the goal of modeling array variables) may yield to undecidable extensions of
widely used theories like Presburger arithmetic [17]. It is, therefore, mandatory
to identify fragments of the quantified theory of arrays which are on one side still
decidable and on the other side sufficiently expressive. In this paper, we show
that by combining restrictions on quantifier prefixes with ‘flatness’ limitations on
dereferencing (only positions named by variables are allowed in dereferencing),
one can restore decidability. We call the fragments so obtained Flat Array Prop-
erties; such fragments are orthogonal to the fragments already proven decidable
in the literature [8, 15, 16] (we shall defer the technical comparison with these
contributions to Section 5). Here we explain the modularity character of our

The work of the first author was supported by Swiss National Science Foundation
under grant no. P1TIP2 152261.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 15–30, 2014.

c Springer-Verlag Berlin Heidelberg 2014
16 F. Alberti, S. Ghilardi, and N. Sharygina

results and their applications to concrete decision problems for array programs
annotated with assertions or postconditions.
We examine Flat Array Properties in two different settings. In one case, we
consider Flat Array Properties over the theory of arrays generated by adding
free function symbols to a given theory T modeling both indexes and elements
of the arrays. In the other one, we take into account Flat Array Properties over
a theory of arrays built by connecting two theories TI and TE describing the
structure of indexes and elements. Our decidability results are fully declarative
and parametric in the theories T, TI , TE . For both settings, we provide suffi-
cient conditions on T and TI , TE for achieving the decidability of Flat Array
Properties. Such hypotheses are widely met by theories of interest in practice,
like Presburger arithmetic. We also provide suitable decision procedures for Flat
Array Properties of both settings. Such procedures reduce the decidability of
Flat Array Properties to the decidability of T -formulæ in one case and TI - and
TE -formulæ in the other case.
We further show, as an application of our decidability results, that the safety
of an interesting class of programs handling arrays or strings of unknown length is
decidable. We call this class of programs simple0A -programs : this class covers non-
recursive programs implementing for instance searching, copying, comparing,
initializing, replacing and testing functions. The method we use for showing
these safety results is similar to a classical method adopted in the model-checking
literature for programs manipulating integer variables (see for instance [7,9,12]):
we first assume flatness conditions on the control flow graph of the program and
then we assume that transitions labeling cycles are “acceleratable”. However,
since we are dealing with array manipulating programs, acceleration requires
specific results that we borrow from [3]. The key point is that the shape of
most accelerated transitions from [3] matches the definition of our Flat Array
Properties (in fact, Flat Array Properties were designed precisely in order to
encompass such accelerated transitions for arrays).
From the practical point of view, we tested the effectiveness of state of the
art SMT-solvers in checking the satisfiability of some Flat Array Properties aris-
ing from the verification of simple0A -programs. Results show that such tools fail
or timeout on some Flat Array Properties. The implementation of our decision
procedures, once instantiated with the theories of interests for practical applica-
tions, will likely lead, therefore, to further improvements in the areas of practical
solutions for the rigorous analysis of software and hardware systems.

Plan of the Paper. The paper starts by recalling in Section 2 required back-
ground notions. Section 3 is dedicated to the deﬁnition of Flat Array Properties.
Section 3.1 introduces a decision procedure for Flat Array Properties in the case
of a mono-sorted theory ARR1 (T ) generated by adding free function symbols to
a theory T . Section 3.2 discusses a decision procedure for Flat Array Properties
in the case of the multi-sorted array theory ARR2 (TI , TE ) built over two theories
TI and TE for the indexes and elements (we supply also full lower and upper
complexity bounds for the case in which TI and TE are both Presburger arith-
metic). In Section 4 we recall and adapt required notions from [3], deﬁne the
Decision Procedures for Flat Array Properties 17

class of flat0 -programs and establish the requirements for achieving the decid-
ability of reachability analysis on some flat0 -programs. Such requirements are
instantiated in Section 4.1 in the case of simple0A -programs, array programs with
flat control-flow graph admitting definable accelerations for every loop. In Sec-
tion 4.2 we position the fragment of Flat Array Properties with respect to the
actual practical capabilities of state-of-the-art SMT-solvers. Section 5 compares
our results with the state of the art, in particular with the approaches of [8,15].

2 Background
We use lower-case latin letters x, i, c, d, e, . . . for variables; for tuples of vari-
ables we use bold face letters like x, i, c, d, e . . . . The n-th component of a tuple
c is indicated with cn and | − | may indicate tuples length (so that we have
c = c1 , . . . , c|c| ). Occasionally, we may use free variables and free constants in-
terchangeably. For terms, we use letters t, u, . . . , with the same conventions as
above; t, u are used for tuples of terms (however, tuples of variables are assumed
to be distinct, whereas the same is not assumed for tuples of terms - this is useful
for substitutions notation, see below). When we use u = v, we assume that two
tuples have equal n length, say n (i.e. n := |u| = |v|) and that u = v abbreviates
the formula i=1 ui = vi .
With E(x) we denote that the syntactic expression (term, formula, tuple
of terms or of formulæ) E contains at most the free variables taken from the
tuple x. We use lower-case Greek letters φ, ϕ, ψ, . . . for quantifier-free formulæ
and α, β, . . . for arbitrary formulæ. The notation φ(t) identifies a quantifier-free
formula φ obtained from φ(x) by substituting the tuple of variables x with the
tuple of terms t.
A prenex formula is a formula of the form Q1 x1 . . . Qn xn ϕ(x1 , . . . , xn ), where
Qi ∈ {∃, ∀} and x1 , . . . , xn are pairwise different variables. Q1 x1 · · · Qn xn is the
prefix of the formula. Let R be a regular expression over the alphabet {∃, ∀}.
The R-class of formulæ comprises all and only those prenex formulæ whose prefix
generates a string Q1 · · · Qn matched by R.
According to the SMT-LIB standard [22], a theory T is a pair (Σ, C), where
Σ is a signature and C is a class of Σ-structures; the structures in C are called
the models of T . Given a Σ-structure M, we denote by S M , f M , P M , . . . the
interpretation in M of the sort S, the function symbol f , the predicate symbol P ,
etc. A Σ-formula α is T -satisfiable if there exists a Σ-structure M in C such that
α is true in M under a suitable assignment to the free variables of α (in symbols,
M |= α); it is T -valid (in symbols, T |= α) if its negation is T -unsatisfiable. Two
formulæ α1 and α2 are T -equivalent if α1 ↔ α2 is T -valid; α1 T -entails α2 (in
symbols, α1 |=T α2 ) iff α1 → α2 is T -valid. The satisfiability modulo the theory
T (SM T (T )) problem amounts to establishing the T -satisfiability of quantifier-
free Σ-formulæ. All theories T we consider in this paper have decidable
SM T (T )-problem (we recall that this property is preserved when adding free
function symbols, see [13, 26]).
A theory T = (Σ, C) admits quantifier elimination iff for any arbitrary Σ-
formula α(x) it is always possible to compute a quantifier-free formula ϕ(x)
18 F. Alberti, S. Ghilardi, and N. Sharygina

such that T |= ∀x.(α(x) ↔ ϕ(x)). Thus, in view of the above assumption on

decidability of SM T (T )-problem, a theory having quantifier elimination is de-
cidable (i.e. T -satisfiability of every formula is decidable). Our favorite example
of a theory with quantifier elimination is Presburger Arithmetic, hereafter de-
noted with P; this is the theory in the signature {0, 1, +, −, =, <} augmented
with infinitely many unary predicates Dk (for each integer k greater than 1). Se-
mantically, the intended class of models for P contains just the structure whose
support is the set of the natural numbers, where {0, 1, +, −, =, <} have the nat-
ural interpretation and Dk is interpreted as the sets of natural numbers divisible
by k (these extra predicates are needed to get quantifier elimination [21]).

3 Monic-Flat Array Property Fragments

Although P represents the fragment of arithmetic mostly used in formal ap-

proaches for the static analysis of systems, we underline that there are many
other fragments that have quantifier elimination and can be quite useful; these
fragments can be both weaker (like Integer Difference Logic [20]) and stronger
(like the exponentiation extension of Semënov theorem [24]) than P. Thus, the
modular approach proposed in this Section to model arrays is not motivated just
by generalization purposes, but can have practical impact.
There exist two ways of introducing arrays in a declarative setting, the mono-
sorted and the multi-sorted ways. The former is more expressive because (roughly
speaking) it allows to consider indexes also as elements1 , but might be computa-
tionally more difficult to handle. We discuss decidability results for both cases,
starting from the mono-sorted case.

3.1 The Mono-sorted Case

Let T = (Σ, C) be a theory; the theory ARR1 (T ) of arrays over T is obtained from
T by adding to it infinitely many (fresh) free unary function symbols. This means
that the signature of ARR1 (T ) is obtained from Σ by adding to it unary function
symbols (we use the letters a, a1 , a2 , . . . for them) and that a structure M is a
model of ARR1 (T ) iff (once the interpretations of the extra function symbols are
disregarded) it is a structure belonging to the original class C.
For array theories it is useful to introduce the following notation. We use a for
a tuple a = a1 , . . . , a|a| of distinct ‘array constants’ (i.e. free function symbols);
if t = t1 , . . . , t|t| is a tuple of terms, the notation a(t) represents the tuple (of
length |a| · |t|) of terms a1 (t1 ), . . . , a1 (t|t| ), . . . , a|a| (t1 ), . . . , a|a| (t|t| ).
ARR1 (T ) may be highly undecidable, even when T itself is decidable (see [17]),
thus it is mandatory to limit the shape of the formulæ we want to try to decide.
A prenex formula or a term in the signature of ARR1 (T ) are said to be flat iff for
every term of the kind a(t) occurring in them (here a is any array constant), the
1
This is useful in the analysis of programs, when pointers to the memory (modeled
as an array) are stored into array variables.
Decision Procedures for Flat Array Properties 19

sub-term t is always a variable. Notice that every formula is logically equivalent

to a ﬂat one; however the ﬂattening transformations are based on rewriting as

φ(a(t), ...) ∃x(x = t ∧ φ(a(x), ...)) or φ(a(t), ...) ∀x(x = t → φ(a(x), ...))

and consequently they may alter the quantifiers prefix of a formula. Thus it must
be kept in mind (when understanding the results below), that flattening trans-
formation cannot be operated on any occurrence of a term without exiting from
the class that is claimed to be decidable. When we indicate a flat quantifier-free
formula with the notation ψ(x, a(x)), we mean that such a formula is obtained
from a Σ-formula of the kind ψ(x, z) (i.e. from a quantifier-free Σ-formula where
at most the free variables x, z can occur) by replacing z by a(x).

Theorem 1. If the T -satisﬁability of ∃∗ ∀∃∗ sentences is decidable, then the

ARR1 (T )-satisﬁability of ∃∗ ∀-ﬂat sentences is decidable.

Proof. We present an algorithm, SATMONO , for deciding the satisﬁability of the

∃∗ ∀-ﬂat fragment of ARR1 (T ) (we let T be (Σ, C)). Subsequently, we show that
SATMONO is sound and complete. From the complexity viewpoint, notice that
SATMONO produces a quadratic instance of a ∃∗ ∀∃∗ -satisﬁability problem.

The Decision Procedure SATMONO

Step I. Let
F := ∃c ∀i.ψ(i, a(i), c, a(c))
be a ∃∗ ∀-ﬂat ARR1 (T )-sentence, where ψ is a quantiﬁer-free Σ-formula. Sup-
pose that s is the length of a and t is the length of c (that is, a = a1 , . . . , as
and c = c1 , . . . , ct ). Let e = el,m (1 ≤ l ≤ s, 1 ≤ m ≤ t) be a tuple of
length s · t of fresh variables and consider the ARR1 (T )-formula:

F1 := ∃c ∃e ∀i.ψ(i, a(i), c, e) ∧ am (cl ) = el,m
1≤l≤t 1≤m≤s

Step II. From F1 build the formula

⎡ ⎤

F2 := ∃c ∃e ∀i. ⎣ψ(i, a(i), c, e) ∧ (i = cl → am (i) = el,m )⎦
1≤l≤t 1≤m≤s

Step III. Let d be a fresh tuple of variables of length s; check the T -satisﬁabi-
lity of
⎡ ⎤

F3 := ∃c ∃e ∀i ∃d. ⎣ψ(i, d, c, e) ∧ (i = cl → dm = el,m )⎦
1≤l≤t 1≤m≤s
20 F. Alberti, S. Ghilardi, and N. Sharygina

Correctness and Completeness of SATMONO . SATMONO transforms an

ARR1 (T )-formula F into an equisatisﬁable T -formula F3 belonging to the ∃∗ ∀∃∗
fragment. More precisely, it holds that F, F1 and F2 are equivalent formulæ,
because

∀i.(i = cl → am (i) = el,m ) ≡ am (cl ) = el,m
1≤l≤t 1≤m≤s 1≤l≤t 1≤m≤s

From F2 to F3 and back, satisﬁability is preserved because F2 is the Skolem-

ization of F3 , where the existentially quantiﬁed variables d = d1 , . . . , ds are
substituted with the free unary function symbols a = a1 , . . . as .

Since Presburger Arithmetic is decidable (via quantiﬁer elimination), we get

in particular that
Corollary 1. The ARR1 (P)-satisﬁability of ∃∗ ∀-ﬂat sentences is decidable.

As another example matching the hypothesis of Theorem 1 (i.e. as an example

of a T such that T -satisﬁability of ∃∗ ∀∃∗ -sentences is decidable) consider pure
ﬁrst order logic with equality in a signature with predicate symbols of any arity
but with only unary function symbols [6].

3.2 The Multi-sorted Case

We are now considering a theory of arrays parametric in the theories specifying
constraints over indexes and elements of the arrays. Formally, we need two in-
gredient theories, TI = (ΣI , CI ) and TE = (ΣE , CE ). We can freely assume that
ΣI and ΣE are disjoint (otherwise we can rename some symbols); for simplicity,
we let both signatures be mono-sorted (but extending our results to many-sorted
TE is quite straightforward): let us call INDEX the unique sort of TI and ELEM
the unique sort of TE .
The theory ARR2 (TI , TE ) of arrays over TI and TE is obtained from the union
of ΣI ∪ ΣE by adding to it infinitely many (fresh) free unary function sym-
bols (these new function symbols will have domain sort INDEX and codomain
sort ELEM). The models of ARR2 (TI , TE ) are the structures whose reducts to the
symbols of sorts INDEX and ELEM are models of TI and TE , respectively.
Consider now an atomic formula P (t1 , . . . , tn ) in the language of ARR2 (TI , TE )
(in the typical situation, P is the equality predicate). Since the predicate symbols
of ARR2 (TI , TE ) are from ΣI ∪ ΣE and ΣI ∩ ΣE = ∅, P belongs either to ΣI or
to ΣE ; in the latter case, all terms ti have sort ELEM and in the former case all
terms ti are ΣI -terms. We say that P (t1 , . . . , tn ) is an INDEX-atom in the former
case and that it is an ELEM-atom in the latter case.
When dealing with ARR2 (TI , TE ), we shall limit ourselves to quantified vari-
ables of sort INDEX : this limitation is justified by the benchmarks arising in
applications (see Section 4).2 A sentence in the language of ARR2 (TI , TE ) is said
2
Topmost existentially quantified variables of sort ELEM can be modeled by enriching
TE with free constants.
Decision Procedures for Flat Array Properties 21

to be monic iﬀ it is in prenex form and every INDEX atom occurring in it contains

at most one variable falling within the scope of a universal quantiﬁer.

Example 1. Consider the following sentences:

(I) ∀i. a(i) = i; (II) ∀i1 ∀i2 . (i1 ≤ i2 → a(i1 ) ≤ a(i2 ));
(III) ∃i1 ∃i2 . (i1 ≤ i2 ∧ a(i1 ) ≤ a(i2 )); (IV ) ∀i1 ∀i2 . a(i1 ) = a(i2 );
(V ) ∀i. (D2 (i) → a(i) = 0); (V I) ∃i ∀j. (a1 (j) < a2 (3i)).

The flat formula (I) is not well-typed, hence it is not allowed in ARR2 (P, P); however,
it is allowed in ARR1 (P). Formula (II) expresses the fact that the array a is sorted: it is
flat but not monic (because of the atom i1 ≤ i2 ). On the contrary, its negation (III) is
flat and monic (because i1 , i2 are now existentially quantified). Formula (IV) expresses
that the array a is constant; it is flat and monic (notice that the universally quantified
variables i1 , i2 both occur in a(i1 ) = a(i2 ) but the latter is an ELEM atom). Formula
(V) expresses that a is initialized so to have all even positions equal to 0: it is monic
and flat. Formula (VI) is monic but not flat because of the term a2 (3i) occurring in it;
however, in 3i no universally quantified variable occurs, so it is possible to produce by
flattening the following sentence

∃i ∃i ∀j (i = 3i ∧ a1 (j) < a2 (i ))

which is logically equivalent to (VI), it is ﬂat and still lies in the ∃∗ ∀-class. Finally, as
a more complicated example, notice that the following sentence

∃k ∀i. (D2 (k) ∧ a(k) = ‘\0‘ ∧ (D2 (i) ∧ i < k → a(i) = ‘b‘) ∧ (¬D2 (i)∧i < k → a(i) = ‘c‘))

is monic and ﬂat: it says that a represents a string of the kind (bc)∗ .

Theorem 2. If TI -satisﬁability of ∃∗ ∀-sentences is decidable, then ARR2 (TI , TE )-

satisﬁability of ∃∗ ∀∗ -monic-ﬂat sentences is decidable.

Proof. As we did for SATMONO , we give a decision procedure, SATMULTI , for

the ∃∗ ∀∗ -monic-flat fragment of ARR2 (TI , TE ); for space reasons, we give here
just some informal justifications, the reader is referred to [2] for proofs. First
(Step I), the procedure guesses the sets (called ‘types’) of relevant INDEX atoms
satisfied in a model to be built. Subsequently (Step II) it introduces a repre-
sentative variable for each type together with the constraint that guessed types
are exhaustive. Finally (Step III, IV and V) the procedure applies combination
techniques for purification.

The Decision Procedure SATMULTI . The algorithm is non-deterministic: the

input formula is satisfiable iff we can guess suitable data T , B so that the formulæ
FI , FE below are satisfiable.

Step I. Let F be a ∃∗ ∀∗ -monic-ﬂat formula; let it be

F := ∃c ∀i.ψ(i, a(i), c, a(c)),

22 F. Alberti, S. Ghilardi, and N. Sharygina

(where as usual ψ is a TI ∪TE -quantiﬁer-free formula). Suppose a = a1 , . . . , as ,

i = i1 , . . . , in and c = c1 , . . . , ct . Consider the set (notice that all atoms in
K are ΣI -atoms and have just one free variable because F is monic)
K = {A(x, c) | A(ik , c) is an INDEX atom of F }1≤k≤n ∪ {x = cl }1≤l≤t
Let us call type a set of literals M such that: (i) each literal of M is an
atom in K or its negation; (ii) for all A(x, c) ∈ K, either A(x, c) ∈ M or
¬A(x, c) ∈ M . Guess a set T = {M1 , . . . , Mq } of types.
Step II. Let b = b1 , . . . , bq be a tuple of new variables of sort INDEX and let
⎡ ⎛ ⎞ ⎤
q
⎢ ∀x. ⎝ L(x, c)⎠ ∧⎥
⎢ ⎥
⎢ j=1 L∈Mj ⎥
⎢ ⎥
⎢ q ⎥
F1 := ∃b ∃c ⎢ ⎢ L(b , c) ∧
⎥
⎥
⎢ j ⎥
⎢ j=1 L∈Mj ⎥
⎢ ⎥
⎣ ⎦
ψ(iσ, a(iσ), c, a(c))
σ:i→b

where iσ is the tuple of terms σ(i1 ), . . . , σ(in ).

Step III. Let e = el,m (1 ≤ l ≤ s, 1 ≤ m ≤ t+ q) be a tuple of length s·(t+ q)
of free constants of sort ELEM. Consider the formula
⎡ ⎛ ⎞ ⎤
q
⎢ ∀x. ⎝ L(x, c)⎠ ∧ ⎥
⎢ ⎥
⎢ j=1 L∈Mj ⎥
⎢ ⎥
⎢ q ⎥
⎢ ⎥
⎢ L(b , c) ∧ ⎥
⎢
F2 := ∃b ∃c ⎢ j=1
j ⎥
⎥
⎢ L∈Mj
⎥
⎢ ψ̄(b, c, e) ∧ ⎥
⎢ ⎥
⎢ ⎥
⎢ s ⎥
⎣ (dm = dn → el,m = el,n )⎦
dm ,dn ∈b∗c l=1

where b ∗ c := d1 , . . . , dq+t is the concatenation of the tuples b and c and

ψ̄(b, c, e) is obtained from

ψ(iσ, a(iσ), c, a(c))
σ:i→b

by substituting each term in the tuple a(b)∗a(c) with the constant occupying
the corresponding position in the tuple e.
Step IV. Let B a full Boolean satisfying assignment for the atoms of the formula

s
F3 := ψ̄(b, c, e) ∧ (dm = dn → el,m = el,n )
dm ,dn ∈b∗c l=1

and let ψ̄I (b, c), ψ̄E (e) be the (conjunction of the) sets of literals of sort
INDEX and ELEM, respectively, induced by B.
Decision Procedures for Flat Array Properties 23

Step V. Check the TI -satisﬁability of

⎡ ⎛ ⎞ ⎤
q
q
FI := ∃b ∃c. ⎣∀x. ⎝ L(x, c)⎠ ∧ L(bj , c) ∧ ψ̄I (b, c)⎦
j=1 L∈Mj j=1 L∈Mj

and the TE -satisﬁability of

FE := ψ̄E (e)

Notice that FI is an ∃∗ ∀-sentence; FE is ground and the TE -satisﬁability of FE

(considering the e as variables instead of as free constants) is decidable because
we assumed that all the theories we consider (hence our TE too) have quantifier-
free fragments decidable for satisfiability.
Theorem 2 applies to ARR2 (P, P) because P admits quantifier elimination. For
this theory, we can determine complexity upper and lower bounds:

Theorem 3. ARR2 (P, P)-satisﬁability of ∃∗ ∀∗ -monic-ﬂat sentences is NExpTime-

complete.

Proof. We use exponentially bounded domino systems for reduction [6,19], see [2]
for details.

4 A Decidability Result for the Reachability Analysis of

Flat Array Programs

Based on the decidability results described in the previous section, we can now
achieve important decidability results in the context of reachability analysis for
programs handling arrays of unbounded length. As a reference theory, we shall
use ARR1 (P+ ) or ARR2 (P+ , P+ ), where P+ is P enriched with free constant sym-
bols and with definable predicate and function symbols. We do not enter into
more details concerning what a definable symbol is (see, e.g., [25]), we just un-
derline that definable symbols are nothing but useful macros that can be used to
formalize case-defined functions and SMT-LIB commands like if-then-else. The
addition of definable symbols does not compromise quantifier elimination, hence
decidability of P+ . Below, we let T be ARR1 (P+ ) or ARR2 (P+ , P+ ).
Henceforth v will denote, in the following, the variables of the programs we will
analyze. Formally, v = a, c where, according to our conventions, a is a tuple of
array variables (modeled as free unary function symbols of T in our framework)
and c a tuple of scalar variables; the latter can be modeled as variables in the
logical sense - in ARR2 (P+ , P+ ) we can model them either as variables of sort
INDEX or as free constants of sort ELEM.
A state-formula is a formula α(v) of T representing a (possibly infinite) set of
configurations of the program under analysis. A transition formula is a formula
of T of the kind τ (v, v ) where v is obtained from copying the variables in v
and adding a prime to each of them. For the purpose of this work, programs will
be represented by their control-flow automaton.
24 F. Alberti, S. Ghilardi, and N. Sharygina

linit
τ1
τ2 l1
procedure initEven ( a[N ] , v ) :
τ3
l1 for (i = 0; i < N ; i = i + 2) a[i] = v;
τ4 l2
l2 for (i = 0; i < N ; i = i + 2) assert(a[i] = v);
τ5 τE
(a) l3 lerror

(b)

Fig. 1. The initEven procedure (a) and its control-ﬂow graph (b)

Deﬁnition 1 (Programs). Given a set of variables v, a program is a triple

P = (L, Λ, E), where (i) L = {l1 , . . . , ln } is a set of program locations among
which we distinguish an initial location linit and an error location lerror ; (ii) Λ is a
ﬁnite set of transition formulæ {τ1 (v, v ), . . . , τr (v, v )} and (iii) E ⊆ L × Λ × L
is a set of actions.

We indicate by src, L, trg the three projection functions on E; that is, for
e = (li , τj , lk ) ∈ E, we have src(e) = li (this is called the ‘source’ location of
e), L(e) = τj (this is called the ‘label’ of e) and trg(e) = lk (this is called the
‘target’ location of e).

Example 2. Consider the procedure initEven in Fig. 1. For this procedure, a = a,

c = i, v. N is a constant of the background theory. Λ is the set of formulæ (we omit
identical updates):

τ1 := i = 0
τ2 := i < N ∧ a = λj.if (j = i) then v else a(j) ∧ i = i + 2
τ3 := i ≥ N ∧ i = 0
τ4 := i < N ∧ a(i) = v ∧ i = i + 2
τ5 := i ≥ N
τE := i < N ∧ a(i) = v

The procedure initEven can be formalized as the control-ﬂow graph depicted in Fig. 1(b),
where L = {linit , l1 , l2 , l3 , lerror }.

Deﬁnition 2 (Program paths). A program path (in short, path) of P =

(L, Λ, E) is a sequence ρ ∈ E n , i.e., ρ = e1 , e2 , . . . , en , such that for every
ei , ei+1 , trg(ei ) = src(ei+1 ). We denote with |ρ| the length of the path. An error
path is a path ρ with src(e1 ) = linit and trg(e|ρ| ) = lerror . A path ρ is a feasible
|ρ|
path if j=1 L(ej )(j) is T -satisﬁable, where L(ej )(j) represents τij (v(j−1) , v(j) ),
with L(ej ) = τij .

The (unbounded) reachability problem for a program P is to detect if P ad-

mits a feasible error path. Proving the safety of P, therefore, means solving the
Decision Procedures for Flat Array Properties 25

reachability problem for P. This problem, given well known limiting results, is
not decidable for an arbitrary program P. The consequence is that, in general,
reachability analysis is sound, but not complete, and its incompleteness mani-
fests itself in (possible) divergence of the verification algorithm (see, e.g., [1]).
To gain decidability, we must first impose restrictions on the shape of the
transition formulæ, for instance we can constrain the analysis to formulæ falling
within decidable classes like those we analyzed in the previous section. This is
not sufficient however, due to the presence of loops in the control flow. Hence
we assume flatness conditions on such control flow and “accelerability” of the
transitions labeling self-loops. This is similar to what is done in [7, 9, 12] for
integer variable programs, but since we handle array variables we need specific
restrictions for acceleration. Our result for the decidability of the safety of an-
notated array programs builds upon the results presented in Section 3 and the
acceleration procedure presented in [3].
We first give the definition of flat0 -program, i.e., programs with only self-
loops for which each location belongs to at most one loop. Subsequently we will
identify sufficient conditions for achieving the full decidability of the reachability
problem for flat0 -programs.

Deﬁnition 3 (flat0 -program). A program P is a flat0 -program if for every

path ρ = e1 , . . . , en of P it holds that for every j < k (j, k ∈ {1, . . . , n}), if
src(ej ) = trg(ek ) then ej = ej+1 = · · · = ek .

We now turn our attention to transition formulæ. Acceleration is a well-known

formalism in the area of model-checking. It has been integrated in several frame-
works and constitutes a fundamental technology for the scalability and efficiency
of modern model checkers (e.g., [5]). Given a loop, represented as a transition re-
lation τ , the accelerated transition τ + allows to compute in one shot the precise
set of states reachable after n unwindings of that loop, for any n. This prevents
divergence of the reachability analysis along τ , caused by its unwinding. What
prevents the applicability of acceleration in the domain we are targeting is that
accelerations are not always definable. By definition, the acceleration of a tran-
τ (v, v ) is the union of the n-th compositions of τ with itself, i.e. it is
sition
τ := n>0 τ n , where
+

τ 1 (v, v ) := τ (v, v ), τ n+1 (v, v ) := ∃v .(τ (v, v ) ∧ τ n (v , v )) .

τ + can be practically exploited only if there exists

a formula ϕ(v, v ) equivalent,
n
modulo the considered background theory, to n>0 τ . Based on this observa-
tion on deﬁnability of accelerations, we are now ready to state a general result
about the decidability of the reachability problem for programs with arrays. The
theorem we give is, as we did for results in Section 3, modular and general. We
will show an instance of this result in the following section. Notationally, let us
extend the projection function L by denoting L+ (e) := L(e)+ if src(e) = trg(e)
and L+ (e) := L(e) otherwise, where L(e)+ denotes the acceleration of the tran-
sition labeling the edge e.
26 F. Alberti, S. Ghilardi, and N. Sharygina

Theorem 4. Let F be a class of formulæ decidable for T -satisﬁability. The

unbounded reachability problem for a flat0 -program P is decidable if (i) F is
closed under conjunctions and (ii) for each e ∈ E one can compute α(v, v ) ∈ F
such that T |= L+ (e) ↔ α(v, v ),
Proof. Let ρ = e1 , . . . , en be an error path of P; when testing its feasibility,
according to Definition 3, we can limit ourselves to the case in which e1 , . . . , en
are all distinct,
n provided we replace the labels L(ek )(k) with L+ (ek )(k) in the
formula j=1 L(ej )(j) from Definition 2.3 Thus P is unsafe iff, for some path
e1 , . . . , en whose edges are all distinct, the formula
L+ (e1 )(1) ∧ · · · ∧ L+ (en )(n) (1)
is T -satisfiable. Since the involved paths are finitely many and T -satisfiability
of formulæ like (1) is decidable, the safety of P can be decided.

4.1 A Class of Array Programs with Decidable Reachability

Problem
We now produce a class of programs with arrays – we call it simple0A -programs–
for which requirements of Theorem 4 are met. The class of simple0A -programs
contains non recursive programs implementing searching, copying, comparing,
initializing, replacing and testing procedures. As an example, the initEven pro-
gram reported in Fig. 1 is a simple0A -program. Formally, a simple0A -program
P = (L, Λ, E) is a flat0 -program such that (i) every τ ∈ Λ is a formula be-
longing to one of the decidable classes covered by Corollary 1 or Theorem 3; (ii)
if e ∈ E is a self-loop, then L(e) is a simplek -assignment.
Simplek -assignments are transitions (defined below) for which the acceleration
is first-order definable and is a Flat Array Property. For a natural number k,
we denote by k̄ the term 1 + · · · + 1 (k-times) and by k̄ · t the term t + · · · + t
(k-times).
Definition 4 (simplek -assignment). Let k ≥ 0; a simplek -assignment is a
transition τ (v, v ) of the kind
φL (c, a[d]) ∧ d = d + k̄ ∧ d = d ∧ a = λj.if (j = d) then t(c, a(d)) else a(j)
where (i) c = d, d and (ii) the formula φL (c, a[d]) and the terms t(c, a[d]) are
flat.
The following Lemma (which is an instance of a more general result from [3])
gives the template for the accelerated counterpart of a simplek -assignment.
Lemma 1. Let τ (v, v ) be a simplek -assignment. Then τ + (v, v ) is T -equivalent
to the formula

∀z. d ≤ z < d + k̄ · y ∧ Dk̄ (z − d) → φL (z, d, a(d)) ∧
∃y > 0
a = λj.U(j, y, v) ∧ d = d + k̄ · y ∧ d = d
3
Notice that by these replacements we can represent in one shot infinitely many paths,
namely those executing self-loops any given number of times.
Decision Procedures for Flat Array Properties 27

where the deﬁnable functions Uh (j, y, v), 1 ≤ h ≤ s of the tuple U are

if (d ≤ j < d + k̄ · y ∧ Dk̄ (j − d)) then th (j, d, a(j)) else ah (j) .

Example 3. Consider transition τ2 from the formalization of our running exam-
ple of Fig. 1. The acceleration τ2+ of such formula is (we omit identical updates)

∀z.(i ≤ z < i + 2y ∧ D2 (z − i) → z < N ) ∧ i = i + 2y ∧
∃y > 0.
a = λj. (if (i ≤ j < 2y + i ∧ D2 (j − i)) then v else a[j])

We can now formally show that the reachability problem for simple0A -programs
is decidable, by instantiating Theorem 4 with the results obtained so far.
Theorem 5. The unbounded reachability problem for simple0A -programs is de-
cidable.
Proof. By prenex transformations, distributions of universal quantifiers over con-
junctions, etc., it is easy to see that the decidable classes covered by Corollary 1
or Theorem 3 are closed under conjunctions. Since the acceleration of a simplek -
assignment fits inside these classes (just eliminate definitions via λ-abstractions
by using universal quantifiers), Theorem 4 applies.

4.2 Experimental Observations

We evaluated the capabilities of available SMT-Solvers on checking the satisfia-
bility of Flat Array Properties and for that we selected some simple0A -programs,
both safe and unsafe. Following the procedure identified in the proof of Theo-
rem 4 we generated 200 SMT-LIB2-compliant files with Flat Array Properties4 .
The simple0A -programs we selected perform some simple manipulations on ar-
rays of unknown length, like searching for a given element, initializing the array,
swapping the arrays, copying one array into another, etc. We tested cvc4 [4]
(version 1.2) and Z3 [10] (version 4.3.1) on the generated SMT-LIB2 files. Exper-
imentation has been performed on a machine equipped with a 2.66 GHz CPU
and 4GB of RAM running Mac OSX 10.8.5. From our evaluation, both tools
timeout on some proof-obligations5. These results suggest that the fragment of
Flat Array Properties definitely identifies fragments of theories which are decid-
able, but their satisfiability is still not entirely covered by modern and highly
engineered tools.

5 Conclusions and Related Work

In this paper we identified a class of Flat Array Properties, a quantified fragment
of theories of arrays, admitting decision procedures. Our results are parameter-
ized in the theories used to model indexes and elements of the array; in this sense,
4
Such files have been generated automatically with our prototype tool which we make
available at www.inf.usi.ch/phd/alberti/prj/booster.
5
See the discussion in [2] for more information on the experiments.
28 F. Alberti, S. Ghilardi, and N. Sharygina

there is some similarity with [18], although (contrary to [18]) we consider purely
syntactically specified classes of formulæ. We provided a complexity analysis
of our decision procedures. We also showed that the decidability of Flat Array
Properties, combined with acceleration results, allows to depict a sound and
complete procedure for checking the safety of a class of programs with arrays.
The modular nature of our solution makes our contributions orthogonal with
respect to the state of the art: we can enrich P with various definable or even
not definable symbols [24] and get from our Theorems 1,2 decidable classes
which are far from the scope of existing results. Still, it is interesting to notice
that also the special cases of the decidable classes covered by Corollary 1 and
Theorem 3 are orthogonal to the results from the literature. To this aim, we
make a closer comparison with [8,15]. The two fragments considered in [8,15] are
characterized by rather restrictive syntactic constraints. In [15] it is considered
a subclass of the ∃∗ ∀-fragment of ARR1 (T ) called SIL, Single Index Logic. In
this class, formulæ are built according to a grammar allowing (i) as atoms only
difference logic constraints and some equations modulo a fixed integer and (ii) as
universally quantified subformulæ only formulæ of the kind ∀i.φ(i) → ψ(i, a(i +
k̄)) (here k is a tuple of integers) where φ, ψ are conjunctions of atoms (in
particular, no disjunction is allowed in ψ). On the other side, SIL includes some
non-flat formulæ, due to the presence of constant increment terms i + k̄ in the
consequents of the above universally quantified implications. Similar restrictions
are in [16]. The Array Property Fragment described in [8] is basically a subclass
of the ∃∗ ∀∗ -fragment of ARR2 (P, P); however universally quantified subformulæ
are constrained to be of the kind ∀i.φ(i) → ψ(a(i)), where in addition the INDEX
part φ(i) must be a conjunction of atoms of the kind i ≤ j, i ≤ t, t ≤ i (with
i, j ∈ i and where t does not contain occurrences of the universally quantified
variables i). These formulæ are flat but not monic because of the atoms i ≤ j.
From a computational point of view, a complexity bound for SATMONO has
been shown in the proof of Theorem 1, while the complexity of the decision pro-
cedure proposed in [15] is unknown. On the other side, both SATMULTI and the
decision procedure described in [8] run in NExpTime (the decision procedure
in [8] is in NP only if the number of universally quantified index variables is
bounded by a constant N ). Our decision procedures for quantified formulæ are
also partially different, in spirit, from those presented so far in the SMT commu-
nity. While the vast majority of SMT-Solvers address the problem of checking
the satisfiability of quantified formulæ via instantiation (see, e.g., [8, 11, 14, 23]),
our procedures – in particular SATMULTI – are still based on instantiation, but the
instantiation refers to a set of terms enlarged with the free constants witnessing
the guessed set of realized types.
From the point of view of the applications, providing a full decidability result
for the unbounded reachability analysis of a class of array programs is what
differentiates our work with other contributions like [1, 3].
Decision Procedures for Flat Array Properties 29

References
1. Alberti, F., Bruttomesso, R., Ghilardi, S., Ranise, S., Sharygina, N.: Lazy abstrac-
tion with interpolants for arrays. In: Bjørner, N., Voronkov, A. (eds.) LPAR-18.
LNCS, vol. 7180, pp. 46–61. Springer, Heidelberg (2012)
2. Alberti, F., Ghilardi, S., Sharygina, N.: Decision procedures for flat array
properties. Technical Report 2013/04, University of Lugano (October 2013),
http://www.inf.usi.ch/research_publication.htm?id=77
3. Alberti, F., Ghilardi, S., Sharygina, N.: Definability of accelerated relations in a
theory of arrays and its applications. In: Fontaine, P., Ringeissen, C., Schmidt,
R.A. (eds.) FroCoS 2013. LNCS, vol. 8152, pp. 23–39. Springer, Heidelberg (2013)
4. Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanović, D., King, T.,
Reynolds, A., Tinelli, C.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV
2011. LNCS, vol. 6806, pp. 171–177. Springer, Heidelberg (2011)
5. Behrmann, G., Bengtsson, J., David, A., Larsen, K.G., Pettersson, P., Yi, W.:
UPPAAL implementation secrets. In: Damm, W., Olderog, E.-R. (eds.) FTRTFT
2002. LNCS, vol. 2469, pp. 3–22. Springer, Heidelberg (2002)
6. Börger, E., Grädel, E., Gurevich, Y.: The classical decision problem. Perspectives
in Mathematical Logic. Springer, Berlin (1997)
7. Bozga, M., Iosif, R., Lakhnech, Y.: Flat parametric counter automata. Fundamenta
Informaticae (91), 275–303 (2009)
8. Bradley, A.R., Manna, Z., Sipma, H.B.: What’s decidable about arrays? In: Emer-
son, E.A., Namjoshi, K.S. (eds.) VMCAI 2006. LNCS, vol. 3855, pp. 427–442.
Springer, Heidelberg (2006)
9. Comon, H., Jurski, Y.: Multiple counters automata, safety analysis and pres-
burger arithmetic. In: Vardi, M.Y. (ed.) CAV 1998. LNCS, vol. 1427, pp. 268–279.
Springer, Heidelberg (1998)
10. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
11. Detlefs, D.L., Nelson, G., Saxe, J.B.: Simplify: a theorem prover for program check-
ing. Technical Report HPL-2003-148, HP Labs (2003)
12. Finkel, A., Leroux, J.: How to compose Presburger-accelerations: Applications to
broadcast protocols. In: Agrawal, M., Seth, A.K. (eds.) FSTTCS 2002. LNCS,
vol. 2556, pp. 145–156. Springer, Heidelberg (2002)
13. Ganzinger, H.: Shostak light. In: Voronkov, A. (ed.) CADE 2002. LNCS (LNAI),
vol. 2392, pp. 332–346. Springer, Heidelberg (2002)
14. Ge, Y., de Moura, L.: Complete instantiation for quantified formulas in satisfi-
abiliby modulo theories. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS,
vol. 5643, pp. 306–320. Springer, Heidelberg (2009)
15. Habermehl, P., Iosif, R., Vojnar, T.: A logic of singly indexed arrays. In: Cervesato,
I., Veith, H., Voronkov, A. (eds.) LPAR 2008. LNCS (LNAI), vol. 5330, pp. 558–
573. Springer, Heidelberg (2008)
16. Habermehl, P., Iosif, R., Vojnar, T.: What else is decidable about integer arrays?
In: Amadio, R.M. (ed.) FOSSACS 2008. LNCS, vol. 4962, pp. 474–489. Springer,
Heidelberg (2008)
17. Halpern, J.Y.: Presburger arithmetic with unary predicates is Π11 complete. J.
Symbolic Logic 56(2), 637–642 (1991), doi:10.2307/2274706
18. Ihlemann, C., Jacobs, S., Sofronie-Stokkermans, V.: On local reasoning in verifica-
tion. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp.
265–281. Springer, Heidelberg (2008)
30 F. Alberti, S. Ghilardi, and N. Sharygina

19. Lewis, H.B.: Complexity of solvable cases of the decision problem for the predicate
calculus. In: 19th Ann. Symp. on Found. of Comp. Sci. pp. 35–47. IEEE (1978)
20. Nieuwenhuis, R., Oliveras, A.: DPLL(T) with Exhaustive Theory Propagation and
Its Application to Difference Logic. In: Etessami, K., Rajamani, S.K. (eds.) CAV
2005. LNCS, vol. 3576, pp. 321–334. Springer, Heidelberg (2005)
21. Oppen, D.C.: A superexponential upper bound on the complexity of Presburger
arithmetic. J. Comput. System Sci. 16(3), 323–332 (1978)
22. Ranise, S., Tinelli, C.: The Satisfiability Modulo Theories Library, SMT-LIB
(2006), http://www.smt-lib.org
23. Reynolds, A., Tinelli, C., Goel, A., Krstić, S., Deters, M., Barrett, C.: Quantifier
instantiation techniques for finite model finding in SMT. In: Bonacina, M.P. (ed.)
CADE 2013. LNCS, vol. 7898, pp. 377–391. Springer, Heidelberg (2013)
24. Semënov, A.L.: Logical theories of one-place functions on the set of natural num-
bers. Izvestiya: Mathematics 22, 587–618 (1984)
25. Shoenfield, J.R.: Mathematical logic. Association for Symbolic Logic, Urbana
(2001) (reprint of the 1973 second printing )
26. Tinelli, C., Zarba, C.G.: Combining nonstably infinite theories. J. Automat. Rea-
son. 34(3), 209–238 (2005)
SATMC: A SAT-Based Model Checker
for Security-Critical Systems

Alessandro Armando1,2 , Roberto Carbone2 , and Luca Compagna3

1
DIBRIS, University of Genova, Genova, Italy
2
Security & Trust, FBK, Trento, Italy
3
Product Security Research, SAP AG, Sophia Antipolis, France
{armando,carbone}@fbk.eu, [email protected]

Abstract. We present SATMC 3.0, a SAT-based bounded model checker

for security-critical systems that stems from a successful combination of
encoding techniques originally developed for planning with techniques de-
veloped for the analysis of reactive systems. SATMC has been successfully
applied in a variety of application domains (security protocols, security-
sensitive business processes, and cryptographic APIs) and for different
purposes (design-time security analysis and security testing). SATMC
strikes a balance between general purpose model checkers and security
protocol analyzers as witnessed by a number of important success sto-
ries including the discovery of a serious man-in-the-middle attack on the
SAML-based Single Sign-On (SSO) for Google Apps, an authentication
flaw in the SAML 2.0 Web Browser SSO Profile, and a number of attacks
on PKCS#11 Security Tokens. SATMC is integrated and used as back-end
in a number of research prototypes (e.g., the AVISPA Tool, Tookan, the
SPaCIoS Tool) and industrial-strength tools (e.g., the Security Validator
plugin for SAP NetWeaver BPM).

1 Introduction
With the convergence of the social, cloud, and mobile paradigms, information
and communication technologies are affecting our everyday personal and working
live to unprecedented depth and scale. We routinely use online services that stem
from the fruitful combination of mobile applications, web applications, cloud
services, and/or social networks. Sensitive data handled by these services often
flows across organizational boundaries and both the privacy of the users and the
assets of organizations are often at risk.
Solutions (e.g., security protocols and services) that aim to securely combine
the ever-growing ecosystem of online services are already available. But they
are notoriously difficult to get right. Many security-critical protocols and ser-
vices have been designed and developed only to be found flawed years after their
deployment. These flaws are usually due to the complex and unexpected inter-
actions of the protocols and services as well as to the possible interference of
malicious agents. Since these weaknesses are very difficult to spot by traditional
verification techniques (e.g., manual inspection and testing), security-critical sys-
tems are a natural target for formal method techniques.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 31–45, 2014.

c Springer-Verlag Berlin Heidelberg 2014
32 A. Armando, R. Carbone, and L. Compagna

SATMC is a SAT-based bounded model checker for security-critical systems

that combines encoding techniques developed for planning [26] with techniques
developed for the analysis of reactive systems [15]. The approach reduces the
problem of determining whether the system violates a security goal in k > 0
steps to the problem of checking the satisfiability of a propositional formula
(the SAT problem). Modern SAT solvers can tackle SAT problems of practical
relevance in milliseconds. Since its first release in 2004 [8], SATMC has been
enhanced to support Horn clauses and first-order LTL formulae. This makes
SATMC 3.0 able to support the security analysis of distributed systems that
exchange messages over a wide range of secure channels, are subject to sophis-
ticated security policies, and/or aim at achieving a variety of security goals.
Since (both honest and malicious) agents can build and exchange messages of
finite, but arbitrary complexity (through concatenation and a variety of cryp-
tographic primitives), most security-critical, distributed systems are inherently
infinite state. For this reason general purpose model checkers (e.g., SPIN [24],
NuSMV [19])—which assume the input system to be finite state—are not suited
for the analysis of a large and important set of security-critical systems (e.g.,
cryptographic protocols and APIs). Special purpose tools (most notably, secu-
rity protocol analyzers, e.g., CL-AtSe [27], OFMC [13], Proverif [16]) are capable
of very good performance and support reasoning about the algebraic properties
of cryptographic operators. SATMC complements security protocol analyzers
by supporting a powerful specification language for communication channels,
intruder capabilities, and security goals based on first-order LTL.
SATMC strikes a balance between general purpose model checkers and se-
curity protocol analyzers. SATMC has been successfully applied in variety of
application domains (namely, security protocols, security-sensitive business pro-
cesses, and cryptographic APIs) and for different purposes (e.g., design-time
security analysis and security testing). SATMC is integrated and used as a back-
end in a number of research prototypes (the AVISPA Tool [2], Tookan [18], the
AVANTSSAR Platform [1], and the SPaCIoS Tool [28]) and industrial-strength
tools (the Security Validator plugin for SAP NetWeaver BPM1 ). The effective-
ness of SATMC is witnessed by the key role it played in the discovery of:

– a ﬂaw in a “patched” version of the protocol for online contract signing

proposed by Asokan, Shoup, and Waidner (ASW) [3],
– a serious man-in-the-middle attack on the SAML-based SSO for Google
Apps [6] and, more recently, an authentication flaw in the SAML 2.0 Web
Browser SSO Profile and related vulnerabilities on actual products [5],
– a number of attacks on the PKCS#11 Security Tokens [18], and
– a flaw in a two-factor and two-channel authentication protocol [7].
As shown in Figure 1, the applicability of SATMC in different domains is en-
abled by domain-specific connectors that translate the system and the property
specifications into ASLan [12], a specification language based on set-rewriting,
Horn clauses, and first-order LTL which is amenable to formal analysis. As shown
1
http://scn.sap.com/docs/DOC-32838
SATMC: A SAT-Based Model Checker 33

User

Sec
uri
Con ty AP
Domain-speciﬁc nec I
tor
Connector BPMN
Connector
ASLan Output col
y Proto
Speciﬁcation Format Securit ctor
Conne

SATMC MiniSAT
NuSMV

Fig. 1. High-level Overview

in the same ﬁgure, SATMC leverages NuSMV to generate the SAT encoding for
the LTL formulae and MiniSAT [22] to solve the SAT problems.

Structure of the Paper. In the next section we present some success stories
related to the application domains wherein SATMC has been so far employed.
In Section 3 we provide the formal framework and in Section 4 we illustrate
how the ASLan speciﬁcation language can be used to specify security-critical
systems, the abilities of the attackers, and the security goals. In Section 5 we
present the bounded model checking procedure implemented in SATMC and the
architecture of the tool, and we conclude in Section 6 with some ﬁnal remarks.

2 Success Stories
SATMC has been successfully used to support the security analysis and testing
in a variety of industry relevant application domains: security protocols, business
processes, and security APIs.
Security Protocols. Security protocols are communication protocols aiming
to achieve security assurances of various kinds through the usage of crypto-
graphic primitives. They are key to securing distributed information infrastruc-
tures, including—and most notably—the Web. The SAML 2.0 Single Sign-On
protocol [21] (SAML SSO, for short) is the established standard for cross-domain
browser-based SSO for enterprises. Figure 2 shows the prototypical use case for
the SAML SSO that enables a Service Provider (SP) to authenticate a Client (C)
via an Identity Provider (IdP): C asks SP to provide the resource at URI (step
1). SP then redirects C to IdP with the authentication request AReq(ID, SP),
where ID uniquely identiﬁes the request (steps 2 and 3). IdP then challenges C
to provide valid credentials (gray dashed arrow). If the authentication succeeds,
IdP builds and sends C a digitally signed authentication assertion ({AA}K−1 )
IdP
embedded into an HTTP form (step 4). This form also includes some script
that automatically posts the message to SP (step 5). SP checks the assertion
34 A. Armando, R. Carbone, and L. Compagna

C IdP SP
1. GET URI

2. HTTP302 IdP?SAMLRequest=AReq(ID, SP)

3. GET IdP?SAMLRequest=AReq(ID, SP)

IdP builds an authentication assertion

AA = AAssert (ID, C, IdP, SP)
4. HTTP200 Form(. . .)

5. POST SP?SAMLResponse=AResp(ID, SP, IdP, {AA}K−1 )

IdP

6. HTTP200 Resource (URI)

Fig. 2. SAML SSO Protocol

and then deliver the requested resource to C (step 6). Upon successful execu-
tion of the protocol, C and SP are mutually authenticated and the resource
has been confidentially delivered to C. To achieve this, SAML SSO—as most
of the application-level protocols—assumes that the communication between C
and SP as well as that between C and IdP is carried over unilateral SSL/TLS
communication channels. It must be noted that even with secure communication
channels in place, designing and developing application-level security protocols
such as the SAML SSO remains a challenge. These protocols are highly con-
figurable, are described in bulky natural language specifications, and deviations
from the standard may be dictated by application-specific requirements of the
organization.
The SAML-based SSO for Google Apps in operation until June 2008 deviated
from the standard in a few, seemingly minor ways. By using SATMC, we discov-
ered an authentication flaw in the service that allowed a malicious SP to mount
a severe man-in-the-middle attack [6]. We readily informed Google and the US-
CERT of the problem. In response to our findings Google developed a patch and
asked their customers to update their applications accordingly. A vulnerability
report was then released by the US-CERT.2 The severity of the vulnerability has
been rated High by NIST.3 By using the SATMC we also discovered an authen-
tication flaw in the prototypical SAML SSO use case [5]. This flaw paves the way
to launching Cross-Site Scripting (XSS) and Cross-Site Request Forgery (XSRF)
attacks, as witnessed by a new XSS attack that we identified in the SAML-based
SSO for Google Apps. We reported the problem to OASIS which subsequently
released an errata addressing the issue.4
We also used SATMC at SAP as a back-end for security protocol analysis
and testing (AVANTSSAR [1] and SPaCIoS [28]) to assist development teams
in the design and development of the SAP NetWeaver SAML Single Sign-On
(SAP NGSSO) and SAP OAuth 2.0 solutions. Overall, more than one hundred
different protocol configurations and corresponding formal models have been
analyzed, showing that both SAP NGSSO and SAP OAuth2 services are indeed
well designed.
2
http://www.kb.cert.org/vuls/id/612636
3
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2008-3891
4
http://tools.oasis-open.org/issues/browse/SECURITY-12
SATMC: A SAT-Based Model Checker 35

SoD1
Approve
Travel
Manager
Request Notify
Travel SoD3 Requestor
Staﬀ
Approve
Budget
SoD2 Manager

Fig. 3. A simple travel approval process with annotated security requirements

All in all, SATMC has been key to the analysis of various security protocols
of industrial complexity, leading to the discovery of a number of serious flaws,
including a vulnerability in a “patched” version of the optimistic fair-exchange
contract signing protocol developed by Asokan, Shoup, and Weidner [3] and a
protocol for strong authentication based on the GSM infrastructure [7].
Business Processes. A Business Process (BP) is a workflow of activities
whose execution aims to accomplish a specific business goal. BPs must be care-
fully designed and executed so to comply with security and regulatory com-
pliance requirements. Figure 3 illustrates a simple example of a BP for travel
request approval: a staff member may issue a travel request. Both the reason
and the budget of the travel must be approved by managers. Afterwards, the
requesting user is notified whether the request is granted or not. The BP shall
ensure a number of authorization requirements: only managers shall be able to
approve the reason and budget of travel requests. Moreover, a manager should
not be allowed to approve her own travel request (Separation of Duty). Finally,
the manager that approves the travel reason should not get access to the details
of the travel budget (and vice versa) to perform her job (Need-to-Know Princi-
ple). Checking whether a given BP of real-world complexity complies with these
kind of requirements is difficult.
SATMC has been used to model check BPs against high-level authorization
requirements [10]. Moreover, SATMC lies at the core of a Security Validation
prototype for BPs developed by the Product Security Research unit at SAP.
This prototype can integrate off-the-shelf Business Process Management (BPM)
systems (e.g., SAP Netweaver BPM and Activity) to support BP analysts in the
evaluation of BP compliance. It enables a BP analyst to easily specify the security
goals and triggers SATMC via a translation of the BP workflow, data, security
policy and goal into ASLan. As soon as a flaw is discovered, it is graphically
rendered to the analyst [20,11].
Security APIs. A Security API is an Application Program Interface that allows
untrusted code to access sensitive resources in a secure way. Figure 4 shows a few
methods of the Java interface5 for the security API defined by the PKCS#11 stan-
dard [25]. The Java interface and its implementation allow to access the PKCS#11
modules of smart cards or other hardware security modules (HSM) where
5
http://javadoc.iaik.tugraz.at/pkcs11 wrapper/1.2.15/iaik/pkcs/
pkcs11/wrapper/PKCS11.html
36 A. Armando, R. Carbone, and L. Compagna

// modifies the value of one or more object attributes

void C_SetAttributeValue(long hSes, long hObj, CK_ATTRIBUTE[] pAtt)
// initializes a decryption operation
void C_DecryptInit(long hSes, CK_MECHANISM pMechanism, long hKey)
// decrypt encrypted data
byte[] C_Decrypt(long hSes, byte[] pEncryptedData)
// wraps (i.e., encrypts) a key
byte[] C_WrapKey(long hSes, CK_MECHANISM pMechanism, long hWrappingKey, long hKey)

Fig. 4. PKCS#11: Java interface

sensitive resources (e.g., cryptographic keys, pin numbers) can be stored. These
resources can be associated with attributes (cf. C SetAttributeValue) stat-
ing, e.g., whether they can be extracted from the device or not, whether a certain
key can be used to wrap (encrypt) another key, etc. For instance, if an object is
set to be non-extractable, then it cannot be reset and become extractable again.
More in general, changes to the attribute values and access to sensitive resources
must comply the policy that the security API is designed to enforce and this must
hold for any possible sequence of invocation of the methods oﬀered by the API
(C Decrypt, C WrapKey, etc.).
SATMC lies at the core of Tookan [18], a tool capable to automatically detect
and reproduce policy violations in commercially available cryptographic security
tokens, exploiting vulnerabilities in their RSA PKCS#11 based APIs. Tookan
can automatically reverse-engineer real PKCS#11 tokens, deduce their function-
alities, construct formal models of the API for the SATMC model checker, and
then execute the attack traces found by SATMC directly against the actual to-
ken. Tookan has been able to detect a variety of severe attacks on a number of
commercial tokens (e.g., SecurID800 by RSA, CardOS V4.3 B by Siemens) [23].
Cryptosense6 is a spin-oﬀ recently established on top of the success of Tookan.

3 Formal Framework

A fact is an atomic formula of a ﬁrst-order language. We consider a language L

defined as the smallest set of formulae containing facts and equalities between
(first-order) terms as atomic propositions as well as the formulae built with the
usual propositional connectives (¬, ∨, . . .), first-order quantifiers (∀ and ∃), and
temporal operators (F for “some time in the future”, G for “globally”, O for
“some time in the past”, . . . ). A formula is closed if and only if all variables in
it are bound by some quantifier.
Let V be the set of variables in L; let F and T be the (possibly infinite) sets
of ground (i.e., variable-free) facts and terms of L respectively. A model is 4-uple
M = I, R, H, C, where

– I ⊆ F is the initial state,

r(v1 ,...,vn )
– R is a set of rewrite rules, i.e., expressions of the form (L −−−−−−−→ R),
where L and R are ﬁnite sets of facts, r is a rule name (i.e., an n-ary function
6
http://cryptosense.com/
SATMC: A SAT-Based Model Checker 37

symbol uniquely associated with the rule) for n ≥ 0, and v1 , . . . , vn are the
variables in L; it is required that the variables occurring in R also occur in
L.
c(v1 ,...,vn )
– H is a set of Horn clauses, i.e., expressions of the form (h ←−−−−−−− B),
where h is a fact, B is a ﬁnite set of facts, c is a Horn clause name (i.e., an
n-ary function symbol uniquely associated with the clause) for n ≥ 0, and
v1 , . . . , vn are the variables occurring in the clause.
– C ⊆ L is a set of closed formulae called constraints.

An assignment is a total function from V into T , i.e., σ : V → T . Assignments

are extended to the facts and terms of L in the obvious way. Let S ⊆ F , the
closure of S ⊆ F under H, in symbols SH , is the smallest set of facts containing
c(...)
S and such that for all (h ←−−− B) ∈ H and σ : V → T if Bσ ⊆ SH then
hσ ∈ SH . A set of facts S is closed under H iff SH = S. We interpret the
facts in SH as the propositions holding in the state represented by S, all other
ρ
facts being false (closed-world assumption). Let (L − → R) ∈ R and σ : V → T .
We say that rule (instance) ρσ is applicable in state S if and only if Lσ ⊆ SH
and when this is the case S = appρσ (S) = (S \ Lσ) ∪ Rσ is the state resulting
from the execution of ρσ in S. A path π is an alternating sequence of states
and rules instances S0 ρ1 S1 ρ2 . . . such that Si = appρi (Si−1 ), for i = 1, 2, . . .. If,
additionally, S0 ⊆ I, then we say that the path is initialized . Let π = S0 ρ1 S1 . . .
be a path; we define π(i) = Si and πi = Si ρi+1 Si+1 . . .; π(i) and πi are the
i-th state of the path and the suffix of the path starting with the i-th state,
respectively. We assume that paths have infinite length. (This can be always
obtained by adding stuttering transitions to the system.) Let π be an initialized
path of M . An LTL formula φ is valid on π under σ, in symbols π |=σ φ, if and
only if π0 |=σ φ, where πi |=σ φ is inductively defined as follows.

where σ[t/x] is the assignment that associates x with t and all other vari-
ables y with σ(y). The semantics of the remaining connectives and temporal
operators, as well as of the universal quantiﬁer, is deﬁned analogously. Let
M1 = I1 , R1 , H1 , C1 and M2 = I2 , R2 , H2 , C2 . The parallel composition of
M1 and M2 is the model M1 M2 = I1 ∪ I2 , R1 ∪ R2 , H1 ∪ H2 , C1 ∪ C2 . Let
M = I, R, H, C be a model and φ ∈ L. We say that φ is valid in M , in symbols
M |= φ, if and only if π |=σ φ for all initialized paths π of M and all assignments
σ such that π |=σ ψ for all ψ ∈ C.
38 A. Armando, R. Carbone, and L. Compagna

Table 1. Facts and their informal meaning

Fact Meaning
Domain sent(s, b, a, m, c) s sent m on c to a pretending to be b
Independent rcvd(a, b, m, c) m (supposedly sent by b) has been received on c by a
contains(d, ds) d is member of ds
ik(m) the intruder knows m
Protocols stater (j, a, ts) a plays r, has internal state ts, and can execute step j
Business pa(r, t) r has the permission to perform t
Processes ua(a, r) a is assigned to r
executed(a, t) a executed t
granted(a, t) a is granted to execute t
APIs attrs(as) security token has attributes as
Legenda: s, a, b: agents; m: message; c: channel; r: role; j: protocol step; ts: list of
terms; t: task d: data; ds: set of data; o: resource object; as: set of attributes

4 Modeling Security-Critical Systems

We are interested in model checking problems of the form:

MS MI |= G (1)

where MS = IS , RS , HS , CS and MI = II , RI , HI , CI are the model of the

security-sensitive system and of the intruder respectively and G is an LTL for-
mula expressing the security properties that the combined model must enjoy.
Table 1 presents an excerpt of the facts (2nd column) used in the different
application domains (1st column). Their informal meaning is explained in the
rightmost column. Some facts have a fixed meaning: ik models the intruder
knowledge, sent and rcvd are used to model communication, and contains ex-
presses set membership. Other facts are domain specific: stater (j, a, ts) models
the state of honest agents in security protocols; pa(r, t), ua(a, r), executed(a, t)
and granted(a, t) are used to represent the security policy and task execution
in business processes; finally attrs(as) is used to model attribute-value assign-
ments to resource objects in security APIs. Here and in the sequel we use type-
writer font to write facts and rules with the additional convention that variables
are capitalized (e.g., C, AReq), whereas constants and function symbols begin
with a lower-case letter (e.g., hReq).
Formal Modeling of Security-Critical Systems. In the case of security
protocols, the initial state IS contains a state-fact stater (1, a, ts) for each agent
a. In case of business processes, the initial state specifies which tasks are ready
for execution as well as the access control policy (e.g., the user-role and the
role-permission assignment relations). In case of security APIs, the initial state
specifies some attribute-value assignments.
Let us now consider an example of rewriting rules in RS . The reception of
message 2 by the client and the forwarding of message 3 in Figure 2 are modeled
by the following rewriting rule:
SATMC: A SAT-Based Model Checker 39

rcvd (C, SP, hRsp(c30x, IdP, AReq), CSP2C)

send2 (C,...,CC2IdP )
statec (2, C, [SP, . . . , CC2IdP ]) −−−−−−−−−−→ statec (3, C, [Areq, SP, . . . , CC2IdP ])
sent (C, C, IdP, hReq(get, IdP, AReq), CC2IdP)
The clauses in HS support the specification, e.g., of access control policies. For
instance, in case of business processes, the Role-based Access Control (RBAC)
model can be naturally and succinctly specified as follows:
grant(A,R,T)
granted(A, T) ←−−−−−−−−− ua(A, R), pa(R, T)
Moreover, security-critical systems often rely on assumptions on the behavior
of the principals involved (e.g., progress, availability). These assumptions can
be specified by adding suitably defined LTL formulae to CS . Some examples are
provided in the last two rows of Table 2.
Formal Modeling of the Intruder. A model that corresponds to the Dolev-
Yao intruder is given by MDY = ∅, RDY , HDY , ∅, where the rewrite rules
in RDY model the ability to overhear, divert, and intercept messages and the
clauses in HDY model the inferential capabilities, e.g., the ability to decrypt
messages when the key used for encryption is known to the intruder as well as
that to forge new messages. The model of the Dolev-Yao intruder MDY can
be complemented by a model MI = II , RI , HI , CI , where II contains the
facts representing the initial knowledge in the scenario considered, RI and HI
model additional, domain specific behaviors of the intruder, and CI may instead
constrain the otherwise allowed behaviors. Thus the model of the intruder stems
from the parallel combination of MDY and MI , i.e., MI = (MDY MI ).
For instance, the behavior of the intruder for security APIs can be extended
using the following rule that models the Java method C Decrypt of Figure 4:

ik (crypt(K, R)) ik (hand(N, inv(K)))

attrs(KeyAttrs) contains(attr(decrypt, true, N), KeyAttrs)
decrypt key asym(KeyAttrs,K,R,N)
−−−−−−−−−−−−−−−−−−−−→ ik(R) LHS
where LHS abbreviates the left hand side of the rule. This rule states that if
(i) the intruder knows a cipher-text encrypted with key K and the handler N
of the key inv(K), and (ii) N has the attribute decryption set to true (i.e., the
key associated with that handler can be used for decryption), then the decrypt
method can be applied and the intruder can retrieve the plain-text R. (Notice
that knowing the handler does not mean to know the key associated to the
handler.)
As a further instance, a transport layer protocol such as SSL/TLS can be
abstractly characterized by including suitable formulae in CI . To illustrate con-
sider the ﬁrst four rows of Table 2, where, here and in the sequel, ∀(ψ) stands for
the universal closure of the formula ψ. The formula in the 3rd row formalizes the
property of a weakly conﬁdential channel, i.e., a channel whose output is exclu-
sively accessible to a single, yet unknown, receiver. In our model this amounts
40 A. Armando, R. Carbone, and L. Compagna

Table 2. LTL constraints

Property LTL Formula
confidential to(c, p) G ∀(rcvd(A, B, M, c) ⇒ A = p)
authentic on(c, p) G ∀(sent(RS, A, B, M, c) ⇒(A = p ∧ RS = p))
weakly confidential (c) G ∀((rcvd(A, B, M, c) ∧ F rcvd(A , B , M , c)) ⇒ A = A )
resilient(c) G ∀(sent(RS, A, B, M, c) ⇒ F rcvd(B, A, M, c))
progress (a, r, j) G ∀(stater (j, a, ES) ⇒ F ¬stater (j, a, ES))
availability(a, c) G ∀(rcvd(a, P, M, c) ⇒ F ¬ rcvd(a, P, M, c))

to requiring that, for every state S, if a fact rcvd(a, b, m, c) ∈ S, then in all

the successor states the rcvd facts with channel c must have a as recipient (see
corresponding LTL formula in the 3rd row). More details can be found in [6,4].
Security Goals. For security protocols, besides the usual secrecy and authen-
tication goals, our property specification language L allows for the specification
of sophisticated properties involving temporal operators and first-order quanti-
fiers. For instance, a fair exchange goal for the ASW protocol discussed in [4]
is:

G ∀nO .∀nR .(hasvc(r, txt, nO , nR ) ⇒ F ∃nR .hasvc(o, txt, nO , nR ))

stating that if an agent r has a valid contract, then we ask o to possess a valid
contract relative to the same contractual text txt and secret commitment nO .
Finally, the separation of duty property SoD3 exempliﬁed in Figure 3 can be
expressed as the following LTL formula:

G ∀(executed(A, approve travel) ⇒ G ¬executed(A, approve budget))

This goal states that if an agent A has executed the task approve travel then
he should not execute the task approve budget.

5 SAT-Based Model Checking of Security-Critical

Systems

A high-level overview of the architecture of SATMC 3.0 is depicted in Fig-

ure 5. SATMC takes as input the specification of the system MS , (optionally)
a specification of the custom intruder behavior MI , a security goal G ∈ L,
and checks whether MS MI |= G via a reduction to a SAT problem. Since
MI = (MDY MI ), this boils down to checking whether MS MDY MI |= G.
The DY Attacker module computes and carries out a number of optimizing
transformations on MS MDY . These transformations specialize the (otherwise
very prolific) rules and clauses of MDY and produce a model MS-DY which
is easier to analyze than (yet equivalent to) MS MDY [9]. The module finally
computes and yields the model MS-DY MI which is equivalent to MS MI with
SATMC: A SAT-Based Model Checker 41

G MI MS kmax Output Format

DY Attacker Model SATMC

(MDY ) Encoding
[[M ]]k Φk SAT Solver
I R H M
&
(MiniSAT)
Planning Graph Goal [[C ⇒ G]]k
Generator Grounding
C
PLTL2SAT
(NuSMV)

Fig. 5. SATMC Internals

MI = (MDY MI ). Thus the problem of checking whether MS MDY MI |= G

is reduced to checking whether MS-DY MI |= G.
Let M = (MS-DY MI ) = I, R, H, C. SATMC now builds a propositional
formula Φk such that every truth-value assignment satisfying Φk corresponds to a
counterexample of M |= G of length k and vice versa, with k ≤ kmax (where the
upper bound kmax is an additional input to SATMC). The formula Φk is given by
the conjunction of the propositional formulae (i) [[M ]]k encoding the unfolding
(up to k-times) of the transition relation associated with M = I, R, H, ∅, and
(ii) [[C ⇒ G]]k , where C is a conjunction of the formulae in C, encoding the set
of attack traces of length k satisfying the constraints C. The module Model
Encoding takes as input M and generates [[M ]]k , while [[C ⇒ G]]k is generated
by the cascade of the Goal Grounding and PLTL2SAT modules.
The PLTL2SAT module leverages existing Bounded Model Checking (BMC)
techniques [14]. However, the techniques available in the literature assume that
(i) a propositional encoding of the transition relation of M is available and that
(ii) the goal formula belongs of propositional LTL. Both assumptions are violated
by the model checking problem we consider, since (i) the models we consider are
defined over a set of states which is not bounded a priori and (ii) first-order LTL
formulae are allowed. In [9] we showed that the first problem can be tackled by
computing a planning graph of the problem of depth k. A planning graph [17]
(module Planning Graph Generator) is a succinct representation of an over-
approximation of the states reachable in k steps. The planning graph is also
key to reducing the first-order LTL formula (C ⇒ G) to an equivalent (in a sense
that will be defined later) propositional LTL formula (via the Goal Grounding
module) which is then reduced to SAT using techniques developed for bounded
model checking of reactive systems [14] (via the PLTL2SAT module).
A SAT solver is finally used to check the satisfiability of the formula Φk
(module SAT Solving). Although not shown in Figure 5, SATMC carries out an
iterative deepening strategy on k. Initially k is set to 0, and then it is incremented
till an attack is found (if any) or kmax is reached. If this is the case no attack
42 A. Armando, R. Carbone, and L. Compagna

traces of length up to kmax exist. Notice that it is possible to set kmax to ∞, but
then the procedure may not terminate (i.e., it is a semi-decision procedure). It
is worth noticing that even though the planning graph may represent spurious
execution paths, the encoding in SAT is precise and thus false positives are not
returned by SATMC.
The Planning Graph. A planning graph is a sequence of layers Γi for i =
0, . . . , k, where each layer Γi is a set of facts concisely representing the set of
states Γi = {S : S ⊆ Γi }. The construction of a planning graph for M goes
beyond the scope of this paper and the interested reader is referred to [9] for
more details. For the purpose of this paper it suﬃces to know that (i) Γ0 is set
to the initial state of M , (ii) if S is reachable from the initial state of M in i
steps, then S ∈ Γi (or equivalently S ⊆ Γi ) for i = 0, . . . , k, and (iii) Γi ⊆ Γi+1
for i = 0, . . . , k − 1, i.e., the layers in the planning graph grow monotonically.
Encoding the Model. The ﬁrst step is to add a time-index to the rules and
facts to indicate the state at which the rules apply or the facts hold. Facts and
rules are thus indexed by 0 through k. If p is a fact, a rule, or a Horn clause and
i is an index, then pi is the corresponding time-indexed propositional variable.
If p = p1 , . . . , pn is a tuple of facts, rules, or Horn clauses and i is an index,
then pi = pi1 , . . . , pin is the corresponding time-indexed tuple of propositional
variables. The propositional formula [[M ]]0 is I(f 0 , hc0 ), while [[M ]]k , for k > 0,
is of the form:

k−1
I(f 0 , hc0 ) ∧ Ti (f i , ρi , hci , f i+1 , hci+1 ) (2)
i=0

where f , ρ, and hc are tuples of facts, rules, and Horn clauses, respectively. The
formula I(f 0 , hc0 ) encodes the initial state whereas the formula Ti (f i , ρi , hci ,
f i+1 , hci+1 ) encodes all the possible evolutions of the system from step i to step
i + 1. The encoding of the system follows the approach proposed in [9], adapted
from an encoding technique originally introduced for AI planning, extended to
support Horn clauses.
Grounding First-order LTL Formulae. Planning graphs are also key to
turn any ﬁrst-order LTL formula ψ into a propositional LTL formula ψ0 such
that if π is an execution path of M with k or less states that violates ψ0 , then
π violates also ψ, and vice versa. This allows us to reduce the BMC problem
for any ﬁrst-order LTL formula ψ to the BMC for a propositional LTL formula
ψ0 (module Goal Grounding) which can in turn be reduced to SAT by us-
ing the techniques available in the literature. The functionalities of the module
PLTL2SAT are currently given by the NuSMV model checker, used as a plugin
by SATMC. From the key properties of the planning graph described in Sec-
tion 5 it is easy to see that if a fact does not occur in Γk , then it is false in all
states reachable from the initial state in k steps and this leads to the following
fact.
SATMC: A SAT-Based Model Checker 43

Fact 1 Let ψ ∈ L, Γi for i = 0, . . . , k be a planning graph for M , and p a fact

such that p ∈ Γk . Then, if π is an execution path of M with k or less states
that violates ψ then π violates also ψ[⊥/p] (and vice versa), where ψ[⊥/p] is the
formula obtained from ψ by replacing all occurrences of p with ⊥.
To illustrate consider the problem of generating a propositional version of:
∃A. F(¬ O s(A, b) ∧ r(b, A)) (3)
when Γk = {s(a1, b), r(b, a1), r(b, a2)}. If the variable A ranges over the (finite)
set of constants DA = {a1, . . . , an}, then we can replace the existential quantifier
with a disjunction of instances of the formula in the scope of the quantifier, where
each instance is obtained by replacing the quantified variable, namely A, with
the constants in DA : F(¬ O s(a1, b) ∧ r(b, a1)) ∨ . . . ∨ F(¬ O s(an, b) ∧ r(b, an))
By repeatedly using Fact 1, this formula can be rewritten into: F(¬ O s(a1, b) ∧
r(b, a1)) ∨ . . . ∨ F(¬ O ⊥ ∧ ⊥) and finally be simplified to
F(¬ O s(a1, b) ∧ r(b, a1)) ∨ F(r(b, a2)) (4)
Even if the resulting formula is compact thanks to the simplification induced by
the planning graph, the instantiation step, namely the replacement of the exis-
tential quantifier with a disjunction of instances, can be very expensive or even
unfeasible (if the domain of the existentially quantified variable is not bounded).
A better approach is to let the instantiation activity be driven by the informa-
tion available in the planning graph. This can be done by generating instances
of the formula by recursively traversing the formula itself in a top down fashion.
As soon as an atomic formula is met, it is matched against the facts in Γk and
if a matching fact is found then the formula is replaced with the ground coun-
terpart found in Γk and the corresponding matching substitution is carried over
as a constraint. The approach is iterated on backtracking in order to generate
all possible instances and when no (other) matching fact in Γk for the atomic
formula at hand is found, then we replace it with ⊥.
To illustrate this, let us apply the approach to (3). By traversing the formula
we find the atomic formula s(A, b) which matches with s(a1, b) in Γk with match-
ing substitution {A = a1}. The atomic formula s(A, b) is instantiated to s(a1, b)
and the constraint A = a1 is carried over. We are then left with the problem of
finding a matching fact in Γk for the formula r(b, A) with A = a1, i.e., for r(b, a1).
The matching fact is easily found in Γk and we are therefore left with the formula
(i1) F(¬ O s(a1, b) ∧ r(b, a1)) as our first instance. On backtracking we find that
there is no other matching fact for r(b, a1) which is then replaced by ⊥, thereby
leading to the instance (i2) F(¬ O s(a1, b) ∧ ⊥). By further backtracking we find
that there is no other matching fact for s(A, b) in Γk and therefore we gener-
ate another instance by replacing s(A, b) with ⊥ and carry over the constraint
A = a1. The only fact in Γk matching r(b, A) while satisfying the constraint
A = a1 is r(b, a2). We are then left with the formula (i3) F(¬ O ⊥ ∧ r(b, a2)) as
our third instance. No further matching fact for r(b, a2) exists. The procedure
therefore turns the first-order LTL formula (3) into the disjunction of (i1), (i2),
and (i3), which can be readily simplified to (4).
44 A. Armando, R. Carbone, and L. Compagna

6 Conclusions

We presented SATMC 3.0, a SAT-based Model Checker for security-critical sys-

tems. SATMC successfully combines techniques from AI planning and for the
analysis of reactive systems to reduce the problem of determining the existence
of an attack of bounded length violating a given security goal to SAT. SATMC
supports the specification of security policies as Horn clauses and of security as-
sumptions and goals as first-order LTL formulae. Its flexibility and effectiveness
is demonstrated by its successful usage within three industrial relevant appli-
cation domains (security protocols, business processes, and security APIs) and
its integration within a number of research prototypes and industrial-strength
tools. SATMC 3.0 can be downloaded at http://www.ai-lab.it/satmc.
Acknowledgments. We are grateful to Luca Zanetti for his contribution in
the design and implementation of the Goal Grounding and PLTL2SAT mod-
ules. This work has partially been supported by the FP7-ICT Project SPaCIoS
(no. 257876), by the PRIN project “Security Horizons” (no. 2010XSEMLC)
funded by MIUR, and by the Activity “STIATE” (no. 14231) funded by the
EIT ICT-Labs.

References

1. Armando, A., et al.: The AVANTSSAR Platform for the Automated Validation of
Trust and Security of Service-Oriented Architectures. In: Flanagan, C., König, B.
(eds.) TACAS 2012. LNCS, vol. 7214, pp. 267–282. Springer, Heidelberg (2012)
2. Armando, A., et al.: The AVISPA Tool for the Automated Validation of Internet
Security Protocols and Applications. In: Etessami, K., Rajamani, S.K. (eds.) CAV
2005. LNCS, vol. 3576, pp. 281–285. Springer, Heidelberg (2005)
3. Armando, A., Carbone, R., Compagna, L.: LTL Model Checking for Security Proto-
cols. In: 20th IEEE Computer Security Foundations Symposium (CSF), pp. 385–396.
IEEE Computer Society (2007)
4. Armando, A., Carbone, R., Compagna, L.: LTL Model Checking for Security Pro-
tocols. In: JANCL, pp. 403–429. Hermes Lavoisier (2009)
5. Armando, A., Carbone, R., Compagna, L., Cuéllar, J., Pellegrino, G., Sorniotti,
A.: An Authentication Flaw in Browser-based Single Sign-On Protocols: Impact
and Remediations. Computers & Security 33, 41–58 (2013)
6. Armando, A., Carbone, R., Compagna, L., Cuéllar, J., Tobarra, L.: Formal Analysis
of SAML 2.0 Web Browser Single Sign-On: Breaking the SAML-based Single Sign-
On for Google Apps. In: Shmatikov, V. (ed.) Proc. ACM Workshop on Formal
Methods in Security Engineering, pp. 1–10. ACM Press (2008)
7. Armando, A., Carbone, R., Zanetti, L.: Formal Modeling and Automatic Security
Analysis of Two-Factor and Two-Channel Authentication Protocols. In: Lopez, J.,
Huang, X., Sandhu, R. (eds.) NSS 2013. LNCS, vol. 7873, pp. 728–734. Springer,
Heidelberg (2013)
8. Armando, A., Compagna, L.: SATMC: A SAT-Based Model Checker for Security
Protocols. In: Alferes, J.J., Leite, J. (eds.) JELIA 2004. LNCS (LNAI), vol. 3229,
pp. 730–733. Springer, Heidelberg (2004)
SATMC: A SAT-Based Model Checker 45

9. Armando, A., Compagna, L.: SAT-based Model-Checking for Security Protocols

Analysis. International Journal of Information Security 7(1), 3–32 (2008)
10. Armando, A., Ponta, S.E.: Model Checking of Security-Sensitive Business Pro-
cesses. In: Degano, P., Guttman, J.D. (eds.) FAST 2009. LNCS, vol. 5983, pp.
66–80. Springer, Heidelberg (2010)
11. Arsac, W., Compagna, L., Pellegrino, G., Ponta, S.E.: Security Validation of Busi-
ness Processes via Model-Checking. In: Erlingsson, Ú., Wieringa, R., Zannone, N.
(eds.) ESSoS 2011. LNCS, vol. 6542, pp. 29–42. Springer, Heidelberg (2011)
12. AVANTSSAR. Deliverable 2.1: Requirements for modelling and ASLan v.1 (2008),
http://www.avantssar.eu
13. Basin, D., Mödersheim, S., Viganò, L.: OFMC: A Symbolic Model-Checker for
Security Protocols. International Journal of Information Security (2004)
14. Biere, A.: Bounded Model Checking. In: Biere, A., Heule, M., van Maaren, H.,
Walsh, T. (eds.) Handbook of Satisfiability. Frontiers in Artificial Intelligence and
Applications, vol. 185, pp. 457–481. IOS Press (2009)
15. Biere, A., Cimatti, A., Clarke, E., Zhu, Y.: Symbolic Model Checking without
BDDs. In: Cleaveland, W.R. (ed.) TACAS 1999. LNCS, vol. 1579, pp. 193–207.
Springer, Heidelberg (1999)
16. Blanchet, B.: An Efficient Cryptographic Protocol Verifier Based on Prolog Rules.
In: Computer Security Foundations Workshop (CSFW), pp. 82–96 (2001)
17. Blum, A., Furst, M.: Fast Planning Through Planning Graph Analysis. In: Proc.
International Joint Conference on Artificial Intelligence, IJCAI 1995 (1995)
18. Bortolozzo, M., Centenaro, M., Focardi, R., Steel, G.: Attacking and Fixing
PKCS#11 Security Tokens. In: Proc. ACM Conference on Computer and Com-
munications Security (CCS 2010), Chicago, USA, pp. 260–269. ACM Press (2010)
19. Cimatti, A., Clarke, E., Giunchiglia, E., Giunchiglia, F., Pistore, M., Roveri, M.,
Sebastiani, R., Tacchella, A.: NuSMV 2: An OpenSource Tool for Symbolic Model
Checking. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404,
pp. 359–364. Springer, Heidelberg (2002)
20. Compagna, L., Guilleminot, P., Brucker, A.D.: Business Process Compliance via
Security Validation as a Service. In: ICST 2013, pp. 455–462 (2013)
21. OASIS Consortium. SAML V2.0 Technical Overview (March 2008),
http://wiki.oasis-open.org/security/Saml2TechOverview
22. Eén, N., Sörensson, N.: An Extensible SAT-solver. In: Giunchiglia, E., Tacchella,
A. (eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518. Springer, Heidelberg (2004)
23. Focardi, R., Luccio, F.L., Steel, G.: An Introduction to Security API Analysis. In:
Aldini, A., Gorrieri, R. (eds.) FOSAD 2011. LNCS, vol. 6858, pp. 35–65. Springer,
Heidelberg (2011)
24. Holzmann, G.: The Spin model checker: primer and reference manual, 1st edn.
Addison-Wesley Professional (2003)
25. RSA Se: Inc. PKCS#11: Cryptographic Token Interface Standard v2.20 (2004)
26. Kautz, H., McAllester, H., Selman, B.: Encoding Plans in Propositional Logic.
In: Aiello, L.C., Doyle, J., Shapiro, S. (eds.) KR 1996: Principles of Knowledge
Representation and Reasoning, pp. 374–384. Morgan Kaufmann (1996)
27. Turuani, M.: The CL-Atse Protocol Analyser. In: Pfenning, F. (ed.) RTA 2006.
LNCS, vol. 4098, pp. 277–286. Springer, Heidelberg (2006)
28. Viganò, L.: The SPaCIoS Project: Secure Provision and Consumption in the In-
ternet of Services. In: ICST 2013, pp. 497–498 (2013)
IC3 Modulo Theories
via Implicit Predicate Abstraction

Alessandro Cimatti, Alberto Griggio, Sergio Mover, and Stefano Tonetta

Fondazione Bruno Kessler, Trento, Italy

{cimatti,griggio,mover,tonettas}@fbk.eu

Abstract. We present a novel approach for generalizing the IC3 algorithm for
invariant checking from finite-state to infinite-state transition systems, expressed
over some background theories. The procedure is based on a tight integration of
IC3 with Implicit (predicate) Abstraction, a technique that expresses abstract tran-
sitions without computing explicitly the abstract system and is incremental with
respect to the addition of predicates. In this scenario, IC3 operates only at the
Boolean level of the abstract state space, discovering inductive clauses over the
abstraction predicates. Theory reasoning is confined within the underlying SMT
solver, and applied transparently when performing satisfiability checks. When the
current abstraction allows for a spurious counterexample, it is refined by discov-
ering and adding a sufficient set of new predicates. Importantly, this can be done
in a completely incremental manner, without discarding the clauses found in the
previous search.
The proposed approach has two key advantages. First, unlike current SMT
generalizations of IC3, it allows to handle a wide range of background theories
without relying on ad-hoc extensions, such as quantifier elimination or theory-
specific clause generalization procedures, which might not always be available,
and can moreover be inefficient. Second, compared to a direct exploration of the
concrete transition system, the use of abstraction gives a significant performance
improvement, as our experiments demonstrate.

1 Introduction

IC3 [5] is an algorithm for the verification of invariant properties of transition systems.
It builds an over-approximation of the reachable state space, using clauses obtained by
generalization while disproving candidate counterexamples. In the case of finite-state
systems, the algorithm is implemented on top of Boolean SAT solvers, fully leveraging
their features. IC3 has demonstrated to be extremely effective, and it is a fundamental
core in all the engines in hardware verification.
There have been several attempts to lift IC3 to the case of infinite-state systems,
for its potential applications to software, RTL models, timed and hybrid systems [9],
although the problem is in general undecidable. These approaches are set in the frame-
work of Satisfiability Modulo Theory (SMT) [1] and hereafter are referred to as IC3
Modulo Theories [7,18,16,25]: the infinite-state transition system is symbolically de-
scribed by means of SMT formulas, and an SMT solver plays the same role of the
SAT solver in the discrete case. The key difference is the need in IC3 Modulo Theories

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 46–61, 2014.

c Springer-Verlag Berlin Heidelberg 2014
IC3 Modulo Theories via Implicit Predicate Abstraction 47

for specific theory reasoning to deal with candidate counterexamples. This led to the
development of various techniques, based on quantifier elimination or theory-specific
clause generalization procedures. Unfortunately, such extensions are typically ad-hoc,
and might not always be applicable in all theories of interest. Furthermore, being based
on the fully detailed SMT representation of the transition systems, some of these solu-
tions (e.g. based on quantifier elimination) can be highly inefficient.
We present a novel approach to IC3 Modulo Theories, which is able to deal with
infinite-state systems by means of a tight integration with predicate abstraction (PA)
[12], a standard abstraction technique that partitions the state space according to the
equivalence relation induced by a set of predicates. In this work, we leverage Implicit
Abstraction (IA) [23], which allows to express abstract transitions without computing
explicitly the abstract system, and is fully incremental with respect to the addition of
new predicates. In the resulting algorithm, called IC3+IA, the search proceeds as if
carried out in an abstract system induced by the set of current predicates P – in fact,
IC3+IA only generates clauses over P. The key insight is to exploit IA to obtain an
abstract version of the relative induction check. When an abstract counterexample is
found, as in Counter-Example Guided Abstraction-Refinement (CEGAR), it is simu-
lated in the concrete space and, if spurious, the current abstraction is refined by adding
a set of predicates sufficient to rule it out.
The proposed approach has several advantages. First, unlike current SMT general-
izations of IC3, IC3+IA allows to handle a wide range of background theories without
relying on ad-hoc extensions, such as quantifier elimination or theory-specific clause
generalization procedures. The only requirement is the availability of an effective tech-
nique for abstraction refinement, for which various solutions exist for many important
theories (e.g. interpolation [15], unsat core extraction, or weakest precondition). Sec-
ond, the analysis of the infinite-state transition system is now carried out in the abstract
space, which is often as effective as an exact analysis, but also much faster. Finally, the
approach is completely incremental, without having to discard or reconstruct clauses
found in the previous iterations.
We experimentally evaluated IC3+IA on a set of benchmarks from heterogeneous
sources [2,14,18], with very positive results. First, our implementation of IC3+IA is
significantly more expressive than the SMT-based IC3 of [7], being able to handle
not only the theory of Linear Rational Arithmetic (LRA) like [7], but also those of
Linear Integer Arithmetic (LIA) and fixed-size bit-vectors (BV). Second, in terms of
performance IC3+IA proved to be uniformly superior to a wide range of alternative
techniques and tools, including state-of-the-art implementations of the bit-level IC3
algorithm ([11,22,3]), other approaches for IC3 Modulo Theories ([7,16,18]), and tech-
niques based on k-induction and invariant discovery ([14,17]). A remarkable property
of IC3+IA is that it can deal with a large number of predicates: in several benchmarks,
hundreds of predicates were introduced during the search. Considering that an explicit
computation of the abstract transition relation (e.g. based on All-SMT [19]) often be-
comes impractical with a few dozen predicates, we conclude that IA is fundamental to
scalability, allowing for efficient reasoning in a fine-grained abstract space.
The rest of the paper is structured as follows. In Section 2 we present some back-
ground on IC3 and Implicit Abstraction. In Section 3 we describe IC3+IA and prove
48 A. Cimatti et al.

its formal properties. In Section 4 we discuss the related work. In Section 5 we ex-
perimentally evaluate our method. In Section 6 we draw some conclusions and present
directions for future work.

2 Background
2.1 Transition Systems
Our setting is standard first order logic. We use the standard notions of theory, satisfi-
ability, validity, logical consequence. We denote formulas with ϕ, ψ, I, T, P , variables
with x, y, and sets of variables with X, Y , X, X. Unless otherwise specified, we work
on quantifier-free formulas, and we refer to 0-arity predicates as Boolean variables, and
to 0-arity uninterpreted functions as (theory) variables. A literal is an atom or its nega-
tion. A clause is a disjunction of literals, whereas a cube is a conjunction of literals.
If s is a cube l1 ∧ . . . ∧ ln , with ¬s we denote the clause ¬l1 ∨ . . . ∨ ¬ln , and vice
versa. A formula is in conjunctive normal form (CNF) if it is a conjunction of clauses,
and in disjunctive normal form (DNF) if it is a disjunction of cubes. With a little abuse
of notation, we might sometimes denote formulas in CNF C1 ∧ . . . ∧ Cn as sets of
clauses {C1 , . . . , Cn }, and vice versa. If X1 , . . . , Xn are a sets of variables and ϕ is a
formula, we might write ϕ(X1 , . . . , Xn ) to indicate that all the variables occurring in ϕ
are elements of i Xi . For each variable x, we assume that there exists a corresponding
variable x (the primed version of x). If X is a set of variables, X is the set obtained
by replacing each element x with its primed version (X = {x | x ∈ X}), X is the set
obtained by replacing each x with x (X = {x | x ∈ X}) and X n is the set obtained by
adding n primes to each variable (X n = {xn | x ∈ X}).
Given a formula ϕ, ϕ is the formula obtained by adding a prime to each variable
occurring in ϕ. Given a theory T, we write ϕ |=T ψ (or simply ϕ |= ψ) to denote that
the formula ψ is a logical consequence of ϕ in the theory T.
A transition system (TS) S is a tuple S = X, I, T where X is a set of (state)
variables, I(X) is a formula representing the initial states, and T (X, X ) is a formula
representing the transitions. A state of S is an assignment to the variables X. A path of
S is a finite sequence s0 , s1 , . . . , sk of states such that s0 |= I and for all i, 0 ≤ i < k,
(si , si+1 ) |= T .
Given a formula P (X), the verification problem denoted with S |= P is the problem
to check if for all paths s0 , s1 , . . . , sk of S, for all i, 0 ≤ i ≤ k, si |= P . Its dual is the
reachability problem, which is the problem to find a path s0 , s1 , . . . , sk of S such that
sk |= ¬P . P (X) represents the “good” states, while ¬P represents the “bad” states.
Inductive invariants are central to solve the verification problem. P is an inductive
invariant iff (i) I(X) |= P (X); and (ii) P (X) ∧ T (X, X ) |= P (X ). A weaker notion
is given by relative inductive invariants: given a formula φ(X), P is inductive relative
to φ iff (i) I(X) |= P (X); and (ii) φ(X) ∧ P (X) ∧ T (X, X ) |= P (X ).

2.2 IC3 with SMT

IC3 [5] is an efficient algorithm for the verification of finite-state systems, with Boolean
state variables and propositional logic formulas. IC3 was subsequently extended to the
IC3 Modulo Theories via Implicit Predicate Abstraction 49

SMT case in [7,16]. In the following, we present its main ideas, following the descrip-
tion of [7]. For brevity, we have to omit several important details, for which we refer to
[5,7,16].
Let S and P be a transition system and a set of good states as in §2.1. The IC3
algorithm tries to prove that S |= P by finding a formula F (X) such that: (i) I(X) |=
F (X); (ii) F (X) ∧ T (X, X ) |= F (X ); and (iii) F (X) |= P (X).
In order to construct an inductive invariant F , IC3 maintains a sequence of formulas
(called trace) F0 (X), . . . , Fk (X) such that: (i) F0 = I; (ii) Fi |= Fi+1 ; (iii) Fi (X) ∧
T (X, X ) |= Fi+1 (X ); (iv) for all i < k, Fi |= P . Therefore, each element of the
trace Fi+1 , called frame, is inductive relative to the previous one, Fi . IC3 strengthens
the frames by finding new relative inductive clauses by checking the unsatisfiability of
the formula:
RelInd(F, T, c) := F ∧ c ∧ T ∧ ¬c . (1)
More specifically, the algorithm proceeds incrementally, by alternating two phases:
a blocking phase, and a propagation phase. In the blocking phase, the trace is analyzed
to prove that no intersection between Fk and ¬P (X) is possible. If such intersection
cannot be disproved on the current trace, the property is violated and a counterexample
can be reconstructed. During the blocking phase, the trace is enriched with additional
formulas, which can be seen as strengthening the approximation of the reachable state
space. At the end of the blocking phase, if no violation is found, Fk |= P .
The propagation phase tries to extend the trace with a new formula Fk+1 , moving
forward the clauses from preceding Fi ’s. If, during this process, two consecutive frames
become identical (i.e. Fi = Fi+1 ), then a fixpoint is reached, and IC3 terminates with
Fi being an inductive invariant proving the property.
In the blocking phase IC3 maintains a set of pairs (s, i), where s is a set of states that
can lead to a bad state, and i > 0 is a position in the current trace. New formulas (in the
form of clauses) to be added to the current trace are derived by (recursively) proving
that a set s of a pair (s, i) is unreachable starting from the formula Fi−1 . This is done
by checking the satisfiability of the formula RelInd(Fi−1 , T, ¬s). If the formula is un-
satisfiable, then ¬s is inductive relative to Fi−1 , and IC3 strengthens Fi by adding ¬s
to it1 , thus blocking the bad state s at i. If, instead, (1) is satisfiable, then the overap-
proximation Fi−1 is not strong enough to show that s is unreachable. In this case, let p
be a subset of the states in Fi−1 ∧ ¬s such that all the states in p lead to a state in s in
one transition step. Then, IC3 continues by trying to show that p is not reachable in one
step from Fi−2 (that is, it tries to block the pair (p, i − 1)). This procedure continues
recursively, possibly generating other pairs to block at earlier points in the trace, until
either IC3 generates a pair (q, 0), meaning that the system does not satisfy the property,
or the trace is eventually strengthened so that the original pair (s, i) can be blocked.
A key difference between the original Boolean IC3 and its SMT extensions in [7,16]
is in the way sets of states to be blocked or generalized are constructed. In the blocking
phase, when trying to block a pair (s, i), if the formula (1) is satisfiable, then a new
pair (p, i − 1) has to be generated such that p is a cube in the preimage of s wrt. T .
In the propositional case, p can be obtained from the model μ of (1) generated by
1
¬s is actually generalized before being added to Fi . Although this is fundamental for the IC3
effectiveness, we do not discuss it for simplicity.
50 A. Cimatti et al.

the SAT solver, by simply dropping the primed variables occurring in μ. This cannot
be done in general in the first-order case, where the relationship between the current
state variables X and their primed version X is encoded in the theory atoms, which in
general cannot be partitioned into a primed and an unprimed set. The solution proposed
in [7] is to compute p by existentially quantifying (1) and then applying an under-
approximated existential elimination algorithm for linear rational arithmetic formulas.
Similarly, in [16] a theory-aware generalization algorithm for linear rational arithmetic
(based on interpolation) was proposed, in order to strengthen ¬s before adding it to Fi
after having successfully blocked it.

2.3 Implicit Abstraction

Predicate Abstraction. Abstraction [10] is used to reduce the search space while pre-
serving the satisfaction of some properties such as invariants. If S is an abstraction of
S, if a condition is reachable in S, then also its abstract version is reachable in S. Thus,

if we prove that a set of states is not reachable in S, the same can be concluded for the
concrete transition system S.
In Predicate Abstraction [12], the abstract state-space is described with a set of pred-
icates. Given a TS S, we select a set P of predicates, such that each predicate p ∈ P
is a formula over the variables X that characterizes relevant facts of the system. For
every p ∈ P, we introduce a new abstract variable xp and define XP as {xp }p∈P . The
abstraction relation HP is defined as HP (X, XP ) := p∈P xp ↔ p(X). Given a for-
mula φ(X), the abstract version φP is obtained by existentially quantifying the vari-
ables X, i.e., φP = ∃X.(φ(X) ∧ HP (X, XP )). Similarly for a formula over X and
X , φP = ∃X, X .(φ(X, X ) ∧ HP (X, XP ) ∧ HP (X , XP )). The abstract system with
SP = XP , IP , TP is obtained by abstracting the initial and the transition conditions. In
the following, when clear from the context, we write just φ instead of φP .
Since most model checkers deal only with quantifier-free formulas, the computation
of SP requires the elimination of the existential quantifiers. This may result in a bottle-
neck and some techniques compute weaker/more abstract systems (cfr., e.g., [21]).

Implicit Predicate Abstraction. Im-

T
plicit predicate abstraction [23] embeds
EQ

EQ
EQ

the definition of the predicate abstraction

in the encoding of the path. This is based T T
on the following formula:
Fig. 1. Abstract path
EQP (X, X) := p(X) ↔ p(X) (2)
p∈P
which relate two concrete states corresponding to the same abstract state. The formula
k,P :=
Path
h−1 h
, X h ) ∧ EQP (X h , X )) ∧ T (X
k−1
, X k ) is satisfiable iff
1≤h<k (T (X
there exists a path of k steps in the abstract state space. Intuitively, instead of having a
contiguous sequence of transitions, the encoding represents a sequence of disconnected
transitions where every gap between two transitions is forced to lay in the same abstract
state (see Fig. 1). BMCkP encodes the abstract bounded model checking problem and
IC3 Modulo Theories via Implicit Predicate Abstraction 51

k,P by adding the abstract initial and target conditions: BMCkP =

is obtained from Path
0
k,P ∧ EQP (X k , X k ) ∧ ¬P (X k ).
I(X 0 ) ∧ EQP (X 0 , X ) ∧ Path

3 IC3 with Implicit Abstraction

3.1 Main Idea

The main idea of IC3+IA is to mimic how IC3 would work on the abstract state space
defined by a set of predicates P, but using IA to avoid quantifier elimination to compute
the abstract transition relation. Therefore, clauses, frames and cubes are restricted to
have predicates in P as atoms. We call these clauses, frames and cubes respectively
P-clauses, P-formulas, and P-cubes. Note that for any P-formula φ (and thus also for

P-cubes and P-clauses), φ = φ[XP /P] ∧ ∃X.( p∈P xp ↔ p(X)).
The key point of IC3+IA is to use an abstract version of the check (1) to prove that
c is inductive relative to the abstract frame F :
an abstract clause

AbsRelInd(F, T, c, P) := F (X) ∧ c(X) ∧

EQP (X, X) ∧ T (X, X ) ∧ EQP (X , X ) ∧ ¬c(X ) (3)

Theorem 1. Consider a set P of predicates, P-formulas F and a P-clause c.

RelInd(F , T,
c) is satisfiable iff AbsRelInd(F, T, c, P) is satisfiable. In particular,
if s |= AbsRelInd(F, T, c, P), then s |= RelInd(F, T,
c).

Proof. Suppose s |= AbsRelInd(F, T, c, P). Let us denote with t and t the projections
of s respectively over X ∪ X and over X ∪ X . Then t |= T and therefore t |= T. Since

t and t are the same abstract transition and therefore

a Boolean combination of P, then t |= ¬

t |= RelInd(F , T,
c . Thus, s |= c).
For the other direction, suppose t |= RelInd(F, T, c). Then there exists an assign-
ment t to X ∪ X such that t |= T and t = t. Therefore, t |= F (X) ∧ c(X) ∧
EQP (X, X) ∧ T (X, X ) ∧ EQP (X , X ) ∧ ¬c(X ), which concludes the proof.

3.2 The Algorithm

The IC3+IA algorithm is shown in Figure 2. The IC3+IA has the same structure of
IC3 as described in [11]. Additionally, it keeps a set of predicates P, which are used
to compute new clauses. The only points where IC3+IA differs from IC3 (shown in
red in Fig. 2) are in picking P-cubes instead of concrete states, the use of AbsRelInd
instead of RelInd, and in the fact that a spurious counterexample may be found and, in
that case, new predicates must be added.
More specifically, the algorithm consists of a loop, in which each iteration is divided
into the blocking and the propagation phase. The blocking phase starts by picking a cube
c of predicates representing an abstract state in the last frame violating the property.
This is recursively blocked along the trace by checking if AbsRelInd(Fi−1 , T, ¬c, P)
52 A. Cimatti et al.

bool IC3+IA (I, T , P , P):

1. P = P ∪ {p | p is a predicate in I or in P }
2. trace = [I] # first elem of trace is init formula
3. trace.push() # add a new frame to the trace
4. while True:
# blocking phase
5. while there exists a P-cube c s.t. c |= trace.last() ∧ ¬P :
6. if not recBlock(c, trace.size() − 1):
# a pair (s0 , 0) is generated
7. if the simulation of π = (s0 , 0); . . . ; (sk , k) fails:
8. P := P ∪ refine(I, T, P, P, π)
9. else return False # counterexample found
# propagation phase
10. trace.push()
11. for i = 1 to trace.size() − 1:
12. for each clause c ∈ trace[i]:
13. if AbsRelInd(trace[i], T, c, P) |= ⊥:
14. add c to trace[i+1]
15. if trace[i] == trace[i+1]: return True # property proved

# simplified recursive description, in practice based on priority queue [5,11]

bool recBlock(s, i):
1. if i == 0: return False # reached initial states
2. while AbsRelInd(trace[i-1], T, ¬s, P) |= ⊥:
3. extract a P-cube c from the Boolean model of AbsRelInd(trace[i-1], T, ¬s, P)
# c is an (abstract) predecessor of s
4. if not recBlock(c, i − 1): return False
5. g = generalize(¬s, i) # standard IC3 generalization [5,11] (using AbsRelInd)
6. add g to trace[i]
7. return True

Fig. 2. High-level description of IC3+IA (with changes wrt. the Boolean IC3 in red)

is satisfiable. If the relative induction check succeeds, Fi is strengthened with a gen-

eralization of ¬c. If the check fails, the recursive blocking continues with an abstract
predecessor of c, that is, a P-cube in Fi ∧ ¬c that leads to c in one step. This recursive
blocking results in either strengthening of the trace or in the generation of an abstract
counterexample. If the counterexample can be simulated on the concrete transition sys-
tem, then the algorithm terminates with a violation of the property. Otherwise, it refines
the abstraction, adding new predicates to P so that the abstract counterexample is no
more a path of the abstract system. In the propagation phase, P-clauses of a frame Fi
that are inductive relative to Fi using T are propagated to the following frame Fi+1 .
As for IC3, if two consecutive frames are equal, we can conclude that the property is
satisfied by the abstract transition system, and therefore also by the concrete one.

3.3 Simulation and Refinement

During the search the procedure may find a counterexample in the abstract space. As
usual in the CEGAR framework, we simulate the counterexample in the concrete system
IC3 Modulo Theories via Implicit Predicate Abstraction 53

to either find a real counterexample or to refine the abstraction, adding new predicates
to P. Technically, IC3+IA finds a set of counterexamples π = (s0 , 0); . . . ; (sk , k) in-
stead of a single counterexample, as described in [7] (i.e. this behaviour depends on the
generalization of a cube performed by ternary simulation or don’t care detection). We
simulate π as usual via bounded model checking.
Formally, we encode allthe paths of S
up to k steps restricted to π with: I(X 0 )∧ i<k T (X i , X i+1 )∧P (X k )∧ i≤k sk (X k ).
If the formula is satisfiable, then there exists a concrete counterexample that witnesses
S |= P , otherwise π is spurious and we refine the abstraction adding new predicates.
The refine(I, T, P, π) procedure is orthogonal to IC3+IA, and can be carried out with
several techniques, like interpolation, unsat core extraction or weakest precondition, for
which there is a wide literature. The only requirement of the refinement is to remove the
spurious counterexamples π. In our implementation we used interpolation to discover
predicates, similarly to [15].
Also, note that in our approach the set of predicates increases monotonically after a
refinement (i.e. we always add new predicates to the existing set of predicates). Thus,
the transition relation is monotonically strengthened (i.e. since P ⊆ P , TP → TP ). This
allows us to keep all the clauses in the IC3+IA frames after a refinement, enabling a
fully incremental approach.

3.4 Correctness
Lemma 1 (Invariants). The following conditions are invariants of IC3+IA:
1. F0 = I;
2. for all i < k, Fi |= Fi+1 ;

3. for all i < k, Fi (X) ∧ EQP (X, X) ∧ T (X, X ) ∧ EQP (X , X ) |= Fi+1 (X );
4. for all i < k, Fi |= P .
Proof. Condition 1 holds, since initially F0 = I, and F0 is never changed. We prove
that the conditions (2-4) are loop invariants for the main IC3+IA loop (line 4). The
invariant conditions trivially hold when entering the loop.
Then, the invariants are preserved by the inner loop at line 5. The loop may change the
content of a frame Fi+1 adding a new clause c while recursively blocking a cube (p, i +
1). c is added to Fi+1 if the abstract relative inductive check AbsRelInd(Fi , T, c, P)
holds. Clearly, this preserves the conditions 2-3. In the loop the set of predicates P may
change at line 8. Note that the invariant conditions still hold in this case. In particular,
3 holds because if P ⊆ P , then EQP |= EQP . When the inner loop ends, we are guar-
anteed that Fk |= PP holds. Thus, condition 4 is preserved when a new frame is added
to the abstraction in line 10. Finally, the propagation phase clearly maintains all the in-
variants (2-4), by the definition of abstract relative induction AbsRelInd(Fi , T, c, P ).
P |= P
Lemma 2. If IC3+IA (I, T , P , P) returns true, then S P .

Proof. The invariant conditions of the IC3 algorithm hold for the abstract frames: 1)
0 = I;
F for all i < k, 2) Fi |= F
i+1 ; 3) Fi ∧ T |= Fi+1 ; and 4) Fi |= P .
Conditions 1), 2), and 4) follow from Lemma 1, since I, P , and Fi are P-cubes.

Condition 3) follows from Lemma 1, since T = ∃X, X .EQP (X, X) ∧ T (X, X ) ∧

EQP (X , X ) by definition.
54 A. Cimatti et al.

By assumption IC3+IA returns true and thus F

k−1 = Fk . Since the conditions (1-4)
hold, we have that F
k−1 is an inductive invariant that proves S |= P .

Theorem 2 (Soundness). Let S = X, I, T be a transition system and P a safety

property and P be a set of predicates over X. The result of IC3+IA (I, T , P , P) is
correct.

Proof. If IC3+IA (I, T , P , P) returns true, then S P |= PP by Lemma 2, and thus
S |= P . If IC3+IA (I, T , P , P) returns false, then the simulation of the abstract coun-
terexample in the concrete system succeeded, and thus S |= P .

Lemma 3 (Abstract counterexample). If IC3+IA finds a counterexample π, then π
is a path of S violating P.

Proof. For all i, 0 ≤ i ≤ trace.size, if π[i] = (si , i) then si is a P-cube satisfying

Theorem 3 (Relative completeness). Suppose that for some set P of predicates, S |=

P . If, at a certain iteration of the main loop, IC3+IA has P as set of predicates, then
IC3+IA returns true.

Proof. Let us consider the case in which, at a certain iteration of the main loop, P is as
defined in the premises of theorem. At every following iteration of the loop, IC3+IA
either finds an abstract counterexample π or strengthens a frame Fi with a new P-clause.
The first case is not possible, since, by Lemma 3, π would be a path of S violating the
property. Therefore, at every iteration, IC3+IA strengthens some frame with a new P-
clause. Since the number of P-clauses is finite and, by Lemma 1, for all i, Fi |= Fi+1 ,
IC3+IA will eventually find that Fi = Fi+1 for some i and return true.

4 Related Work
This work combines two lines of research in verification, abstraction and IC3.
Among the existing abstraction techniques, predicate abstraction [12] has been suc-
cessfully applied to the verification of infinite-state transition systems, such as soft-
ware [20]. Implicit abstraction [23] was first used with k-induction to avoid the explicit
computation of the abstract system. In our work, we exploit implicit abstraction in IC3
to avoid theory-specific generalization techniques, widening the applicability of IC3 to
transition systems expressed over some background theories. Moreover, we provided
the first integration of implicit abstraction in a CEGAR loop.
The IC3 [5] algorithm has been widely applied to the hardware domain [11,6] to
prove safety and also as a backend to prove liveness [4]. In [24], IC3 is combined with
a lazy abstraction technique in the context of hardware verification. The approach has
some similarities with our work, but it is limited to Boolean systems, it uses a “visible
IC3 Modulo Theories via Implicit Predicate Abstraction 55

variables” abstraction rather than PA, and applies a modified concrete version of IC3
for refinement.
Several approaches adapted the original IC3 algorithm to deal with infinite-state
systems [7,16,18,25]. The techniques presented in [7,16] extend IC3 to verify systems
described in the linear real arithmetic theory. In contrast to both approaches, we do
not rely on theory specific generalization procedures, which may be expensive, such as
quantifier elimination [7] or may hinder some of the IC3 features, like generalization
(e.g. the interpolant-based generalization of [16] does not exploit relative induction).
Moreover, IC3+IA searches for a proof in the abstract space. The approach presented
in [18] is restricted to timed automata since it exploits the finite partitioning of the
region graph. While we could restrict the set of predicates that we use to regions, our
technique is applicable to a much broader class of systems, and it also allows us to apply
conservative abstractions. IC3 was also extended to the bit-vector theory in [25] with
an ad-hoc extension, that may not handle efficiently some bit-vector operators. Instead,
our approach is not specific for bit-vectors.

5 Experimental Evaluation

We have implemented the algorithm described in the previous section in the SMT exten-
sion of IC3 presented in [7]. The tool uses M ATH SAT [8] as backend SMT solver, and
takes as input either a symbolic transition system or a system with an explicit control-
flow graph (CFG), in the latter case invoking a specialized “CFG-aware” variant of
IC3 (TreeIC3, also described in [7]). The discovery of new predicates for abstraction
refinement is performed using the interpolation procedures implemented in M ATH -
SAT, following [15]. In this section, we experimentally evaluate the effectiveness of
our new technique. We will call our implementation of the various algorithms as fol-
lows: IC3(LRA) is the “concrete” IC3 extension for Linear Rational Arithmetic (LRA)
as presented in [7]; T REE IC3+ITP(LRA) is the CFG-based variant of [7], also work-
ing only over LRA, and exploiting interpolants whenever possible2 ; IC3+IA(T) is IC3
with Implicit Abstraction for an arbitrary theory T; T REE IC3+IA(T) is the CFG-based
IC3 with Implicit Abstraction for an arbitrary theory T.
All the experiments have been performed on a cluster of 64-bit Linux machines with
a 2.7 Ghz Intel Xeon X5650 CPU, with a memory limit set to 3Gb and a time limit of
1200 seconds (unless otherwise specified). The tools and benchmarks used in the ex-
periments are available at https://es.fbk.eu/people/griggio/papers/
tacas14-ic3ia.tar.bz2.

5.1 Performance Benefits of Implicit Abstraction

In the first part of our experiments, we evaluate the impact of Implicit Abstraction for
the performance of IC3 modulo theories. In order to do so, we compare IC3+IA(LRA)
and T REE IC3+IA(LRA) against IC3(LRA) and T REE IC3+ITP(LRA) on the same
set of benchmarks used in [7], expressed in the LRA theory. We also compare both
2
See [7] for more details.
56 A. Cimatti et al.

1000 1000

T REE IC3+IA(LRA)
IC3+IA(LRA)

100 100

10 10

1 1

0.1 0.1
0.1 1 10 100 1000 0.1 1 10 100 1000

IC3(LRA) T REE IC3+ITP(LRA)

80 IC3+IA(LRA)
TreeIC3+IA(LRA) Algorithm/Tool # solved Tot time
TreeIC3+ITP(LRA)
70
IC3+IA(LRA) 82 5836
Number of solved instances

Z3
IC3(LRA)
60 T REE IC3+IA(LRA) 75 8825
T REE IC3+ITP(LRA) 70 10478
50
Z3 66 2923
40 IC3(LRA) 62 9637
30

0.1 1 10 100 1000 10000

Total time

Fig. 3. Experimental results on LRA benchmarks from [7]

variants against the SMT extension of IC3 for LRA presented in [16] and implemented
in the Z 3 SMT solver.3
The results are reported in Figure 3. In the scatter plots at the top, safe instances
are shown as blue squares, and unsafe ones as red circles. The plot at the bottom re-
ports the number of solved instances and the total accumulated execution time for
each tool. From the results, we can clearly see that using abstraction has a very sig-
nificant positive impact on performance. This is true for both the fully symbolic and
the CFG-based IC3, but it is particularly important in the fully symbolic case: not only
IC3+IA(LRA) solves 20 more instances than IC3(LRA), but it is also more than one
order of magnitude faster in many cases, and there is no instance that IC3(LRA) can
solve but IC3+IA(LRA) can’t. In fact, Implicit Abstraction is so effective for these
benchmarks that IC3+IA(LRA) outperforms also T REE IC3+IA(LRA), even though
IC3(LRA) is significantly less efficient than T REE IC3+ITP(LRA). One of the rea-
sons for the smaller performance gain obtained in the CFG-based algorithm might be
3
We used the Git revision 3d910028bf of Z 3.
IC3 Modulo Theories via Implicit Predicate Abstraction 57

that T REE IC3+ITP(LRA) already tries to avoid expensive quantifier elimination op-
erations whenever possible, by populating the frames with clauses extracted from in-
terpolants, and falling back to quantifier elimination only when this fails (see [7] for
details). Therefore, in many cases T REE IC3+ITP(LRA) and T REE IC3+IA(LRA)
end up computing very similar sets of clauses. However, implicit abstraction still
helps significantly in many instances, and there is only one problem that is solved by
T REE IC3+ITP(LRA) but not by T REE IC3+IA(LRA). Moreover, both abstraction-
based algorithms outperform all the other ones, including Z 3.
We also tried a traditional CEGAR approach based on explicit predicate abstraction,
using a bit-level IC3 as model checking algorithm and the same interpolation procedure
of IC3+IA(LRA) for refinement. As we expected, this configuration ran out of time or
memory on most of the instances, and was able to solve only 10 of them.
Finally, we did a preliminary comparison with a variant of IC3 specific for timed
automata, ATMOC [18]. We randomly selected a subset of the properties provided with
ATMOC, ignoring the trivial ones (i.e. properties that are 1-step inductive or with a coun-
terexample of length < 3). IC3+IA(LRA) performs very well also in this case, solving
100 instances in 772 seconds, while ATMOC solved 41 instances in 3953 seconds (Z 3
and IC3(LRA) solved 100 instances in 1535 seconds and 46 instances in 3347 seconds
respectively). For lack of space we do not report the plots.
Impact of Number of Predicates. The refinement step may introduce more predicates
than those actually needed to rule out a spurious counterexample (e.g. the interpolation-
based refinement adds all the predicates found in the interpolant). In principle, such re-
dundant predicates might significantly hurt performance. Using the implicit abstraction
framework, however, we can easily implement a procedure that identifies and removes
(a subset of) redundant predicates after each successful refinement step. Suppose that
IC3+IA finds a spurious counterexample trace π = (s0 , 0); . . . ; (sk , k) with the set of
predicates P, and that refine(I, T, P, π) finds a set Pn of new predicates. The reduction
procedure exploits the encoding of the set of paths of the abstract system SP∪Pn up to
k steps, BMCkP∪Pn . If P ∪ Pn are sufficient to rule out the spurious counterexample,
BMCkP∪Pn is unsatisfiable. We ask the SMT solver to compute the unsatisfiable core of
BMCkP∪Pn , and we keep only the predicates of Pn that appear in the unsatisfiable core.
In order to evaluate the effectiveness of this simple approach, we compare two ver-
sions of IC3+IA(LRA) with and without the reduction procedure. Perhaps surprisingly,
although the reduction procedure is almost always effective in reducing the total number
of predicates, the effects on the execution time are not very big. Although redundancy
removal seems to improve performance for the more difficult instances, overall the two
versions of IC3+IA(LRA) solve the same number of problems. However, this shows
that the algorithm is much less sensitive to the number of predicates added than ap-
proaches based on an explicit computation of the abstract transition relation e.g. via
All-SMT, which often show also in practice (and not just in theory) an exponential
increase in run time with the addition of new predicates. IC3+IA(LRA) manages to
solve problems for which it discovers several hundreds of predicates, reaching the peak
of 800 predicates and solving most of safe instances with more than a hundred predi-
cates. These numbers are typically way out of reach for explicit abstraction techniques,
which blow up with a few dozen predicates.
58 A. Cimatti et al.

160
TreeIC3+IA(BV)
140 IC3+IA(BV) Algorithm/Tool # solved Tot time
ABC-dprove
T REE IC3+IA(BV) 150 7056
Number of solved instances

Tip
120 IC3ref
ABC-PDR IC3+IA(BV) 150 12753
100 ABC- DPROVE 120 4298
T IP 119 6361
80
IC3 REF 110 9041
60 ABC- PDR 75 6447
40

0.01 0.1 1 10 100 1000 10000

Total time

Fig. 4. Experimental results on BV benchmarks from software verification

5.2 Expressiveness Benefits of Implicit Abstraction

In the second part of our experimental analysis, we evaluate the effectiveness of Implicit
Abstraction as a way of applying IC3 to systems that are not supported by the methods
of [7], by instantiating IC3+IA(T) (and T REE IC3+IA(T)) over the theories of Linear
Integer Arithmetic (LIA) and of fixed-size bit-vectors (BV).

IC3 for BV. For evaluating the performance of IC3+IA(BV) and T REE IC3+IA(BV),
we have collected over 200 benchmark instances from the domain of software verifica-
tion. More specifically, the benchmark set consists of: all the benchmarks used in §5.1,
but using BV instead of LRA as background theory; the instances of the bitvector
set of the Software Verification Competition SV-COMP [2]; the instances from the test
suite of InvGen [13], a subset of which was used also in [25].
We have compared the performance of our tools with various implementations of
the Boolean IC3 algorithm, run on the translations of the benchmarks to the bit-level
Aiger format: the PDR implementation in the ABC model checker (ABC- PDR) [11],
T IP [22], and IC3 REF [3], the new implementation of the original IC3 algorithm as
described in [5]. Finally, we have also compared with the DPROVE algorithm of ABC
(ABC- DPROVE), which combines various different techniques for bit-level verification,
including IC3.4 We also tried Z 3, but it ran out of memory on most instances. It seems
that Z 3 uses a Datalog-based engine for BV, rather than PDR.
The results of the evaluation on BV are reported in Figure 4. As we can see, both
IC3+IA(BV) and T REE IC3+IA(BV) outperform the bit-level IC3 implementations.
In this case, the CFG-based algorithm performs slightly better than the fully-symbolic
one, although they both solve the same number of instances.

IC3 for LIA. For our experiments on the LIA theory, we have generated benchmarks
using the Lustre programs available from the webpage of the K IND model checker for
Lustre [14]. Since such programs do not have an explicit CFG, we have only evaluated
4
We used ABC version 374286e9c7bc, T IP 4ef103d81e and IC3 REF 8670762eaf.
IC3 Modulo Theories via Implicit Predicate Abstraction 59

900 IC3+IA(LIA)
Z3
800 pKind
Algorithm/Tool # solved Tot time
Number of solved instances

Kind
700
IC3+IA(LIA) 933 2064
600 Z3 875 1654
500 P K IND 859 720
400
K IND 746 8493
300

200

100

0
0.01 0.1 1 10 100 1000 10000
Total time

Fig. 5. Experimental results on LIA benchmarks from Lustre programs [14]

IC3+IA(LIA), by comparing it with Z 3 and with the latest versions of K IND as well as
its parallel version P K IND [17].5 The results are summarized in Figure 5. Also in this
case, IC3+IA(LIA) outperforms the other systems.

6 Conclusion
In this paper we have presented IC3+IA, a new approach to the verification of infinite
state transition systems, based on an extension of IC3 with implicit predicate abstrac-
tion. The distinguishing feature of our technique is that IC3 works in an abstract state
space, since the counterexamples to induction and the relative inductive clauses are ex-
pressed with the abstraction predicates. This is enabled by the use of implicit abstraction
to check (abstract) relative induction. Moreover, the refinement in our procedure is fully
incremental, allowing to keep all the clauses found in the previous iterations.
The approach has two key advantages. First, it is very general: the implementations
for the theories of LRA, BV, and LIA have been obtained with relatively little effort.
Second, it is extremely effective, being able to efficiently deal with large numbers of
predicates. Both advantages are confirmed by the experimental results, obtained on a
wide set of benchmarks, also in comparison against dedicated verification engines.
In the future, we plan to apply the approach to other theories (e.g. arrays, non-linear
arithmetic) investigating other forms of predicate discovery, and to extend the technique
to liveness properties.

Acknowledgments. This work was carried out within the D-MILS project, which is
partially funded under the European Commission’s Seventh Framework Programme
(FP7).
5
We used version 1.8.6c of K IND and P K IND. P K IND differs from K IND because it runs in
parallel k-Induction and an automatic invariant generation procedure. We run K IND with options
“-compression -n 100000” and P K IND with options “-compression -with-inv-gen -n 100000”.
60 A. Cimatti et al.

References
1. Barrett, C.W., Sebastiani, R., Seshia, S.A., Tinelli, C.: Satisfiability modulo theories. In:
Handbook of Satisfiability, vol. 185, pp. 825–885. IOS Press (2009)
2. Beyer, D.: Second Competition on Software Verification - (Summary of SV-COMP 2013).
In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 594–609. Springer,
Heidelberg (2013)
3. Bradley, A.: IC3ref, https://github.com/arbrad/IC3ref
4. Bradley, A., Somenzi, F., Hassan, Z., Zhang, Y.: An incremental approach to model checking
progress properties. In: Proc. of FMCAD (2011)
5. Bradley, A.R.: SAT-Based Model Checking without Unrolling. In: Jhala, R., Schmidt, D.
(eds.) VMCAI 2011. LNCS, vol. 6538, pp. 70–87. Springer, Heidelberg (2011)
6. Chokler, H., Ivrii, A., Matsliah, A., Moran, S., Nevo, Z.: Incremental formal verification of
hardware. In: Proc. of FMCAD (2011)
7. Cimatti, A., Griggio, A.: Software Model Checking via IC3. In: Madhusudan, P., Seshia,
S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 277–293. Springer, Heidelberg (2012)
8. Cimatti, A., Griggio, A., Schaafsma, B.J., Sebastiani, R.: The MathSAT5 SMT Solver. In:
Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 93–107. Springer,
Heidelberg (2013)
9. Cimatti, A., Mover, S., Tonetta, S.: Smt-based scenario verification for hybrid systems. For-
mal Methods in System Design 42(1), 46–66 (2013)
10. Clarke, E., Grumberg, O., Long, D.: Model Checking and Abstraction. ACM Trans. Program.
Lang. Syst. 16(5), 1512–1542 (1994)
11. Een, N., Mishchenko, A., Brayton, R.: Efficient implementation of property-directed reach-
ability. In: Proc. of FMCAD (2011)
12. Graf, S., Saı̈di, H.: Construction of Abstract State Graphs with PVS. In: Grumberg, O. (ed.)
CAV 1997. LNCS, vol. 1254, pp. 72–83. Springer, Heidelberg (1997)
13. Gupta, A., Rybalchenko, A.: InvGen: An efficient invariant generator. In: Bouajjani, A.,
Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 634–640. Springer, Heidelberg (2009)
14. Hagen, G., Tinelli, C.: Scaling Up the Formal Verification of Lustre Programs with SMT-
Based Techniques. In: Cimatti, A., Jones, R.B. (eds.) FMCAD, pp. 1–9. IEEE (2008)
15. Henzinger, T., Jhala, R., Majumdar, R., McMillan, K.: Abstractions from proofs. In: POPL,
pp. 232–244 (2004)
16. Hoder, K., Bjørner, N.: Generalized property directed reachability. In: Cimatti, A., Sebas-
tiani, R. (eds.) SAT 2012. LNCS, vol. 7317, pp. 157–171. Springer, Heidelberg (2012)
17. Kahsai, T., Tinelli, C.: Pkind: A parallel k-induction based model checker. In: Barnat, J.,
Heljanko, K. (eds.) PDMC. EPTCS, vol. 72, pp. 55–62 (2011)
18. Kindermann, R., Junttila, T., Niemelä, I.: SMT-based induction methods for timed systems.
In: Jurdziński, M., Ničković, D. (eds.) FORMATS 2012. LNCS, vol. 7595, pp. 171–187.
Springer, Heidelberg (2012)
19. Lahiri, S.K., Nieuwenhuis, R., Oliveras, A.: SMT techniques for fast predicate abstraction.
In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 424–437. Springer, Heidel-
berg (2006)
20. McMillan, K.L.: Lazy Abstraction with Interpolants. In: Ball, T., Jones, R.B. (eds.) CAV
2006. LNCS, vol. 4144, pp. 123–136. Springer, Heidelberg (2006)
IC3 Modulo Theories via Implicit Predicate Abstraction 61

21. Sharygina, N., Tonetta, S., Tsitovich, A.: The synergy of precise and fast abstractions for
program verification. In: SAC, pp. 566–573 (2009)
22. Sorensson, N., Claessen, K.: Tip, https://github.com/niklasso/tip
23. Tonetta, S.: Abstract Model Checking without Computing the Abstraction. In: Cavalcanti,
A., Dams, D.R. (eds.) FM 2009. LNCS, vol. 5850, pp. 89–105. Springer, Heidelberg (2009)
24. Vizel, Y., Grumberg, O., Shoham, S.: Lazy abstraction and SAT-based reachability in hard-
ware model checking. In: Cabodi, G., Singh, S. (eds.) FMCAD, pp. 173–181. IEEE (2012)
25. Welp, T., Kuehlmann, A.: QF BV model checking with property directed reachability. In:
Macii, E. (ed.) DATE, pp. 791–796 (2013)
SMT-Based Verification of Software Countermeasures
against Side-Channel Attacks

Hassan Eldib, Chao Wang, and Patrick Schaumont

Department of ECE, Virginia Tech, Blacksburg, VA 24061, USA

{heldib,chaowang,schaum} @vt.edu

Abstract. A common strategy for designing countermeasures against side chan-

nel attacks is using randomization techniques to remove the statistical depen-
dency between sensitive data and side-channel emissions. However, this process
is both labor intensive and error prone, and currently, there is a lack of automated
tools to formally access how secure a countermeasure really is. We propose the
first SMT solver based method for formally verifying the security of a counter-
measures against such attacks. In addition to checking whether the sensitive data
are masked, we also check whether they are perfectly masked, i.e., whether the
joint distribution of any d intermediate computation results is independent of the
secret key. We encode this verification problem into a series of quantifier-free
first-order logic formulas, whose satisfiability can be decided by an off-the-shelf
SMT solver. We have implemented the new method in a tool based on the LLVM
compiler and the Yices SMT solver. Our experiments on recently proposed coun-
termeasures show that the method is both effective and efficient for practical use.

1 Introduction

Security analysis of the hardware and software systems implemented in embedded de-
vices is becoming increasingly important, since an adversary may have physical access
to such devices and therefore can launch a whole new class of side-channel attacks,
which utilize secondary information resulting from the execution of sensitive algo-
rithms on these devices. For example, the power consumption of a typical embedded
device executing the instruction tmp=text⊕key depends on the value of the secret
key [12]. This value can be reliably deduced using a statistical method known as differ-
ential power analysis (DPA [10,19]). In recent years, many commercial systems in the
embedded space have shown weaknesses against such attacks [16,14,1].
A common mitigation strategy against such attacks is using randomization techniques
to remove the statistical dependency between the sensitive data and the side-channel in-
formation. This can be done in multiple ways. Boolean masking, for example, uses an
XOR operation of a random number r with a sensitive variable a to obtain a masked (ran-
domized) variable: am = a ⊕ r [1,17]. Later, the sensitive variable can be restored by a
second XOR operation with the same random number: am ⊕r = a. Other randomization
based countermeasures have used additive masking (am = a + r mod n), multiplicative
masking (am = a ∗ r mod n), and application-specific code transformations such as
RSA blinding (am = are mod N ).

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 62–77, 2014.

c Springer-Verlag Berlin Heidelberg 2014
SMT-Based Verification of Software Countermeasures 63

However, designing and implementing such side-channel countermeasures are labor

intensive and error prone, and currently, there is a lack of formal verification tools to
evaluate how secure a countermeasure really is. Software countermeasures are particu-
larly challenging to design, since the source of the information leakage is not the cryp-
tographic software but the microprocessor hardware that executes the software. From
the perspective of average software developers – who may not know all the architec-
tural details of the device – it is difficult to predict the myriad possible ways in which
side-channel information may be leaked. Furthermore, bugs in implementation can also
break an otherwise secure countermeasure.
In this paper, we propose a new method for formally verifying the security of mask-
ing countermeasures. Our method uses an SMT solver to check if any intermediate
computation result of a software statistically depends on the sensitive data. Since this is
a statistical property, it cannot be directly checked by conventional formal verification
methods [5,20,21,11]. Although in the literature, there exists some work on tackling
the problem using type-based information flow analysis techniques [18], these methods
are often overly conservative, leading to the classification of countermeasures as secure
when they are not. In contrast, our method always returns the precise result. Although
Bayrak et al. [2] also used a constraint solver in their method, the analysis is signifi-
cantly less precise than ours. They check whether a variable is masked by some random
variable, but not whether it is perfectly masked, i.e., whether the probability distribution
is dependent on the sensitive data. To the best of our knowledge, our method is the first
automated verification method that checks for perfect masking. This is important be-
cause with order-d perfect masking, an implementation is provably secure against any
type of order-d (and lower-order) power analysis attack [9].
Fig. 1 (left) illustrates the difference between naive and perfect masking. Here, k is
the sensitive data, r1 and r2 are the random variables, and o1, o2, o3, and o4 are the
results of four different masking schemes. Assume that all variables are Boolean, we
can construct the truth table in Fig. 1 (right). Although o1,o2,o3 all seem to depend
on the values of the random variables r1 and r2, they are vulnerable to side-channel
attacks. To see why, consider the case when o1 is logical 1. In this case, we know for
sure that k is 1, regardless of the values of the random variables. Similarly, when o2 is
logical 0, we know for sure that k is 0. Although o3 does not directly leak the sensitive
information about k as in o1 and o2, the masking is still not perfect. When o3 is logical
1 (or 0), there is a 75% chance that k is logical 1 (or 0). Therefore, an adversary may
launch a power analysis attack to deduce the value of k.

k r1 r2 o1 o2 o3 o4
0 0 0 0 0 0 0
o1 = k ∧ (r1 ∧ r2) 0 0 1 0 0 0 1
o2 = k ∨ (r1 ∧ r2) 0 1 0 0 0 0 1
0 1 1 0 1 1 0
o3 = k ⊕ (r1 ∧ r2) 1 0 0 0 1 1 1
o4 = k ⊕ (r1 ⊕ r2) 1 0 1 0 1 1 0
1 1 0 0 1 1 0
1 1 1 1 1 0 1

Fig. 1. Masking examples: o1,o2,o3 are not perfectly masked, but o4 is perfectly masked
64 H. Eldib, C. Wang, and P. Schaumont

In contrast, o4 is perfectly masked in that the output is statistically independent of

the sensitive data. When k is logical 1 (or 0), there is 50% chance that o4 is logical
1 (or 0). Therefore, it is provably secure against any first-order power analysis attack,
where the adversary can observe one intermediate computation result. The example in
Fig. 1 also demonstrates a weakness of the method in [2]: Since it only checks whether
a variable is masked, but not whether its probability distribution depends on the key, it
would (falsely) classify all of o1,o2,o3,o4 as secure. In contrast, our new method can
differentiate o4 from the other three, since only o4 is perfectly masked.
We have implemented our new method in a verification tool based on the LLVM
compiler and the Yices SMT solver [6]. We encode the verification problem into a se-
ries of quantifier-free first-order logic formulas, whose satisfiability can be decided by
Yices. Our SMT encoding scheme is significantly different from the ones used by stan-
dard verification methods, because the perfect masking property checked by our tool is
statistical in nature. For comparison, we also implemented the method in [2] in our tool.
We have conducted experiments on a large set of recently proposed countermeasures,
including the ones applied to AES and the MAC-Keccak reference code submitted to
Round 3 of NIST’s SHA-3 competition. Our results show that the new method is ef-
fective in detecting flaws in the masking implementation. Furthermore, the method is
scalable enough to handle programs of practical size and complexity.
The remainder of this paper is organized as follows. We establish notation in Sec-
tion 2, before presenting our SMT based verification algorithm in Section 3. Then, we
illustrate the entire verification process using an example in Section 4. We present our
incremental verification method in Section 5, which further improves the scalability of
our SMT-based method. We present our experimental results in Section 6, and finally
give our conclusions in Section 7.

2 Preliminaries
In this section, we define the type of side-channel attacks considered in this paper and
review the notion of perfect masking.

Side-Channel Attacks. Following the notation used by Blömer et al. [4], we assume
that the program to be verified implements a function c ← enc(x, k), where x is the
plaintext, k is the secret key, and c is the ciphertext. Let I1 (x, k, r), I2 (x, k, r), . . .,
It (x, k, r) be the sequence of intermediate computation results inside the function,
where r is an s-bit random number in the domain {0, 1}s. The purpose of using r is
to make all intermediate results statistically independent of the secret key (k).
When enc(x, k) is a linear function in the Boolean domain, masking and de-masking
are straightforward. However, when enc(x, k) is a non-linear function, masking and de-
masking often require a complete redesign of the implementation. However, this manual
design process is both labor intensive and error prone, and currently, there is a lack of
automated tools to assess how secure a countermeasure really is.
We assume that an adversary knows the pair (x, c) of plaintext and ciphertext in
c ← enc(x, k). For each pair (x, c), the adversary also knows the joint distribution of at
most d intermediate computation results I1 (x, k, r), . . . , Id (x, k, r), through access to
SMT-Based Verification of Software Countermeasures 65

some aggregated quantity such as the power dissipation. However, the adversary does
not have access to r, which is produced by a true random number generator. The goal
of the adversary is to compute the secret key (k). In embedded computing, for instance,
these are realistic assumptions. In their seminal work, Kocher et al. [10] demonstrated
that for d = 1 and 2, the sensitive data can be reliably deduced using a statistical method
known as differential power analysis (DPA).

Perfect Masking. Given a pair (x, k) of plaintext and secret key for the function
enc(x, k), and d intermediate results I1 (x, k, r), . . . , Id (x, k, r), we use Dx,k (R) to
denote the joint distribution of I1 , . . . , Id – while assuming that the s-bit random num-
ber r is uniformly distributed in the domain {0, 1}s . Following Blömer et al. [4], we
do not put restrictions on the technical capability of an adversary. As long as there is
information leak, we consider the implementation to be vulnerable.
Definition 1. Given an implementation of function enc(x, k) and a set of intermediate
results {Ii (x, k, r)}, we say that the implementation is order-d perfectly masked if, for
all d-tuples I1 , . . . , Id , we have

Dx,k (R) = Dx ,k (R) for any two pairs (x, k) and (x , k ) .

The notion of perfect masking used here is more accurate than the sensitivity [2].
There, an intermediate result is considered to be sensitive if (1) it depends on at least
one secret input and (2) it is independent of any random input. We have demonstrated
the difference between them using the example in Fig. 1, where o1,o2,o3,o4 are all
insensitive, but only o4 is perfectly masked. In general, if an intermediate result is per-
fectly masked, it is guaranteed to be insensitive. However, an insensitive intermediate
result may not be perfectly masked.
To check for violations of perfect masking, we need to decide whether there exists a
d-tuple I1 , . . . , Id such that Dx,k (R) = Dx ,k (R) for some (x, k) and (x , k ). Here,
the main challenge is to compute Dx,k (R). We will present our solution in Section 3.
In this work, we focus on verifying security-critical programs, e.g. those that im-
plement cryptographic algorithms, as opposed to arbitrary software programs. (Our
method would be too expensive for verifying general-purpose software.) In general,
the class of programs that we consider here do not have input-dependent control flow,
meaning that we can easily remove all the loops and function calls from the code using
standard loop unrolling and function inlining techniques. Furthermore, the program can
be transformed into a branch-free representation, where the if-else branches are merged.
Finally, since all variables are bounded integers, we can convert the program to a purely
Boolean program through bit-blasting. Therefore, in this paper, we shall present our
new verification method on the bit-level representation of a branch-free program. Our
goal is to verify that all intermediate bits of the program are perfectly masked.

3 SMT Based Verification of Perfect Masking

We first illustrate the overall flow of our verification method using the program in Fig. 2.
The program is a masked version of c ← (k1 ∧ k2), where k1 and k2 are two secret
66 H. Eldib, C. Wang, and P. Schaumont

1 : compute(bool k1, bool k2, bool r1, bool r2){ c

2: bool n1, n2, n3, n4, n5, n6, n7, n8, c; ⊕
3: n1 = k1 ⊕ r1; n3 n8
4: n2 = k2 ⊕ r2;
5: n3 = n1 ∧ n2; ∧ ⊕
6: n4 = k2 ⊕ r2; n1 n7
n5
7: n5 = r1 ∧ n4; n2
8: n6 = k1 ⊕ r1; ⊕ ∧ ∧
⊕
9: n7 = r2 ∧ n6; n4
n6
10 : n8 = n5 ⊕ n7;
⊕ ⊕
11 : c = n3 ⊕ n8; k1 r1 r2
k2 r2 r1
12 : return c;
13 : } k2 r2 k1 r1

Fig. 2. Example: a program and its graphic representation (⊕ denotes XOR; ∧ denotes AND)

keys, r1 and r2 are random variables with independent and uniform distribution in
{0, 1}, and c is the computation result. The objective of masking is to make the power
consumption of the device executing this code independent from the values of the secret
keys. This masking scheme originated from Blömer et al. [4]. The return value c is
logically equivalent to (k1 ∧ k2) ⊕ (r1 ∧ r2). The corresponding demasking function,
which is not shown in the figure, is c ⊕ (r1 ∧ r2). Therefore, demasking would produce
a result that is logically equivalent to the desired value (k1 ∧ k2).
Our method will determine if all the intermediate variables of the program are per-
fectly masked. We use the Clang/LLVM compiler to parse the input Boolean program
and construct the data-flow graph, where the root represents the output and the leaf
nodes represent the input bits. Each internal node represents the result of a Boolean
operation of one of the following types: AND, OR, NOT, and XOR. For the example
in Fig. 2, our method starts by parsing the program and creating a graph representation.
This is followed by traversing the graph in a topological order, from the program inputs
(leaf nodes) to the return value (root node). For each internal node, which represents
an intermediate result, we check whether it is perfectly masked. The order in which we
check the internal nodes is as follows: n1, n2, n3, n4, n5, n6, n7, n8, and finally, c.

The Theory. As the starting point, we mark all the plaintext bits in x as public, the
key bits in k as secret, and the mask bits in r as random. Then, for each intermediate
computation result I(x, k, r) of the program, we check whether it is perfectly masked.
Following Definition 1, we formulate this check as a satisfiability problem as follows:

∃x.∃k, k . Σr∈{0,1}s I(x, k, r) = Σr∈{0,1}s I(x, k , r)
Here, x represents the plaintext bits, k and k represent two different valuations of the
key bits, and r is the random number uniformly distributed in the domain {0, 1}s, where
s is the number of random bits. For any fixed (x, k, k ),
– Σr∈{0,1}s I(x, k, r) is the number of satisfying assignments for I(x, k, r), and
– Σr∈{0,1}s I(x, k , r) is the number of satisfying assignment for I(x, k , r).
Assume that r is uniformly distributed in the domain {0, 1}s, the above summations
can be used to indicate the probabilities of I being logical 1 under two different key
values k and k .
SMT-Based Verification of Software Countermeasures 67

If the above formula is satisfiable, there exists a plaintext x and two different keys
(k, k ) such that the distribution of I(x, k, r) differs from the distribution of I(x, k , r).
In other words, some information of the secret key is leaked through I, and therefore
we say that I is not perfectly masked. If the above formula is unsatisfiable, then such
information leakage is not possible, and therefore we say that I is perfectly masked.
Another way to understand the above satisfiability problem is to look at the negation.
Instead of checking the satisfiability of the formula above, we check the validity of the
formula below:

∀x.∀k, k . Σr∈{0,1}s I(x, k, r) = Σr∈{0,1}s I(x, k , r)
If this formula is valid – meaning that it holds for all valuations of x, k and k – then
we say that I is perfectly masked.

The Encoding. Let Φ denote the SMT formula to be created for checking intermediate
result I(x, k, r). Let s be the number of random bits in r. Our encoding method ensures
that Φ is satisfiable if and only if I is not perfectly masked. We define Φ as follows:
2s −1 2s −1

Φ := Ψkr ∧ Ψkr ∧ Ψb2i ∧ Ψsum ∧ Ψdiﬀ ,
r=0 r=0
where the subformulas are defined as follows:
– Program logic (Ψkr ): Each subformula Ψkr encodes a copy of the functionality of
I(x, k, r), with the random variable r set to a concrete value in {0, . . . , 2s − 1} and
the key set to value k or k . All copies share the same plaintext variable x.
– Boolean-to-int (Ψb2i ): It encodes the conversion of the Boolean valued output of
I(x, k, r) to an integer (true becomes 1 and false becomes 0), so that the integer
2s
values can be summed up later to compute Σr=1 I(x, k, r).
– Sum-up-the-1s (Ψsum ): It encodes the two summations of the logical 1s in the out-
puts of the 2s program logic copies, one for I(x, k, r) and the other for I(x, k , r).
– Different sums (Ψdiﬀ ): It asserts that the two summations should have different
results.

code checked code checked code checked code checked

SAT?
k1 k2 r1 r2 k1 k2 r1 r2 k1 k2 r1 r2 k1 k2 r1 r2
0 0 0 1 1 0 1 1

code checked code checked code checked code checked

k1’ k2’ r1 r2 k1’ k2’ r1 r2 k1’ k2’ r1 r2 k1’ k2’ r1 r2

0 0 0 1 1 0 1 1
Fig. 3. SMT encoding for checking the statistical dependence of an output on secret data (k1, k2)
68 H. Eldib, C. Wang, and P. Schaumont

Fig. 3 is a pictorial illustration of our SMT encoding for an intermediate result

I(k1, k2, r1, r2), where k1, k2 are the secret key bits and r1, r2 are two random bits.
Here, the first four boxes, encoding Ψk0 , . . . , Ψk3 , are the four copies of the program
logic for key bits (k1k2), with the random bits set to 00, 01, 10, and 11, respectively.
The other four boxes, encoding Ψk0 , . . . , Ψk3 , are the four copies of the program logic
for key bits (k1 k2 ), with the random bits set to 00, 01, 10, and 11, respectively. The
formula checks for security against first-order DPA attacks – whether there exist two
sets of keys (k1 k2 and k1’ k2’) under which the distributions of I are different.

The Running Example. Consider node n8 in Fig. 2 as the node under verification. The
function is defined as n8 = (r1 & (k2 xor r2)) xor (r2 & (k1 xor r1)).
The SMT formula that our method generates – by instantiating r1r2 to 00, 01, 10,
and 11 – is the conjunction of all of the formulas listed below:
n8_1 = (0 & (k2 xor 0)) xor (0 & (k1 xor 0)) // four copies of I(k, r)
n8_2 = (0 & (k2 xor 1)) xor (1 & (k1 xor 0))
n8_3 = (1 & (k2 xor 0)) xor (0 & (k1 xor 1))
n8_4 = (1 & (k2 xor 1)) xor (1 & (k1 xor 1))
n8_1’ = (0 & (k2’ xor 0)) xor (0 & (k1’ xor 0)) // four copies of I(k’,r)
n8_2’ = (0 & (k2’ xor 1)) xor (1 & (k1’ xor 0))
n8_3’ = (1 & (k2’ xor 0)) xor (0 & (k1’ xor 1))
n8_4’ = (1 & (k2’ xor 1)) xor (1 & (k1’ xor 1))
(( num1 = 1 ) & n8_1 ) | ((num1=0) & not n8_1 ) // convert bool to integer
(( num2 = 1 ) & n8_2 ) | ((num2=0) & not n8_2 )
(( num3 = 1 ) & n8_3 ) | ((num3=0) & not n8_3 )
(( num4 = 1 ) & n8_4 ) | ((num4=0) & not n8_4 )
(( num1’ = 1 ) & n8_1’) | ((num1’=0) & not n8_1’) // convert bool to integer
(( num2’ = 1 ) & n8_2’) | ((num2’=0) & not n8_2’)
(( num3’ = 1 ) & n8_3’) | ((num3’=0) & not n8_3’)
(( num4’ = 1 ) & n8_4’) | ((num4’=0) & not n8_4’)
(num1 + num2 + num3 + num4) != (num1’ + num2’ + num3’ + num4’) // the check

We solve the conjunction of the above formulas using an off-the-shelf SMT solver
called Yices [6]. In this particular example, the formula is satisfiable. For example, one
of the satisfying assignments is k1k2=00 and k1’k2’=01. We shall show in the next
section that, when the key bits are 00, the probability for n8 to be logical 1 is 0%; but
when the key bits are 01, the probability is 50%. This makes it vulnerable to first-order
DPA attacks. Therefore, n8 is not perfectly masked.

High-Order Attacks. For a masked code to be resistant to first-order differential power

analysis (DPA) attacks, all the intermediate results must be perfectly masked. However,
even if each intermediate result is perfectly masked, it is still not sufficient to resist high-
order DPA attacks, where an adversary can simultaneously observe leakage from more
than one intermediate computation result. For a masking scheme to be resistant to order-
d DPA attacks, we need to ensure that the joint distribution of any d intermediate results
(where d = 2, 3, . . . ) is independent of the secret key. That is, for any d intermediate
results I1 , . . . , Id , we check the satisfiability of the following formula:

∃x.∃k, k . Σr∈{0,1}s Σi=1 d
Ii (x, k, r) = Σr∈{0,1}s Σi=1
d
Ii (x, k , r)

Our encoding can be easily extended to implement this new check. In practice, most
countermeasures assume that the adversary has access to the side-channel leakage of
SMT-Based Verification of Software Countermeasures 69

either one or two intermediate results, which corresponds to first-order and second-
order attacks. In our actual implementation, we handle both first-order and second-order
attacks. In our experiments, we also evaluate our new method on verifying countermea-
sures against both first-order and second-order attacks (where d = 1 or 2).

4 The Working Example

Consider the automated verification of our running example in Fig. 2. For each internal
node I, we first identify all the transitive fan-in nodes of I in the program to form a code
region for the subsequent SMT solver based analysis. In the worst case, the extracted
code region should start from the instruction (node) to be verified, and cover all the
transitive fan-in nodes on which it depends. Then, the extracted code region is given
to our SMT based verification procedure, whose goal is to prove (or disprove) that the
node is statistically independent of the secret key.
Following a topological order, our method starts with node n1, which is defined in
Line 3 of the program in Fig. 2. The extracted code region consists of n1 = k1 ⊕ r1
itself. Since it involves only one key and one random variable in the XOR operation,
a simple static analysis can prove that it is perfectly masked. Therefore, although we
could have verified it using SMT, we skip it for efficiency reasons. Such simple static
analysis is able to prove that n2, n4 and n6 are also perfectly masked.
Next, we check if n3 is perfectly masked. The truth table of n3 is shown in Fig. 4
(left). In all four valuations of k1 and k2, the probability of n3 being logical 1 is 25%.
Therefore, n3 is perfectly masked. When we apply our SMT based method, the solver
is not able to find any satisfying assignment for k1 and k2 under which the probability
distributions of n3 are different. Note that our method does not check the probability of
the output being logical 0, since having an equal probability distribution of logical 1 is
equivalent to having an equal probability distribution for logical 0.

k1 k2 r1 r2 n3 k1 k2 r1 r2 n8 k1 k2 r1 r2 c
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 1 0 0 0 0 1 0
0 0 1 0 0 0 0 1 0 0 0 0 1 0 0
0 0 1 1 1 0 0 1 1 0 0 0 1 1 1
0 1 0 0 0 0 1 0 0 0 0 1 0 0 0
0 1 0 1 0 0 1 0 1 0 0 1 0 1 0
0 1 1 0 1 0 1 1 0 1 0 1 1 0 0
0 1 1 1 0 0 1 1 1 1 0 1 1 1 1
1 0 0 0 0 1 0 0 0 0 1 0 0 0 0
1 0 0 1 1 1 0 0 1 1 1 0 0 1 0
1 0 1 0 0 1 0 1 0 0 1 0 1 0 0
1 0 1 1 0 1 0 1 1 1 1 0 1 1 1
1 1 0 0 1 1 1 0 0 0 1 1 0 0 1
1 1 0 1 0 1 1 0 1 1 1 1 0 1 1
1 1 1 0 0 1 1 1 0 1 1 1 1 0 1
1 1 1 1 0 1 1 1 1 0 1 1 1 1 0

Fig. 4. The truth-tables for internal nodes n3, n8, and c of the example program in Fig. 2

The verification steps for nodes n5 and n7 are similar to that of n3 – all of them are
perfectly masked.
70 H. Eldib, C. Wang, and P. Schaumont

Next, we check if n8 is perfectly masked. The proof would fail because, as shown in
the truth table in Fig. 4 (middle), the probability for n8 to be logical 1 is not the same
under different valuations of the keys. For example, if the keys are 00, then n8 would
be 0 regardless of the values of the random variables. Recall that we have shown the
detailed SMT encoding for n8 in Section 3. Using our method, the solver can quickly
find two configurations of the key bits (for example, 00 and 11) under which the prob-
abilities of n8 being logical 1 are different. Therefore, n8 is not perfectly masked.
The remaining node is c, whose truth table is shown in Fig. 4 (right). Similar to n8,
our SMT based method will be able to show that it is not perfectly masked.
It is worth pointing out that the result of applying the Sleuth method [2] would have
been different. Although n8 and c are clearly vulnerable to first-order DPA attacks, the
Sleuth method, based on the notion of sensitivity, would have classified them as “se-
curely masked.” This demonstrates a major advantage of our new method over Sleuth.

5 The Incremental Verification Algorithm

Note that the size of the formula created by our SMT encoding is linear in the size of
the program and exponential in the number of random variables – for s random bits, we
need to make 2s+1 copies of the program logic. This is the main bottleneck for applying
our method to large programs. In this section, we propose an incremental verification
algorithm, which applies SMT solver based analysis only to small code regions – one
at a time – as opposed to the entire fan-in cone of the node under verification. This is
crucial for scaling the method up to programs of practical size.

I2 := I1 ⊕ de-mask (x , k , r )
mask2 mask2 := rnew ⊕ mask (x , k , r ) ⊕ de-mask (x , k , r )
I2 := rnew ⊕ (. . .)
+ rdummy := rdummy
I3 I3
I1
+ de−mask Before verifying mask2, if we have already
rnew proved that I2 is perfectly masked, and rnew
mask x k r is a new random variable not used elsewhere,
then for the purpose of checking mask2 only,
x k r we can substitute I2 with rnew while verify-
ing mask2.

Fig. 5. Incremental verification: applying the SMT based analysis to a small fan-in region only

Extracting the Verification Region. In practice, a common strategy in implementing

randomization based countermeasures is to have a chain of modules, where the inputs
of each module are masked before executing its logic, and are demasked afterward.
To avoid having an unmasked intermediate value, the inputs to the successor module
are masked with fresh random variables, before they are demasked from the random
variables of the previous module. This can be illustrated by the example in Fig. 5,
where the output of mask(x,k,r) is masked with the new random variable rnew before it
is demasked from the old random variable r.
SMT-Based Verification of Software Countermeasures 71

Due to associativity of the ⊕ operator, reordering the masking and demasking oper-
ations would not change the logical result. For example, in Fig. 5, the instruction being
verified is in mask2(). Since the newly added random variable rnew is not used inside
mask() or de-mask(), or in the support of I3 , we can replace the entire fan-in cone of
I2 by a new random variable rdummy (or even rnew itself) while verifying mask2().
We shall see in the experimental results section that such opportunities are abundant in
real-world applications. Therefore, in this subsection, we present a sound algorithm for
extracting a small code region from the fan-in cone of the node under verification.
Our algorithm relies on some auxiliary data structures associated with the current
node i under verification: supportV[i], uniqueM[i] and perfectM[i].
– supportV[i] is the set of inputs in the support of the function of node i.
– uniqueM[i] is the set of random inputs that each reaches i along only one path.
– perfectM[i] is a subset of uniqueM[i] where each random variable, by itself, guar-
antees that node i is perfectly masked.
These tables can be computed by a traversal of the program nodes as described in Algo-
rithm 1. For example, for node I1 in Fig. 5, supportV[I1 ]= {x, k, r, rnew }, uniqueM[I1 ]
= {r, rnew }, and perfectM[I1 ]= {rnew }, assuming r is not repeated in the mask block.
For node I2 , we have supportV[I2 ]= {x, k, r, rnew }, uniqueM[I2 ]= {rnew }, since r
reaches I2 twice and so may have been de-masked, and perfectM[I2 ]= {rnew }.

Algorithm 1. Computing the auxiliary tables for all internal nodes of the program.
1. supportV[i] ← { v } for each input node i with variable v
2. uniqueM[i] ← { v } for each input node i with random mask variable v
3. perfectM[i] ← { v } for each input node i with random mask variable v
4. for each (internal node i in a leaf-to-root topological order) {
5. L ← L EFT C HILD(i)
6. R ← R IGHT C HILD(i)
7. supportV[i] ← supportV[L] ∪ supportV[R]
8. uniqueM ← (uniqueM[L] ∪ uniqueM[R]) \ (supportV[L] ∩ supportV[R])
9. if (i is an XOR node)
10. perfectM[i] ← uniqueM[i] ∩ (perfectM[L]∪perfectM[R])
11. else
12. perfectM[i] ← { }
13. }

Our idea of extracting a small code region for SMT based analysis is formalized in
Algorithm 2. Given the node i under verification, and uniqueM[i] as the set of random
variables that each reaches i along only one path, we call G ET R EGION(i,uniqueM[i])
to compute the region. Inside G ET R EGION, uniqueM[i] is renamed to freshMasksATi.
More specifically, we start by checking each transitive fan-in node n of the current node
i. If n is a leaf node (Line 2), then we add n and the input variable v to the region. If
n is not a leaf node, we check if there is a random variable r ∈uniqueMATi that, by
itself, can perfectly mask node n (Line 4). In Fig. 5, for example, rnew , by itself, can
uniformly mask node I2 . If such random variable r exists, then we add pair (n, r) to
the region and return – skipping the entire fan-in cone of n. Otherwise, we recursively
invoke G ET R EGION to traverse the two child nodes of n.
72 H. Eldib, C. Wang, and P. Schaumont

Algorithm 2. Extracting a code region for node i for the subsequent SMT based analysis.
1. G ET R EGION (n, uniqueMATi) {
2. if (n is an input node with variable v)
3. region.add ← (n, v)
4. else if (∃ random variable r ∈ perfectM[n] ∩ uniqueMATi)
5. region.add ← (n, r)
6. else
7. region.add ← (n, {})
8. region.add ← G ET R EGION(n.Left, uniqueMATi)
9. region.add ← G ET R EGION(n.Right, uniqueMATi)
10. return region
11. }

The Overall Algorithm. Algorithm 3 shows the overall flow of our incremental verifi-
cation method. Given the program and the lists of secret, random and plaintext variables,
our method systematically scans through all the internal nodes from the inputs to the
return value. For each node i, our method first extracts a small code region (Line 4).
Then, we invoke the SMT based analysis. If the node is not perfectly masked, we add it
to the list of bad nodes.

Algorithm 3. Incremental verification of perfect masking.

1. V ERIFY P ERFECT M ASKING (Prog, keys, rands, plains) {
2. badNodes ← { }
3. for each (internal node i ∈ Prog in a topological order ) {
4. region ← G ET R EGION(i, uniqueM[i])
5. notPerfect ← C HECK M ASKING B Y SMT (i, region, keys, rands, plains )
6. if (notPerfect)
7. badNodes.add( i )
8. }
9. return badNodes
10. }

To optimize the performance of Algorithm 3, we conduct a simple static analysis

between Line 4 and Line 5 to quickly check whether it is fruitful to invoke the SMT
solver. The first one checks if the region contains any secret keys, if not then the solver
is not invoked and the instruction is perfectly masked. The second analysis checks some
syntactic conditions – if all of these conditions are satisfied, the current node i is guar-
anteed to be perfectly masked. In such case, we also avoid invoking the SMT solver.
The implemented syntactic conditions are listed as follows:
– The instruction has no secret input as its child. This guarantees that when a secret
variable is introduced, its masking operation will be verified.
– None of the random variables appears in both operand’s supportV tables. This
guarantees that no perfectly masking of a secret variable in any of the operands
may be affected.
SMT-Based Verification of Software Countermeasures 73

– Both operands are perfectly masked. This guarantees to find all the resultant imper-
fect masked instructions due to an initial imperfectly masked instruction.
To further optimize the performance of Algorithm 3, we implement a method for
identifying random variables that are don’t cares for the node i under verification, and
use the information to reduce the cost of the SMT based analysis. Prior to the SMT
encoding, for each random variable r ∈supportV[i], we check if the value of r can
ever affect the output of i. If the answer is no, then r is a don’t care. During our SMT
encoding, we will set r to logical 0 rather than treat r as a random variable, to to reduce
the size of the SMT formula. This can lead to a significant performance improvement
since the formula size is exponential in the number of relevant random variables.
We check whether r ∈ support[i] is a don’t care for node i by constructing a SAT
formula and solving it using the SMT solver. The SAT formula is defined as follows:
r=0
Ψregion ∧ Ψregion
r=1
∧ ΨdiﬀO ,
r=0
where Ψregion encodes the program logic of the region, with the random bit r set to
r=1
0, Ψregion encodes the program logic of the region, with the random bit r set to 1,
and ΦdiﬀO asserts that the outputs of these two copies differ. If the above formula is
unsatisfiable, then r is a don’t care for node i.

6 Experiments
We have implemented our method in a verification tool called SC Sniffer, based on the
LLVM compiler and the Yices SMT solver [6]. It runs in two modes: monolithic and
incremental. The monolithic mode applies our SMT based encoding to the entire fan-in
cone of each node in the program, whereas the incremental method tries to restrict the
SMT encoding to a localized region. In addition, we implemented the Sleuth method [2]
for experimental comparison. The main difference is that our method not only checks
whether a node is masked (as in Sleuth), but also checks whether it is perfectly masked,
i.e. it is statistically independent of the secret key.
We have evaluated our tool on some recently proposed countermeasures. Our exper-
iments were designed to answer the following research questions:
– How effective is our new method? We know that in theory, the new method is more
accurate than the Sleuth method. But does it have a significant advantage over the
Sleuth method in practice?
– How scalable is our new method, especially in verifying applications of realistic
code size and complexity? We have extended our SMT based method with incre-
mental verification. Is it effective in practice?
Table 1 shows the statistics of the benchmarks. Column 1 shows the name of each
benchmark example. Column 2 shows a short description of the implemented algo-
rithm. Column 3 shows the number of lines of code – here, each instruction is a bit
level operation. Column 4 shows the number of nodes that represent the intermediate
computation results. Columns 5-7 show the number of input bits that are the secret key,
the plaintext, and the random variable, respectively.
74 H. Eldib, C. Wang, and P. Schaumont

Table 1. The benchmark statistics: in addition to the program name and a short description, we
show the total lines of code, the numbers of intermediate nodes and the various inputs

Name Description Code Size Nodes Keys Plains Rands

P1 CHES13 Masked Key Whitening 79 47 16 16 16
P2 CHES13 De-mask and then Mask 67 31 8 0 16
P3 CHES13 AES Shift Rows [2nd-order] 21 21 2 0 2
P4 CHES13 Messerges Boolean to Arithmetic (bit0) [2-order] 23 24 1 0 2
P5 CHES13 Goubin Boolean to Arithmetic (bit0) [2-order] 27 60 1 0 2
P6 Logic Design for AES S-Box (1st implementation) 32 9 2 0 2
P7 Logic Design for AES S-Box (2nd implementation) 40 6 2 0 3
P8 Masked Chi function MAC-Keccak (1st implementation) 59 19 3 0 4
P9 Masked Chi function MAC-Keccak (2nd implementation) 60 19 3 0 4
P10 Syn. Masked Chi func MAC-Keccak (1st implementation) 66 22 3 0 4
P11 Syn. Masked Chi func MAC-Keccak (2nd implementation) 66 22 3 0 4
P12 MAC-Keccak 512b Perfect masked 285k 128k 288 288 805
P13 MAC-Keccak 512b De-mask and then mask – compiler error 285k 128k 288 288 805
P14 MAC-Keccak 512b Not-perfect Masking of Chi function (v1) 285k 128k 288 288 805
P15 MAC-Keccak 512b Not-perfect Masking of Chi function (v2) 285k 152k 288 288 805
P16 MAC-Keccak 512b Not-perfect Masking of Chi function (v3) 285k 128k 288 288 805
P17 MAC-Keccak 512b Unmasking of Pi function 285k 131k 288 288 805

Table 2. Experimental results: comparing our SC Sniffer method with the Sleuth method [2]

Name Sleuth [2] SC Sniffer (monolithic) SC Sniffer (incremental)

masked nodes nodes time masked nodes nodes time masked nodes nodes SMT time
failed checked perfect failed checked perfect failed checked mask
P1 No 16 47 0.16s No 16 47 0.22s No 16 47 16 0.09s
P2 No 8 31 0.21s No 8 31 0.20s No 8 31 8 0.09s
P3 No 9 21 1.17s No 9 21 1.27s No 9 21 18 0.46s
P4 No 2 24 0.58s No 2 24 0.65s No 2 24 8 0.57s
P5 No 2 60 1.19s No 2 60 1.40s No 2 60 20 1.12s
P6 Yes 0 9 0.06s No 2 9 0.10s No 2 9 2 0.08s
P7 Yes 0 6 0.04s No 1 6 0.07s No 1 6 1 0.03s
P8 No 1 19 0.15s No 3 19 0.26s No 3 19 3 0.11s
P9 Yes 0 19 0.13s No 2 19 0.27s No 2 19 2 0.10s
P10 Yes 0 22 0.18s No 1 22 0.32s No 1 22 2 0.14s
P11 Yes 0 22 0.20s No 1 22 0.37s No 1 22 3 0.18s
P12 Yes 0 128k 91m53s - 0 34 mem-out Yes 0 128K 0 10m48s
P13 No 2560 128k 92m59s No 1 46 mem-out No 2560 128K 2560 14m10s
P14 Yes 0 128k 97m38s - 0 31 mem-out No 1024 128K 1024 18m20s
P15 Yes 0 152k 132m10s - 0 32 mem-out No 512 152K 1024 37m37s
P16 No 512 128k 113m12s - 0 40 mem-out No 1536 128K 1536 17m24s
P17 No 4096 131k 103m56s - 0 34 mem-out No 4096 131K 4096 17m35s

The benchmarks are classified into three groups. The first group of test cases (P1 to
P5) are taken from the Sleuth benchmark [2], all of which contain intermediate variables
that are not masked at all. More specifically, P1 is the masking key whitening code on
Page 12 of the Sleuth paper. P2 is the AES8 example, a smart card implementation of
AES resistant to power analysis, originated from Herbst et al. [8]. P3 is the code on
Page 13 of the Sleuth paper, also originated from Herbst et al. [8]. P4 is the code on
Page 18 of the Sleuth paper, originated from Messerges [13]. P5 is the code on Page 18
of the Sleuth paper, originated from Goubin [7].
The second group of test cases (P6 to P11) are examples where most of the interme-
diate variables are masked, but none of the masking schemes is perfect. P6 and P7 are
SMT-Based Verification of Software Countermeasures 75

the two examples used by Blömer et al. [4] (on Page 7). P8 and P9 are the SHA3 MAC-
Keccak computation reordered examples, originated from Bertoni et al. [3] (Eq. 5.2 on
Page 46). P10 and P11 are two experimental masking schemes for the Chi function in
SHA3, none of which is perfectly masked.
The third group of test cases (P12 to P17) comes from the regeneration of MAC-
Keccak reference code submission to NIST in the SHA-3 competition [15]. There are a
total of 285k lines of Boolean operation code. The difference among these test cases is
that they are protected by various countermeasures, some of which are perfectly masked
(e.g. P12) whereas others are not.
Table 2 shows the experimental results run on a machine with a 3.4 GHz Intel i7-
2600 CPU, 4 GB RAM, and a 32-bit Linux OS. We have compared the performance
of three methods: Sleuth, New (monolithic), and New (incremental). Here, Sleuth is
the method proposed by Bayrak et al. [2], while the other two are our own method.
In this table, Column 1 shows the name of each test program. Columns 2-5 show the
results of running Sleuth, including whether the program passed the check, the number
of nodes failed the check, and the total number of nodes checked. Columns 6-9 show the
results of running our new monolithic method. Here, mem-out means that the method
requires more than 4 GB of RAM. Columns 10-14 show the results of running our new
incremental method. Here, we also show the number of SMT based masking checks
made, which is often much smaller than the number of nodes checked, because many
of them are resolved by our static analysis.
First, the results show that our new algorithm is more accurate than Sleuth in deciding
whether a node is securely masked. Every node that failed the security check of Sleuth
would also fail the security check of our new method. However, there are many nodes
that passed the check of Sleuth, but failed the check of our new method. These are
the nodes that are masked, but their probability distributions are still dependent on the
sensitive inputs – in other words, they are not perfectly masked.
Second, the results show that our incremental method is significantly more scalable
than the monolithic method. On the first two groups of test cases, where the programs
are small, both methods can complete, and the difference in run time is small. However,
on large programs such as the Keccak reference code, the monolithic method could not
finish since it quickly ran out of the 4 GB RAM, whereas the incremental method can
finish in a reasonable amount of time. Moreover, although the Sleuth method imple-
ments a significantly simpler (and hence weaker) check, it is also based on a monolithic
verification approach. Our results in Table 2 show that, on large examples, our incre-
mental method is significantly faster than Sleuth.
As a measurement of the scalability of the
algorithms, we have conducted experiments
on a 1-bit version of test program P1 for 1
to 10 encryption rounds. In each parameter-
ized version, the input for each round is the
output from the previous round. We ran the
experiment twice, once with an unmasked in-
struction in each round, and once with all
instructions perfectly masked. The results of
76 H. Eldib, C. Wang, and P. Schaumont

the two experiments are almost identical, and therefore, we only plot the result for the
perfectly masked version. In the right figure, the x-axis shows the program size, and the
y-axis shows the verification time in seconds. Among the three methods, our incremen-
tal method is the most scalable.

7 Conclusions

We have presented the first fully automated method for formally verifying whether a
software implementation is perfectly masked by uniformly random inputs, and there-
fore is secure against power analysis based side-channel attacks. Our new method re-
lies on translating the verification problem into a set of constraint solving problems,
which can be decided by off-the-shelf solvers such as Yices. We have also presented
an incremental checking procedure to drastically improve the scalability of the SMT
based algorithm. We have conducted experiments on a large set of recently proposed
countermeasures. Our results show that the new method is not only more precise than
existing methods, but also scalable for practical use.

Acknowledgments. This work is supported in part by the NSF grant CNS-1128903

and the ONR grant N00014-13-1-0527.

References
1. Balasch, J., Gierlichs, B., Verdult, R., Batina, L., Verbauwhede, I.: Power analysis of Atmel
CryptoMemory – recovering keys from secure EEPROMs. In: Dunkelman, O. (ed.) CT-RSA
2012. LNCS, vol. 7178, pp. 19–34. Springer, Heidelberg (2012)
2. Bayrak, A.G., Regazzoni, F., Novo, D., Ienne, P.: Sleuth: Automated verification of software
power analysis countermeasures. In: Bertoni, G., Coron, J.-S. (eds.) CHES 2013. LNCS,
vol. 8086, pp. 293–310. Springer, Heidelberg (2013)
3. Bertoni, G., Daemen, J., Peeters, M., Assche, G.V., Keer, R.V.: Keccak implementation
overview, http://keccak.neokeon.org/Keccak-implementation-3.2.pdf
4. Blömer, J., Guajardo, J., Krummel, V.: Provably secure masking of AES. In: Handschuh, H.,
Hasan, M.A. (eds.) SAC 2004. LNCS, vol. 3357, pp. 69–83. Springer, Heidelberg (2004)
5. Clarke, E.M., Grumberg, O., Peled, D.A.: Model Checking. MIT Press, Cambridge (1999)
6. Dutertre, B., de Moura, L.: A fast linear-arithmetic solver for DPLL(T). In: Ball, T., Jones,
R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 81–94. Springer, Heidelberg (2006)
7. Goubin, L.: A sound method for switching between boolean and arithmetic masking. In:
Koç, Ç.K., Naccache, D., Paar, C. (eds.) CHES 2001. LNCS, vol. 2162, pp. 3–15. Springer,
Heidelberg (2001)
8. Herbst, C., Oswald, E., Mangard, S.: An AES smart card implementation resistant to power
analysis attacks. In: Zhou, J., Yung, M., Bao, F. (eds.) ACNS 2006. LNCS, vol. 3989, pp.
239–252. Springer, Heidelberg (2006)
9. Joye, M., Paillier, P., Schoenmakers, B.: On second-order differential power analysis. In:
Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 293–308. Springer, Heidelberg
(2005)
10. Kocher, P.C., Jaffe, J., Jun, B.: Differential power analysis. In: Wiener, M. (ed.) CRYPTO
1999. LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999)
SMT-Based Verification of Software Countermeasures 77

11. Li, B., Wang, C., Somenzi, F.: A satisfiability-based approach to abstraction refinement in
model checking. Electronic Notes in Theoretical Computer Science 89(4) (2003)
12. Mangard, S., Oswald, E., Popp, T.: Power Analysis Attacks - Revealing the Secrets of Smart
Cards. Springer (2007)
13. Messerges, T.S.: Securing the AES finalists against power analysis attacks. In: Schneier, B.
(ed.) FSE 2000. LNCS, vol. 1978, pp. 150–164. Springer, Heidelberg (2001)
14. Moradi, A., Barenghi, A., Kasper, T., Paar, C.: On the vulnerability of FPGA bitstream en-
cryption against power analysis attacks: Extracting keys from Xilinx Virtex-II FPGAs. In:
ACM Conference on Computer and Communications Security, pp. 111–124 (2011)
15. NIST. Keccak reference code submission to NIST’s SHA-3 competition (Round 3),
http://csrc.nist.gov/groups/ST/hash/sha-3/Round3/
documents/Keccak FinalRnd.zip
16. Paar, C., Eisenbarth, T., Kasper, M., Kasper, T., Moradi, A.: Keeloq and side-channel
analysis-evolution of an attack. In: FDTC, pp. 65–69 (2009)
17. Prouff, E., Rivain, M.: Masking against side-channel attacks: A formal security proof. In:
Johansson, T., Nguyen, P.Q. (eds.) EUROCRYPT 2013. LNCS, vol. 7881, pp. 142–159.
Springer, Heidelberg (2013)
18. Sabelfeld, A., Myers, A.C.: Language-based information-flow security. IEEE Journal on Se-
lected Areas in Communications 21(1), 5–19 (2003)
19. Taha, M., Schaumont, P.: Differential power analysis of MAC-Keccak at any key-length.
In: Sakiyama, K., Terada, M. (eds.) IWSEC 2013. LNCS, vol. 8231, pp. 68–82. Springer,
Heidelberg (2013)
20. Wang, C., Hachhtel, G.D., Somenzi, F.: Abstraction Refinement for Large Scale Model
Checking. Springer (2006)
21. Yang, Z., Wang, C., Ivančić, F., Gupta, A.: Mixed symbolic representations for model check-
ing software programs. In: Formal Methods and Models for Codesign, pp. 17–24 (July 2006)
Detecting Unrealizable Speciﬁcations
of Distributed Systems

Bernd Finkbeiner and Leander Tentrup

Saarland University, Germany

Abstract. Writing formal speciﬁcations for distributed systems is dif-

ficult. Even simple consistency requirements often turn out to be unre-
alizable because of the complicated information flow in the distributed
system: not every information is available in every component, and infor-
mation transmitted from other components may arrive with a delay or
not at all, especially in the presence of faults. The problem of checking
the distributed realizability of a temporal specification is, in general, un-
decidable. Semi-algorithms for synthesis, such as bounded synthesis, are
only useful in the positive case, where they construct an implementation
for a realizable specification, but not in the negative case: if the specifica-
tion is unrealizable, the search for the implementation never terminates.
In this paper, we introduce counterexamples to distributed realizability
and present a method for the detection of such counterexamples for spec-
ifications given in linear-time temporal logic (LTL). A counterexample
consists of a set of paths, each representing a different sequence of inputs
from the environment, such that, no matter how the components are im-
plemented, the specification is violated on at least one of these paths.
We present a method for finding such counterexamples both for the clas-
sic distributed realizability problem and for the distributed realizability
problem with faulty nodes. Our method considers, incrementally, larger
and larger sets of paths until a counterexample is found. While coun-
terexamples for full LTL may consist of infinitely many paths, we give a
semantic characterization such that the required number of paths can be
bounded. For this fragment, we thus obtain a decision procedure. Exper-
imental results, obtained with a QBF-based prototype implementation,
show that our method finds simple errors very quickly, and even prob-
lems with high combinatorial complexity, like the Byzantine Generals’
Problem, are tractable.

1 Introduction
The goal of program synthesis, and systems engineering in general, is to build sys-
tems that satisfy a given speciﬁcation. Sometimes, however, this goal is unattain-
able, because the conditions of the speciﬁcation are impossible to satisfy in an

This work was partially supported by the German Research Foundation (DFG) as
part of SFB/TR 14 AVACS and by the Saarbrücken Graduate School of Computer
Science, which receives funding from the DFG as part of the Excellence Initiative of
the German Federal and State Governments.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 78–92, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Detecting Unrealizable Speciﬁcations of Distributed Systems 79

implementation. A textbook example for such a case is the Byzantine Generals’

Problem, introduced in the early 1980s by Lamport et al. [1]. Three generals
of the Byzantine army, consisting of one commander and two lieutenants, need
to agree on whether they should “attack” or “retreat.” For this purpose, the
commander sends an order to the lieutenants, and all generals then exchange
messages with each other, reporting, for example, to one general which messages
they have received from the other general. The problem is that one of the gener-
als is a traitor and can therefore not be assumed to tell the truth: the tale of the
Byzantine generals is, after all, just an illustration for the problem of achieving
fault tolerance in distributed operating systems, where we would like to achieve
consensus even if a certain subset of the nodes is faulty. Of course, we cannot
expect the traitor to agree with the loyal generals, but we might still expect a
loyal lieutenant to agree with the order issued by a loyal commander, and two
loyal lieutenants to reach a consensus in case the commander is the traitor. This
specification is, however, unrealizable in the setting of the three generals (and,
more generally, in all settings where at least a third of the nodes are faulty).
Detecting unrealizable specifications is of great value because it avoids spend-
ing implementation effort on specifications that are impossible to satisfy. If the
system consists of a single process, then unrealizable specifications can be de-
tected with synthesis algorithms, which detect unrealizability as a byproduct of
attempting to construct an implementation. For distributed systems, the prob-
lem is more complicated: in order to show that there is no way for the three
generals to achieve consensus, we need to argue about the knowledge of each
general. The key observation in the Byzantine Generals’ Problem is that the
loyal generals have no way of knowing who, among the other two generals, is
the traitor and who is the second loyal general. For example, the situation where
the commander is the traitor and orders one lieutenant to “attack” and the other
to “retreat” is indistinguishable, from the point of view of the loyal lieutenant
who is ordered to attack, from the situation where the commander is loyal and
orders both lieutenants to attack, while the traitor claims to have received a
“retreat” order. Since the specification requires the lieutenant to act differently
(agree with the other lieutenant vs. agree with the commander) in the two in-
distinguishable situations, we reach a contradiction.
Since realizability for distributed systems is in general an undecidable prob-
lem [2], the only available decision procedures are limited to special cases, such as
pipeline and ring architectures [3, 4]. There are semi-algorithms for distributed
synthesis, such as bounded synthesis [5], but the focus is on the search for imple-
mentations rather than on the search for inconsistencies: if an implementation
exists, the semi-algorithm terminates with such an implementation, otherwise it
runs forever. In this paper, we take the opposite approach and study counterex-
amples to realizability. Intuitively, a counterexample collects a sufficient number
of scenarios such that, no matter what the implementation does, an error will
occur in at least one of the chosen scenarios. As specifications, we consider for-
mulas of linear-time temporal logic (LTL). It is straightforward to encode the
Byzantine Generals’ Problem in LTL. Another interesting example is the famous
80 B. Finkbeiner and L. Tentrup

CAP Theorem, a fundamental result in the theory of distributed computation

conjectured by Brewer [6]. The CAP Theorem states that it is impossible to
design a distributed system that provides Consistency, Availability, and Parti-
tion tolerance (CAP) simultaneously. We assume there is a fixed number n of
nodes, that every node implements the same service, and that there are direct
communication links between all nodes. We use the variables reqi and outi to
denote input and output of node i, respectively. The consistency
and availability
requirements
can then be encoded
as the LTL formulas 1≤i<n (outi ↔ outi+1 )
and ( 1≤i≤n reqi ) ↔ ( 1≤i≤n outi ). The partition tolerance is modeled in
a way that there is always at most one node partitioned from the rest of the
system.
In both examples, a finite set of input sequences suffices to force the system
into violating the specification on at least one of the input sequences. In this
paper, we present an efficient method for finding such counterexamples. It turns
out that searching for counterexamples is much easier than the classic synthesis
approach of establishing unrealizability by the non-existence of strategy trees [2,
3, 4]. The difficulty in synthesis is to enforce the consistency condition that
the strategy of a process must act the same way in all situations the process
cannot distinguish. On the strategy trees, this consistency condition is not an
ω-regular (or even decidable) property. When analyzing a counterexample, on
the other hand, we only check consistency on a specific set of sequences, not on a
full tree. This restricted consistency condition is an ω-regular property and can,
in fact, simply be expressed in LTL as part of the temporal specification. Our
QBF-based prototype implementation finds counterexamples for the Byzantine
Generals’ Problem and the CAP Theorem within just a few seconds.

Related Work. To the best of the authors’ knowledge, there has been no at-
tempt in the literature to characterize unrealizable specifications for distributed
systems beyond the restricted class of architectures with decidable synthesis
problems, such as pipelines and rings [3, 4]. By contrast, there is a rich litera-
ture concerning unrealizability for open systems, that is, single-process systems
interacting with the environment [7, 8, 9]. In robotics, there have been recent at-
tempts to analyze unrealizable specifications [10]. The results are also focused on
the reason for unsatisfiability, while our approach tries to determine if a specifi-
cation is unrealizable. Moreover, they only consider the simpler non-distributed
synthesis of GR(1) specifications, which is a subset of LTL. There are other
approaches concerning unrealizable specifications in the non-distributed setting
that also use counterexamples [11, 12]. There, the system specifications are as-
sumed to be correct and the information from the counterexamples are used
to modify environment assumptions in order to make the specifications realiz-
able. The Byzantine Generals’ Problem is often used as an illustration for the
knowledge-based reasoning in epistemic logics, see [13] for an early formaliza-
tion. Concerning the synthesis of fault-tolerant distributed systems, there is an
approach to synthesize fault-tolerant systems in the special case of strongly con-
nected system architectures [14].
Detecting Unrealizable Specifications of Distributed Systems 81

a b x
a b

y a b y z
x x c
y
a

(a) (b) (c)

Fig. 1. Distributed architectures

2 Distributed Realizability
A specification is realizable if there exists an implementation that satisfies the
specification. For distributed systems, the realizability problem is typically stated
with respect to a specific system architecture. Figure 1 shows some typical exam-
ple architectures: an architecture consisting of independent processes, a pipeline
architecture, and a join architecture. The architecture describes the communi-
cation topology of the distributed system. For example, an edge from x to y
labeled with b indicates that b is a shared variable between processes x and y,
where x writes to b and y reads b. The classic distributed realizability problem is
to decide whether there exists an implementation (or strategy) for each process
in the architecture, such that the joint behavior satisfies the specification. In
this paper, we are furthermore interested in the synthesis of fault-tolerant dis-
tributed systems, where the processes and the communication between processes
may become faulty.
In order to have a uniform and precise definition for the various realizabil-
ity problems of interest, we use a logical representation. Extended coordination
logic (ECL) [15] is a game-based extension of linear-time temporal logic (LTL).
ECL uses the strategy quantifier ∃C s to express the existence of an implemen-
tation for a process output s based on input variables C.

ECL Syntax. ECL formulas contain two types of variables: the set C of input
(or coordination) variables, and the set S of output (or strategy) variables. In
addition to the usual LTL operators Next , Until U, and Release R, ECL has
the strategy quantiﬁer ∃C s, which introduces an output variable s whose values
must be chosen based on the inputs in C. The syntax is given by the grammar

ϕ ::= x | ¬x | ϕ ∨ ϕ | ϕ ∧ ϕ | ϕ | ϕ U ϕ | ϕ R ϕ | ∃C s. ϕ | ∀C s. ϕ ,

where x ∈ C ∪˙ S, C ⊆ C, and s ∈ S. Beside the standard abbreviations true ≡

x ∨ ¬x, false = x ∧ ¬x, ϕ ≡ true U ϕ, and ϕ ≡ false R ϕ, we use n ϕ as an
abbreviation of n consecutive Next operators.
We denote by Q the (possibly empty) quantification prefix of a formula and
call the remainder the body. For Q ∈ {∃, ∀}, we use QQ if the prefix contains only
Q-quantifiers. For the purposes of this paper, it suffices to consider the fragment
ECL∃ that only contains existential quantifiers. We furthermore assume that the
body is quantifier-free, i.e., that the formulas are in prenex normal form (PNF).
82 B. Finkbeiner and L. Tentrup

Examples. We demonstrate how to express distributed realizability problems

in ECL∃ with the example architectures from Fig. 1. The realizability of an LTL
formula ψ1 in the architecture from Fig. 1(a) is expressed by the ECL∃ formula

∃{a} x. ∃{b} y. ψ1 . (1)

Interprocess communication via a shared variable b, as in the pipeline architec-

ture from Fig. 1(b), is expressed by separating the information read from b from
the output written to b. In the following ECL∃ formula we use output variable
x to denote the output written to b:

∃{b} y. ∃{a, b} x. (b = x) → ψ2 (2)

The LTL speciﬁcation ψ2 is qualiﬁed by the input-output relation (b = x),

which expresses that ψ2 is required to hold under the assumption that the infor-
mation written to b by process x is also the information read from b by process y.
This separation between sent and received information is useful to model faults
that disturb the transmission. Failing processes can be speciﬁed by omitting the
input-output relations that refer to the failing processes. As an example, consider
the architecture in Fig. 1(c). The ECL∃ formula

∃{a} x, y. ∃{b, c} z. (c = y) → ψ3 ∧ (b = x) → ψ3 (3)

speciﬁes that there exists an implementation such that ψ3 is guaranteed to hold

even if process x or y (but not both) fails.
For a formula Φ, we diﬀerentiate two types of coordination variables, external
and internal. A coordination variable c ∈ C is external iﬀ it is a true input
from the environment, i.e., not contained in any input-output relation of Φ. For
example, the input a in (3) is external while b and c are internal.

ECL Semantics. We give a quick definition of the ECL∃ semantics for for-
mulas in PNF and refer the reader to [15] for details and for the semantics
of full ECL. The semantics is based on trees as a representation for strate-
gies and computations. Given a finite set of directions Υ and a finite set of
labels Σ, a (full) Σ-labeled Υ -tree T is a pair Υ ∗ , l, where l : Υ ∗ → Σ as-
signs each node υ ∈ Υ a label l(υ). For two trees T and T , we define the
joint valuation T ⊕ T to be the widened tree with the union of both la-
bels. We refer to [15] for a formal definition. A path σ in a Σ-labeled Υ -tree
T is an ω-word σ0 σ1 σ2 . . . ∈ Υ ω and the corresponding labeled path σ T is
(l(
), σ0 )(l(σ0 ), σ1 )(l(σ0 σ1 ), σ2 )(l(σ0 σ1 σ2 ), σ3 ) . . . ∈ (Υ × Σ)ω .
For a strategy variable s that is bound by some quantifier QC s. ϕ, we refer to
C as the scope of s, denoted by Scope(s). The meaning of a strategy variable s is
a strategy or implementation fs : (2Scope(s) )∗ → 2{s} , i.e., a function that maps a
history of valuations of input variables to a valuation of the output variable s. We
represent the computation of a strategy fs as the tree (2Scope(s) )∗ , fs where fs
serves as the labeling function (cf. Fig. 2(a)–(b)). ECL∃ formulas are interpreted
over computation trees, that are the joint valuations of the computations for
Detecting Unrealizable Specifications of Distributed Systems 83

{y} ∅ {y}
a ¬a a ¬a
∅ {x} ∅ {x} ∅

{y} {x} ∅ {x} ∅ {y, x} {y} {y, x} {y}

∅ {x} ∅ {x} ∅ {x} ∅ {x} ∅ {x} ∅ {x} ∅ {x} ∅ {x} ∅

.. .. ..
. . .
(a) Strategy for y (b) Strategy for x (c) Computation tree

Fig. 2. In (a) and (b) we sketch example strategies for y and x satisfying the ECL∃
formula ∃∅ y. ∃{a} x. ( x ↔ a) ∧ (y ↔ ¬y). In (c) we visualize the resulting
computation tree on which the body (LTL) formula is evaluated.

Scope(s) ∗
strategies belonging to the strategy variables in S, i.e., s∈S (2 ) , fs
(cf. Fig. 2(c)). Given an ECL∃ formula Q∃ . ϕ in prenex normal form over strategy
variables S and coordination variables C, the formula is satisﬁed if there exists
a computation tree T (over S), such that all paths in T satisfy the LTL formula
ϕ, i.e., ∀σ ∈ (2C )ω . σ T , 0 ϕ where the satisfaction of an LTL formula on a
labeled path σ T on position i ≥ 0 is deﬁned as usual.

3 Counterexamples to Distributed Realizability

We now introduce counterexamples to realizability, which correspond to coun-
terexamples to satisfiability for the ECL∃ formula that represents the realizabil-
ity problem. The satisfiability problem for an ECL∃ formula in prenex form asks
for an implementation for all strategy variables in the quantification prefix of
the formula such that the temporal specification in the body is satisfied.
Let Φ = Q∃ . ϕ be an ECL∃ formula in prenex form over coordination variables
C and strategy variables S, where the body of the formula is the LTL formula ϕ.
A counterexample to satisfiability for Φ is a (possibly infinite) set of paths P ⊆
(2C )ω , such that, no matter what strategies are chosen for the strategy variables
in S, there exists a path σ ∈ P that violates the body ϕ. Formally, P ⊆ (2C )ω is
a counterexample to satisfiability iff, for all strategies fs : (2Scope(s) )∗ → 2{s} for
each s ∈ S, it holds that there exists a path σ ∈ P such that σ T , 0 ¬ϕ where
T = s∈S (2Scope(s) )∗ , fs .
Proposition 1. An ECL∃ formula Φ over coordination variables C and strat-
egy variables S is unsatisfiable if and only if there exists a counterexample to
satisfiability P ⊆ (2C )ω .

Proof. By the semantics of ECL∃ and P = (2C )ω .

In the remainder of the paper, we focus on counterexamples to realizability prob-

lems. The distributed realizability problem without faults correspond to ECL∃
formulas of the form Φ = Q∃ . ϕpath → ϕ, where the ϕpath deﬁnes the system
architecture AΦ : there is an edge from one strategy variable to another if the
84 B. Finkbeiner and L. Tentrup

input-output relation occurs in ϕpath . A ﬁnite counterexample to satisﬁability

of Φ is a finite set of paths P ⊆ (2Cext )ω corresponding to external coordination
variables, such that for any implementation T there exists a path σ ∈ P such
that an extension σ ∈ (2C )ω of σ violates ϕ. Note that the extension of σ by
the valuation of the internal coordination variables is uniquely specified by the
input path σ and the system implementation T .
Corollary 2. If there exists a finite counterexample to satisfiability P ⊆ (2Cext )ω
for an ECL∃ formula Φ = Q∃ . ϕpath → ϕ over coordination variables C and
strategy variables S, then Φ is unsatisfiable.
As an example consider again the ECL∃ formula (1) ∃{a} x. ∃{b} y. ψ1 ,
corresponding to the architecture from Fig. 1(a) in the previous section. Let
ψ1 := ( y ↔ a), i.e., y must output the input a with an one-step delay. A
simple counterexample for this formula consists of two paths P1 := { ∅ω , {a}ω }
that differ in the values of a, but not in the values of b. Since process x cannot
distinguish the two paths, but must produce different outputs, we arrive at a
contradiction. Consider the same formula for the pipeline architecture specified
by (2) ∃{b} y. ∃{a, b} x. (b = x) → ψ2 . Due to the delay when forwarding
the input a over shared variable b, the formula becomes unsatisfiable. P1 is a
finite counterexample in this case, too: Given an implementation of x and y, we
extend both paths such that the input-output specification (b = x) is satisfied.
The distributed realizability
problem
with faults
correspond to ECL∃ formu-
las of the form Φ = Q∃ . 1≤i≤n ϕpathi → ϕi . If ϕi = ϕ for all i, the formula
states that there exists an implementation such that the specification ϕ should
hold in all architectures induced by the path specifications ϕpathi . Omitted chan-
nel specifications in one of these formulas represent an arbitrary error at this
channel. In this case, a counterexample identifies for each implementation one
of these architectures where a contradiction occurs. A finite counterexample to
satisfiability of Φ are n finite sets of paths Pi ⊆ (2Cext )ω each corresponding to
i

external coordination variables Cext in the respective architecture i, such that

for any implementation T there exists an architecture j and a path σ ∈ Pj such

that an extension σ ∈ (2C )ω of σ violates ϕj .

Corollary 3. An ECL∃ formula Φ = Q∃ . 1≤i≤n ϕpathi → ϕi over coordina-
tion variables C and strategy variables S is unsatisfiable if there exists a finite
counterexample to satisfiability of Φ.
A counterexample for the ECL specification (3) introduces paths for inputs as
well as for every faulty node by introducing paths that model the exact channel
specification and additional paths that model the arbitrary node failures. The
target node that reads from a shared variable can, in contrast to incomplete infor-
mation, react differently on the given paths, but the reaction must be consistent
regarding its observations on all paths. Consider for example the specification
ψ3 := (2 z ↔ a) for the ECL formula in (3), that is, process z should output
the input a of nodes x and y. In both architectures we introduce additional
paths for the coordination variable that is omitted in the channel specification,
i.e., b and c for the first and second conjunct, respectively. Process z cannot tell
Detecting Unrealizable Specifications of Distributed Systems 85

which of its inputs come from a faulty node. Since z must produce the same
output on two paths it cannot distinguish, the implementation of z contradicts
the speciﬁcation in either architecture.

4 From ECL∃ to QPTL

We encode the existence of ﬁnite counterexample to realizability as a formula of

quantified propositional temporal logic (QPTL). QPTL extends LTL with a path
quantifier ∃p, where a path σ ∈ 2AP satisfies ∃p. ϕ at position i ≥ 0, denoted by
σ, i ∃p. ϕ, if there exists a path σ ∈ 2AP∪{p} which coincides with σ except for
the newly introduced atomic proposition p, such that σ , i ϕ. In the encoding,
we use the path quantifier to explicitly name the paths in the counterexample.

Realizability without Faults. We consider ﬁrst the distributed realizability

problems without faults, represented by ECL∃ formula Φ = Q∃ . ϕpath → ϕ. We
assume, without loss of generality, that the architecture AΦ is acyclic. Finkbeiner
and Schewe [4] gave a realizability-preserving transformation to acyclic architec-
tures that removes feedback edges.
Lemma 4 ([4]). Any ECL∃ formula Φ = Q∃ . ϕpath → ϕ can be transformed into
an equisatisfiable formula Φ = Q∃ . ϕpath → ϕ such that the system architecture
AΦ is acyclic.
We search for a finite counterexample of Φ by bounding the number of paths
regarding the external coordination variables. The bound on the number of paths
is given as a function K : C → IN that maps each coordination variable to the
number of branchings that should be considered for this variable. For example,
for coordination variables a and b, and K(a) = K(b) = 1, we encode 4 different
paths, one per possible combination for the two paths for each variable. We fix
an arbitrary strict order ≺ ⊆ C × C between the coordination variables. For a
set C ⊆ C, we identify K(C) by the vector IN|C| where the position of the value
K(c) for a coordination variable c ∈ C is determined by ≺. For our encoding in
QPTL, we use the following helper functions:

– deps(v) returns the set of coordination variables that inﬂuence variable v. A

coordination variable c influences variable v if c belongs to a directed path
that leads to v in AΦ . For example in the architecture of Fig. 1(c), b and
x are influenced by a while z is influenced by a, b, and c. A coordination
variable is influenced by itself.
– branches(C, K) returns the set of branches belonging to coordination vari-
ables C. A branch is referenced by a tuple IN|C| and the set of branches is
{(nc1 , . . . , nck ) | {c1 ≺ · · · ≺ ck } = C and 1 ≤ nc ≤ 2K(c) for all c ∈ C}
– paths(C, K) and strategies(S, K) create the (path) variables in the QPTL
formula that belong to the variables of the ECL∃ formula. For a variable
v ∈ C ∪ S it introduces for each branch π ∈ branches(deps(v), K) a separate
variable pvπ that represents the variable v belonging to this branch π.
86 B. Finkbeiner and L. Tentrup

– header(S, K) creates the alternating introductions of strategies and paths

according to the acyclic architecture AΦ . For every strategy variable s ∈ S
we introduce all paths belonging to coordination variables c ∈ Scope(s) prior
to s and avoid duplicate path introductions:
∃ paths(Scope(s1 ), K) ∀ strategies({s1 }, K)
∃ paths(Scope(s2 ) \ Scope(s1 ), K) ∀ strategies({s2 }, K)
...
∃ paths(Scope(sn ) \ Scope(si ) , K) ∀ strategies({sn }, K) ,
i=1,...,n−1

where s1 , . . . , sn are sorted in ascending order according to their informed-

ness, i.e., the subset relation on their scopes.
– consistent(S, K) specifies the consistency condition for the variables be-
longing to the strategy variables on the different branches. The variables
psπ1 , . . . , psπk belonging to a strategy variable s ∈ S must be equal as long as
the coordination variables in the scope of s on the branches π1 , . . . , πk are
equal. This can be specified in LTL as there are only finitely many branches.
The QPTL encoding for ECL∃ formula Φ and function K : C → IN is
unsatdist (Φ, K) := header(S, K). consistent(S, K) →

ϕpath (π) ∧ ¬ϕ(π) , (4)
π∈branches(C,K) π∈branches(C,K)

where ϕ(π) is the initialization of LTL formula ϕ on the branch π, that is we

exchange v by pvπ for v ∈ C ∪ S where π is the subvector of π that contains the
values for coordination variables in deps(v).
Theorem 5 (Correctness). Given an ECL∃ formula Φ = Q∃ . ϕpath → ϕ over
coordination variables C and strategy variables S with an acyclic system archi-
tecture AΦ . Φ is unsatisfiable if there exists a function K : C → IN such that the
QPTL formula unsatdist (Φ, K) is satisfiable.
Realizability with Node Failures. In thecase ofpossible failures, the ECL∃
formulas Φ has the more general form Q∃ . 1≤i≤n ϕpathi → ϕi . In this spe-
cific setting we cannot assume acyclic architectures in general. The architecture
belonging to Φ is acyclic
if the architecture belonging to the conjunction of all
paths specifications 1≤i≤n ϕpathi is acyclic. An edge is a common feedback edge
if and only if it is a feedback edge in all architectures. As before, we can elim-
inate common feedback edges but this does not give us acyclic architectures in
general as depicted in Fig. 3. In the following, we assume acyclic architectures
after removing common feedback edges.
The QPTL encoding of ECL∃ formula Φ and functions K1 . . . Kn : C → IN is
unsatfault (Φ, K1 , . . . , Kn ) := header(S, K). consistent(S, K) →

ϕpathi (π) ∧ ¬ϕi (π) , (5)
1≤i≤n π∈branches(C,Ki ) π∈branches(C,Ki )

where K : C → IN is deﬁned as K(c) := max1≤i≤n Ki (c) for every c ∈ C.

Detecting Unrealizable Speciﬁcations of Distributed Systems 87

b
c d d
x y x y
a b a c

(a) (b)

Fig. 3. Example illustrating common feedback edges: Edge c is a feedback edge in

architecture (a), but not in architecture (b), thus it is also not a common feedback
edge when considering both architectures

Theorem 6 (Correctness). Given an ECL∃ formula Φ = Q∃ . 1≤i≤n ϕpathi

→ ϕi over coordination variables C and strategy variables S with an acyclic
system architecture AΦ after removing common feedback edges. Φ is unsatisﬁ-
able if there exist functions K1 . . . Kn : C → IN such that the QPTL formula
unsatfault (Φ, K1 , . . . , Kn ) is satisﬁable.

Example. We consider again the Byzantine Generals’ Problem with three nodes
g1 , g2 , and g3 . The ﬁrst general is the commander who forwards the input v that
states whether to attack the enemy or not. The encoding as ECL∃ formula is
Φbgp := ∃{v} g12 , g13 . ∃{c12 } g23 . ∃{c13 } g32 . ∃{c12 , c32 } g2 . ∃{c13 , c23 } g3 .

(operational2,3 → consensus2,3 ) ∧ (operational1,i → correctvali ) ,
i∈{2,3}

where the quantification prefix introduces the strategies for the generals g2 and
g3 , as well as the communication between the three generals as depicted in the
architecture in Fig. 4(a). Note that we omit the vote of the commander g1 as it is
not used in the specification. In the temporal part, we specify which failures can
occur. The first conjunct, corresponding to Fig. 4(b), states that the commander
is faulty (operational2,3 ) which implies that the other two generals have to reach a
consensus whether to attack or not (consensus2,3 ). The other two cases, depicted
in Fig. 4(c)–(d), are symmetric and state that whenever one general is faulty the
other one should agree on the decision made by the commander. The QPTL
encoding unsatfault (Φbgp , K1 , K2 , K3 ) is given as
∃ paths({v}, K). ∀ strategies({g12 , g13 }, K). ∃ paths({c12 , c13 }, K).
∀ strategies({g23 , g32 }, K). ∃ paths({c23 , c32 }, K). ∀ strategies({g2 , g3 }, K).
consistent({g12 , g13 , g23 , g32 , g2 , g3 }, K) →

operational2,3 (π) ∧ ¬consensus2,3 (π) ∨
π∈branches(C,K1 ) π∈branches(C,K1 )

operational1,3 (π) ∧ ¬correctval3 (π) ∨
π∈branches(C,K2 ) π∈branches(C,K2 )

operational1,2 (π) ∧ ¬correctval2 (π) .
π∈branches(C,K3 ) π∈branches(C,K3 )
88 B. Finkbeiner and L. Tentrup

v v v v

g1 g1 g1 g1
c12 c13 c12 c13 c12 c13 c12 c13

c32 c32 c32 c32

g2 g3 g2 g3 g2 g3 g2 g3

c23 c23 c23 c23

(a) (b) (c) (d)

Fig. 4. The Byzantine Generals’ architecture. Figure (a) shows the architecture in
cases all generals are loyal. Figures (b)–(d) show the possible failures, indicated by the
dashed communication links.

5 From QPTL to QBF

Presently available QPTL solver were unable to handle even small instances
of our problem. We therefore simplify the problem using the following steps.
Instead of checking the QPTL formula directly, we encode the formula as an
equivalent monadic second order logic of one successor (S1S) formula using a
straightforward translation. We then interpret the S1S formula with a WS1S
formula, which can be checked using the WS1S solver Mona [16]. Some of our
smaller instances were solved by Mona, but the Byzantine Generals’ Problem
failed due to memory constraints in the BDD library.
Taking the simplifications even further, we not only bound the number of
paths but also the length of the paths by translating the problem to the satisfia-
bility problem of quantified Boolean formulas (QBF). The encoding translates a
QPTL variable x to Boolean variables x0 , . . . , xk−1 , each representing one step
in the system where k is the length of the paths. We build the QBF formula
by unrolling the QPTL formula for k-steps: Each variable in the quantification
prefix of the QPTL formula is transformed into k Boolean variables in the QBF
prefix, e.g., the 3-unrolling of ∃x. ∀y. ϕ is ∃x0 , x1 , x2 . ∀y0 , y1 , y2 . ϕunroll . The un-
rolling of the remaining LTL formula is given by the expansion law for Until,
ϕ U ψ ≡ ψ ∨ (ϕ ∧ ϕ U ψ). After the unrolling, the QBF formula is transformed
into Conjunctive Normal Form (CNF) and encoded in the QDIMACS file for-
mat, that is the standard format for QBF solvers. Already with this encoding
we could solve more examples than using the WS1S approach.
In this simple translation, one cause of high complexity is due to the consis-
tency conditions between the strategy variables across different paths. However,
most of these variables are not used for the counterexample itself but appear
only in the consistency condition. One optimization removes these unnecessary
variables from the encoding. Therefore, we collect all strategy variables and
(when possible) their temporal occurrence from the LTL specification. For every
used strategy variable we build the dependency graph that contains all variables
which can influence the outcome of the strategy. In the last step, we remove all
variables that are not contained in any dependency graph.
Detecting Unrealizable Specifications of Distributed Systems 89

6 Completeness
Proposition 1 states that the characterization of unsatisfiable formulas with
counterexamples is complete. Our method, however, searches for counterex-
amples involving only a bounded number of external paths and the follow-
ing example shows this leads to incompleteness. Consider the ECL∃ formula
Φinf := ∃∅ y. ϕinf with temporal specification ϕinf := (y = x) where x
is a free coordination variable. Φinf is unsatisfiable because for every strategy
fy : ∅∗ → 2{y} there exists a path σ ∈ (2{x} )ω that simulates exactly the output
of the strategy, as the formula is evaluated over the full binary x-tree. Assume
for contradiction that a finite set of paths P ⊆ (2{x} )ω suffices to satisfy ¬ϕinf
against any strategy fy . Interpreting the outcome of the strategy as a path and
considering all possible strategies gives us a full binary tree T . Let ρ be a path
from T that is not contained in P (after renaming y in ρ to x). Such a path must
exists because there are infinite many different paths in T . Choose the strategy
fyρ that belongs to ρ. For all paths in P it holds that (y = x) and thus no
path satisfies ¬ϕinf .
However, in practice finite external counterexamples are sufficient to detect
many errors in specifications. In this section we give a semantic characterization
of the finite path satisfiability based only on the LTL specification.
Given an ECL∃ formula Φ = Q∃ . ϕpath → ϕ. We assume w.l.o.g. that ϕ only
contains coordination variables Ce ⊆ C that are not used as a channel as otherwise
one could replace a variable c ∈ C \ Ce by the strategy variable corresponding to
the channel. The semantics of the LTL formula ¬ϕ, denoted by ¬ϕ, gives us
a language L ⊆ (2S × 2Ce )ω . From L we obtain the relation R ⊆ (2S )ω × (2Ce )ω
between paths of strategy variables and paths of coordination variables. We say
that an LTL formula ψ over variables S × Ce admits finite external paths if there
exists a function r : (2S )ω → (2Ce )ω such that (1) for all σ ∈ (2S )ω it holds that
r(σ) = ρ ⇔ σ R ρ, and (2) {r(σ) | σ ∈ (2S )ω } is finite.
Let RAψ be the deterministic Rabin word automata for LTL formula ψ. RAψ
contains a path split if there exist a state q in the automaton where (1) there are
two outgoing edges labeled with (s, p) and (s , p ) where s = s and p = p , and
(2) from q we can build accepting runs visiting q infinitely often and containing
exclusively the (s, p)-edge or (s , p )-edge.
Theorem 7. An LTL formula ψ over variables S × Ce admits finite external
paths if and only if the automaton RAψ has no path split.

7 Experimental Results
We have carried out our experiments on a 2.6 GHz Opteron system. For solving
the QBF instances, we used a combination of the QBF preprocessor Bloqqer [17]
in version 031 and the QBF solver DepQBF [18] in version 1.0. For solving the
WS1S instances, we used Mona [16] in version 1.4-15.
Table 1 demonstrates that the Byzantine Generals’ Problem remains, despite
the optimizations described above, a nontrivial combinatorial problem: we need
90 B. Finkbeiner and L. Tentrup

to find a suitable set of paths for every possible combination of the strategies
of the generals. The bound given in the first column reads as follows: The first
component is the number of branchings for the input variable v in all three
architectures. The last three components state the number of branchings for the
outputs of the faulty nodes in their respective architectures. For example, bound
(1, 1, 0, 0) means that we have two branches for v, c12 , and c13 , while we have only
one branch for c23 and c32 . More precisely, starting from always zero functions
K1 , K2 , K3 , the bound (1, 1, 0, 0) sets K1 (v) = K2 (v) = K3 (v) = K1 (c12 ) =
K1 (c13 ) = 1 and K2 (c23 ) = K3 (c32 ) = 0. To prove the unrealizability, we need
one branching for the input v and one branching for every coordination variable
that serves as a shared variable for a faulty node, i.e., the bound (1, 1, 1, 1). The
number of branches and thereby the formula size grows exponentially with the
number of branchings for the input variables.

Table 1. Result of the Byzantine Generals’ Problem example

Bound Result # Clauses # Variables Memory (MB) Time (s)

(0, 0, 0, 0) Unsatisfiable 57 44 5.06 0.00
(1, 0, 0, 0) Unsatisfiable 228 143 5.71 0.05
(1, 1, 0, 0) Unsatisfiable 2286 1095 17.83 2.16
(1, 1, 1, 0) Unsatisfiable 2904 1375 18.41 2.42
(1, 1, 1, 1) Satisfiable 3522 1655 28.88 11.95
The table shows the time and memory consumption of Bloqqer 031 and De-
pQBF 1.0 when solving the encoding of the Byzantine Generals’ Problem in
QBF with a fixed length of 3 unrollings.

The CAP Theorem for two nodes is encoded as the ECL∃ formula
∃{req1 } com1 . ∃{req1 , chan2 } out1 . ∃{req2 } com2 . ∃{req2 , chan1 } out2 .
( (chan1 = com1 ) → ((out1 = out2 ) ∧ ((req1 ∨ req2 ) ↔ 2 (out1 ∨ out2 )))) ∧
( (chan2 = com2 ) → ((out1 = out2 ) ∧ ((req1 ∨ req2 ) ↔ 2 (out1 ∨ out2 )))) .

The architecture is similar to Fig. 1(a) with the difference that there is a direct
communication channel between the two processes (chan1 , chan2 ). The formula
states that the system should be available and consistent despite an failure of
one process. Table 2 shows that our method is able to find conflicts in a speci-
fication with an architecture up to 50 nodes within reasonable time. When we
drop either Consistency, Availability, or Partition tolerance, the corresponding
instances (AP, CP, and CA) become satisfiable. Hence, our tool does not find
counterexamples in these cases.

Discussion. We evaluate the different encodings that we have used in the fol-
lowing. There does not exist an algorithm that decides whether a given ECL∃
formula is unsatisfiable. We used a sound approach where we bound the number
of paths and encoded the problem in QPTL. The reason for incompleteness was
Detecting Unrealizable Specifications of Distributed Systems 91

Table 2. Result of the CAP Theorem example

Instance Result # Clauses # Variables Memory (MB) Time (s)

ap 2 Unsatisfiable 1232 619 9.22 0.29
ca 2 Unsatisfiable 1408 763 12.47 0.87
cp 2 Unsatisfiable 48 42 5.05 0.00
cap 2 Satisfiable 110 84 5.05 0.00
cap 5 Satisfiable 665 426 5.06 0.05
cap 10 Satisfiable 2590 1556 6.49 0.35
cap 25 Satisfiable 15865 9146 35.47 2.83
cap 50 Satisfiable 62990 35796 87.84 44.03
The table shows the time and memory consumption of Bloqqer 031 and
DepQBF 1.0 when solving the encoding of the CAP Theorem in QBF
with a fixed length of 2 unrollings.

shown in Sec. 6; in some cases one may need infinite many paths to show unsatis-
fiability. Our encoding in WS1S (Mona) loses the ability to find counterexample
paths of infinite length, e.g., the ECL∃ formula ∃∅ y. ( y ↔ x) with free
coordination variable x is unsatisfiable where two paths that are infinitely of-
ten different are sufficient to prove it. The QPTL encoding is capable of finding
these paths while the WS1S encoding is not. However, Mona could not solve any
satisfiable instance given in Tables 1 and 2. Lastly, for the translation in QBF we
do not only restrict ourself to paths of finite length (WS1S), but we also bound
the paths to length k where k is an additional parameter. With this encoding
we approximate the reactive behavior of our system by a finite prefix. It turned
out that despite of this restriction we could prove unsatisfiability for many in-
teresting specifications. In practice, one would first use the QBF abstraction in
order to find “cheap” counterexamples. After hitting the number of paths that
the QBF solver can no longer handle within reasonable time, one proceeds with
more costly abstractions like the WS1S encoding.

8 Conclusion
We introduced counterexamples for distributed realizability and showed how to
automatically derive counterexamples from given specifications in ECL∃ . We
used encodings in QPTL, WS1S, and QBF. Our experiments showed that the
QBF encoding was the most efficient. Even problems with high combinatorial
complexity, such as the Byzantine Generals’ Problem, are handled automati-
cally. Given that QBF solvers are likely to improve in the future, even larger
instances should become tractable. Possible future directions include building a
set of benchmarks, evaluating more solvers, and use the information about an
counterexample given by QBF certification [19] to build counterexamples for the
specification. As the bound given for the encoding is not uniform, i.e., there is a
bound for each coordination variable, and the observation that the performance
depend on the chosen bound, it is crucial to find suitable heuristics that rank the
importance of the coordination variables. Also, more types of failures could be
92 B. Finkbeiner and L. Tentrup

incorporated into our model, e.g., variations of the failure duration like transient,
or intermittent. Lastly, it would be also conceivable to use similar methods to
derive a larger class of inﬁnite counterexamples.

References
1. Lamport, L., Shostak, R.E., Pease, M.C.: The byzantine generals problem. ACM
Trans. Program. Lang. Syst. 4(3), 382–401 (1982)
2. Pnueli, A., Rosner, R.: Distributed reactive systems are hard to synthesize. In:
Proc. FOCS 1990, pp. 746–757 (1990)
3. Kupferman, O., Vardi, M.Y.: Synthesizing distributed systems. In: LICS, pp. 389–
398. IEEE Computer Society (2001)
4. Finkbeiner, B., Schewe, S.: Uniform distributed synthesis. In: LICS, pp. 321–330.
IEEE Computer Society (2005)
5. Finkbeiner, B., Schewe, S.: Bounded synthesis. International Journal on Software
Tools for Technology Transfer 15(5-6), 519–539 (2013)
6. Brewer, E.A.: Towards robust distributed systems (abstract). In: PODC, p. 7.
ACM (2000)
7. Church, A.: Logic, arithmetic and automata. In: Proc. 1962 Intl. Congr. Math.,
Upsala, pp. 23–25 (1963)
8. Abadi, M., Lamport, L., Wolper, P.: Realizable and unrealizable specifications of
reactive systems. In: Ronchi Della Rocca, S., Ausiello, G., Dezani-Ciancaglini, M.
(eds.) ICALP 1989. LNCS, vol. 372, pp. 1–17. Springer, Heidelberg (1989)
9. Kupferman, O., Vardi, M.Y.: Synthesis with incomplete information. In: Proc. of
ICTL (1997)
10. Raman, V., Kress-Gazit, H.: Analyzing unsynthesizable specifications for high-
level robot behavior using LTLMoP. In: Gopalakrishnan, G., Qadeer, S. (eds.)
CAV 2011. LNCS, vol. 6806, pp. 663–668. Springer, Heidelberg (2011)
11. Li, W., Dworkin, L., Seshia, S.A.: Mining assumptions for synthesis. In: MEM-
OCODE, pp. 43–50. IEEE (2011)
12. Chatterjee, K., Henzinger, T.A., Jobstmann, B.: Environment assumptions for syn-
thesis. In: van Breugel, F., Chechik, M. (eds.) CONCUR 2008. LNCS, vol. 5201,
pp. 147–161. Springer, Heidelberg (2008)
13. Halpern, J.Y., Moses, Y.: Knowledge and common knowledge in a distributed
environment. In: PODC, pp. 50–61. ACM (1984)
14. Dimitrova, R., Finkbeiner, B.: Synthesis of fault-tolerant distributed systems. In:
Liu, Z., Ravn, A.P. (eds.) ATVA 2009. LNCS, vol. 5799, pp. 321–336. Springer,
Heidelberg (2009)
15. Finkbeiner, B., Schewe, S.: Coordination logic. In: Dawar, A., Veith, H. (eds.) CSL
2010. LNCS, vol. 6247, pp. 305–319. Springer, Heidelberg (2010)
16. Henriksen, J.G., Jensen, J.L., Jørgensen, M.E., Klarlund, N., Paige, R., Rauhe,
T., Sandholm, A.: Mona: Monadic second-order logic in practice. In: Brinksma,
E., Steffen, B., Cleaveland, W.R., Larsen, K.G., Margaria, T. (eds.) TACAS 1995.
LNCS, vol. 1019, pp. 89–110. Springer, Heidelberg (1995)
17. Biere, A., Lonsing, F., Seidl, M.: Blocked clause elimination for QBF. In: Bjørner,
N., Sofronie-Stokkermans, V. (eds.) CADE 2011. LNCS, vol. 6803, pp. 101–115.
Springer, Heidelberg (2011)
18. Lonsing, F., Biere, A.: DepQBF: A dependency-aware QBF solver. JSAT 7(2-3),
71–76 (2010)
19. Balabanov, V., Jiang, J.H.R.: Unified QBF certification and its applications. For-
mal Methods in System Design 41(1), 45–65 (2012)
Synthesizing Safe Bit-Precise Invariants

Arie Gurﬁnkel1 , Anton Belov2 , and Joao Marques-Silva2

1
Carnegie Mellon University, USA
2
University College Dublin, Ireland

Abstract. Bit-precise software veriﬁcation is an important and diﬃcult

problem. While there has been an amazing progress in SAT solving, Sat-
isfiability Modulo Theory of Bit Vectors, and bit-precise Bounded Model
Checking, proving bit-precise safety, i.e. synthesizing a safe inductive in-
variant, remains a challenge. Although the problem is decidable and is
reducible to propositional safety by bit-blasting, the approach does not
scale in practice. The alternative approach of lifting propositional algo-
rithms to bit-vectors is difficult. In this paper, we propose a novel tech-
nique that uses unsound approximations (i.e., neither over- nor under-)
for synthesizing sound bit-precise invariants. We prototyped the tech-
nique using Z3/PDR engine and applied it to bit-precise verification of
benchmarks from SVCOMP’13. Even with our preliminary implemen-
tation we were able to demonstrate significant (orders of magnitude)
performance improvements with respect to bit-precise verificaton using
Z3/PDR directy.

1 Introduction
The problem of program safety (or reachability) verification is to decide whether
a given program can violate an assertion (i.e., can reach a bad state). The prob-
lem is reducible to finding either a finite counter-example, or a safe inductive
invariant that certifies unreachability of a bad state. The problem of bit-precise
program safety, Safety(BV), further requires that the program operations are
represented soundly relative to low-level bit representation of data. Arguably,
verification techniques that are not bit-precise are unsound, and do not reflect
the actual behavior of a program. Unlike many other problems in software veri-
fication, bit-precise verification (without memory allocation and concurrency) is
decidable. However, in practice it appears to be more challenging that verifica-
tion of programs relative to integers or rationals (both undecidable).
The recent decade has seen an amazing progress in SAT solvers, in Satisfiabil-
ity Modulo Theory of Bit-Vectors, SMT(BV), and in Bounded Model Checkers

This material is based upon work funded and supported by the Department of De-
fense under Contract No. FA8721-05-C-0003 with Carnegie Mellon University for the
operation of the Software Engineering Institute, a federally funded research and de-
velopment center. This material has been approved for public release and unlimited
distribution. DM-0000869. The second and third authors are financially supported
by SFI PI grant BEACON (09/IN.1/I2618), and by FCT grants ATTEST (CMU-
PT/ELE/0009/2009) and POLARIS (PTDC/EIA-CCO/123051/2010).

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 93–108, 2014.

c Springer-Verlag Berlin Heidelberg 2014
94 A. Gurﬁnkel, A. Belov, and J. Marques-Silva

(BMC) based on these techniques. A SAT solver decides whether a given propo-
sitional formula is satisfiable. Current solvers can handle very large problems
and are routinely used in many industrial applications (including Hardware and
Software verification). SMT(BV) extends SAT-solver techniques to the theory of
bit-vectors – that is propositional formulas whose atoms are predicates about bit-
vectors. Most successful SMT(BV) solvers (e.g., Boolector [6], STP [17], Z3 [12],
MathSAT [9]) are based on reducing the problem to SAT via pre-processing
and bit-blasting. The bit-blasting step takes a BV formula ϕ and constructs
an equivalent propositional formula ψ, where each propositional variable of ψ
corresponds to a bit of some bit-vector variable of ϕ. The, more important,
pre-processing step typically consists of equisatisfiable reductions that reduce
the size of the input formula. While the pre-processor is not as powerful as the
SAT-solver (typically pre-processor is required to run in polynomial time), it
does not maintain equivalence. The pre-processing phase of SMT(BV) solvers is
crucial for their performance. For example, in our experiments with Boolector,
the different between straight forward bit-blasting and pre-processing is several
order of magnitude.
There has also been a tremendous progress in applying those techniques to
program verification. In particular, there are several mature Bounded Model
Checkers, including CBMC [10], LLBMC [32], and ESBMC [11], that decide
existence of a bounded bit-precise counterexamples of C programs. These tools
are based on ultimate reduction of BMC to SAT, either via their own custom
bit-blasting and pre-processing steps (e.g., CBMC) or by leveraging SMT(BV)
solvers described above (e.g., LLBMC). While BMC tools are great at finding
counterexamples (even in industrial applications), proving bit-precise safety, i.e.,
synthesizing a bit-precise invariant, remains a challenge. For example, none of the
tools submitted to the Software Verification Competition in 2013 (SVCOMP’13)
are both bit-precise and effective at invariant synthesis.
As we described above, Safety(BV) is decidable. In fact, it is reducible to safety
problem over propositional logic, Safety(Prop), via the simple bit-blasting men-
tioned above. Thus, the naive solution is to reduce Safety(BV) to Safety(Prop)
and decide it using tools for propositional verification. This, however, does not
scale. Our experiments with Z3/PDR (the Model Checker of Z3), show that the
approach is ineffective for almost all benchmarks in SVCOMP’13. The main is-
sue is that the reduction of Safety(BV) to Safety(Prop) is incompatible with the
pre-processing techniques that make bit-blasting for SMT(BV) so effective.
An alternative approach of lifting effective Model Checking technique from
propositional level to BV appears to be difficult, with only a few somewhat suc-
cessful attempts (e.g., [26,19]). For example, techniques based on interpolation
(e.g., [31,27,1]) require world-level interpolation for BV [25,19] that satisfies ad-
ditional properties (e.g., sequence and tree properties) [21]. While, techniques
based on PDR [22], require novel world-level inductive generalization strategies.
Both are difficult problems in themselves.
Thus, instead of lifting existing techniques, we are interested in finding a
way to use existing verification engines to improve scalability of the naive
Synthesizing Safe Bit-Precise Invariants 95

bit-blasting-based solution. Our key insight is based on the observation that

most program verifiers abstract program arithmetic by integer (or rational) arith-
metic. This is unsound in the presence of overflows (see [19] for an example),
but the results are often “almost” correct. More importantly, they are useful to
the users. Thus, we are interested in how to reuse such unsound invariants in a
sound way.
Our procedure is based on an iterative guess-and-check loop. Given a Safety(BV)
problem P , we begin by trying to solve P using a Safety(BV) solver. If this takes
too long, we abort it, and construct an approximation (neither over- nor under-)
PT of P in another theory T (e.g., Linear Rational Arithmetic), decide the safety
of PT using a solver for Safety(T ), and obtain an inductive safe invariant Inv T . We
then port Inv T in a sound way to P , strengthen P with it, and repeat bit-blasting-
based verification. In the best case, the ported version of Inv T is a safe and in-
ductive invariant for P and the process terminates immediately. In the worst case,
Inv T contributes facts that might help the next verification effort.
We make the following contributions. First, we formally define a framework
that allows to use unsound invariants soundly in a verification loop. Second,
we instantiate the framework for the theories of Bit Vectors and Linear Arith-
metic. In particular, we describe an algorithm for computing Maximal Inductive
Subformula for SMT(BV) and show how it interacts with the pre-processing
step. Third, we have implemented the proposed framework using Z3/PDR for
Safety(Prop) and Boolector for SMT(BV) and have evaluated it on the bench-
marks from SVCOMP’13. Even with our preliminary implementation, we are
able to synthesize safe invariants for most programs.
Related Work. The use of over- and under-approximation and relaxation of a
problem from one theory into another is common in both SMT-solving and Model
Checking. For example, Bryant et al. [7], use over- and under-approximation to
decide formulas in SMT(BV). Komuravelli et al. [24] similarly use over- and
under-approximations for Software Model Checking. While we do not require
our approximations to be sound, we employ similar techniques to lift proof cer-
tificates (inductive invariants in our case) are in principle similar.
Computing Maximal Inductive Subformula (MIS) is similar to mining an in-
ductive invariant from a set of possible annotations, as for example in [16,23].
The key novelty in our approach is in the reduction from MIS problem to a Min-
imal Unsatisfiable Subformula (MUS) problem that allows the use of efficient
MUS extractors for SAT.
The works conceptually closest to ours are in the area of Upgrade Check-
ing [15], Multi-Property Verification [8], and Regression Verification [18,28,4].
A common theme in the above approaches is that they attempt to lift a safety
invariant from one given program P1 to another, related but not equivalent,
program P2 . The key difference is that we do not assume existence of a proven
program P1 , but, instead, synthesize P1 and its safety proof automatically.

2 Preliminaries
We assume some familiarity with program veriﬁcation, logic, SMT and SAT.
96 A. Gurﬁnkel, A. Belov, and J. Marques-Silva

Safety Veriﬁcation. A transition system P is a tuple (V, Init , Tr , Bad ), where

V is a set of variables, Init , Bad , and Tr are formulas (with free variables in V)
denoting the initial and the bad states, and the transition relation, respectively.
A transition system P is UNSAFE iﬀ there exists a natural number N such
that the following formula is satisﬁable:
N −1

Init (v0 ) ∧ Tr (vi , vi+1 ) ∧ Bad (vN ) (1)
i=0

When P is UNSAFE and s ∈ Bad is the reachable state, the path from s0 ∈ Init
to s ∈ Bad is called a counterexample (CEX).
A transition system P is SAFE if and only if there exists a formula Inv, called
a safe invariant, that satisfies the following conditions:
Init(v) → Inv(v) Inv (v) ∧ Tr (v, u) → Inv (u) Inv (v) → ¬Bad (v) (2)
A formula Inv that satisfies the first two conditions is called an invariant of
P , while a formula Inv that satisfies the third condition is called safe. A safety
verification problem is to decide whether a transition system P is SAFE or
UNSAFE. Thus, a safety verification problem is equivalent to the problem of
establishing an existence of a safe invariant. In SAT-based Model Checking, the
verification problem is decided by iteratively synthesizing an invariant Inv or
finding a CEX.

Minimal Unsatisﬁability. A CNF formula F , viewed as a set of clauses,

is minimal unsatisfiable (MU) if (i) F is unsatisfiable, and (ii) for any clause
C ∈ F , F \ {C} is satisfiable. A CNF formula F is a minimal unsatisfiable
subformula (MUS) of a formula F if F ⊆ F and F is MU. Motivated by several
applications, minimal unsatisfiability and related concepts have been extended
to CNF formulas where clauses are partitioned into disjoint sets called groups.
Definition 1. [33] Given an explicitly partitioned unsatisfiable CNF formula
G = G0 ∪ G1 ∪ · · · ∪ Gk (a group-MUS instance or a group-CNF formula), where
Gi ’s are pair-wise disjoint sets of clauses called groups, a group-MUS of G is
k } such that
a subset G of {G1 , . . . , G (i) G0 ∪ G is unsatisfiable, and (ii) for
any group G ∈ G, G0 ∪ G \ {G} is satisfiable.
Notice that group-0, G0 , plays the special role of a “background” subformula,
with respect to which the set of groups {G1 , . . . , Gk } is minimized. In particular,
if G0 is unsatisfiable, the group-MUS of G is ∅.

3 Synthesizing Safe Bit-Precise Invariants

3.1 High-Level Description of the Approach
Given a transition system P = (V, Init, Tr , Bad ), let the target theory TT be
the theory1 , or a combination of theories, that deﬁne the formulas in P . Let
1
The term “theory” is used as in the context of Satisﬁability Modulo Theories.
Synthesizing Safe Bit-Precise Invariants 97

TW be another theory, referred to as a working theory, with the intention that

reasoning in TW is easier in practice than reasoning in TT . Our approach relies
on a mapping MT →W that translates formulas over TT to formulas over TW .
Although the correctness of the approach is not affected by the choice of MT →W ,
its effectiveness is. We would like to map between formulas that are somewhat
close to each other semantically. Thus, we assume that MT →W maps the terms
and the atomic formulas of TT to those of TW and is an identity mapping for
the symbols shared between the two theories. The mapping is extended to all
formulas of TT by structural induction, i.e., given a formula F (v) over TT , the
corresponding formula FW (v) over TW is constructed by inductively applying
MT →W on the structure of F (v). Similarly, to translate formulas from TW to
TT , we work with a mapping MW →T from the terms and the atomic formulas of
the working theory TW to those of TT , extended to all formulas of TW .
Example 1. Let TT = BV∗ (32) — a sub-theory of the quantifier-free fragment of
the first-order theory of 32 bit bit-vector arithmetic (cf., [7]) obtained by remov-
ing all the non-arithmetic functions and predicates, as well as the multiplication
and the division on bit-vectors. Let TW = LA — the quantifier-free fragment of
the first order-theory of linear arithmetic, together with the propositional logic.
The mapping MT →W is defined as follows: (i) the propositional fragment of TT
maps to the propositional fragment of TW as is; (ii) bit-vector variables map
to LA variables; (iii) the arithmetic functions and predicates of BV(32) map to
their natural counterparts in LA, e.g., +[32] to +, <[32] to <, etc. Then, if

Init(x[32] , y [32] , z) = (x[32] +[32] y [32] >[32] 0[32] ) ∧ z,

where x[32] and y [32] are bit-vector and z propositional variables, the correspond-
ing LA formula Init W (x, y, z) is

Init (x, y, z) = (x + y > 0) ∧ z.

The inverse mapping MW →T from LA to BV(32) is constructed in a similar

manner, with the slight complication related to LA constants, which might be
non-integer, too large to fit into the required bit-width, or negative. One possibil-
ity to deal with non-integer constants is to truncate the fractional digits, i.e., map
0.5 to 0[32] . Other options include rounding up the constants when possible, e.g.,
by translating (x > 0.5) to (x[32] ≥[32] 1[32] ), but (x < 0.5) to (x[32] ≤[32] 0[32] ).
For this paper, we adopt the former, simpler, approach, and leave the investiga-
tion of more sophisticated translations to future work. To convert an integer LA
constant to BV(32) we take the lower 32 bit of its 2s-complement representation.
Remark 1. Clearly, our sub-theory BV∗ (32) of the full theory BV(32) was chosen
to simplify the construction of the mapping to and from LA. Generally, such
restriction of the original target theory might not be necessary if the working
theory TW supports uninterpreted functions.
The pseudocode in Algorithm 1 provides the high-level description of our ver-
ification framework (MISper). Given a transition system P = (V, Init , Tr , Bad )
98 A. Gurfinkel, A. Belov, and J. Marques-Silva

Algorithm 1: MISper — safety veriﬁcation framework

Input : P = (V, Init, Tr , Bad ) — a transition system over theory TT
Output: st ∈ {SAFE, UNSAFE, UNKNOWN}
1 forever do
2 under resource limits do
3 (st, Inv , Cex ) ← Safety(TT )(P ) // solve in the target theory
4 if st = UNKNOWN then return st
5 (TW , MT →W , MW →T ) ← pick a working theory and mappings
6 PW ← MT →W (P ) // translate P to the working theory
7 (st, Inv W , Cex W ) ← Safety(TW )(PW )
8 if st = SAFE then
9 return UNKNOWN // options: deal with CEX; try another TW
10 Cand ← MW →T (Inv W ) // get the candidate invariant for P
11 if Cand is safe invariant for P then
12 return SAFE
13 Cand I ← ComputeMIS(Cand )
14 Tr (u, v) ← Cand I (u) ∧ Tr (u, v) ∧ Cand I (v) // strengthen tr. rel.

over the target theory TT (e.g., BV∗ (32) from Example 1), we first attempt to
solve P with a solver for Safety(TT ) under heuristically chosen resource limits2 .
If the solver fails to prove or disprove the safety of P , we pick a working theory
TW , and a pair of corresponding mappings MT →W and MW →T (e.g., TW = LA
and the mappings are as in Example 1). Then, we attempt to verify the safety of
PW = MT →W (P ) = (U, MT →W (Init ), MT →W (Tr ), MT →W (Bad )), where U are
the fresh variables introduced by MT →W , using a solver for Safety(TW ). Since
PW is in general neither under- nor over- approximation of P , the (un)safety of
the former does not imply the (un)safety of the latter. Since the focus of this
paper is on synthesis of invariants for verification, we omit the detailed discus-
sion of how to handle the UNSAFE status of PW . One option is to simply return
UNKNOWN, as in Algorithm 1. Alternatively, the CEX for PW can be mapped
to TT via MW →T and checked on P — if the mapped CEX is also a CEX for
P , return UNSAFE. Otherwise, the mapping can be refined to eliminate the
CEX, and the safety verification of PW under the new mapping repeated. If,
on the other hand, PW is safe, we take the safe invariant Inv W of PW , and
translate it back to the target theory TT to obtain a candidate-invariant for-
mula Cand = MW →T (Inv W ). If Cand is a safe invariant of P , then the safety
of P is established, and the algorithm returns SAFE. Otherwise, we attempt
to compute a subformula Cand I of Cand that is an invariant of P — this is
done in the function ComputeMIS on line 13 of Algorithm 1, which we describe
in detail in Section 3.2. Once an invariant of P is obtained, we restrict the tran-
sition relation of P by replacing the formula Tr (u, v) in P with the formula
Cand I (u) ∧ Tr (u, v) ∧ Cand I (v), and attempt to verify the safety of the new
2
This step is optional on the first iteration of the main loop of Algorithm 1.
Synthesizing Safe Bit-Precise Invariants 99

transition system (the next iteration of the main loop). Since Cand I is the ac-
tual invariant of P , the (un)safety of strengthened transition system implies the
(un)safety of the input system P .
This veriﬁcation framework can be instantiated in numerous ways and leaves a
number of open heuristic choices. We postpone the description of an instantiation
of the framework used in our experiments to Section 4.

3.2 Computing Invariants

Given a candidate invariant Cand for a transition system P = (V, Init , Tr , Bad ),
obtained as described in Section 3.1, we are interested in computing a subformula
Cand I of Cand that is an invariant with respect to P , that is, Cand I (u) ∧
Tr (u, v) |= Cand I (v). Similarly to the previous work on invariant extraction
(e.g., [8,24]), we proceed under the assumption that the candidate invariant
Cand (u) is given as a conjunction of formulas Cand (u) = L1 (u) ∧ · · · ∧ Ln (u).
We refer to the conjuncts Li of Cand as lemmas. Then, the invariant Cand I
can be always be constructed as a (possibly empty) conjunction of some of the
lemmas in Cand . In our setting, this assumption is justified by the fact that many
verification tools, particularly those based on PDR [5,13] and its extensions
(e.g., Z3/PDR [22]) do indeed produce invariants in this form. In the worst
case, Cand itself can be treated as the (only) conjunct, which, while affecting
the effectiveness of our approach, does not affect its correctness. We note that
the ideas discussed in this section can be extended to candidate invariants of
arbitrary structure, though such extension is outside of the scope of this paper.
For notational convenience we treat Cand as a set of lemmas {L1 , . . . , Ln },
and formalize the invariant computation problem as follows:
Definition 2. Given a set of lemmas L ={L1 , . . . , Ln } and a transition
relation
Tr (u, v), a subset L ⊆ L is inductive if ( L∈L L(u)) ∧ Tr (u, v) |= L∈L L(v).
An inductive subset L ⊆ L is maximal if no strict superset of L is inductive.
Finally, an inductive subset L ⊆ L is maximum if the cardinality of L is
maximum among all inductive subsets of L.
It is not difficult to see that a union of two inductive subsets is inductive, and so
any set of lemmas L has a unique maximal, and hence a unique maximum, induc-
tive subset L . We refer to L as the MIS (maximal/maximum inductive subset)
of L. Thus, in our framework, given a candidate invariant Cand of transition
system P , the actual invariant Cand I of P is obtained by computing the MIS of
Cand — this is motivated by the fact that we aim to strengthen the transition
relation as much as possible prior to the next iteration of the algorithm.

Approaches to MIS Computation. The existing approaches to computa-

tion of MISes can be categorized into eager and lazy. Given a set of lemmas
L = {L1 , . . . , Ln } and the transition relation Tr , the eager approach (taken, for
example, in [8]) starts by checking whether L(u) ∧ Tr (u, v) |= L(v). This is typ-
ically done by testing the unsatisfiability of the formula L(u) ∧ Tr (u, v) ∧ ¬L(v)
with an SMT (or a SAT) solver. If the formula satisfiable, i.e., L is not inductive,
100 A. Gurfinkel, A. Belov, and J. Marques-Silva

the model returned by the solver must falsify one or more lemmas in L(v). These
lemmas are then removed both from L(u) and from L(v), and the test is repeated.
The process continues until for some subset L ⊆ L, L (u) ∧ Tr (u, v) |= L (v).
The final subset L is obviously inductive. Furthermore, for any set of lemmas
L ⊆ L \ L there must have been a point in the execution of the algorithm
where it obtained a model for a formula L (u) ∧ L (u) ∧ Tr (u, v) that falsifies
at least one lemma in L (v), as otherwise this lemma would be included in L .
Hence, L is maximal, and therefore is a MIS of L.
In the lazy approach to MIS computation (e.g., [16,24]), when the set L is
not inductive, the lemmas in the consequent L(v) that are falsified by the model
of L(u) ∧ Tr (u, v) are initially removed only from L(v). The process continues
until for some L ⊆ L, L(u) ∧ Tr (u, v) |= L (v) — notice that the premise still
contains all of the lemmas of L. We refer to such sets L as semi-inductive with
respect to L and Tr . Observe that the semi-inductive subset L obtained in
this manner is maximal and also maximum, by the argument analogous to that
used to establish the uniqueness of MISes. Once the maximum semi-inductive
subset L of L is computed, the lemmas excluded from L are removed from
L(u), and the algorithm checks whether L (u) ∧ Tr (u, v) |= L (v), i.e., whether
L is inductive. If not, the algorithm repeats the process, by first computing a
maximum semi-inductive subset of L , then checking its inductiveness, and so
on. The, eventually obtained, inductive subset of L is the MIS of L — this can
be justified in essentially the same way as for the eager approach.
One potential advantage of the lazy approach is that, since, compared to the
eager approach, there are often more lemmas in the premises, the SMT/SAT
solver is likely to work with stronger formulas. Furthermore, if a solver retains
information between invocations — for example, derived facts and history-based
heuristic parameters, as in incremental SAT solvers — more information can be
reused between iterations, thus speeding-up the MIS computation.
One additional feature of the lazy approach, pointed out and used in [24], is
that the computation of semi-inductive subsets can be reduced to the computa-
tion of Minimal Unsatisfiable Subformulas (MUSes), or, more precisely, to the
computation of group-MUSes (recall Definition 1). This observation is particu-
larly useful in cases when satisfiability problem in the theory that defines the
invariants can be soundly reduced to propositional satisfiability, as it allows to
leverage the large body of recent work and tools for the computation of MUSes
(e.g., [2,30,34]). We take advantage of this observation in the implementation of
our framework since, in our case, the invariants are quantifier-free formula over
(a sub-theory of) the theory of bit-vectors, and the satisfiability of such formulas
can be soundly reduced to SAT via bit-blasting. The reduction to group-MUS
computation and the overall MIS extraction flow are presented below.
Computing MISes with Group-MUSes. For a set of lemmas L = {L1 , . . . ,
Ln } and a transition relation formula Tr , we first rewrite the formula L(u) ∧
Tr (u, v)∧¬L(v), used to check the inductiveness of L, as a formula AL,Tr defined
in the following way:
Synthesizing Safe Bit-Precise Invariants 101

AL,Tr = (prei → Li (u)) ∧ Tr (u, v) ∧ (posti ∧ ¬Li (v)) , (3)
Li ∈L Li ∈L

where prei and posti for i ∈ [1, n] are fresh propositional variables, one for each
lemma Li ∈ L. One of the purposes of these variables is similar to that of the
indicator variables used in assumption-based incremental SAT solving (cf. [14])
— the variables can be used to emulate the removal of lemmas from formulas
L(u) and L(v). Setting prei to true (resp. false) causes the lemma Li to be
included (resp. excluded) from L(u), while setting posti to true (resp. false) has
the same effect on the lemma Li in L(v). The names of the variables reflect the
fact that they control either the “precondition” or the “postcondition” lemmas.
With this in mind, a computation of the MIS of L with respect to Tr can
be implemented on top of an incremental SMT solver by loading the formula
AL,Tr into the solver, and checking the satisfiability of the formula under a set
of assumptions. For example, the set L is inductive if and only if the formula
is unsatisfiable under assumptions i∈[1,n] {prei , posti }. When a lemma Li ∈ L
needs to be removed from L(u) and/or L(v), we simply assert the formula (¬prei )
and/or (¬posti ) to the solver.
However, as explained above, our intention is to take advantage of proposi-
tional MUS extractors, using the fact that quantifier-free bit-vector formulas can
be soundly converted to propositional logic. The pre and post variables serve a
purpose in this context as well. Assume that we have a polytime computable
function B2P , which given a quantifier-free formula FBV over the theory BV,
and a set of propositional variables X = {x1 , . . . , xk } that occur in FBV returns a
propositional formula FP rop = B2P (FBV , X), in CNF, with the following prop-
erty: for any assignment τ to the variables in X, the formula FBV [τ ] is satisfiable
if and only if so is the formula FP rop [τ ]. Following [29], we say that the formulas
FBV and FP rop are var-equivalent on X in this case. Note that var-equivalence
of FBV and FP rop on X does not imply FP rop contains all variables of X — for
example, FP rop = # is var-equivalent to FBV if FBV [τ ] is satisfiable for every
assignment τ for X.
Now, for a set of lemmas L = {L1 , . . . , Ln } and a transition relation Tr
over BV, let AL,Tr be the formula defined in (3), let Pre = {prei | i ∈ [1, n]},
Post = {posti | i ∈ [1, n]}. Consider the group-CNF formula GL,Tr constructed
in the following way:

GL,Tr = G0 ∪ G1 ∪ · · · ∪ Gn , where:
G0 = CL,Tr ∪ {(prei ) | i ∈ [1, n]}, with CL,Tr = B2P (AL,Tr , Pre ∪ Post)
Gi = {(¬posti )} for i ∈ [1, n]

That is, the group G0 of GL,Tr is the formula CL,Tr — a CNF formula var-
equivalent to AL,Tr on the set Pre ∪ Post — together with the positive unit
clauses for pre variables. Each group Gi in GL,Tr consists of a single negative
unit clause for the variable posti .
102 A. Gurﬁnkel, A. Belov, and J. Marques-Silva

Proposition 1. Let G be a group-MUS of the group-CNF formula GL,Tr . Then,

the set of lemmas L = {Lk | k ∈ [1, n] and Gk ∈ / G} is the maximum semi-
inductive subset of L with respect to Tr . Furthermore, G = ∅ iff L is inductive.
Intuitively, Proposition 1 follows from the fact that the function B2P preserves
var-equivalence. The formulas AL,Tr and CL,Tr are var-equivalent on the vari-
ables Pre ∪ Post . Thus, any group-MUS G of the group-CNF formula GL,Tr
is exactly a group-MUS of the “group-BV” formula obtained by taking AL,Tr
together with the appropriate unit clauses as group-0 and the rest of groups as
in GL,Tr . Furthermore, whenever a group Gi is included in G, the correspond-
ing variable posti is forced to 0, and so the lemma Li (v) is disabled in AL,Tr .
Since G0 ∪ G is unsatisfiable, so is the formula AL,Tr with the rest of the
post-lemmas (i.e., the set L ) enabled, thus implying the semi-inductiveness of
L . The maximality of the latter is implied by the minimality of G.
Proof. First, observe that the formula GL,Tr is unsatisfiable. This is because
GL,Tr ≡ CL,Tr [τ ], where τ = {prei → 1, posti → 0 | i ∈ [1, n]} is the assignment
entailed by the unit clauses in GL,Tr . Since B2P preserves var-equivalence on
Pre ∪ Post , the formula CL,Tr [τ ] is equisatisfiable with the formula AL,Tr [τ ]
(cf. (3)), which, in turn, is unsatisfiable since τ sets allpost variables to 0.
Let now G be a group-MUS of GL,Tr . Since G0 ∪ G is unsatisfiable (re-
call Definition 1), so is the formula CL,Tr [τG ], where τG = {prei → 1 | i ∈
[1, n]} ∪ {postj → 1 | Gj ∈ / G} ∪ {postk → 0 | Gk ∈ G}, and, therefore, the for-
mula AL,Tr [τG ]. Note, however, that the latter is equivalent to L(u) ∧ Tr (u, v) ∧
¬L (v), where L is as defined in the statement of the proposition. Hence, L is
semi-inductive.

w.l.o.g. take any G ⊂ G. Since G is a group-MUS of GL,Tr , the formula
Finally,
G0 ∪ G is satisfiable. Following the previous argument with the assignment
τG we have that the formula AL,Tr [τG ] is satisfiable, and so is the formula
L(u) ∧ Tr (u, v) ∧ ¬L (v), where L = L ∪ {Lk | Gk ∈ G \ G }. We conclude that
any L ⊃ L is not semi-inductive, and so L is maximal.
The “only-if” part of the second claim of the proposition follows immediately
from the first claim. For the “if” part, assume that L is inductive, and let τ
be the assignment that enables all lemmas of L, i.e., τ = {prei → 1, posti →
1 | i ∈ [1, n]}. Then, the formula AL,Tr [τ ] is unsatisfiable. Since the post variables
appear in AL,Tr only in positive polarity, changing the value of any of the post
variables to 0 cannot make the formula satisfiable. Thus, for τ = {prei → 1 | i ∈
[1, n]} the formula AL,Tr [τ ] is also unsatisfiable, and since B2P preserves var-
equivalence, so is the CNF formula CL,Tr [τ ]. But, CL,Tr [τ ] ≡ G0 , and so the
group-MUS of GL,Tr is ∅.

The MIS Computation Algorithm. Based on Proposition 1, we can compute

the maximum semi-inductive subset of the set of lemmas L by invoking any
off-the-shelf group-MUS extractor (e.g., MUSer2 [3]). The post variables are
essential for this reduction, as the translation function B2P can, and in practice
does, significantly modify the structure of the input BV formula through the
application of various BV-specific preprocessing techniques. The purpose of pre
Synthesizing Safe Bit-Precise Invariants 103

Algorithm 2: ComputeMIS for invariants in BV

Input : (L, Tr ) — a set of lemmas and a transition relation, in BV
Output: L ⊆ L — the MIS of L with respect to Tr
1 construct AL,Tr // the BV formula defined in eq. (3)
2 CL,Tr ← B2P (AL,Tr , Pre ∪ Post ) // compute a var-equivalent CNF
3 L ← L
4 forever do
5 construct GL,L ,Tr // the group-CNF defined in eq. (4)
6 G ← ComputeGMUS(GL,L ,Tr ) // compute a group-MUS
7 if G = ∅ then // L is inductive, cf. Prop. 1
8 return L
9 L = {Lk | k ∈ [1, n] and Gk ∈
/ G} // remove lemmas included in G

variables is slightly more technical. Assume that in the first iteration of the lazy
MIS computation algorithm a maximal semi-inductive set L of L is computed,
and that L ⊂ L. At this point, some of the lemmas L(u) (i.e., the precondition
lemmas) have to be removed from L. One possibility is to build a new formula
AL ,Tr analogously to that in equation (3), apply the function B2P to it, and
proceed with the computation of the maximum semi-inductive subset of L . An
alternative is to re-use the CNF formula CL,Tr , obtained by translating the
original formula AL,Tr via B2P , and simply add negative unit clauses (¬prei )
and (¬posti ) for each of the lemmas removed from L. This way we avoid re-
invoking B2P , and open up the possibility of reusing more information between
the invocations of the group-MUS extractor. As the group-CNF formula GL,Tr
does need to be modified between iterations by taking into account removal
of some of the lemmas, for a set L ⊆ L of remaining lemmas we define the
group-CNF formula GL,L ,Tr as follows:

GL,L ,Tr = G0 ∪ {Gi | Li ∈ L }, where:

G0 = CL,Tr ∪ {(prei ) | Li ∈ L } ∪ {(¬prej ), (¬postj ) | Lj ∈ L \ L } (4)
Gi = {(¬posti )} for Li ∈ L .

The pseudocode of the MIS computation algorithm based on the ideas pre-
sented above is presented in Algorithm 2. Given a set of BV lemmas L and a
transition relation formula Tr , the algorithm constructs the formula AL,Tr , de-
fined in (3), and converts the formula to CNF using a var-equivalence preserving
function B2P . The set L that will eventually represent the resulting MIS is ini-
tialized to L. The main loop of the algorithm reflects the outer loop of the lazy
MIS computation approach. On every iteration, the maximum semi-inductive
subset of L is computed via the reduction to group-MUS computation, as jus-
tified by Proposition 1. If the group-MUS is empty, then, according to Proposi-
tion 1, the set L itself is inductive, and, therefore, based on the correctness of
the lazy MIS computation algorithm, is the MIS of L. Otherwise, L is updated
to the computed maximum semi-inductive subset represented by the extracted
104 A. Gurfinkel, A. Belov, and J. Marques-Silva

Fig. 1. Performance of Z3/PDR and MISper for the target theories BV∗ (32) (left) and
BV∗ (16) (right) in terms of CPU runtime. Timeout of 1800 seconds is represented by
the dashed (green) lines; orders of magnitude are represented by diagonals.

group-MUS (line 9). Note that the removal of the lemmas from the premise for-
mula L(u) performed at this stage during the lazy MIS computation is implicit
in the construction of the group-CNF formula GL,L ,Tr in the next iteration of
the main loop (cf. (4)). The termination of the algorithm is guaranteed by the
fact that on every iteration at least one lemma is removed from L , and so, in
the worst case, there will be an iteration of the main loop with L = ∅. Since, in
this case, L is inductive, by Proposition 1 the computed group-MUS will be ∅,
and the algorithm terminates.

4 Implementation and Empirical Evaluation

Our prototype implementation of MISper framework was instantiated with the
restriction BV∗ of the theory of bit-vectors, described in Example 1, as the target
theory TT , and the theory of linear arithmetic LA as working theory TW . The
mappings MT →W and MW →T between the theories are as described in Exam-
ple 1. We used Z3/PDR engine for the implementation of the Safety(BV) and
Safety(LA) procedures. The Horn SMT systems used as an input to Z3/PDR
were obtained from the UFO framework. To check the safety and the inductive-
ness of the candidate invariants Cand in BV we used the bit-vector engine of
Z3. To perform var-equivalent translation of BV formulas to CNF during in-
variant extraction (function B2P in Algorithm 2) we used the front-end of the
SMT(BV) solver Boolector [6]. Though we were unable to formally establish
the var-equivalence of the translation, we validated the inductiveness of com-
puted invariants independently. Finally, we used the MUS extractor MUSer2 to
compute group-MUSes in Algorithm 2.

Experimental Setup and Results. To evaluate the performance of the pro-

posed framework empirically we selected 214 bit-precise veriﬁcation benchmarks
Synthesizing Safe Bit-Precise Invariants 105

Table 1. Performance of Z3/PDR and MISper for the target theories BV∗ (32) and
BV∗ (16). Within each horizontal section, the ﬁrst row (all) presents the data for all
214 instances, while the second row (unsol.) presents the data for those instances that
were not solved by Z3/PDR. ’‘Solved” means that the tool returned SAFE within the
timeout/memout of 1800 sec/4 GB. Column Z3/PDR shows the data for Z3/PDR —
each cell contains the number of solved instances (#sol), followed by the average and the
median of the CPU times on the solved instances (avg/med). Column MISper displays
the same data for MISper. Column MISper:Cand displays the data for instances solved
by MISper by proving the safety of the candidate invariant Cand (Alg. 1, line 12).
Column MISper:MIS displays the data for instances solved by MISper by computing
MIS of Cand , and invoking Z3/PDR on strengthened system (Alg. 1, lines 13-14). For
example, the ﬁrst row in the table shows that out of 214 instances, Z3/PDR solved 116,
while MISper solved 174, out of which 165 were solved immediately after the conversion
of LA invariant to BV∗ (32), and 9 were solved after extracting invariants.
bit inst. count Z3/PDR MISper MISper:Cand MISper:MIS
width #sol(avg/med) #sol(avg/med) #sol(avg/med) #sol(avg/med)
all 214 116(127.54/8.27) 174(28.34/0.43) 165(8.50/0.42) 9(391.95/133.94)
32
unsol. 98 — 58(75.90/1.03) 52(21.89/0.70) 6(544.05/366.18)
all 214 165(176.69/8.20) 182(69.32/0.38) 165(8.37/0.36) 17(660.91/399.32)
16
unsol. 49 — 18(624.79/376.24) 6(50.80/21.45) 12(911.78/1094.58)

from the set of SAFE benchmarks used in 2013 Competition on Software Verifica-
tion, SVCOMP’133. We translated the benchmarks to Horn SMT formulas over
the theories BV∗ (32) and BV∗ (16) (recall Example 1), after replacing the unsup-
ported bit-vector operations by fresh variables — hence, the resulting systems
are an over-approximation of the original programs4. We compared the perfor-
mance of Z3/PDR engine with that of MISper, instantiated with the theory of
linear arithmetic (LA) as a working theory TW . All experiments were performed
on Intel Xeon X3470, 32 GB, running Linux 2.6. For each experiment, we set a
CPU time limit of 1800 seconds, and a memory limit of 4 GB.
The scatter plots in Figure 1, complemented by Table 1, summarize the re-
sults of our experiments. In 32-bit experiments, MISper solved all 116 instances
solved by Z3/PDR, and additional 58 on which Z3/PDR exceeded the allot-
ted resources (174 in total). Furthermore, judging from the scatter plot (left),
on the vast majority of instances MISper was at least one order of magnitude
faster than Z3/PDR, and, in some cases, the performance improvement exceeded
three orders of magnitude. The 16-bit benchmarks were, not surprisingly, easier
for Z3/PDR than 32-bit, and so it succeeded to solve quite significantly more
problems (165). Nevertheless, MISper significantly outperforms Z3/PDR in this
setting as well, solving 17 more benchmarks, and still demonstrating multiple
orders of magnitude performance improvements. We found only one instance
solved by Z3/PDR, but unsolved by MISper (exceeded time limit). To summa-
rize, the results clearly demonstrate the effectiveness of the proposed framework.

3
http://sv-comp.sosy-lab.org/2013.
4
The benchmarks are available at http://bitbucket.org/arieg/misp.
106 A. Gurﬁnkel, A. Belov, and J. Marques-Silva

A number of interesting additional observations can be made by analyzing

the data in Table 1. Consider the 58 instances unsolved by Z3/PDR and solved
by MISper in the 32-bit experiments (second row of Table 1). In 52 of these the
safe invariants obtained in LA were transferred to directly to BV. Thus, in many
practical cases, while the safety of the program can be easily established without
taking into account its bit-precise semantics, the BV-based engine seems to get
bogged down by discovering information that, in the end, is mostly irrelevant.
In these situations, our approach allows to “find needles in the haystack”, and
quickly. In the remaining 6 cases, the bit-precise semantics do come into play.
However, the MIS-based invariant synthesis allows to transfer information that
is useful for bit-precise reasoning — this is evidenced by at least 3x average
speed-up of bit-precise reasoning on the strengthened system, with close to 6x
speed-up on 3 instances out of 6. The 16-bit experiments confirm the usefulness
of the partially transferred invariants further: out of 18 instances unsolved by
Z3/PDR, only on 6 the LA invariant could be transferred directly to BV, while
on remaining 12 the partial information allowed to speed-up the verification by
at least 2x on average.

5 Conclusion
In this paper, we introduced a bit-precise program verification framework MISper.
The key idea behind the framework is to transfer, at least partially, information
obtained during the verification of an unsound approximation of the original
program in the form of bit-precise invariants. We describe a novel approach to
computing such invariants that allows to take advantage of the state-of-the-art
propositional MUS extractors. The results of the experiments with our proto-
type implementation of the framework suggest that the proposed approach is
promising. Furthermore, the verification tool FrankenBit [20] that integrates our
prototype implementation of MISper with LLBMC [32], has won two awards at
the 2014 Competition on Software Verification (SVCOMP’14).

References
1. Albarghouthi, A., Gurfinkel, A., Chechik, M.: From Under-Approximations to
Over-Approximations and Back. In: Flanagan, C., König, B. (eds.) TACAS 2012.
LNCS, vol. 7214, pp. 157–172. Springer, Heidelberg (2012)
2. Belov, A., Lynce, I., Marques-Silva, J.: Towards efficient MUS extraction. AI Com-
mun. 25(2) (2012)
3. Belov, A., Marques-Silva, J.: MUSer2: An Efficient MUS Extractor. JSAT 8(1/2)
(2012)
4. Beyer, D., Löwe, S., Novikov, E., Stahlbauer, A., Wendler, P.: Precision reuse for
efficient regression verification. In: ESEC/SIGSOFT FSE (2013)
5. Bradley, A.R.: SAT-Based Model Checking without Unrolling. In: Jhala, R.,
Schmidt, D. (eds.) VMCAI 2011. LNCS, vol. 6538, pp. 70–87. Springer, Heidel-
berg (2011)
Synthesizing Safe Bit-Precise Invariants 107

6. Brummayer, R., Biere, A.: Boolector: An Efficient SMT Solver for Bit-Vectors and
Arrays. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505,
pp. 174–177. Springer, Heidelberg (2009)
7. Bryant, R.E., Kroening, D., Ouaknine, J., Seshia, S.A., Strichman, O., Brady,
B.A.: Deciding Bit-Vector Arithmetic with Abstraction. In: Grumberg, O., Huth,
M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 358–372. Springer, Heidelberg (2007)
8. Chockler, H., Ivrii, A., Matsliah, A., Moran, S., Nevo, Z.: Incremental formal ver-
ification of hardware. In: FMCAD (2011)
9. Cimatti, A., Griggio, A., Schaafsma, B.J., Sebastiani, R.: The MathSAT5 SMT
Solver. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp.
93–107. Springer, Heidelberg (2013)
10. Clarke, E., Kroening, D., Lerda, F.: A Tool for Checking ANSI-C Programs.
In: Jensen, K., Podelski, A. (eds.) TACAS 2004. LNCS, vol. 2988, pp. 168–176.
Springer, Heidelberg (2004)
11. Cordeiro, L., Fischer, B., Marques-Silva, J.: SMT-Based Bounded Model Checking
for Embedded ANSI-C Software. IEEE Trans. Software Eng. 38(4) (2012)
12. de Moura, L., Bjørner, N.: Z3: An Efficient SMT Solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
13. Eén, N., Mishchenko, A., Brayton, R.K.: Efficient implementation of property di-
rected reachability. In: FMCAD (2011)
14. Eén, N., Sörensson, N.: Temporal induction by incremental SAT solving. Electr.
Notes Theor. Comput. Sci. 89(4) (2003)
15. Fedyukovich, G., Sery, O., Sharygina, N.: Function Summaries in Software Up-
grade Checking. In: Eder, K., Lourenço, J., Shehory, O. (eds.) HVC 2011. LNCS,
vol. 7261, pp. 257–258. Springer, Heidelberg (2012)
16. Flanagan, C., Leino, K.R.M.: Houdini, an Annotation Assistant for ESC/Java. In:
Oliveira, J.N., Zave, P. (eds.) FME 2001. LNCS, vol. 2021, pp. 500–517. Springer,
Heidelberg (2001)
17. Ganesh, V., Dill, D.L.: A Decision Procedure for Bit-Vectors and Arrays. In:
Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 519–531.
Springer, Heidelberg (2007)
18. Godlin, B., Strichman, O.: Regression verification. In: DAC (2009)
19. Griggio, A.: Effective word-level interpolation for software verification. In: FMCAD
(2011)
20. Gurfinkel, A., Belov, A.: FrankenBit: Bit-Precise Verification with Many Bits
(Competition Contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014.
LNCS, vol. 8413, pp. 408–411. Springer, Heidelberg (2014)
21. Gurfinkel, A., Rollini, S.F., Sharygina, N.: Interpolation properties and SAT-based
model checking. In: Van Hung, D., Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172,
pp. 255–271. Springer, Heidelberg (2013)
22. Hoder, K., Bjørner, N.: Generalized Property Directed Reachability. In: Cimatti,
A., Sebastiani, R. (eds.) SAT 2012. LNCS, vol. 7317, pp. 157–171. Springer, Hei-
delberg (2012)
23. Kahsai, T., Ge, Y., Tinelli, C.: Instantiation-Based Invariant Discovery. In: Bobaru,
M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NFM 2011. LNCS, vol. 6617,
pp. 192–206. Springer, Heidelberg (2011)
24. Komuravelli, A., Gurfinkel, A., Chaki, S., Clarke, E.M.: Automatic Abstraction
in SMT-Based Unbounded Software Model Checking. In: Sharygina, N., Veith, H.
(eds.) CAV 2013. LNCS, vol. 8044, pp. 846–862. Springer, Heidelberg (2013)
108 A. Gurfinkel, A. Belov, and J. Marques-Silva

25. Kroening, D., Weissenbacher, G.: Lifting Propositional Interpolants to the Word-
Level. In: FMCAD (2007)
26. Kroening, D., Weissenbacher, G.: Interpolation-Based Software Verification with
Wolverine. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806,
pp. 573–578. Springer, Heidelberg (2011)
27. Kuncak, V., Rybalchenko, A. (eds.): VMCAI 2012. LNCS, vol. 7148. Springer,
Heidelberg (2012)
28. Lahiri, S.K., Hawblitzel, C., Kawaguchi, M., Rebêlo, H.: SYMDIFF: A Language-
Agnostic Semantic Diff Tool for Imperative Programs. In: Madhusudan, P., Seshia,
S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 712–717. Springer, Heidelberg (2012)
29. Lang, J., Liberatore, P., Marquis, P.: Propositional Independence: Formula-
Variable Independence and Forgetting. J. Artif. Intell. Res. (JAIR) 18 (2003)
30. Marques-Silva, J., Janota, M., Belov, A.: Minimal Sets over Monotone Predicates in
Boolean Formulae. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044,
pp. 592–607. Springer, Heidelberg (2013)
31. McMillan, K.L.: Lazy Abstraction with Interpolants. In: Ball, T., Jones, R.B. (eds.)
CAV 2006. LNCS, vol. 4144, pp. 123–136. Springer, Heidelberg (2006)
32. Merz, F., Falke, S., Sinz, C.: LLBMC: Bounded Model Checking of C and C++
Programs Using a Compiler IR. In: Joshi, R., Müller, P., Podelski, A. (eds.) VSTTE
2012. LNCS, vol. 7152, pp. 146–161. Springer, Heidelberg (2012)
33. Nadel, A.: Boosting minimal unsatisfiable core extraction. In: FMCAD (2010)
34. Nadel, A., Ryvchin, V., Strichman, O.: Efficient MUS Extraction with Resolution.
In: FMCAD (2013)
PEALT: An Automated Reasoning Tool
for Numerical Aggregation of Trust Evidence

Michael Huth and Jim Huan-Pu Kuo

Department of Computing, Imperial College London

London, SW7 2AZ, United Kingdom
{m.huth,jimhkuo}@imperial.ac.uk

Abstract. We present a tool PEALT that supports the understanding

and validation of mechanisms that numerically aggregate trust evidence
of potentially heterogenous sources. Such mechanisms are expressed in
the policy composition language Peal and subjected to vacuity check-
ing, sensitivity analysis of thresholds, and policy refinement. Verification
code is generated by either compiling away numerical references prior to
constraint solving or by delegating numerical reasoning to Z3, the com-
mon back-end constraint solver of PEALT. The former gives compact
diagnostics but restricts value ranges and may be space intensive. The
latter generates compact verification code, but gives verbose diagnostics,
and may struggle with multiplicative reasoning. We experimentally com-
pare code generation and verification running times of these methods on
randomly generated analyses and on a non-random benchmark modeling
majority voting. Our findings suggest both methods have complementary
value and may scale up well for the analysis of most realistic case studies.

1 Introduction
Trust is a fundamental factor that influences decisions pertaining to human inter-
actions, be they social or economic in nature. Mayer et al. [11] offer a definition of
trust as “... the willingness to be vulnerable, based on positive expectation about
the behavior of others.” These expectations of the trustor would be informed
by trust signals exchanged with the trustee of a planned interaction. Trust has
an economic incentive, it avoids the use of costly measures that guarantee as-
surance in the absence of trust-enabled interaction. We note that assurance is
the established means of realizing “IT security”. Traditionally, trust signals (e.g.
body language) could be observed both in spatial and temporal proximity to
a planned interaction. Modern IT infrastructures, however, disembed agents in
space and in time from such signals and interaction resources, making it hard to
use existing trust mechanics such as those proposed in [17] in this setting [10].
This identifies a need for a calculus in which trust and distrust signals can
be expressed and aggregated to support decision making in a variety of applica-
tions (e.g. financial transactions, software installations, and run-time monitoring
of hardware). In our proposed methodology, signals of trust or distrust have no
effect in their absence but evaluate to a score in their presence. These scores

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 109–123, 2014.

c Springer-Verlag Berlin Heidelberg 2014
110 M. Huth and J.H.-P. Kuo

may be determined by techniques suitable for the types of signals, e.g. machine
learning if signals are features, metrics if signals indicate trustworthiness of IT
infrastructures, etc. This then makes it challenging to devise a calculus for com-
bining scores of different types in a manner that articulates the expectations in
trust-mediated interactions. Let us give some examples of this.
Trust of an individual in an online transaction will depend, amongst other
things, on the monetary value of that transaction, the reputation of the seller,
and contextual information such as recommendations from friends. IT infras-
tructures in highly dynamic and volatile environments such as military operating
theatres can no longer be secured in a binary “secure or insecure” manner. They
have to react to risks in agile manners [1], suggesting the use of compositional
metrics for run-time trust management. Similarly, run-time systems may want
to monitor executing code by measuring signals from execution characteristics –
such as the threat level of parsed input (e.g. input such as meta-data may serve
as an attack surface [19]), the domain of a remote procedure call, etc. – and ag-
gregate such evidence to control execution paths. We refer to [8] for a case study
of such execution control in the Scala programming language, where methods are
annotated with expectation blocks – a precursor of the language Peal [4] – whose
aggregation computes what corresponds to the score of a policy set in Peal.
These examples suggest a trust calculus needs to express evidence that is not
only rooted in trust (e.g. an asset value), needs to be extensible for domain-
specific expressions of signals (e.g. those of a social network), and requires a
means of calculating trust from observed signals (e.g. compositional metrics). In
[4], such a language Peal was proposed in which signals are abstract predicates
whose truth triggers a score, and where score aggregation captures reasoning
about levels of trust. In [4], several analyses were also defined that assess if trust
calculations perform as expected by specifiers. Verification of trust calculations
is thus a key ingredient of such an approach, and the focus of this paper.
We here express the analysis of Peal expressions as constraints that can be an-
alyzed with the SMT solver Z3, and so capture logical dependencies of (dis)trust
signals. Specifically, we refine and extend the language Peal of [4] to support a
richer calculus, we implement analyses proposed in [4] in the SMT solver Z3 on
this richer language via two different methods of automated Z3 code generation
in PEALT, and we experimentally explore the trade-offs of both methods.

Outline of paper. Section 2 contains background on Peal and the SMT solver Z3.
Design and implementation of PEALT are outlined in Section 3. In Section 4, we
describe two methods for converting conditions used in analyses into Z3 input.
The validation of PEALT via experiments and other activities is reported in
Section 5. Section 6 contains related work, and Section 7 concludes the paper.

2 Background

Peal: a Pluggable Evidence Aggregation Language. The syntax of language Peal

is shown in Figure 1. In Peal, a rule rule consists of a predicate or signal qj and
PEALT: An Automated Reasoning Tool for Numerical Aggregation 111

its declared score sj , has no eﬀect if predicate qj is false (no signal), and has
score sj as eﬀect otherwise (signal present). Policies pol have form as in

pi = op ((q1 s1 ) . . . (qn sn )) default s or pi = op () default s (1)

contain zero or more rules, a default score s, and an aggregation operator op.
Policy pi returns default score s if all its rules have false predicates; otherwise
it returns the result of applying op to all scores sj of true predicates qj . The

op ::= min | max | + | ∗

rule ::= if (q) score
pol ::= op (rule∗ ) default score
pSet ::= pol | max (pSet, pSet) | min(pSet, pSet)
cond ::= th < pSet | pSet ≤ th

Fig. 1. Syntax of Peal where q ranges over some language of predicates, and th and
score range over real numbers (potentially restricted by domains or analysis methods)

design of Peal is layered as in [4]. Supported aggregation operators are min (e.g.
for distrust signals), max (e.g. for trust signals), + (e.g. for accumulative sig-
nals), and ∗ (e.g. for aggregating independent probabilistic evidence). Policies
are composed into policy sets (pSet) using max and min. Finally, policy sets are
compared to thresholds th using inequalities in conditions cond. The intuition
is that scores and thresholds are real numbers but that some analysis methods
may constrain the ranges of said values. The latter is one reason why the PEALT
input language under-speciﬁes such design choices. The meaning of policy com-
position is context-dependent. For example, if a condition th < min(pS1, pS2) is
used in support of recommending an action, e.g., then min acts as a pessimistic
composition since the score of any of its arguments may falsify this condition.

SMT solver Z3. Satisfiability modulo theories [5] is supported with robust and
powerful tools, that combine the state-of-the-art of deductive theorem proving
with that of SAT solving for propositional logic. The SMT solver Z3 has a
declarative input language for defining constants, functions, and assertions about
them [12]. Figure 2 shows Z3 input code to illustrate that language and its
principal analysis directives. On the left, constants of Z3 type Bool and Real are
declared. Then an assertion defines that the Boolean constant q1 means that x
is less than y + 1, and the next assertion insists that q1 be true. The directives
check-sat and get-model instruct Z3 to find a witness of the satisfiability of
the conjunction of all visible assertions, and to report such a witness (called a
model). On the right, we see what Z3 reports for the input on the left: sat states
that there is a model; other possible replies are unsat (there cannot be a model),
and unknown (Z3 does not know whether or not a model exists).
112 M. Huth and J.H.-P. Kuo

(declare-const q1 Bool) sat

(declare-const x,y Real) (model
(assert (= q1 (< x (+ y 1)))) (define-fun y () Real 0.0 )
(assert q1) (define-fun q1 () Bool true )
(check-sat) (define-fun x () Real 0.0
(get-model) )

Fig. 2. Left: Z3 input with directives to ﬁnd and generate a model. Right: Z3 output
for this input, a model that makes all input assertions true. (Both edited to save space.)

3 Workﬂow and Input Language of PEALT

The tool is rendered as a web application which accepts analysis declarations.
The declared analyses can be converted to Z3 input code, followed by calling
Z3 and getting feedback on running such code. The tool also allows generation
of random declarations or creation of majority-voting condition instances – the
latter stress test the explicit method for Z3 code generation described below. A
typical workflow of using PEALT would be to generate/write/edit Peal condi-
tions and their analyses, to run these analyses on the Z3 code the tool compiles,
and to study the Z3 output to decide whether further such actions are needed.
Analyses such as different? c1 c2 have keywords ending in ? and list condi-
tions as arguments. Users may specify any number of analyses. Generated Z3
input code will execute each declared analysis in turn using a visibility stack
discipline for assertions, as detailed in Section 4.
Example 1. Figure 3 shows an example of PEALT input that may model trust
perceptions when downloading a software installation and where a non-matching
hash of the download, e.g., is mitigated by the fact that the download was done
in a browser X that may non-maliciously change file signatures in that process.
In the example, both analyses have negative outcome.
Keywords POLICIES etc. divide declarations into sorts: policies, policy sets,
conditions, domain-specific declarations, and analyses. Keyword if is omitted
from rules in PEALT input for sake of succinctness. A simple naming construct
name = expr is used to uniformly bind expressions from the syntactic categories
for policies, policy sets, conditions, and analyses to names that can be referenced
without any scope restrictions. The syntax for policies, policy sets, and condi-
tions is hoped to be intuitive enough given the definition of Peal. Domain-specific
declarations are written in zone DOMAIN SPECIFICS, are expressed directly in Z3
code, and assume that all predicates within rules of declared policies are declared
in Z3 input as Z3 type Bool already.
We implemented two different ways of generating Z3 input code for declara-
tions entered into PEALT: an explicit and a symbolic one, whose details we will
provide below. Intuitively, explicit code generation compiles away any references
to numerical values to capture logically – without loss of arithmetic precision –
the declared analyses; whereas symbolic code generation statically encodes the
PEALT: An Automated Reasoning Tool for Numerical Aggregation 113

POLICIES
b1 = min ((companyDevice 0.1) (uncertifiedOrigin 0.2) (nonMatchingHash 0.2)) default 1
b2 = + ((downloadWithBrowserX 0.1) (useIOS 0.2) (useLinux 0.1) (recentPatch 0.1)) default 0
POLICY_SETS
pSet = min(b1, b2)
CONDITIONS
cond1 = 0.2 < pSet
cond2 = 0.1 < pSet
DOMAIN_SPECIFICS
(declare-const numberOfDaysSinceLastPatch Real)
(assert (= recentPatch (< numberOfDaysSinceLastPatch 7)))
ANALYSES
ana1 = always_true? cond1
ana2 = equivalent? cond1 cond2

Fig. 3. Trust perceptions of software download in PEALT, with two analyses

operational semantics of Peal through use of numerical declarations in order for

Z3 to be able to reason about all possible dynamic settings. Z3 code genera-
tion may produce an exponential blow-up in the explicit method whereas the
symbolic one typically ﬁnds it harder to reason about multiplication.
Users can specify which code generation method (explicit or symbolic) to use,
whether to just compile Z3 input code, and whether to also run it and display
results. Users also have the option of downloading the generated Z3 code (as
it may be large). For the explicit method, one may just generate results of all
analyses in pretty-printed, minimal form. We don’t oﬀer this for the symbolic
method as its code generation prevents the creation of minimal output models.
PEALT is written in Scala 2.10.2 using the Lift web framework. After converting
Peal declarations into Z3 input code, PEALT interfaces with the SMT solver Z3
(version 4.3.1) by launching it as an external process via Scala’s P rocessBuilder.

4 Z3 Code Generation

Our tool only generates code for conditions that are used : i.e. that are declared
in the input panel and occur in at least one declared analysis as argument.
Let c1 be the declared name of such a condition for declaration c1 = cond. We
generate Z3 code that declares c1 as Z3 type Bool and adds an assert statement
that binds the name c1 to φ[cond] via (assert (= c1 φ[cond])) where φ[cond] is
Z3 code for the logical formula generated for condition cond.
The code generated for φ[cond] explicitly or implicitly lists all signal scenarios
that may occur if we ignore any logical dependencies between signals. This means
that we delegate to our analysis backend, the Z3 SMT solver, the task of only
generating scenarios in analyses that are also logically feasible. We now describe
two methods for generating Z3 code for φ[cond], starting with the explicit one.

Explicit code generation. For sake of succinctness, we state φ[cond] here as a

formula of propositional logic over predicates and not as Z3 input. The deﬁnition
of φ[cond] is given by structural induction over the policy set argument in cond,
as shown in Figure 4. In the ﬁrst four equations, min and max compositions
114 M. Huth and J.H.-P. Kuo

of policy sets create disjunctions or conjunctions of simpler code generation

problems, depending on the type of inequality in cond. The first equation, for
example, expresses that the minimum of (the score of) two policy sets is less
than or equal to a threshold iff that it the case for one of these two policy sets.
The next four equations define auxiliary predicates Q1 to Q4 that we can
use to specify the remaining cases of conditions that involve only a sole policy.
All such conditions first generate the code context for the non-default case:
in (6), the default score of the sole policy in the condition is compatible with the
inequality. Therefore, we generate a disjunction whose first disjunct captures
the default case when all predicates of all rules are false, and whose second
disjunct captures the non-default case. In (7), the default score of the sole policy
is incompatible with the inequality of its condition cond and so only the non-
default case may apply. Therefore, we generate a conjunction that forces at least
one predicate and the formula generated for the non-default case to be true.
It remains to describe the code generation for the non-default case φndfop [cond]:
in (8), code generation of φndfop [cond] adds a top-level negation and reverts the
condition type when Q3 holds – where dual(pol ≤ th) equals th < pol and
dual(th < pol) equals pol ≤ th. This means that we only have to deal with
the same inequality type in the remaining cases, that enumerate scenarios. The
enumeration process for max and min in (9) is clear. For example, φndf max [th < pol]
is a disjunction of all predicates in pol whose scores are strictly larger than th.
The code generation in (10) applies to conditions pol ≤ th for ∗ policies
pol, and conditions th < pol for + policies pol. In these cases, we enumerate
all minimal scenarios of present signals that make the condition true. These
scenarios are minimal in that any smaller subset of present signals won’t make
the condition true. The code therefore generates a disjunction of monomials
where each monomial describes such a minimal scenario. Concretely, as + is
monotone and the inequality is th < pol, we only need to generate minimal
index sets X such that the sum of all si with i in X is above th. These X are
the elements of set M+ which is computed by enum+ in Figure 5. The Boolean
guard in the while-loop of enum+ makes use of the partial sums ti to ensure that
recursive calls to enum+ are only made when they will still enumerate at least
one new element of M+ . The correctness proof for enum+ is straightforward:
all such minimal index sets X are generated in some recursive execution path
(completeness), and all enumerated index sets are indeed minimal (soundness,
which requires the scores to be sorted in ascending order). Algorithm enum∗
enumerates all minimal scenarios in the case of a ∗ policy in pol ≤ th and
is dual to enum+ : it reverts all inequalities for th, lists scores in descending
order, and therefore retains the requirement to compute minimal index sets.
The correctness proof for enum∗ is that for enum+ modulo that duality.
Let us discuss what restrictions use of this explicit code generation imposes on
the PEALT input language. It requires that all scores within ∗ policies be within
[0, 1] so that ∗ is anti-tone; that all scores within + policies be non-negative to
get a correct interpretation of minimal index sets in enum+ ; whereas scores
within max and min policies may be any real numbers, since the inequalities
PEALT: An Automated Reasoning Tool for Numerical Aggregation 115

def
φ[min(pS1 , pS2 ) ≤ th] = φ[pS1 ≤ th] ∨ φ[pS2 ≤ th] (2)
def
φ[max (pS1 , pS2 ) ≤ th] = φ[pS1 ≤ th] ∧ φ[pS2 ≤ th] (3)
def
φ[th < min(pS1 , pS2 )] = φ[th < pS1 ] ∧ φ[th < pS2 ] (4)
def
φ[th < max (pS1 , pS2 )] = φ[th < pS1 ] ∨ φ[th < pS2 ] (5)

def
Q1 (pol, cond) = (s ≤ th, cond = pol ≤ th) ∨ (th < s, cond = th < pol)
def
Q2 (pol, cond) = (th < s, cond = pol ≤ th) ∨ (s ≤ th, cond = th < pol)
def
Q3 (op, cond) = (op ∈ {+, max }, cond = pol ≤ th) ∨ (op ∈ {∗, min}, cond = th < pol)
def
Q4 (op, cond) = (op = ∗, cond = pol ≤ th) ∨ (op = +, cond = th < pol)

def
φ[cond] = (¬q1 ∧ · · · ∧ ¬qn ) ∨ φndf
op [cond] (when Q1 (pol, cond) is true) (6)
def
φ[cond] = (q1 ∨ · · · ∨ qn ) ∧ φndf
op [cond] (when Q2 (pol, cond) is true) (7)

def
op [cond] = ¬φop [dual(cond)]
φnd f ndf
(when Q3 (op, cond) is true) (8)
def
def

ndf
φmax [th < pol] = qi φmin [pol ≤ th] =
ndf
qi (9)
i|th<si i|si ≤th
def

φndf
op [cond] = qi (when Q4 (op, cond) is true) (10)
X∈Mop i∈X

Fig. 4. Explicit code generation (recursively): pol has form as in (1); predicates Q1 to
Q4 drive the compilation logic; the computation of sets Mop is detailed in Figure 5

enum+ (X, acc, index, op) { enum∗ (X, acc, index, op) {
if (th < acc) { output X; } if (acc ≤ th) { output X; }
else { else {
j = index − 1; j = index − 1;
while ((0 ≤ j) ∧ (th < op(acc, tj )) { while ((0 ≤ j) ∧ (op(acc, tj ) ≤ th) {
enum+ (X ∪ {j}, op(acc, sj ), j, op); enum∗ (X ∪ {j}, op(acc, sj ), j, op);
j = j − 1; }}} j = j − 1; }}}

Fig. 5. Left: algorithm enum+ computes M+ where scores si are sorted in ascending
order. Right: algorithm enum∗ computes M∗ where si are sorted in descending order.
Initial call context is ({}, 0, n, +) for enum+ and ({}, 1, n, ∗) for enum∗ .

in (9) have the intended meaning for all sign combinations. Z3 code generated
for the PEALT input in Figure 3 is shown in Figure 6. PEALT uses the push
and pop directives of Z3 in order to add constraints speciﬁc to an analysis onto
the top of the assertion visibility stack that Z3 maintains, and to discharge these
assertions before turning to the next analysis. The Z3 code generated for analyses
is verbatim the same for the symbolic code generation to which we turn next.
116 M. Huth and J.H.-P. Kuo

(declare-const recentPatch Bool)

(declare-const useLinux Bool)
(declare-const uncertifiedOrigin Bool)
(declare-const companyDevice Bool)
(declare-const downloadWithBrowserX Bool)
(declare-const useIOS Bool)
(declare-const nonMatchingHash Bool)
(declare-const cond2 Bool)
(declare-const cond1 Bool)
(assert (= cond1 (and (or (and (not companyDevice) (not uncertifiedOrigin) (not nonMatchingHash))
(not (or companyDevice uncertifiedOrigin nonMatchingHash)))
(and (or downloadWithBrowserX useIOS useLinux recentPatch)
(or (and useIOS recentPatch) (and useIOS useLinux) (and useIOS downloadWithBrowserX)
(and recentPatch useLinux downloadWithBrowserX))))))
(assert (= cond2 (and (or (and (not companyDevice) (not uncertifiedOrigin) (not nonMatchingHash))
(not companyDevice)) (and (or downloadWithBrowserX useIOS useLinux recentPatch)
(or useIOS (and recentPatch useLinux) (and recentPatch downloadWithBrowserX)
(and useLinux downloadWithBrowserX))))))

(echo "Result of analysis [ana1 = always_true? cond1]:")

(push)
(declare-const always_true_ana1 Bool)
(assert (= always_true_ana1 cond1))
(assert (not always_true_ana1))
(check-sat)
(get-model)
(pop)

(echo "Result of analysis [ana2 = equivalent? cond1 cond2]:")

(push)
(declare-const equivalent_ana2 Bool)
(assert (= equivalent_ana2 (or (and cond1 (not cond2)) (and (not cond1) cond2))))
(assert equivalent_ana2)
(check-sat)
(get-model)
(pop)

Fig. 6. Explicitly generated code for input from Figure 3 (hand edited to save space)

Symbolic code generation. This method also binds the name c1 of declaration
c1 = cond to its condition via (assert (= c1 φ[cond])). But for each policy pi
occurring in cond, it also declares a constant cond pi of Z3 type Bool and then
generates φ[cond] as a positive Boolean formula over the constants cond pi . This
process follows the same logic as for explicit code generation in (2) to (5). For
each declared constant cond p i of Z3 type Bool, it then adds an assert state-
ment (assert (= cond p i φ[cond pi ])) that defines the meaning of cond pi .
For policies pi of form as in (1), the code generated is similar to the one of the
explicit method when op equals min or max – we refer to [7] for further details.
Let op equal ∗ or + and policy pi occur in at least one condition within some
declared analysis. Then the code generation for φ[cond pi ] in Figure 7 trades off
the space complexity of enumerating elements in M+ and M∗ with the time
complexity of solving real-valued inequalities in the Z3 SMT solver. For each
predicate qj within pi , we declare a constant pi score qj of Z3 type Real, and
add two assertions that, combined, model that the value of pi score qj is sj iff
qj is true, and that this value equals the unit of + (respectively, ∗) iff qj is
false. This means that we can precisely model the effect of the non-default case
(when at least one qj is true) by aggregating all values pi score qj with op, and
by comparing that aggregated result to the threshold in the specified manner
PEALT: An Automated Reasoning Tool for Numerical Aggregation 117

(< or ≥). Crucially, the values of pi score qj for predicates that happen to be
false won’t contaminate this aggregated value as they are units for operator op.
The encoding for symbolic code generation is therefore linear in the size of
cond. Using this encoding, we can now express φ[cond pi ] in Z3 by directly
encoding the “operational” semantics of cond pi : either the default score satisﬁes
the inequality and all policy predicates are false, or at least one policy predicate is
true and the aggregation of all values pi score qj with op satisﬁes the inequality.
These Z3 declarations and expressions are stated in Figure 7.

(declare-const p_i_score_q_j Real)

(assert (implies q_j (= s_i p_i_score_q_j)))
(assert (implies (not (= <unit> p_i_score_q_j)) q_j))

(or (and (cop th s) (not (or q_1 ... q_n)))

(and (or q_1 ... q_n)
(cop th (op p_i_score_q_1 ... p_i_score_q_n))))

Fig. 7. Top: declarations for pi score qj where s j is sj , and <unit> is 0.0 for +
policies pi and 1.0 for ∗ policies pi . Bottom: Z3 code for φ[cond pi ] for the ﬁrst case
in (1); comparison operator cop is < for th < pi or ≥ for th ≥ pi , and th denotes th.

The symbolic code generation described above imposes no restrictions on the

ranges of scores si . PEALT allows us to replace s i with an arithmetic expression
such as any real numbers c, real variables x, or products thereof (c · x).

Analyses. Analysis implies? checks whether the first condition logically implies
the second one, which is a form of policy refinement. Analyses always false?
and satisfiable? are “equivalent” but capture different intent of the user, ditto
for analysis equivalent? versus analysis different?. A typical use of analysis
different? is to check whether conditions differ for 0.5 < pSet and 0.6 < pSet,
i.e. whether pSet is sensitive to the increase of threshold value from 0.5 to 0.6.

Specification of domain specifics. Users may add domain-specific constraints or

knowledge as Z3 code within zone DOMAIN SPECIFICS: e.g. to declare variables
with which one can then deﬁne the exact meaning of predicates used in rules
(e.g. as a means of adding parameters to signals), to encode required properties
of the modeling domain, and to perhaps add assertions that guide the search of a
model of some analysis. The use of raw Z3 code means that any code generation
method will simply copy and paste this code into the generated Z3 input code.
We realize that our decision to automatically generate Z3 declarations of all
variables occurring in rules might confuse novice users, though, when they try
to declare these as Z3 types within zone DOMAIN SPECIFICS explicitly.

Witness generation. For each declared analysis, Z3 will try to decide it when
running PEALT. If the Z3 output is unsat, then we know that there is no witness
to the query – e.g. for always true? this would mean that Z3 decides that the
condition cannot be false, and so the answer is “yes, always true”. If the Z3
118 M. Huth and J.H.-P. Kuo

output is sat, then we report the correct answer (e.g. for always true? we say
“no, not always true”) and generate supporting evidence for this answer. For
explicit code generation, the generated models tend to be very short (few crucial
truth values of predicates qi and supporting values of variables used to deﬁne
these qi if applicable). PEALT can post-processes this raw Z3 output to extract
this information in pretty-printed form, an example thereof is seen in Figure 8.
For symbolic code generation, model list truth values for almost all declared
predicates qi that occur in at least one ∗ or + policy. The reason for this seems
to stem from the assertions we declare for variables p_i_score_q_j in Figure 7.
We mean to investigate how to shorten such evidence in future work.
Result of analysis [ana1 = always_true? cond1]
cond1 is NOT always true
For example, when useLinux is true, recentPatch is true,
nonMatchingHash is true, companyDevice is false

Fig. 8. Sample of pretty printed evidence for satisﬁability witness computed from ex-
plicitly generated code for always true? from Figure 3 (hand edited to save space)

Execution constraints. To summarize, explicit code generation of policies within

analyzed conditions requires that no ∗ policy has scores outside [0, 1] and that no
other policy has negative or non-constant scores. For symbolic code generation,
we only have to ensure that min and max policies have constant scores (negative
ones are allowed), and we mean to lift the latter restriction in future work.

5 Validation
We report experimental results for code generation methods and execution of
generated code on random and non-random analyses. We also discuss other tool
validation activities we conducted. All experiments were run on a test server
with two, 6-core, Intel E5 CPUs running at 2.5GHz and 48G of RAM.

Non-random benchmark. We use condition 0.5 < pmv(n) with + policy pmv(n) ,
default score 0, and n many rules each with score 1/n. The condition is true
when more than half of the predicates are true (“majority voting”). There are
no logical dependencies of predicates in pmv(n) and the size of M+ is exponential
in n. We can generate explicitly Z3 input code for values of n up to 27 (when
code takes up half a gigabyte), and code generation takes more than ﬁve minutes
for n being 23. By comparison, we could generate symbolically such code and
verify that this condition is true, within ﬁve minutes each, for n up to 49408.

Randomly generated analyses. We also implemented a feature

randP eal n, mmin , mmax , m+ , m∗ , p, th, δ

that randomly generates a policy set pSet, two conditions th < pSet and th +
δ < pSet and analyses the ﬁrst one with always true?, the second one with
PEALT: An Automated Reasoning Tool for Numerical Aggregation 119

always false?, and then applies different? to both conditions. Predicates are
randomly selected from a pool of p many predicates (with n ≤ p). Scores are
chosen from [0, 1] uniformly at random. In pSet, there are n policies for each
operator op of Peal (i.e. 4n policies in total) and each op policy has mop many
rules. For the maximal k with 2k ≤ 4n, we combine 2k policies using alternating
max and min compositions on their full binary parse tree; the result is further
composed with the remaining 4n−2k policies (if applicable) by grouping these in
min pairs, and by adding these pairs in alternating min and max compositions
to the binary policy tree. This stress tests policy composition above and beyond
what one would expect in practical specifications.
We then conducted three experiments that share an execution and termination
logic: experimental input to randP eal has only one degree of freedom and we use
unbounded binary search to see (within granularity of 10 and for five randomly
generated condition pairs) whether both code generation methods can generate
Z3 code within five minutes, and whether Z3 can perform each analysis within
that same time frame. If this fails for one of these condition pairs, we stop binary
expansion and go to a bisection mode to find the boundary.
Experiment 1 picks for operator min input headers 1, x, 1, 1, 1, 3x, 0.5, 0.1 so
it explores how many (x) rules a sole min policy can handle within five minutes.
The same evaluation is done for the other three operators. We also investigated
a variant of this experiment – Exp 1 (DS) – for which we also add as many
assertions as there are declared predicates in the conditions, as described in
[7]. This uses a function calledBy that models method call graphs with at
most one incoming edge (using a forall axiom in Z3 code) and declares a
third of these predicates to mean that a specific method called. The other two
thirds define predicates as linear inequalities between real, respectively integer,
variables (which may stem from method input headers) – please see [7] for details.
Experiment 2 picks for operator min the input headers n, c, 1, 1, 1, 3c, 0.5, 0.1
where c equals x/10 for the boundary value of x found in Experiment 1. We here
explore how many min policies we can handle for a sizeable number of rules.
The same evaluation is done for the other three operators. Experiment 3 picks
for operator min input headers n, n, 1, 1, 1, 3n, 0.5, 0.1 so that we explore how
many (the n) min policies with the same number of rules we can handle within
five minutes. The same evaluation is done for the other three operators.
Results of these experiments are displayed in Figure 9. In their discussion we
need to recognize that random analyses can have very different analysis times
for the same configuration type. So a termination “boundary” does not mean
that we cannot verify larger instances within five minutes, it just means that we
encountered an instance at the reported boundary that took longer than that.
In the first experiment, Z3 code generation seems faster than execution of
that Z3 code. We also see that up to two million rules can be handled for min
and max for both code generation methods within two minutes. For ∗, explicit
code generation seems to be one order of magnitude better than symbolic code
generation, although the Z3 execution in the latter case appears to be faster.
For +, on the other hand, symbolic code generation now seems to be an order
120 M. Huth and J.H.-P. Kuo

Exp 1 ex min sy min ex max sy max ex * sy * ex + sy +

rules 1867904 1802240 2101248 2162688 120 16 144 5784
code 26s 20s 32s 22s 5s 0.1s 14s 0.6s
Z3 110s 181s 74s 132s 48s 3s 72s 133s

Exp 1 (DS) ex min sy min ex max sy max ex * sy * ex + sy +

rules 8064 6280 6544 7240 136 16 128 1848
code 0.9 0.8 0.8s 0.8 8s 0.1s 1s 1s
Z3 133s 88s 136s 150s 60s 14s 40s 91s

Exp 2 ex min sy min ex max sy max ex * sy * ex + sy +

pol,rul 48,186790 56,180224 40,210124 56,216268 65888,12 4192,2 17488,14 24,578
code 264s 76s 169s 87s 279s 84s 277s 0.8s
Z3 time 438s 205s 44s 249s 4s 108s 2s 160s

Exp 3 ex min sy min ex max sy max ex * sy * ex + sy +

pol=rul 2128 2552 2136 2936 88 16 96 160
code 271s 71s 293s 99s 85s 0.2s 160s 1s
Z3 8s 63s 8s 120s 17s 144s 26s 23s

Fig. 9. Experimental results: columns show code generation method (“ex”plicit or

“sy”mbolic) and operator; rows show number of rules for policies of chosen operator in
analyses, time (rounded to seconds) to generate Z3 code, and time to execute Z3 code

of magnitude better than the explicit one – handling thousands of rules in just
over two minutes. When we add the domain-specific constraints in Exp 1 (DS),
we notice that min and max can only handle about seven-thousand rules in a
similar amount of time (compared to two million beforehand). The results for ∗
for both methods and for + for explicit code generation seem about the same
as without domain-specific constraints. But + now only can handle less than
two-thousand rules for symbolic code generation. In the second experiment, the
number of rules used for max and min is about two-hundred thousand. We can
deal with about fifty policies with that many rules within five minutes, noting
that code generation now takes more time. It is noteworthy that explicit code
generation can handle over sixty-thousand ∗ policies with 12 rules each, but that
this drops to less than twenty-thousand + policies; the symbolic approach does
not scale that well in comparison. In the third experiment, both methods can
handle between two to three thousand policies with that many rules for max
and min. For operators ∗ and +, the explicit method spends most of its time
in code generation whereas the symbolic one spends the bulk of its time in Z3
execution. For operator ∗, explicit code generation is still about an order of
magnitude better whereas for + it is not significantly better.
Ideally, we would like to extend these experiments to larger data points. But
such an attempt quickly reaches the memory boundary of our powerful server in
explicit code generation. We also believe that practical case studies would not
PEALT: An Automated Reasoning Tool for Numerical Aggregation 121

use more than a few dozen or hundreds of rules for each + and ∗ policy declared,
and so both approaches may actually work well then.

Software validation and future work. We have not yet encountered a Z3 output
unknown for PEALT analyses, although this is easy to achieve by adding complex
constraints as domain specifics. We validated both code generation methods
by running them side by side on randomly generated analyses and checking
whether they would produce conflicting answers (unsat and sat). During the
development of PEALT, we encountered a few of these conflicts which helped to
identify implementation bugs. Of course, this does not mean what we proved the
correctness of our Z3 code generator (written in Scala), and doing so would be
unwise as this generator will evolve with the tool language. Therefore, we want
to independently verify the evidence computed by Z3, in future work. This will
also verify that no double rounding errors in Z3 corrupted analysis outcomes.
In future work, we also want to understand whether we can construct proofs for
outputs unsat such that these proofs are meaningful for the analyses in question.

6 Related Work

The language in Figure 1 extends that in [4]: it supports policies without rules,
∗ policies, negative and non-constant scores for symbolic code generation, and
logical dependencies of predicates qi within PEALT. The symbolic code genera-
tion in PEALT uses the same enumeration process for + and ∗ on minimal index
sets (and not maximal ones as in [4]). PEALT implements most analyses of [4]
with logical dependencies, leaving more complex ones of [4] for future work.
The determination of scores is a fundamental concern in our approach, and
where PEALT is meant to provide confidence in such scorings and their implica-
tions. The process of arriving at scores depends on the application domain, we
offer two examples thereof from the literature. TrustBAC [3] extends role-based
access control with levels of trust, scores in [−1, 1], that are bound to roles in
RBAC sessions. These levels are derived from a trust vector that reflects user be-
havior, user recommendations, and other sources. No analysis of these levels and
their implications is offered. In [16], we see an example of how a sole score may
reflect the integrity of an information infrastructure, as a formula that accounts
for known vulnerabilities, threats that can exploit such vulnerabilities, and the
likelihood for each vulnerability to exist in the given infrastructure. We should
keep in mind that any such metrics are heuristics, and so it is important to an-
alyze their impact on decision making, especially if other factors also influence
such decisions. PEALT allows us, in principle, to conduct such analyses. Extant
work enriches security elements with quantities, e.g. credential chains [18], secu-
rity levels [15], trust-management languages [2], reputation [9], and combinations
of reputation and trust [13,14]. But we are not aware of substantial tool support
for analyzing the effect of such enrichments when combined with other aspects
of evidence. Shinren [6] offers the ability to reason about both trust and distrust
explicitly and in a declarative manner, with the support of priority composition
122 M. Huth and J.H.-P. Kuo

operators for layers of trust and distrust. Although Peal is in principle expressive
enough to encode most of this functionality, doing so would not constitute good
engineering practice: this is a good example for when conditions of Peal would
be expressions to be composed in upstream languages such as Shinren.

7 Conclusions
We have created a tool PEALT in which one can study different mechanisms of
aggregating numerical trust evidence. We extended the policy-composition lan-
guage Peal of [4] and modified the generation of verification conditions reported
in [4] for Peal conditions to make them dischargeable with an SMT solver. We
proposed two different means of generating such verification conditions and dis-
cussed both conceptual and experimental advantages and disadvantages of such
methods. The explicit method compiles away any references to numerical values
and so arrives at a purely logical formulation. The price for this may be an explo-
sion in the length of the resulting formula and in the restriction of score ranges for
certain policy composition operators (e.g. multiplication). The symbolic method
creates formulas with only linear size in the conditions but shifts the compu-
tational burden to Z3 and its reasoning about linear arithmetic. Both methods
delegate to Z3 logical feasibility checks of trust scenarios discovered in analyses.
Our current PEALT prototype supports verification of policy refinement, vacu-
ity checking, sensitivity analysis of thresholds in conditions, and non-constant
scores (for symbolic code generation) to express metrics. We think PEALT is
a good example of the benefits that can be gained by connecting to a powerful
back-end such as the SMT solver Z3 for analyses. The version of the source code
used in this paper is available on https://bitbucket.org/jimhkuo/pealt.

Acknowledgments. We thank Jason Crampton and Charles Morisset for very

fruitful discussions on PEALT, anonymous reviewers for helpful comments, and
IntelR Corporation for funding this work in its Trust Evidence research project.

References
1. Announcement of Cybersecurity Collaborative Research Alliance. Press Release,
US Army Research Laboratory (October 15, 2013)
2. Bistarelli, S., Martinelli, F., Santini, F.: A semantic foundation for trust man-
agement languages with weights: An application to the RT family. In: Rong, C.,
Jaatun, M.G., Sandnes, F.E., Yang, L.T., Ma, J. (eds.) ATC 2008. LNCS, vol. 5060,
pp. 481–495. Springer, Heidelberg (2008)
3. Chakraborty, S., Ray, I.: TrustBAC: integrating trust relationships into the RBAC
model for access control in open systems. In: Proceedings of the Eleventh ACM
Symposium on Access Control Models and Technologies, SACMAT 2006, pp. 49–
58. ACM, New York (2006)
4. Crampton, J., Huth, M., Morisset, C.: Policy-based access control from numerical
evidence. Tech. Rep. 2013/6, Imperial College London, Department of Computing
(October 2013) ISSN 1469-4166 (Print), ISSN 1469-4174 (Online)
PEALT: An Automated Reasoning Tool for Numerical Aggregation 123

5. De Moura, L., Bjørner, N.: Satisfiability modulo theories: introduction and appli-
cations. Commun. ACM 54(9), 69–77 (2011)
6. Dong, C., Dulay, N.: Shinren: Non-monotonic trust management for distributed
systems. In: Nishigaki, M., Jøsang, A., Murayama, Y., Marsh, S. (eds.) IFIPTM
2010. IFIP AICT, vol. 321, pp. 125–140. Springer, Heidelberg (2010)
7. Huth, M., Kuo, J.H.P.: PEALT: A reasoning tool for numerical aggregation of trust
evidence. Tech. Rep. 2013/7, Imperial College London, Department of Computing
(2013) ISSN 1469-4166 (Print)
8. Huth, M., Kuo, J.H.-P.: Towards verifiable trust management for software execu-
tion(extended abstract). In: Huth, M., Asokan, N., Čapkun, S., Flechais, I., Coles-
Kemp, L. (eds.) TRUST 2013. LNCS, vol. 7904, pp. 275–276. Springer, Heidelberg
(2013)
9. Jøsang, A., Ismail, R.: The beta reputation system. In: Proceedings of the 15th
Bled Conference on Electronic Commerce, Bled, Slovenia, June 17-19 (2002)
10. Kirlappos, I., Sasse, M.A., Harvey, N.: Why trust seals don’t work: A study of user
perceptions and behavior. In: Katzenbeisser, S., Weippl, E., Camp, L.J., Volkamer,
M., Reiter, M., Zhang, X. (eds.) TRUST 2012. LNCS, vol. 7344, pp. 308–324.
Springer, Heidelberg (2012)
11. Mayer, R., Davis, J., Schoorman, F.D.: An integrative model of organizational
trust. Academy of Management Review 20(3), 709–734 (1995)
12. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
13. Mui, L.: Computational Models of Trust and Reputation: Agents, Evolutionary
Games, and Social Networks. Ph.D. thesis, Massachusetts Institute of Technology
(2002)
14. Muller, T., Schweitzer, P.: On beta models with trust chains. In: Fernández-Gago,
C., Martinelli, F., Pearson, S., Agudo, I. (eds.) IFIPTM. IFIP AICT, vol. 401, pp.
49–65. Springer, Heidelberg (2013)
15. Ni, Q., Bertino, E., Lobo, J.: Risk-based access control systems built on fuzzy infer-
ences. In: Proceedings of the 5th ACM Symposium on Information, Computer and
Communications Security, ASIACCS 2010, pp. 250–260. ACM, New York (2010),
http://doi.acm.org/10.1145/1755688.1755719
16. Nurse, J.R.C., Creese, S., Goldsmith, M., Rahman, S.S.: Supporting human
decision-making online using information-trustworthiness metrics. In: Marinos, L.,
Askoxylakis, I. (eds.) HAS/HCII 2013. LNCS, vol. 8030, pp. 316–325. Springer,
Heidelberg (2013)
17. Riegelsberger, J., Sasse, M.A., McCarthy, J.D.: The mechanics of trust: A frame-
work for research and design. Int. J. Hum.-Comput. Stud. 62(3), 381–422 (2005)
18. Schwoon, S., Jha, S., Reps, T.W., Stubblebine, S.G.: On generalized authorization
problems. In: CSFW, pp. 202–218. IEEE Computer Society (2003)
19. Shapiro, R., Bratus, S., Smith, S.W.: “Weird Machines” in ELF: A Spotlight on
the Underappreciated Metadata. In: Proceedings of the 7th USENIX Workshop on
Offensive Technologies (WOOT 2013), 12 pages. USENIX (2013)
GRASShopper
Complete Heap Verification with Mixed Specifications

Ruzica Piskac1 , Thomas Wies2, Æ , and Damien Zufferey3

1
Yale University, USA
2
New York University, USA
3
MIT CSAIL, USA

Abstract. We present GRASShopper, a tool for compositional verification of

heap-manipulating programs against user-provided specifications. What makes
our tool unique is its decidable specification language, which supports mixing of
assertions expressed in separation logic and first-order logic. The user of the tool
can thus take advantage of the succinctness of separation logic specifications and
the discipline of local reasoning. Yet, at the same time, she can revert to classi-
cal logic in the cases where decidable separation logic fragments are less suited,
such as reasoning about constraints on data and heap structures with complex
sharing. We achieve this combination of specification languages through a trans-
lation to programs whose specifications are expressed in a decidable fragment of
first-order logic called GRASS. This logic is well-suited for automation using sat-
isfiability modulo theory solvers. Unlike other tools that provide similar features,
our decidability guarantees enable GRASShopper to produce detailed counterex-
amples for incorrect or underspecified programs. We have found this feature to be
invaluable when debugging specifications. We present the underlying philosophy
of the tool, describe the major technical challenges, and discuss implementation
details. We conclude with an evaluation that considers challenging benchmarks
such as sorting algorithms and a union/find data structure.

1 Introduction

We present GRASShopper, a new tool for compositional verification of heap manipu-

lating programs against user-provided specifications. GRASShopper takes programs in
a C-like procedural language as input. The tool checks that procedures mutually satisfy
their contracts, that all memory accesses are safe, and that there are no memory leaks.
The unique feature of the input language is that it admits specifications that freely mix
assertions expressed in separation logic and first-order logic.
Separation logic (SL) [19] is an extension of Hoare logic for proving the correct-
ness of heap-manipulating programs. SL assertions specify regions in the heap rather
than the global state of the heap. This distinction to classical logic gives rise to a disci-
pline of local reasoning where the specification of a program fragment C only concerns
C’s footprint, i.e., the portion of memory on which C operates. This approach typically
yields succinct and natural specifications that closely resemble a programmer’s intuition
about program correctness. Separation logic has therefore spawned extensive research
Æ
Supported in part by NSF grant CCS-1320583.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 124–139, 2014.

c Springer-Verlag Berlin Heidelberg 2014
GRASShopper 125

into developing tool support for automated verification of programs against SL specifi-
cations [3,4,9,27]. The cores of such tools are specialized theorem provers for checking
entailments between SL assertions [2, 6, 7, 21]. Much of the work on such provers aims
at decidable fragments of separation logic to guarantee a robust user experience.
Despite the elegance of separation logic, there are certain situations where it is more
appropriate to express specifications in classical logic. This includes, for example, sit-
uations in which data structures exhibit complex sharing or involve constraints about
data, e.g., arithmetic constraints. Reasoning about such constraints is not directly sup-
ported by SL theorem provers. The question is then how to extend these provers without
giving up on decidability and completeness guarantees.
Typically, theory reasoning is realized by using a satisfiability modulo theories (SMT)
solver that is integrated with the SL entailment procedure [5]. However, the interplay
between SL reasoning and theory reasoning is intricate, e.g. equalities inferred by the
theory solvers must be propagated back to the SL solver. Guaranteeing completeness of
such a combined procedure is brittle and often involves the reimplementation of infras-
tructure that is already provided by the SMT solver.
In our previous work, we developed a new approach for checking SL entailments
that reduces to checking satisfiability of formulas expressed in a decidable fragment of
first-order logic [22]. We refer to this fragment as the logic of graph reachability and
stratified sets (GRASS). Formulas in this logic express properties of the structure of
graphs, such as whether nodes in the graph are inter-reachable, as well as properties of
sets of nodes. The combination of these two features enables a natural encoding of the
semantics of SL assertions. The advantage of this approach is that we can now delegate
all reasoning to the SMT solver, exploiting existing infrastructure for combinations [18]
and extensions [25] of first-order theories to handle reasoning about data robustly.
In this paper, we present GRASShopper, a tool which extends our previous work with
support for local reasoning. Inspired by implicit dynamic frames [20, 24], we present a
translation of programs with mixed separation logic and first-order logic specifications
to programs with GRASS specifications. The translation and verification of the resulting
program is fully automated. The key challenge in this approach is to ensure that the en-
coding of SL assertions and the support for local reasoning remains within a decidable
logic. To this end, we present a decidable extension of the GRASS logic that suffices to
express that reachability information concerning heap paths outside the footprint of a
code fragment is preserved by the execution of that code fragment.
We implemented the decision procedure for our extension of GRASS on top of
the SMT solver Z3 [8] and integrated this decision procedure into GRASShopper. We
used the tool to automatically verify list-manipulating programs such as sorting algo-
rithms whose specifications involve constraints on data. We further considered pro-
grams whose specifications are difficult to express in decidable SL fragments alone.
One example is the find operation of a union/find data structure. The postcondition of
this operation must describe a heap region that consists of an unbounded number of list
segments. With our approach we can easily express this postcondition using a quantified
constraint in classical logic, while using SL assertions to describe the precondition. The
seamless yet robust combination of separation logic and classical logic in a specification
language that supports local reasoning is the key contribution of this work.
126 R. Piskac, T. Wies, and D. Zufferey

1 struct Node { var data : int; var next: Node; }

2 predicate blseg(x: Node, y: Node, lb: int, ub: int) {
3
x y x y ∗ acc(x) ∗ lb x.data ub ∗ blseg(x.next, y, lb, ub)
4 }
predicate bslseg(x: Node, y: Node, lb: int, ub: int) {

5

6 x y x y ∗ acc(x) ∗ lb x.data ub ∗ bslseg(x.next, y, x.data, ub)

7 }
8 procedure quicksort(x: Node, y: Node, ghost lb: int, ghost ub: int) returns (rx: Node)
9 requires blseg(x, y, lb, ub);
10 ensures bslseg(rx, y, lb, ub);
11
{ if (x y x.next y) {
12 var pivot: Node, z: Node;
13 rx, pivot := split(x, y, lb);
14 rx := quicksort(rx, pivot, lb, pivot.data);
15 z := quicksort(pivot.next, y, pivot.data, ub);
16 pivot.next := z;
17 } else { rx := x; }
18 }
Fig. 1. A partial implementation of a quicksort algorithm on singly-linked lists

2 Overview and Running Example

We illustrate our approach through an example that implements a quicksort algorithm
for linked lists storing integer values. The implementation and specification is shown in
Figure 1. We use the syntax of GRASShopper’s input language (modulo mark-up).
The procedure quicksort takes two pointers x and y as input, marking the start and
end points of the list segment that is to be sorted. This property is expressed by the SL
assertion in the precondition of quicksort: the inductive predicate blseg(x, y, lb, ub). The
predicate states that x and y are indeed the start and end points of an acyclic list segment.
Furthermore, it states that the data values of this list segment are bounded from below
and above by the values lb and ub, respectively. These values are passed to quicksort
as additional ghost parameters. The atomic predicate acc(x) in the definition of blseg
represents a heap region that consists of the single heap cell x. That is, acc(x) means
that x is in the footprint of the predicate. Such SL assertions are combined to assertions
describing larger heap regions using spatial conjunction, denoted by ‘*’. Spatial con-
junction asserts that the composed heap regions are disjoint in memory. Hence, blseg
describes an acyclic list segment. Note that atomic assertions such as x y only express
constraints on values but describe empty heap regions. In particular, x y x y is
not a tautology. Such constraints are called pure in SL jargon. Further note that spatial
conjunction binds stronger than classical conjunction and disjunction.
The footprint of blseg(x, y, lb, ub) is also the initial footprint of procedure quick-
sort which, by induction, consists of all heap cells between x and y, excluding y. The
quicksort procedure returns a pointer rx to the head of the sorted list segment, which
we specify in the postcondition using the predicate bslseg(rx, y, lb, ub). For exposition
purposes, we do not specify that the output list is a permutation of the input list.
In the recursive case, quicksort picks a pivot and splits the list into two segments,
one containing all values smaller than pivot.data, and one containing all other values.
To simplify the presentation, we have factored out the code for the actual splitting in
GRASShopper 127

1 procedure split(x: Node, y: Node, ghost lb: int, ghost ub: int) returns (rx: Node, pivot: Node)
2
requires blseg(x, y, lb, ub) ∗ x y;
ensures blseg(rx, pivot, lb, pivot.data) ∗ blseg(pivot, y, pivot.data, ub);

3

4 ensures Btwn(next, rx, pivot, y) ∗ pivot y ∗ lb pivot.data ub;

Fig. 2. Specification of the procedure split used by quicksort

a separate procedure split. After splitting, quicksort recursively calls itself on the two
sublists and concatenates the two sorted list segments.
We provide the specification of split but not its implementation. It is shown in Fig. 2.
The specification is agnostic to implementation details such as whether only the data
values are reordered in the list or the entire nodes. Multiple ensures, respectively, re-
quires clauses in a procedure contract are implicitly connected by spatial conjunction.
The procedure split also demonstrates the convenience of a specification language
that allows mixing of separation logic and reachability logic. The conjunct Btwn(next,
rx, pivot, y) in the second ensures clause is a predicate in our logic GRASS. The pred-
icate states that the node pivot lies between rx and y on the direct next path connecting
the two nodes. That is, the two list segments described by the first ensures clause do
not form a panhandle list. A panhandle list can occur if y is a dangling pointer to an
unallocated node and split allocates that node and inserts it into the list segment from
rx to pivot, thereby creating a cycle. Without the additional reachability constraint, the
specification of split would be too weak to prove the correctness of quicksort because
the final sorted list segment returned by quicksort must be acyclic. If we used either only
separation logic or only reachability logic, the specification of procedure split would be
considerably more complicated (assuming we stayed inside decidable fragments).

3 Verifying Programs with GRASShopper

The verification of the input program provided to GRASShopper proceeds in three steps:
first we translate the program to an equivalent program whose specification is expressed
solely in our first-order logic fragment GRASS; in the second step we encode the trans-
lated program into verification conditions (also expressed in GRASS) using standard
verification condition generation; finally we decide the generated verification condi-
tions using our GRASS solver. All three steps are fully automated in GRASShopper.
We now explain these steps using the quicksort procedure as a running example.

3.1 Translation to GRASS Programs

We first describe the translation of the input program to a GRASS program. The trans-
lation must capture the semantics of Hoare triples in separation logic and preserve the
ability to reason about correctness locally. For a Hoare triple P C Q to be valid in
separation logic, the precondition P must subsume the footprint of the program frag-
ment C. That is, P specifies the portion of memory that C is allowed to access. This
semantics enables local reasoning, which is distilled into the so-called frame rule. The
frame rule states that if P C Q is valid, then so is P F C Q F for any SL
128 R. Piskac, T. Wies, and D. Zufferey

assertion F . That is, C does not affect the state of memory regions disjoint from its
footprint. The assertion F is referred to as the frame of the rule application.
The frame rule enables compositional symbolic execution of program fragments.
For example in quicksort, the symbolic state after the call to split in line 13 is de-
scribed by the postcondition of split. The first subsequent recursive call to quicksort
then only operates on the first sublist blseg(rx,pivot,lb,ub) of that symbolic state, leav-
ing blseg(pivot,y,lb,ub) in the frame. The frame rule then implies that this second sublist
is not modified by the first recursive call. All such applications of the frame rule for
procedure calls are made explicit in the GRASS program.
The translation to a GRASS program proceeds one procedure at a time. Each result-
ing procedure is equivalent to its counterpart in the input program, modulo auxiliary
ghost state. This auxiliary ghost state makes the semantics of separation logic specifica-
tions explicit and encodes the applications of the frame rule. Figure 3 shows the result
of the translation for the quicksort procedure. The translation works as follows.
Alloc. First, we introduce a global ghost variable Alloc (line 2), which is used to model
allocation and deallocation instructions. That is, at any point of execution, Alloc denotes
the set of all Node objects that are currently allocated on the heap.
Footprints and Implicit Frame Inference. Each procedure maintains its own footprint
throughout its execution using the dedicated local ghost variable FP. That is, at any point
of a procedure’s execution, FP contains the set of all heap nodes that the procedure
has permission to access or modify at that point. Each heap access or modification
is therefore guarded by an assert statement that checks whether the modification is
permitted by the current footprint (see, e.g., lines 25 and 29). The translation maintains
the invariant that footprints contain only allocated nodes. That is, both allocation and
deallocation instructions affect FP.
For each procedure call, the footprint of the caller is passed to the callee and the
callee returns the new footprint of the caller. That is, it is the callee’s responsibility
to inform the caller about allocation and deallocation operations that affect the caller’s
footprint. For this purpose, each procedure is instrumented with an additional ghost
input parameter FP_Caller and an additional ghost return parameter FP_Caller’.
The contract of the translated procedure governs the transfer of permissions between
caller and callee via the exchanged footprints and ties the footprints to the translations
of the separation logic specifications in the original procedure contract. The initial value
of FP in the translated procedure is determined by the footprint of the separation logic
assertions in the precondition of the input procedure, which itself must be a subset of
the callers footprint (line 16).
Note that the ghost variable FP is declared as an implicit ghost input parameter of
the procedure (line 13). The semantics of an implicit ghost parameter is that it is ex-
istentially quantified across the entire procedure contract1. That is, during verification
condition generation, the precondition of the contract is asserted at the call site with all
implicit ghost parameters existentially quantified. When the solver checks the gener-
ated verification condition for this assertion, it needs to find a witness for FP, thereby
implicitly inferring the frame of the procedure call that is used in the application of
1
We adhere to the usual semantics of procedure contracts where input parameters occuring in
ensures clauses refer to the initial values of these parameters.
GRASShopper 129

1 struct Node { var data : Int; var next: Node; }

2 ghost var Alloc: set<Node>;
function blseg_fp(x: Node, y: Node) returns (Footprint: set<Node>) {

3

4 Footprint {z: Node :: Btwn(next, x, z, y) z y}

5 }
6 predicate blseg_struct(x: Node, y: Node, lb: int, ub: int) {
7
Btwn(next, x, y, y)
z blseg_fp(x, y) :: lb z.data ub
8 }
predicate bslseg_struct(x: Node, y: Node, lb: int, ub: int) {

9

10 blseg_struct(x,y,lb,ub) z,w blseg_fp(x, y) :: Btwn(next,z,w,y) z.data w.data

11 }
12 procedure quicksort(x: Node, y: Node, ghost lb: int, ghost ub: int,
13 ghost FP_Caller: set<Node>, implicit ghost FP: set<Node>)
14 returns (rx: Node, ghost FP_Caller’: set<Node>)
15 requires blseg_struct(x, y, lb, ub);

requires FP blseg_fp(x, y) FP FP_Caller;

17 free requires FP_Caller Alloc null Alloc;

18 modifies next, data, Alloc;
19 ensures bslseg_struct(rx, y, lb, ub);

ensures blseg_fp(rx, y) (Alloc FP) (Alloc old(Alloc));

20

21 free ensures FP_Caller’ (FP_Caller FP) (Alloc FP) (Alloc old(Alloc));

free ensures FP_Caller’ Alloc null Alloc;

23 free ensures Frame(old(Alloc), FP, old(next), next) Frame(old(Alloc), FP, old(data), data);
24
{ FP_Caller := FP_Caller FP;

assert x y x FP;

25

26 if (x y x.next y) {
27 var pivot: Node, z: Node;
28 rx, pivot, FP := split(x, y, lb, FP);
29
assert pivot FP;
30 rx, FP := quicksort(rx, pivot, lb, pivot.data, FP);
31 z, FP := quicksort(pivot.next, y, pivot.data, ub, FP);
32 pivot.next := z;
} else { rx := x; }

33

34 FP_Caller’ := FP_Caller FP; }

Fig. 3. Translation of quicksort program from Figure 1 to an equivalent GRASS program

the frame rule. After the precondition has been asserted, it is assumed with the implicit
ghost parameters replaced by fresh Skolem constants. These Skolem constants then also
occur in the assumed postcondition at the call site.
Encoding the Frame Rule. The free requires and ensures clauses in the contract con-
stitute the actual encoding of the frame rule. The free annotation means that the corre-
sponding clause does not need to be checked but can be freely assumed by the callee,
respectively, caller. These clauses follow from the soundness of the frame rule and the
invariants concerning Alloc and the footprints that are guaranteed by the translation. We
discuss the most important parts of the encoding in more detail:

– First, consider the ensures clause in line 20: blseg_fprx, y Alloc FP

AllocoldAlloc. This clause states that the footprint of the postcondition, de-
noted by blseg_fprx, y, accounts for all memory in the initial footprint that has not
been deallocated, and all memory that has been freshly allocated (but not
130 R. Piskac, T. Wies, and D. Zufferey

deallocated again) during execution of quicksort. This clause thus implies that the
procedure does not leak memory.
– Next, consider the ensures clause in line 21: FP_Caller FP_CallerFP

Alloc FP
AllocoldAlloc. This clause states that the new footprint of the
caller, FP_Caller’, is the caller’s old footprint with the initial footprint of quicksort
replaced by quicksort’s final footprint (as defined in line 20).
– Finally, the clause in line 23 states that the fields next and data are not modified in
the frame of the call. We express this using the predicate Frame. The frame of the
call is given by the set old(Alloc) FP. We discuss the predicate Frame in more detail
in the next section, as the choice of its encoding is crucial for the completeness of
our translation.

Translation of SL Assertions. Finally, we describe the translation of the SL assertions

in the contract of the input procedure. This translation generalizes our previous work
on deciding entailment in separation logic of linked lists via reduction to GRASS [22].
First, each inductive SL predicate px in the input program is translated to a GRASS
predicate p_structx and a function p_fpx. The predicate p_structx collects all
constraints concerning the structure of the heap region that is described by the SL pred-
icate px, while the function p_fpx denotes the footprint of px. For example,
consider the predicate blseg(x,y,lb,ub) in the input program. As expected, its footprint
function blseg_fp(x,y) denotes the set of all nodes z on the next path between x and y,
excluding y. This is expressed in terms of a set comprehension. Such set comprehen-
sions are expanded to universally quantified constraints in the back-end solver. Note
that if y is not reachable from x in the heap, then blseg_fp(x,y) denotes the empty set.
For convenience, we reuse the same footprint function for the translation of the predi-
cate bslseg. The predicate blseg_struct(x,y,lb,ub) states that x is indeed reachable from
y (which is expressed by the predicate Btwn(x,y,y)) and that the nodes in the footprint
store data values in the interval [lb,ub]. Our tool uses a sound heuristic to generate the
translations of the user-defined inductive predicates. The heuristic cannot be complete
for arbitrary inductively defined predicates, as the problem of checking entailment for
such predicates becomes undecidable. However, our back-end solver is complete for
the translations of a large class of predicates describing linked list structures, including
the ones in the quicksort example.
With the translation of inductive predicates in place, the translation of an SL as-
sertion H to a GRASS formula is then given by a function tr H, X , where X is a set
variable that denotes the footprint of the assertion. The definition of tr H, X is defined
recursively on the structure of H as follows:

– if H px, then tr H, X p_structx X p_fpx;

– if H accx where x is a node variable, then tr H, X X x;
– if H accY where Y is a node set variable, then tr H, X X Y ;
– if H F where F is a pure constraint, then tr H, X F X ;
– if H H1 H2 , then tr H, X X1 , X2 :: tr H1 , X1 tr H2 , X2 X
X1 X2 , where X1 , X2 are fresh node set variables;
– if H H1 H2 , then tr H, X X1 , X2 :: tr H1 , X1 tr H2 , X2 X
X1
X2 , where X1 , X2 are fresh node set variables.
GRASShopper 131

For convenience, we also include nondisjoint spatial composition in our SL assertion

language, which we denote by H1 H2 . This operator is useful to specify overlayed data
structures concisely, respectively, specify alternative views of the same data structure.
Note that the points-to predicate x.next y that is commonly used in separation logic
fragments is simply a short-hand for the assertion acc(x) x.next y.

Example 1. In Figure 3, the translation tr blseg(x,y,lb,ub), FP of the original precon-

dition of the quicksort procedure is the conjunction of the clause in line 15 and the first
set equality in the clause in line 16.

Apart from the treatment of inductive predicates, the translation of SL assertions is

surprisingly close to the way in which their semantics is traditionally defined. To the ex-
pert reader, this might seem problematic, at first. Namely, when checking the generated
verification conditions, the back-end solver for GRASS negates some of the resulting
constraints to reduce the problem to satisfiability queries. Thus, some of the auxiliary
existentially quantified set variables that are introduced in the translation of spatial op-
erators2 become universally quantified. This might raise concerns about decidability.
However, the translation function is defined in such a way that all existentially quanti-
fied set variables are uniquely defined by set equalities. That is, the negated constraints
of the form X :: X T F can be transformed back into equivalent constraints of
the form X :: X T F .

3.2 Frame Axioms and Completeness

We next discuss how we ensure both completeness of the translation to GRASS pro-
grams and decidability of checking the generated verification conditions (relative to
certain assumptions about the specifications in the input program).
To enable efficient verification condition generation where all case splitting is dif-
fered to the back-end SMT solver, we model fields such as next and data as arrays. This
allows us to encode field updates conveniently as store operations, which are supported
by the array theory in the SMT solver. However, we also need to model the effect of pro-
cedure calls on fields, and how modifications of fields affect reachability information
captured by the Btwn predicate.
Ultimately, both completeness and decidability hinge on the interpretation of the
frame axioms FrameA, FP , f, f , which we use to encode the application of the frame
rule. Here, A and FP are the values of Alloc and FP before a procedure call, and f and
f are arrays that encode the state of a field such as next before and after the call. In
principle, it is sufficient to consider the following interpretation of Frame, which states
that the field f is not modified in the frame of the call:

FrameAlloc, FP, f, f x AllocFP :: x.f x.f (1)

The translation to GRASS programs that we outlined in the previous section would
then be complete if we considered an axiomatic semantics where GRASS formulas
are interpreted in a first-order logic with transitive closure. Transitive closure enables
2
As well as the quantified implicit ghost parameter FP in call-site checks of preconditions.
132 R. Piskac, T. Wies, and D. Zufferey

quicksort( rx, pivot) frame frame

quicksort( rx, pivot)

rx pivot y rx y
pivot

(a) y does not reach nodes in the footprint (b) y reaches the footprint (panhandle list).

Fig. 4. Two of the possible heaps at the call site on line 14. The footprint of the recursive call to
quicksort and the portion of the frame that belongs to the caller’s footprint are enclosed in dotted
boxes. Solid black edges denote next pointers, dashed black edges indicate next paths, and solid
red edges represent the ep function.

us to tie the interpretation of a predicate Btwn(next,x,y,z) on a semantic level to the

interpretation of next in a given program state. However, the problem of checking the
generated verification conditions would be undecidable [11].
An alternative approach is to tie the interpretation of Btwn(next,x,y,z) to the interpreta-
tion of next on an axiomatic level. In general, transitive closure cannot be axiomatized in
first-order logic. However, we are considering the special case of finite
structures, for which first-order axiomatizations of transitive closure exist. In fact, several
reachability logics for reasoning about heap structures have been proposed that can be
decided efficiently (see, e.g., [16, 26]). The problem now is to preserve precise reach-
ability information in the presence of field modifications, i.e., how do Btwn(next,x,y,z)
and Btwn(next’,x,y,z) relate if next’ is obtained from next by some (possibly unbounded)
sequence of updates. For single heap updates p.next := q, the effect on the reachability
predicate can be encoded using appropriate axioms [16]. However, to preserve reacha-
bility information for heap paths in the frame of a procedure call (which may execute an
unbounded number of heap updates) we need a more general mechanism.
To preserve reachability information in the frame, we need an interface between the
frame and the footprint of the callee that distinguishes the portions of a path belonging
to the frame from those portions belonging to the footprint. We define this interface
using the entry point function. The entry point for a heap node x with respect to a set X
and field f , denoted ep X, f, x, is defined as the first node in X that is reachable from
x via f . If such a node does not exist, then ep X, f, x x.
Example 2. Figure 4 illustrates two different heap states that may occur at the call site
of the recursive call to quicksort on line 14 in Figure 1. The evaluation of the entry point
function is depicted by red arrows.

We axiomatize ep in terms of the predicate Btwn as follows:

x :: Btwnf, x, ep X, f, x, ep X, f, x
x :: ep X, f, x X ep X, f, x x
x, y :: Btwnf, x, y, y y X ep X, f, x X Btwnf, x, ep X, f, x, y
Using the entry point function we can now correctly update the reachability information
for paths that cross the boundary into the footprint of the callee. The corresponding
frame axiom for pointer fields such as next is then as follows:
GRASShopper 133

FrameA, FP , f, f x AFP :: x.f x.f

x, y, z AFP :: ReachWOf, x, y, ep FP , f, x
Btwnf, x, z, y Btwnf , x, z, y
x, y, z A :: x FP x ep FP , f, x
Btwnf, x, y, z Btwnf , x, y, z
The two additional axioms specify that the order of nodes is preserved for the path
segments between any node x and its entry point into FP, respectively, the full path
starting in x if no node in FP is reachable from x. The predicate ReachWOf, x, y, z
means that x can reach y via f without going through z. We express this as follows:

ReachWOf, x, y, z Btwnf, x, y, z Btwnf, x, y, y Btwnf, x, z, z

For nonpointer fields such as data, equation 1 is already sufficient.

3.3 Deciding the Verification Conditions

The verification conditions that are generated from the GRASS programs are aug-
mented with theory axioms to encode the semantics of predicates such as Btwn as
well as operations on sets. The resulting formulas are in first-order logic, checked for
(un)satisfiability modulo first-order theories that are natively supported by SMT solvers,
e.g., linear arithmetic and free function symbols. The generated formulas contain both
existential and universal quantifiers, however, no quantifier alternations. To ensure
that we can use the SMT solver as an actual decision procedure for checking satisfi-
ability of the generated formulas, we preprocess these quantifiers before we pass the
formula to the SMT solver. Preprocessing depends on the kind of the quantifier:
– Existentially quantified subformulas are simply skolemized. We implemented opti-
mization such as maximizing the scope of existential quantifiers and reusing exis-
tentially quantified variables as much as possible to minimize the number of gener-
ated Skolem constants.
– Universally quantified subformulas are first hoisted to the top level of the formula
(by introducing propositional variables as place holders) and then further processed
depending on their type. We distinguish three types that we further describe below.

Effective Propositional Fragment (EPR). The EPR fragment (aka the Bernays-Schön-
finkel-Ramsey class) consists of formulas in which universally quantified variables do
not occur below function symbols. This fragment can be decided quite efficiently us-
ing Z3’s model-based quantifier instantiation mechanism. Hence, all EPR formulas are
passed directly to Z3. For formulas that are not in EPR, we make a finer distinction.
Stratified Sort Fragment. If universal quantified variables appear below function sym-
bols, then instantiating these variables may create new ground terms, which in turn can
be used for instantiation, causing the SMT solver to diverge. One special case, though,
are axioms satisfying stratified sort restrictions [1]. Examples of such formulas are the
quantified constraints in the predicates blseg_struct and bslseg_struct of Figure 3. The
sort of the quantified variables z and w is Node, while the sort of the instantiated terms
134 R. Piskac, T. Wies, and D. Zufferey

z.data and w.data is int. Since we do not quantify over int variables, the generated ground
terms do not enable new quantifier instantiations. Formulas in the stratified sort frag-
ment are directly passed to Z3.
Local Theory Extensions. The remaining quantified constraints are more difficult. In
general, we provide no completeness guarantee for our handling of quantifiers because
we allow users to specify unrestricted quantified pure constraints in their specifications.
However, we can guarantee completeness for specifications written in separation logic
for linked lists mixed with quantifier-free pure GRASS constraints (as well as some
types of user-specified quantified constraints). We designed our translation carefully
so that the remaining quantified formulas are in decidable fragments (in particular, the
frame and theory axioms). To decide these fragments, we build on local theory exten-
sions [25]. Local theory extensions are described by axioms for which instantiation can
be restricted to ground terms appearing in the verification condition (or some finite set
of ground terms that can be computed from this formula). We preprocess such axioms
by partially instantiating all variables below function symbols with the relevant sets of
ground terms. The partially instantiated axioms are then in the EPR fragment and passed
to Z3. We discuss one example of a local theory extension in more detail below. To re-
duce the number of generated partial instances, we compute the congruence closure for
the ground part of the verification condition to group ground terms into equivalence
classes. We then only need to consider one representative term per equivalence class
during instantiation.

Example 3. One example of a local theory extension is the theory extension defining the
entry point functions in Section 3.2 together with the generated frame axioms concern-
ing ep. Note that in all models of this extension, the entry point function is idempotent
for fixed X and f . Hence, we only need to instantiate these axioms once for each Node
ground term x. One potential problem may arise from the interactions between the ep
functions for different footprint sets and fields. That is, instantiating one ep term for
one X, f and ground term t may expose a new entry point e ep X , f , ep X, f, t
for another pair X , f such that, in some model, e is different from all previously gen-
erated ground terms. However, such a situation cannot occur if all footprints are defined
by a union of a bounded number of list segments. This holds true for separation logic of
linked lists. Even in the general case, the counterexamples that witness incompleteness
are rather degenerate and we doubt they can occur in actual program executions.

4 Mixing Separation Logic and First-Order Logic Specifications

The key advantage of our approach is that it allows the user to seamlessly mix SL and
GRASS specifications. Some data structures are difficult to specify in separation logic
because they involve complex sharing, or their footprints are not easily definable using
simple inductive predicates. In Figure 5, we show the specifications of the find and union
procedures of a union-find data structure implemented as a forest of inverted trees. This
data structure exhibits both of the above problems.
Complex Sharing. A path that goes from a node to its representative in a union-find
structure can be expressed as a list segment. However, describing the entire structure is
GRASShopper 135

1 predicate lseg_set(x: Node, y: Node, X: set<Node>) {

2 (x y ∗ X
) (x y ∗ acc(x) ∗ x X ∗ lseg_set(x.next, y, X {x}))
3 }
4 procedure find(x: Node, ghost root_x: Node, implicit ghost X: set<Node>)
returns (res: Node)

5

6 requires lseg_set(x, root_x, X) root_x.next null;

7
ensures res root_x acc(X) ( z X :: z.next res) res.next
null;
8

9 procedure union(x: Node, y: Node, ghost root_x: Node, ghost root_y: Node,
10 implicit ghost X: set<Node>, implicit ghost Y: set<Node>)

requires lseg_set(x, root_x, X) lseg_set(y, root_y, Y);

11

12 requires root_x.next null root_y.next null;

13
ensures (acc(X) acc(Y)) (root_y.next null acc(root_x));

ensures ( z X :: z.next root_x) ( z Y :: z.next root_y);

14

15 ensures root_x root_y root_x.next root_y;

Fig. 5. Operations on a union-find data structure with mixed specifications

more difficult. For instance, in the union procedure if x and y are in different equivalence
classes, then the two paths in the data structure are disjoint. However, if they are in the
same class, then their paths may be partially shared. It is difficult to express this in
traditional SL fragments without explicitly distinguishing the two cases. We can cover
both cases conveniently using the spatial connective for nondisjoint union.
Structural Constraints Expressed in First-Order Logic. When path compaction is
used in the find procedure, then the postcondition of find is not expressible in terms of
a bounded number of inductive predicates. The reason is that path compaction turns a
list segment of unbounded length into an unbounded number of points-to predicates.
Therefore, expressing the postcondition requires some form of universal quantification.
We can express this quite easily using the constraint F z X :: z.next root_x,
where X is the initial footprint of the procedure described by an SL assertion. Note
that the additional predicate acc(X) in the postcondition specifies that X is also the final
footprint of the procedure. Hence, F only constrains the structure of the heap region that
is captured by the footprint. Note that this example also uses implicit ghost parameters
of procedures to existentially quantify over the explicit footprint X.
When mixing separation logic and classical logic, then additional well-formedness
checks are needed to guarantee that reachability predicates and other heap-dependent
pure formulas do not constrain heap regions outside of the footprint that is specified by
the nonpure SL assertions. Otherwise, the application of the frame rule would become
unsound. However, these additional checks can be automated in the same manner as the
checks of the actual verification conditions.

5 Implementation and Evaluation

We have implemented all the features described in this paper in GRASShopper. The tool
is implemented in OCaml and available under a BSD license. The source code distribu-
tion including all benchmarks can be downloaded from the project web page [10].
136 R. Piskac, T. Wies, and D. Zufferey

pivot null
Benchmarks # LOC # VCs time in s
SLL (loop) 156 56 1.9
Loc!0
SLL (rec.) 142 70 3.1 x
data = 1
Loc!7 next

sorted SLL 171 55 6.6 next

DLL 195 59 11
Loc!3
sorting algorithms 230 98 15 Loc!2
data = 2

union-find 35 8 4.8 next next next

SLL.filter (deref. null pointer) 7 0.4 Loc!8

Loc!4 y
data = 2
DLL.insert (missing update) 8 3.1
quicksort (underspec. split) 12 0.9 next next

union-find (bug in postcond.) 4 12.8 Loc!1

Fig. 6. The left-hand side shows the summary of the experiments for the collections of correct
benchmarks as well as some benchmarks that contain bugs in the code or specification. The
right-hand side shows the generated counterexample for the underspecified quicksort program.

GRASShopper takes as input an annotated C-like program and generates verification

conditions, which are checked using a back-end SMT solver. The solver is integrated
via the standard interface defined by SMT-LIB 2. Currently, we use Z3 [8] as back-end
solver but we are working on incorporating CVC4 as well as other solvers.
Evaluation. We have collected 37 examples of correct heap-manipulating programs
working over singly and doubly-linked lists. This includes basic manipulations of the
data structures (traverse, dispose, copy, reverse, concat, filter, remove, insert) for the
singly-linked lists (SLL) and doubly-linked lists (DLL). The singly-linked list exam-
ples come in three flavors: an imperative style loop-based implementation, a recursive
implementation, and one based on sorted lists. Beyond these benchmarks, we imple-
ment four different sorting algorithms (insertion sort, merge sort, quicksort, strand sort)
and a union-find data structure. In addition, we applied the tool to programs that contain
bugs or have incorrect specifications. The table in Fig. 6 summarizes our results. The
table shows the number of lines of code for each set of examples, the total number of
verification conditions, and the total running time of GRASShopper on those examples.
All examples have been correctly verified, respectively, falsified. The number of lines
of code includes the specifications but excludes the definitions of the data structures.
Counterexample Generation. When a verification condition cannot be proved, i.e.,
the formula sent to the SMT solver is satisfiable, GRASShopper uses the model re-
turned by the solver to construct a counterexample. Due to the preprocessing of quan-
tifiers, the model returned by the SMT solver is actually a partial model of the GRASS
formula. This means that instead of having all pointer fields defined, some of them are
summarized by reachability constraints. These reachability constraints encode paths of
unbounded length in the heap. From this information we construct a graph in Graphviz
format that represents an entire family of counterexamples.
For example, when we were writing the quicksort example in Fig. 1, we had to iterate
a few times before we obtained a correct version. At some point, we had a postcondi-
tion for split that was missing the Btwn predicate, as described in Section 2. The cor-
responding counterexample produced by GRASShopper is shown in Fig. 6. The graph
GRASShopper 137

clearly shows the panhandle list. The full counterexample also includes valuations for
the footprint sets of the caller and callee. The final footprint FP_Caller’ returned by split
is Loc!0, Loc!1, Loc!2, Loc!3, Loc!4, Loc!8 and the footprint that was expected by the
postcondition of quicksort is Loc!2, Loc!4. The two sets should be equal.

6 Related Work and Conclusion

Since the pioneering work on the Smallfoot tool [2, 3], several efficient decision pro-
cedures for entailment checking in separation logic of linked lists have been devel-
oped [7, 21]. Other procedures target more expressive fragments, e.g., nested lists [6]
or structures with tree backbones [12]. Currently, GRASShopper only supports struc-
tures with a flat list backbone but we are working on extending the tool to handle more
complex data structures.
In our previous work [22], we proposed an approach to deciding entailment in sep-
aration logic via a reduction to first-order logic and presented a technique for frame
inference. However, this technique relied on model enumeration, which is very expen-
sive. We now propose an alternative where the frame rule is encoded in the SMT query.
Qiu et al. [23] introduced D RYAD a logic to specify heap shapes. To reason about
D RYAD formulas, they use natural proofs, a heuristic to bound the proof search space.
For instance, the unfolding of recursive definitions is limited to the ground terms in
the formulas. This is similar to our approach of quantifier instantiation based on local
theory extensions, but without completeness guarantees.
Closely related to our approach is the work on using effectively propositional logic
(EPR) for reasoning about programs that manipulate linked lists [13, 14]. As in this
paper, the authors of [14] use idempotent entry point functions to express that heap
paths in the frame of a procedure call do not change. Their approach yields a sound
and complete procedure for modular checking of EPR specifications. We have devel-
oped the same idea independently, motivated by the goal of verifying programs with
specifications that mix separation logic with first-order theories. The union/find data
structure has also been considered in [14]. Beside the different motivation, the main
technical difference between our work and [14] is that we are not restricted to programs
with acyclic lists. Incidentally, the more general reachability predicate that we use for
reasoning about cycles yields a simpler encoding of the frame rule.
Our SL translation and the handling of the frame rule is in part inspired by work
on implicit dynamic frames [20, 24]. Per se, the implicit dynamic frames approach
provides no decidability guarantees for the first-order logic fragment used by the SL
encoding. In particular, tools such as VeriFast [15] and Chalice [17], which are based
on this approach, use pattern-based quantifier instantiation heuristics to check the re-
sulting verification conditions. These heuristics are in general incomplete and often fail
to produce models for satisfiable formulas. Instead, we designed the target fragment of
our SL encoding carefully so that decidability is preserved by the translation while still
admitting efficient implementations on top of SMT solvers. We find the ability of our
implementation to produce counterexamples invaluable when debugging specifications.
138 R. Piskac, T. Wies, and D. Zufferey

References

1. Abadi, A., Rabinovich, A., Sagiv, M.: Decidable fragments of many-sorted logic. In: Der-
showitz, N., Voronkov, A. (eds.) LPAR 2007. LNCS (LNAI), vol. 4790, pp. 17–31. Springer,
Heidelberg (2007)
2. Berdine, J., Calcagno, C., O’Hearn, P.W.: A decidable fragment of separation logic. In: Lo-
daya, K., Mahajan, M. (eds.) FSTTCS 2004. LNCS, vol. 3328, pp. 97–109. Springer, Hei-
delberg (2004)
3. Berdine, J., Calcagno, C., O’Hearn, P.W.: Smallfoot: Modular automatic assertion checking
with separation logic. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.-P. (eds.)
FMCO 2005. LNCS, vol. 4111, pp. 115–137. Springer, Heidelberg (2006)
4. Berdine, J., Cook, B., Ishtiaq, S.: SLAYER: Memory Safety for Systems-Level Code. In:
Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 178–183. Springer,
Heidelberg (2011)
5. Botincan, M., Parkinson, M.J., Schulte, W.: Separation logic verification of C programs with
an SMT solver. Electr. Notes Theor. Comput. Sci. 254, 5–23 (2009)
6. Bouajjani, A., Drăgoi, C., Enea, C., Sighireanu, M.: Accurate invariant checking for pro-
grams manipulating lists and arrays with infinite data. In: Chakraborty, S., Mukund, M. (eds.)
ATVA 2012. LNCS, vol. 7561, pp. 167–182. Springer, Heidelberg (2012)
7. Cook, B., Haase, C., Ouaknine, J., Parkinson, M., Worrell, J.: Tractable reasoning in a
fragment of separation logic. In: Katoen, J.-P., König, B. (eds.) CONCUR 2011. LNCS,
vol. 6901, pp. 235–249. Springer, Heidelberg (2011)
8. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J.
(eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008)
9. Dudka, K., Peringer, P., Vojnar, T.: Predator: A practical tool for checking manipulation of
dynamic data structures using separation logic. In: Gopalakrishnan, G., Qadeer, S. (eds.)
CAV 2011. LNCS, vol. 6806, pp. 372–378. Springer, Heidelberg (2011)
10. GRASShopper tool wep page, http://cs.nyu.edu/wies/software/
grasshopper(last accessed: October 2013)
11. Immerman, N., Rabinovich, A., Reps, T., Sagiv, M., Yorsh, G.: The boundary between de-
cidability and undecidability for transitive-closure logics. In: Marcinkowski, J., Tarlecki, A.
(eds.) CSL 2004. LNCS, vol. 3210, pp. 160–174. Springer, Heidelberg (2004)
12. Iosif, R., Rogalewicz, A., Simacek, J.: The tree width of separation logic with recursive
definitions. In: Bonacina, M.P. (ed.) CADE 2013. LNCS, vol. 7898, pp. 21–38. Springer,
Heidelberg (2013)
13. Itzhaky, S., Banerjee, A., Immerman, N., Nanevski, A., Sagiv, M.: Effectively-propositional
reasoning about reachability in linked data structures. In: Sharygina, N., Veith, H. (eds.) CAV
2013. LNCS, vol. 8044, pp. 756–772. Springer, Heidelberg (2013)
14. Itzhaky, S., Lahav, O., Banerjee, A., Immerman, N., Nanevski, A., Sagiv, M.: Modular rea-
soning on unique heap paths via effectively propositional formulas. In: POPL (2014)
15. Jacobs, B., Smans, J., Philippaerts, P., Vogels, F., Penninckx, W., Piessens, F.: VeriFast: A
powerful, sound, predictable, fast verifier for C and java. In: Bobaru, M., Havelund, K., Holz-
mann, G.J., Joshi, R. (eds.) NFM 2011. LNCS, vol. 6617, pp. 41–55. Springer, Heidelberg
(2011)
16. Lahiri, S., Qadeer, S.: Back to the future: revisiting precise program verification using SMT
solvers. In: POPL (2008)
17. Leino, K.R.M., Müller, P., Smans, J.: Verification of concurrent programs with chalice. In:
Aldini, A., Barthe, G., Gorrieri, R. (eds.) FOSAD 2007/2008/2009. LNCS, vol. 5705, pp.
195–222. Springer, Heidelberg (2009)
GRASShopper 139

18. Nelson, G., Oppen, D.C.: Simplification by cooperating decision procedures. ACM
TOPLAS 1(2), 245–257 (1979)
19. O’Hearn, P., Reynolds, J., Yang, H.: Local reasoning about programs that alter data struc-
tures. In: Fribourg, L. (ed.) CSL 2001 and EACSL 2001. LNCS, vol. 2142, pp. 1–19.
Springer, Heidelberg (2001)
20. Parkinson, M.J., Summers, A.J.: The relationship between separation logic and implicit dy-
namic frames. Logical Methods in Computer Science 8(3) (2012)
21. Pérez, J.A.N., Rybalchenko, A.: Separation logic + superposition calculus = heap theorem
prover. In: PLDI, pp. 556–566. ACM (2011)
22. Piskac, R., Wies, T., Zufferey, D.: Automating Separation Logic Using SMT. In: Sharygina,
N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 773–789. Springer, Heidelberg (2013)
23. Qiu, X., Garg, P., Stefanescu, A., Madhusudan, P.: Natural proofs for structure, data, and
separation. In: PLDI, pp. 231–242 (2013)
24. Smans, J., Jacobs, B., Piessens, F.: Implicit dynamic frames: Combining dynamic frames and
separation logic. In: Drossopoulou, S. (ed.) ECOOP 2009. LNCS, vol. 5653, pp. 148–172.
Springer, Heidelberg (2009)
25. Sofronie-Stokkermans, V.: Hierarchic reasoning in local theory extensions. In: Nieuwenhuis,
R. (ed.) CADE 2005. LNCS (LNAI), vol. 3632, pp. 219–234. Springer, Heidelberg (2005)
26. Totla, N., Wies, T.: Complete instantiation-based interpolation. In: POPL. ACM (2013)
27. Yang, H., Lee, O., Berdine, J., Calcagno, C., Cook, B., Distefano, D., O’Hearn, P.W.: Scalable
shape analysis for systems code. In: Gupta, A., Malik, S. (eds.) CAV 2008. LNCS, vol. 5123,
pp. 385–398. Springer, Heidelberg (2008)
Alternating Runtime and Size Complexity
Analysis of Integer Programs

Marc Brockschmidt1 , Fabian Emmes2 , Stephan Falke3 , Carsten Fuhs4 ,

and Jürgen Giesl2
1
Microsoft Research, Cambridge, UK
2
RWTH Aachen University, Germany
3
Karlsruhe Institute of Technology, Germany
4
University College London, UK

Abstract. We present a modular approach to automatic complexity

analysis. Based on a novel alternation between ﬁnding symbolic time
bounds for program parts and using these to infer size bounds on pro-
gram variables, we can restrict each analysis step to a small part of the
program while maintaining a high level of precision. Extensive experi-
ments with the implementation of our method demonstrate its perfor-
mance and power in comparison with other tools.

1 Introduction
There exist numerous methods to prove termination of imperative programs,
e.g., [2, 6, 8, 9, 12, 13, 15–17, 19, 25, 33–35]. In many cases, however, termination is
not sufficient, but the program should terminate in reasonable (e.g., (pseudo-)
polynomial) time. To prove this, it is often crucial to derive (possibly non-linear)
bounds on the values of variables that are modified repeatedly in loops.
We build upon the well-known observation that rank functions for termina-
tion proofs also provide a runtime complexity bound [3, 4, 6, 7, 32]. However, this
only holds for proofs using a single rank function. Larger programs are usually
handled by a disjunctive [16,28,35] or lexicographic [6,12,13,17,19,21,23,25] com-
bination of rank functions. Here, deriving a complexity bound is much harder.
To illustrate this, consider the program on the right and
while i > 0 do
a variant where the instruction “x = x + i” is removed. For
i=i−1
both variants, the lexicographic rank function f1 , f2 proves
x=x+i
termination, where f1 measures states by the value of i and
done
f2 is just the value of x. However, the program without the
instruction “x = x + i” has linear runtime, while the program while x > 0 do
x=x−1
on the right has quadratic runtime. The crucial difference be-
done
tween the two programs is in the size of x after the first loop.
To handle such effects, we introduce a novel modular approach which alter-
nates between finding runtime bounds and finding size bounds. In contrast to
standard invariants, our size bounds express a relation to the size of the variables
at the program start, where we measure the size of integers by their absolute
values. Our method derives runtime bounds for isolated parts of the program

Supported by the DFG grant GI 274/6-1.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 140–155, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Alternating Runtime and Size Complexity Analysis of Integer Programs 141

and uses these to deduce (often non-linear) size bounds for program variables at
certain locations. Further runtime bounds can then be inferred using size bounds
for variables that were modified in preceding parts of the program. By splitting
the analysis in this way, we only need to consider small program parts in each
step, and the process continues until all loops and variables have been handled.
For the example, our method proves that the first loop is executed linearly
often using the rank function i. Then, it deduces that i is bounded by the size of
its initial value |i0 | in all loop iterations. Combining these bounds, it infers that
x is incremented by a value bounded by |i0 | at most |i0 | times, i.e., x is bounded
by the sum of its initial size |x0 | and |i0 |2 . Finally, our method detects that the
second loop is executed x times, and combines this with our bound |x0 | + |i0 |2
on x’s value when entering the second loop. In this way, we can conclude1 that
the program’s runtime is bounded by |i0 | + |i0 |2 + |x0 |. This novel combination
of runtime and size bounds allows us to handle loops whose runtime depends on
variables like x that were modified in earlier loops. Thus, our approach succeeds
on many programs that are beyond the reach of previous techniques.
Sect. 2 introduces the basic notions for our approach. Then Sect. 3 and Sect. 4
present our techniques to compute runtime and size bounds, respectively. Sect. 5
discusses related work and provides an extensive experimental evaluation. Proofs
for all theorems as well as several extensions of our approach can be found in [14].
Input: List x
2 Preliminaries 0 : List y = null
1 : while x = null do
Consider the program on the right. For an input y = new List(x.val, y)
list x, the loop at location 1 creates a list y by x = x.next
reversing the elements of x. The loop at location done
2 iterates over the list y and increases each el- List z = y
ement by the sum of its successors. So if y was 2 while z = null do
:
[5, 1, 3], it will be [5 + 1 + 3, 1 + 3, 3] after the sec- List u = z.next
ond loop. This example is a representative for 3 : while u = null do
methods using several algorithms in sequence. z.val += u.val
We regard sequential imperative integer pro- u = u.next
grams with (potentially non-linear) arithmetic done
and unbounded non-determinism. Our approach z = z.next
is compatible with methods that abstract fea- done
tures like heap usage to integers [2, 4, 15, 19, 29, 34]. So the above program could
be abstracted automatically to the integer program below. Here, list variables
are replaced by integer variables that correspond to the lengths of the lists.
We fix a (finite) set of program variables V = {v1 , . . . , vn } and represent inte-
ger programs as directed graphs. Nodes are program locations L and edges are
program transitions T . The set L contains a canonical start location 0 . W.l.o.g.,
we assume that no transition leads back to 0 and that all transitions T are reach-
able from 0 . All transitions originating in 0 are called initial transitions. The
1
Since each step of our method over-approximates the runtime or size of a variable,
we actually obtain the bound 2 + |i0 | + max{|i0 |, |x0 |} + |i0 |2 , cf. Sect. 4.2.
142 M. Brockschmidt et al.

transitions are labeled by formulas over the variables V and primed post-variables
V = {v1 , . . . , vn } which represent the values of the variables after the transi-
tion. In the following graph, we represented these formulas by imperative com-
mands. For instance, t3 is labeled by the for- 0
mula z > 0∧u = z−1∧x = x∧y = y∧z = z.
t0: y = 0
We used standard invariant-generation tech-
niques (based on the Octagon domain [30]) to t1: if(x > 0) 1
propagate simple integer invariants, adding the y= y+1 t2: if(x ≤ 0)
condition z > 0 to the transitions t4 and t5 . x= x−1 z=y

Deﬁnition 1 (Programs). A transition is a 2

t5: if(u ≤ 0)
tuple (, τ, ) where , ∈ L are locations and if(z > 0)
t3: if(z > 0)
τ is a formula relating the (pre-)variables V u= z−1
z= z−1
and the post-variables V . A program is a set 3
of transitions T . A configuration (, v) consists t4: if(u > 0)
of a location ∈ L and a valuation v : V → Z. if(z > 0)
We write (, v) →t ( , v ) for an evaluation u= u−1
step with a transition t = (, τ, ) iff the valuations v, v satisfy the formula τ
of t. We drop the index t if we do not care about the used transition and write
(, v) →k ( , v ) if k evaluation steps lead from configuration (, v) to ( , v ).
So for the program above, we have (1 , v 1 ) →t2 (2 , v 2 ) for any valuations
where v 1 (x) = v 2 (x) ≤ 0, v 1 (y) = v 2 (y) = v 2 (z), and v 1 (u) = v 2 (u).
Let T always denote the analyzed program. Our goal is to find bounds on the
runtime and the sizes of program variables, where these bounds are expressed as
functions in the sizes of the input variables v1 , . . . , vn . For our example, our meth-
od will detect that its runtime is bounded by 3+4·|x|+|x|2 (i.e., it is quadratic in
|x|). We measure the size of variable values v(vi ) by their absolute values |v(vi )|.
For a valuation v and a vector m = (m1 , ..., mn ) ∈ Nn , let v ≤ m abbreviate
|v(v1 )| ≤ m1 ∧ . . . ∧ |v(vn )| ≤ mn . We define runtime complexity by a function
rc that maps the sizes m of the program variables to the maximal number of
evaluation steps that are possible from a start configuration (0 , v) with v ≤ m.
To analyze complexity in a modular way, we construct a runtime approximation
R such that for any t ∈ T , R(t) over-approximates the number of times that t
can be used in an evaluation. In Def. 2, →∗ ◦ →t is the relation that allows to
perform arbitrary many evaluation steps followed by a step with transition t.
As we generate new bounds by composing previously found bounds, we only
use weakly monotonic functions R(t) (i.e., mi ≥ mi implies (R(t))(m1 , . . . , mi ,
. . . , mn ) ≥ (R(t))(m1 , . . . , mi , . . . , mn )). We define the set of upper bounds C as
the weakly monotonic functions from Nn → N and ?, where ?(m) = ω for all
m ∈ Nn . We have ω > n for all n ∈ N. In our implementation, we restrict R(t)
to functions constructed from max, min, ?, and polynomials from N[v1 , . . . , vn ].
Definition 2 (Runtime Complexity and Approximation). The runtime
complexity rc : Nn → N ∪ {ω} is defined as2 rc(m) = sup{k ∈ N | ∃v 0 , , v . v 0 ≤
2
Here, rc(m) = ω means non-termination or arbitrarily long runtime. Such programs
result from non-determinism, e.g., i = nondet(); while i > 0 do i = i − 1 done.
Alternating Runtime and Size Complexity Analysis of Integer Programs 143

m ∧ (0 , v 0 ) →k (, v)}. A function R : T → C is a runtime approximation iﬀ

(R(t))(m) ≥ sup{k ∈ N | ∃v 0 , , v . v 0 ≤ m ∧ (0 , v 0 ) (→∗ ◦ →t )k (, v)} holds
for all transitions t ∈ T and all m ∈ Nn . The initial runtime approximation R0
is defined as3 R0 (t) = 1 for all initial transitions t and R0 (t) = ? otherwise.
For size complexity, we analyze how large the value of a program variable
can become. Analogous to R, we use a size approximation S, where S(t, v ) is
a bound on the size of the variable v after a certain transition t was used in an
evaluation. For any transition t ∈ T and v ∈ V, we call |t, v | a result variable.
Definition 3 (Result Variables and Size Approximation). Let RV =
{|t, v | | t ∈ T , v ∈ V} be the set of result variables. A function S : RV → C
is a size approximation iff (S(t, v ))(m) ≥ sup{|v(v)| | ∃v 0 , , v . v 0 ≤ m ∧
(0 , v 0 ) (→∗ ◦ →t ) (, v)} holds for all |t, v | ∈ RV and all m ∈ Nn . The initial
size approximation S0 is defined as S0 (t, v ) = ? for all |t, v | ∈ RV. A pair (R, S)
is a complexity approximation if R is a runtime and S is a size approximation.
Our approach starts with the initial approximation (R0 , S0 ) and improves it
by iterative refinement. An approximation for the runtime complexity rc of the
whole program Tcan be obtained by adding the runtime bounds R(t) for its
transitions, i.e., ( t∈T R(t)) ≥ rc. The overall bound t∈T R(t) = 3+4·|x|+|x|2
for our example was obtained in this way. Here for f, g ∈ C, the comparison,
addition, multiplication, maximum, and the minimum are defined point-wise.
So f ≥ g holds iff f (m) ≥ g(m) for all m ∈ Nn and f + g is the function with
(f + g)(m) = f (m) + g(m), where ω + n = ω for all n ∈ N ∪ {ω}.

3 Computing Runtime Bounds

To ﬁnd runtime bounds automatically, we use (lexicographic combinations of)
polynomial rank functions (PRFs). Such rank functions are widely used in ter-
mination analysis and many techniques are available to generate PRFs auto-
matically [6, 8, 9, 12, 19–21, 33]. In Sect. 3.1 we recapitulate the basic approach
to use PRFs for the generation of time bounds. In Sect. 3.2, we improve it to
a novel modular approach which infers time bounds by combining PRFs with
information about variable sizes and runtime bounds found earlier.

3.1 Runtime Bounds from Polynomial Rank Functions

A PRF Pol : L → Z[v1 , . . . , vn ] assigns an integer polynomial Pol () over the
program variables to each location . Then configurations (, v) are measured as
the value of the polynomial Pol () for the numbers v(v1 ), . . . , v(vn ). To obtain
time bounds, we search for PRFs where no transition increases the measure of
configurations, and at least one transition decreases it. To rule out that this
decrease continues forever, we also require that the measure has a lower bound.
Definition 4 (PRF). We call Pol : L → Z[v1 , . . . , vn ] a polynomial rank func-
tion ( PRF) for T iff there is a non-empty T ⊆ T such that the following holds:
3
Here, “1” denotes the constant function which maps all arguments m ∈ Nn to 1.
144 M. Brockschmidt et al.

• for all (, τ, ) ∈ T , we have τ ⇒ (Pol ())(v1 , . . . , vn ) ≥ (Pol ( ))(v1 , . . . , vn )
• for all (, τ, ) ∈ T , we have τ ⇒ (Pol ())(v1 , . . . , vn ) > (Pol ( ))(v1 , . . . , vn )
and τ ⇒ (Pol ())(v1 , . . . , vn ) ≥ 1
The constraints on a PRF Pol are the same constraints needed for termination
proofs, allowing to re-use existing PRF synthesis techniques and tools. They
imply that the transitions in T can only be used a limited number of times, as
each application of a transition from T decreases the measure, and no transition
increases it. Hence, if the program is called with input m1 , . . . , mn , no transition
t ∈ T can be used more often than (Pol (0 ))(m1 , . . . , mn ) times. Consequently,
Pol (0 ) is a runtime bound for the transitions in T . Note that no such bound
is obtained for the remaining transitions in T .
In the program from Sect. 2, we could use Pol 1 with Pol 1 () = x for all ∈ L,
i.e., we measure configurations by the value of x. No transition increases this mea-
sure and t1 decreases it. The condition x > 0 ensures that the measure is positive
whenever t1 is used, i.e., T = {t1 }. Hence Pol 1 (0 ) (i.e., the value x at the
beginning of the program) is a bound on the number of times t1 can be used.
Such PRFs lead to a basic technique for inferring time bounds. As mentioned
in Sect. 2, to obtain a modular approach afterwards, we only allow weakly mono-
tonic functions as complexity bounds. For any polynomial p ∈ Z[v1 , . . . , vn ], let
[p] result from p by replacing all coefficients and variables with their absolute
value (e.g., for Pol 1 (0 ) = x we have [Pol 1 (0 )] = |x| and if p = 2 · v1 − 3 · v2
then [p] = 2 · |v1 | + 3 · |v2 |). As [p](m1 , . . . , mn ) ≥ p(m1 , . . . , mn ) holds for all
m1 , . . . , mn ∈ Z, this is a sound approximation, and [p] is weakly monotonic. In
our example, the initial runtime approximation R0 can now be refined to R1 ,
with R1 (t1 ) = [Pol 1 (0 )] = |x| and R1 (t) = R0 (t) for all other transitions t.
Theorem 5 (Complexities from PRFs). Let R be a runtime approximation
and Pol be a PRF for T . Let4 R (t) = [Pol (0 )] for all t ∈ T and R (t) = R(t)
for all other t ∈ T . Then, R is also a runtime approximation.

3.2 Modular Runtime Bounds from PRFs and Size Bounds

The basic method from Thm. 5 only succeeds in finding complexity bounds for
simple examples. In particular, it often fails for programs with non-linear run-
time. Although corresponding SAT- and SMT-encodings exist [20], generating
a suitable PRF Pol of a non-linear degree is a complex synthesis problem (and
undecidable in general). This is aggravated by the need to consider all of T at
once, which is required to check that no transition of T increases Pol ’s measure.
Therefore, we now present a new modular technique that only considers iso-
lated program parts T ⊆ T in each PRF synthesis step. The bounds obtained
from these “local” PRFs are then lifted to a bound expressed in the input values.
To this end, we combine them with bounds on the size of the variables when
entering the program part T and with a bound on the number of times that
4
To ensure that R (t) is at most as large as the previous bound R(t), one could also
define R (t) = min{[Pol (0 )], R(t)}. A similar improvement is possible for all other
techniques in the paper that refine the approximations R or S.
Alternating Runtime and Size Complexity Analysis of Integer Programs 145

T can be reached in evaluations of the full program T . This allows us to use

existing efficient procedures for the automated generation of (often linear) PRFs
for the analysis of programs with (possibly non-linear) runtime.
For instance, consider the subset T1 = {t1 , . . . , t5 } of the transitions in our
program. Using the constant PRF Pol 2 with Pol 2 (1 ) = 1 and Pol 2 (2 ) =
Pol 2 (3 ) = 0, we see that t1 , t3 , t4 , t5 do not increase the measure of configura-
tions and that t2 decreases it. Hence, in executions that are restricted to T1 and
that start in 1 , t2 is used at most [Pol 2 (1 )] = 1 times. To obtain a global result,
we consider how often T1 is reached in a full program run. As T1 can only be
reached by the transition t0 , we multiply its runtime approximation R1 (t0 ) = 1
with the local bound [Pol 2 (1 )] = 1 obtained for the sub-program T1 . Thus, we
can refine the runtime approximation R1 to R2 (t2 ) = R1 (t0 )·[Pol 2 (1 )] = 1 ·1 =
1 and we set R2 (t) = R1 (t) for all other t.
In general, to estimate how often a sub-program T is reached in an evalu-
ation, we consider the transitions t̃ ∈ T that lead to an “entry location” in T .
We multiply the runtime bound of such transitions t̃ with the bound [Pol ()]
for runs starting in . In our example, t0 is the only transition leading to T1 =
{t1 , . . . , t5 } and thus, the runtime bound R1 (t0 ) = 1 is multiplied with [Pol 2 (1 )].
Next, we consider the remaining transitions T2 = {t3 , t4 , t5 } for which we have
no bound yet. We use Pol 3 (2 ) = Pol 3 (3 ) = z where (T2 ) = {t5 }. So restricted
to the sub-program T2 , t5 is used at most [Pol 3 (2 )] = |z| times. Here, z refers to
the value when entering T2 (i.e., after transition t2 ). To translate this bound into
an expression in the input values, we substitute the variable z by its maximal
size after using the transition t2 , i.e., by the size bound S(t2 , z ). As the runtime
of the loop at 2 depends on the size of z, our approach alternates between
computing runtime and size bounds. Our method to compute size bounds will
determine that the size of z after the transition t2 is at most |x|, cf. Sect. 4.
Hence, we replace the variable z in [Pol 3 (2 )] = |z| by S(t2 , z ) = |x|.
So in general, the polynomials [Pol ()] for the entry locations of T only
provide a bound in terms of the variable values at location . To find bounds ex-
pressed in the variable values at the start location 0 , we use our size approxima-
tion S and replace all variables in [Pol ()] by our approximation for their sizes
at location . For this, we define the application of polynomials to functions.
Let p ∈ N[v1 , ..., vn ] and f1 , ..., fn ∈ C. Then p(f1 , ..., fn ) is the function with
(p(f1 , ..., fn ))(m) = p(f1 (m), . . . , fn (m)) for all m ∈ Nn . Weak monotonicity of
p, f1 , ..., fn also implies weak monotonicity of p(f1 , ..., fn ), i.e., p(f1 , ..., fn ) ∈ C.
For example, when analyzing how often t5 is used in the sub-program T2 =
{t3 , t4 , t5 } above, we applied the polynomial [Pol 3 (2 )] for the start location
2 of T2 to the size bounds S(t2 , v ) for the variables x, y, z, u (i.e., to their
sizes before entering T2 ). As [Pol 3 (2 )] = |z| and S(t2 , z ) = |x|, we obtained
[Pol 3 (2 )](S(t2 , x ), S(t2 , y ), S(t2 , z ), S(t2 , u )) = |x|.
To compute a global bound, we also have to examine how often T2 can be
executed in a full program run. As T2 is only reached by t2 , we obtain R3 (t5 ) =
R2 (t2 )·|x| = 1·|x| = |x|. For all other transitions t, we again have R3 (t) = R2 (t).
In Thm. 6, our technique is represented by the procedure TimeBounds. It
146 M. Brockschmidt et al.

takes the current complexity approximation (R, S) and a sub-program T , and

computes a PRF for T . Based on this, R is reﬁned to the approximation R .
Theorem 6 (TimeBounds). Let (R, S) be a complexity approximation and
T ⊆ T such that T contains no initial transitions. Let L = { | (, τ, ) ∈
T } contain all entry locations of T and let Pol be a PRF for T . For any
∈ L , let T
contain all transitions (,

˜ τ̃ , ) ∈ T \ T leading to . Let R (t) =

∈L, t̃∈T R(t̃) · [Pol ()](S(t̃, v1 ), . . . , S(t̃, vn )) for t ∈ T and R (t) = R(t) for

all t ∈ T \T . Then, TimeBounds(R, S, T ) = R is also a runtime approximation.
Here one can see why we require complexity bounds to be weakly monotonic.
The reason is that S(t̃, v ) over-approximates the size of v at some location .
Hence, to ensure that [Pol ()](S(t̃, v1 ), . . . , S(t̃, vn )) correctly over-approximates
how often transitions of T can be applied in parts of evaluations that only use
transitions from T , [Pol ()] must be weakly monotonic.
By Thm. 6, we now obtain bounds for the remaining transitions in our exam-
ple. For T3 = {t3 , t4 }, we use Pol 4 (2 ) = 1, Pol 4 (3 ) = 0, and hence (T3 ) = {t3 }.
The transitions t2 and t5 lead to T3 , and thus, we obtain R4 (t3 ) = R3 (t2 ) · 1 +
R3 (t5 ) · 1 = 1 + |x| and R4 (t) = R3 (t) for all other transitions t.
For T4 = {t4 }, we use Pol 5 (3 ) = u with (T4 ) = T4 . The part T4 is only en-
tered by the transition t3 . So to get a global bound, we substitute u in [Pol 5 (3 )]
= |u| by S(t3 , u ) (in Sect. 4, we will determine S(t3 , u ) = |x|). Thus, R5 (t4 ) =
R4 (t3 ) · S(t3 , u ) = (1 + |x|) · |x| = |x| + |x|2 and R5 (t) = R4 (t) for all other t ∈ T .
So while the runtime of T4 on its own is linear, the loop at location 3 is reached
a linear number of times, i.e., its transition t 4 is used quadratically often. Thus,
the overall program runtime is bounded by t∈T R5 (t) = 3 + 4 · |x| + |x|2 .

4 Computing Size Bounds

The procedure TimeBounds improves the runtime approximation R, but up to
now the size approximation S was only used as an input. To infer bounds on the
sizes of variables, we proceed in three steps. First, we find local size bounds that
approximate the effect of a single transition on the sizes of variables. Then, we
construct a result variable graph that makes the flow of data between variables
explicit. Finally, we analyze each strongly connected component (SCC) of this
graph independently. Here, we combine the local size bounds with our runtime
approximation R to estimate how often transitions modify a variable value.
By a series of SMT queries, we find local size bounds Sl (t, v ) that describe
how the size of the post-variable v is related to the pre-variables of a transition
t. So while S(t, v ) is a bound on the size of v after using t in a full program run,
Sl (t, v ) is a bound on v after a single use of t.
Definition 7 (Local Size Approximation). We call Sl : RV → C a local size
approximation iff (Sl (t, v ))(m) ≥ sup{|v (v)| | ∃, v, , v . v ≤ m ∧ (, v) →t
( , v )} for all |t, v | ∈ RV and all m ∈ Nn .
In our example, we obtain Sl (t1 , y ) = |y| + 1, as t1 increases y by 1. Similarly,
|t1 , x | is bounded by |x|. As t1 is only executed if x is positive, decreasing x by
Alternating Runtime and Size Complexity Analysis of Integer Programs 147

1 does not increase its absolute value. The bound max{0, |x| − 1} would also be
allowed, but our approach does not compute better global size bounds from it.
To track how variables influence each other, we construct a result variable
graph (RVG) whose nodes are the result variables. An RVG for our example is
shown below. Here, we display local size bounds in the RVG to the left of the
result variables, separated by “≥” (e.g., “|x| ≥ |t1 , x |” means Sl (t1 , x ) = |x|).
The RVG has an edge
|x| ≥ |t0 , x | 0 ≥ |t0 , y | |z| ≥ |t0 , z | |u| ≥ |t0 , u |
from a result variable |t̃, ṽ |
to |t, v | if the transition t̃ can
|x| ≥ |t1 , x | |y|+1 ≥ |t1 , y | |z| ≥ |t1 , z | |u| ≥ |t1 , u |
be used directly before t and
if ṽ occurs in the local size
|x| ≥ |t2 , x | |y| ≥ |t2 , y | |y| ≥ |t2 , z | |u| ≥ |t2 , u |
bound Sl (t, v ). Such an edge
means that the size of ṽ in
|x| ≥ |t3 , x | |y| ≥ |t3 , y | |z| ≥ |t3 , z | |z| ≥ |t3 , u |
the post-location of the tran-
sition t̃ may influence the size
|x| ≥ |t4 , x | |y| ≥ |t4 , y | |z| ≥ |t4 , z | |u| ≥ |t4 , u |
of v in t’s post-location.
To state which variables
|x| ≥ |t5 , x | |y| ≥ |t5 , y | |z| ≥ |t5 , z | |u| ≥ |t5 , u |
may influence a function f ∈
C, we define its active variables as actV(f ) = {vi ∈ V | ∃m1 , . . . , mn , mi ∈ N .
f (m1 , . . . , mi , . . . , mn ) = f (m1 , . . . , mi , . . . , mn ) }. Let pre(t) denote the tran-
sitions that may precede t in evaluations, i.e., pre(t) = {t̃ ∈ T | ∃v 0 , , v .
(0 , v 0 ) →∗ ◦ →t̃ ◦ →t (, v)}. While pre(t) is undecidable in general, there exist
several techniques to compute over-approximations of pre(t), cf. [19, 21]. For ex-
ample, one can disregard the formulas of the transitions and approximate pre(t)
by all transitions that end in t’s source location.
Definition 8 (RVG). Let Sl be a local size approximation. An RVG has T ’s re-
sult variables as nodes and the edges {(|t̃, ṽ |, |t, v |) | t̃ ∈ pre(t), ṽ ∈ actV(Sl (t,v ))}.
For the transition t2 which sets z = y, we obtain Sl (t2 , z ) = |y|. Hence, we
have actV(Sl (t2 , z )) = y. The program graph implies pre(t2 ) = {t0 , t1 }, and
thus, our RVG contains edges from |t0 , y | to |t2 , z | and from |t1 , y | to |t2 , z |.
Each SCC of the RVG represents a set of result variables that may influence
each other. To lift the local approximation Sl to a global one, we consider each
SCC on its own. We treat the SCCs in topological order, reflecting the data flow.
As usual, an SCC is a maximal subgraph with a path from each node to every
other node. An SCC is trivial if it consists of a single node without an edge to
itself. In Sect. 4.1, we show how to deduce global bounds for trivial SCCs and in
Sect. 4.2, we handle non-trivial SCCs where transitions are applied repeatedly.

4.1 Size Bounds for Trivial SCCs of the RVG

Sl (t, v ) approximates the size of v after the transition t w.r.t. t’s pre-variables.
But our goal is to obtain a global bound S(t, v ) that approximates v w.r.t.
the initial values of the variables at the program start. For trivial SCCs that
consist of a result variable α = |t, v | with an initial transition t, the local bound
Sl (α) is also the global bound S(α), as the start location 0 has no incoming
148 M. Brockschmidt et al.

transitions. For example, regard the trivial SCC with the result variable |t0 , y |.
As 0 ≥ |t0 , y | holds, its global size bound is also 0, and we set S(t0 , y ) = 0.
Next, we consider trivial SCCs α = |t, v | with incoming edges from other
SCCs. Now Sl (α) (m) is an upper bound on the size of v after using the tran-
sition t in a conﬁguration where the sizes of the variables are at most m. To
obtain a global bound, we replace m by upper bounds on t’s input variables.
The edges leading to α come from result variables |t̃, vi | where t̃ ∈ pre(t) and
vi ∈ actV(Sl (α)). Thus, a bound for the result variable α = |t, v | is obtained by
applying Sl (α) to S(t̃, v1 ), . . . , S(t̃, vn ), for all t̃ ∈ pre(t).
As an example consider the result variable |t2 , z |. Its local size bound is Sl (t2 ,

z ) = |y|. To express this bound in terms of the input variables, we consider the
predecessors |t0 , y | and |t1 , y | of |t2 , z | in the RVG. So Sl (t2 , z ) must be applied
to S(t0 , y ) and S(t1 , y ). If SCCs are handled in topological order, one already
knows that S(t0 , y ) = 0 and S(t1 , y ) = |x|. Thus, S(t2 , z ) = max{0, |x|} = |x|.
Thm. 9 presents the resulting procedure SizeBounds. Based on the current
approximation (R, S), it improves the global size bound for the result variable
in a non-trivial SCC of the RVG. Non-trivial SCCs will be handled in Thm. 10.
Theorem 9 (SizeBounds for Trivial SCCs). Let (R, S) be a complexity ap-
proximation, let Sl be a local size approximation, and let {α} ⊆ RV be a trivial
SCC of the RVG. We deﬁne S (α ) = S(α ) for α = α and
• S (α) = Sl (α), if α = |t, v | for some initial transition t
• S (α) = max{Sl (α) (S(t̃, v1 ), . . . , S(t̃, vn )) | t̃ ∈ pre(t)}, otherwise
Then SizeBounds(R, S, {α}) = S is also a size approximation.

4.2 Size Bounds for Non-trivial SCCs of the RVG

Finally, we show how to improve the size bounds for result variables in non-trivial
SCCs of the RVG. Such an SCC corresponds to a loop and hence, each of its local
changes can be applied several times. By combining the time bounds R(t) for its
transitions t with the local size bounds Sl (t, v ), we approximate the overall effect
of these repeated changes. To simplify this approximation, we use the following
classification of result variables α depending on their local size bound Sl (α):
.
• α ∈ = (α is an “equality”) if the result variable is not larger than its pre-
variables or a constant, i.e., iff there is a number eα ∈ N with max{eα , m1 ,
. . . , mn } ≥ (Sl (α))(m1 , . . . , mn ) for all m1 , . . . , mn ∈ N.
• α ∈ (α “adds a constant”) if the result variable only increases over the
pre-variables by a constant, i.e., iff there is a number eα ∈ N with eα +
max{m1 , . . . , mn } ≥ (Sl (α))(m1 , . . . , mn ) for all m1 , . . . , mn ∈ N.
• α ∈ Σ̇ (α “adds variables”) if the result variable is not larger than the sum
of thepre-variables and a constant, i.e., iff there is a number eα ∈ N with
eα + i∈{1,...,n} mi ≥ (Sl (α))(m1 , . . . , mn ) for all m1 , . . . , mn ∈ N.
.
So for our example, we get {|t3 , z |, |t4 , z |, |t5 , z |} ⊆ = since Sl (t3 , z ) =
Sl (t4 , z ) = Sl (t5 , z ) = |z|. Similarly, we have |t1 , y | ∈ as Sl (t1 , y ) = |y| + 1.

In the following, local size bounds like 2 · |x| are not handled because we
Alternating Runtime and Size Complexity Analysis of Integer Programs 149

are currently interested only in bounds that can be expressed by polynomials

(and max and min). If a change bounded by 2 · |x| is applied |y| times, the
resulting value is bounded only by the exponential function 2|y| · |x|. Of course,
our approach could be extended to infer such exponential size bounds as well.
In Sect. 5, we discuss the limitations and possible extensions of our approach.
Similar to pre(t) for transitions t, let pre(α) for a result variable α be those
α̃ ∈ RV with an edge from α̃ to α in the RVG. To deduce a bound on the size of
the result variables α in an SCC C, we first consider the size of values entering
the SCC C. Hence, we require that the resulting size bound S(α) for α ∈ C
should be at least as large as the sizes S(α̃) of the inputs α̃, i.e., of those result
variables α̃ outside the SCC C that have an edge to some α ∈ C. Moreover, if the
.
SCC C contains result variables α = |t, v | ∈ =, then the transition t either does
not increase the size at all, or increases it to the constant eα . Hence, the bound
.
S(α) for the result variables α in C should also be at least max{eα | α ∈ =}.5
For example, when computing the global size bounds for the result variables
in the SCC C = {|t3 , z |, |t4 , z |, |t5 , z |} in our example, the only predecessor
of this SCC is |t2 , z | with S(t2 , z ) = |x|. For each α ∈ C, the corresponding
constant eα is 0. Thus, for all α ∈ C we obtain S(α) = max{|x|, 0} = |x|.
.
To handle result variables α ∈ \ = that add a constant eα , we consider how
often this addition is performed. Thus, while TimeBounds from Thm. 6 uses the
size approximation S to improve the runtime approximation R, SizeBounds uses
R to improve S. We define R(|t, v |) = R(t) for all result variables |t, v |. Then,
since R(α) is a bound on the number of times that eα is added, the repeated
traversal of α’s transition increases the overall size by at most R(α) · eα .
For instance, consider the result variable α = |t1 , y | in our example. Its local
size bound is Sl (t1 , y ) = |y| + 1, i.e., each traversal of t1 increases y by eα = 1.
As before, we use the size bounds on the predecessors of the SCC {α} as a
basis. So the input value when entering the SCC is S(t0 , y ) = 0. Since t1 is
executed at most R(α) = R(t1 ) = |x| times, we obtain the global bound S(α) =
S(t0 , y ) + R(α) · eα = 0 + |x| · 1 = |x|.
Finally, we discuss how to handle result variables α ∈ 0
Σ̇\. To this end, consider the program from Sect. 1 again. t1: if(i > 0)
i= i−1 t0
Its program graph is depicted on the right. Our method
x= x+i
detects the runtime bounds R(t0 ) = 1, R(t1 ) = |i|, and 1
R(t2 ) = 1. To obtain size bounds, we first generate the
t2: if(i ≤ 0)
RVG (see the next page). Now we can infer the global size

bounds S(t, i ) = |i| for all t ∈ T and S(t0 , x ) = |x|. Next 2
t : if(x > 0)
we regard the result variable α = |t1 , x | with the local 3
x= x−1
bound Sl (α) = |x| + |i|. Thus, we have α ∈ Σ̇ \ .
For result variables α that sum up several program variables, we require that
only one comes from α’s own SCC in the RVG. Otherwise, we would also consider
loops like while z > 0 do x = x + y; y = x; z = z − 1; done that increase
the size of x exponentially. To express our requirement formally, let Vα = {v |
|t, v | ∈ pre(α)∩C} be those variables whose result variables in C have an edge to
5
Again, “eα ” denotes the constant function mapping all values from Nn to eα .
150 M. Brockschmidt et al.

α. We require |Vα | = 1, i.e., no two result vari-

|i| ≥ |t0 , i | |x| ≥ |t0 , x |
ables |t, v |, |t̃, ṽ | in α’s SCC C with v = ṽ
may have edges to α. But we allow incoming
edges from arbitrary result variables outside |i| ≥ |t1 , i | |x|+|i| ≥ |t1 , x |
the SCC. The requirement is satisﬁed in our
RVG, as α = |t1 , x | is a predecessor of itself |i| ≥ |t2 , i | |x| ≥ |t2 , x |
and its SCC contains no other result variables.
Thus, Vα = {x}. Of course, α also has prede-
|i| ≥ |t3 , i | |x| ≥ |t3 , x |
cessors of the form |t, i | outside the SCC.
For each variable v, let fvα be an upper bound on the size of those result
variables |t, v | ∈ / C that have edges to α, i.e., fvα = max{S(t, v ) | |t, v | ∈
pre(α) \ C}. The execution of α’s transition then means that the value of the
variable in Vα can be increased by adding fvα (for all v ∈ actV(Sl (α)) \ Vα ) plus
the constant eα . Again, this can be repeated at most R(α) times. So the overall
size is bounded by adding R(α) · (eα + v∈actV(Sl (α))\Vα fvα ).
In our example with α = |t1 , x |, we have Vα = {x}, actV(Sl (α)) = actV(|x| +
|i|) = {i, x}, and fiα = max{S(t0 , i ), S(t1 , i )} = |i|. When entering α’s SCC,
the input is bounded by the preceding transitions, i.e., by max{S(t0 , i ), S(t1 , i ),
S(t0 , x )} = max{|i|, |x|}. By traversing α’s transition t1 repeatedly (at most
R(α) = R(t1 ) = |i| times), this value may be increased by adding R(α) · (eα +
fiα ) = |i| · (0 + |i|) = |i|2 . Hence, we obtain S(α) = max{|i|, |x|} + |i|2 . Conse-
quently, we also get S(t2 , x ) = S(t3 , x ) = max{|i|, |x|} + |i|2 . Thm. 10 extends
the procedure SizeBounds from Thm. 9 to non-trivial SCCs.
Theorem 10 (SizeBounds for Non-Trivial SCCs). Let (R, S) be a complexi-
ty approximation, Sl a local size approximation, and C ⊆ RV a non-trivial SCC
of the RVG. If there is an α ∈ C with α ∈
/ Σ̇ or both α ∈ Σ̇\ and |Vα | > 1, then
we set S = S. Otherwise, for all α ∈
/ C let S (α) = S(α). For all α ∈ C, we set
.
S (α) = max(
{S(α̃) | there is an α ∈ C with α̃ ∈ pre(α) \ C} ∪ {eα | α ∈ =} )
+ α∈\=. R(α) · eα

+ α∈Σ̇\ R(α) · (eα + v∈actV(Sl (α))\Vα fvα )
Then SizeBounds(R, S, C) = S is also a size approximation.
In our example, by the inferred size bounds we can derive a runtime bound for
the last transition t3 . When calling TimeBounds on T = {t3 }, it ﬁnds the PRF
Pol (2 ) = x, implying that T ’s runtime is linear. When reaching T , the size of x
is bounded by S(t2 , x ). So R(t3 ) = R(t2 ) · [Pol (2 )](S(t2 , i ), S(t2 ,
x )) = 1 ·

S(t2 , x ) = max{|i|, |x|} + |i| . So a bound on the overall runtime is t∈T R(t)
2

= 2 + |i| + max{|i|, |x|} + |i|2 , i.e., it is linear in |x| and quadratic in |i|.

5 Implementation and Related Work

We presented a new alternating modular approach for runtime and size com-
plexity analysis of integer programs. Each step only considers a small part of the
program, and runtime bounds help to infer size bounds and vice versa.
Our overall procedure to compute the runtime and size approximations R and
Alternating Runtime and Size Complexity Analysis of Integer Programs 151

S is displayed on the right. (R, S) := (R0 , S0 )

After starting with the ini- while there are t, v with R(t) = ? or S(t, v ) = ? do
tial approximations R0 , S0 , T := {t ∈ T | R(t) = ?}
the procedure TimeBounds R := TimeBounds(R, S, T )
(Thm. 6) is used to improve for all SCCs C of the RVG in topological order do
the runtime bounds for those S := SizeBounds(R, S, C)

transitions T for which we done
have no bound yet.6 After- done
wards, the procedure SizeBounds (Thm. 9 and 10) considers the SCCs of the
result variable graph in topological order to update the size approximation.
When all bounds have been determined, R and S are returned. Of course, we
do not always succeed in finding bounds for all transitions and variables. Thus,
while the procedure keeps on improving the bounds, at any point during its run,
R and S are over-approximations of the actual runtimes and sizes. Hence, the
procedure can be interrupted at any time and it always returns correct bounds.
Several methods to determine symbolic complexity bounds for programs have
been developed in recent years. The approaches of [3, 4] (implemented in COSTA
and its backend PUBS) and [37] (implemented in Loopus) also use an iterative
procedure based on termination proving techniques to find runtime bounds for
isolated loops, which are then combined to an overall result. However, [3,4] han-
dles all loop transitions at once and [37] is restricted to termination proofs via the
size-change principle [28]. The approach of [6] (implemented in Rank) first proves
termination by a lexicographic combination of linear rank functions, similar to
our Thm. 6. However, while Thm. 6 combines these rank functions with size
bounds, [6] approximates the reachable state space using Ehrhart polynomials.
The tool SPEED [24] instruments programs by counters and employs an invariant
generation tool to obtain bounds on these counters. The ABC system [11] also
determines symbolic bounds for nested loops, but does not treat sequences of
loops. Finally, our technique in Sect. 4.2 to infer size bounds by estimating the
effect of repeated local changes has some similarities to the approach of [10]
which defines syntactic criteria for programs to have polynomial complexity.
The work on determining the worst-case execution time (WCET) for real-
time systems [36] is largely orthogonal to symbolic loop bounds. It distinguishes
processor instructions according to their complexity, but requires loop bounds
to be provided by the user. Recently, recurrence solving has been used as an
automatic pre-processing step for WCET analysis in the tool r-TuBound [27].
There is also a wealth of work on complexity for declarative paradigms. For
instance, resource aware ML [26] analyzes amortized complexity for recursive
functional programs with inductive data types, but it does not handle programs
whose complexity depends on integers. There are also numerous techniques for
complexity analysis of term rewriting and logic programming [7, 18, 22, 31, 32].
6
After generating a PRF Pol for T , it is advantageous to extend T by all remaining
transitions (, τ, ) from T \ T where the measure Pol is also (weakly) decreasing,
i.e., where τ ⇒ (Pol())(v1 , . . . , vn ) ≥ (Pol( ))(v1 , . . . , vn ). Calling the procedure
TimeBounds with this extended set T yields better results and may also improve pre-
viously found runtime bounds. We also used this strategy for the example in Sect. 3.
152 M. Brockschmidt et al.

Our approach builds upon well-known basic concepts (like lexicographic rank
functions), but uses them in a novel way to obtain a more powerful technique
than previous approaches. In particular, in contrast to previous work, our
approach deals with non-linear information flow between different program parts.
To evaluate our approach, we implemented a prototype KoAT and compared
it with PUBS [3, 4] and Rank [6]. We also contacted the authors of SPEED [24]
and Loopus [37], but were not able to obtain these tools. We did not compare
KoAT to ABC [11], RAML [26], or r-TuBound [27], as their input or analysis goals
differ considerably from ours. As benchmarks, we collected 682 programs from
the literature on termination and complexity of integer programs. These include
all 36 examples from the evaluation of Rank, all but one of the 53 examples used
to evaluate PUBS,7 all 27 examples from the evaluations of SPEED, and the ex-
amples from the current paper (which can be handled by KoAT, but not by PUBS
or Rank). Where examples were available as C programs, we used the tool KITTeL
[19] to transform them into integer programs automatically. The collection con-
tains 48 recursive examples, which cannot be analyzed with Rank, and 20 exam-
ples with non-linear arithmetic, which can be handled by neither Rank nor PUBS.
The remaining examples are compatible with all tested tools. All examples, the
results of the three tools, and a binary of KoAT are available at [1].
The table illustrates how often 1 log n n n log n n2 n3 n>3 EXP Time
each tool could infer a specific KoAT 121 0 145 0 59 3 3 0 1.1 s
runtime bound for the example PUBS 116 5 131 5 22 7 0 6 0.8 s
set. Here, 1, log n, n, n log n, n2 , Rank 56 0 19 0 8 1 0 0 0.5 s
n3 , and n>3 represent their corresponding asymptotic classes and EXP is the
class of exponential functions. In the column “Time”, we give the average run-
time on those examples where the respective tool was successful. The average
runtime on those 65 examples where all tools succeeded were 0.5 s for KoAT,
0.2 s for PUBS, and 0.6 s for Rank. The benchmarks were executed on a computer
with 6GB of RAM and an Intel i7 CPU clocked at 3.07 GHz, using a timeout of
60 seconds for each example. A longer timeout did not yield additional results.
On this collection, our approach was more powerful than the two other tools
and still efficient. In fact, KoAT is only a simple prototype whose efficiency could
still be improved considerably by fine-tuning its implementation. As shown in
[1], there are 77 examples where KoAT infers a bound of a lower asymptotic
class than PUBS, 548 examples where the bounds are in the same class, and 57
examples where the bound of PUBS is (asymptotically) more precise than KoAT’s.
Similarly, there are 259 examples where KoAT is asymptotically more precise than
Rank, 410 examples where they are equal, and 13 examples where Rank is more
precise. While KoAT is the only of the three tools that can also handle non-linear
arithmetic, even when disregarding the 20 examples with non-linear arithmetic,
KoAT can detect runtime bounds for 325 examples, whereas PUBS succeeds only
for 292 programs and Rank only finds bounds for 84 examples.
A limitation of our implementation is that it only generates (possibly non-
linear) PRFs to detect polynomial bounds. In contrast, PUBS uses PRFs to find
logarithmic and exponential complexity bounds as well [3]. Such an extension
7
We removed one example with undefined semantics.
Alternating Runtime and Size Complexity Analysis of Integer Programs 153

could also be directly integrated into our method. Moreover, we are restricted to
weakly monotonic bounds in order to allow their modular composition. Another
limitation is that our size analysis only handles certain forms of local size bounds
in non-trivial SCCs of the result variable graph. For that reason, it often over-
approximates the sizes of variables that are both incremented and decremented
in the same loop. Due to all these imprecisions, our approach sometimes infers
bounds that are asymptotically larger than the actual asymptotic costs.
Our method is easily extended. In [14], we provide an extension to handle
(possibly recursive) procedure calls in a modular fashion. Moreover, we show how
to treat other forms of bounds (e.g., on the number of sent network requests)
and how to compute bounds for separate program parts in advance or in parallel.
Future work will be concerned with reﬁning the precision of the inferred run-
time and size approximations and with improving our implementation (e.g., by
extending it to infer also non-polynomial complexities). Moreover, instead of ab-
stracting heap operations to integers, we intend to investigate an extension of our
approach to apply it directly to programs operating on the heap. Finally, simi-
lar to the coupling of COSTA with the tool KeY in [5], we want to automatically
certify the complexity bounds found by our implementation KoAT.
Acknowledgments. We thank A. Ben-Amram, B. Cook, C. von Essen, C. Otto
for valuable discussions and C. Alias and S. Genaim for help with the experiments.

References
1. http://aprove.informatik.rwth-aachen.de/eval/IntegerComplexity/
2. Albert, E., Arenas, P., Codish, M., Genaim, S., Puebla, G., Zanardini, D.: Termi-
nation analysis of Java Bytecode. In: Barthe, G., de Boer, F.S. (eds.) FMOODS
2008. LNCS, vol. 5051, pp. 2–18. Springer, Heidelberg (2008)
3. Albert, E., Arenas, P., Genaim, S., Puebla, G.: Closed-form upper bounds in static
cost analysis. JAR 46(2), 161–203 (2011)
4. Albert, E., Arenas, P., Genaim, S., Puebla, G., Zanardini, D.: Cost analysis of
object-oriented bytecode programs. TCS 413(1), 142–159 (2012)
5. Albert, E., Bubel, R., Genaim, S., Hähnle, R., Puebla, G., Román-Dı́ez, G.: Verified
resource guarantees using COSTA and KeY. In: Khoo, S.-C., Siek, J.G. (eds.) PEPM
2011, pp. 73–76. ACM Press (2011)
6. Alias, C., Darte, A., Feautrier, P., Gonnord, L.: Multi-dimensional rankings, pro-
gram termination, and complexity bounds of flowchart programs. In: Cousot, R.,
Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 117–133. Springer, Heidelberg
(2010)
7. Avanzini, M., Moser, G.: A combination framework for complexity. In: van Raams-
donk, F. (ed.) RTA 2013. LIPIcs, vol. 21, pp. 55–70. Dagstuhl Publishing (2013)
8. Bagnara, R., Mesnard, F., Pescetti, A., Zaffanella, E.: A new look at the automatic
synthesis of linear ranking functions. IC 215, 47–67 (2012)
9. Ben-Amram, A.M., Genaim, S.: On the linear ranking problem for integer linear-
constraint loops. In: Giacobazzi, R., Cousot, R. (eds.) POPL 2013, pp. 51–62. ACM
Press (2013)
10. Ben-Amram, A.M., Jones, N.D., Kristiansen, L.: Linear, polynomial or exponential?
Complexity inference in polynomial time. In: Beckmann, A., Dimitracopoulos, C.,
Löwe, B. (eds.) CiE 2008. LNCS, vol. 5028, pp. 67–76. Springer, Heidelberg (2008)
154 M. Brockschmidt et al.

11. Blanc, R., Henzinger, T.A., Hottelier, T., Kovács, L.: ABC: Algebraic bound compu-
tation for loops. In: Clarke, E.M., Voronkov, A. (eds.) LPAR 2010. LNCS (LNAI),
vol. 6355, pp. 103–118. Springer, Heidelberg (2010)
12. Bradley, A.R., Manna, Z., Sipma, H.B.: Linear ranking with reachability. In: Etes-
sami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 491–504. Springer,
Heidelberg (2005)
13. Brockschmidt, M., Cook, B., Fuhs, C.: Better termination proving through cooper-
ation. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 413–429.
Springer, Heidelberg (2013)
14. Brockschmidt, M., Emmes, F., Falke, S., Fuhs, C., Giesl, J.: Alternating runtime and
size complexity analysis of integer programs. Tech. Rep. AIB 2013-12, RWTH Aachen
(2013), available from [1] and from http://aib.informatik.rwth-aachen.de
15. Brockschmidt, M., Musiol, R., Otto, C., Giesl, J.: Automated termination proofs
for Java programs with cyclic data. In: Madhusudan, P., Seshia, S.A. (eds.) CAV
2012. LNCS, vol. 7358, pp. 105–122. Springer, Heidelberg (2012)
16. Cook, B., Podelski, A., Rybalchenko, A.: Termination proofs for systems code. In:
Schwartzbach, M., Ball, T. (eds.) PLDI 2006, pp. 415–426. ACM Press (2006)
17. Cook, B., See, A., Zuleger, F.: Ramsey vs. Lexicographic termination proving.
In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 47–61.
Springer, Heidelberg (2013)
18. Debray, S., Lin, N.: Cost analysis of logic programs. TOPLAS 15, 826–875 (1993)
19. Falke, S., Kapur, D., Sinz, C.: Termination analysis of C programs using compiler
intermediate languages. In: Schmidt-Schauß, M. (ed.) RTA 2011. LIPIcs, vol. 10,
pp. 41–50. Dagstuhl Publishing (2011)
20. Fuhs, C., Giesl, J., Middeldorp, A., Schneider-Kamp, P., Thiemann, R., Zankl, H.:
SAT solving for termination analysis with polynomial interpretations. In: Marques-
Silva, J., Sakallah, K.A. (eds.) SAT 2007. LNCS, vol. 4501, pp. 340–354. Springer,
Heidelberg (2007)
21. Fuhs, C., Giesl, J., Plücker, M., Schneider-Kamp, P., Falke, S.: Proving termination
of integer term rewriting. In: Treinen, R. (ed.) RTA 2009. LNCS, vol. 5595, pp.
32–47. Springer, Heidelberg (2009)
22. Giesl, J., Ströder, T., Schneider-Kamp, P., Emmes, F., Fuhs, C.: Symbolic evalu-
ation graphs and term rewriting: A general methodology for analyzing logic pro-
grams. In: De Schreye, D., Janssens, G., King, A. (eds.) PPDP 2012, pp. 1–12.
ACM Press (2012)
23. Giesl, J., Thiemann, R., Schneider-Kamp, P., Falke, S.: Mechanizing and improving
dependency pairs. JAR 37(3), 155–203 (2006)
24. Gulwani, S., Mehra, K.K., Chilimbi, T.M.: SPEED: Precise and eﬃcient static es-
timation of program computational complexity. In: Shao, Z., Pierce, B.C. (eds.)
POPL 2009, pp. 127–139. ACM Press (2009)
25. Harris, W.R., Lal, A., Nori, A.V., Rajamani, S.K.: Alternation for termination. In:
Cousot, R., Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 304–319. Springer,
Heidelberg (2010)
26. Hoﬀmann, J., Aehlig, K., Hofmann, M.: Multivariate amortized resource analysis.
TOPLAS 34(3) (2012)
27. Knoop, J., Kovács, L., Zwirchmayr, J.: r-TuBound: Loop bounds for WCET analy-
sis (Tool paper). In: Bjørner, N., Voronkov, A. (eds.) LPAR 2012. LNCS, vol. 7180,
pp. 435–444. Springer, Heidelberg (2012)
28. Lee, C.S., Jones, N.D., Ben-Amram, A.M.: The size-change principle for program
termination. In: Hankin, C., Schmidt, D. (eds.) POPL 2001, pp. 81–92. ACM Press
(2001)
Alternating Runtime and Size Complexity Analysis of Integer Programs 155

29. Magill, S., Tsai, M.H., Lee, P., Tsay, Y.K.: Automatic numeric abstractions for
heap-manipulating programs. In: Hermenegildo, M.V., Palsberg, J. (eds.), POPL
2010, pp. 211–222 (2010)
30. Miné, A.: The Octagon abstract domain. HOSC 19(1), 31–100 (2006)
31. Navas, J., Mera, E., López-Garcı́a, P., Hermenegildo, M.V.: User-deﬁnable resource
bounds analysis for logic programs. In: Dahl, V., Niemelä, I. (eds.) ICLP 2007.
LNCS, vol. 4670, pp. 348–363. Springer, Heidelberg (2007)
32. Noschinski, L., Emmes, F., Giesl, J.: Analyzing innermost runtime complexity of
term rewriting by dependency pairs. JAR 51(1), 27–56 (2013)
33. Podelski, A., Rybalchenko, A.: A complete method for the synthesis of linear rank-
ing functions. In: Steﬀen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp.
239–251. Springer, Heidelberg (2004)
34. Spoto, F., Mesnard, F., Payet, É.: A termination analyser for Java Bytecode based
on path-length. TOPLAS 32(3) (2010)
35. Tsitovich, A., Sharygina, N., Wintersteiger, C.M., Kroening, D.: Loop summariza-
tion and termination analysis. In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS
2011. LNCS, vol. 6605, pp. 81–95. Springer, Heidelberg (2011)
36. Wilhelm, R., Engblom, J., Ermedahl, A., Holsti, N., Thesing, S., Whalley, D.B.,
Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I.,
Puschner, P.P., Staschulat, J., Stenström, P.: The worst-case execution-time prob-
lem: overview of methods and survey of tools. TECS 7(3), 36:1–36:53 (2008)
37. Zuleger, F., Gulwani, S., Sinn, M., Veith, H.: Bound analysis of imperative pro-
grams with the size-change abstraction. In: Yahav, E. (ed.) SAS 2011. LNCS,
vol. 6887, pp. 280–297. Springer, Heidelberg (2011)
Proving Nontermination via Safety

Hong-Yi Chen1 , Byron Cook2,1 , Carsten Fuhs1 ,

Kaustubh Nimkar1 , and Peter O’Hearn1
1
University College London, UK
2
Microsoft Research, UK

Abstract. We show how the problem of nontermination proving can be

reduced to a question of underapproximation search guided by a safety
prover. This reduction leads to new nontermination proving implementa-
tion strategies based on existing tools for safety proving. Our preliminary
implementation beats existing tools. Furthermore, our approach leads to
easy support for programs with unbounded nondeterminism.

1 Introduction
The problem of proving program nontermination represents an interesting com-
plement to termination as, unlike safety, termination’s falsification cannot be
witnessed by a finite trace. While the problem of proving termination has now
been extensively studied, the search for reliable and scalable methods for proving
nontermination remains open.
In this paper we develop a new method of proving nontermination based on a
reduction to safety proving that leverages the power of existing tools. An iterative
algorithm is developed which uses counterexamples to a fixed safety property
to refine an underapproximation of a program. With our approach, existing
safety provers can now be employed to prove nontermination of programs that
previous techniques could not handle. Not only does the new approach perform
better, it also leads to nontermination proving tools supporting programs with
nondeterminism, for which previous tools had only little support.
Limitations. Our proposed nontermination procedure can only prove
nontermination. On terminating programs the procedure is
likely to diverge (although some heuristics are proposed if (k ≥ 0)
which aim to avoid this). While our method could be ex- skip;
tended to further programming language features (e.g. heap, else
recursion), in practice the supported features of an under- i := −1;
lying safety prover determine applicability. Our implemen-
tation uses a safety prover for non-recursive programs with while (i ≥ 0) {
linear integer arithmetic commands. i := nondet();
}
Example. Before discussing our procedure in a formal set-
ting, we begin with a simple example given to the right. In i := 2;
this program the command i := nondet() represents non-
deterministic value introduction into the variable i. The loop in this program

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 156–171, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Proving Nontermination via Safety 157

is nonterminating when the program is invoked with appropriate inputs and

when appropriate choices for nondet assignment are made. We are interested in
automatically detecting this nontermination. The basis of our procedure is the
search for an underapproximation of the original program that never terminates.
As “never terminates” can be encoded as safety property (defined later as closed
recurrence in Sect. 2), we can then iterate a safety prover together with a method
of underapproximating based on counterexamples. We have to be careful, how-
ever, to find the right underapproximation in order to avoid unsoundness.
In order to find the desired underapproximation for our example, we introduce
an assume statement at the beginning with the initial precondition true. We
also place assume(true) statements after each use of nondet. We then put an
assert(false) statement at points where the loop under consideration exits
(thus encoding the “never terminates” property). See Fig. 1(a).
We can now use a safety checker to search for paths that violate this assertion.
Any error path clearly cannot contribute towards the nontermination of the loop.
After detecting such a path we calculate restrictions on the introduced assume
statements such that the path is no longer feasible when the restriction is applied.
Initially as a first counterexample to safety, we might get the path k < 0, i :=
−1, i < 0, from a safety prover. We now want to determine from which states
we can reach assert(false) and eliminate those states. Using a precondition
computation similar to Calcagno et al. [6] we find the condition k < 0. The
trick is to use the standard weakest precondition rule for assignments, but to use
pre(assume(Q), P ) P ∧Q instead of the standard wp(assume(Q), P ) Q ⇒ P .
This way, we only consider executions that actually reach the error location. To
rule out the states k < 0 we can add the negation (e.g. k ≥ 0) to the precondition
assume statement. See Fig. 1(b).
In our procedure we try again to prove the assertion statement unreachable,
using the program in Fig. 1(b). In this instance we might get the path k ≥
0, skip, i < 0, which again violates the assertion. For this path we would
discover the precondition k ≥ 0 ∧ i < 0, and to rule out these states we refine
the precondition assume statement with “assume(k ≥ 0 ∧ i ≥ 0);”. See Fig. 1(c).
On this program our safety prover will again fail, perhaps resulting in the path
k ≥ 0, skip, i ≥ 0, i := nondet(), i < 0. Then our procedure would stop
computing the precondition at the command i := nondet() (for reasons discussed
later). Here we would learn that at the nondeterministic command the result
must be i < 0 to violate the assertion, thus we would refine the assume statement
just after the nondet with the negation of i < 0: “assume(i ≥ 0);” See Fig. 1(d).
The program in Fig. 1(d) cannot violate the assertion, and thus we have
hopefully computed the desired underapproximation to the transition relation
needed in order to prove nontermination. However, for soundness, it is essential
to ensure that the loop in Fig. 1(d) is still reachable, even after the successive
restrictions to the state-space. We encode this condition as a safety problem. See
Fig. 1(e). This time we add assert(false) before the loop and aim to prove
that the assertion is violated. The existence of a path violating the assertion
ensures that the loop in Fig. 1(d) is reachable. Here the assertion and thus the
158 H.-Y. Chen et al.

assume(true); assume(k ≥ 0); assume(k ≥ 0 ∧ i ≥ 0);

if (k ≥ 0) if (k ≥ 0) if (k ≥ 0)
skip; skip; skip;
else else else
i := −1; i := −1; i := −1;

while (i ≥ 0) { while (i ≥ 0) { while (i ≥ 0) {

i := nondet(); i := nondet(); i := nondet();
assume(true); assume(true); assume(true);
} } }

assert(false); assert(false); assert(false);

i := 2; i := 2; i := 2;
(a) (b) (c)

assume(k ≥ 0 ∧ i ≥ 0);
assume(k ≥ 0 ∧ i ≥ 0);
if (k ≥ 0)
skip; if (k ≥ 0)
assume(k ≥ 0 ∧ i ≥ 0);
else skip;
assume(k ≥ 0);
i := −1; else
i := −1;
skip;
while (i ≥ 0) {
while (i ≥ 0) {
i := nondet(); assert(false);
i := nondet();
assume(i ≥ 0);
assume(i ≥ 0);
} while (i ≥ 0) {
}
i := nondet();
assert(false); assume(i ≥ 0);
}
i := 2;
(d) (e) (f )

Fig. 1. Original instrumented program (a) and its successive underapproximations

(b), (c), (d). Reachability check for the loop (e), and nondeterminism-assume that
must be checked for satisﬁability (f ).

loop are still reachable. The path violating the assertion is our desired path to
the loop which we refer to as stem. Fig. 1(f) shows the stem and the loop.
Finally we need to ensure that the assume statement in Fig. 1(f) can always
be satisﬁed with some i by any reachable state from the restricted pre-state.
This is necessary: our underapproximations may accidentally have eliminated
not only the paths to the loop’s exit location, but also all of the nonterminating
paths inside the loop. Once this check succeeds we have proved nontermination.
Proving Nontermination via Safety 159

2 Closed Recurrence Sets

In this section we define a new concept which is at the heart of our procedure,
called closed recurrence. Closed recurrence extends the known concept of (open)
recurrence [16] in a way that facilitates automation, e.g. via a safety prover.
Preliminaries. Let S be the set of states. Given a transition relation R ⊆ S × S,
for a state s with R(s, s ), we say that s is a successor of s under R.
We will be considering programs P with finitely many program locations
L and a set of memory states M , so the program’s state space S is given as
S = L × M . For instance, for a program on n integer variables, we have M = Zn ,
and a memory state amounts to a valuation of the program variables.
A program P on locations L is represented via its control-flow graph (CFG)
(L, Δ, li , lf , le ). The program locations are the CFG’s nodes and Δ is a set of
edges between locations labeled with commands. We designate special locations
in L: li is the initial location, lf the final location, and le the error location. Each
(l, T, l) ∈ Δ is a directed edge from l to l labeled with a command T . We write
RT ⊆ M × M for the relation on memory states induced by T in the usual way.
We say that a memory state s at node l has a successor s along the edge
(l, T, l) iff RT (s, s ) holds. A path π in a CFG is a sequence of edges (l0 , T0 , l1 )
(l1 , T1 , l2 ) . . . (ln−1 , Tn−1 , ln ). The composite transition relation Rπ of a path π is
the composition RT0 ◦RT1 ◦...◦RTn−1 of the individual relations. We also describe
a path π by the sequence of nodes it visits, e.g. l0 → l1 → . . . → ln−1 → ln .
Commands. We represent by V the set of all program variables. A deterministic
assignment statement is of the form i := exp where i ∈ V and exp is an expression
over program variables. A nondeterministic assignment statement is of the form
i := nondet(); assume(Q); where i ∈ V , nondet() is a nondeterministic choice and
Q is a boolean expression over V representing the restriction that the nondet()
choice must obey. Conditional statements are encoded using assume commands
(from Nelson [22]): assume(Q), where Q is a boolean expression over V . W.l.o.g.
li has 0 incoming and 1 outgoing edge, labeled with assume(Q), where initially
usually Q true. For readability, in our example CFGs we often write Q for
assume(Q). Our algorithm will later strengthen Q in the assume-statements.
We define the indegree of a node l in a CFG to be the number of incoming
edges to l. Similarly, the outdegree of a node l in a CFG is the number of outgoing
edges from l. A node l ∈ L \ {li , lf , le } must be of one of the following types.
1. A deterministic assignment node: l has outdegree exactly 1 and the outgoing
edge is labeled with a deterministic assignment statement or skip. Any
memory state s at l has a unique successor s along the edge.
2. A deterministic conditional node: l has outdegree 2 with one edge labeled
assume(ϕ), the other edge labeled assume(¬ϕ). Any memory state s at l has
a unique successor s along one edge and no successor along the other edge.
3. A nondeterministic assignment node: l has outdegree exactly 1 and the out-
going edge is labeled with a nondeterministic assignment statement. A mem-
ory state s at l may have zero or more successors along the outgoing edge
depending on the condition present in the assume(Q) statement.
160 H.-Y. Chen et al.

In our construction of a CFG, a lone statement assume(Q) from an input

program is modeled by a deterministic conditional node, with one edge labeled
assume(Q) and the other edge labeled assume(¬Q) leading to the end location lf .
Example. Consider the CFG for our ini- 0
tial example given in Fig. 2. Here we have
the initial location li = 0, the final loca- true
tion lf = 7, and the error location le = 8. k<0
1 3 7
The nodes 2, 3 and 6 are deterministic as-
signment nodes, nodes 1 and 4 are deter- k ≥ 0 i := −1 i := 2
ministic conditional nodes, and node 5 is
skip i<0
a nondeterministic assignment node. 2 4 6
CFG loops. Given a program with its CFG,
a loop L in the CFG is a set of nodes s.t. i := nondet()
i≥0
assume(true)
– There exists a path from any node of Error
L to any other node of L. 5 8
– (W.l.o.g.) we assume that there is only
one node h of L s.t. there exists a node Fig. 2. CFG for initial example
n∈/ L with an edge from n to h. The node h is called the header node of L.
The subgraph of CFG containing all nodes of L is called the loop body of L.
Header h of L is a deterministic conditional node with one edge that is part of
the loop body, the guard edge of L. The other edge of h goes to a node e ∈ / L.
We call this edge the exit edge of L and e the exit location of L.1
Example. In Fig. 2, the only loop is L = {4, 5}, and its header node h is 4. The
exit location of L is 6.
Given a loop L in program P , we define a loop path πL as any finite path
through L’s body of the form (l0 = lh ) → l1 → ... → ln−1 → (ln = lh ), where lh is
the header node of L and ∀p.(0 < p < n) → lp = lh . We define the composite
transition relation RL of L as RL (s, s ) iff there exists a loop path π s.t. Rπ (s, s ).
Here RL can be an infinite (or empty) set. The initial states IL for RL are the set
of reachable states at the loop header before the loop is entered for the first time.
Preconditions. When computing preconditions of assume statements we borrow
from Calcagno et al. [6]: pre(assume(Q), P ) P ∧ Q, called the “assume as
assert trick”. This lets us interpret assume statements (often from conditional
branches) in a way that allows us to determine in a precondition the states from
which an error location can be reached in a safety counterexample path. For
assignment statements we will use the standard weakest precondition [12].
Example. Note that the weakest precondition of an assignment with nondeter-
minism is a little subtle. Let i := nondet(); assume(true); be the nondet state-
ment under consideration. The weakest precondition for the postcondition (i < j)
is false (equivalent to ∀i. i < j). However the weakest precondition for the post-
condition (i < j ∨ k > 0) is (k > 0).
1
Languages like Java or C allow loops with additional exit edges also from locations
other than the loop header, which are implemented by commands like break or goto.
We also support such more general loops, via a program transformation.
Proving Nontermination via Safety 161

Recurrence sets. A relation R with initial states I is nonterminating iﬀ there

R R R
exists an infinite transition sequence s0 − → s1 − → s2 − → . . . with s0 ∈ I. Gupta
et al. [16] characterize nontermination of a relation R by the existence of a
recurrence set, viz. a nonempty set of states G such that for each s ∈ G there
exists a transition to some s ∈ G. In particular, an infinite transition sequence
R R R
s0 −→ s1 − → s2 − → . . . itself gives rise to the recurrence set {s0 , s1 , s2 , . . .}. Here
we extend the notion of a recurrence set ∃s.G(s) ∧ I(s) (1)
to include initial states. A transition rela-
∀s∃s .G(s) → R(s, s ) ∧ G(s ) (2)
tion R with initial states I has an (open)
recurrence set of states G(s) iff (1) and (2) hold. A transition relation R with
initial states I is nonterminating iff it has a recurrence set of states.
A set G is a closed recurrence set for a ∃s.G(s) ∧ I(s) (3)
transition relation R with initial states I
∀s∃s .G(s) → R(s, s ) (4)
iff the conditions (3)–(5) hold. In contrast
to open recurrence sets, we now require a ∀s∀s .G(s) ∧ R(s, s ) → G(s ) (5)
purely universal property: for each s ∈ G and for each of its successors s , also s
must be in the recurrence set (Condition (5)). So instead of requiring that we can
stay in the recurrence set, we now demand that we must stay in the recurrence
set. This now helps us to incorporate nondeterministic transition systems too.
Now what if a state s in our recurrence set G has no successor s at all? Our
alleged infinite transition sequence would reach a sudden halt, yet our universal
formula would trivially hold. Thus, we impose that each s ∈ G has some successor
s (Condition (4)). But this existential statement need not mention that s must
be in G again—our previous universal statement already takes care of this.
Theorem 1 (Closed Recurrence Sets are Recurrence Sets). Let G be
a closed recurrence set for R with initial states I. Then G is also an (open)
recurrence set for R with initial states I.
The proofs for all theorems can be found in the technical report [7].

Underapproximation. We call a transition relation R with initial states I an
underapproximation of transition relation R with initial states I iff R ⊆ R, I ⊆
I. Then every nonterminating program contains a closed recurrence set as an
underapproximation (i.e., together with underapproximation, closed recurrence
sets characterize nontermination).
Theorem 2 (Open Recurrence Sets Always Contain Closed Recur-
rence Sets). There exists a recurrence set G for a transition relation R with
initial states I iff there exist an underapproximation R with initial states I and
G ⊆ G such that G is a closed recurrence set for R with initial states I .

3 Algorithm
Our nontermination proving procedure Prover is detailed in Fig. 3. Its input is
a program P given by its CFG, and a loop to be considered for nontermination.
To prove nontermination of the entire program P we need to ﬁnd only one
nonterminating loop L. This can be done in parallel. Alternatively, the procedure
162 H.-Y. Chen et al.

Prover (CFG P , Loop L) Refine (CFG P , Path π)

h := header node of L (l0 T0 l1 )(l1 T1 l2 ) . . . (ln−1 Tn−1 ln ) := π
e := exit node of L in P Calculate WPs ψ1 , ψ2 . . . ψn−1 along π
P := Underapproximate(P, e) so {ψ1 }T1 {ψ2 }T2 . . . {ψn−1 }Tn−1 {true}
L := reﬁned loop L in P are valid Hoare-triples.
if ¬ Reachable(P , h) then Find p s.t. ψp = false ∧ ∀q < p. ψq = false
return Unknown, ⊥ P := P |(Tp−1 ,¬ψp )
fi return P
Π := {π | π feasible path to h in P }
for all π ∈ Π do Strengthen (CFG P , Path π)
P := π :: L // concatenation (l0 T0 l1 )(l1 T1 l2 ) . . . (ln−1 Tn−1 ln ) := π
if Validate(P ) then Calculate WPs ψ1 , ψ2 . . . ψn−1 along π
return NonTerminating, P so {ψ1 }T1 {ψ2 }T2 . . . {ψn−1 }Tn−1 {true}
fi are valid Hoare-triples.
done Find p s.t. ψp = false ∧ ∀q < p. ψq = false
return Unknown, ⊥ W := {v | v gets updated in subpath
(lp Tp lp+1 ) . . . (ln−1 Tn−1 ln )}
ρp := QE(∃W. ψp )
P := P |(Tp−1 ,¬ρp )
Underapproximate (CFG P , Node e) return P
κ := [ ]
while Reachable(P, e) do Validate (CFG P )
π := feasible path to e in P L := the outermost loop in P
κ := π :: κ M := {l | l is nondet assignment node in L }
P := Refine(P, π) for all l ∈ M do
if the n most recent paths in κ Calculate invariant invl at node l in P
are repeating then let nondet statement at l be
P := Strengthen(P, First(κ)) v := nondet(); assume(ϕ);
fi if invl → ∃v. ϕ is not valid then
done return false
return P fi
done
return true
Fig. 3. Algorithm Prover for underapproximation to synthesize a reachable nontermi-
nating loop. To prove nontermination of P , Prover should be run on all loops L.

can be implemented sequentially, but then timeouts are advisable in Prover,

as the procedure might diverge and cause another loop to not be considered.
The subprocedure Underapproximate performs the search for an underap-
proximation such that we can prove the loop is never exited. While the loop exit
is still Reachable (a.k.a. “unsafe”), we use the subprocedure Refine to exam-
ine paths returned from an off-the-shelf safety prover. Here the notation P |(Ti ,ϕ)
denotes P with an additional assume(ϕ) added to the transition Ti . From the
postcondition true (used to indicate success in reaching the loop exit), we use a
backwards precondition analysis to find out which program states will inevitably
end up in the loop exit. We continue this precondition calculation until either
we have reached the beginning of the path or until just before we have reached
a nondeterministic assignment that leads to the precondition false. We then
negate this condition as our underapproximating refinement.
Proving Nontermination via Safety 163

In some cases our refinement is too weak, leading to divergence. The difficulty
is that in cases the same loop path will be considered repeatedly, but at each
instance the loop will be unrolled for an additional iteration. To avoid this prob-
lem we impose a limit n for the number of paths that go along the same locations
(possibly with more and more repetitions). We call such paths repeating. If we
reach this limit, we use the subprocedure Strengthen to strengthen the precon-
dition, inspired by a heuristic by Cook and Koskinen [8]. Here we again calculate
a precondition, but when we have found ψp , we quantify out all the variables that
are written to after ψp and apply quantifier elimination (QE) to get ρp . We then
refine with ¬ρp . This leads to a more aggressive pruning of the transition rela-
tion. This heuristic can lead to additional incompleteness.
0
Example. Consider the instrumented program ϕ
to the right. Suppose we have initially ϕ i<0
i ≥ 0. We might get cex1 : 0 → 1 → 2 → 6 1
3 → 5 → 1 → 6 as a first counterexample. Error i ≥ 0
The Refine procedure finds the weakest precon-
dition k ≥ 0∧i = 0 at location 1. Adding its nega- 2
k≥0
tion to ϕ and simplifying the formula gives us k<0 skip
ϕ (i ≥ 0) ∧ (k < 0 ∨ i ≥ 1). Now we may get
cex2 : 0 → 1 → 2 → 3 → 5 → 1 → 2 → 3 → 3 4
5 → 1 → 6 as next counterexample, and Refine skip
updates ϕ (i ≥ 0) ∧ (k < 0 ∨ i ≥ 2). Now we i := i − 1
5
may get cex3 : 0 → 1 → 2 → 3 → 5 → 1 → 2 →
3 → 5 → 1 → 2 → 3 → 5 → 1 → 6 as next counterexample. Note that cex1 ,
cex2 , cex3 are repeating counterexamples and if we just use the Refine pro-
cedure, Underapproximate gets stuck in a sequence of infinite counterexam-
ples. Now Strengthen identifies the repeating counterexamples, considers cex1
and calculates the weakest precondition ψ1 k ≥ 0 ∧ i = 0. It then existentially
quantifies out variable i as it gets modified later along cex1 . We get ∃i. k ≥ 0∧i =
0, and quantifier elimination yields ρ1 k ≥ 0. Clearly ψ1 entails ρ1 . Adding ¬ρ1
to ϕ and simplifying the formula we get ϕ i ≥ 0 ∧ k < 0. Now all repeating
counterexamples are eliminated, the program is safe, and we have obtained a
closed recurrence set witnessing nontermination of the original program.
In the Underapproximate procedure, once there are no further counterex-
amples to safety of P , we know that in P the loop exit is not reachable. The
procedure returns the final underapproximation (denoted by P ) that is safe.
When Underapproximate returns to Prover, we check if in P the original
loop L after refinements has a closed recurrence set. We refer to the refined loop
as L . In order to check the existence of a closed recurrence set, we first need to
ensure that L is reachable in P even after the refinements. We again pose this
problem as a safety/reachability problem. This time we mark the header node of
L as an error location in P and hope that P is unsafe. If P is safe then clearly
we have failed to prove nontermination and we report the result as unknown. If
P is unsafe, then the counterexample to its safety is a path to the header of L .
We enumerate all such paths to the header of L in a set Π (generated lazily
164 H.-Y. Chen et al.

in our implementation). For each such path π ∈ Π we then create a simpliﬁed

CFG P by concatenating π to L , thus eliminating other paths to L .
At this point, we are sure that the header of L is reachable and there is no
path that can reach the exit location of L . However refinements in Underap-
proximate may have restricted the nondet statements inside L by strength-
ening the assume statements associated with them. Thus a reachable state
at the nondeterministic assignment node may not have a successor along its
outgoing edge. This would bring our alleged
0
infinite execution to a halt. The safety checker
i == 10
cannot detect this since then the path just i = 10
gets blocked at this node, and the error loca- 7 1
tion at the exit of L cannot be reached. Error i == 10
Example. Consider the instrumented program 2
in Fig. 4. Suppose initially ϕ true. The
original program (without instrumentation) is j := nondet()
clearly terminating. Our algorithm might give assume(ϕ)
cex1 : 0 → 1 → 2 → 3 → 4 → 6 → 1 → 7 as
the first counterexample. The nondet state- j≥4 3 skip
ment at node 2 gets restricted by updating
j<4
ϕ (j ≤ 3 ∨ i == 9). Now we might get
cex2 : 0 → 1 → 2 → 3 → 5 → 6 → 1 → 7 as 4 5
the next counterexample. Our algorithm re-
stricts the nondet statement at node 2 by up- i := i − 1
dating ϕ (j ≤ 3∨i == 9)∧(j ≥ 4∨i == 11).
i := i + 1 6
However now there are no further coun-
terexamples, and the safety checker returns
safe. The state s i = 10 at node 2 has no Fig. 4. Program showing why we
successor along the outgoing edge as there is need Validate procedure
no way to satisfy the condition ϕ and the execution is halted, so it would be
unsound to report the result as Nonterminating.
Note that at first it may appear that adding another outgoing edge to node
2 with j := nondet(); assume(¬ϕ); and marking the next node as an error node
would help us catch the halted execution. However the problem is that this would
discover again all of the previously eliminated counterexamples as well. Thus we
need a special check by the Validate procedure, which we describe next.
Validate takes as input the final underapproximation P . It first calculates a
location invariant at every nondet. assignment node inside the outermost loop L
in P . Let l be a nondet. assignment node with: v := nondet(); assume(ϕ); Let
inv be a location invariant at l. Validate then checks if (6) is valid. This formula
checks if for all reachable states at l, a choice can be made
inv → ∃v. ϕ (6)
for the nondet assignment obeying ϕ (and thus Condition
(4) holds). Validate returns true iff (6) holds for all nondet statements in L .
Example. Consider the program in Fig. 1(f). Using a standard invariant generator
we calculate the invariant i ≥ 0 before line 4. Substituting in (6) we get, i ≥ 0 →
∃i . i ≥ 0. Clearly the formula is valid. Note that in most of the cases even the
Proving Nontermination via Safety 165

weakest invariant true can be suﬃcient to prove validity of (6). In this example
as well we can easily prove that true → ∃i . i ≥ 0 is valid.
Moreover, consider the program in Fig. 4. Suppose ϕ (j ≤ 3 ∨ i == 9) ∧
(j ≥ 4 ∨ i == 11). Using an invariant generator, we obtain the location invariant
i = 10 at location 2. Then (6) becomes i = 10 → ∃j. (j ≤ 3∨i = 9)∧(j ≥ 4∨i = 11).
Clearly the formula is not valid. In this case Validate returns false.
If Validate returns true, we are sure that every reachable state at the non-
deterministic assignment node in L has a successor along the edge. At this point,
we report nontermination and return the ﬁnal underapproximation P of P as a
proof of nontermination for P : P is a closed recurrence set.
Note that as invariants are overapproximations, we may report unknown in
some cases even when the discovered underapproximation actually does have a
closed recurrence set. However, the check is essential to retain soundness.
Theorem 3 (Correctness of Prover for Nontermination). Let P be a
program and L a loop in P . Suppose Prover (P, L) = Nonterminating, P . Then
P is nonterminating.

4 Nested Loops if (i == 10) {

while (i > 0) {
Our algorithm can handle nested loops easily. That is i := i − 1;
a part of the beauty of the reduction to safety, as ex- while (i == 0)
isting safety provers (e.g. SLAM, Impact, etc.) han- skip;
dle nested loops with ease. Note that technically we }
only need to consider an outermost loop. Consider the assert(false);
instrumented program with nested loops to the right. }
Here the outer loop decreases the value of i 10 times and then it is the inner loop
that is nonterminating. However, it suﬃces only to consider the outermost loop
for safety as the assert(false) at the end of the outer loop is not reachable, but
the head of the outer loop is reachable, so that we have proved nontermination.
Tricky example. We close with an example that partially
assume(ϕ);
::::::::::
explains the advantage seen for our tool over TNT (dis-
while (k ≥ 0) {
cussed in Sect. 5). Consider the program to the right (al-
k := k + 1;
ready shown with our algorithm’s instrumentation,
::::::::::::::
ini-
j := k;
tially with ϕ true). This program clearly does not ter-
while (j ≥ 1)
minate, yet TNT will fail to prove it. In fact, our imple-
j := j − 1;
mentation of TNT which follows the strategy discussed
}
for enumerating lassos [16] diverges looking at larger and
assert(false);
larger cyclic paths (i.e. straight-line code from a location ::::::::::::::
back to that location). The diﬃculty here is that each cyclic path is well-founded.
Consider e.g. this cyclic path from the head h of the outer loop back to h:
k ≥ 0, k := k + 1, j := k, j ≥ 1, j := j − 1, j < 1, k ≥ 0
This path is well-founded. In fact the path cannot be repeated. The root of
the problem is the command sequence j ≥ 1, j := j − 1, j < 1, which tells us that
j goes from exactly 1 to 0. Because j = 1 before entering the inner loop, we know
166 H.-Y. Chen et al.

that k = 1, thus by the k := k + 1 command we know that k = 0 at the start of

the loop’s execution. Thus the command sequence respects the following condi-
tion: k > k∧k ≤ 1, and thus it is well-founded. Hence, the lasso-based tool TNT
will never be able to prove nontermination of this program. The tool AProVE
cannot prove such aperiodic nontermination for nested loops either [5].
Our approach, however, does not fall victim to this problem. It will ﬁnd the
path: k < 0, resulting in ϕ k ≥ 0. As assert(false) is unreachable with this
restriction (and the loop is still reachable), we have proved nontermination.

5 Related Work
Automatic tools for proving nontermination of term rewriting systems include
[14,23]. However, while nontermination analysis for term rewriting considers the
entire state space as legitimate initial states for a (possibly inﬁnite) evaluation
sequence, our setting also factors in reachability from the initial states.
Static nontermination analysis has also been investigated for logic programs
(e.g. [24,31]). Most related to our setting are techniques for constraint-logic pro-
grams (CLPs) [25]. Termination tools for CLPs (e.g. [25]) can in cases be used to
prove nontermination of imperative programs (e.g. Julia [26] can show nontermi-
nation for Java Bytecode programs if the abstraction to CLPs is exact, but gives
no witness like a recurrence set to the user). The main diﬃculty for imperative
programs is that typically overapproximating abstractions (in general unsound
for nontermination) are used for converting languages like Java and C to CLPs.
TNT [16] uses a characterization of nontermination by recurrence sets. We
build upon this notion and introduce closed recurrence sets in our formalization,
as an intermediate concept during our nontermination proof search. In contrast
to us, TNT is restricted to programs with periodic “lasso-shaped” counterexam-
ples to termination. We support unbounded nondeterminism in the program’s
transition relation, whereas TNT is restricted to deterministic commands.
The tool Invel [30] analyzes nontermination of Java programs using a com-
bination of theorem proving and invariant generation. However, Invel does not
provide a witness for nontermination. Like Brockschmidt et al. [5], we were un-
able to obtain a working version of Invel. Note that in the empirical evaluation
by Brockschmidt et al. [5], the AProVE tool (which we have compared against)
subsumed Invel on Invel’s data set. Finally, Invel is only applicable to deter-
ministic (integer) programs, yet our approach allows nondeterminism as well.
Atig et al. [1] describe a technique for proving nontermination of multi-
threaded programs, via a reduction to nontermination reasoning for sequential
programs. Our work complements Atig et al., as we provide improvements to
the underlying sequential tools that future multithreaded tools can make use of.
The tool TRex [19] combines existing nontermination proving techniques with
a Terminator-like [9] iterative procedure. Our new method should complement
TRex nicely, as ours is more powerful than the underlying nontermination prov-
ing approach previously used [16].
AProVE [13] uses SMT to prove nontermination of Java programs [5]. First
nontermination of a loop regardless of its context is shown, then reachability of
Proving Nontermination via Safety 167

this loop with suitable values. Drawbacks are that they require recurrence sets
to be singletons (after program slicing) or the loop conditions to be invariants.
Gurfinkel et al. [18] present the CEGAR-based model checker Yasm which
supports arbitrary CTL properties, such as EG pc = END, denoting nontermi-
nation. Yasm implements a method of both under and over-approximating the
input program. Unfortunately, together with the author of Yasm we were not
able to get the tool working on our examples [17]. We suspect that our approach
will be faster, as it uses current safety proving techniques, i.e. Impact [20] rather
than Slam-style technology [2]. This is a feature of our approach: any off-the-
shelf software model checker can be turned into a nontermination prover.
Nontermination proving for finite-state systems is essentially a question of
safety [3]. Nontermination and/or related temporal logics are also supported for
more expressive systems, e.g. pushdown automata [28].
Recent work on CTL proving for programs uses an off-the-shelf nontermina-
tion prover [8]. We use a few steps when treating nondeterminism which look
similar to the approach from [8]. The key difference is that our work provides a
nontermination prover, whereas the previous work requires one off-the-shelf.
Gulwani et al. [15] (Claim 3), make a false claim that is similar to our own.
Their claim is false, as a nondeterministic program can be constructed which
represents a counterexample. Much of the subtlety in our approach comes from
our method of dealing with nondeterminism.

6 Experiments

We have built a preliminary implementation of our approach within the tool

T2 [10,4]2 and conducted an empirical evaluation with it against these tools:

– TNT [16], the original TNT tool was not available, and thus we have reim-
plemented its constraint-based algorithm with Z3 [11] as SMT backend.
– AProVE [13], via the Java Bytecode frontend, using the SMT-based non-
termination analysis by Brockschmidt et al. [5].
– Julia [29], which implements an approach via a reduction to constraint logic
programming described by Payet and Spoto [26].

As a benchmark set, we used a set of 492 benchmarks for termination analysis

from a variety of applications also used in prior tool evaluations (e.g. Windows
device drivers, the Apache web server, the PostgreSQL server, integer approx-
imations of numerical programs from a book on numerical recipes [27], integer
approximations of benchmarks from LLBMC [21] and other tool evaluations).3
Of these, 81 are known to be nonterminating and 254 terminating. For 157
examples, the termination status is unknown. These examples include a program
whose termination would imply the Collatz conjecture, and the remaining ex-
amples are too large to render a manual analysis feasible. On average a CFG in
2
We will make our implementation available in the next source code release of T2.
3
Download: http://www0.cs.ucl.ac.uk/staff/C.Fuhs/safety-nontermination
168 H.-Y. Chen et al.

(a) (b) (c)

Nonterm TO No Res Nonterm TO No Res Nonterm TO No Res
Fig. 3 51 0 30 0 45 209 82 3 72
AProVE 0 61 20 0 142 112 0 139 18
Julia 3 8 70 0 40 214 0 91 66
TNT 19 3 59 0 48 206 32 12 113

Fig. 5. Evaluation success overview, showing the number of problems solved for each
tool. Here (a) represents the results for known nonterminating examples, (b) is known
terminating examples, (c) is (previously) unknown examples.

our test suite has 18.4 nodes (max. 427 nodes) and 2.4 loops (max. 120 loops).
Unfortunately each tool requires a different machine configuration, and thus a
direct comparison is difficult. Experiments with our procedure were performed on
a dualcore Intel Core 2 Duo U9400 (1.4 GHz, 2 GB RAM, Windows 7). TNT was
run on Intel Core i5-2520M (2.5 GHz, 8 GB RAM, Ubuntu Linux 12.04). We ran
AProVE on Intel Core i7-950 (3.07 GHz, 6 GB RAM, Debian Linux 7.2). Note
that the TNT-/AProVE-machines are significantly faster than the machine our
new procedure was run on, thus we can make some adjusted comparison between
the tools. For Julia, an unknown cloud-based configuration was used. All tools
were run with a timeout of 60s. When a tool returned early with no definite
result, we display this in the plots using the special “NR” (no result) value.
We ran three sets of experiments: (a) all the examples previously known to be
nonterminating, (b) all the examples previously known to be terminating, and
(c) all the examples where no previous results are known. With (a) we assess
the efficiency of the algorithm, (b) is used to demonstrate its soundness, and (c)
checks if our algorithm scales well on relatively large and complicated examples.
The results of the three sets of experiments are given in Fig. 5, which shows for
each tool and for each set (a)–(c) the numbers of benchmarks with nontermina-
tion proofs (“Nonterm”), timeouts (“TO”), and no results (“No Res”). (Proofs
of termination, found by AProVE and Julia, are also listed as “No Res”.)
On the 89 deterministic instances of our bench- # Paths Nonterm Time [s]
mark set, our implementation proves nontermina- 2 133 272
tion of 33 examples, and TNT of 21 examples. 4 133 301
We have also experimented with different values 6 129 264
for the number of repeated paths before invoking ∞ 123 272
Strengthen. The results are reported in Fig. 6 Fig. 6. Repeated paths be-
(runtimes are for successful nontermination proofs). fore calling Strengthen
Fig. 7 charts the difference in power and performance between our imple-
mentation and TNT in a scatter plot, in log scale. Here we have included all
programs from (a)–(c). Each ‘x’-mark in the plot represents an example from the
benchmark. The value of the x-axis plots the runtime of TNT and the y-axis
value plots the runtime of our procedure on the same example. Points under the
diagonal are in favor of our procedure. Thus, the more ‘x’-marks there are in the
lower-righthand corner, the better our tool has performed.
Discussion. Figs. 5(a&c) demonstrate that our technique is overwhelmingly the
most successful tool (Fig. 5(b) confirms simply that no tool has demonstrable
Proving Nontermination via Safety 169

NR soundness bugs). The poor precision of

60 AProVE & Julia is mainly due to the non-
30 deterministic updates originally present in
Fig. 3 (s)

many of the benchmarks and also introduced

10
by the (automated) conversion of the bench-
5
marks to Java (the two tools’ input syntax).
This shows the lack of reliable support of non-
1 determinism in today’s nontermination tools.
0.5 The TNT algorithm requires outright that
0.5 1 5 10 30 60 NR
nondeterminism must not occur in the input.
TNT (s) Our implementation of TNT softens this re-
Fig. 7. Evaluation results of our quirement slightly: parts of the program with
procedure vs. TNT. Scatter plot in nondet-assignments are allowed as long as
log scale. Timeout=60s. NR=“No they are not used during the synthesis of re-
Result”, indicating failure of the currence sets.
tool.

Finally, we observe in Fig. 6 that the Strengthen procedure provides addi-

tional precision for our approach without harming performance.

7 Conclusion
We have introduced a new method of proving nontermination. The idea is to split
the reasoning in two parts: a safety prover is used to prove that a loop in an
underapproximation of the original program never terminates; meanwhile failed
safety proofs are used to calculate the underapproximation. We have shown that
nondeterminism can be easily handled in our framework while previous tools
often fail. Furthermore, we have shown that our approach leads to performance
improvements against previous tools where they are applicable.
Our technique is not restricted to linear integer arithmetic: Given suitable
tools for safety proving and for precondition inference, in principle our approach
is applicable to any program setting (note that the Strengthen procedure is
just an optimization). For future work, e.g. heap programs are a highly promising
candidate for nontermination analysis via abduction tools for separation logic [6].

Acknowledgments. We thank Marc Brockschmidt, Fabian Emmes, Florian

Frohn and Fausto Spoto for help with the experiments and Tony Hoare, Jules
Villard and the anonymous reviewers for insightful comments.

References
1. Atig, M.F., Bouajjani, A., Emmi, M., Lal, A.: Detecting fair non-termination in
multithreaded programs. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS,
vol. 7358, pp. 210–226. Springer, Heidelberg (2012)
2. Ball, T., Rajamani, S.K.: The SLAM toolkit. In: Berry, G., Comon, H., Finkel, A.
(eds.) CAV 2001. LNCS, vol. 2102, pp. 260–264. Springer, Heidelberg (2001)
170 H.-Y. Chen et al.

3. Biere, A., Artho, C., Schuppan, V.: Liveness checking as safety checking. In: Proc.
FMICS 2002 (2002)
4. Brockschmidt, M., Cook, B., Fuhs, C.: Better termination proving through cooper-
ation. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 413–429.
Springer, Heidelberg (2013)
5. Brockschmidt, M., Ströder, T., Otto, C., Giesl, J.: Automated detection of non-
termination and NullPointerExceptions for Java Bytecode. In: Beckert, B., Dami-
ani, F., Gurov, D. (eds.) FoVeOOS 2011. LNCS, vol. 7421, pp. 123–141. Springer,
Heidelberg (2012)
6. Calcagno, C., Distefano, D., O’Hearn, P.W., Yang, H.: Compositional shape anal-
ysis by means of bi-abduction. J. ACM 58(6), 26 (2011)
7. Chen, H.-Y., Cook, B., Fuhs, C., Nimkar, K., O’Hearn, P.: Proving nontermination
via safety. Technical Report RN/13/23, UCL (2014)
8. Cook, B., Koskinen, E.: Reasoning about nondeterminism in programs. In: Proc.
PLDI 2013 (2013)
9. Cook, B., Podelski, A., Rybalchenko, A.: Terminator: Beyond safety. In: Ball, T.,
Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 415–418. Springer, Heidelberg
(2006)
10. Cook, B., See, A., Zuleger, F.: Ramsey vs. Lexicographic termination proving.
In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 47–61.
Springer, Heidelberg (2013)
11. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
12. Dijkstra, E.W.: A Discipline of Programming. Prentice-Hall (1976)
13. Giesl, J., Schneider-Kamp, P., Thiemann, R.: AProVE 1.2: Automatic termina-
tion proofs in the dependency pair framework. In: Furbach, U., Shankar, N. (eds.)
IJCAR 2006. LNCS (LNAI), vol. 4130, pp. 281–286. Springer, Heidelberg (2006)
14. Giesl, J., Thiemann, R., Schneider-Kamp, P.: Proving and disproving termina-
tion of higher-order functions. In: Gramlich, B. (ed.) FroCos 2005. LNCS (LNAI),
vol. 3717, pp. 216–231. Springer, Heidelberg (2005)
15. Gulwani, S., Srivastava, S., Venkatesan, R.: Program analysis as constraint solving.
In: Proc. PLDI 2008 (2008)
16. Gupta, A., Henzinger, T.A., Majumdar, R., Rybalchenko, A., Xu, R.-G.: Proving
non-termination. In: Proc. POPL 2008 (2008)
17. Gurfinkel, A.: Private communication (2012)
18. Gurfinkel, A., Wei, O., Chechik, M.: Yasm: A software model-checker for verifica-
tion and refutation. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144,
pp. 170–174. Springer, Heidelberg (2006)
19. Harris, W.R., Lal, A., Nori, A.V., Rajamani, S.K.: Alternation for termination. In:
Cousot, R., Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 304–319. Springer,
Heidelberg (2010)
20. McMillan, K.L.: Lazy abstraction with interpolants. In: Ball, T., Jones, R.B. (eds.)
CAV 2006. LNCS, vol. 4144, pp. 123–136. Springer, Heidelberg (2006)
21. Merz, F., Falke, S., Sinz, C.: LLBMC: Bounded model checking of C and C++ pro-
grams using a compiler IR. In: Joshi, R., Müller, P., Podelski, A. (eds.) VSTTE
2012. LNCS, vol. 7152, pp. 146–161. Springer, Heidelberg (2012)
22. Nelson, G.: A generalization of Dijkstra’s calculus. ACM TOPLAS 11(4) (1989)
23. Payet, É.: Loop detection in term rewriting using the eliminating unfoldings. Theor.
Comput. Sci. 403(2-3) (2008)
Proving Nontermination via Safety 171

24. Payet, É., Mesnard, F.: Nontermination inference of logic programs. ACM
TOPLAS 28(2) (2006)
25. Payet, É., Mesnard, F.: A non-termination criterion for binary constraint logic
programs. TPLP 9(2) (2009)
26. Payet, É., Spoto, F.: Experiments with non-termination analysis for Java Bytecode.
In: Proc. BYTECODE 2009 (2009)
27. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes:
The Art of Scientiﬁc Computing. Cambridge Univ. Press (1989)
28. Song, F., Touili, T.: Pushdown model checking for malware detection. In: Flana-
gan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 110–125. Springer,
Heidelberg (2012)
29. Spoto, F., Mesnard, F., Payet, É.: A termination analyzer for Java bytecode based
on path-length. ACM TOPLAS 32(3) (2010)
30. Velroyen, H., Rümmer, P.: Non-termination checking for imperative programs. In:
Beckert, B., Hähnle, R. (eds.) TAP 2008. LNCS, vol. 4966, pp. 154–170. Springer,
Heidelberg (2008)
31. Voets, D., De Schreye, D.: A new approach to non-termination analysis of logic
programs. In: Hill, P.M., Warren, D.S. (eds.) ICLP 2009. LNCS, vol. 5649, pp.
220–234. Springer, Heidelberg (2009)
Ranking Templates for Linear Loops

Jan Leike1,2 and Matthias Heizmann1,

1
University of Freiburg, Germany
2
Max Planck Institute for Software Systems, Germany

Abstract. We present a new method for the constraint-based synthesis

of termination arguments for linear loop programs based on linear rank-
ing templates. Linear ranking templates are parametrized, well-founded
relations such that an assignment to the parameters gives rise to a rank-
ing function. This approach generalizes existing methods and enables us
to use templates for many diﬀerent ranking functions with aﬃne-linear
components. We discuss templates for multiphase, piecewise, and lexico-
graphic ranking functions. Because these ranking templates require both
strict and non-strict inequalities, we use Motzkin’s Transposition Theo-
rem instead of Farkas Lemma to transform the generated ∃∀-constraint
into an ∃-constraint.

1 Introduction

The scope of this work is the constraint-based synthesis of termination argu-

ments. In our setting, we consider linear loop programs, which are specified by
a boolean combination of affine-linear inequalities over the program variables.
This allows for both, deterministic and non-deterministic updates of the program
variables. An example of a linear loop program is given in Figure 1.
Usually, linear loop programs do not occur as stand-alone programs. Instead,
they are generated as a finite representation of an infinite path in a control flow
graph. For example, in (potentially spurious) counterexamples in termination
analysis [9,13,17,18,21,22], non-termination analysis [12], stability analysis [8,23],
or cost analysis [1,11].
We introduce the notion of linear ranking templates (Section 3). These are
parameterized relations specified by linear inequalities such that any assignment
to the parameters yields a well-founded relation. This notion is general enough
to encompass all existing methods for linear loop programs that use constraint-
based synthesis of ranking functions of various kinds (see Section 6 for an assess-
ment). Moreover, ours is the first method for synthesis of lexicographic ranking
functions that does not require a mapping between loop disjuncts and lexico-
graphic components.
In this paper we present the following linear ranking templates.

This work is supported by the German Research Council (DFG) as part of the
Transregional Collaborative Research Center “Automatic Verification and Analysis
of Complex Systems” (SFB/TR14 AVACS).

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 172–186, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Ranking Templates for Linear Loops 173

while ( q > 0 ) : q>0

q := q − y ;
∧ q = q − y
y := y + 1 ;
∧ y = y + 1
Fig. 1. A linear loop program given as program code (left) and as a formula deﬁning
a binary relation (right)

– The multiphase ranking template speciﬁes a ranking function that proceeds

through a fixed number of phases in the program execution. Each phase is
ranked by an affine-linear function; when this function becomes non-positive,
we move on to the next phase (Subsection 4.1).
– The piecewise ranking template specifies a ranking function that is a piece-
wise affine-linear function with affine-linear predicates to discriminate be-
tween the pieces (Subsection 4.2).
– The lexicographic ranking template specifies a lexicographic ranking func-
tion that corresponds to a tuple of affine-linear functions together with a
lexicographic ordering on the tuple (Subsection 4.3).

These linear ranking templates can be used as a ’construction kit’ for com-
posing linear ranking templates that enable more complex ranking functions
(Subsection 4.4). Moreover, variations on the linear ranking templates presented
here can be used and completely different templates could be conceived.
Our method is described in Section 5 and can be summarized as follows. The
input is a linear loop program as well as a linear ranking template. From these
we construct a constraint to the parameters of the template. With Motzkin’s
Theorem we can transform the constraint into a purely existentially quantified
constraint (Subsection 5.1). This ∃-constraint is then passed to an SMT solver
which checks it for satisfiability. A positive result implies that the program termi-
nates. Furthermore, a satisfying assignment will yield a ranking function, which
constitutes a termination argument for the given linear loop program.
Related approaches invoke Farkas’ Lemma for the transformation into ∃-
constraints [2,5,6,7,14,20,24,25]. The piecewise and the lexicographic ranking
template contain both strict and non-strict inequalities, yet only non-strict in-
equalities can be transformed using Farkas’ Lemma. We solve this problem by
introducing the use of Motzkin’s Transposition Theorem, a generalization of
Farkas’ Lemma. As a side effect, this also enables both, strict and non-strict
inequalities in the program syntax. To our knowledge, all of the aforementioned
methods can be improved by the application of Motzkin’s Theorem instead of
Farkas’ Lemma.
Our method is complete in the following sense. If there is a ranking function
of the form specified by the given linear ranking template, then our method will
discover this ranking function. In other words, the existence of a solution is never
lost in the process of transforming the constraint.
In contrast to some related methods [14,20] the constraint we generate is not
linear, but rather a nonlinear algebraic constraint. Theoretically, this constraint
can be decided in exponential time [10]. Much progress on nonlinear SMT solvers
174 J. Leike and M. Heizmann

has been made and present-day implementations routinely solve nonlinear con-
straints of various sizes [16].
A related setting to linear loop programs are linear lasso programs. These
consist of a linear loop program and a program stem, both of which are specified
by boolean combinations of affine-linear inequalities over the program variables.
Our method can be extended to linear lasso programs through the addition of
affine-linear inductive invariants, analogously to related approaches [5,7,14,25].

2 Preliminaries
In this paper we use K to denote a field that is either the rational numbers Q or
the real numbers R. We use ordinal numbers according to the definition in [15].
The first infinite ordinal is denoted by ω; the finite ordinals coincide with the
natural numbers, therefore we will use them interchangeably.

2.1 Motzkin’s Transposition Theorem

Intuitively, Motzkin’s Transposition Theorem [26, Corollary 7.1k] states that a
given system of linear inequalities has no solution if and only if a contradiction
can be derived via a positive linear combination of the inequalities.
Theorem 1 (Motzkin’s Transposition Theorem). For A ∈ Km×n , C ∈
K
×n , b ∈ Km , and d ∈ K
, the formulae (M1) and (M2) are equivalent.
∀x ∈ Kn . ¬(Ax ≤ b ∧ Cx < d) (M1)

∃λ ∈ Km ∃μ ∈ K
. λ ≥ 0 ∧ μ ≥ 0
∧ λT A + μT C = 0 ∧ λT b + μT d ≤ 0 (M2)
∧ (λ b < 0 ∨ μ = 0)
T

If is set to 1 in Theorem 1, we obtain the aﬃne version of Farkas’ Lemma [26,

Corollary 7.1h]. Therefore Motzkin’s Theorem is strictly superior to Farkas’
Lemma, as it allows for a combination of both strict and non-strict inequali-
ties. Moreover, it is logically optimal in the sense that it enables transformation
of any universally quantiﬁed formula from the theory of linear arithmetic.

2.2 Linear Loop Programs

In this work, we consider programs that consist of a single loop. We use binary
relations over the program’s states to define its transition relation.
We denote by x the vector of n variables (x1 , . . . , xn )T ∈ Kn corresponding to
program states and by x = (x1 , . . . , xn )T ∈ Kn the variables of the next state.
Definition 1 (Linear loop program). A linear loop program LOOP(x, x ) is
a binary relation defined by a formula with the free variables x and x of the
form
Ai (xx ) ≤ bi ∧ Ci (xx ) < di
i∈I
Ranking Templates for Linear Loops 175

for some ﬁnite index set I, some matrices Ai ∈ Kn×mi , Ci ∈ Kn×ki , and some
vectors bi ∈ Kmi and di ∈ Kki . The linear loop program LOOP(x, x ) is called
conjunctive, iﬀ there is only one disjunct, i.e., #I = 1.

Geometrically the relation LOOP corresponds to a union of convex polyhedra.

Deﬁnition 2 (Termination). We say that a linear loop program LOOP(x, x )
terminates iﬀ the relation LOOP(x, x ) is well-founded.

Example 1. Consider the following program code.

while ( q > 0 ) :
i f (y > 0 ) :
q := q − y − 1 ;
else :
q := q + y − 1 ;

We represent this as the following linear loop program:

(q > 0 ∧ y > 0 ∧ y = y ∧ q = q − y − 1)
∨ (q > 0 ∧ y ≤ 0 ∧ y = y ∧ q = q + y − 1)

This linear loop program is not conjunctive. Furthermore, there is no inﬁnite

sequence of states x0 , x1 , . . . such that for all i ≥ 0, the two successive states
(xi , xi+1 ) are contained in the relation LOOP. Hence the relation LOOP(x, x ) is
well-founded and the linear loop program terminates.

3 Ranking Templates
A ranking template is a template for a well-founded relation. More specifically, it
is a parametrized formula defining a relation that is well-founded for all assign-
ments to the parameters. If we show that a given program’s transition relation
LOOP is a subset of an instance of this well-founded relation, it must be well-
founded itself and thus we have a proof for the program’s termination. Moreover,
an assignment to the parameters of the template gives rise to a ranking func-
tion. In this work, we consider ranking templates that can be encoded in linear
arithmetic.
We call a formula whose free variables contain x and x a relation template.
Each free variable other than x and x in a relation template is called parameter.
Given an assignment ν to all parameter variables of a relation template T(x, x ),
the evaluation ν(T) is called an instantiation of the relation template T. We note
that each instantiation of a relation template T(x, x ) defines a binary relation.
When specifying templates, we use parameter variables to define affine-linear
functions. For notational convenience, we will write f (x) instead of the term
sTf x + tf , where sf ∈ Kn and tf ∈ K are parameter variables. We call f an
affine-linear function symbol.
176 J. Leike and M. Heizmann

Deﬁnition 3 (Linear ranking template). Let T(x, x ) be a template with

parameters D and aﬃne-linear function symbols F that can be written as a
boolean combination of atoms of the form

αf · f (x) + βf · f (x ) + γd · d 0,
f ∈F d∈D

where αf , βf , γd ∈ K are constants and ∈ {≥, >}. We call T a linear rank-

ing template over D and F iﬀ every instantiation of T deﬁnes a well-founded
relation.

Example 2. We call the following template with parameters D = {δ} and aﬃne-
linear function symbols F = {f } the Podelski-Rybalchenko ranking template [20].

δ > 0 ∧ f (x) > 0 ∧ f (x ) < f (x) − δ (1)

In the remainder of this section, we introduce a formalism that allows us to show

that every instantiation of the Podelski-Rybalchenko ranking template deﬁnes
a well-founded relation. Let us now check the additional syntactic requirements
for 1 to be a linear ranking template:

δ>0 ≡ 0 · f (x) + 0 · f (x ) + 1 · δ > 0

f (x) > 0 ≡ 1 · f (x) + 0 · f (x ) + 0 · δ > 0

f (x ) < f (x) − δ ≡ 1 · f (x) + (−1) · f (x ) + (−1) · δ > 0

The next lemma states that we can prove termination of a given linear loop
program by checking that this program’s transition relation is included in an
instantiation of a linear ranking template.

Lemma 1. Let LOOP be a linear loop program and let T be a linear ranking
template with parameters D and aﬃne-linear function symbols F . If there is an
assignment ν to D and F such that the formula

∀x, x .
LOOP(x, x ) → ν(T)(x, x ) (2)

is valid, then the program LOOP terminates.

Proof. By deﬁnition, ν(T) is a well-founded relation and (2) is valid iﬀ the rela-
tion LOOP is a subset of ν(T). Thus LOOP must be well-founded.

In order to establish that a formula conforming to the syntactic requirements

is indeed a ranking template, we must show the well-foundedness of its instan-
tiations. According to the following lemma, we can do this by showing that an
assignment to D and F gives rise to a ranking function. A similar argument was
given in [3]; we provide signiﬁcantly shortened proof by use of the Recursion
Theorem, along the lines of [15, Example 6.12].
Ranking Templates for Linear Loops 177

Deﬁnition 4 (Ranking Function). Given a binary relation R over a set Σ,

we call a function ρ from Σ to some ordinal α a ranking function for R iﬀ for
all x, x ∈ Σ the following implication holds.

(x, x ) ∈ R =⇒ ρ(x) > ρ(x )

Lemma 2. A binary relation R is well-founded if and only if there exists a

ranking function for R.

Proof. Let ρ be a ranking function for R. The image of a sequence decreasing

with respect to R under ρ is a strictly decreasing ordinal sequence. Because the
ordinals are well-ordered, this sequence cannot be inﬁnite.
Conversely, the graph G = (Σ, R) with vertices Σ and edges R is acyclic by
assumption. Hence the function ρ that assigns to every element of Σ an ordinal
number such that ρ(x) = sup {ρ(x ) + 1 | (x, x ) ∈ R} is well-deﬁned and exists
due to the Recursion Theorem [15, Theorem 6.11].

Example 3. Consider the terminating linear loop program LOOP from Example 1.
A ranking function for LOOP is ρ : R2 → ω, deﬁned as follows.

q, if q > 0, and
ρ(q, y) =
0 otherwise.

Here · denotes the ceiling function that assigns to every real number r the
smallest natural number that is larger or equal to r. Since we consider the natural
numbers to be a subset of the ordinals, the ranking function ρ is well-deﬁned.

We use assignments to a template’s parameters and aﬃne-linear function sym-

bols to construct a ranking function. These functions are real-valued and we will
transform them into functions with codomain ω in the following way.

Deﬁnition 5. Given an aﬃne-linear function f and a real number δ > 0 called

the step size, we deﬁne the ordinal ranking equivalent of f as

f (x)
, if f (x) > 0, and
f(x) = δ
0 otherwise.

For better readability we used this notation which does not explicitly refer to
δ. In our presentation the step size δ is always clear from the context in which
an ordinal ranking equivalent f is used.

Example 4. Consider the linear loop program LOOP(x, x ) from Example 1. For
δ = 12 and f (q) = q + 1, the ordinal ranking equivalent of f with step size δ is

2(q + 1), if q + 1 > 0, and
f(q, y) =
0 otherwise.
178 J. Leike and M. Heizmann

The assignment from Example 4 to δ and f makes the implication (2) valid.
In order to invoke Lemma 1 to show that the linear loop program given in
Example 1 terminates, we need to prove that the Podelski-Rybalchenko ranking
template is a linear ranking template. We use the following technical lemma.

Lemma 3. Let f be an aﬃne-linear function of step size δ > 0 and let x and x
be two states. If f (x) > 0 and f (x) − f (x ) > δ, then f(x) > 0 and f(x) > f(x ).

Proof. From f (x) > 0 follows that f(x) > 0. Therefore f(x) > f(x ) in the case
f(x ) = 0. For f(x ) > 0, we use the fact that f (x) − f (x ) > δ to conclude that
f (x )
f (x)
δ − δ > 1 and hence f(x ) > f(x).

An immediate consequence of this lemma is that the Podelski-Rybalchenko

ranking template is a linear ranking template: any assignment ν to δ and f
satisﬁes the requirements of Lemma 3. Consequently, f is a ranking function for
ν(T), and by Lemma 2 this implies that ν(T) is well-founded.

4 Examples of Ranking Templates

4.1 Multiphase Ranking Template

The multiphase ranking template is targeted at programs that go through a

ﬁnite number of phases in their execution. Each phase is ranked with an aﬃne-
linear function and the phase is considered to be completed once this function
becomes non-positive.

Example 5. Consider the linear loop program from Figure 1. Every execution
can be partitioned into two phases: ﬁrst y increases until it is positive and then
q decreases until the loop condition q > 0 is violated. Depending on the initial
values of y and q, either phase might be skipped altogether.

Deﬁnition 6 (Multiphase Ranking Template). We deﬁne the k-phase rank-

ing template with parameters D = {δ1 , . . . , δk } and aﬃne-linear function symbols
F = {f1 , . . . , fk } as follows.

k
δi > 0 (3)
i=1

k
∧ fi (x) > 0 (4)
i=1
∧ f1 (x ) < f1 (x) − δ1 (5)
k

∧ fi (x ) < fi (x) − δi ∨ fi−1 (x) > 0 (6)
i=2
Ranking Templates for Linear Loops 179

We say that the multiphase ranking function given by an assignment to

f1 , . . . , fk and δ1 , . . . , δk is in phase i, iff fi (x) > 0 and fj (x) ≤ 0 for all j < i.
The condition (4) states that there is always some i such that the multiphase
ranking function is in phase i. (5) and (6) state that if we are in a phase ≥ i,
then fi has to be decreasing by at least δi > 0. Note that the 1-phase ranking
template coincides with the Podelski-Rybalchenko ranking template.
Multiphase ranking functions are related to eventually negative expressions
introduced by Bradley, Manna, and Sipma [6]. However, in contrast to our ap-
proach, they require a template tree that specifies in detail how each loop tran-
sition interacts with each phase.
Lemma 4. The k-phase ranking template is a linear ranking template.
Proof. The k-phase ranking template conforms to the syntactic requirements to
be a linear ranking template. Let an assignment to the parameters D and the
affine-linear function symbols F of the k-phase template be given. Consider the
following ranking function with codomain ω · k.

ω · (k − i) + fi (x) if fj (x) ≤ 0 for all j < i and fi (x) > 0,
ρ(x) = (7)
0 otherwise.

Let (x, x ) ∈ T. By Lemma 2, we need to show that ρ(x ) < ρ(x). From (4)
follows that ρ(x) > 0. Moreover, there is an i such that fi (x) > 0 and fj (x) ≤ 0
for all j < i. By (5) and (6), fj (x ) ≤ 0 for all j < i because fj (x ) < fj (x)−δj ≤
0 − δj ≤ 0, since f
(x) ≤ 0 for all < j.
If fi (x ) ≤ 0, then ρ(x ) ≤ ω · (k − i) < ω · (k − i) + fi (x) = ρ(x). Otherwise,
fi (x ) > 0 and from (6) follows fi (x ) < fi (x) − δi . By Lemma 3, fi (x) > fi (x )
for the ordinal ranking equivalent of fi with step size δi . Hence

ρ(x ) = ω · (k − i) + fi (x ) < ω · (k − i) + fi (x) = ρ(x).

Example 6. Consider the program from Figure 1. The assignment

f1 (q, y) = 1 − y, f2 (q, y) = q + 1, δ1 = δ2 = 1
2

yields a 2-phase ranking function for this program.

Example 7. There are terminating conjunctive linear loop programs that do not
have a multiphase ranking function:

q > 0 ∧ q = q + z − 1 ∧ z = −z

Here, the sign of z is alternated in each iteration. The function ρ(q, y, z) = q
is decreasing in every second iteration, but not decreasing in each iteration.
Example 8. Consider the following linear loop program.

(q > 0 ∧ y > 0 ∧ y = 0)
∨ (q > 0 ∧ y ≤ 0 ∧ y = y − 1 ∧ q = q − 1)
180 J. Leike and M. Heizmann

For a given input, we cannot give an upper bound on the execution time: starting
with y > 0, after the ﬁrst loop execution, y is set to 0 and q is set to some
arbitrary value, as no restriction to q applies in the ﬁrst disjunct. In particular,
this value does not depend on the input. The remainder of the loop execution
then takes q iterations to terminate.
However we can prove the program’s termination with the 2-phase ranking
function constructed from f1 (q, y) = y and f2 (q, y) = q.

4.2 Piecewise Ranking Template

The piecewise ranking template formalizes a ranking function that is deﬁned
piecewise using aﬃne-linear predicates to discriminate the pieces.

Deﬁnition 7 (Piecewise Ranking Template). We deﬁne the k-piece rank-

ing template with parameters D = {δ} and aﬃne-linear function symbols F =
{f1 , . . . , fk , g1 , . . . , gk } as follows.

δ>0 (8)
k
k
∧ gi (x) < 0 ∨ gj (x ) < 0 ∨ fj (x ) < fi (x) − δ (9)
i=1 j=1

k
∧ fi (x) > 0 (10)
i=1

k
∧ gi (x) ≥ 0 (11)
i=1

We call the aﬃne-linear function symbols {gi | 1 ≤ i ≤ k} discriminators and

the affine-linear function symbols {fi | 1 ≤ i ≤ k} ranking pieces.
The disjunction (11) states that the discriminators cover all states; in other
words, the piecewise defined ranking function is a total function. Given the k
different pieces f1 , . . . , fk and a state x, we use fi as a ranking function only
if gi (x) ≥ 0 holds. This choice need not be unambiguous; the discriminators
may overlap. If they do, we can use any one of their ranking pieces. According
to (10), all ranking pieces are positive-valued and by (9) piece transitions are
well-defined: the rank of the new state is always less than the rank any of the
ranking pieces assigned to the old state.

Lemma 5. The k-piece ranking template is a linear ranking template.

Proof. The k-piece ranking template conforms to the syntactic requirements to

be a linear ranking template. Let an assignment to the parameter δ and the
aﬃne-linear function symbols F of the k-piece template be given. Consider the
following ranking function with codomain ω.
! "
ρ(x) = max fi (x) | gi (x) ≥ 0 (12)
Ranking Templates for Linear Loops 181

The function ρ is well-deﬁned, because according to (11), the set {fi (x) | gi (x) ≥
0} is not empty. Let (x, x ) ∈ T and let i and j be indices such that ρ(x) = fi (x)
and ρ(x ) = fj (x ). By deﬁnition of ρ, we have that gi (x) ≥ 0 and gj (x) ≥ 0,
and (9) thus implies fj (x ) < fi (x) − δ. According to Lemma 3 and (10), this
entails fj (x ) < fi (x) and therefore ρ(x ) < ρ(x). Lemma 2 now implies that T
is well-founded.
Example 9. Consider the following linear loop program.
(q > 0 ∧ p > 0 ∧ q < p ∧ q = q − 1)
∨ (q > 0 ∧ p > 0 ∧ p < q ∧ p = p − 1)
In every loop iteration, the minimum of p and q is decreased by 1 until it be-
comes negative. Thus, this program is ranked by the 2-piece ranking function
constructed from f1 (p, q) = p and f2 (p, q) = q with step size δ = 12 and discrimi-
nators g1 (p, q) = q −p and g2 (p, q) = p−q. Moreover, this program does not have
a multiphase or lexicographic ranking function: both p and q may increase with-
out bound during program execution due to non-determinism and the number
of switches between p and q being the minimum value is also unbounded.

4.3 Lexicographic Ranking Template

Lexicographic ranking functions consist of lexicographically ordered components
of affine-linear functions. A state is mapped to a tuple of values such that the
loop transition leads to a decrease with respect to the lexicographic ordering for
this tuple. Therefore no function may increase unless a function of a lower index
decreases. Additionally, at every step, there must be at least one function that
decreases.
Several different definitions for lexicographic ranking functions have been uti-
lized [2,4,5]; a comparison can be found in [4]. Each of these definitions for
lexicographic linear ranking functions can be formalized using linear ranking
templates; in this publication we are following the definition of [2].
Definition 8 (Lexicographic Ranking Template). We define the k-lexico-
graphic ranking template with parameters D = {δ1 , . . . , δk } and affine-linear
function symbols F = {f1 , . . . , fk } as follows.

k
δi > 0 (13)
i=1

k
∧ fi (x) > 0 (14)
i=1

k−1
i−1
∧ fi (x ) ≤ fi (x) ∨ fj (x ) < fj (x) − δj (15)
i=1 j=1

k
∧ fi (x ) < fi (x) − δi (16)
i=1
182 J. Leike and M. Heizmann

The conjunction (14) establishes that all lexicographic components f1 , . . . , fk

have positive values. In every step, at least one component must decrease ac-
cording to (16). From (15) follows that all functions corresponding to components
of smaller index than the decreasing function may increase.

Lemma 6. The k-lexicographic ranking template is a linear ranking template.

Proof. The k-lexicographic ranking template conforms to the syntactic require-

ments to be a linear ranking template. Let an assignment to the parameters
D and the aﬃne-linear function symbols F of the k-lexicographic template be
given. Consider the following ranking function with codomain ω k .

k
ρ(x) = ω k−i · fi (x) (17)
i=1

Let (x, x ) ∈ T. From (14) follows fj (x) > 0 for all j, so ρ(x) > 0. By (16)
and Lemma 3, there is a minimal i such that fi (x ) < fi (x). According to (15),
f1 (x ) ≤ f1 (x) and hence inductively fj (x ) ≤ fj (x) for all j < i, since i was
minimal.

k
i−1
k
ρ(x ) = ω k−j · fj (x ) ≤ ω k−j · fj (x) + ω k−j · fj (x )
j=1 j=1 j=i

i−1
< ω k−j · fj (x) + ω k−i · fi (x) ≤ ρ(x)
j=1

Therefore Lemma 2 implies that T is well-founded.

4.4 Composition of Templates

The multiphase ranking template, the piecewise ranking template, and the lex-
icographic ranking template defined in the previous sections can be used as a
’construction kit’ for more general linear ranking templates. Each of our tem-
plates contains lower bounds ((4), (10), (11), and (14)) and decreasing behavior
((5), (6), (9), (15), and (16)). We can compose templates by replacing the lower
bound conditions and decreasing behavior conditions to affine-linear function
symbols in our linear ranking templates with the corresponding conditions of
another template. This is possible because linear ranking templates allow ar-
bitrary boolean combination of inequalities and are closed under this kind of
substitution. For example, we can construct a template for a lexicographic rank-
ing function whose lexicographic components are multiphase functions instead
of affine-linear functions (see Figure 2). This encompasses the approach applied
by Bradley et al. [6].
Ranking Templates for Linear Loops 183

k

δi,j > 0
i=1 j=1

k

∧ fi,j (x) > 0
i=1 j=1

k−1

∧ fi,1 (x ) ≤ fi,1 (x) ∧ fi,j (x ) ≤ fi,j (x) ∨ fi,j−1 (x) > 0
i=1 j=2

i−1

∨ ft,1 (x ) < ft,1 (x) − δt,1 ∧ ft,j (x ) < ft,j (x) − δt,j ∨ ft,j−1 (x) > 0
t=1 j=2
k

∧ fi,1 (x ) < fi,1 (x) − δi,1 ∧ fi,j (x ) < fi,j (x) − δi,j ∨ fi,j−1 (x) > 0
i=1 j=2

Fig. 2. A k-lexicographic ranking template with phases in each lexicographic com-

ponent with the parametersD = {δi,j } and aﬃne-linear function symbols F = {fi,j }

5 Synthesizing Ranking Functions

Our method for ranking function synthesis can be stated as follows. We have
a finite pool of linear ranking templates. This pool will include the multiphase,
piecewise, and lexicographic ranking templates in various sizes and possibly com-
binations thereof. Given a linear loop program whose termination we want to
prove, we select a linear ranking template from the pool. With this template we
build the constraint (2) to the linear ranking template’s parameters. If this con-
straint is satisfiable, this gives rise to a ranking function according to Lemma 2.
Otherwise, we try again using the next linear ranking template from the pool
until the pool has been exhausted. In this case, the given linear loop program
does not have a ranking function of the form specified by any of the pool’s lin-
ear ranking templates and the proof of the program’s termination failed. See
Figure 3 for a specification of our method in pseudocode.
Following related approaches [2,5,6,7,14,20,24,25], we transform the ∃∀-con-
straint (2) into an ∃-constraint. This transformation makes the constraint more
easily solvable because it reduces the number of non-linear operations in the
constraint: every application of an affine-linear function symbol f corresponds
to a non-linear term sTf x + tf .

5.1 Constraint Transformation Using Motzkin’s Theorem

Fix a linear loop program LOOP and a linear ranking template T with parameters
D and aﬃne-linear function symbols F . We write LOOP in disjunctive normal
form and T in conjunctive normal form:
184 J. Leike and M. Heizmann

Input: linear loop program LOOP and a list of linear ranking templates T
Output: a ranking function for LOOP or null if none is found
foreach T ∈ T do :
l e t ϕ = ∀x, x . LOOP(x, x ) → T(x, x )
l e t ψ = transformWithMotzk in ( ϕ )
i f SMTsolver . checkSAT ( ψ ) :
l e t (D , F ) = T . getParameters ( )
l e t ν = getAssignment (ψ , D , F )
return T . e x t r a c t R a n k i n g F u n c t i o n ( ν )
return n u l l
Fig. 3. Our ranking function synthesis algorithm described in pseudocode. The func-
tion transformWithMotzkin transforms the ∃∀-constraint ϕ into an ∃-constraint ψ as
described in Subsection 5.1.

LOOP(x, x ) ≡ Ai (xx ) ≤ bi
i∈I

T(x, x )≡ Tj,
(x, x )
j∈J
∈Lj

We prove the termination of LOOP by solving the constraint (2). This constraint
is implicitly existentially quantiﬁed over the parameters D and the parameters
corresponding to the aﬃne-linear function symbols F .

∀x, x . Ai (xx ) ≤ bi →
Tj,
(x, x ) (18)
i∈I j∈J
∈Lj

First, we transform the constraint (18) into an equivalent constraint of the form
required by Motzkin’s Theorem.

∀x, x . ¬ Ai (xx ) ≤ bi ∧ ¬Tj,
(x, x ) (19)
i∈I j∈J
∈Lj

Now, Motzkin’s Transposition Theorem will transform the constraint (19) into
an equivalent existentially quantiﬁed constraint.
This ∃-constraint is then checked for satisﬁability. If an assignment is found,
it gives rise to a ranking function. Conversely, if no assignment exists, then there
cannot be an instantiation of the linear ranking template and thus no ranking
function of the kind formalized by the linear ranking template. In this sense our
method is sound and complete.

Theorem 2 (Soundness). If the transformed ∃-constraint is satisﬁable, then

the linear loop program terminates.

Theorem 3 (Completeness). If the ∃∀-constraint (2) is satisﬁable, then so

is the transformed ∃-constraint.
Ranking Templates for Linear Loops 185

6 Related Work

The first complete method of ranking function synthesis for linear loop pro-
grams through constraint solving was due Podelski and Rybalchenko [20]. Their
approach considers termination arguments in form of affine-linear ranking func-
tions and requires only linear constraint solving. We explained the relation to
their method in Example 2.
Bradley, Manna, and Sipma propose a related approach for linear lasso pro-
grams [5]. They introduce affine-linear inductive supporting invariants to handle
the stem. Their termination argument is a lexicographic ranking function with
each component corresponding to one loop disjunct. This not only requires non-
linear constraint solving, but also an ordering on the loop disjuncts. The authors
extend this approach in [6] by the use of template trees. These trees allow each
lexicographic component to have a ranking function that decreases not neces-
sarily in every step, but eventually.
In [14] the method of Podelski and Rybalchenko is extended. Utilizing sup-
porting invariants analogously to Bradley et al., affine-linear ranking functions
are synthesized. Due to the restriction to non-decreasing invariants, the gener-
ated constraints are linear.
A collection of example-based explanations of constraint-based verification
techniques can be found in [24]. This includes the generation of ranking functions,
interpolants, invariants, resource bounds and recurrence sets.
In [4] Ben-Amram and Genaim discuss the synthesis of affine-linear and lex-
icographic ranking functions for linear loop programs over the integers. They
prove that this problem is generally co-NP-complete and show that several spe-
cial cases admit a polynomial time complexity.

References
1. Albert, E., Arenas, P., Genaim, S., Puebla, G.: Closed-form upper bounds in static
cost analysis. J. Autom. Reasoning 46(2), 161–203 (2011)
2. Alias, C., Darte, A., Feautrier, P., Gonnord, L.: Multi-dimensional rankings, pro-
gram termination, and complexity bounds of ﬂowchart programs. In: Cousot, R.,
Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 117–133. Springer, Heidelberg
(2010)
3. Ben-Amram, A.M.: Size-change termination, monotonicity constraints and ranking
functions. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 109–
123. Springer, Heidelberg (2009)
4. Ben-Amram, A.M., Genaim, S.: Ranking functions for linear-constraint loops. In:
POPL (2013)
5. Bradley, A.R., Manna, Z., Sipma, H.B.: Linear ranking with reachability. In: Etes-
sami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 491–504. Springer,
Heidelberg (2005)
6. Bradley, A.R., Manna, Z., Sipma, H.B.: The polyranking principle. In: Caires, L.,
Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS,
vol. 3580, pp. 1349–1361. Springer, Heidelberg (2005)
186 J. Leike and M. Heizmann

7. Colón, M.A., Sankaranarayanan, S., Sipma, H.B.: Linear invariant generation using
non-linear constraint solving. In: Hunt Jr., W.A., Somenzi, F. (eds.) CAV 2003.
LNCS, vol. 2725, pp. 420–432. Springer, Heidelberg (2003)
8. Cook, B., Fisher, J., Krepska, E., Piterman, N.: Proving stabilization of biological
systems. In: Jhala, R., Schmidt, D. (eds.) VMCAI 2011. LNCS, vol. 6538, pp.
134–149. Springer, Heidelberg (2011)
9. Cook, B., Podelski, A., Rybalchenko, A.: Terminator: Beyond safety. In: Ball, T.,
Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 415–418. Springer, Heidelberg
(2006)
10. Grigor’ev, D.Y., Vorobjov Jr., N.N.: Solving systems of polynomial inequalities in
subexponential time. Journal of Symbolic Computation 5(1-2), 37–64 (1988)
11. Gulwani, S., Zuleger, F.: The reachability-bound problem. In: PLDI, pp. 292–304
(2010)
12. Gupta, A., Henzinger, T.A., Majumdar, R., Rybalchenko, A., Xu, R.G.: Proving
non-termination. In: POPL, pp. 147–158 (2008)
13. Harris, W.R., Lal, A., Nori, A.V., Rajamani, S.K.: Alternation for termination. In:
Cousot, R., Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 304–319. Springer,
Heidelberg (2010)
14. Heizmann, M., Hoenicke, J., Leike, J., Podelski, A.: Linear ranking for linear lasso
programs. In: Van Hung, D., Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172, pp.
365–380. Springer, Heidelberg (2013)
15. Jech, T.: Set Theory, 3rd edn. Springer (2006)
16. Jovanović, D., de Moura, L.: Solving non-linear arithmetic. In: Gramlich, B., Miller,
D., Sattler, U. (eds.) IJCAR 2012. LNCS, vol. 7364, pp. 339–354. Springer, Hei-
delberg (2012)
17. Kroening, D., Sharygina, N., Tonetta, S., Tsitovich, A., Wintersteiger, C.M.: Loop
summarization using abstract transformers. In: Cha, S(S.), Choi, J.-Y., Kim,
M., Lee, I., Viswanathan, M. (eds.) ATVA 2008. LNCS, vol. 5311, pp. 111–125.
Springer, Heidelberg (2008)
18. Kroening, D., Sharygina, N., Tsitovich, A., Wintersteiger, C.M.: Termination anal-
ysis with compositional transition invariants. In: Touili, T., Cook, B., Jackson, P.
(eds.) CAV 2010. LNCS, vol. 6174, pp. 89–103. Springer, Heidelberg (2010)
19. Leike, J.: Ranking function synthesis for linear lasso programs. Master’s thesis,
University of Freiburg, Germany (2013)
20. Podelski, A., Rybalchenko, A.: A complete method for the synthesis of linear rank-
ing functions. In: Steﬀen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp.
239–251. Springer, Heidelberg (2004)
21. Podelski, A., Rybalchenko, A.: Transition invariants. In: LICS, pp. 32–41 (2004)
22. Podelski, A., Rybalchenko, A.: Transition predicate abstraction and fair termina-
tion. In: POPL, pp. 132–144 (2005)
23. Podelski, A., Wagner, S.: A sound and complete proof rule for region stability of
hybrid systems. In: Bemporad, A., Bicchi, A., Buttazzo, G. (eds.) HSCC 2007.
LNCS, vol. 4416, pp. 750–753. Springer, Heidelberg (2007)
24. Rybalchenko, A.: Constraint solving for program veriﬁcation: Theory and practice
by example. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174,
pp. 57–71. Springer, Heidelberg (2010)
25. Sankaranarayanan, S., Sipma, H.B., Manna, Z.: Constraint-based linear-relations
analysis. In: Giacobazzi, R. (ed.) SAS 2004. LNCS, vol. 3148, pp. 53–68. Springer,
Heidelberg (2004)
26. Schrijver, A.: Theory of linear and integer programming. Wiley-Interscience series
in discrete mathematics and optimization. Wiley (1999)
FDR3 — A Modern Refinement Checker for CSP

Thomas Gibson-Robinson, Philip Armstrong, Alexandre Boulgakov,

and Andrew W. Roscoe

Department of Computer Science, University of Oxford

Wolfson Building, Parks Road, Oxford, OX1 3QD, UK
{thomas.gibson-robinson,philip.armstrong,
alexandre.boulgakov,bill.roscoe}@cs.ox.ac.uk

Abstract. FDR3 is a complete rewrite of the CSP reﬁnement checker

FDR2, incorporating a significant number of enhancements. In this paper
we describe the operation of FDR3 at a high level and then give a detailed
description of several of its more important innovations. This includes
the new multi-core refinement-checking algorithm that is able to achieve
a near linear speed up as the number of cores increase. Further, we
describe the new algorithm that FDR3 uses to construct its internal
representation of CSP processes—this algorithm is more efficient than
FDR2’s, and is able to compile a large class of CSP processes to more
efficient internal representations. We also present experimental results
that compare FDR3 to related tools, which show it is unique (as far as
we know) in being able to scale beyond the bounds of main memory.

1 Introduction

FDR (Failures Divergence Reﬁnement) is the most widespread reﬁnement

checker for the process algebra CSP [1,2,3]. FDR takes a list of CSP processes,
written in machine-readable CSP (henceforth CSPM ) which is a lazy functional
language, and is able to check if the processes refine each other according to
the CSP denotational models (e.g. the traces, failures and failures-divergences
models). It is also able to check for more properties, including deadlock-freedom,
livelock-freedom and determinism, by constructing equivalent refinement checks.
FDR2 was released in 1996, and has been widely used both within industry
and in academia for verifying systems [4,5,6]. It is also used as a verification
backend for several other tools including: Casper [7] which verifies security pro-
tocols; SVA [8] which can verify simple shared-variable programs; in addition to
several industrial tools (e.g. ModelWorks and ASD).
FDR3 has been under development for the last few years as a complete rewrite
of FDR2. It represents a major advance over FDR2, not only in the size of system
that can be checked (we have verified systems with over ten billion states in a
few hours), but also in terms of its ease of use. FDR3 has also been designed
and engineered to be a stable platform for future development of CSP model-
checking tools, in addition to tools for CSP-like languages [2]. In this paper we
give an outline of FDR3, highlighting a selection of the advances made.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 187–201, 2014.

c Springer-Verlag Berlin Heidelberg 2014
188 T. Gibson-Robinson et al.

In Section 4 we describe the new multi-core reﬁnement-checking algorithm

that achieves a near linear increase in performance as the number of cores in-
creases. Section 6 gives some experimental results that compare the performance
of the new algorithm to FDR2, Spin [9], DiVinE [10], and LTSmin [11].
In Section 5 we detail the new compilation algorithm, which constructs FDR’s
internal representation of CSP processes (i.e. labelled-transition systems) from
CSPM processes. This algorithm is an entirely new development and is able to
compile many CSP processes into more eﬃcient labelled-transition systems. It
is also related to the operational semantics of CSP, unlike the FDR2 algorithm
which was based on heuristics.
In addition to the advances that we present in this paper, FDR3 incorporates
a number of other new features. Most notably, the graphical user interface has
been entirely rethought, and includes: a new CSPM type checker; a built-in ver-
sion of ProBE, the CSP process animator; and a new debugger that emphasises
interactions between processes. See the FDR3 manual [12] for further details.
Before describing the new advances, in Section 2 we brieﬂy review CSP. In
Section 3 we then outline the high-level design and structure of FDR3.

2 CSP
CSP [1,2,3] is a process algebra in which programs or processes that communicate
events from a set Σ with an environment may be described. We sometimes
structure events by sending them along a channel. For example, c.3 denotes
the value 3 being sent along the channel c. Further, given a channel c the set
{|c|} ⊆ Σ contains those events of the form c.x .
The simplest CSP process is the process STOP that can perform no events.
The process a → P offers the environment the event a ∈ Σ and then behaves
like P . The process P 2 Q offers the environment the choice of the events
offered by P and by Q and is not resolved by the internal action τ . P Q
non-deterministically chooses which of P or Q to behave like. P Q initially
behaves like P , but can timeout (via τ ) and then behaves as Q .
P A B Q allows P and Q to perform only events from A and B respectively
and forces P and Q to synchronise on events in A ∩ B . P Q allows P and Q to
A
run in parallel, forcing synchronisation on events in A and arbitrary interleaving
of events not in A. The interleaving of two processes, denoted P ||| Q , runs P
and Q in parallel but enforces no synchronisation. P \ A behaves as P but hides
any events from A by transforming them into the internal event τ . This event
does not synchronise with the environment and thus can always occur. P [[R]],
behaves as P but renames the events according to the relation R. Hence, if P can
perform a, then P [[R]] can perform each b such that (a, b) ∈ R, where the choice
(if more than one such b) is left to the environment (like 2). P ' Q initially
behaves like P but allows Q to interrupt at any point and perform a visible
event, at which point P is discarded and the process behaves like Q . P ΘA Q
initially behaves like P , but if P ever performs an event from A, P is discarded
and P ΘA Q behaves like Q . Skip is the process that immediately terminates.
FDR3 — A Modern Refinement Checker for CSP 189

The sequential composition of P and Q , denoted P ; Q , runs P until it ter-

minates at which point Q is run. Termination is indicated using a : Skip is
defined as → STOP and, if the left argument of P ; Q performs a , P ; Q
performs a τ to the state Q (i.e. P is discarded and Q is started).
Recursive processes can be defined either equationally or using the notation
μ X · P . In the latter, every occurrence of X within P represents a recursive call.
An argument P of a CSP operator Op is on iff it can perform an event. P
is off iff no such rule exists. For example, the left argument of the exception
operator is on, whilst the right argument is off .
The simplest approach to giving meaning to a CSP expression is by defining
an operational semantics. The operational semantics of a CSP process naturally
creates a labelled transition system (LTS) where the edges are labelled by events
from Σ ∪ {τ } and the nodes are process states. Formally, an LTS is a 3-tuple
a
consisting of a set of nodes, an initial node, and a relation −→ on the nodes: i.e.
it is a directed graph where each edge is labelled by an event. The usual way of
defining the operational semantics of CSP processes is by presenting Structured
a
Operational Semantics (SOS) style rules in order to define −→. For instance,
the operational semantics of the exception operator are defined by:
b
a
P −→ P P −→ P τ
P −→ P
a a ∈A b
b∈
/A τ
P ΘA Q −→ Q P ΘA Q −→ P ΘA Q P ΘA Q −→ P ΘA Q
The interesting rule is the first, which specifies that if P performs an event a ∈ A,
then P ΘA Q can perform the event a and behave like Q .
The SOS style of operational semantics is far more expressive than is required
to give an operational semantics to CSP, and indeed can define operators which,
for a variety of reasons, make no sense in CSP models. As pointed out in [3], it is
possible to re-formulate CSP’s semantics in the highly restricted combinator style
of operational semantics, which largely concentrates on the relationships between
events of argument processes and those of the constructed system. This style
says, inter alia, that only on arguments can influence events, that any τ action
of an on argument must be allowed to proceed freely, and that an argument
process has changed state in the result state if and only if it has participated
in the action. Cloning of on arguments is not permitted. Any language with a
combinator operational semantics can be translated to CSP with a high degree
of faithfulness [3] and is compositional over every CSP model. FDR3 is designed
so that it can readily be extended to such CSP-like languages.
CSP also has a number of denotational models, such as the traces, failures
and failures-divergences models. In these models, each process is represented by
a set of behaviours: the traces model represents a process by the set of sequences
of events it can perform; the failures model represents a process by the set of
events it can refuse after each trace; the failures-divergences model augments the
failures model with information about when a process can perform an unbounded
number of τ events. Two processes are equal in a denotational model iff they have
the same set of behaviours. If every behaviour of Impl is a behaviour of Spec in
the denotational model X , then Spec is refined by Impl , denoted Spec (X Impl .
190 T. Gibson-Robinson et al.

3 The Overall Structure of FDR3

As FDR3 is a refinement checker (deadlock freedom, determinism, etc. are con-
verted into refinement checks), we consider how FDR3 decides if P ( Q .
Since P and Q will actually be CSPM expressions, FDR3 needs to evaluate
them to produce a tree of CSP operator applications. For example, if P was the
CSPM expression if true then c?x -> STOP else STOP, this would evaluate
to c.0 -> STOP [] c.1 -> STOP. Notice that the functional language has been
removed: all that remains is a tree of trivial operator applications, as follows.
Definition 1. A syntactic process P is generated according to the grammar:
P ::= Operator (P1 , . . . , PM ) | N where the Pi are also syntactic processes,
Operator is any CSP operator (e.g. external choice, prefix etc) and N is a process
name. A syntactic process environment Γ is a function from process name to
syntactic process such that Γ (N ) is never a process name.
The evaluator converts CSPM expressions to syntactic processes. Since CSPM
is a lazy functional language, the complexity of evaluating CSPM depends on
how the CSPM code has been written. This is written in Haskell and is available
as part of the open-source Haskell library libcspm [13], which implements a
parser, type-checker and evaluator for CSPM .
Given a syntactic process, FDR3 then converts this to an LTS which is used
to represent CSP processes during refinement checks. In order to support various
features (most importantly, the compressions such as normalisation), FDR in-
ternally represents processes as generalised labelled transition systems (GLTSs),
rather than LTSs. These differ from LTSs in that the individual states can be
labelled with properties according to the semantic model in use. For example,
if the failures model is being used, a GLTS would allow states to be labelled
with refusals. The compiler is responsible for converting syntactic processes into
GLTSs. The primary challenge for the compiler is to decide which of FDR3’s in-
ternal representations of GLTSs (which have various trade-offs) should be used
to represent each syntactic process. This algorithm is detailed in Section 5.
After FDR3 has constructed GLTSs for the specification and implementation
processes, FDR3 checks for refinement. Firstly, as with FDR2, the specification
GLTS is normalised [3] to yield a deterministic GLTS with no τ ’s. Normalising
large specifications is expensive, however, generally specifications are relatively
small. FDR3 then checks if the implementation GLTS refines the normalised
specification GLTS according to the algorithm presented in Section 4.
Like FDR2, FDR3 supports a variety of compressions which can be used to
cut the state space of a system. FDR3 essentially supports the compressions
of [3], in some cases with significantly improved algorithms, which we will report
on separately. It also supports the chase operator of FDR2 which forces τ actions
and is a useful pruner of state spaces where it is semantically valid.
Like recent versions of FDR2, FDR3 supports the Timed CSP language
[14,15]. It uses the strategy outlined in [16,3] of translating the continuous Timed
CSP language to a variant of untimed CSP with prioritisation and relying on
FDR3 — A Modern Refinement Checker for CSP 191

function Refines(S , I , M)
done ← {} The set of states that have been visited
current ← {(root(S ), root(I ))} States to visit on the current ply
next ← {} States to visit on the next ply
while current = {} do
for (s, i ) ← current \ done do
Check if i reﬁnes s according to M
done ← done ∪ {(s, i )}
for (e, i ) ∈ transitions(I , i ) do
if e = τ then next ← next ∪ {(s, i )}
else t ← transitions(S , s, e)
if t = {} then Report trace error S cannot perform the event
else {s } ← t
next ← next ∪ {(s , i )}
current ← next
next ← {}

Fig. 1. The single-threaded reﬁnement-checking algorithm where: S is the normalised

speciﬁcation GLTS; I is the implementation GLTS; M is the denotational model to
perform the check in; root(X ) returns the root of the GLTS X ; transitions(X , s) returns
the set of all (e, s ) such that there is a transition from s to s in the GLTS X labelled
by the event e; transitions(X , s, e) returns only successors under event e

theorems of digitisation [17]. In order to support this, FDR3 also supports the
prioritise operator [3,18], which has other interesting applications as shown
there.

4 Parallel Refinement Checking

We now describe the new multi-core refinement-checking algorithm that FDR3
uses to decide if a normalised GLTS P (recall that normalisation produces a
GLTS with no τ ’s and such that for each state and each event, there is a unique
successor state) is refined by another GLTS Q . We begin by outlining the refine-
ment checking algorithm of [2] and describing the FDR2 implementation [19].
We then define the parallel refinement-checking algorithm, before contrasting our
approach with the approaches taken by others to parallelise similar problems.
In this paper we concentrate on parallelising refinement checking on shared-
memory systems. We also concentrate on refinement checking in models that do
not consider divergence: we will report separately on parallelising this.

The Single-Threaded Algorithm Reﬁnement checking proceeds by performing a

search over the implementation, checking that every reachable state is compatible
with every state of the speciﬁcation after the same trace. A breadth-ﬁrst search
is performed since this produces a minimal counterexample when the check fails.
The single threaded algorithm [2,19] is given in Figure 1.
The interesting aspect of an implementation of the above algorithm is how
it stores the sets of states (i.e. current , next and done). FDR2 uses B-Trees for
192 T. Gibson-Robinson et al.

function Worker(S , I , M, w )
donew , currentw , nextw ← {}, {}, {}
finishedw ← true
if hash(root(S ), root(I )) = w then
currentw ← {(root(S ), root(I ))}
finishedw ← false
while ∨w ∈Workers ¬finishedw do
Wait for other workers to ensure the plys start together
finishedw ← true
for (s, i ) ← currentw \ donew do
finishedw ← false
Check if i reﬁnes s according to M
donew ← donew ∪ {(s, i )}
for (i , e) ∈ transitions(I , i ) do
if e = τ then w ← hash(s, i ) mod #Workers
nextw ← nextw ∪ {(s, i )}
else t ← transitions(S , s, e)
if t = {} then Report Trace Error
else {s } ← t
w ← hash(s , i ) mod #Workers
nextw ← nextw ∪ {(s , i )}
Wait for other workers to ﬁnish their ply
currentw ← nextw
nextw ← {}

Fig. 2. Each worker in a parallel reﬁnement check executes the above function. The
set of all workers is given by Workers. Hash(s, i ) is an eﬃcient hash function on the
state pair (s, i ). All other functions are as per Figure 1.

all of the above sets [19], primarily because this allowed checks to efficiently use
disk-based storage when RAM was exhausted (in contrast to, e.g. hash tables,
where performance often decays to the point of being unusable once RAM has
been exhausted). This brings the additional benefit that inserts into done (from
current ) can be performed in sorted order. Since B-Trees perform almost op-
timally under such workloads, this makes insertions into the done tree highly
efficient. To improve efficiency, inserts into the next tree are buffered, with the
buffer being sorted before insertion. The storage that the B-Tree uses is also
compressed, typically resulting in memory requirements being halved.

Parallelisation Parallelising FDR3’s reﬁnement checking essentially reduces to

parallelising the breadth-ﬁrst search of Figure 1. Our algorithm partitions the
state space based on a hash function on the node pairs. Each worker is assigned
a partition and has local current , next and done sets. When a worker visits
a transition, it computes the worker who is responsible for the destination by
hashing the new state pair. This algorithm is presented in Figure 2.
Whilst the abstract algorithm is straightforward, the implementation has to be
carefully designed in order to obtain good performance. As before, our primary
FDR3 — A Modern Reﬁnement Checker for CSP 193

consideration is minimising memory usage. In fact, this becomes even more critical
in the parallel setting since memory will be consumed at a far greater rate: with 16
cores, FDR3 can visit up to 7 billion states per hour consuming 70GB of storage.
Thus, we need to allow checks to exceed the size of the available RAM. Given the
above, B-Trees are a natural choice for storing the sets.
All access to the done and current B-Trees is restricted to the worker who
owns those B-Trees, meaning that there are no threading issues to consider. The
next B-Trees are more problematic: workers can generate node pairs for other
workers. Thus, we need to provide some way of accessing the next B-Trees of
other workers in a thread-safe manner. Given the volume of data that needs to
be put into next (which can be an order of magnitude greater than the volume
put into done), locking the tree is undesirable. One option would be to use fine-
grained locking on the B-Tree, however this is difficult to implement efficiently.
Instead of using complex locks, we have generalised the buffering that is used
to insert into next under the single-threaded algorithm. Each worker w has a set
of buffers, one for each other worker, and a list of buffers it has received from
other workers that require insertion into this worker’s next . When a buffer of
worker w for worker w = w fills up, it immediately passes it to the target worker.
Workers periodically check the stack of pending buffers to be flushed, and when
a certain size is exceeded, they perform a bulk insert into next by performing a
n-way merge of all of the pending buffers to produce a single sorted buffer.
One potential issue this algorithm could suffer from is uneven distribution
amongst the workers. We have not observed this problem: the workers have
terminated at roughly the same time. If necessary this could be addressed by
increasing the number of partitions, with workers picking a partition to work on.
We give experimental results that show the algorithm is able to achieve a near
linear speed up in Section 6.

Related Work There have been many algorithms proposed for parallelising BFS,
e.g. [20,21,22,23]. In general, these solutions do not attempt to optimise memory
usage of performance once RAM has been exhausted to the same degree.
The authors of [20] parallelised the FDR2 refinement checker for cluster sys-
tems that used MPI. The algorithm they used was similar to our algorithm in
that nodes were partitioned amongst the workers and that B-Trees were used
for storage. The main difference comes from the communication of next : in their
approach this was deferred until the end of each round where a bulk exchange
was done, whereas in our model we use a complex buffer system.
The authors of [21] propose a solution that is optimised for performing a BFS
on sparse graphs. This uses a novel tree structure to efficiently (in terms of time)
store the bag of nodes that are to be visited on the next ply. This was not suitable
for FDR since it does not provide a general solution for eliminating duplicates
in next , which would cause FDR3 to use vastly more memory.
The author of [23] enhances the Spin Model Checker [9] to support parallel
BFS. In this solution, which is based on [24], done is a lock-free hash-table and is
shared between all of the workers, whilst new states are randomly assigned to a
number of subsets which are lock-free linked lists. This approach is not suitable
194 T. Gibson-Robinson et al.

for FDR since hash-tables are known not to perform well once RAM has been
exhausted (due to their essentially random access pattern). Storing next in a
series of linked-lists is suitable for Spin since it can eﬃciently check if a node is
in done using the lock-free hash-table. This is not the case for FDR, since there
is no way of eﬃciently checking if a node is in the done B-Tree of a worker.

5 Compiler
As outlined in Section 3, the compiler is responsible for converting syntactic
processes into GLTSs. This is a difficult problem due to the generality of CSP
since operators can be combined in almost arbitrary ways. In order to allow the
processes to be represented efficiently, FDR3 has a number of different GLTS
types as described in Section 5.1, and a number of different way of construct-
ing each GLTS, as described in Section 5.2. In Section 5.3 we detail the new
algorithm that the compiler uses to decide which of FDR3’s representations of
GLTSs to use. This is of critical importance: if FDR3 were to choose the wrong
representation this could cause the time to check a property and the memory
requirements to greatly increase.

5.1 GLTSs
FDR3 has two main representations of GLTSs: Explicit and Super-Combinator ma-
chines. Explicit machines require memory proportional to the number of states and
transitions during a refinement check. In contrast, Super-Combinator machines only
require storage proportional to the number of states, since the transitions can be
computed on-the-fly. Equally, it takes longer to calculate the transitions of a Super-
Combinator machine than the corresponding Explicit machine.
An Explicit GLTS is simply a standard graph data structure. Nodes in an
Explicit GLTS are process states whilst the transitions are stored in a sorted list.
A Super-Combinator machine represents the LTS by a series of component LTSs
along with a list of rules to combine the transitions of the components. Nodes
for a Super-Combinator machine are tuples, with one entry for each component
machine. For example, a Super-Combinator for P ||| Q consists of the components
P , Q and the rules:
{(1 )→ a, a) | a ∈ αP ∪ {τ }} ∪ {(2 )→ a, a) | a ∈ αQ ∪ {τ }}
where αX is the alphabet of the process X (i.e. the set of events it can perform).
These rules describe how to combine the actions of P and Q into actions of the
whole machine. A single rule is of the form (f , e) where f is a partial function
from the index of a component machine (e.g. in the above example, 1 represents
P ) to the event that component must perform. e is the event the overall machine
performs if all components perform their required events.
Rules can also be split into formats, which are sets of rules. For example, a
Super-Combinator for P ; Q would start in format 1 , which has the rules:
{(1 )→ a, a, 1) | a ∈ αP ∪ {τ }, a = } ∪ {(1 )→ , τ, 2) | a ∈ αQ ∪ {τ }}.
FDR3 — A Modern Refinement Checker for CSP 195

The second format has the rules: {(2 )→ a, a, 2 ) | a ∈ αQ ∪ {τ }}. Thus, the
ﬁrst format allows P to perform visible events and stay in format 1 (as indicated
by the third element of the tuple), but if P performs a and terminates, the
second format is started which allows Q to perform visible events.
Rules can also specify that component machines should be restarted. For ex-
ample, to represent P = X ; P as a Super-Combinator, there needs to be a way of
restarting the process X after a . Thus, we add to the rules a list of components
whose states should be discarded and replaced by their root states:

{({1 )→ a}, a, 1, ) | a ∈ αX ∪ {τ }, a = )} ∪ {({1 )→ }, τ, 1, 1)}.

The ﬁrst rule set allows X to perform non- events as usual. However, if X ever
performs a this is converted into a τ and component 1 (i.e. X ) is restarted.
FDR also recursively combines the rules for Super-Combinator machines. For
example, (P ||| Q ) ||| R is not represented as two diﬀerent Super-Combinator
machines, but instead the rules for P ||| Q and · ||| R are combined. This
process is known as supercompilation. As you might expect from the name, super-
combinators are closely related to combinator operational semantics: the “super”
essentially co-incides with the joining together using supercompilation.

5.2 Strategies

There are several different strategies that FDR3 can use to construct Explicit or
Super-Combinator machines from syntactic processes. These strategies differ in
the type of processes that they can support (e.g. some cannot support recursive
processes), the time they take to execute and the type of the resulting GLTS.
The low-level is the simplest strategy and supports any process. An Explicit
LTS is constructed simply by directly applying CSP’s operational semantics.
The high-level compiles a process to a Super-Combinator. This is not able to
compile recursive processes, such as P = a → P . The supercombinator rules are
directly constructed using the operational semantics of CSP.
The mixed-level is a hybrid of the low and high-level strategies where, in-
tuitively, non-recursive parts of processes are compiled as per the high-level
strategy whilst recursive parts are compiled as per the low-level strategy. For
example, consider P = a → P 2 b → (X ||| Y ): compiling X ||| Y at the
high-level is preferable since it does not require the cartesian product of X
and Y to be formed. If P is compiled at the mixed-level, X ||| Y is compiled
at the high-level, and a → P 2 b → · is compiled into an Explicit machine.
These are wrapped in a Super-Combinator machine that starts X ||| Y when
the Explicit machine performs the b. The supercombinator has two formats,
the first with the rules: {({1 )→ a}, a, 1 ), ({1 )→ b}, b, 2 )} and the second with:
{({2 )→ a}, a, 2 ) | a ∈ α(X ||| Y ) ∪ {τ }}. Thus, when the first process performs
b, the Super-Combinator moves to the second format in which X ||| Y is run.
The next section formalises the set of process that can be compiled in this way.
The recursive high-level strategy is new in FDR3. This compiles to a Super-
Combinator machine and allows some recursive processes (which we formalise
196 T. Gibson-Robinson et al.

in the next section) to be compiled. This is used to compile processes such as

P = (X ||| Y ) ; P which are recursive, but are desirable to compile to Super-
Combinator machines for efficiency reasons (as above, constructing X ||| Y is
expensive). In this particular case, X ||| Y is compiled to a Super-Combinator
machine, and then a recursive supercombinator is constructed with the rules:
{({1 )→ a}, a, 1, ) | a ∈ α(X ||| Y ) ∪ {τ }, a = )} ∪ {({1 )→ }, τ, 1, 1)}.
Recall that the last component in the above rules indicates that component 1
should be reset. Thus, the above rules indicate that X ||| Y can perform non-
events normally, but a will cause X ||| Y to be reset to its initial state.
The majority of processes can be compiled at the recursive high-level, with
the exception of those that recurse through an on argument of an operator (e.g.
P = a → P 2 b → P ). For example, consider the process P = X ; (P 2 . . .):
since 2 is not discarded by a τ , it follows that this recursion is safe only when
X always performs a visible event before a (otherwise there would be an
infinite series of 2’s applied). This cannot be determined statically (i.e. without
accessing the transitions of X ), and thus it is not possible to determine if the
process can be compiled at the recursive high-level. Thankfully, such processes
are sufficiently rare in the context where recursive high-level is of use.

5.3 Picking a Strategy

We now describe the new algorithm that FDR3 uses to decide how to compile a
syntactic process. The input to the compilation algorithm is a syntactic process
environment (Definition 1) and the output is a list of strategies that specify
how each syntactic processes should be compiled. The algorithm guarantees to
produce a strategy such that executing the strategy yields a valid GLTS that
corresponds to the input process. The algorithm also uses heuristics to attempt
to reduce the time and memory usage during the subsequent refinement check.
All operators have a preferred level of compilation, either low (indicating Ex-
plicit is preferred) or high (indicating Super-Combinator is preferred). For exam-
ple, prefix prefers the low whilst interleave prefers high. In general, FDR3 aims
to compile an operator at its preferred level. If this is high, this may require
using the mixed and recursive high-level strategies on surrounding processes (a
preference for high is more important). When this is not possible (because, e.g.,
the processes do not permit the mixed level), the low-level strategy is used.
The first step is to calculate the strongly connected components (SCCs) of re-
cursive processes. This is done by performing a DFS on the recursion graph that
is naturally formed from the syntactic process environment. Then, we compute
which SCCs can be compiled at the recursive high-level, and which SCCs would
prefer to be compiled at the recursive high-level (by incorporating preferences,
e.g. prefix prefers to recurse at low, but ; prefers high). The graph is also used to
check for invalid processes, such as P = P 2 P : formally, for each process name
P we check that on each path back to P , at least one off argument is traversed.
Using the recursion graph, FDR3 computes which strategy to use to compile
a syntactic process P . This cannot be done in ignorance of the context of P ,
FDR3 — A Modern Refinement Checker for CSP 197

function Strategy(P , r ) P is a syntactic process, r is an event type

as ← The strategy for each argument of P
for each argument Q of P do
forceLow ← false Set to true if this must be compiled at low
if Q is an on argument of P then
r ← r discards(P , Q)
forceLow ← r = None
else Q is oﬀ
if r turnedOnBy(P , Q) = None then This might get turned on by
forceLow ← true an event that does not discard the context
else r ← Any The context is discarded when Q is turned on
if forceLow then as ← as Low
else as ← as Strategy(Q, r )

allLow ← a∈as a = Low

if (P is recursive ∨ r = Any) ∧ recursionType(P ) = High then
if allLow then return Low
else return Mixed
else if P is recursive then return RecursiveHigh
else if P prefers Low then
if allLow then return Low
else return Mixed
else return High

Fig. 3. The algorithm FDR3 uses to decide how to compile syntactic processes

since this may dictate how a process is compiled. For example, P = a → P 2 Q

requires Q to be compiled at the low-level, since P is a low-level recursion and Q
appears as an on argument of an operator that is on the recursion path. Thus,
when compiling a syntactic process, we need to be aware of the surrounding
X ||| STOP). When deciding on the strategy for P ,
context C [·], (e.g. C1 [X ] =
the relevant fact about the context is what events P can perform to cause the
context to be discarded. For example, nothing can discard the context C1 , whilst
X 2 STOP. As we are interested
any visible event discards the context C2 [X ] =
in statically analysing processes, we approximate these sets as follows.
Definition 2. An event type is either None, Invisible, Visible or Any. The
relation < is defined as None < Invisible, None < Visible, Invisible < Any,
Visible < Any. Note < is a partial order on event types. The meet of e1 and e2
is denoted by e1 e2 .
Definition 3. Let Q be an argument of a syntactic process P . If Q is on,
then discards(P , Q ) returns the event type that Q performs to cause P to be
discarded and Q to be left running (e.g. discards(X 2 Y , X ) = Visible, whilst
discards(X ||| Y , X ) = None). If Q is off , then turnedOnBy(P , Q ) returns the
event type that P performs in order to turn on Q . For example, turnedOnBy(X ;
Y , Y ) = Invisible whilst turnedOnBy(X Θ· Y , Y ) = Visible.
Thus it is possible to use discards along with the meet on event types to
compute when a context will be discarded.
198 T. Gibson-Robinson et al.

Figure 3 defines a function Strategy(P , r ) that returns the strategy that should
be used to compile the syntactic process P in a context that is discarded by
events of event type r . Informally, given a process P and an event type r this
firstly recursively visits each of its arguments, passing down an appropriate event
restriction (which is computed using discards for on arguments and turnedOnBy
for off arguments). It may also force some arguments to be low-level if the
restriction becomes None. Then, a compilation strategy for P is computed by
considering the preferences of the operator, whether the operator is recursive
and the deduced strategies for the arguments. The overriding observation behind
this choice is that compilation at high is only allowed when the process is non-
recursive, and when there is no surrounding context (i.e. r = Anything).

5.4 Related Work

FDR2 has support for Explicit and Super-Combinator GLTSs, along with a GLTS
deﬁnition for each CSP operator (e.g. external choice etc). We believe that the
FDR3 representation is superior, since it requires fewer GLTS types to be main-
tained and because it makes the GLTSs independent of CSP, making other pro-
cess algebras easier to support. As mentioned in Section 5.2, FDR2 did not make
use of the recursive high-level, and was unable to automatically compile processes
such as P = (X ||| Y ) ; P at the high-level. We have found that the recursive
high-level has dramatically decreased compilation time on many examples.
The biggest diﬀerence is in the algorithm that each uses to compile syntac-
tic processes. FDR2 essentially used a series of heuristics to accomplish this and
would always start trying to compile the process at its preferred level, backtrack-
ing where necessary. This produced undesirable behaviour on certain processes.
We believe that since the new algorithm is based on the operational semantics of
CSP, it is simpler and can be easily applied to other CSP-like process algebras.

6 Experiments

We compare the performance of a pre-release version of FDR 3.1.0 to FDR 2.94,

Spin 6.25, DiVinE 3.1 beta 1, and LTSmin 2.0, on a complete traversal of a
graph. The experiments were performed on a Linux server with two 8 core 2GHz
Xeons with hyperthreading (i.e. 32 virtual cores), 128GB RAM, and five 100GB
SSDs. All input files are available from the first author’s webpage. — denotes a
check that took over 6 hours, * denotes a check that was not attempted, and †
denotes a check that could not be completed due to insufficient memory. Times
refer to the total time required to run each program whilst memory figures refer
to the maximum Resident Set Size plus any on-disk storage used.
Figure 4a compares the performance of FDR2 and FDR3. FDR3 with 1
worker substantially outperforms FDR2. This is because FDR3’s B-Tree has
been heavily optimised and because FDR3 makes fewer allocations during re-
finement checks. FDR3 with 1 worker also uses less memory than FDR2: this is
due to a new compaction algorithm used to compress B-Tree blocks that only
FDR3 — A Modern Refinement Checker for CSP 199

States Transitions Time (s) & Storage (GB) FDR3-32

Input
(10 6 ) (10 6 ) FDR2 FDR3-1 FDR3-32 Speedup
bully.7 129 1354 2205 (4.8) 1023 (2.2) 85 (5.5) 12.0
cuberoll.0 7524 20065 — — 3546 (74.5) —
ddb.0 65 377 722 (1.4) 405 (0.5) 31 (2.36) 13.1
knightex.5.5 67 259 550 (1.4) 282 (0.6) 23 (2.4) 12.3
knightex.3.11 19835 67321 * * 26235 (298.5) —
phils.10 60 533 789 (1.3) 431 (0.5) 32 (2.0) 13.5
solitare.0 187 1487 2059 (4.4) 1249 (1.6) 84 (3.8) 14.9
solitare.1 1564 13971 19318 (35.1) 11357 (11.7) 944 (17.5) 12.0
solitaire.2 11622 113767 * * 9422 (113.3) —
tnonblock.7 322 635 2773 (6.7) 937 (2.6) 109 (6.8) 8.6
(a) Times comparing FDR2, FDR3 with 1 worker, and FDR3 with 32 workers.
Memory exceeded
Millions of States per Second
15 2

10
Speedup

5
bully.7
solitaire.0
tnonblock.7
0
0
0 10 20 30 0 1 2 ·10 4
Workers Time (s)

(b) FDR3’s scaling performance. (c) Disk storage performance on knightex.3.11.

Fig. 4. Experimental results demonstrating FDR3’s performance

stores the difference between keys. The extra memory used for the parallel ver-
sion is for extra buffers and the fact that the B-Tree blocks do not compress as
well.
The speed-ups that Figures 4a and 4b exhibit between 1 and 32 workers vary
according to the problem. solitaire is sped up by a factor of 15 which is almost
optimal given the 16 cores. Conversely, tnonblock.7 is only sped up by a factor
of 9 because it has many small plys, meaning that the time spent waiting for
other workers at the end of a ply is larger.
Figure 4c shows how the speed that FDR3 visits states at changes during
the course of verifying knightex.3.11, which required 300GB of storage (FDR3
used 110GB of memory as a cache and 190GB of on-disk storage). During a
refinement check, the rate at which states are explored will decrease because the
B-Trees increase in size. Observe that there is no change in the decrease of the
state visiting rate after memory is exceeded. This demonstrates that B-Trees are
effectively able to scale to use large amounts of on-disk storage.
Figure 5 compares the performance of FDR3, Spin, DiVinE and LTSmin. For
in-memory checks Spin, DiVinE and LTSmin complete the checks up to three
200 T. Gibson-Robinson et al.

Time (s) & Storage (GB)

Input
Spin-32 FDR3-32 DiVinE-32 LTSmin-32
knightex.5.5 12 (5.8) 23 (2.4) 13 (4.6) 28 (33.1)
knightex.3.10 396 (115.0) 943 (22.7) † 395 (35.5)
knightex.3.11 † 26235 (298.5) † †
solitaire.0 89 (15.5) 85 (3.9) 85 (14.3) 73 (36.4)

Fig. 5. A comparison between FDR3, Spin, DiVinE and LTSmin. knightex.3.10 has
2035 × 10 6 states and 6786 × 10 6 transitions.

times faster than FDR3 but use up to four times more memory. We believe that
FDR3 is slower because supercombinators are expensive to execute in comparison
to the LTS representations that other tools use, and because B-Trees are slower
to insert into than hashtables. FDR3 was the only tool that was able to complete
knightex.3.11 which requires use of on-disk storage; Spin, DiVinE and LTSmin
were initially fast, but dramatically slowed once main memory was exhausted.

7 Conclusions
In this paper we have presented FDR3, a new refinement checker for CSP. We
have described the new compiler that is more efficient, more clearly defined
and produces better representations than the FDR2 compiler. Further, we have
detailed the new parallel refinement-checking algorithm that is able to achieve
a near-linear speed-up as the number of cores increases whilst ensuring efficient
memory usage. Further, we have demonstrated that FDR3 is able to scale to
enormous checks that far exceed the bounds of memory, unlike related tools.
This paper concentrates on parallelising refinement checks on shared-memory
systems. It would be interesting to extend this to support clusters instead: this
would allow even larger checks to be run. It would also be useful to consider
how to best parallelise checks in the failures-divergence model. This is a difficult
problem, in general, since this uses a depth-first search to find cycles.
FDR3 is available for 64-bit Linux and Mac OS X from https://www.
cs.ox.ac.uk/projects/fdr/. FDR3 is free for personal use or academic re-
search, whilst commercial use requires a licence.

Acknowledgements. This work has beneﬁtted from many useful conversations

with Michael Goldsmith, Colin O’Halloran, Gavin Lowe, and Nick Moﬀat. We
would also like to thank the anonymous reviewers for their useful comments.

References
1. Hoare, C.A.R.: Communicating Sequential Processes. Prentice-Hall, Inc., Upper
Saddle River (1985)
2. Roscoe, A.W.: The Theory and Practice of Concurrency. Prentice Hall (1997)
FDR3 — A Modern Reﬁnement Checker for CSP 201

3. Roscoe, A.W.: Understanding Concurrent Systems. Springer (2010)

4. Lawrence, J.: Practical Application of CSP and FDR to Software Design. In: Abdal-
lah, A.E., Jones, C.B., Sanders, J.W. (eds.) CSP25. LNCS, vol. 3525, pp. 151–174.
Springer, Heidelberg (2005)
5. Mota, A., Sampaio, A.: Model-checking CSP-Z: strategy, tool support and indus-
trial application. Science of Computer Programming 40(1) (2001)
6. Fischer, C., Wehrheim, H.: Model-Checking CSP-OZ Specifications with FDR. In:
IFM 1999. Springer (1999)
7. Lowe, G.: Casper: A Compiler for the Analysis of Security Protocols. Journal of
Computer Security 6(1-2) (1998)
8. Roscoe, A.W., Hopkins, D.: SVA, a Tool for Analysing Shared-Variable Programs.
In: Proceedings of AVoCS 2007 (2007)
9. Holzmann, G.: Spin Model Checker: The Primer and Reference Manual. Addison-
Wesley Professional (2003)
10. Barnat, J., Brim, L., Havel, V., Havlíček, J., Kriho, J., Lenčo, M., Ročkai, P., Štill,
V., Weiser, J.: DiVinE 3.0 – An Explicit-State Model Checker for Multithreaded C
& C++ Programs. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044,
pp. 863–868. Springer, Heidelberg (2013)
11. Laarman, A., van de Pol, J., Weber, M.: Multi-Core LTSmin: Marrying Modularity
and Scalability. In: Bobaru, M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.)
NFM 2011. LNCS, vol. 6617, pp. 506–511. Springer, Heidelberg (2011)
12. University of Oxford, Failures-Divergence Refinement—FDR 3 User Manual
(2013), https://www.cs.ox.ac.uk/projects/fdr/manual/
13. University of Oxford, libcspm (2013), https://github.com/tomgr/libcspm
14. Reed, G.M., Roscoe, A.W.: A Timed Model for Communicating Sequential Pro-
cesses. Theoretical Computer Science 58 (1988)
15. Armstrong, P., Lowe, G., Ouaknine, J., Roscoe, A.W.: Model checking Timed CSP.
In: Proceedings of HOWARD (Festschrift for Howard Barringer) (2012)
16. Ouaknine, J.: Discrete Analysis of Continuous Behaviour in Real-Time Concurrent
Systems. DPhil Thesis (2001)
17. Barringer, H., Kuiper, R., Pnueli, A.: A really abstract concurrent model and its
temporal logic. In: Proceedings of the 13th ACM SIGACT-SIGPLAN Symposium
on Principles of Programming Languages. ACM (1986)
18. Roscoe, A.W., Hopcroft, P.J.: Slow abstraction via priority. In: Liu, Z., Woodcock,
J., Zhu, H. (eds.) Theories of Programming and Formal Methods. LNCS, vol. 8051,
pp. 326–345. Springer, Heidelberg (2013)
19. Roscoe, A.W.: Model-Checking CSP. In: A Classical Mind: Essays in Honour of
CAR Hoare (1994)
20. Goldsmith, M., Martin, J.: The parallelisation of FDR. In: Proceedings of the
Workshop on Parallel and Distributed Model Checking (2002)
21. Leiserson, C.E., Schardl, T.B.: A work-efficient parallel breadth-first search algo-
rithm (or how to cope with the nondeterminism of reducers). In: Proc. 22nd ACM
Symposium on Parallelism in Algorithms and Architectures, SPAA 2010 (2010)
22. Korf, R.E., Schultze, P.: Large-scale parallel breadth-first search. In: Proc. 20th
National Conference on Artificial Intelligence, vol. 3. AAAI (2005)
23. Holzmann, G.J.: Parallelizing the Spin Model Checker. In: Donaldson, A., Parker,
D. (eds.) SPIN 2012. LNCS, vol. 7385, pp. 155–171. Springer, Heidelberg (2012)
24. Laarman, A., van de Pol, J., Weber, M.: Boosting multi-core reachability perfor-
mance with shared hash tables. In: Formal Methods in Computer-Aided Design
(2010)
Concurrent Depth-First Search Algorithms

Gavin Lowe

Department of Computer Science, University of Oxford

Wolfson Building, Parks Road, Oxford, OX1 3QD, United Kingdom
[email protected]

Abstract. We present concurrent algorithms, based on depth-ﬁrst search,

for three problems relevant to model checking: given a state graph, to
ﬁnd its strongly connected components, which states are in loops, and
which states are in “lassos”. Our algorithms typically exhibit about a
four-fold speed-up over the corresponding sequential algorithms on an
eight-core machine.

1 Introduction

In this paper we present concurrent versions of algorithms based on depth-ﬁrst

search, all variants of Tarjan’s Algorithm [17]. We consider algorithms for three
closely related problems:

1. To ﬁnd the strongly connected components (SCCs) of a graph (i.e., the

maximal subsets S of the graph’s nodes such that for any pair of nodes
n, n ∈ S, there is a path from n to n );
2. To ﬁnd which nodes are part of a cycle in the graph (i.e., such that there is
a non-empty path from the node to itself);
3. To ﬁnd which nodes are part of a “lasso” (i.e., such that there is a path from
the node to a node on a cycle).

Our main interest in these algorithms is as part of the development of the

FDR3 model checker [6,18] for CSP [16]. In order to carry out checks in the
failures-divergences model, it is necessary to detect which nodes are divergent,
i.e. can perform an unbounded number of internal τ events; this is equivalent to
detecting whether the node is part of a lasso in the transition graph restricted
to τ -transitions (Problem 3).
FDR’s main failures-divergences refinement checking algorithm performs a
concurrent breadth-first search of the product of the state graphs of the sys-
tem and specification processes, testing whether each system state is compatible
with the corresponding specification state. In particular, this involves testing
whether the system state is divergent; hence several divergences tests need to be
performed concurrently starting at different nodes.
Further, FDR can perform various compressions upon the transition graphs of
processes. One of these, tau_loop_factor, works by identifying all nodes within
an SCC in the transition graph restricted to τ -transitions (Problem 1).

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 202–216, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Concurrent Depth-First Search Algorithms 203

Problem 2 has applications in other areas of model checking: the automata-

theoretic approach for LTL model checking [19] involves searching for a cycle
containing an accepting state in the graph formed as the product of the Büchi
property automaton and the system.
We present concurrent algorithms for each of the above three problems. Our
implementations typically exhibit about a four-fold speed-up over the corre-
sponding sequential algorithms on an eight-core machine; the speed-ups are
slightly better on graphs with a higher ratio of transitions to states.
These are challenging problems for the following reasons. In many graphs,
threads will encounter nodes that are currently being considered by other threads;
we need to ensure that the threads do not duplicate work, do not interfere with
one another, but do obtain information from one another: depth-first search
seems to be an area where it is difficult to achieve a high degree of indepen-
dence between threads. Further, many graphs contain a super-component that
contains a large proportion of the graph’s nodes; for Problems 1 and 2, it seems
impossible to avoid having the nodes of this super-component being considered
sequentially.
In [14], Reif showed that computation of depth-first search post-ordering of
vertices is P -complete. This is often used to claim that parallelising algorithms
based on depth-first search is difficult (assuming N C = P ): no algorithm can run
in poly-logarithmic time with a polynomial number of processors. Nevertheless,
it is possible to achieve significant speed-ups, at least for a fairly small number
of processors (as is common in current computers), for the types of graphs that
are typical of those encountered in model checking.
In Section 2 we review the sequential version of Tarjan’s Algorithm. In Sec-
tion 3 we present our concurrent algorithm. In Section 4 we describe some aspects
of our prototype implementation, and highlight a few tricky aspects. In Section 5
we report on some experiments, comparing our algorithm to the sequential ver-
sion. We sum up and discuss related work in Section 6.

2 Tarjan’s Algorithm

In this section we review the sequential Tarjan’s Algorithm [17]. We start by

describing the original version, for finding SCCs; we then discuss how to adapt
the algorithm to find loops or lassos.
Tarjan’s Algorithm performs a depth-first search of the graph. The algorithm
uses a stack, denoted tarjanStack, to store those nodes that have been encountered
in the search but not yet placed into an SCC. Each node n is given two variables:
index, which is a sequence counter, corresponding to the order in which nodes
were encountered; and lowlink which records the smallest index of a node n in
the stack that is reachable via the descendents of n fully considered so far. The
following function (presented in pseudo-Scala) to update a node’s low-link will
be useful.
def updateLowlink(update: Int ) = { lowlink = min(lowlink, update) }
204 G. Lowe

1 var index = 0
2 // Set node’s index and lowlink, and add it to the stacks
3 def addNode(node) = {
4 node.index = index; node. lowlink = index; index += 1
5 controlStack .push(node); tarjanStack .push(node)
6 }
7 addNode(startNode)
8 while( controlStack .nonEmpty){
9 val node = controlStack.top
10 if (node has an unexplored edge to child ){
11 if ( child previously unseen) addNode(child)
12 else if ( child is in tarjanStack ) node.updateLowlink(child . index)
13 // otherwise, child is complete, nothing to do
14 }
15 else{ // backtrack from node
16 controlStack .pop
17 if ( controlStack .nonEmpty) controlStack.top.updateLowlink(node.lowlink )
18 if (node. lowlink == node.index){
19 start new SCC
20 do{
21 w = tarjanStack.pop; add w to SCC; mark w as complete
22 } until (w == node)
23 }
24 }
25 }

Fig. 1. Sequential Tarjan’s Algorithm

Also, each node has a status: either complete (when it has been placed in an
SCC), in-progress (when it has been encountered but not yet been placed in an
SCC), or unseen (when it has not yet been encountered).
Tarjan’s Algorithm is normally described recursively; however, we consider
here an iterative version. We prefer an iterative version for two reasons: (1) as
is well known, iteration is normally more eﬃcient than recursion; (2) when we
move to a concurrent version, we will want to suspend searches; this will be
easier with an iterative version. We use a second stack, denoted controlStack,
that corresponds to the control stack of the recursive version, and keeps track
of the nodes to backtrack to.
We present the sequential Tarjan’s Algorithm for ﬁnding SCCs (Problem 1)
in Figure 1. The search starts from the node startNode. When an edge is explored
to a node that is already in the stack, the low-link of the edge’s source is updated
(line 12). Similarly, when the search backtracks, the next node’s low-link is up-
dated (line 17). On backtracking from a node, if its low-link equals its index, all
the nodes above it on the Tarjan stack form an SCC, and so are removed from
that stack and collected (lines 18–23).
The following observation will be useful later.
Concurrent Depth-First Search Algorithms 205

Observation 1. 1. For each node in the tarjanStack, there’s a path in the graph
to each subsequent node in the tarjanStack.
2. For any node n in the tarjanStack, if n is nearer the top of that stack than
controlStack.top, then there is a path from n to controlStack.top (and hence the
two nodes are in the same SCC).
3. If nodes n and l are such that n.lowlink = l.index, then all the nodes between n
and l in the tarjanStack are in the same SCC.
If, instead, we are interested in ﬁnding cycles (Problem 2) then: (1) at line 12,
if node == child then we mark the node as in a cycle; and (2) after line 22, if the
SCC has more than one node, we mark all its nodes as in a cycle.
If we are interested in ﬁnding lassos (Problem 3) then: (1) at line 12, we
immediately mark node and all the other nodes in the Tarjan stack as being in
a lasso; and (2) if we encounter a complete node (line 13), if it is in a lasso, we
mark all the nodes in the Tarjan stack as being in a lasso.

3 Concurrent Tarjan’s Algorithm

We now describe our concurrent version of Tarjan’s Algorithm. We again start
with an algorithm for finding SCCs, presented in Figure 2; we later consider how
to adapt this for the other problems.
Each search is independent, and has its own control stack and Tarjan stack.
A search is started at an arbitrary node startNode that has not yet been con-
sidered by any other search (we describe this aspect of our implementation in
Section 4.2). Each search proceeds much as in the standard Tarjan’s Algorithm,
as long as it does not encounter a node that is part of another current search.
However, if the search encounters a node child that is not complete but is not
in its own stack (line 13) —so it is necessarily in the stack of another search—
then the search suspends (detailed below). When child is completed, the search
can be resumed (line 24). This design means that each node is in the stacks of
at most one search; each node has a field search identifying that search (set at
line 5).
A difficulty occurs if suspending a search would create a cycle of searches, each
blocked on the next. Clearly we need to take some action to ensure progress. We
transfer the relevant nodes of those searches into a single search, and continue,
thereby removing the blocking-cycle. We explain our procedure in more detail
with an example, depicted in Figure 3; it should be clear how to generalise
this example. The bottom-left of the figure depicts the graph G being searched;
the top-left depicts the tarjanStacks of the searches (oriented downwards, so the
“tops” of the stacks are towards the bottom of the page).
Suppose search s1 is blocked at n1 waiting for node c2 of search s2 to complete,
because s1 encountered an edge from n1 to c2 (corresponding to node and child,
respectively, in Figure 2). Similarly, suppose search s2 is blocked at n2 waiting
for node c3 of search s3 to complete; and search s3 is blocked at n3 waiting
for node c1 of search s1 to complete. This creates a blocking cycle of suspended
searches (see Figure 3, top-left). Note that the nodes between c1 and n1 , between
206 G. Lowe

1 var index = 0
2 // Set node’s index, lowlink and search, and add it to the stacks
3 def addNode(node) = {
4 node.index = index; node. lowlink = index; index += 1
5 node.search = thisSearch ; controlStack .push(node); tarjanStack .push(node)
6 }
7 addNode(StartNode)
8 while( controlStack .nonEmpty){
9 val node = controlStack.top
10 if (node has an unexplored edge to child ){
11 if ( child previously unseen) addNode(child)
12 else if ( child is in tarjanStack ) node.updateLowlink(child . index)
13 else if ( child is not complete) // child is in−progress in a diﬀerent search
14 suspend waiting for child to complete
15 // otherwise, child is complete, nothing to do
16 }
17 else{ // backtrack from node
18 controlStack .pop
19 if ( controlStack .nonEmpty) controlStack.top.updateLowlink(node.lowlink )
20 if (node. lowlink == node.index){
21 start new SCC
22 do{
23 w = tarjanStack.pop; add w to SCC
24 mark w as complete and unblock any searches suspended on it
25 } until (w == node)
26 }
27 }
28 }

Fig. 2. Concurrent Tarjan’s Algorithm (changes from Fig. 1 underlined)

c2 and n2 , and between c3 and n3 are all in the same SCC, by Observation 1(1);
we denote this SCC by “C”.
Let t1 be the top of the Tarjan stack of s1 : t1 might equal n1 ; or s1 might
have backtracked from t1 to n1 . Note that all the nodes between n1 and t1 are
in the same SCC as n1 , by Observation 1(2), and hence in the SCC C. Similarly,
let t2 and t3 be the tops of the other Tarjan stacks; all the nodes between n2
and t2 , and between n3 and t3 are likewise in C.
Let l1 be the earliest node of s1 known (according to the low-links of s1 ) to
be in the same SCC as c1 : l1 is the earliest node reachable by following low-
links from the nodes between c1 to t1 (inclusive), and then (perhaps) following
subsequent low-links; equivalently, l1 is the last node in s1 that is no later than c1
and such that all low-links of nodes between l1 and t1 are at least l1 (a simple
traversal of the Tarjan stack can identify l1 ). Hence all the nodes from l1 to t1
are in the SCC C (by Observation 1(3)). Let l2 and l3 be similar.
Consider the graph G formed by transforming the original graph by adding
edges from n1 to l2 , and from n3 to l1 , as illustrated in Figure 3 (middle top
Concurrent Depth-First Search Algorithms 207

s1 s2 s3 s1 s2 s3 s1 s2 s3

l1-V l2-V l3-V l1-V Z lJ 2-V l3-V ,, l3-V
( ( ( ( ( ( ,, (
,,
,,
c1 Y cG 2 cG 3 c1 Y cG 2 cG 3 cJ 3
,,
,,
,,
,,
,,
,,
,,
n1 n2 n3 n1 n2 n3 ,, n3
,,
,,
t1 t2 t3 t1 t2 t3 ,, t3
,,
,,

Key: / blocking edge l1-V
+3 Tarjan stack (
_ _ _/ lowest low-link
/ added edge. c1

The graph G The graph G

t1
l1U l2 V l3V l1U Z lJ 2 V l3V

l2-V
c1 Y cG 2 cG 3 c1 Y cG 2 cG 3
(

• • • • • •
c2

n1U n2U n3U n1U n2U n3U

n2
t1 t2 t3 t1 t2 t3

Fig. 3. Illustration of the blocking cycle reduction

208 G. Lowe

and middle bottom). It is clear that the transformed graph has precisely the
same SCCs as the original, since all the nodes below l1 , l2 and l3 in the figure
are in the same SCC C. Consider the following scenario for the transformed
graph: the search s3 explores via nodes l3 , c3 , n3 (backtracking from t3 ), l1 ,
c1 , n1 (backtracking from t1 ), l2 , c2 , n2 (backtracking from t2 ), and then back
to c3 ; meanwhile, the searches s1 and s2 reach l1 and l2 , respectively, and are
suspended.
We transform the stacks to be compatible with this scenario, as illustrated in
Figure 3 (right), by transferring the nodes from l1 to n1 , and from l2 to n2 onto
the stack of search s3 . Note, in particular, that the indexes and lowlinks have to
be updated appropriately (letting δ1 = s3 .index − l1 .index, we add δ1 onto the
index and lowlink of each node transferred from s1 and update s3 .index to be one
larger than the greatest of the new indexes; we then repeat with s2 ).
We then resume search s3 . We start by considering the edge from n2 to c3 ,
and so update the lowlink of n2 . Searches s1 and s2 remain suspended until l1
and l2 are completed.
We now consider the other two problems. If we are interested in finding cycles
(Problem 2) then we adapt the algorithm as for the sequential algorithm: (1) at
line 12, if node == child then we mark the node as in a cycle; and (2) after line 25,
if the SCC has more than one node, we mark all its nodes as in a cycle.
If we are interested in finding lassos (Problem 3) then we again adapt the
algorithm as for the sequential algorithm: (1) at line 12, we immediately mark
node and all the other nodes in the Tarjan stack as being in a lasso; and (2) if
we encounter a complete node (line 15), if it is in a lasso, we mark all the
nodes in the Tarjan stack as being in a lasso. Further, if a search encounters an
in-progress node (line 13), if that node is in a lasso, then there is no need to
suspend the search: instead all the nodes in the Tarjan stack can also be marked
as in a lasso. Similarly, when a node is marked as being in a lasso, any search
blocked on it can be unblocked; when such a search is unblocked, all the nodes
in its Tarjan stack can also be marked as in a lasso. Finally, the procedure for
reducing blocking cycles can be greatly simplified, using the observation that all
the nodes in the Tarjan stacks are in a lasso: the search that discovered the cycle
(s3 in the example) marks all its nodes as in a lasso, and so unblocks the search
blocked on it (s2 in the example); that search similarly marks its nodes as in a
lasso, and so on.

4 Implementation

In this section we give some details of our prototype implementation of the

algorithm, and highlight a few areas where care is required. Our implementation1
uses Scala [12].

1
Available from http://www.cs.ox.ac.uk/people/gavin.lowe/parallelDFS.html.
Concurrent Depth-First Search Algorithms 209

4.1 Suspending and Resuming Searches

Each node n includes a ﬁeld blocked : List[Search], storing the searches that have
encountered this node and are blocked on it. When the node is completed, those
searches can be resumed (line 24 of Figure 2). Note that testing whether n is
complete (line 13 of Figure 2) and updating blocked has to be done atomically.
In addition, each suspended search has a ﬁeld waitingFor, storing the node it is
waiting on.
We record which searches are blocked on which others in a map suspended
from Search to Search, encapsulated in a Suspended object. The Suspended object
has an operation suspend(s: Search, n: Node) to record that s is blocked on n.
When s suspends blocked by a node of s , we detect if this would create a
blocking cycle by transitively following the suspended map to see if it includes a
blocking path from s to s. If so, nodes are transferred to s, and s is resumed
as outlined in the previous section. This is delicate. Below, let sb be one of the
searches from which nodes are transferred and that remains blocked.

1. Each node’s search, index and lowlink are updated, as described in the previous
section.
2. Each sb with remaining nodes has its waitingFor ﬁeld updated to the appro-
priate node of s (the li nodes of Figure 3); and those nodes have their blocked
ﬁelds updated.
3. The suspended map is updated: each sb that has had all its nodes transferred
is removed; each other sb is now blocked by s; and any other search s that
was blocked on one of the nodes transferred to s is now also blocked on s.

The Suspended object acts as a potential bottleneck. Perhaps surprisingly, it is

possible to allow several calls to suspend to proceed semi-concurrently. Considered
as a graph, the suspended map forms a forest of reverse arborescences, i.e. a forest
of trees, with all edges in a tree oriented towards a single sink search; further,
only the sink searches are active. Hence, concurrent reductions of blocking cycles
act on distinct reverse arborescences and so distinct searches.
We may not allow two concurrent attempts to detect a blocking cycle (consider
the case where each of two searches is blocked on the other: the cycle will not
be detected). Further, if no blocking cycle is found, the suspended map needs to
be updated before another attempt to find a blocking cycle; and the suspended
map must not be updated between reading the search field of the blocking node n
and completing the search for a blocking cycle (to prevent n being transferred
to a different search in the meantime)2 . Finally, the suspended map itself must
be thread-safe (we simply embed updates in synchronized blocks).
Other than as described in the previous paragraph, calls to suspend may act
concurrently. In particular, suppose a call suspend(s,n) detects a blocking cycle.
It updates search fields (item 1, above) before the suspended map (item 3). Sup-
pose, further, a second call, suspend(s’,n’), acts on the same reverse arborescence,
2
We believe that the amount of locking here can be reduced; however, this locking
does not seem to be a major bottleneck in practice.
210 G. Lowe

and consider the case that n’ is one of the nodes transferred to s. We argue that
the resulting race is benign. The second call will not create a blocking cycle
(since only the sink search of the reverse arborescence, s, can create a block-
ing cycle); this will be correctly detected, even in the half-updated state. Fur-
ther, suspended(s’) gets set correctly: if suspend(s’,n’) sets suspended(s’) to n’.search
before suspend(s,n) updates n’.search, then the latter will subsequently update
suspended(s’) to s (in item 3); if suspend(s,n) sets n’.search to s before suspend(s’,n’)
reads it, then both will set suspended(s’) to s.

4.2 Scheduling
Our implementation uses a number of worker threads (typically one per processor
core), which execute searches. We use a Scheduler object to provide searches for
workers, thereby implementing a form of task-based parallelism.
The Scheduler keeps track of searches that have been unblocked as a result of
the blocking node becoming complete (line 24 of Figure 2). A dormant worker
can resume one of these. (Note that when a search is unblocked, the update
to the Scheduler is done after the updates to the search itself, so that it is not
resumed in an inconsistent state.)
The algorithm can proceed in one of two diﬀerent modes: rooted, where the
search starts at a particular node, but the state space is not known in advance;
and unrooted, where the state space is known in advance, and new searches can
start at arbitrary nodes. In an unrooted search, the Scheduler keeps track of all
nodes from which no search has been started. A dormant worker can start a new
search at one of these (assuming it has not been reached by another search in
the meantime). Similarly, in a rooted search the Scheduler keeps track of nodes
encountered in the search but not yet expanded: when a search encounters a
new node n, it passes n’s previously unseen successors, except the one it will
consider next, to the Scheduler. Again, a dormant worker can start a new search
from such a node.

4.3 Enhancements
We now describe a few details of our implementation that have an effect upon
efficiency.
We use a map from node identifiers (Ints) to Node objects that store infor-
mation about nodes. We have experimented with many representations of this
map. Our normal implementation is based on the hash table described by Laar-
man et al. in [7]. However, this implementation uses a fixed-size table, rather
than resizing the table, thus going against the design of FDR (we have extended
the hash table to allow resizing, but this makes the implementation somewhat
slower). On some problems (including our experiments on random graphs in the
next section), the implementation works better with a sharded hash table3 with
3
A sharded hash table can be thought of as a collection of M individual hash tables,
each with its own lock; an entry with hash value h is stored in the table with
index h mod M .
Concurrent Depth-First Search Algorithms 211

open addressing. Even with these implementation, the algorithms spend about
40% of their time within this map. (Other implementations are worse; using a
Java ConcurrentHashMap increases the running time by a factor of two!)
It is clearly advantageous to avoid suspending searches, if possible. Therefore,
the implementation tries to choose (at line 10 of Figure 2) a child node that is
not in-progress in a diﬀerent search, if one exists.
Some nodes have no successors. It is advantageous, when starting a search
from such a node, to avoid creating a Search object with its associated stacks,
but instead to just mark the node as complete and to create a singleton SCC
containing it.

5 Experiments
In this section we report the results of timing experiments. The experiments were
carried out on an eight-core machine (an Intel* R Xeon* R E5620) with 12GB of
RAM. Each of the results is averaged over ten runs, after a warm-up round.
We have performed timing experiments on a suite of CSP files. We have
extracted the graphs of τ -transitions for all implementation processes in the
FDR3 test suite (including most of the CSP models from [15,16,1]) and the
CSP models from [10]. The top of Figure 4 gives statistics about a selection
of the graphs with between 200,000 and 5,000,000 states (we omit eleven such,
in the interests of space), plus a slightly smaller file tring2.1 which we discuss
below4 . For each graph we give the number of states (i.e. nodes), the number
of transitions (i.e. edges), the number of SCCs, the size of the largest SCC, the
number of trivial SCCs (with a single state), the number of states on a loop,
and the number of states on a lasso.
The bottom of Figure 4 gives corresponding timing results. For each of the
three problems, we give times (in ms) for each of the concurrent and sequential
algorithms, and the ratio between them (which represents the speed-up factor).
The penultimate row gives totals for these running times, and their ratios. The
final row gives data for tring2.1. Even on a single-threaded program, the JVM
uses a fair amount of concurrency. The sequential algorithm typically uses about
160% of a single core (as measured by top). Hence the maximum speed-up one
should expect is a factor of about five.
We have performed these experiments in unrooted mode, because it more-
closely simulates our main intended use within FDR, namely for detecting di-
vergences (i.e. τ -lassos) during failures-divergences checks. Such a check performs
a breadth-first search of the product of the system and specification processes;
for each pair of states encountered, if the specification state does not allow a
divergence, then FDR checks that the system state does not have a divergence.
The overall effect is normally that a lasso search is started at every reachable
system state.
The concurrent algorithms normally give significant speed-ups. Further, the
speed-up tends to be larger for larger graphs, particularly for graphs with more
4
The file matmult.6 contains no τ -transitions, only visible transitions.
212 G. Lowe

Graph States Transitions SCCs Largest Trivial Loop Lasso

SCC SCCs states states
cloudp.0 691692 1020880 691692 1 691692 0 0
cloudp.2 480984 643790 480984 1 480984 0 0
soldiers.0 714480 688110 714480 1 714480 0 0
comppuz.0 1235030 1558042 1235030 1 1235030 0 0
solitaire.0 494372 2271250 494372 1 494372 0 0
solitaire.1 4001297 5387623 4001297 1 4001297 0 0
matmul.6 2252800 0 2252800 1 2252800 0 0
virtroute.2 390625 1937500 390625 1 390625 0 0
tabp2.0 430254 310312 430254 1 430254 0 0
tabp2.1 427192 308978 427192 1 427192 0 0
tabp2.2 437908 316254 437908 1 437908 0 0
tringm.1 921403 925998 921403 1 921403 0 0
alt12.2.0 344221 1034608 344221 1 344221 0 0
alt12.2.1 344221 1114628 251927 2400 241821 102400 277229
alt12.3.0 575627 1283160 575627 1 575627 0 0
alt12.3.1 575627 1507604 447053 6560 439291 136336 440255
alt11.2.0 589149 1757856 589149 1 589149 0 0
alt11.2.1 589149 1883340 389713 1442 368629 220520 512131
alt11.3.0 990167 2227720 990167 1 990167 0 0
alt11.3.1 990167 2576732 652431 3168 628759 361408 886425
tring2.1 175363 355287 45822 129542 45821 131594 175363

SCCs Loops Lassos

Graph Conc Seq Ratio Conc Seq Ratio Conc Seq Ratio
cloudp.0 174 697 3.99 158 562 3.54 159 348 2.18
cloudp.2 127 414 3.24 120 317 2.63 118 167 1.42
soldiers.0 193 708 3.66 185 579 3.12 186 331 1.78
comppuz.0 334 1626 4.86 304 1399 4.59 305 990 3.24
solitaire.0 191 668 3.50 178 573 3.22 175 423 2.41
solitaire.1 1531 5058 3.30 1034 4396 4.25 1035 3220 3.11
matmul.6 543 2533 4.67 543 2161 3.97 543 1365 2.51
virtroute.2 138 457 3.31 129 381 2.95 129 261 2.03
tabp2.0 152 370 2.43 142 279 1.96 143 137 0.96
tabp2.1 153 370 2.41 141 278 1.97 144 137 0.95
tabp2.2 157 379 2.40 144 286 1.98 145 140 0.97
tringm.1 281 869 3.09 258 699 2.71 258 390 1.51
alt12.2.0 118 404 3.42 111 338 3.03 108 232 2.14
alt12.2.1 118 383 3.23 111 316 2.84 125 263 2.10
alt12.3.0 172 766 4.43 161 652 4.03 159 469 2.94
alt12.3.1 186 744 4.00 174 630 3.61 189 520 2.75
alt11.2.0 178 805 4.51 164 692 4.20 163 503 3.07
alt11.2.1 188 760 4.04 174 642 3.67 198 558 2.81
alt11.3.0 276 1226 4.44 256 1042 4.07 254 741 2.92
alt11.3.1 287 1178 4.10 274 983 3.58 315 845 2.68
Total 8505 34092 4.01 7554 28964 3.83 7816 21258 2.72
tring2.1 461 207 0.45 420 125 0.30 76 118 1.55

Fig. 4. Results for tests on CSP ﬁles: statistics about the graphs, and timing results
Concurrent Depth-First Search Algorithms 213

transitions. However, beyond a few million states, the speed-ups drop off again,
I believe because of issues of memory contention.
The results for tring2.1 deserve comment. This graph has a large SCC, ac-
counting for over 70% of the states. The first two concurrent algorithms consider
the nodes of this SCC sequentially and so (because the concurrent algorithms
are inevitably more complex) are slightly slower than the sequential algorithms.
However, the algorithm for lassos gives more scope for considering the nodes of
this SCC concurrently, and therefore gives a speed-up.
The above point is also illustrated in Figure 5. This figure considers a number
of random graphs, each with N = 200,000 states. For each pair of nodes n and n ,
an edge is included from n to n with probability p; this gives an expected number

p SCCs Largest Loop Lasso p SCCs Largest Loop Lasso

SCCs states states SCCs states states
0.0000005 200000 1 0 0 0.0000075 131836 68165 68167 116882
0.0000010 200000 1 0 1 0.0000080 117613 82387 82390 128340
0.0000015 200000 1 0 0 0.0000085 104296 95704 95706 138443
0.0000020 200000 1 1 1 0.0000090 92685 107316 107318 146591
0.0000025 200000 1 1 2 0.0000095 82651 117350 117351 153129
0.0000030 200000 1 1 3 0.0000100 73020 126981 126982 159418
0.0000035 199999 2 2 7 0.0000105 64830 135171 135171 164467
0.0000040 199998 3 4 20 0.0000110 57715 142286 142287 168734
0.0000045 199994 6 8 120 0.0000115 51080 148921 148922 172610
0.0000050 199908 72 96 2573 0.0000120 45565 154436 154437 175740
0.0000055 194020 5973 5984 34743 0.0000125 40710 159291 159292 178494
0.0000060 179848 20149 20155 63368 0.0000130 36322 163679 163679 180849
0.0000065 164038 35959 35966 84903 0.0000135 32422 167579 167580 183109
0.0000070 147928 52071 52075 101907 0.0000140 28929 171072 171072 184999

Fig. 5. Experiments on random graphs

214 G. Lowe

Fig. 6. Speed ups on CSP ﬁles as a function of the number of worker threads

of edges equal to N 2 p. (Note that such graphs do not share many characteristics
with the graphs one typically model checks!) The graph plots the speed-up for
the three algorithms for various values of p; the tables give statistical information
about the graphs considered (giving averages, rounded to the nearest integer in
each case). For p greater than about 0.000005, the graph has a large SCC, and
the algorithms for SCCs and loops become less efficient. However, the algorithm
for finding lassos becomes progressively comparatively more efficient as p, and
hence the number of edges, increases; indeed, for higher values of p, the speed-up
plateaus at about 5.
It is worth noting that graphs corresponding to the τ -transitions of CSP
processes rarely have very large SCCs. The graph tring2.1 corresponds to a CSP
process designed for checking in the traces model, as opposed to the failures-
divergences model, so the problems considered in this paper are not directly
relevant to it.
Figure 6 considers how the speed up varies as a function of the number of
worker threads. It suggests that the algorithm scales well.

6 Conclusions
In this paper we have presented three concurrent algorithms for related problems:
ﬁnding SCCs, loops and lassos in a graph. The algorithms give appreciable speed-
ups, typically by a factor of about four on an eight-core machine.
It is not surprising that we fall short of a speed-up equal to the number of
cores. As noted above, the JVM uses a fair amount of concurrency even on
single-threaded programs. Also, the concurrent algorithms are inevitably more
complex than the sequential ones. Further, I believe that they are slowed down
by contention for the memory bus, because the algorithms frequently need to
read data from RAM.
Concurrent Depth-First Search Algorithms 215

I believe there is some scope for reducing the memory contention, in particular
by reducing the size of Node objects: many of the attributes of Nodes are neces-
sary only for in-progress nodes, so could be stored in the relevant Search object.
Further, I intend to investigate whether it’s possible to reduce the amount of
locking of objects done by the prototype implementation.
We intend to incorporate the lasso and SCC algorithms into the FDR3 model
checker. In particular, it will be interesting to see whether the low-level nature
of C++ (in which FDR3 is implemented) permits optimisations that give better
memory behaviour.
As noted earlier, a large proportion of the algorithms’ time is spent within the
map storing information about nodes. I would like to experiment with diﬀerent
implementations.

Related Work. We briefly discuss here some other concurrent algorithms address-
ing one or more of our three problems. We leave an experimental comparison
with these algorithms for future work.
Gazit and Miller [5] describe an algorithm based upon the following idea.
The basic step is to choose an arbitrary pivot node, and calculate its SCC as the
intersection of its descendents and ancestors; these descendents and ancestors can
be calculated using standard concurrent algorithms. This basic step is repeated
with a new pivot whose SCC has not been identified, until all SCCs are identified.
A number of improvements to this algorithm have been proposed [13,11,2].
Several papers have proposed algorithms for finding loops, in the particular
context of LTL model checking [8,4,9,3]. These algorithms are based on the
SWARM technique: multiple worker threads perform semi-independent searches
of the graph, performing a nested depth-first search to detect a loop containing
an accepting state; the workers share only information on whether a node has
been fully explored, and whether it has been considered within an inner depth-
first search.

Acknowledgements. I would like to thank Tom Gibson-Robinson for many

interesting and useful discussions that contributed to this paper. I would also
like to thank the anonymous referees for their useful comments.

References
1. Armstrong, P., Lowe, G., Ouaknine, J., Roscoe, A.W.: Model checking timed CSP.
In: Proceedings of HOWARD-60 (2012)
2. Barnat, J., Chaloupka, J., van de Pol, J.: Distributed algorithms for SCC decom-
position. Journal of Logic and Computation 21(1), 23–44 (2011)
3. Evangelista, S., Laarman, A., Petrucci, L., van de Pol, J.: Improved multi-core
nested depth-ﬁrst search. In: Chakraborty, S., Mukund, M. (eds.) ATVA 2012.
LNCS, vol. 7561, pp. 269–283. Springer, Heidelberg (2012)
4. Evangelista, S., Petrucci, L., Youcef, S.: Parallel nested depth-ﬁrst searches for LTL
model checking. In: Bultan, T., Hsiung, P.-A. (eds.) ATVA 2011. LNCS, vol. 6996,
pp. 381–396. Springer, Heidelberg (2011)
216 G. Lowe

5. Fleischer, L.K., Hendrickson, B.A., Pinar, A.: On identifying strongly connected

components in parallel. In: Rolim, J.D.P. (ed.) IPDPS-WS 2000. LNCS, vol. 1800,
pp. 505–511. Springer, Heidelberg (2000)
6. Gibson-Robinson, T., Armstrong, P., Boulgakov, A., Roscoe, A.W.: FDR3 — A
modern refinement checker for CSP. In: Ábrahám, E., Havelund, K. (eds.) TACAS
2014. LNCS, vol. 8413, pp. 180–194. Springer, Heidelberg (2014)
7. Laarman, A., van de Pol, J., Weber, M.: Boosting multi-core reachability perfor-
mance with shared hash tables. In: Proceedings of 10th International Conference
on Formal Methods in Computer-Aided Design, FMCAD 2010 (2010)
8. Laarman, A., Langerak, R., van de Pol, J., Weber, M., Wijs, A.: Multi-core
nested depth-first search. In: Bultan, T., Hsiung, P.-A. (eds.) ATVA 2011. LNCS,
vol. 6996, pp. 321–335. Springer, Heidelberg (2011)
9. Laarman, A.W., van de Pol, J.C.: Variations on multi-core nested depth-first
search. In: Proceedings of the 10th International Workshop on Parallel and Dis-
tributed Methods in Verification. Electronic Proceedings in Theoretical Computer
Science, vol. 72, pp. 13–28 (2011)
10. Lowe, G.: Implementing generalised alt: A case study in validated design using
CSP. In: Communicating Process Architectures (2011)
11. McLendon III, W., Hendrickson, B., Plimpton, S.J., Rauchwerger, L.: Finding
strongly connected components in distributed graphs. Journal of Parallel and Dis-
tributed Computing 65(8), 901–910 (2005)
12. Odersky, M., Spoon, L., Venners, B.: Programming in Scala. Artima (2008)
13. Orzan, S.: On Distributed Verification and Verified Distribution. PhD thesis, Free
University of Amsterdam (2004)
14. Reif, J.H.: Depth-first search is inherently sequential. Information Processing Let-
ters 20(5), 229–234 (1985)
15. Roscoe, A.W.: Theory and Practice of Concurrency. Prentice Hall (1998)
16. Roscoe, A.W.: Understanding Concurrent Systems. Springer (2010)
17. Tarjan, R.: Depth-first search and linear graph algorithms. SIAM Journal of Com-
puting 1(2), 146–160 (1972)
18. University of Oxford. Failures-Divergence Refinement—FDR 3 User Manual
(2013), http://www.cs.ox.ac.uk/projects/fdr/manual/index.html
19. Vardi, M.Y., Wolper, P.: An automata-theoretic approach to automatic program
verification. In: Proceedings of Logic in Computer Science (1986)
Basic Problems in Multi-View Modeling

Jan Reineke1 and Stavros Tripakis2

1
Saarland University, Germany
2
UC Berkeley, USA and Aalto University, Finland

Abstract. Modeling all aspects of a complex system within a single model is a

difficult, if not impossible, task. Multi-view modeling is a methodology where
different aspects of the system are captured by different models, or views. A key
question then is consistency: if different views of a system have some degree
of overlap, how can we guarantee that they are consistent, i.e., that they do not
contradict each other? In this paper we formulate this and other basic problems
in multi-view modeling within an abstract formal framework. We then instantiate
this framework in a discrete, finite-state system setting, and study how some key
verification and synthesis problems can be solved in that setting.

1 Introduction
Real systems are usually complex objects, and grasping all the details of a system at the
same time is often difficult. In addition, each of the various stakeholders in the system
are concerned with different system aspects. For these reasons, modeling and design
teams usually deal only with partial and incomplete views of a system, which are easier
to manage separately. For example, when designing a digital circuit, architects may
be concerned with general (boolean) functionality issues, while ignoring performance.
Other stakeholders, however, may be concerned about timing aspects such as the delay
of the critical path, which ultimately affects the clock rate at which the circuit can
be run. Yet other stakeholders may be interested in a different aspect, namely, energy
consumption of the circuit which affects battery life.
Modeling and simulation are often used to support system design. In this paper,
when we talk about views, we refer concretely to the different models of a system
that designers build. Such models may be useful as models of an existing system: the
system exists, and a model is built in order to study the system. Then, the model is only
a partial or incomplete view of the system, since it focuses on certain aspects and omits
others. For example, an energy consumption model for an airplane ignores control,
air dynamics, and other aspects. Models may also be used for a system-to-be-built: an
energy consumption model as in the example above could be developed as part of the
design process, even before the airplane is built.

This research is partially supported by the National Science Foundation and the Academy of
Finland, via projects ExCAPE: Expeditions in Computer Augmented Program Engineering
and COSMOI: Compositional System Modeling with Interfaces, by the Deutsche Forschungs-
gemeinschaft as part of the Transregional Collaborative Research Centre SFB/TR 14 AVACS,
and by the centers TerraSwarm and iCyPhy (Industrial Cyber-Physical Systems) at UC
Berkeley.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 217–232, 2014.

c Springer-Verlag Berlin Heidelberg 2014
218 J. Reineke and S. Tripakis

For large systems, each aspect of the system is typically designed by a dedicated
design team. These teams often use different modeling languages and tools to capture
different views, which is generally referred to as multi-view modeling (MVM). MVM
presents a number of challenges, such as the crucial issue of consistency: if different
views of the system are captured by different models, and these models have some
degree of overlap, how can we guarantee that the models are consistent, i.e., that they
do not contradict each other? Understanding the precise meaning of such questions, and
developing techniques to answer them, ideally fully automatically, is the main goal of
this paper.
Toward this goal, we begin in Section 2 by introducing an example of simple 3-
dimensional structure modeling. Even though our focus is on dynamic behaviors, we
will use this static system as an illustrative running example to demonstrate the salient
concepts of our formal MVM framework. The latter is itself presented in Section 3. The
main concepts are as follows: (1) views can be derived from systems using abstraction
functions, which map system behaviors to view behaviors; (2) conformance formalizes
how “faithful” a view is to a system; (3) consistency of a set of views is defined as
existence of a witness system to which all views conform; (4) view reduction allows to
“optimize” views by using the information contained in other views; (5) orthogonality
captures independence between views.
The framework proposed in Section 3 is abstract, in the sense that it does not refer to
specific notions of behaviors, neither to concrete representations of systems and views.
In the rest of the paper we instantiate this abstract framework for the case of discrete
systems. The latter, defined in Section 4, are finite-state symbolic transition systems
consisting of a set of state variables, a predicate over the state variables characterizing
the set of initial states, and a predicate characterizing the transition relation.
In Section 5 we study projections as abstraction functions for discrete systems. Fully-
observable systems, where all variables are observable, are not closed under projection,
therefore we also consider systems with internal (unobservable) variables. We show
how to effectively solve a number of verification and synthesis problems on discrete
systems and views, including view conformance and consistency checking.

2 Running Example: 3D Objects

To illustrate the concept of views we introduce a running example. Consider the 3D
structure shown at the left of Figure 1. It can be modeled as a set of points in a 4 × 4 × 4
space, each point (x, y, z) representing a “box” appearing at coordinate (x, y, z), for
x, y, z ∈ {1, 2, 3, 4}. The object shown to the left of the figure contains 16 such boxes,
and the corresponding set contains 16 points.
Three views of the object are shown to the right of the figure: a top view, a front
view, and a side view. These views can be formalized as 2D projections. Let S be the
set of points representing the 3D object. Then the three views can be formalized as sets
Vtop , Vfront , Vside , where: Vtop = {(x, y) | ∃z : (x, y, z) ∈ S}, Vf ront = {(x, z) | ∃y :
(x, y, z) ∈ S}, Vside = {(y, z) | ∃x : (x, y, z) ∈ S}.
The above projections can be seen as abstractions of S. In fact, they are generally
strict abstractions in the sense that some information about S is lost during the abstrac-
tion. In the case of Figure 1, e.g., the same views would be obtained if one were to add
Basic Problems in Multi-View Modeling 219

Fig. 1. A 3D structure (left) and 3 views of it (right) – image produced using this tool:
http://www.fi.uu.nl/toepassingen/02015/toepassing_wisweb.en.html

to the object the missing boxes so that no box under the “staircase structure” hangs in
the air.

3 Views: A Formalization
Systems: We define a system semantically, as a set of behaviors. As in [15], there is
no restriction on the type of behaviors: they could be discrete traces, continuous tra-
jectories, hybrid traces, or something else. We only assume given a domain of possible
behaviors, U. Then, a system S over domain of behaviors U is a subset of U: S ⊆ U.
View Domains: A view is intuitively an “incomplete picture” of a system. It can be
incomplete in different ways:

– Some behaviors may be missing from the view, i.e., the view may contain only a
subset of system behaviors. (As we shall see when we discuss conformance, the
view may also be a superset.)
– Some parts of a behavior itself may be missing in the view. E.g., if the behavior
refers to a state vector with, say, 10 state variables, the view could refer only to 2
state variables. In this case the view can be seen as a projection.
– More generally, the view may be obtained by some other kind of transformation
(not necessarily a projection) to behaviors. E.g., the original system behaviors may
contain temperature as a state variable, but the view only contains temperature av-
erages over some period of time.

From the above discussion, it appears that: semantically, views can be formalized as sets
of behaviors, just like systems are. However, because of projections or other transfor-
mations, the domain of behaviors of a view is not necessarily the same as the domain of
system behaviors, U. Therefore, we let Di be the domain of behaviors of view i (there
can be more than one view, hence the subscript i). When we refer to a general view
domain, we drop the subscript and simply write D.
In the case of our running example, U = {1, 2, 3, 4}3, and Dtop = Df ront =
Dside = {1, 2, 3, 4}2.

Views: A view is a set of behaviors over a given view domain. That is, a view V over
view domain D is defined to be a subset of D: V ⊆ D.
220 J. Reineke and S. Tripakis

Abstraction Functions: Given a domain of behaviors U and a view domain D, we

would like to relate systems over U and views over D. In order to do this, we will first
introduce abstraction functions, which map behaviors from U to D. An abstraction
function from U to D is defined to be a mapping a : U → D. Abstraction functions can
be projections or other types of transformations, as discussed above.
In the case of our running example, the abstraction functions atop , af ront , aside are
3D-to-2D projections on the corresponding planes.
An abstraction function a can be naturally “lifted” from behaviors to systems. If
S ⊆ U, then a(S) is defined to be: a(S) := {a(σ) | σ ∈ S}. Note that a(S) ⊆ D,
therefore, a(S) is a view over D.

Conformance: Given system S ⊆ U, view V ⊆ D, and abstraction function a : U →

D, we say that V is a complete view of S w.r.t. a if V = a(S). The notion of complete
view is a reasonable way of capturing how “faithful” a given view is to a certain system.
For example, if S is an object containing two boxes, S = {(1, 1, 1), (2, 2, 2)} and atop
is the top view, then V1 = {(1, 1), (2, 2)} is complete w.r.t. atop , whereas V2 = {(2, 2)}
and V3 = {(1, 1), (2, 2), (3, 3)} are not complete.
But faithfulness need not always require a strict equality as in the condition V =
a(S). Depending on the usage one makes of a view, weaker conditions may be ap-
propriate. Because of this, we introduce the notion of conformance. Conformance is
defined with respect to a partial order + on the set of all views over view domain D.
That is, + is a partial order on 2D , the powerset of D. Then, we say that V conforms to
S w.r.t. a and +, denoted V +a S, if V + a(S).
For example, if one uses the top view to decide whether it is safe to drop a box to the
floor without touching another box during landing, then a view that safely approximates
the set of free (x, y) positions could be acceptable. In this case, the partial order + is ⊇,
i.e., conformance is defined as V ⊇ atop (S). Indeed, dropping a box to (x, y) ∈ V
would be safe, since (x, y) ∈ V and V ⊇ atop (S) imply (x, y) ∈ atop (S). In another
scenario, it may be more appropriate to require that the view under-approximates a(S),
thus over-approximates the set of free (x, y) positions. For example, if one uses the top
view to decide whether it is safe to drop an object so that it does not hit the floor, then
it is more appropriate to define conformance as V ⊆ atop (S). In this case, + is ⊆.

An Alternative Formalization – Starting with Conformance: In the way we for-

malized things so far, we started with an abstraction function a and a partial order +,
and defined the conformance relation +a with respect to those. As an alternative, we
can start with a conformance relation |= ⊆ 2D × 2U , which relates a view V and a
system S, i.e., V |=S, and derive an abstraction function a. We can do this provided that
|= satisfies the conditions described below, and that the domain of views equipped with
+, denoted (2D , +), forms a complete lattice. Let denote the greatest lower bound in
this lattice. Note that the interpretation of the lattice is that the smaller an element the
more accurate it is, and x +# y says that y is smaller
than x. Therefore, when + is ⊇, top
is D, bottom is ∅, and is . When + is ⊆, is . Then, |= induces an abstraction
function a defined as follows:

a|= (S) := {V ⊆ D | V |=S}.
Basic Problems in Multi-View Modeling 221

For this to work, however, we need |= to have the two following properties:
1. (monotonicity) V1 |=S ∧ V2
+ V1 ⇒ V2 |=S.
2. (conformance preserved by ) ∀W ⊆ 2D : (∀V ∈ W : V |=S) ⇒ ( W )|=S.
Condition 1 says that if V1 conforms to S then any view greater than V1 also conforms
to S. Condition 2 says that if a set of views all conform to a system S, then their greatest
lower bound also conforms to S. Any relation +a defined by an abstraction function a
and an order + forming a complete lattice has these two properties by construction.
Consistency: Consider a set of views, V1 , V2 , ..., Vn , over view domains D1 , D2 , ..., Dn .
For each view domain Di , consider given a conformance relation |=i (which could
be derived from given abstraction function ai and partial order +i , or defined as a
primitive notion as explained above). We say that V1 , V2 , ..., Vn are consistent w.r.t.
|=1 , |=2 , ..., |=n if there exists a system S over U such that ∀i = 1, ..., n : Vi |=i S. We
call such a system S a witness to the consistency of V1 , V2 , ..., Vn . Clearly, if no such
S exists, then one must conclude that the views are inconsistent, as there is no system
from which these views could be derived. When +i is = for all i, i.e., when Vi = ai (S)
for all i, we say that V1 , ..., Vn are strictly consistent. Note that if +i is ⊇ for all i,
then consistency trivially holds as the empty system is a witness, since Vi ⊇ ∅ = ai (∅)
for all i. Also, if +i is ⊆ for all i and every ai satisfies ai (U) = Di , then consistency
trivially holds as the system U is a witness, since Vi ⊆ Di = ai (U) for all i.
In our 3D objects example, if Vtop is non-empty but Vside is empty, then the two
views are inconsistent w.r.t. strict conformance V = a(S). A less trivial case is when
Vtop = {(1, 1)} and Vside = {(2, 2)}. Again the two views are inconsistent (w.r.t. =):
Vtop asserts that some box must be in the column with (x, y) coordinates (1, 1), but
Vside implies that there is no box whose y coordinate is 1.
The last example may mislead to believe that consistency (w.r.t. =) is equivalent to
“intersection of inverse projection of views being non-empty.” This is not true. Even in
the case where abstraction functions are projections, non-empty intersection of inverse
projections is a necessary, but not a sufficient condition for consistency. To see this,
consider views Vtop = {(1, 1), (3, 3)} and Vside = {(2, 2), (1, 2)} in the context of our
running example. These two views are inconsistent w.r.t. =. Yet the intersection of their
inverse projections is non-empty, and equal to {(1, 1, 2)}.
View Reduction: Given a set of views V1 , ..., Vn of a system S, it may be possible to
“reduce” each view Vi based on the information contained in the other views, and as
a result obtain views V1 , ..., Vn that are “more accurate” views of S. We use the term
reduction inspired from similar work in abstract interpretation [5,10].
For example, if we assume that conformance is defined as V ⊇ a(S), then the views

Vtop = {(1, 1), (3, 3)} and Vside = {(2, 2), (1, 2)} can be reduced to Vtop = {(1, 1)}

and Vside = {(1, 2)}. Vtop is still a valid top view, in the sense that for every system S,

if both Vtop ⊇ atop (S) and Vside ⊇ aside (S), then Vtop ⊇ atop (S). In addition, Vtop is

more accurate than Vtop in the sense that Vtop is a strict subset of Vtop . Indeed, Vtop does
not contain the “bogus” square (3, 3) which cannot occur in S, as we learn from Vside .
Let us now define the notion of view reduction formally. First, given a conformance
relation between views and systems, |= ⊆ 2D × 2U , we define the concretization func-
tion c|= which, given a view V , returns the set of all systems which V conforms to:
222 J. Reineke and S. Tripakis

c|= (V ) := {S ⊆ U | V |=S} = {S ⊆ U | V + a|= (S)}.

#n
Note that V1 , ..., Vn are consistent w.r.t. |=1 , ..., |=n iff i=1 c|=i (Vi ) = ∅. Also observe
that, by definition, a|= (S)|=S. As a consequence, S ∈ c|= (a|= (S)) for all S ⊆ U.
We next lift a|= to sets of systems. For this, we will again assume that (2D , +) forms
a lattice, with denoting its greatest lower bound.1 Then, if S is a set of systems
over U, we define a|= (S) to be the “most accurate” view that conforms to all systems
in S:

a|= (S) := {V ⊆ D | c|= (V ) ⊇ S} = {V ⊆ D | ∀S ∈ S : V |=S}.

Lemma 1. The most accurate view that conforms to a set of systems S can also be
determined from the individual systems’ abstractions:
$
a|= (S) = {a|= (S) | S ∈ S}.

Missing proofs to lemmas and theorems can be found in the technical report [17].
Given the above, and assuming n view domains with corresponding conformance
relations, (D1 , |=1 ), ..., (Dn , |=n ), view reduction can be defined as follows:
%
n

reducei (V1 , V2 , ..., Vn ) := a|=i c|=i (Vi ) .
i=1
Lemma 2. Reduction is a reductive operation, i.e., Vi + reducei (V1 , V2 , ..., Vn ) for
all i. The set#of witnesses to the consistency of views #n V1 , ..., Vn is invariant under re-
n
duction, i.e., i=1 c|=i (reducei (V1 , V2 , ..., Vn )) = i=1 c|=i (Vi ) for all i.
The second part of the lemma implies that reduction is idempotent, i.e., for all i:
reducei (V1 , ..., Vn ) = reducei (V1 , ..., Vn ), where Vi = reducei (V1 , V2 , ..., Vn ).
Orthogonality: In some fortunate cases different aspects of a system are indepen-
dent of each other. Intuitively, what this means is that each aspect can be defined
separately without the need for communication between development teams to avoid
inconsistencies.
Formally, we say that view domains D1 , ..., Dn are orthogonal if all sets of non-
empty views V1 , ..., Vn from these view domains are mutually irreducible, i.e., if
reducei (V1 , ..., Vn ) = Vi for all i = 1, ..., n. The view domains from our example
of 3D objects, capturing projections onto two dimensions, are not orthogonal, as the
reduction example involving the domains shows. On the other hand, view domains cor-
responding to the projection onto individual dimensions would indeed be orthogonal to
each other.
Alternatively, orthogonal view domains can be defined by requiring that all sets of
non-empty views V1 , ..., Vn from these domains are consistent w.r.t. =.
The following lemma shows that the two definitions of orthogonal domains are
equivalent, if we assume that conformance is defined based on abstraction functions
and the superset and equality relations as the partial orders on views.

that when is a set-theoretic relation such as ⊆ or ⊇, this obviously holds and is
1
Note
D
or . When is = then (2 , =) is not a lattice, and the definition of view reduction given
below does not apply. This is not a problem, as in that case we require views to be complete.
Basic Problems in Multi-View Modeling 223

Lemma 3. Given non-empty views V1 , ..., Vn , the following statements are equivalent:

1. V1 , ..., Vn are consistent w.r.t. =a1 , ..., =an .

2. V1 , ..., Vn are mutually irreducible w.r.t. ⊇a1 , ..., ⊇an .
3. V1 , ..., Vn are mutually irreducible w.r.t. ⊆a1 , ..., ⊆an .

A system S ⊆ U is called view definable w.r.t. |=1 , ..., |=n if there exist views V1 ⊆
D1 , ..., Vn ⊆ Dn , such that c|=1 (V1 ) ∩ · · · ∩ c|=n (Vn ) = {S}. In the example of 3D
objects, with 2D projections, the empty object S = {} is view definable, as it is defined
by the empty views. Similarly, all objects Si,j,k = {(i, j, k)} are view definable. Note
that a general cube is not view definable, as there are other objects (e.g., a hollow cube)
which have the same 2D projections.

Verification and Synthesis Problems Related to Views

View conformance checking: given (concrete representation of) system S, view V , and
a certain conformance relation, does V conform to S?
View synthesis: given system S and abstraction function a, synthesize (concrete repre-
sentation of) a(S). Alternatively, given S and conformance relation |=, construct small-
est view V such that V |=S, that is, construct a|= (S).
View consistency checking: given views V1 , ..., Vn and conformance relations |=1 , ...,
|=n , check whether V1 , ..., Vn are consistent w.r.t. |=1 , ..., |=n .
System synthesis from views: given consistent views V1 , ..., Vn and conformance rela-
tions |=1 , ..., |=n , construct a system S such that for all i, Vi |=i S.
View reduction: given views V1 , ..., Vn compute reducei (V1 , V2 , ..., Vn ) for given i.

4 Discrete Systems
Our goal in the rest of this paper is to instantiate the view framework developed in
Section 3. We instantiate it for a class of discrete systems, and we also provide answers
to some of the corresponding algorithmic problems.
We will consider finite-state discrete systems. The state space of such a system can be
represented by a set of boolean variables, X, resulting in 2n potential states, where n =
|X| is the size of X. A state s over X is a valuation over X, i.e., a function s : X → B,
where B := {0, 1} is the set of booleans. For convenience, we sometimes consider other
finite domains with the understanding that they can be encoded as booleans. A behavior
over X is a finite or infinite sequence of states over X, σ = s0 s1 s2 · · · . U(X) denotes
the set of all possible behaviors over X.
Semantically, a discrete system S over X is a set of behaviors over X, i.e., S ⊆
U(X). For computation, we need a concrete representation of discrete systems. We
will start with a simple representation where all system variables are observable. We
will then discuss limitations of this representation and consider an extension where the
system can also have internal (unobservable) variables in addition to the observable
ones.
224 J. Reineke and S. Tripakis

Fully-Observable Discrete Systems: A fully-observable discrete system (FOS) is rep-

resented concretely by a triple (X, θ, φ). X is the (finite) set of (boolean) variables. All
variables in X are considered observable. θ is a boolean expression over X, character-
izing the set of initial states of the system. Given state s, we write θ(s) to denote the fact
that s satisfies θ, i.e., s is an initial state. φ is a boolean expression over X ∪ X , where
X is the set of primed copies of variables in X, X := {x | x ∈ X}, representing the
next state variables, as usual. φ characterizes pairs of states (s, s ), each representing a
transition of S, i.e., a move from state s to state s . We write φ(s, s ) to denote that the
pair (s, s ) satisfies φ, i.e., that there is a transition from s to s .
A behavior of a system (X, θ, φ) is a finite or infinite sequence of states over X,
σ = s0 s1 s2 · · · , such that θ(s0 ) and ∀i : φ(si , si+1 ), i.e., s0 is an initial state and there
is a transition from each si to si+1 (if the latter exists). A state s is reachable if there is
a finite behavior s0 s1 · · · sn , such that s = sn .
We sometimes use S = (X, θ, φ) to denote the concrete (syntactic) representation of
discrete system S, and S to denote its semantics, i.e., its set of behaviors.

Projection (variable hiding): Projection, or variable hiding, is a natural operation on

systems, which can also serve as a basic abstraction function for views, as we shall see
below. Here, we define projection and motivate the introduction of internal variables in
the concrete representation of discrete systems.
Let s be a state over a set of variables X. Given subset Y ⊆ X, the projection
function hY projects s onto the set of variables Y , that is, hY hides from s all variables
in X \Y . hY (s) is defined to be the new state s over Y , that is, the function s : Y → B,
such that s (x) = s(x) for all x ∈ Y .
Projection can be lifted to behaviors in the standard way. If σ = s0 s1 · · · is a behav-
ior over X, then hY (σ) is a behavior over Y defined by hY (σ) := hY (s0 )hY (s1 ) · · · .
Projection can also be lifted to systems. If S is a discrete system over X then hY (S) :=
{hY (σ) | σ ∈ S}.

Non-closure Properties
Non-closure Under Projection: The projection hY (S) is defined semantically, as a
set of behaviors. It is natural to ask whether the syntactic representation of discrete
systems is closed under projection. That is, is it true that for any S = (X, θ, φ), and
Y ⊆ X, there exists S = (Y, θ , φ ), such that S = hY (S)? This is not generally
true:
Lemma 4. There exists a FOS S = (X, θ, φ), and Y ⊆ X, such that there is no FOS
S = (Y, θ , φ ), such that S = hY (S).

Proof. Consider the finite-state system S = ({x, y}, x = 0 ∧ y = true, (x = (x +

1) mod 5) ∧ (y ↔ (x = 0))), where x ∈ {0, 1, 2, 3, 4} and y ∈ B. Let Y = {y}.
Then hY (S) = {y0 y1 · · · | ∀i : yi ⇔ i mod 5 = 0}. We claim that there is no
S = (Y, θ , φ ) such that S = hY (S). The reason is that S needs to count modulo
five in order to produce the correct output. But S has only one boolean variable y.

As it turns out, we can check whether closure under projection holds for a given
system: see Theorem 2 in Section 5.
Basic Problems in Multi-View Modeling 225

Non-closure Under Union

Lemma 5. Fully-observable systems over a set of variables X are not closed under
union, i.e., there exist S1 = (X, θ1 , φ1 ), S2 = (X, θ2 , φ2 ) such that there is no S =
(X, θ, φ) such that S = S1 ∪ S2 .

Proof. Consider as an example S1 = ({x}, θ1 = x, φ1 = x∧¬x ) and S2 = ({x}, θ2 =

¬x, φ1 = ¬x ∧ x ). Both systems allow exactly one transition, from x )→ true to
x )→ false and vice versa. A system that represents the union of S1 and S2 needs
to include both transitions. Then, however, it also includes arbitrarily long behaviors
alternating between x )→ true and x )→ false.

Discrete Systems with Internal Variables: The above non-closure properties motivate
us to study, in addition to fully-observable discrete systems, a generalization which ex-
tends them with a set of internal, unobservable state variables. Most practical modeling
languages also allow the construction of models with both internal and observable state
variables.
Accordingly, we extend the definition of a discrete system to be in general a tuple
(X, Z, θ, φ), where X, Z are disjoint (finite) sets of variables. X models the observable
and Z the internal variables. θ is a boolean expression over X ∪ Z and φ is a boolean
expression over X ∪ Z ∪ X ∪ Z . In such a system, we need to distinguish between
behaviors, and observable behaviors. A behavior of a system S = (X, Z, θ, φ) is a
finite or infinite sequence σ over X ∪ Z, defined as above. The observable behavior
corresponding to σ is hX (σ), which is a behavior over X. From now on, S denotes
the set of all behaviors (over X ∪ Z) of S, and So denotes the set of observable
behaviors (over X) of S.
Note that we allow Z to be empty. In that case, the system has no internal variables,
i.e., it is a FOS. We will continue to represent a FOS by a triple S = (X, θ, φ). A FOS S
satisfies S = So .
Closure Properties: We have already shown (Lemma 5) that FOS are not closed under
union. They are however closed under intersection:

Lemma 6. Given two FOS S1 = (X, θ1 , φ1 ) and S2 = (X, θ2 , φ2 ), a FOS S such that
S = S1 ∩ S2 is S1 ∧ S2 = (X, θ1 ∧ θ2 , φ1 ∧ φ2 ).

General discrete systems (with internal variables) are closed under intersection,
union, as well as projection.

Lemma 7. Let S1 = (X, Z1 , θ1 , φ1 ) and S2 = (X, Z2 , θ2 , φ2 ) be two systems, such

that Z1 ∩ Z2 = ∅. Let Y ⊆ X and let z be a fresh variable not in X ∪ Z1 ∪ Z2 . Let:

S∩ = (X, Z1 ∪ Z2 , θ1 ∧ θ2 , φ1 ∧ φ2 ),

S∪ = X, Z1 ∪ Z2 ∪ {z}, (θ1 ∧ z) ∨ (θ2 ∧ ¬z).(z → φ1 ∧ z ) ∧ (¬z → φ2 ∧ ¬z ) ,
Sh = (Y, Z1 ∪ (X \ Y ), θ1 , φ1 ).

Then, S∩ o = S1 o ∩ S2 o , S∪ o = S1 o ∪ S2 o , and Sh o = hY (S1 o ).
226 J. Reineke and S. Tripakis

5 Views of Finite-State Discrete Systems

Having defined discrete systems, we now turn to instantiating the view framework for
such systems.

Discrete Views, View Domains, and Abstraction Functions: Discrete views are
finite-state discrete systems. They are represented in general by tuples of the form
(X, Z, θ, φ), and when Z = ∅, by triples of the form (X, θ, φ).
In this paper, we will study projection as the abstraction function for the discrete
view framework. That is, a system will be a discrete system S over a set of observable
variables X, and therefore the domain of system behaviors will be U = U(X). A view
will be a discrete system V over a subset of observable variables Y ⊆ X. Therefore,
the view domain of V is D = U(Y ). Note that both S and V may have (each their own)
internal variables.
Let S = (X, Z, θ, φ) be a discrete system, V = (Y, W, θ , φ ) be a discrete view,
with Y ⊆ X, and + be one of the orders =, ⊆, or ⊇. To make notation lighter, we will
write V + hY (S) instead of V o + hY (S). Note, that hY (S) = hY (So ). More
generally, when comparing systems or views, we compare them w.r.t. their observable
behaviors. For instance, when writing V1 + V2 , we mean V1 o + V2 o .

Least and Greatest Fully-Observable Views: Let S be a discrete system over set
of observable variables X. Given a set Y ⊆ X, one might ask whether there is a
“canonical” view V of S w.r.t. Y . Clearly, if we allow V to have internal variables, the
answer is yes: it suffices to turn all variables in X \ Y into internal variables in V . Then,
by Lemma 7, V represents precisely the projection of S to Y , i.e., it is a complete view,
it satisfies V = hY (S), and therefore trivially also V ⊇ hY (S) and V ⊆ hY (S). Note
that this is true independently of whether S has internal variables or not.
In this section we study the question for the case where we forbid V from hav-
ing internal variables, i.e., we restrict views to be fully-observable. As FOS are not
closed under projection, there are systems that have no complete fully-observable view.
On the other hand, there can be multiple views V over Y such that V ⊇ hY (S) or
V ⊆ hY (S). In particular, (Y, true, true) ⊇ hY (S) and (Y, false, false) ⊆ hY (S),
for any S and Y . Thus, the question arises, whether there is a least fully-observable
view lv(S, Y ) of S with lv(S, Y ) ⊇ hY (S), such that for any fully-observable view V
with V ⊇ hY (S), we have V ⊇ lv(S, Y ). Similarly, one may ask whether there is a
greatest fully-observable view gv(S, Y ) w.r.t. ⊆hY . These questions are closely related
to whether views are closed under intersection and union. In particular, we can use clo-
sure under intersection to show that a least view always exists. A greatest view, on the
other hand, does not necessarily exist.

Theorem 1. Let S = (X, Z, θ, φ) be any discrete system and let Y ⊆ X. Let ψS

characterize the set of reachable states of S. Then the FOS

(Y, θY = ∃(X ∪ Z) \ Y : θ, φY = ∃(X ∪ Z) \ Y : ψS ∧ ∃(X ∪ Z ) \ Y : φ)

is the unique fully-observable least view lv(S, Y ), that is, lv(S, Y ) ⊇ hY (S), and for
any fully-observable view V over Y with V ⊇ hY (S), we have V ⊇ lv(S, Y ).
Basic Problems in Multi-View Modeling 227

As Lemma 4 shows, the projection of a system cannot generally be represented as a

fully-observable view. As it turns out, we can effectively check whether this is the case
for a given system S, by checking whether the least view of S conforms to S w.r.t. =.
Theorem 2. Given discrete system S over X and Y ⊆ X, there exists a fully-observable
view V over Y with V = hY (S) iff lv(S, Y ) = hY (S).
Theorem 2 implies that it is decidable to check whether a system admits a fully-
observable complete view V .
Theorem 3. There is a discrete system S over X and a subset Y ⊆ X for which there
is no unique greatest fully-observable view gv(S, Y ) with gv(S, Y ) ⊆ hY (S), such that
for any fully-observable view V with V ⊆ hY (S), we have V ⊆ gv(S, Y ).
Proof. Consider the FOS S = ({x, y}, θ = (x∧y)∨(¬x∧¬y), φ = (x∧¬x ∧y ∧y )∨
(¬x ∧ x ∧ ¬y ∧ ¬y )). The FOS S1 and S2 from the proof of Lemma 5 are both views
of S for Y = {x}, yet they are incomparable and there is no FOS view conforming to
S w.r.t. ⊆ that is greater than both of them as their union is not a view of S.

View Conformance Checking for Discrete Systems and Views

Problem 1. Given discrete system S = (X, Z, θ, φ), discrete view V = (Y, W, θV , φV ),
where Y ⊆ X and Z ∩ W = ∅, and partial order +∈ {=, ⊆, ⊇}, check whether
V + hY (S).
Problem 2. Given discrete systems S1 = (X, Z1 , θ1 , φ1 ) and S2 = (X, Z2 , θ2 , φ2 ),
where Z1 ∩ Z2 = ∅, and partial order +∈ {=, ⊆, ⊇}, check whether S1 o + S2 o .
Theorem 4. Problem 1 can be reduced to Problem 2 in polynomial time. Problem 2 is
in PSPACE.
Proof. For the first part of the theorem, observe that discrete systems are closed under
projection. An instance of Problem 1 can be transformed into an instance of Problem 2,
simply by shifting the variables X \Y of S from the observable to the internal variables.
For the second part of the theorem, we limit our attention to the case +=⊆, as the
other two cases then follow trivially. Problem 2 can be reduced to the finite state automa-
ton inequivalence problem, which is known to be in PSPACE [9]. As discrete systems
are closed under union, we construct a system S∪ , with S∪ o = S1 o ∪ S2 o . Then
S∪ o = S2 o iff S1 o ⊆ S2 o . From S∪ and S2 we can construct NFAs M∪ and M2
that accept a sequence σ iff σ is an observable behavior of S∪ and S2 , respectively.

Theorem 5. Problem 1 is in P for partial order +=⊇ if the discrete view V is a FOS.
Proof. First, notice that if Y ⊆ X, then V = (Y, θV , φV ) is a view of S = (X, Z, θ, φ)
if and only if it is a view of the fully-observable system S = (X ∪ Z, θ, φ). This is
because hY (S) = hY (S ). Thus, in the following, we will assume S to be a FOS with
S = (X, θ, φ).
Let ψS denote the reachable states of S. ψS can, e.g., be computed incrementally
using BDDs. Let Z := X \ Y and Z := X \ Y . Then, V ⊇hY S, if and only if the
following two conditions hold, which can be effectively checked:
228 J. Reineke and S. Tripakis

1. ∀Y, Z : θ(Y, Z) → θV (Y ) ≡ ∀s : θ(s) → θV (hY (s)), and

2. ∀Y, Z, Y , Z : ψS (Y, Z) → (φ((Y, Z), (Y , Z )) → φV (Y, Y )) ≡ ∀s, s :
ψS (s) → (φ(s, s ) → φV (hY (s), hY (s ))).
We need to show that Conditions 1 and 2 from above hold, if and only if V ⊇hY S.
Let us first show that Conditions 1 and 2 imply V ⊇hY S:
We show this by induction over the length n of behaviors σ of S.
Base case: let σ = s0 ∈ S be any behavior of length 1 of S. Then θ(s0 ) must hold,
which, by Condition 1 implies θV (h(s0 )), which implies that h(s0 ) ∈ V .
Induction step: let σ = s0 s1 · · · sn−1 sn ∈ S be a sequence of length n + 1. As
S is by definition prefix-closed, s0 s1 · · · sn−1 is also in S. By the induction hypoth-
esis, we know that h(s0 )h(s1 ) · · · h(sn−1 ) is in V . As σ ∈ S, sn−1 is reachable,
thus ψS (sn−1 ) holds. Thus, we can apply Condition 2, and deduce from the fact that
φ(sn−1 , sn ), that φV (h(sn−1 ), h(sn )). This in turn implies that h(s0 )h(s1 ) · · · h(sn−1 )
h(sn ) is a behavior of V .
Now, let us show the opposite direction, i.e., that V ⊇hY S implies Conditions 1
and 2. We show this by contraposition. Assume Condition 1 does not hold. Then, there
is a valuation vY of Y and a valuation vZ of Z, such that θ(vY vZ) holds (where
vY vZ is the valuation that agrees with vY on Y and with vZ on Z), but θV (vY )
does not. Clearly, h(vY vZ) = vY . So, vY vZ ∈ S, but h(vY vZ) ∈ V , which
implies that V ⊇hY S does not hold. Now assume that Condition 2 does not hold.
This implies that there are valuations vY, vZ and vY , vZ , such that ψS (vY vZ) and
φ(vY vZ, vY vZ ) hold, but φV (vY, vY ) does not. As vY vZ is thus reachable, there
must be a behavior s0 · · · (vY vZ) ∈ S. By φ(vY vZ, vY vZ ), we also have that
s0 · · · (vY vZ)(vY vZ ) ∈ S. Yet, because φV (vY, vY ) does not hold, h(s0 ) · · ·
h(vY vZ)h(vY vZ ) ∈ V , which concludes the proof.

Theorem 6. Problem 1 is PSPACE-hard even if the discrete view V is fully-observable

for |Y | ≥ 1 and partial orders =, ⊆. Problem 1 is also PSPACE-hard for |Y | ≥ 1 and
partial order ⊇ if V is not restricted to be fully-observable.

In [13], it is shown that checking the universality of non-deterministic finite automata

(NFA), having the property that all states are final, is PSPACE-hard for alphabets of
size at least 2. In the technical report [17], we show how to reduce this problem to
Problem 1.
View Consistency Checking for Discrete Systems and Views
Problem 3. Given partial order +∈ {=, ⊆, ⊇} and discrete views V1 , ..., Vn , with Vi =
(Yi , Wi , θi , φi ) for i = 1, ..., n, check whether there exists discrete system S = (X, Z, θ,
φ), with X ⊇ Yi for all i, such that Vi +hYi S for all i.
Problem 3 asks to check whether a given number of views are consistent w.r.t. pro-
jection as abstraction function and a given partial order among =, ⊆, ⊇. Note that we
can assume without
loss of generality that the witness system has set of observable
variables X = ni=1 Yi , as any extra variables could be made internal.
Problem 3 is trivially solved by the “all” system θ = φ = true for ⊆ and by the
“empty” system θ = φ = false for ⊇. For =, if we restrict the witness system to be a
Basic Problems in Multi-View Modeling 229

FOS, then
n Problem 3 is trivially decidable as there are only finitely many systems with
X = i=1 Yi . Clearly, this is not very efficient. Theorems 7-9 (which also apply to
general discrete systems, non necessarily FOS) provide a non-brute-force method.
Theorem 7. For a set of views V1 , . . . , Vn , with Vi = (Yi , Wi , θi , φi ) for all i, there al-
ways exists
a computable unique greatest witness system gw(V1 , . . . , Vn ) = (X, Z, θ, φ),
n
with X = i=1 Yi , w.r.t. partial order ⊇.
Proof. First, observe that Si = (X, Wi , θi , φi ) is the unique greatest witness system for
Vi for systems with the set of variables X, i.e., Vi ⊇hYi Si and for all S = (X, W, θ, φ)
such that Vi ⊇hYi S, we have Si ⊇ S. In fact, Vi =hYi Si . Given two views
Vi , Vj , the unique greatest witness system for both views is Si,j = (X, Wi ∪ Wj , θi ∧
θj , φi ∧ φj ), whose behaviors are exactly the intersection of the behaviors of Si and
Sj (see Lemma 7). Adding any behavior to Si,j would violate either Vi ⊇hYi Si,j or
n n
V ⊇ S Generalizing the above, S∧ = (X∧ = i=1 Yi , Z∧ = i=1 Wi , θ∧ =
jn hYi i,j n
i=1 θi , φ∧ = i=1 φi ) is the unique greatest witness system for the set of views
V1 , . . . , Vn .
Theorem 8. Consistency with respect to = holds if and only if the greatest witness
system gw(V1 , . . . , Vn ) derived in Theorem 7 is a witness with respect to =.
Theorem 9. Problem 3 is PSPACE-complete for partial order =.
Theorem 10. There are discrete views V1 , . . . , Vn , with Vi = (Yi , Wi , θi , φi ) for all i,
for which
there is no unique least witness system lw(V1 , . . . , Vn ) = (X, Z, θ, φ), with
X = ni=1 Yi , w.r.t. partial order ⊆.
Proof. Consider the following two views Vx = ({x}, θx = x, φx = true) and Vy =
({y}, θy = y, φy = true). We provide two witness systems S1 , S2 , both consistent
with Vx , Vy , such that their intersection is not consistent with Vx and Vy , which proves
that there is no unique least witness system for Vx , Vy w.r.t. ⊆:

S1 = ({x, y}, θ1 = x ∧ y, φ1 = (x ⇔ y) ∧ (x ⇔ y ))

S2 = ({x, y}, θ2 = x ∧ y, φ2 = x ∨ y )

In every behavior of S1 , x and y take the same value, whereas in S2 , x and y are never
both false. In their intersection S∩ = ({x, y}, θ1 ∧ θ2 , φ1 ∧ φ2 ), neither x nor y can
thus ever be false. So S∩ is neither consistent with Vx nor with Vy .

View Reduction for Discrete Systems and Views

Problem 4. Given partial order +∈ {=, ⊆, ⊇} and discrete views V1 , ..., Vn , with Vi =
(Yi , Wi , θi , φi ) for i = 1, ..., n, compute reducei (V1 , ..., Vn ) for all i = 1, ..., n.
Theorem 11. For partial order ⊇, Problem 4 is solved by the projection of the greatest
witness system to the observable
variables of the respective view: let gw(V1 , . . . , Vn ) =
(X, Z, θ, φ), with X = ni=1 Yi , be the greatest witness system to the consistency of
V1 , ..., Vn w.r.t. partial order ⊇. Then:

reducei (V1 , ..., Vn ) = (Yi , Z ∪ (X \ Yi ), θ, φ).

230 J. Reineke and S. Tripakis

For partial order ⊆, Problem 4 is often trivial. Specifically, if the sets of observable
variables of all views are incomparable, then no information can be transferred from
one view to another:
Theorem 12. Let V1 , ..., Vn be discrete views with Vi = (Yi , Wi , θi , φi ). Assume Yi \
Yj = ∅ for all i, j. Then, assuming + is ⊆, the following holds for all i:

reducei (V1 , ..., Vn ) = Vi .

6 Discussion
MVM is not a new topic, and terms such as “view” and “viewpoint” often appear in
system engineering literature, including standards such as ISO 42010 [12]. Despite this
fact, and the fact that MVM is a crucial concern in system design, an accepted mathe-
matical framework for reasoning about views has so far been lacking. This is especially
true for behavioral views, that is, views describing the dynamic behavior of the system,
as opposed to its static structure. Behavioral views are the main focus of our work.
Discrete behavioral views could also be captured in a temporal logic formalism such
as LTL. View consistency could then be defined as satisfiability of the conjunction
φ1 ∧ · · · ∧ φn , where each φi is a view (possibly over a different set of variables).
This definition is however weaker than our definition of strict consistency (w.r.t. =).
Satisfiability of φ1 ∧ · · · ∧ φn is equivalent to checking that the intersection of the in-
verse projections of views is non-empty, which, as we explained earlier, is a necessary
but not sufficient condition for strict consistency.
The same fundamental difference exists between our framework and view consis-
tency as formulated in the context of interface theories, where a special type of interface
conjunction is used [11] (called “fusion” in [2] and “shared refinement” in [7,18]).
Behavioral abstractions/views are also the topic of [15,16]. Their framework is close
to ours, in the sense that it also uses abstraction functions to map behaviors between dif-
ferent levels of abstraction (or between systems and views). The focus of both [15,16]
is to ease the verification task in a heterogeneous (e.g., both discrete and continuous)
setting. Our main focus is checking view consistency. The notion of “heterogeneous
consistency” [15] is different from our notion of view consistency. The notion of “con-
junctive implication” [15] is also different, as views which have an empty intersection
of their inverse projections trivially satisfy conjunctive implication, yet these views can
be inconsistent in our framework. Problems such as view consistency checking are not
considered in [15,16].
Consistency between architectural views, which capture structural but not behavioral
aspects of a system, is studied in [3]. Consistency problems are also studied in [8] using
a static, logic-based framework. Procedures such as join and normalization in relational
databases also relate to notions of static consistency.
An extensive survey of different approaches for multi-view modeling can be found
in [14]. [14] also gives a partial and preliminary formalization, but does not discuss
algorithmic problems. [4] discusses an informal methodology for selecting formalisms,
languages, and tools based on viewpoint considerations. A survey of trends in multi-
paradigm modeling can be found in [1]. Trends and visions in multi-view modeling are
also the topic of [19]. The latter paper also discusses pragmatics of MVM in the context
Basic Problems in Multi-View Modeling 231

of the Ptolemy tool. However formal aspects of MVM and algorithmic problems such
as checking consistency are not discussed.
Implicitly, MVM is supported by multi-modeling languages such as UML, SysML,
and AADL. For instance, AADL defines separate “behavior and error annexes” and
having separate models in these annexes can result in inconsistencies. But capabilities
such as conformance or consistency checking are typically not provided by the tools
implementing these standards. Architectural consistency notions in a UML-like frame-
work are studied in [6].
This work is a first step toward a formal and algorithm-supported framework for
multi-view modeling. A natural direction for future work is to study algorithmic prob-
lems such as consistency checking in a heterogeneous setting. Although the framework
of Section 3 is general enough to capture heterogeneity, in this paper we restricted our
attention to algorithmic MVM problems for discrete systems, as we feel that we first
need a solid understanding of MVM in this simpler case.
Other directions for future work including investigating other types of abstraction
functions, generalizing the methods developed in Section 5, e.g., so that ⊆, =, ⊇ can be
arbitrarily combined, and studying algorithmic problems related to orthogonality.

References

1. Amaral, V., Hardebolle, C., Karsai, G., Lengyel, L., Levendovszky, T.: Recent advances in
multi-paradigm modeling. In: Ghosh, S. (ed.) MODELS 2009. LNCS, vol. 6002, pp. 220–
224. Springer, Heidelberg (2010)
2. Benveniste, A., Caillaud, B., Ferrari, A., Mangeruca, L., Passerone, R., Sofronis, C.: Multiple
viewpoint contract-based specification and design. In: de Boer, F.S., Bonsangue, M.M., Graf,
S., de Roever, W.-P. (eds.) FMCO 2007. LNCS, vol. 5382, pp. 200–225. Springer, Heidelberg
(2008)
3. Bhave, A., Krogh, B.H., Garlan, D., Schmerl, B.: View consistency in architectures for cyber-
physical systems. In: ICCPS 2011, pp. 151–160 (2011)
4. Broman, D., Lee, E.A., Tripakis, S., Törngren, M.: Viewpoints, Formalisms, Languages, and
Tools for Cyber-Physical Systems. In: MPM (2012)
5. Cousot, P., Cousot, R.: Systematic design of program analysis frameworks. In: POPL,
pp. 269–282. ACM (1979)
6. Dijkman, R.M.: Consistency in Multi-Viewpoint Architectural Design. PhD thesis, Univer-
sity of Twente (2006)
7. Doyen, L., Henzinger, T., Jobstmann, B., Petrov, T.: Interface theories with component reuse.
In: EMSOFT, pp. 79–88 (2008)
8. Finkelstein, A., Gabbay, D., Hunter, A., Kramer, J., Nuseibeh, B.: Inconsistency handling in
multiperspective specifications. IEEE TSE 20(8), 569–578 (1994)
9. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-
Completeness. W. H. Freeman (1979)
10. Granger, P.: Improving the results of static analyses programs by local decreasing iteration.
In: Shyamasundar, R.K. (ed.) FSTTCS 1992. LNCS, vol. 652, pp. 68–79. Springer, Heidel-
berg (1992)
11. Henzinger, T.A., Ničković, D.: Independent implementability of viewpoints. In: Calinescu,
R., Garlan, D. (eds.) Monterey Workshop 2012. LNCS, vol. 7539, pp. 380–395. Springer,
Heidelberg (2012)
232 J. Reineke and S. Tripakis

12. ISO/IEC/IEEE 42010:2011. Systems and software engineering - Architecture description,

the latest edition of the original IEEE Std 1471:2000, Recommended Practice for Architec-
tural Description of Software-intensive Systems. IEEE and ISO (2011)
13. Kao, J.-Y., Rampersad, N., Shallit, J.: On NFAs where all states are final, initial, or both.
Theoretical Computer Science 410(47-49), 5010–5021 (2009)
14. Persson, M., Törngren, M., et al.: A Characterization of Integrated Multi-View Modeling for
Embedded Systems. In: EMSOFT (2013)
15. Rajhans, A., Krogh, B.H.: Heterogeneous verification of cyber-physical systems using be-
havior relations. In: HSCC 2012, pp. 35–44. ACM (2012)
16. Rajhans, A., Krogh, B.H.: Compositional heterogeneous abstraction. In: HSCC 2013, pp.
253–262. ACM (2013)
17. Reineke, J., Tripakis, S.: Basic problems in multi-view modeling. Technical Report
UCB/EECS-2014-3, EECS Department, University of California, Berkeley (January 2014)
18. Tripakis, S., Lickly, B., Henzinger, T.A., Lee, E.A.: A theory of synchronous relational in-
terfaces. ACM Trans. on Progr. Lang. and Sys. (TOPLAS) 33(4) (July 2011)
19. von Hanxleden, R., Lee, E.A., Motika, C., Fuhrmann, H.: Multi-view modeling and pragmat-
ics in 2020. In: Calinescu, R., Garlan, D. (eds.) Monterey Workshop 2012. LNCS, vol. 7539,
pp. 209–223. Springer, Heidelberg (2012)
GPUexplore: Many-Core On-the-Fly State Space
Exploration Using GPUs

Anton Wijs and Dragan Bošnački

Eindhoven University of Technology, The Netherlands

Abstract In recent years, General Purpose Graphics Processors (GPUs)

have been successfully applied in multiple application domains to drasti-
cally speed up computations. Model checking is an automatic method to
formally verify the correctness of a system specification. Such specifica-
tions can be viewed as implicit descriptions of a large directed graph or
state space, and for most model checking operations, this graph must be
analysed. Constructing it, or on-the-fly exploring it, however, is compu-
tationally intensive, so it makes sense to try to implement this for GPUs.
In this paper, we explain the limitations involved, and how to overcome
these. We discuss the possible approaches involving related work, and
propose an alternative, using a new hash table approach for GPUs. Ex-
perimental results with our prototype implementations show significant
speed-ups compared to the established sequential counterparts.

1 Introduction

General Purpose Graphics Processing Units (GPUs) are being applied success-
fully in many areas of research to speed up computations. Model checking [1] is
an automatic technique to verify that a given specification of a complex, safety-
critical (usually embedded) system meets a particular functional property. It
involves very time and memory demanding computations. Many computations
rely on on-the-fly state space exploration. This incorporates interpreting the spec-
ification, resulting in building a graph, or state space, describing all its potential
behaviour. Hence, the state space is not explicitly given, but implicitly, through
the specification. The state space size is not known a priori.
GPUs have been successfully applied to perform computations for probabilis-
tic model checking, when the state space is given a priori [2–4]. However, no
attempts as of yet have been made to perform the exploration itself entirely
using GPUs, due to it not naturally fitting the data parallel approach of GPUs,
but in this paper, we propose a way to do so. Even though current GPUs have
a limited amount of memory, we believe it is relevant to investigate the possibil-
ities of GPU state space exploration, if only to be prepared for future hardware

This work was sponsored by the NWO Exacte Wetenschappen, EW (NWO Physical
Sciences Division) for the use of supercomputer facilities, with financial support
from the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Netherlands
Organisation for Scientific Research, NWO).

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 233–247, 2014.

c Springer-Verlag Berlin Heidelberg 2014
234 A. Wijs and D. Bošnački

developments (for example, GPUs are already being integrated in CPUs). We

also believe that the results reported in this paper can be relevant for solving
other on-the-fly graph problems. In this paper, we describe several options to im-
plement basic state space exploration, i.e. reachability analysis, for explicit-state
model checking on GPUs. We focus on CUDA-enabled GPUs of NVIDIA, but
the options can also be implemented using other interfaces. We experimentally
compare these options, and draw conclusions. Where relevant, we use techniques
from related work, but practically all related implementations are focussed on
explicit graph searching, in which the explicit graph is given, as opposed to
on-the-fly constructing the graph. The structure of the paper is as follows: in
Section 2, the required background information is given. Then, Section 3 con-
tains the description of several implementations using different extensions. In
Section 4, experimental results are shown, and finally, Section 5 contains conclu-
sions and discusses possible future work.

2 Background and Related Work

2.1 State Space Exploration

The ﬁrst question is how a speciﬁcation should be represented. Most descrip-

tions, unfortunately, are not very suitable for our purpose, since they require the
dynamic construction of a database of data terms during the exploration. GPUs
are particularly unsuitable for dynamic memory allocation. We choose to use a
slightly modified version of the networks of LTSs model [5]. In such a network,
the possible behaviour of each process or component of the concurrent system
design is represented by a process LTS, or Labelled Transition System. An LTS
is a directed graph in which the vertices represent states of a process, and the
edges represent transitions between states. Moreover, each edge has a label indi-
cating the event that is fired by the process. Finally, an LTS has an initial state
sI . A network of LTSs is able to capture the semantics of specifications with
finite-state processes at a level where all data has been abstracted away and
only states remain. It is used in particular in the CADP verification toolbox [6].
Infinite-state processes are out of the scope here, and are considered future work.
In the remainder of this paper, we use the following notations: a network
contains a vector Π of n process LTSs, with n ∈ N. Given an integer n > 0,
1..n is the set of integers ranging from 1 to n. A vector v of size n contains n
elements indexed by 1..n. For i ∈ 1..n, v[i] denotes element i in v, hence Π[i]
refers to the ith LTS.
Besides a finite number of LTSs, a network also contains a finite set V of
synchronisation rules, describing how behaviour of different processes should
synchronise. Through this mechanism, it is possible to model synchronous com-
munication between processes. Each rule t, α consists of a vector t of size n,
describing the process events it is applicable on, and a result α, i.e. the system
event resulting from a successful synchronisation. As an example, consider the
first two LTSs from the left in Figure 1, together defining a network with n = 2
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs 235

approach R,0 delay

delay delay R,1 goleft R,0
goleft R,1 R,0
0 goright 2
R wait
stop start approach approach R,3 goright
1
crossing delay
delay delay R,2
continue wait R,3
Y τ G
delay G,1
3
G,1
wait τ goright
goleft
V = {(start, continue, crossing)}
G,3 Y,1 G,0 G,2

Fig. 1. Exploring the state space of a traﬃc light speciﬁcation

of a simple traffic light system specification, where process 0 represents the be-
haviour of a traffic light (the states representing the colours of the light) and pro-
cess 1 represents a pedestrian. We also have V = {(start, continue, crossing)},
meaning that there is only a single synchronisation rule, expressing that the start
event of process 0 can only be fired if event continue of process 1 is fired at the
same time, resulting in the event crossing being fired by the system as a whole.
In general, synchronisation rules are not required to involve all processes; in
order to express that a rule is not applicable on a process i ∈ 1..n, we use a
dummy value • indicating this, and define t[i] = •.
State space exploration now commences as follows: first, the two initial states
of the processes (indicated by an incoming transition without a source state) are
combined into a system state vector s = R, 0. In general, given a vector s, the
corresponding state of Π[i], with i ∈ 1..n, is s[i]. The set of outgoing transitions
(and their corresponding target states or successors of s) can now be determined
a
using two checks for each transition s[i] −
→ pi , with pi a state of process i:

→ s with s [i] = pi ∧ ∀j ∈ 1..n \ {i}.s [j] = s[j]

a
1. ¬∃t, α ∈ V.t[i] = a =⇒ s −
t[j]
→ s
α
2. ∀t, α ∈ V.t[i] = a ∧ (∀j ∈ 1..n \ {i}.t[j] = • =⇒ s[j] −−→ pj ) =⇒ s −

with ∀j ∈ 1..n.(t[j] = • ∧ s [j] = s[j]) ∨ (t[j] = • ∧ s [j] = pj )

The first check is applicable for all independent transitions, i.e. transitions on
which no rule is applicable, hence they can be fired individually, and therefore
directly ‘lifted’ to the system level. The second check involves applying synchro-
nisation rules. In Figure 1, part of the system state space obtained by applying
the defined checks on the traffic network is displayed on the right.

2.2 GPU Programming

NVIDIA GPUs can be programmed using the CUDA interface, which extends the
C and Fortran programming languages. These GPUs contain tens of streaming
multiprocessors (SM) (see Figure 2, with N the number of SMs), each containing
a ﬁxed number of streaming processors (SP), e.g. 192 for the Kepler K20 GPU,
236 A. Wijs and D. Bošnački

and fast on-chip shared memory. Each SM employs single instruction, multiple
data (SIMD) techniques, allowing for data parallelisation. A single instruction
stream is performed by a fixed size group of threads called a warp. Threads in
a warp share a program counter, hence perform instructions in lock-step. Due
to this, branch divergence can occur within a warp, which should be avoided:
for instance, consider the if-then-else construct if (C) then A else B. If a
warp needs to execute this, and for at least one thread C holds, then all threads
must step through A. It is therefore possible that the threads must step together
through both A and B, thereby decreasing performance. The size of a warp is
fixed and depends on the GPU type, usually it is 32, we refer to it as WarpSize.
A block of threads is a larger group assigned to a single SM. The threads in
a block can use the shared memory to communicate with each other. An SM,
however, can handle many blocks in parallel. Instructions to be performed by
GPU threads can be defined in a function called a kernel. When launching a
kernel, one can specify how many thread blocks should execute it, and how many
threads each block contains (usually a power of two). Each SM then schedules all
the threads of its assigned blocks up to the warp level. Data parallelisation can
be achieved by using the predefined keywords BlockId and ThreadId, referring
to ID of the block a thread resides in, and the ID of a thread within its block,
respectively. Besides that, we refer with WarpNr to the global ID of a warp, and
with WarpTId to the ID of a thread within its warp. These can be computed as
follows: WarpNr = ThreadId/WarpSize and WarpTId = ThreadId %WarpSize.
Most of the data used by a
Multiprocessor 1 Multiprocessor N
GPU application resides in global
memory or device memory. It em- SP SP SP SP
bodies the interface between the SP SP SP SP
host (CPU) and the kernel (GPU).
SP SP ··· SP SP
Depending on the GPU type, its
size is typically between 1 and 6 SP SP SP SP
GB. It has a high bandwidth, but Shared memory Shared memory
also a high latency, therefore mem-
128B 128B
ory caches are used. The cache
line of most current NVIDIA GPU L1 & L2 cache
L1 and L2 caches is 128 Bytes,
Texture cache
which directly corresponds with
each thread in a warp fetching a Global memory
32-bit integer. If memory accesses
in a kernel can be coalesced within Fig. 2. Hardware model of CUDA GPUs
each warp, efficient fetching can be
achieved, since then, the threads in a warp perform a single fetch together, nicely
filling one cache line, instead of different fetches, which would be serialised by
the GPU, thereby losing many clock-cycles. This plays an important role in the
hash table implementation we propose.
Finally, read-only data structures in global memory can be declared as tex-
tures, by which they are connected to a texture cache. This may be beneficial if
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs 237

access to the data structure is expected to be random, since the cache may help
in avoiding some global memory accesses.

2.3 Sparse Graph Search on GPUs

In general, the most suitable search strategy for parallelisation is Breadth-First
Search (BFS), since each search level is a set of vertices that can be distributed
over multiple workers. Two operations dominate in BFS: neighbour gathering,
i.e. obtaining the list of vertices reachable from a given vertex via one edge,
and status lookup, i.e. determining whether a vertex has already been visited
before. There exist many parallelisations of BFS; here, we will focus on GPU
versions. Concerning model checking, [7] describes the only other GPU on-line
exploration we found, but it uses both the CPU and GPU, restricting the GPU
to neighbour gathering, and it uses bitstate hashing, hence it is not guaranteed
to be exhaustive. In [8], explicit state spaces are analysed.
The vast majority of GPU BFS implementations are quadratic parallelisa-
tions, e.g. [9, 10]. To mitigate the dependency of memory accesses on the graph
structure, each vertex is considered in each iteration, yielding a complexity of
O(|V |2 + |E|), with V the set of vertices and E the set of edges. In [11], entire
warps are used to obtain the neighbours of a vertex.
There are only a few linear parallelisations in the literature: in [12], a hierar-
chical scheme is described using serial neighbour gathering and multiple queues
to avoid high contention on a single queue. In [13], an approach using prefix sum
is suggested, and a thorough analysis is made to determine how gatherings and
lookups need to be placed in kernels for maximum performance.
All these approaches are, however, not directly suitable for on-the-fly explo-
ration. First of all, they implement status lookups by maintaining an array, but
in on-the-fly exploration, the required size of such an array is not known a pri-
ori. Second of all, they focus on using an adjacency matrix, but for on-the-fly
exploration, this is not available, and the memory access patterns are likely to
be very different.
Related to the first objection, the use of a hash table seems unavoidable.
Not many GPU hash table implementations have been reported, but the ones
in [14, 15] are notable. They are both based on Cuckoo-hashing [16]. In Cuckoo
hashing, collisions are resolved by shuffling the elements along to new locations
using multiple hash functions. Whenever an element must be inserted, and hash
function h1 refers it to a location l already populated by another element, then
the latter element is replaced using the next hash function for that element, i.e.
if it was placed in l using hash function hi , then function hi+1 mod kc , with kc
the number of hash functions, is used. In [15], it is suggested to set kc = 4.
Finally, in [14,15], a comparison is made to radix sorting, in particular of [17].
On a GPU, sorting can achieve high throughput, due to the regular access pat-
terns, making list insertion and sorting faster than hash table insertion. Lookups,
however, are slower than hash table lookups if one uses binary searches, as is
done in [14, 15]. An alternative is to use B-trees for storing elements, improving
238 A. Wijs and D. Bošnački

32 32
ProcOﬀsets ··· 67
StateOﬀsets · · · 201, 206 ··· · · · 101 1
TransArray ··· t1 , . . . , t 6
s[4] s[3] s[2] s[1] Ts2 Ts1 Ts0 Ta Ts

Fig. 3. Encodings of a network, a state vector and a transition entry

memory access patterns by grouping the elements in warp-segments.1 Although

we have chosen to use a hash table approach (for on-the-ﬂy exploration, we expe-
rience that the sorting approach is overly complicated, requiring many additional
steps), we will use this idea of warp-segments for our hash table.

3 GPU Parallelisation

Alg. 1 provides a high-level view of state space exploration. As in BFS, one can
clearly identify the two main operations, namely successor generation (line 4),
analogous to neighbour gathering, and duplicate detection (line 5), analogous to
status lookup. Finally, in lines 6-7, states are added to the work sets, Visited
being the set of visited states and Open being the set of states yet to be explored
(usually implemented as a queue). In the next subsections, we will discuss our
approach to implementing these operations.

3.1 Data Encoding

As mentioned before, memory ac- Algorithm 1. State space exploration

cess patterns are usually the main Require: network Π, V, initial state s
I
cause for performance loss in GPU Open, Visited ← {sI }
graph traversal. The first step to 2: while Open
= ∅ do
s ← Open; Open ← Open \ s
minimise this effect is to choose ap- 4: for all s ∈ constructSystemSuccs(s) do
propriate encodings of the data. Fig- 6: if s ∈ Visited then
Visited ← Visited ∪ {s }
ure 3 presents on the left how we Open ← Open ∪ {s }
encode a network into three 32-bit
integer arrays. The first, called ProcOffsets, contains the start offset for each of
the Π[i] in the second array. The second array, StateOffsets, contains the offsets
for the source states in the third array. Finally, the third array, TransArray, actu-
ally contains encodings of the outgoing transitions of each state. As an example,
let us say we are interested in the outgoing transitions of state 5 of process LTS
8, in some given network. First, we look at position 8 in ProcOffsets, and find
that the states of that process are listed starting from position 67. Then, we look
at position 67+5 in StateOffsets, and we find that the outgoing transitions of
state 5 are listed starting at position 201 in TransArray. Moreover, at position
67+6, we find the end of that list. Using these positions, we can iterate over the
outgoing transitions in TransArray.
1
See http://www.moderngpu.com (visited 18/4/2013).
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs 239

One can imagine that these structures are practically going to be accessed
randomly when exploring. However, since this data is never updated, we can
store the arrays as textures, thereby using the texture caches to improve access.
Besides this, we must also encode the transition entries themselves. This is
shown on the right of Figure 3. Each entry fills a 32-bit integer as much as possi-
ble. It contains the following information: the lowest bit (Ts ) indicates whether
or not the transition depends on a synchronisation rule. The next log2 (ca ) num-
ber of bits, with ca the number of different labels in the entire network, encodes
the transition label (Ta ). We encode the labels, which are basically strings, by
integer values, sorting the labels occurring in a network alphabetically. After
that, each log2 (cs ) bits, with cs the number of states in the process LTS own-
ing this transition, encodes one of the target states. If there is non-determinism
w.r.t. label Ta from the source state, multiple target states will be listed, possibly
continuing in subsequent transition entries.
In the middle of Figure 3, the encoding of state vectors is shown. These are
simply concatenations of encodings of process LTS states. Depending on the
number of bits needed per LTS state, which in turn depends on the number of
states in the LTSs, a fixed number of 32-bit integers is required per vector.
Finally, the synchronisation rules need to be encoded. To simplify this, we
rewrite networks such that we only have rules involving a single label, e.g.
(a, a, a). In practice, this can usually be done without changing the meaning.
For the traffic light system, we could rewrite start and continue to crossing. It
allows encoding the rules as bit sequences of size n, where for each process LTS,
1 indicates that the process should participate, and 0 that it should not partic-
ipate in synchronisation. Two integer arrays then suffice, one containing these
encodings, the other containing the offsets for all the labels.

3.2 Successor Generation

At the start of a search iteration, each block fetches a tile of new state vectors
from the global memory. How this is done is explained at the end of Section 3.3.
The tile size depends on the block size BlockSize.
On GPUs, one should realise ﬁne-grained parallelism to obtain good speedups.
Given the fact that each state vector consists of n states, and the outgoing
transitions information needs to be fetched from physically separate parts of the
memory, it is reasonable to assign n threads to each state vector to be explored.
In other words, in each iteration, the tile size is at most BlockSize/n vectors.
Assigning multiple threads per LTS for fetching, as in [11], does not lead to
further speedups, since the number of transition entries to fetch is usually quite
small due to the sparsity of the LTSs, as observed before by us in [4].
We group the threads into vector groups of size n to assign them to state
vectors. Vector groups never cross warp boundaries, unless n > 32. The positive
eﬀect of this is that branch divergence can be kept to a minimum, since the
threads in a vector group work on the same task. For a vector s, each thread
with ID i w.r.t. its vector group (the VGID) fetches the outgoing transitions
of s[i + 1]. Each transition entry T with Ts = 0 can directly be processed, and
240 A. Wijs and D. Bošnački

the corresponding target state vectors are stored for duplicate detection (see
Section 3.3). For all transitions with Ts = 1, to achieve cooperation between
the threads while limiting the amount of used shared memory, the threads it-
erate over their transitions in order of label ID (LID). To facilitate this, the
entries in each segment of outgoing transitions belonging to a particular state in
TransArray are sorted on LID before exploration starts.
Successors reached through synchronisa-
TransArray
tion are constructed in iterations. In each it-
eration, the threads assigned to s fetch the · · · 1 1. . . 2. . . 1. . . 3. . . · · ·
entries with lowest LID and Ts = 1 from their cnt th 0 th 1 th 2 th 3
list of outgoing transitions, and store these in
a designated buffer in the shared memory. The Fig. 4. Fetching transitions
size of this buffer can be determined before exploration as n times the maximum
number of entries with the same LID and Ts = 1 from any process state in the
network. Then, the thread with VGID 0, i.e. the vector group leader, determines
the lowest LID fetched within the vector group. Figure 4 illustrates this for a
vector with n = 4. Threads th 0 to th 3 have fetched transitions with the lowest
LIDs for their respective process states that have not yet been processed in the
successor generation, and thread th 0 has determined that the next lowest LID to
be processed by the vector group is 1. This value is written in the cnt location.
Since transitions in TransArray are sorted per state by LID, we know that all
possible transitions with LID = 1 have been placed in the vector group buffer.
Next, all threads that fetched entries with the lowest LID, in the example threads
th 0 and th 2 , start scanning the encodings of rules in V applicable on that LID.
We say that thread i owns rule r iff there is no j ∈ 1..n with j < i and r[j] = •.
If a thread encounters a rule that it owns, then it checks the buffer contents to
determine whether the rule is applicable. If it is, it constructs the target state
vectors and stores them for duplicate detection. In the next iteration, all entries
with lowest LID are removed, the corresponding threads fetch new entries, and
the vector group leader determines the next lowest LID to be processed.

3.3 Closed Set Maintenance

Local State Caching. As explained in Section 2, we choose to use a global memory

hash table to store states. Research has shown that in state space exploration,
due to the characteristics of most networks, there is a strong sense of locality, i.e.
in each search iteration, the set of new state vectors is relatively small, and most
of the already visited vectors have been visited about two iterations earlier [18,
19]. This allows effective use of block local state caches in shared memory. Such
a cache, implemented as a linear probing hash table, can be consulted quickly,
and many duplicates can already be detected, reducing the number of global
memory accesses. We implemented the caches in a lockless way, apart from using
a compare-and-swap (CAS) operation to store the first integer of a state vector.
When processing a tile, threads add successors to the cache. When finished,
the block scans the cache, to check the presence of the successors in the global
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs 241

hash table. Thus, caches also allow threads to cooperatively perform global du-
plicate detection and insertion of new vectors.

Global Hash Table. For the global hash table, we initially used the Cuckoo hash
table of [15]. Cuckoo hashing has the nice property that lookups are done in
constant time, namely, it requires kc memory accesses, with kc the number of
hash functions used.
However, an important aspect of Cuckoo hashing is that elements are relo-
cated in case collisions occur. In [15], key-value pairs are stored in 64-bit integers,
hence insertions can be done atomically using CAS operations. Our state vectors,
though, can encompass more than 64 bits, ruling out completely atomic inser-
tions. After having created our own extension of the hash table of [15] that allows
for larger elements, we experienced in experiments that the number of explored
states far exceeded the actual number of reachable states, showing that in many
cases, threads falsely conclude that a vector was not present (a false negative).
We concluded that this is mainly due to vector relocation, involving non-atomic
removal and insertion, which cannot be avoided for large vectors; once a thread
starts removing a vector, it is not present anymore in the hash table until the
subsequent insertion has finished, and any other thread looking for the vector
will not be able to locate it during that time. It should however be noted, that
although the false negatives may negatively influence the performance, they do
not affect the correctness of our method.
To decrease the number of false negatives, as an alternative, we choose to
implement a hash table using buckets, linear probing and bounded double hash-
ing. It is implemented using an array, each consecutive WarpSize 32-bit integers
forming a bucket. This plays to the strength of warps: when a block of threads
is performing duplicate detection, all the threads in a warp cooperate on check-
ing the presence of a particular s. The first hash function h1 , built as specified
in [15], is used to find the primary bucket. A warp can fetch a bucket with one
memory access, since the bucket size directly corresponds with one cache line.
Subsequently, the bucket contents can be checked in parallel by the warp. This is
similar to the walk-the-line principle of [20], instead that here, the walk is done
in parallel, so we call it warp-the-line. Note that each bucket can contain up to
WarpSize/c vectors, with c the number of 32-bit integers required for a vector. If
the vector is not present and there is a free location, the vector is inserted. If the
bucket is full, h2 is used to jump to another bucket, and so on. This is similar
to [21], instead that we do not move elements between buckets.
The pseudo-code for scanning the local cache and looking up and inserting
new vectors (i.e. find-or-put) in the case that state vectors fit in a single 32-bit
integer is displayed in Alg. 2. The implementation contains the more general case.
The cache is declared extern, meaning that the size is given when launching the
kernel. Once a work tile has been explored and the successors are in the cache,
each thread participates in its warp to iterate over the cache contents (lines 6,
27). If a vector is new (line 8, note that empty slots are marked ‘old’), insertion
in the hash table will be tried up to H ∈ N times. In lines 11-13, warp-the-line is
performed, each thread in a warp investigating the appropriate bucket slot. If any
242 A. Wijs and D. Bošnački

Algorithm 2. Hash table ﬁnd-or-put for single integer state vectors

extern volatile shared unsigned int cache []
2: < process work tile and fill cache with successors >
WarpNr ← ThreadId / WarpSize
4: WarpTId ← ThreadId % WarpSize
i ← WarpNr
6: while i < |cache| do
s ← cache[i]
8: if isNewVector(s) then
for j = 0 to H do
10: BucketId ← h1 (s)
entry ← Visited[BucketId + WarpTId]
12: if entry = s then
setOldVector(cache[i])
14: s ← cache[i]
if isNewVector(s) then
16: for l = 0 to WarpSize do
if Visited[BucketId + l] = empty then
18: if WarpTId = 0 then
old = atomicCAS(&Visited[BucketId + l], empty, s)
20: if old = empty then
setOldVector(s)
22: if ¬isNewVector(s) then
break
24: if ¬isNewVector(s) then
break
26: BucketId ← BucketId + h2 (s)
i ← i + BlockSize/WarpSize

thread sets s as old in line 13, then all threads will detect this in line 15, since s is
read from shared memory. If the vector is not old, then it is attempted to insert
it in the bucket (lines 15-23). This is done by the warp leader (WarpTId = 0,
line 18), by performing a CAS. CAS takes three arguments, namely the address
where the new value must be written, the expected value at the address, and
the new value. It only writes the new value if the expected value is encountered,
and returns the encountered value, therefore a successful write has happened if
empty has been returned (line 20). Finally, in case of a full bucket, h2 is used
to jump to the next one (line 26).
As discussed in Section 4, we experienced good speedups and no unresolved
collisions using a double hashing bound of 8, and, although still present, far fewer
false negatives compared to Cuckoo hashing. Finally, it should be noted that
chaining is not a suitable option on a GPU, since it requires memory allocation
at runtime, and the required sizes of the chains are not known a priori.
Recall that the two important data structures are Open and Visited. Given
the limited amount of global memory, and that the state space size is unknown
a priori, we prefer to initially allocate as much memory as possible for Visited.
But also the required size of Open is not known in advance, so how much mem-
ory should be allocated for it without potentially wasting some? We choose to
combine the two in a single hash table by using the highest bit in each vector
encoding to indicate whether it should still be explored or not. The drawback is
that unexplored vectors are not physically close to each other in memory, but
the typically large number of threads can together scan the memory relatively
fast, and using one data structure drastically simpliﬁes implementation. It has
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs 243

the added beneﬁt that load-balancing is handled by the hash functions, due to
the fact that the distribution over the hash table achieves distribution over the
workers. A consequence is that the search will not be strictly BFS, but this is
not a requirement. At the start of an iteration, each block gathers a tile of new
vectors by scanning predeﬁned parts of the hash table, determined by the block
ID. In the next section, several possible improvements on scanning are discussed.

3.4 Further Extensions

On top of the basic approach, we implemented the following extensions. First
of all, instead of just one, we allow a variable number of search iterations to be
performed within one kernel launch. This improves duplicate detection using the
caches due to them maintaining more of the search history (shared memory data
is lost once a kernel terminates). Second of all, building on the ﬁrst extension,
we implemented a technique we call forwarding. When multiple iterations are
performed per launch, and a block is not in its ﬁnal iteration, its threads will
add the unexplored successors they generated in the current iteration to their
own work tile for the next one. This reduces the need for scanning for new work.

4 Implementation and Experiments

We implemented the exploration techniques in CUDA for C.2 The implemen-
tation was tested using 25 models from different sources; some originate from
the distributions of the state-of-the-art model checking toolsets CADP [6] and
mCRL2 [22], and some from the BEEM database [23]. In addition, we added two
we created ourselves. Here, we discuss the results for a representative subset.
Sequential experiments have been performed using Exp.Open [5] with Gen-
erator, both part of CADP. These are highly optimised for sequential use.
Those experiments were performed on a machine with an Intel Xeon E5520
2.27 GHz CPU, 1TB RAM, running Fedora 12. The GPU experiments were
done on machines running CentOS Linux, with a Kepler K20 GPU, an Intel
E5-2620 2.0 GHz CPU, and 64 GB RAM. The GPU has 13 SMs, 6GB global
memory (realising a hash table with about 1.3 billion slots), and 48kB (12,288
integers) shared memory per block. We chose not to compare with the GPU tool
of [7], since it is a CPU-GPU hybrid, and therefore does not clearly allow to
study to what extent a GPU can be used by itself for exploration. Furthermore,
it uses bitstate hashing, thereby not guaranteeing exhaustiveness.
We also conducted experiments with the model checker LTSmin [24] using
the six CPU cores of the machines equipped with K20s. LTSmin uses the most
scalable multi-core exploration techniques currently available.
Table 1 displays the characteristics of the models we consider here. The first
five are models taken from and inspired by those distributed with the mCRL2
toolset (in general ’.1’ suffixed models indicate that we extended the existing
2
The implementation and experimental data is available at
http://www.win.tue.nl/~awijs/GPUexplore.
244 A. Wijs and D. Bošnački

Fig. 5. Performance with varying nr. of blocks (iters=10)

models to obtain larger state spaces), the next two have been created by us, the
seven after that originate from CADP, and the final five come from the BEEM
database. The latter ones have first been translated manually to mCRL2, since
our input, network of LTSs, uses an action-based representation of system be-
haviour, but BEEM models are state-based, hence this gap needs to be bridged.
An important question is how the ex-
Table 1. Benchmark characteristics
ploration should be configured, i.e. how
many blocks should be launched, and
Model #States #Transitions
how many iterations should be done per
1394 198,692 355,338
kernel launch. We tested different config-
1394.1 36,855,184 96,553,318
urations for 512 threads per block (other
acs 4,764 14,760
numbers of threads resulted in reduced
acs.1 200,317 895,004
performance) using double hashing with
wafer stepper.1 4,232,299 19,028,708
forwarding; Figure 5 shows our results
ABP 235,754,220 945,684,122
launching a varying number of blocks
broadcast 60,466,176 705,438,720
(note the logscale of the right graph),
transit 3,763,192 39,925,524
each performing 10 iterations per kernel
CFS.1 252,101,742 1,367,483,201
launch. The ideal number of blocks for
asyn3 15,688,570 86,458,183
the K20 seems to be 240 per SM, i.e.
asyn3.1 190,208,728 876,008,628
3120 blocks. For GPU standards, this
ODP 91,394 641,226
is small, but launching more often nega-
ODP.1 7,699,456 31,091,554
tively affects performance, probably due
DES 64,498,297 518,438,860
to the heavy use of shared memory.
lamport.8 62,669,317 304,202,665
Figure 6 shows some of our results on
lann.6 144,151,629 648,779,852
varying the number of iterations per ker-
lann.7 160,025,986 944,322,648
nel launch. Here, it is less clear which
peterson.7 142,471,098 626,952,200
value leads to the best results, either 5
szymanski.5 79,518,740 922,428,824
or 10 seems to be the best choice. With
a lower number, the more frequent hash table scanning becomes noticable, while
with higher numbers, the less frequent passing along of work from SMs to each
other leads to too much redundancy, i.e. re-exploration of states, causing the
exploration to take more time.
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs 245

Fig. 6. Performance with varying nr. of iterations per kernel (blocks=3120)

Fig. 7. Runtime results for various tools

For further experimentation, we opted for 10 iterations per launch. Figure 7

shows our runtime results (note the log scale). The GPU extension combinations
used are Double Hashing (DH), DH+Forwarding (DH+F), and DH without local
caches (NC). The smaller state spaces are represented in the left graph. Here, DH
and NC often do not yet help to speed up exploration; the overhead involved can
lead to longer runtimes compared to sequential runs. However, DH+F is more
often than not faster than sequential exploration. The small diﬀerences between
DH and NC, and the big ones between NC and DH+F (which is also the case in
the right graph) indicate that the major contribution of the caches is forwarding,
as opposed to localised duplicate detection, which was the original motivation
for using them. DH+F speeds up DH on average by 42%.
It should be noted that for vectors requiring multiple integers, GPU explo-
ration tends to perform on average 2% redundant work, i.e. some states are
re-explored. In those cases, data races occur between threads writing and read-
ing vectors, since only the ﬁrst integer of a vector is written with a CAS. However,
we consider these races benign, since it is important that all states are explored,
not how many times, and adding additional locks hurts the performance.
The right graph in Figure 7 includes results for LTSmin using six CPU cores.
This shows that, apart from some exceptions, our GPU implementation on
246 A. Wijs and D. Bošnački

average has a performance similar to using about 10 cores with LTSmin, based
on the fact that LTSmin demonstrates near-linear speedups when the number of
cores is increased. In case of the exceptions, such as the ABP case, about two orders
of magnitude speedup is achieved. This may seem disappointing, considering that
GPUs have an enormous computation potential. However, on-the-fly exploration
is not a straightforward task for a GPU, and a one order of magnitude speedup
seems reasonable. Still, we believe these results are very promising, and merit fur-
ther study. Existing multi-core exploration techniques, such as in [24], scale well
with the number of cores. Unfortunately, we cannot test whether this holds for our
GPU exploration, apart from varying the number of blocks; the number of SMs
cannot be varied, and any number beyond 15 on a GPU is not yet available.
Concluding, our choices regarding data encoding and successor generation seem
to be effective, and our findings regarding a new GPU hash table, local caches and
forwarding can be useful for anyone interested in GPU graph exploration.

5 Conclusions

We presented an implementation of on-the-ﬂy GPU state space exploration, pro-

posed a novel GPU hash table, and experimentally compared diﬀerent conﬁgu-
rations and combinations of extensions. Compared to state-of-the-art sequential
implementations, we measured speedups of one to two orders of magnitude. We
think that GPUs are a viable option for state space exploration. Of course, more
work needs to be done in order to really use GPUs to do model checking. For fu-
ture work, we will experiment with changing the number of iterations per kernel
launch during a search, support LTS networks with data, pursue checking safety
properties, and experiment with partial searches [25, 26].

References

1. Baier, C., Katoen, J.P.: Principles of Model Checking. The MIT Press (2008)
2. Bošnački, D., Edelkamp, S., Sulewski, D., Wijs, A.: Parallel Probabilistic Model
Checking on General Purpose Graphics Processors. STTT 13(1), 21–35 (2011)
3. Bošnački, D., Edelkamp, S., Sulewski, D., Wijs, A.: GPU-PRISM: An Extension
of PRISM for General Purpose Graphics Processing Units. In: Joint HiBi/PDMC
Workshop (HiBi/PDMC 2010), pp. 17–19. IEEE (2010)
4. Wijs, A.J., Bošnački, D.: Improving GPU Sparse Matrix-Vector Multiplication for
Probabilistic Model Checking. In: Donaldson, A., Parker, D. (eds.) SPIN 2012.
LNCS, vol. 7385, pp. 98–116. Springer, Heidelberg (2012)
5. Lang, F.: Exp.Open 2.0: A Flexible Tool Integrating Partial Order, Compositional,
and On-The-Fly Veriﬁcation Methods. In: Romijn, J.M.T., Smith, G.P., van de
Pol, J. (eds.) IFM 2005. LNCS, vol. 3771, pp. 70–88. Springer, Heidelberg (2005)
6. Garavel, H., Lang, F., Mateescu, R., Serwe, W.: CADP 2010: A Toolbox for
the Construction and Analysis of Distributed Processes. In: Abdulla, P.A., Leino,
K.R.M. (eds.) TACAS 2011. LNCS, vol. 6605, pp. 372–387. Springer, Heidelberg
(2011)
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs 247

7. Edelkamp, S., Sulewski, D.: Eﬃcient Explicit-State Model Checking on General

Purpose Graphics Processors. In: van de Pol, J., Weber, M. (eds.) SPIN 2000.
LNCS, vol. 6349, pp. 106–123. Springer, Heidelberg (2010)
8. Barnat, J., Bauch, P., Brim, L., Češka, M.: Designing Fast LTL Model Checking
Algorithms for Many-Core GPUs. J. Par. Distr. Comput. 72, 1083–1097 (2012)
9. Deng, Y., Wang, B., Shuai, M.: Taming Irregular EDA Applications on GPUs. In:
ICCAD 2009, pp. 539–546 (2009)
10. Harish, P., Narayanan, P.J.: Accelerating Large Graph Algorithms on the GPU
Using CUDA. In: Aluru, S., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.)
HiPC 2007. LNCS, vol. 4873, pp. 197–208. Springer, Heidelberg (2007)
11. Hong, S., Kim, S., Oguntebi, T., Olukotun, K.: Accelerating CUDA Graph Algo-
rithms At Maximum Warp. In: PPoPP 2011, pp. 267–276. ACM (2011)
12. Luo, L., Wong, M., Hwu, W.M.: An Eﬀective GPU Implementation of Breadth-
First Search. In: DAC 2010, pp. 52–55. IEEE Computer Society Press (2010)
13. Merrill, D., Garland, M., Grimshaw, A.: Scalable GPU Graph Traversal. In: PPoPP
2012, pp. 117–128. ACM (2012)
14. Alcantara, D., Sharf, A., Abbasinejad, F., Sengupta, S., Mitzenmacher, M., Owens,
J., Amenta, N.: Real-time Parallel Hashing on the GPU. ACM Trans. Graph. 28(5),
154 (2009)
15. Alcantara, D., Volkov, V., Sengupta, S., Mitzenmacher, M., Owens, J., Amenta,
N.: Building an Eﬃcient Hash Table on the GPU. In: GPU Computing Gems Jade
Edition. Morgan Kaufmann (2011)
16. Pagh, R.: Cuckoo Hashing. In: Meyer auf der Heide, F. (ed.) ESA 2001. LNCS,
vol. 2161, pp. 121–133. Springer, Heidelberg (2001)
17. Merrill, D., Grimshaw, A.: High Performance and Scalable Radix Sorting: a Case
Study of Implementing Dynamic Parallelism for GPU Computing. Parallel Pro-
cessing Letters 21(2), 245–272 (2011)
18. Pelánek, R.: Properties of State Spaces and Their Applications. STTT 10(5), 443–
454 (2008)
19. Mateescu, R., Wijs, A.: Hierarchical Adaptive State Space Caching Based on Level
Sampling. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505,
pp. 215–229. Springer, Heidelberg (2009)
20. Laarman, A., van de Pol, J., Weber, M.: Boosting Multi-core Reachability Perfor-
mance with Shared Hash Tables. In: FMCAD 2010, pp. 247–255 (2010)
21. Dietzfelbinger, M., Mitzenmacher, M., Rink, M.: Cuckoo Hashing with Pages. In:
Demetrescu, C., Halldórsson, M.M. (eds.) ESA 2011. LNCS, vol. 6942, pp. 615–627.
Springer, Heidelberg (2011)
22. Cranen, S., Groote, J.F., Keiren, J.J.A., Stappers, F.P.M., de Vink, E.P., Wesselink,
W., Willemse, T.A.C.: An overview of the mCRL2 Toolset and Its Recent Advances.
In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 199–213.
Springer, Heidelberg (2013)
23. Pelánek, R.: BEEM: Benchmarks for Explicit Model Checkers. In: Bošnački, D.,
Edelkamp, S. (eds.) SPIN 2007. LNCS, vol. 4595, pp. 263–267. Springer, Heidelberg
(2007)
24. Laarman, A., van de Pol, J., Weber, M.: Multi-Core LTSmin: Marrying Modularity
and Scalability. In: Bobaru, M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.)
NFM 2011. LNCS, vol. 6617, pp. 506–511. Springer, Heidelberg (2011)
25. Torabi Dashti, M., Wijs, A.J.: Pruning State Spaces with Extended Beam Search.
In: Namjoshi, K.S., Yoneda, T., Higashino, T., Okamura, Y. (eds.) ATVA 2007.
LNCS, vol. 4762, pp. 543–552. Springer, Heidelberg (2007)
26. Wijs, A.: What To Do Next?: Analysing and Optimising System Behaviour in Time.
PhD thesis, VU University Amsterdam (2007)
Forward Reachability Computation
for Autonomous Max-Plus-Linear Systems

Dieky Adzkiya1 , Bart De Schutter1 , and Alessandro Abate2,1

1
Delft Center for Systems and Control
TU Delft – Delft University of Technology, The Netherlands
{d.adzkiya,b.deschutter,a.abate}@tudelft.nl
2
Department of Computer Science
University of Oxford, United Kingdom
[email protected]

Abstract. This work discusses the computation of forward reachabil-

ity for autonomous (that is, deterministic) Max-Plus-Linear (MPL) sys-
tems, a class of continuous-space discrete-event models that are relevant
for applications dealing with synchronization and scheduling. Given an
MPL model and a set of initial states, we characterize and compute its
“reach tube,” namely the sequential collection of the sets of reachable
states (these sets are regarded step-wise as “reach sets”). We show that
the exact computation of the reach sets can be quickly and compactly
performed by manipulations of diﬀerence-bound matrices, and derive
explicit worst-case bounds for the complexity of these operations. The
concepts and techniques are implemented within the toolbox VeriSiMPL,
and are practically elucidated by a running example. We further dis-
play the computational performance of the approach by two concluding
numerical benchmarks: the technique comfortably handles reachability
computations over twenty-dimensional MPL models (i.e., models with
twenty continuous variables), and it clearly outperforms an alternative
state-of-the-art approach in the literature.

1 Introduction
Reachability analysis is a fundamental problem in the areas of formal meth-
ods and of systems theory. It is concerned with assessing whether a certain set
of states of a system is attainable from a given set of initial conditions. The
problem is particularly interesting and compelling over models with continuous
components – either in time or in the (state) space. For the ﬁrst class of models,
reachability has been widely investigated over discrete-space systems, such as
timed automata [1,2], or (timed continuous) Petri nets [3], or hybrid automata
[4]. On the other hand, much research has been done to enhance and scale the

This work has been supported by the European Commission STREP project MoVeS
257005, by the European Commission Marie Curie grant MANTRAS 249295, by the
European Commission IAPP project AMBI 324432, by the European Commission
NoE Hycon2 257462, and by the NWO VENI grant 016.103.020.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 248–262, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Forward Reachability Computation for Autonomous MPL Systems 249

reachability analysis of continuous-space models. Among the many approaches

for deterministic dynamical systems, we report here the use of face lifting [5], the
computation of flow-pipes via polyhedral approximations [6,7], the formulation
as solution of Hamilton-Jacobi equations [8] (related to the study of forward and
backward reachability [9]), the use of ellipsoidal techniques [10,11], differential
inclusions [12], support functions [13], and Taylor models [14].
Max-Plus-Linear (MPL) models are discrete-event systems [15] with con-
tinuous variables that express the timing of the underlying sequential events.
Autonomous MPL models are characterized by deterministic dynamics. MPL
models are employed to describe the timing synchronization between interleaved
processes, and as such are widely employed in the analysis and scheduling of
infrastructure networks, such as communication and railway systems [16] and
production and manufacturing lines [17]. They are related to a subclass of timed
Petri nets, namely timed-event graphs [15], however not directly to time Petri
nets [18] nor to timed automata [1]. MPL models are classically analyzed over
properties such as transient and periodic regimes [15]. They can be simulated
(though not verified) via the max-plus toolbox Scilab [19].
Reachability analysis of MPL systems from a single initial condition has been
investigated in [20,21] by the computation of the reachability matrix (as for
discrete-time linear dynamical systems). It has been shown in [22, Sect. 4.13]
that the reachability problem for autonomous MPL systems with a single initial
condition is decidable – this result however does not hold for a general, uncount-
able set of initial conditions. Under the limiting assumption that the set of initial
conditions is expressed as a max-plus polyhedron [23,24], forward reachability
analysis can be performed over the max-plus algebra. In conclusion, to the best
of our knowledge, there exists no computational toolbox for general reachability
analysis of MPL models, nor it is possible to leverage software for related timed-
event graphs or timed Petri nets. As an alternative, reachability computation
for MPL models can be studied using the Multi-Parametric Toolbox (MPT) [25]
(cf. Section 4).
In this work, we extend the state-of-the-art results for forward reachabil-
ity analysis of MPL models by considering an arbitrary (possibly uncountable)
set of initial conditions, and present a new computational approach to forward
reachability analysis of MPL models. We first alternatively characterize MPL dy-
namics by Piece-wise Affine (PWA) models, and show that the dynamics can be
fully represented by Difference-Bound Matrices (DBM) [26, Sect. 4.1], which are
structures that are quite simple to manipulate. We further claim that DBM are
closed over PWA dynamics, which leads to being able to map DBM-sets through
MPL models. We then characterize and compute, given a set of initial states,
its “reach tube,” namely the sequential collection of the sets of reachable states
(aggregated step-wise as “reach sets”). With an emphasis on computational and
implementation aspects, we provide a quantification of the worst-case complexity
of the algorithms discussed throughout the work. Notice that although DBM are
a structure that has been used in reachability analysis of timed automata, this
250 D. Adzkiya, B. De Schutter, and A. Abate

does not imply that we can employ related techniques for reachability analysis
of MPL systems, since the two modeling frameworks are not equivalent.
While this new approach reduces reachability analysis of MPL models to a
computationally feasible task, the foundations of this contribution go beyond
mere manipulations of DBM: the technique is inspired by the recent work in
[27], which has developed an approach to the analysis of MPL models based
on ﬁnite-state abstractions. In particular, the procedure for forward reachabil-
ity computation on MPL models discussed in this work is implemented in the
VeriSiMPL (“very simple”) software toolbox [28], which is freely available. While
the general goals of VeriSiMPL go beyond the topics of this work and are thus
left to the interested reader, in this article we describe the details of the im-
plementation of the suite for reachability analysis within this toolbox over a
running example. With an additional numerical case study, we display the scala-
bility of the tool as a function of model dimension (the number of its continuous
variables): let us emphasize that related approaches for reachability analysis of
discrete-time dynamical systems based on ﬁnite abstractions do not reasonably
scale beyond models with a few variables [29], whereas our procedure comfort-
ably handles models with about twenty continuous variables. In this numerical
benchmark we have purposely generated the underlying dynamics randomly: this
allows deriving empirical outcomes that are general and not biased towards possi-
ble structural features of a particular model. Finally, we successfully benchmark
the computation of forward reachability sets against an alternative approach
based on the well-developed MPT software tool [25].

2 Models and Preliminaries

2.1 Max-Plus-Linear Systems
Deﬁne IRε , ε and e respectively as IR ∪ {ε}, −∞ and 0. For α, β ∈ IRε , introduce
the two operations α ⊕ β = max{α, β} and α ⊗ β = α + β, where the element
ε is considered to be absorbing w.r.t. ⊗ [15, Deﬁnition 3.4]. Given β ∈ IR, the
max-algebraic power of α ∈ IR is denoted by α⊗β and corresponds to αβ in the
conventional algebra. The rules for the order of evaluation of the max-algebraic
operators correspond to those of conventional algebra: max-algebraic power has
the highest priority, and max-algebraic multiplication has a higher priority than
max-algebraic addition [15, Sect. 3.1].
The basic max-algebraic operations are extended to matrices as follows. If
A, B ∈ IRm×nε ; C ∈ IRm×p
ε ; D ∈ IRp×n ε ; and α ∈ IRε
, then [α ⊗ A](i, j) =
p
α⊗A(i, j); [A⊕B](i, j) = A(i, j)⊕B(i, j); [C ⊗D](i, j) = k=1 C(i, k)⊗D(k, j);
for i = 1, . . . , m and j = 1, . . . , n. Notice the analogy between ⊕, ⊗ and +, ×
for matrix and vector operations in conventional algebra. Given m ∈ IN, the
m-th max-algebraic power of A ∈ IRn×n ε is denoted by A⊗m and corresponds to
⊗0
A ⊗ · · · ⊗ A (m times). Notice that A is an n-dimensional max-plus identity
matrix, i.e. the diagonal and nondiagonal elements are e and ε, respectively. In
this paper, the following notation is adopted for reasons of convenience. A vector
with each component that is equal to 0 (or −∞) is also denoted by e (resp., ε).
Forward Reachability Computation for Autonomous MPL Systems 251

Furthermore, for practical reasons, the state space is taken to be IRn , which also
implies that the state matrix has to be row-finite (cf. Definition 1).
An autonomous (that is, deterministic) MPL model [15, Remark 2.75] is de-
fined as:
x(k) = A ⊗ x(k − 1) , (1)
where A ∈ IRn×n ε , x(k − 1) = [x1 (k − 1) . . . xn (k − 1)]T ∈ IRn for k ∈ IN. The
independent variable k denotes an increasing discrete-event counter, whereas
the state variable x defines the (continuous) timing of the discrete events. Au-
tonomous MPL models are characterized by deterministic dynamics. Related to
the state matrix A is the notion of regular (or row-finite) matrix and that of
irreducibility.

Deﬁnition 1 (Regular (Row-Finite) Matrix, [16, Sect. 1.2]) A max-plus

matrix A ∈ IRn×n
ε is called regular (or row-finite) if A contains at least one ele-
ment different from ε in each row.
n−1
A matrix A ∈ IRn×n
ε is irreducible if the nondiagonal elements of k=1 A⊗k
are finite (not equal to ε). If A is irreducible, there exists a unique max-plus
eigenvalue λ ∈ IR [15, Th. 3.23] and the corresponding eigenspace E(A) = {x ∈
IRn : A ⊗ x = λ ⊗ x} [15, Sect. 3.7.2].
Example: Consider the following autonomous MPL model from [16, Sect. 0.1],
representing the scheduling of train departures from two connected stations i =
1, 2 (xi (k) denotes the time of the k-th departure for station i):
& '
25
x(k) = ⊗ x(k − 1) , or equivalently ,
33
& ' & ' (2)
x1 (k) max{2 + x1 (k − 1), 5 + x2 (k − 1)}
= .
x2 (k) max{3 + x1 (k − 1), 3 + x2 (k − 1)}

Matrix A is a row-ﬁnite matrix and irreducible since A(1, 2) = ε = A(2, 1).

Proposition 1 ([16, Th. 3.9]) Let A ∈ IRn×n ε be an irreducible matrix with

max-plus eigenvalue λ ∈ IR. There exist k0 , c ∈ IN such that A⊗(k+c) = λ⊗c ⊗
A⊗k , for all k ≥ k0 . The smallest k0 and c verifying the property are deﬁned as
the length of the transient part and the cyclicity, respectively.

Proposition 1 allows to establish the existence of a periodic behavior. Given an

initial condition x(0) ∈ IRn , there exists a ﬁnite k0 (x(0)), such that x(k + c) =
λ⊗c ⊗ x(k), for all k ≥ k0 (x(0)). Notice that we can seek the length of the
transient part k0 (x(0)) speciﬁcally for the initial condition x(0), which is in
general less conservative than the global k0 = k0 (A), as in Proposition 1. Upper
bounds for the length of the transient part k0 and for its computation have been
discussed in [30].
Example: In the example (2), from Proposition 1 we obtain a max-plus eigen-
value λ = 4, cyclicity c = 2, and a (global) length of the transient part k0 = 2.
252 D. Adzkiya, B. De Schutter, and A. Abate

The length of the transient part speciﬁcally for x(0) = [3, 0]T can be computed
observing the trajectory
& ' & ' & ' & ' & ' & ' & ' & ' & ' & '
3 5 11 14 19 22 27 30 35 38
, , , , , , , , , ,...
0 6 9 14 17 22 25 30 33 38

The periodic behavior occurs (as expected) after 2 event steps, i.e. k0 ([3, 0]T ) =
2, and shows a period equal to 2, namely x(4) = 4⊗2 ⊗ x(2) = 8 + x(2), and
similarly x(5) = 4⊗2 ⊗ x(3). Furthermore x(k + 2) = 4⊗2 ⊗ x(k) for k ≥ 2.

2.2 Piece-Wise Aﬃne Systems

This section discusses Piece-wise Affine (PWA) systems [31] generated by an
autonomous MPL model. In the following section, PWA systems will play an
important role in forward reachability analysis. PWA systems are characterized
by a cover of the state space, and by affine (linear plus constant) dynamics that
are active within each set of the cover.
Every autonomous MPL model characterized by a row-finite matrix A ∈
IRn×n
ε can be expressed as a PWA system in the event domain. The affine dy-
namics are characterized, along with its corresponding region, by the coefficients
g = (g1 , . . . , gn ) ∈ {1, . . . , n}n or, more precisely, as:

%
n %
n
Rg = {x ∈ IRn : A(i, gi ) + xgi ≥ A(i, j) + xj } ; (3)
i=1 j=1
xi (k) = xgi (k − 1) + A(i, gi ) , 1≤i≤n . (4)

Implementation: VeriSiMPL employs a backtracking algorithm to generate the

PWA system. Recall that we are looking for all coeﬃcients g = (g1 , . . . , gn ) such
that Rg is not empty. In the backtracking approach, the partial coeﬃcients are
(g1 , . . . , gk ) for k = 1, . . . , n and the corresponding region is

%
k %
n
R(g1 ,...,gk ) = {x ∈ IRn : A(i, gi ) + xgi ≥ A(i, j) + xj } .
i=1 j=1

Notice that if the region associated with a partial coefficient (g1 , . . . , gk ) is empty,
then the regions associated with the coefficients (g1 , . . . , gn ) are also empty, for
all gk+1 , . . . , gn . The set of all coefficients can be represented as a potential
search tree. For a 2-dimensional MPL model, the potential search tree is given
in Fig. 1. The backtracking algorithm traverses the tree recursively, starting from
the root, in a depth-first order. At each node, the algorithm checks whether the
corresponding region is empty: if the region is empty, the whole sub-tree rooted
at the node is skipped (pruned).
The function maxpl2pwa is used to construct a PWA system from an autonomous
MPL model. The autonomous MPL model is characterized by a row-finite state
matrix (Ampl), whereas the PWA system is characterized by a collection of regions
(D) and a set of affine dynamics (A,B). The affine dynamics that are active in the
Forward Reachability Computation for Autonomous MPL Systems 253

j-th region are characterized by the j-th column of both A and B. Each column of
A and the corresponding column of B contain the coeﬃcients [g1 , . . . , gn ]T and the
constants [A(1, g1 ), . . . , A(n, gn )]T , respectively. The data structure of D will be
discussed in Section 2.3.
Considering the autonomous MPL example in (2), the following script gener-
ates the PWA system:

>> Ampl = [2 5;3 3], [A,B,D] = maxpl2pwa(Ampl)

It will become clear in Section 2.3 that the nonempty regions of the PWA system
produced by the script are: R(1,1) = {x ∈ IR2 : x1 − x2 ≥ 3}, R(2,1) = {x ∈ IR2 :
e ≤ x1 − x2 ≤ 3}, and R(2,2) = {x ∈ IR2 : x1 − x2 ≤ e}. The aﬃne dynamics
corresponding to a region Rg are characterized by g, e.g. those for region R(2,1)
are given by x1 (k) = x2 (k − 1) + 5, x2 (k) = x1 (k − 1) + 3.

IR2

R(1) R(2)

R(1,1) R(1,2) R(2,1) R(2,2)

Fig. 1. Potential search tree for a 2-dimensional MPL model

2.3 Diﬀerence-Bound Matrices

This section introduces the definition of a DBM [26, Sect. 4.1] and of its canonical-
form representation. DBM provide a simple and computationally advantageous
representation of the MPL dynamics, and will be further used in the next section
to represent the initial conditions and reach sets.
Definition 2 (Difference-Bound Matrix) A DBM is the intersection of fini-
tely many sets defined by xj −xi i,j αi,j , where i,j ∈ {<, ≤}, αi,j ∈ IR∪{+∞},
for 0 ≤ i = j ≤ n and the value of x0 is always equal to 0.
The special variable x0 is used to represent bounds on a single variable: xi ≤ α
can be written as xi − x0 ≤ α. A “stripe” is defined as a DBM that does not con-
tain x0 . Definition 2 can be likewise given over the input and the corresponding
augmented space.
Implementation: VeriSiMPL represents a DBM in IRn as a 1×2 cell: the first
element is an (n + 1)-dimensional real-valued matrix representing the upper
bound α, and the second element is an (n + 1)-dimensional Boolean matrix
representing the value of . More precisely, the (i+1, j +1)-th element represents
the upper bound and the strictness of the sign of xj − xi , for i = 0, . . . , n
and j = 0, . . . , n (cf. Definition 2). Furthermore, a collection of DBM is also
254 D. Adzkiya, B. De Schutter, and A. Abate

represented as a 1×2 cell, where the corresponding matrices are stacked along
the third dimension.
Each DBM admits an equivalent and unique canonical-form representation,
which is a DBM with the tightest possible bounds [26, Sect. 4.1]. The Floyd-
Warshall algorithm can be used to obtain the canonical-form representation of
a DBM, with a complexity that is cubic w.r.t. its dimension. One advantage
of the canonical-form representation is that it is easy to compute orthogonal
projections w.r.t. a subset of its variables, which is simply performed by deleting
rows and columns corresponding to the complementary variables [26, Sect. 4.1].
Implementation: The Floyd-Warshall algorithm has been implemented in the
function floyd warshall. Given a collection of DBM, this function generates
its canonical-form representation. The following MATLAB script computes the
canonical-form representation of D = {x ∈ IR4 : x1 − x4 ≤ −3, x2 − x1 ≤
−3, x2 − x4 ≤ −3, x3 − x1 ≤ 2}:

>> D = cell(1,2), ind = sub2ind([5,5],[4,1,4,1]+1,[1,2,2,3]+1)

>> D{1} = Inf(5), D{1}(1:6:25) = 0, D{1}(ind) = [-3,-3,-3,2]
>> D{2} = false(5), D{2}(1:6:25) = true, D{2}(ind) = true
>> Dcf = floyd warshall(D)

Let us discuss the steps in the construction of the DBM D. We first initial-
ize D with IR4 as D = cell(1,2), D{1} = Inf(5), D{1}(1:6:25) = 0, D{2} =
false(5), D{2}(1:6:25) = true. The variable ind contains the location, in lin-
ear index format, of each inequality in the matrix. We define the upper bounds
and the strictness in D{1}(ind) = [-3,-3,-3,2] and D{2}(ind) = true, re-
spectively. The output is Dcf = {x ∈ IR4 : x1 − x4 ≤ −3, x2 − x1 ≤ −3, x2 − x4 ≤
−6, x3 − x1 ≤ 2, x3 − x4 ≤ −1}. Notice that the bounds of x2 − x4 and x3 − x4
are tighter. Moreover, the orthogonal projection of D (or Dcf) w.r.t. {x1 , x2 } is
{x ∈ IR2 : x2 − x1 ≤ −3}.
The following result plays an important role in the computation of reachability
for MPL models.
Proposition 2 ([27, Th. 1]) The image of a DBM with respect to affine dy-
namics (in particular the PWA expression (4) generated by an MPL model) is
a DBM.

Implementation: The procedure to compute the image of a DBM in IRn w.r.t.

the affine dynamics (4) involves: 1) computing the cross product of the DBM
and IRn ; then 2) determining the DBM generated by the expression of the affine
dynamics (each equation can be expressed as the difference between variables
at event k and k − 1); 3) intersecting the DBM obtained in steps 1 and 2;
4) generating the canonical-form representation; finally 5) projecting the DBM
over the variables at event k, i.e. {x1 (k), . . . , xn (k)}. The worst-case complexity
critically depends on computing the canonical-form representation (in the fourth
step) and is O(n3 ).
Forward Reachability Computation for Autonomous MPL Systems 255

The procedure has been implemented in dbm image. It computes the image
of a collection of DBM w.r.t. the corresponding aﬃne dynamics. The following
example computes the image of D = {x ∈ IR2 : e ≤ x1 − x2 ≤ 3} w.r.t. x1 (k) =
x2 (k − 1) + 5, x2 (k) = x1 (k − 1) + 3:

>> D = cell(1,2), ind = sub2ind([3,3],[2,1]+1,[1,2]+1)

>> D{1} = Inf(3), D{1}(1:4:9) = 0, D{1}(ind) = [3,0]
>> D{2} = false(3), D{2}(1:4:9) = true, D{2}(ind) = true
>> A = [2;1], B =[5;3], Dim = dbm image(A,B,D)

The image is Dim = {x ∈ IR2 : −1 ≤ x1 − x2 ≤ 2}, which is a DBM.

The result in Proposition 2 allows computing the image of a DBM in IRn
w.r.t. the MPL model characterized by a row-finite matrix A ∈ IRn×n
ε . In order
to do so, we leverage the corresponding PWA system dynamics and separate
the procedure in the following steps: 1) intersecting the DBM with each region
of the PWA system; then 2) computing the image of nonempty intersections
according to the corresponding affine dynamics (cf. Theorem 2). The worst-case
complexity depends on the last step and is O(|R(A)| · n3 ), where |R(A)| is the
number of regions in the PWA system generated by matrix A.
Proposition 2 can be extended as follows.
Corollary 1 The image of a union of finitely many DBM w.r.t. the PWA sys-
tem generated by an MPL model is a union of finitely many DBM.

3 Forward Reachability Analysis

The goal of forward reachability analysis is to quantify the set of possible states
that can be attained under the model dynamics, from a set of initial conditions.
Two main notions can be deﬁned.
Deﬁnition 3 (Reach Set) Given an MPL model and a nonempty set of initial
positions X0 ⊆ IRn , the reach set XN at the event step N > 0 is the set of all
states {x(N ) : x(0) ∈ X0 } obtained via the MPL dynamics.

Definition 4 (Reach Tube) Given an MPL model and a nonempty set of ini-
tial positions X0 ⊆ IRn , the reach tube is defined by the set-valued function
k )→ Xk for any given k > 0 where Xk is defined.
Unless otherwise stated, in this work we focus on finite-horizon reachability:
in other words, we compute the reach set for a finite index N (cf. Definition 3)
and the reach tube for k = 1, . . . , N , where N < ∞ (cf. Definition 4). While the
reach set can be obtained as a by-product of the (sequential) computations used
to obtain the reach tube, we will argue that it can be as well calculated by a
tailored procedure (one-shot).
In the computation of the quantities defined above, the set of initial conditions
X0 ⊆ IRn will be assumed to be a union of finitely many DBM. In the more
256 D. Adzkiya, B. De Schutter, and A. Abate

general case of arbitrary sets, these will be over- or under-approximated by

DBM. As it will become clear later, this will in general shape the reach set Xk
at event step k > 0 as a union of ﬁnitely many DBM. For later use, we assume
that Xk is a union of |Xk | DBM and in particular that the set of initial conditions
X0 is a union of |X0 | DBM.

3.1 Sequential Computation of the Reach Tube

This approach uses the one-step dynamics for autonomous MPL systems itera-
tively. In each step, we leverage the DBM representation and the PWA dynamics
to compute the reach set.
Given a set of initial conditions X0 , the reach set Xk is recursively deﬁned as
the image of Xk−1 w.r.t. the MPL dynamics as

Xk = I(Xk−1 ) = {A ⊗ x : x ∈ Xk−1 } .

In the dynamical systems and automata literature the mapping I is also known
as Post [32, Definition 2.3]. Under the assumption that X0 is a union of finitely
many DBM, by Corollary 1 it can be shown by induction that the reach set Xk
is also a union of finitely many DBM, for each k ∈ IN.
Implementation: Given a state matrix A and a set of initial conditions X0 ,
the general procedure for obtaining the reach tube works as follows: first, we
construct the PWA system generated by A; then, for each k = 1, . . . , N , the
reach set Xk is obtained by computing I(Xk−1 ).
The worst-case complexity of the procedure (excluding that related to the
generation of PWA system) can be assessed as follows. The complexity of com-
puting I(Xk−1 ) is O(|Xk−1 | · |R(A)| · n3 ), for k = 1, . . . , N . This results in an
−1
overall complexity of O(|R(A)| · n3 N k=0 |Xk |). Notice that quantifying explic-
itly the cardinality |Xk | of the DBM union at each step k is not possible in
general (cf. Benchmark in Section 4).
The procedure has been implemented in maxpl reachtube for. The inputs
are the PWA system (A, B, D), the initial states (D0), and the event horizon
(N). The set of initial states D0 is a collection of finitely many DBM and the
event horizon N is a natural number. The output is a 1×(N + 1) cell. For each
1 ≤ i ≤ N + 1, the i-th element contains the reach set Xi−1 , which is a collection
of finitely many DBM (cf. Section 2.3).
Let us consider the unit square as the set of initial conditions X0 = {x ∈ IR2 :
0 ≤ x1 ≤ 1, 0 ≤ x2 ≤ 1}. The following MATLAB script computes the reach
tube for two steps:

>> Ampl = [2 5;3 3], [A,B,D] = maxpl2pwa(Ampl), N = 2

>> D0 = cell(1,2), ind = sub2ind([3,3],[1,2,0,0]+1,[0,0,1,2]+1)
>> D0{1} = Inf(3), D0{1}(1:4:9) = 0, D0{1}(ind) = [0,0,1,1]
>> D0{2} = false(3), D0{2}(1:4:9) = true, D0{2}(ind) = true
>> D0N = maxpl reachtube for(A,B,D,D0,N)
Forward Reachability Computation for Autonomous MPL Systems 257

The reach sets are DBM given by X1 = {x ∈ IR2 : 1 ≤ x1 − x2 ≤ 2, 5 ≤ x1 ≤

6, 3 ≤ x2 ≤ 4}, X2 = {x ∈ IR2 : 0 ≤ x1 − x2 ≤ 1, 8 ≤ x1 ≤ 9, 8 ≤ x2 ≤ 9}, and
are shown in Fig. 2 (left).

4
10

x2 R(2,2) n = 13

running time (in seconds)

3
10

R(2,1) 2
10 n = 10

1
10
n=7
0
10
n=4
R(1,1)
−1
10

−2
10
x1 10 20 30 40 50 60 70 80 90 100

event horizon (N )
Fig. 2. (Left plot) Reach tube for autonomous MPL model over 2 event steps. (Right
plot) Time needed to generate reach tube of autonomous models for diﬀerent models
size and event horizons, cf. Section 4.

Recall that, given a set of initial conditions X0 and a ﬁnite event horizon
N ∈ IN, in order to compute XN , we have to calculate X1 , . . . , XN −1 . If the
autonomous MPL system is irreducible, we can exploit the periodic behavior
(cf. Proposition 1) to simplify the computation.
Proposition 3 Let A ∈ IRn×n ε be an irreducible matrix with max-plus eigen-
value λ ∈ IR and cyclicity c ∈ IN. There exists a k0 (X0 ) = maxx∈X0 k0 (x), such
that Xk+c = λ⊗c ⊗ Xk , for all k ≥ k0 (X0 ).
Proof. Recall that for each x(0) ∈ IRn , there exists a k0 (x(0)) such that x(k +
c) = λ⊗c ⊗ x(k), for all k ≥ k0 (x(0)). Since k0 (X0 ) = maxx∈X0 k0 (x), for each
x(0) ∈ X0 , we have x(k+c) = λ⊗c ⊗x(k), for k ≥ k0 (X0 ). Recall from Deﬁnition 3
that Xk = {x(k) : x(0) ∈ X0 }, for all k ∈ IN.

Thus if the autonomous MPL system is irreducible, we only need to compute

X1 , . . . , Xk0 (X0 )∧N in order to calculate XN , for any N ∈ IN, where k0 (X0 )∧N =
min{k0 (X0 ), N }.
If the initial condition X0 is a stripe, the infinite-horizon reach tube can be
computed in a finite time, as stated in the following theorem.
Theorem 1 Let A ∈ IRn×n be an irreducible matrix with cyclicity c ∈ IN. If
ε
0 (X0 )+c−1
X0 is a union of finitely many stripes, then ki=0 Xi = ki=0 Xi , for all
k ≥ k0 (X0 ) + c − 1.
Proof. First we will show that Xk is a union of finitely many stripes for all k ∈ IN.
By using the procedure to compute the image of a DBM w.r.t. an affine dynamics,
258 D. Adzkiya, B. De Schutter, and A. Abate

it can be shown that the image of a stripe w.r.t. affine dynamics (generated by
an MPL model) is a stripe. Following the arguments after Theorem 2, it can be
shown that the image of a union of finitely many stripes w.r.t. the PWA system
generated by an MPL model is a union of finitely many stripes.
Since a stripe is a collection of equivalence classes [16, Sect. 1.4], then X0 ⊗
α = X0 , for each α ∈ IR. From Proposition 3 and the previous observations,
Xk+c = Xk for all k ≥ k0 (X0 ).

Example: The set of initial conditions can also be described as a stripe, for
example X0 = {x ∈ IR2 : −1 ≤ x1 − x2 ≤ 1}. The reach sets are stripes given
by X1 = {x ∈ IR2 : 1 ≤ x1 − x2 ≤ 2} and X2 = {x ∈ IR2 : 0 ≤ x1 − x2 ≤ 1}.
Additionally, we obtain X1 = X2k−1 and+∞X2 = X2k , for all k ∈ IN. It follows
2
that the inﬁnite-horizon reach tube is k=0 Xk = k=0 Xk = {x ∈ IR2 : −1 ≤
x1 − x2 ≤ 2}.

3.2 One-Shot Computation of the Reach Set

In this section we discuss a procedure for computing the reach set for a spe-
ciﬁc event step N using a tailored (one-shot) procedure. Given a set of initial
conditions X0 , we compute the reach set at event step N by using

XN = (I ◦ · · · ◦ I)(A) = I N (A) = {A⊗N ⊗ x : x ∈ X0 } .

Using Corollary 1, it can be seen that the reach set XN is a union of finitely
many DBM.
Implementation: Given a state matrix A, a set of initial conditions X0 and a
finite index N , the general procedure for obtaining XN is: 1) computing A⊗N ;
then 2) constructing the PWA system generated by it; finally 3) computing the
image of X0 w.r.t. the obtained PWA system.
Let us quantify the total complexity of the first and third steps in the pro-
cedure. The complexity of computing N -th max-algebraic power of an n × n
matrix (cf. Section 2.1) is O(log2 (N ) · n3 ). Excluding the generation of the
PWA system – step 2), see above – the overall complexity of the procedure is
O(log2 (N ) · n3 + |X0 | · |R(A⊗N )| · n3 ).
The procedure has been implemented in maxpl reachset for. The inputs are
the state matrix (Ampl), the initial states (D0), and the event horizon (N). The set
of initial states D0 is a collection of finitely many DBM (cf. Section 2.3) and the
event horizon N is a natural number. The output is a 1×2 cell: the first element
is the set of initial states and the second one is the reach set at event step N.
Recall that both the initial states and the reach set are a collection of finitely
many DBM.
Let us consider the unit square as the set of initial conditions X0 = {x ∈ IR2 :
0 ≤ x1 ≤ 1, 0 ≤ x2 ≤ 1}. The following MATLAB script computes the reach set
for two steps:
Forward Reachability Computation for Autonomous MPL Systems 259

>> Ampl = [2 5;3 3], N = 2

As expected, the reach set is a DBM given by X2 = {x ∈ IR2 : 0 ≤ x1 − x2 ≤

1, 8 ≤ x1 ≤ 9, 8 ≤ x2 ≤ 9}.
Intuitively, the sequential approach involves step-wise computations, and yields
correspondingly more information than the one-shot procedure as an output. The
complexities of both the sequential and one-shot computations depend on the
number of PWA regions corresponding to, respectively, the models related to
matrix A and A⊗N . Thus, in order to compare the performance of both meth-
ods, we need to assess the cardinality of the PWA regions generated by A⊗k ,
for diﬀerent values of k: from our experiments, it seems that the cardinality of
PWA regions grows if k increases, hence the one-shot approach may not always
result in drastic computational advantages. More work is needed to conclusively
assess this feature.

4 Numerical Benchmark
4.1 Implementation and Setup
The technique for forward reachability computation on MPL models discussed
in this work is implemented in the VeriSiMPL (“very simple”) version 1.3, which
is freely available at [28]. VeriSiMPL is a software tool originally developed to
obtain finite abstractions of Max-Plus-Linear (MPL) models, which enables their
verification against temporal specifications via a model checker. The algorithms
have been implemented in MATLAB 7.13 (R2011b) and the experiments have
been run on a 12-core Intel Xeon 3.47 GHz PC with 24 GB of memory.
In order to test the practical efficiency of the proposed algorithms, we compute
the runtime needed to determine the reach tube of an autonomous MPL system,
for event horizon N = 10 and an increasing dimension n of the MPL model. We
also keep track of the number of regions of the PWA system generated from the
MPL model. For any given n, we generate matrices A with 2 finite elements (in
a max-plus sense) that are randomly placed in each row. The finite elements are
randomly generated integers between 1 and 100. The set of initial conditions is
selected as the unit hypercube, i.e. {x ∈ IRn : 0 ≤ x1 ≤ 1, . . . , 0 ≤ xn ≤ 1}.
Over 10 independent experiments, Table 1 reports the average time needed
to generate the PWA system and to compute the reach tube, as well as the
corresponding number of regions. As confirmed by Table 1, the time needed to
compute the reach tube is monotonically increasing w.r.t. the dimension of the
MPL model (as we commented previously this is not the case for the cardinality
of reach sets, which hinges on the structure of the MPL models). For a fixed
model size and dynamics, the growth of the computational time for forward
260 D. Adzkiya, B. De Schutter, and A. Abate

reachability is linear (in the plot, logarithmic over logarithmic time scale) with
the event horizon as shown in Fig. 2 (right). We have also performed reachability
computations for the case of the set of initial conditions described as a stripe,
which has yielded results that are analogue to those in Table 1.
Table 1. Numerical benchmark, autonomous MPL model: computation of the reach
tube (average over 10 experiments)

size time for number of time for number of

of MPL generation of regions of generation of regions of
model PWA system PWA system reach tube X10
3 0.09 [sec] 5.80 0.09 [sec] 4.20
5 0.14 [sec] 22.90 0.20 [sec] 6.10
7 0.52 [sec] 89.60 0.72 [sec] 13.40
9 2.24 [sec] 340.80 2.25 [sec] 4.10
11 10.42 [sec] 1.44 ×103 15.49 [sec] 3.20
13 46.70 [sec] 5.06 ×103 5.27 [min] 16.90
15 3.48 [min] 2.01 ×104 25.76 [min] 10.10
17 15.45 [min] 9.07 ×104 3.17 [hr] 68.70
19 67.07 [min] 3.48 ×105 7.13 [hr] 5.00

4.2 Comparison with Alternative Reachability Computations

To the best of the authors knowledge, there exist no approaches for general for-
ward reachability computation over MPL models. Forward reachability can be
alternatively assessed only leveraging the PWA characterization of the model dy-
namics (cf. Section 2). Forward reachability analysis of PWA models can be best
computed by the Multi-Parametric Toolbox (MPT, version 2.0) [25]. However,
the toolbox has some implementation requirements: the state space matrix A has
to be invertible – this is in general not the case for MPL models; the reach sets
Xk have to be bounded – in our case the reach sets can be unbounded, partic-
ularly when expressed as stripes; further, MPT deals only with full-dimensional
polytopes – whereas the reach sets of interest may not necessarily be so; ﬁnally,
MPT handles convex regions and over-approximates the reach sets Xk when
necessary – our approach computes instead the reach sets exactly.
For the sake of comparison, we have constructed randomized examples (with
invertible dynamics) and run both procedures in parallel, with focus on compu-
tation time rather than the actual obtained reach tubes. Randomly generating
the underlying dynamics allows deriving general results that are not biased to-
wards possible structural features of the model. MPT can handle in a reason-
able time frame models with dimension up to 10: in this instance (as well as
lower-dimensional ones) we have obtained that our approach performs better
(cf. Table 2). Notice that this is despite MPT being implemented in the C lan-
guage, whereas VeriSiMPL runs in MATLAB: this leaves quite some margin of
computational improvement to our techniques.
Forward Reachability Computation for Autonomous MPL Systems 261

Table 2. Time for generation of the reach tube of 10-dimensional autonomous MPL
model for diﬀerent event horizons (average over 10 experiments)

event horizon 20 40 60 80 100

VeriSiMPL 11.02 [sec] 17.94 [sec] 37.40 [sec] 51.21 [sec] 64.59 [sec]
MPT 47.61 [min] 1.19 [hr] 2.32 [hr] 3.03 [hr] 3.73 [hr]

5 Conclusions and Future Work

This work has discussed the computation of forward reachability analysis of Max-
Plus-Linear models by fast manipulations of DBM through PWA dynamics.
Computationally, we are interested in further optimizing the software for
reachability computations, by leveraging symbolic techniques based on the use of
decision diagrams and by developing an implementation in the C language. We
are presently exploring a comparison of the proposed approach with Flow* [14].
We plan to investigate backward reachability, as well as reachability of non-
autonomous models, which embed non-determinism in the form of a control
input, by tailoring or extending the techniques discussed in this work.

References

1. Alur, R., Dill, D.: A theory of timed automata. Theoretical Computer Sci-
ence 126(2), 183–235 (1994)
2. Behrmann, G., David, A., Larsen, K.G.: A tutorial on uppaal. In: Bernardo, M.,
Corradini, F. (eds.) SFM-RT 2004. LNCS, vol. 3185, pp. 200–236. Springer, Hei-
delberg (2004)
3. Kloetzer, M., Mahulea, C., Belta, C., Silva, M.: An automated framework for formal
verification of timed continuous Petri nets. IEEE Trans. Ind. Informat. 6(3), 460–
471 (2010)
4. Henzinger, T.A., Rusu, V.: Reachability verification for hybrid automata. In:
Henzinger, T.A., Sastry, S.S. (eds.) HSCC 1998. LNCS, vol. 1386, pp. 190–204.
Springer, Heidelberg (1998)
5. Dang, T., Maler, O.: Reachability analysis via face lifting. In: Henzinger, T.A.,
Sastry, S.S. (eds.) HSCC 1998. LNCS, vol. 1386, pp. 96–109. Springer, Heidelberg
(1998)
6. Chutinan, A., Krogh, B.: Computational techniques for hybrid system verification.
IEEE Trans. Autom. Control 48(1), 64–75 (2003)
7. CheckMate, http://users.ece.cmu.edu/~ krogh/checkmate/
8. Mitchell, I., Bayen, A., Tomlin, C.: A time-dependent Hamilton-Jacobi formula-
tion of reachable sets for continuous dynamic games. IEEE Trans. Autom. Con-
trol 50(7), 947–957 (2005)
9. Mitchell, I.M.: Comparing forward and backward reachability as tools for safety
analysis. In: Bemporad, A., Bicchi, A., Buttazzo, G. (eds.) HSCC 2007. LNCS,
vol. 4416, pp. 428–443. Springer, Heidelberg (2007)
10. Kurzhanskiy, A., Varaiya, P.: Ellipsoidal techniques for reachability analysis of
discrete-time linear systems. IEEE Trans. Autom. Control 52(1), 26–38 (2007)
262 D. Adzkiya, B. De Schutter, and A. Abate

11. Kurzhanskiy, A., Varaiya, P.: Ellipsoidal toolbox. Technical report, EECS Depart-
ment, University of California, Berkeley (May 2006)
12. Asarin, E., Schneider, G., Yovine, S.: Algorithmic analysis of polygonal hybrid sys-
tems, part i: Reachability. Theoretical Computer Science 379(12), 231–265 (2007)
13. Le Guernic, C., Girard, A.: Reachability analysis of hybrid systems using support
functions. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp.
540–554. Springer, Heidelberg (2009)
14. Chen, X., Ábrahám, E., Sankaranarayanan, S.: Flow*: An Analyzer for Non-linear
Hybrid Systems. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044,
pp. 258–263. Springer, Heidelberg (2013)
15. Baccelli, F., Cohen, G., Olsder, G., Quadrat, J.P.: Synchronization and Linearity,
An Algebra for Discrete Event Systems. John Wiley and Sons (1992)
16. Heidergott, B., Olsder, G., van der Woude, J.: Max Plus at Work–Modeling and
Analysis of Synchronized Systems: A Course on Max-Plus Algebra and Its Appli-
cations. Princeton University Press (2006)
17. Roset, B., Nijmeijer, H., van Eekelen, J., Lefeber, E., Rooda, J.: Event driven
manufacturing systems as time domain control systems. In: Proc. 44th IEEE Conf.
Decision and Control and European Control Conf. (CDC-ECC 2005), pp. 446–451
(December 2005)
18. Merlin, P., Farber, D.J.: Recoverability of communication protocols–implications
of a theoretical study. IEEE Trans. Commun. 24(19), 1036–1043 (1976)
19. Plus, M.: Max-plus toolbox of Scilab (1998),
http://www.cmap.polytechnique.fr/~ gaubert/MaxplusToolbox.html
20. Gazarik, M., Kamen, E.: Reachability and observability of linear systems over
max-plus. Kybernetika 35(1), 2–12 (1999)
21. Gaubert, S., Katz, R.: Reachability and invariance problems in max-plus algebra.
In: Benvenuti, L., De Santis, A., Farina, L. (eds.) Positive Systems. LNCIS, vol. 294,
pp. 15–22. Springer, Heidelberg (2003)
22. Gaubert, S., Katz, R.: Reachability problems for products of matrices in semirings.
International Journal of Algebra and Computation 16(3), 603–627 (2006)
23. Gaubert, S., Katz, R.: The Minkowski theorem for max-plus convex sets. Linear
Algebra and its Applications 421(2-3), 356–369 (2007)
24. Zimmermann, K.: A general separation theorem in extremal algebras. Ekonom.-
Mat. Obzor 13(2), 179–201 (1977)
25. Kvasnica, M., Grieder, P., Baotić, M.: Multi-parametric toolbox, MPT (2004)
26. Dill, D.: Timing assumptions and verification of finite-state concurrent systems. In:
Sifakis, J. (ed.) CAV 1989. LNCS, vol. 407, pp. 197–212. Springer, Heidelberg (1990)
27. Adzkiya, D., De Schutter, B., Abate, A.: Finite abstractions of max-plus-linear
systems. IEEE Trans. Autom. Control 58(12), 3039–3053 (2013)
28. Adzkiya, D., Abate, A.: VeriSiMPL: Verification via biSimulations of MPL models.
In: Joshi, K., Siegle, M., Stoelinga, M., D’Argenio, P.R. (eds.) QEST 2013. LNCS,
vol. 8054, pp. 274–277. Springer, Heidelberg (2013),
http://sourceforge.net/projects/verisimpl/
29. Yordanov, B., Belta, C.: Formal analysis of discrete-time piecewise affine systems.
IEEE Trans. Autom. Control 55(12), 2834–2840 (2010)
30. Charron-Bost, B., Függer, M., Nowak, T.: Transience bounds for distributed algo-
rithms. In: Braberman, V., Fribourg, L. (eds.) FORMATS 2013. LNCS, vol. 8053,
pp. 77–90. Springer, Heidelberg (2013)
31. Sontag, E.D.: Nonlinear regulation: The piecewise-linear approach. IEEE Trans.
Autom. Control 26(2), 346–358 (1981)
32. Baier, C., Katoen, J.P.: Principles of Model Checking. The MIT Press (2008)
Compositional Invariant Generation
for Timed Systems

Lacramioara Aştefănoaei, Souha Ben Rayana,

Saddek Bensalem, Marius Bozga, and Jacques Combaz

UJF-Grenoble, CNRS VERIMAG UMR 5104, Grenoble F-38041, France

Abstract. In this paper we address the state space explosion problem

inherent to model-checking timed systems with a large number of compo-
nents. The main challenge is to obtain pertinent global timing constraints
from the timings in the components alone. To this end, we make use of
auxiliary clocks to automatically generate new invariants which capture
the constraints induced by the synchronisations between components.
The method has been implemented as an extension of the D-Finder tool
and successfully experimented on several benchmarks.

1 Introduction
Compositional methods in verification have been developed to cope with state
space explosion. Generally based on divide et impera principles, these methods
attempt to break monolithic verification problems into smaller sub-problems by
exploiting either the structure of the system or the property or both. Composi-
tional reasoning can be used in different manners e.g., for deductive verification,
assume-guarantee, contract-based verification, compositional generation, etc.
The development of compositional verification for timed systems remains how-
ever challenging. State-of-the-art tools [7,13,25,18] for the verification of such
systems are mostly based on symbolic state space exploration, using efficient
data structures and particularly involved exploration techniques. In the timed
context, the use of compositional reasoning is inherently difficult due to the
synchronous model of time. Time progress is an action that synchronises contin-
uously all the components of the system. Getting rid of the time synchronisation
is necessary for analysing independently different parts of the system (or of the
property) but becomes problematic when attempting to re-compose the partial
verification results. Nonetheless, compositional verification is actively investi-
gated and several approaches have been recently developed and employed in
timed interfaces [2] and contract-based assume-guarantee reasoning [15,22].
In this paper, we propose a different approach for exploiting compositionality
for analysis of timed systems using invariants. In contrast to exact reachability
analysis, invariants are symbolic approximations of the set of reachable states of
the system. We show that rather precise invariants can be computed composi-
tionally, from the separate analysis of the components in the system and from

Work partially supported by the European Integrated Projects 257414 ASCENS,
288175 CERTAINTY, and STREP 318772 D-MILS.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 263–278, 2014.

c Springer-Verlag Berlin Heidelberg 2014
264 L. Aştefănoaei et al.

their composition glue. This method is proved to be sound for the verification
of safety properties. However, it is not complete.
The starting point is the verification method of [9], summarised in Figure 1.
The method exploits compositionality as explained next. Consider a system con-
sisting of components Bi interacting by means of a set γ of multi-party inter-
actions, and let Ψ be a system property of interest. Assume that all Bi as well
as the composition through γ can be independently characterised by means
of component invariants CI (Bi ), respectively interaction invariants II (γ). The
connection between the invariants and the system property Ψ can be intuitively
understood as follows: if Ψ can be proved to be a logical consequence of the con-
junction of components and interaction invariants, then Ψ holds for the system.
In the rule (VR) the symbol “ . ” is
used to underline that the logical im- . CI (Bi ) ∧ II (γ) → Ψ
plication can be effectively proved (for i
(VR)
instance with an SMT solver) and the no- γ Bi |= Ψ
tation “B |= Ψ ” is to be read as “Ψ
Fig. 1. Compositional Verification
holds in every reachable state of B”.
The verification rule (VR) in [9] has been developed for untimed systems. Its
direct application to timed systems may be weak as interaction invariants do not
capture global timings of interactions between components. The key contribution
of this paper is to improve the invariant generation method so to better track
such global timings by means of auxiliary history clocks for actions and inter-
actions. At component level, history clocks expose the local timing constraints
relevant to the interactions of the participating components. At composition
level, extra constraints on history clocks are enforced due to the simultaneity of
interactions and to the synchrony of time progress.
As an illustration, let us consider as running example the timed system in
Figure 2 which depicts a “controller” component serving n “worker” components,
one at a time. The interactions between the controller and the workers are defined
by the set of synchronisations {(a | bi ), (c | di ) | i ≤ n}. Periodically, after every
4 units of time, the controller synchronises its action a with the action bi of any
worker i whose clock shows at least
Controller Worker 3
a b
3
Worker
Worke 4n
2 units of time. Initially, such a
lc b 3
0 2 l 1 2
l 1
Worker
W o worker
1 exists because the controller
x ≥ 4n b
x := 0
d
3
y d:=
2
=0
1 b 3 1
y bl ≥ 4n
2
waits for 4n units of time before in-
3 3 1
y ::= 0 y ≥ 4n
x ≤ 4 lc
1
2
d1
l 3
2
b1
n
teracting with workers. The cycle re-
2 2
y :=l 0 y ≥ 4n
c a, x = 4
d
3
d 2
1 2 1
peats forever because there is always
x := 0 1
x:=0
d 1
l2 a worker “willing” to do b, that is,
lc
2
c the system is deadlock-free. Proving
deadlock-freedom of the system re-
Fig. 2. A Timed System quires to establish that when the con-
troller is at location lc1 there is at least one worker such that yi − x ≥ 4n − 4.
Unfortunately, this property cannot be shown if we use (VR) as it is in [9]. In-
tuitively, this is because the proposed invariants are too weak to infer such cross
constraints relating the clocks of the controller and the clocks of the workers:
Compositional Invariant Generation for Timed Systems 265

interaction invariants II (γ) relates only locations of components and thus at

most eliminates unreachable configurations like (lc1 , . . . , l2i , . . . ), while the com-
ponent invariants can only state local conditions on clocks such as that x ≤ 4 at
lc1 . Using history clocks allows to recover additional constraints. For example,
after the first execution of the loop, each time when the controller is at location
lc1 , there exists a worker i whose clock has an equal value as that of the con-
troller. Similarly, history clocks allow to infer that different (a | bi ) interactions
are separated by at least 4 time units. These constraints altogether are then
sufficient to prove the deadlock freedom property.

Related Work. Automatic generation of invariants for concurrent systems is a

long-time studied topic. Yet, to our knowledge, specific extensions or applications
for timed systems are rather limited. As an exception, the papers [5,17] propose
a monolithic, non-compositional method for finding invariants in the case of
systems represented as a single timed automaton.
Compositional verification for timed systems has been mainly considered in
the context of timed interface theories [2] and contract-based assume guarantee
reasoning [15,22]. These methods usually rely upon choosing a “good” decom-
position structure and require individual abstractions for components to be de-
terministic timed I/O automata. Finding the abstractions is in general difficult,
however, their construction can be automated by using learning techniques [22] in
some cases. In contrast to the above, we are proposing a fully automated method
generating, in a compositional manner, an invariant approximating the reachable
states of a timed system. Abstractions serve also for compositional minimisation,
for instance [10] minimises by constructing timed automata quotients with re-
spect to simulation; these quotients are in turn composed for model-checking.
Our approach is orthogonal in that we do not compose at all. Compositional de-
ductive verification as in [16] is also orthogonal on our work in that, by choosing
a particular class of local invariants to work with, we need not focus on elaborate
proof systems but reason at a level closer to intuition.
The use of additional clocks has been considered, for instance, in [8]. There,
extra reference clocks are added to components to faithfully implement a partial
order reduction strategy for symbolic state space exploration. Time is allowed
to progress desynchronised for individual components and re-synchronised only
when needed, i.e., for direct interaction within components. Clearly, the history
clocks in our work behave in a similar way, however, our use of clocks is as a
helper construction in the generation of invariants and we are totally avoiding
state space exploration. Finally, another successful application of extra clocks
has been provided in [23] for timing analysis of asynchronous circuits. There,
specific history clocks are reset on input signals and used to provide a new time
basis for the construction of an abstract model of output signals of the circuit.

Organisation of the paper. Section 2 recalls the needed deﬁnitions for modelling
timed systems and their properties. Section 3 presents our method for composi-
tional generation of invariants. Section 4 describes the prototype implementing
the method and some case studies we experimented with. Section 5 concludes.
266 L. Aştefănoaei et al.

2 Timed Systems and Properties

In the framework of the present paper, components are timed automata and
systems are compositions of timed automata with respect to multi-party inter-
actions. The timed automata we use are essentially the ones from [3], however,
sligthly adapted to embrace a uniform notation throughout the paper.

Deﬁnition 1 (Syntax of a Component). A component is a timed automaton

(L, l0 , A, T, X , tpc) where L is a finite set of locations, l0 is an initial location,
A a finite set of actions, T ⊆ L × (A × C × 2X ) × L is a set of edges labeled with
an action, a guard, and a set of clocks to be reset, X is a finite set of clocks1 ,
and tpc : L → C assigns a time progress condition2 to each location. C is the set
of clock constraints. A clock constraint is defined by the grammar:

C ::= true | false | x#ct | x − y#ct | C ∧ C

with x, y ∈ X , # ∈ {<, ≤, =, ≥, >} and ct ∈ Z. Time progress conditions are

restricted to conjunctions of constraints as x ≤ ct .

Before recalling the semantics of a component, we ﬁrst ﬁx some notation. Let

V be the set of all clock valuation functions v : X → R≥0 . For a clock constraint
C, C(v) denotes the evaluation of C in v. The notation v + δ represents a new
v deﬁned as v (x) = v(x) + δ while v[r] represents a new v which assigns any
x in r to 0 and otherwise preserves the values from v.

Deﬁnition 2 (Semantics of a Component). The semantics of a component

B = (L, l0 , A, T, X, tpc) is given by the labelled transition system (Q, A, →) where
Q ⊆ L × V denotes the states of B and → ⊆ Q × (A ∪ R≥0 ) × Q denotes the
transitions according to the rules:
δ
– (l, v) → (l, v + δ) if ∀δ ∈ [0, δ] .(tpc(l)(v + δ )) (time progress);
a
– (l, v) → (l , v[r]) if l, (a, g, r), l ∈ T , g(v) ∧ tpc(l )(v[r]) (action step).

Because the semantics defined above is in general infinite, we work with the
so called zone graph [19] as a finite symbolic representation. The symbolic states
in a zone graph are pairs (l, ζ) where l is a location of B and ζ is a zone, a set
of clock valuations defined by clock constraints. Given a symbolic state (l, ζ),
its successor with respect to a transition t of B is denoted as succ(t, (l, ζ)) and
defined by means of its timed and its discrete successor:

– time-succ((l, ζ)) = (l, / ζ ∩ tpc(l))

– disc-succ(t, (l, ζ)) = (l , (ζ ∩ g)[r] ∩ tpc(l )) if t = l, ( , g, r), l
– succ(t, (l, ζ)) = norm(time-succ(disc-succ(t, (l, ζ))))
1
Clocks are local. This is essential for avoiding side eﬀects which would break com-
positionality and local analysis.
2
To avoid confusion with invariant properties, we prefer to adopt the terminology of
“time progress condition” from [11] instead of “location invariants”.
Compositional Invariant Generation for Timed Systems 267

where /, [r], norm are usual operations on zones: / ζ is the forward diagonal
projection of ζ, i.e., it contains any valuation v for which there exists a real δ
such that v − δ is in ζ; ζ[r] is the set of all valuations in ζ after applying the
resets in r; norm(ζ) corresponds to normalising ζ such that computation of the
set of all successors terminates. Since we are seeking component invariants which
are over-approximations of the reachable states, a more thorough discussion on
normalisation is not relevant for the present paper. The interested reader may
refer to [12] for more precise deﬁnitions.
A symbolic execution of a component starting from a symbolic state s0 is a
sequence of symbolic states s0 , s1 , . . . , sn , . . . such that for any i > 0 there exists
a transition t for which si is succ(t, si−1 ).
Given a component B with initial symbolic state s0 and transitions T , the
set of reachable symbolic states Reach(B) is Reach(s0 ) where Reach is deﬁned
recursively for an arbitrary s as:

Reach(s) = {s} ∪ Reach(succ(t, s)).
t∈T

In our framework, components communicate by means of interactions, which

are synchronisations between their actions. Given n components Bi , 1 ≤ i ≤ n,
with disjoint sets of actions Ai , an interaction is a subset of actions α ⊆ ∪i Ai
containing at most one action per component, that is, of the form α = {ai }i∈I ,
with ai ∈ Ai for all i ∈ I ⊆ {1, . . . , n}. Given a set of interactions γ ⊆ 2∪i Ai ,
we denote by Act(γ) the set of actions involved in γ, that is, Act(γ) = ∪α∈γ α.
A timed system is the composition of components Bi for a set of interactions γ
such that Act(γ) = ∪i Ai .
Definition 3 (Timed System). For n components Bi = (Li , l0i , Ai , Ti , Xi , tpci )
with Ai ∩ Aj = ∅, Xi ∩ Xj = ∅, for any i = j, the composition γ Bi w.r.t. a
set of interactions γ is defined by a timed automaton (L, l̄0 , γ, Tγ , X , tpc) where
l̄0 = (l01 , . . . , l0n ), X = ∪i Xi , L = ×i Li , tpc(l̄) = ∧i tpc(li ), and Tγ is such that
α,g,r
for any interaction α = {ai }i∈I we have that l̄ −−−→ l̄ where ¯l = (l1 , . . . , ln ),
ai ,gi ,ri
g = ∧i∈I gi , r = ∪i∈I ri , and l̄ (i) = if (i ∈ I) li else li for li −−−−−→ li .
In the timed system γ Bi a component Bi can execute an action ai only as
part of an interaction α, ai ∈ α, that is, along with the execution of all other
actions aj ∈ α, which corresponds to the usual notion of multi-party interaction.
Notice that interactions can only restrict the behavior of components, i.e. the
states reached by Bi in γ Bi belong to Reach(Bi ). This property is exploited by
the verification rule (VR) presented throughout this paper.
To give a logical characterisation of components and interactions we use in-
variants. An invariant Φ is a state property which holds in every reachable state
of a component (or of a system) B , in symbols, B |= Φ. We use CI (Bi ) and
II (γ), to denote component, respectively interaction invariants. For component
invariants, our choice is to work with their reachable set. More precisely, for
a component B with initial state s0 , CI (B ) is the disjunction of (l ∧ ζ) and
where, to ease the reading, we abuse notation and use l as a place holder for a
268 L. Aştefănoaei et al.

state predicate “at (l)” which holds in any symbolic state with location l, that is,
the semantics of at (l) is given by (l, ζ) |= at (l). As an example, the component
invariants for the scenario in Figure 2 with one worker are:

CI (Controller ) = (lc0 ∧ x ≥ 0) ∨ (lc1 ∧ x ≤ 4) ∨ (lc2 ∧ x ≥ 0)

CI (Worker i ) = (l1i ∧ yi ≥ 0) ∨ (l2i ∧ yi ≥ 4).

Interaction invariants are over-approximations of global state spaces allowing

us to disregard certain tuples of local states as unreachable. Interaction invariants
relate locations of diﬀerent atomic components. They are either boolean e.g., l1 ∨
l2 ∨l3 or linear e.g., l1 +l2 +l3 = 1. These particular examples ensure that at least
(resp. exactly) one of the locations l1 , l2 , l3 are active at any time. Interaction
invariants are computed on the synchronization skeleton of the composition, that
is, a 1-safe Petri net obtained by composing component behaviours according
to the interaction glue. The methods rely on boolean ([9]) / algebraic ([21])
constraint solving and avoid any form of state-space exploration. In the case of
the running example, when the controller
is interacting with one worker, the
interaction invariant II {(a | b1 ), (c | d1 )} is (lc2 ∨ l11 ) ∧ (l21 ∨ lc0 ∨ lc1 ).
The proposed invariants3 have the feature that they are inductive. We recall
that an invariant Φ is inductive if it holds initially and if for a state s s.t. s |= Φ
we have that s |= Φ for any successor s of s. Moreover, inductive invariants
have the property that their conjunction is also an inductive invariant.

3 Timed Invariant Generation

As explained in the introduction, a direct application of (VR) may not be useful
in itself in the sense that the component and the interaction invariants alone are
usually not enough to prove global properties, especially when the properties
involve relations between clocks in different components. More precisely, though
component invariants encode timings of local clocks, there is no direct way – the
interaction invariant is orthogonal on timing aspects – to constrain the bounds
on the differences between clocks in different components. To give a concrete
illustration, consider the safety property ΨS af e = (lc1 ∧ l11 → x ≤ y1 ) that holds
in the running example with one worker. It is not difficult to see that ΨS afe
cannot be deduced from CI (Controller ) ∧ CI (Worker 1 ) ∧ II {(a | b1 ), (c | d1 )} .

3.1 History Clocks for Actions

In this section we show how we can, by means of some auxiliary constructions,
apply (VR) more successfully. To this end, we “equip” components (and later,
interactions) with history clocks, a clock per action; then, at interaction time,
the clocks corresponding to the actions participating in the interaction are reset.
3
The rule (VR) is generic enough to work with other types of invariants. For example,
one could use over-approximations of the reachable set in the case of component
invariants, however, this comes at the price of losing precision.
Compositional Invariant Generation for Timed Systems 269

This basic transformation allows us to automatically compute a new invariant of

the system with history clocks. This new invariant, together with the component
and interaction invariants, is shown to be, after projection of history clocks, an
invariant of the initial system.
Definition 4 (Components with History Clocks). Given a component
model B = (L, l0 , A, T, X , tpc), its extension wrt history clocks is a timed au-
tomaton B h = (L, l0 , A, T h , X ∪ HA , tpc) where:
– HA = {ha | a ∈ A} ∪ {h0 } is the set of history clocks associated to actions
and ho , a history clock dedicated to initialisation. Together with the clocks
in X , h0 is initialised to zero. All other clocks in HA may be initialised to
any arbitrary
! positive value. "
– T h = l, (a, g, r ∪ [ha := 0]), l | l, (a, g, r), l ∈ T .
Since there is no timing constraint involving history clocks, these have no
influence on the behaviour. The extended model is, in fact, bisimilar to the
original model. Moreover, any invariant of the composition of Bih corresponds
to an invariant of γ Bi . For the ease of reading, we abuse notation and use ∃HA
to stand for ∃h0 ∃ha1 ∃ha2 . . . ∃han for A = {a1 , a2 , . . . , an }.
Proposition 1 Any symbolic execution in B h corresponds to a symbolic exe-
cution (where all constraints on history clocks are ignored) in B . Moreover, if
γ Bih |= Φ then γ Bi |= (∃HA ).Φ.
The only operation acting on history clocks is reset. Its effect is that immedi-
ately after an interaction takes place, all history clocks involved in the interaction
are equal to zero. All other history clocks preserve their previous values, thus
they are at least greater in value than all those being reset. This basic but useful
observation is exploited in the following definition, which builds, recursively, all
the inequalities that could hold given an interaction set γ.
Definition 5 (Interaction Inequalities for History Clocks). Given an in-
teraction set γ, we define the following interaction inequalities E(γ):

E(γ) = hai = haj ≤ hak ∧ E(γ 0 α) .
α∈γ ai ,aj ∈α
ak ∈α
/

where γ 0 α = {β \ α | β ∈ γ ∧ β ⊆ α}.
Remark 1. We can use the interpreted function “min” as syntactic sugar to have
a more compact expression for E(γ):

E(γ) = hai = haj ≤ min hak ∧ E(γ 0 α) .
ak ∈α
α∈γ ai ,aj ∈α

Example 1. For γ = {(a | b1 ), (c | d1 )}, corresponding to the interactions be-

tween the controller and one worker in Figure 2, the compact form of E(γ) is:

ha = hb1 ≤ min(hc , hd1 ) ∧ hc = hd1 ∨ hc = hd1 ≤ min(ha , hb1 ) ∧ ha = hb1 .
270 L. Aştefănoaei et al.

E(γ) characterises the relations between history clocks during any possible
execution of a system. It can be shown, by induction, that this characterisation
is, in fact, an inductive invariant of the extended system.

Proposition 2 E(γ) is an inductive invariant of γ Bih .

By Proposition 2, and using the fact that component and interaction invari-
ants are inductive, we have that also their conjunction is an inductive invariant
of the system with history clocks. As a consequence of Proposition 1, we can
eliminate the history clocks from ∧i CI (Bih ) ∧ II (γ) ∧ E(γ) and obtain an invari-
ant of the original system. This invariant is usually stronger than CI (Bi ) ∧ II (γ)
and yields more successful applications of the rule (VR).
Example 2. We reconsider the sub-system of a controller and a worker from
Figure 2. We illustrate how the safety property ψSafe introduced in the beginning
of the section can be shown to hold by using the newly generated invariant. The
invariants for the components with history clocks are:

CI (Controller h ) = (lc0 ∧ h0 = x) ∨
(lc1 ∧ x ≤ 4 ∧ ha ≤ h0 ∧ (ha = hc ≥ 4 + x ∨ x = hc ≤ ha )) ∨
(lc2 ∧ x = ha ∧ hc ≤ h0 ∧ (hc ≥ ha + 8 ∨ hc = ha + 4))
CI (Worker h1 ) = (l11 ∧ (y1 = h0 ∨ y1 = hd1 ≤ hb1 ≤ h0 )) ∨
(l21 ∧ h0 ≥ y1 = hd1 ≥ 4 + hb1 )

By using the interaction invariant described in Section 2 and the equality con-
straints
E(γ) from Example 1, after the elimination of the existential quantiﬁers

in ∃h0 .∃ha .∃hb1 .∃hc .∃hd1 CI (Controller h ) ∧ CI (Worker h1 ) ∧ II (γ) ∧ E(γ) we
obtain the following invariant Φ :

Φ = (l11 ∧ lc0 ∧ (y1 ≤ x)) ∨ l11 ∧ lc1 ∧ (y1 = x ∨ y1 ≥ x + 4) ∨
1
l2 ∧ lc2 ∧ (y1 = x + 4 ∨ y1 ≥ x + 8) .

It can be easily checked that Φ ∧ ¬ΨSafe has no satisfying model and this proves
that ΨSafe holds for the system. We used bold fonts in Φ to highlight relations
between x and y1 which are not in CI (Controller ) ∧ CI (Worker 1 ) ∧ II (γ).
To sum up, the basic steps described so far are: (1) extend the input com-
ponents Bi to components with history clocks Bih ; (2) compute component in-
variants CI (Bih ) and (3) equality constraints E(γ) from the interactions γ; (4)
ﬁnally, eliminate the history clocks in ∧i CI (Bih ) ∧ E(γ) ∧ II (γ), and obtain a
stronger invariant by means of which the application of (VR) is more successful.
We conclude the section with a remark on the size of E(γ). Due to the com-
bination of recursion and disjunction, E(γ) can be large. Much more compact
formulae can be obtained by exploiting non-conﬂicting interactions, i.e., inter-
actions that do not share actions.

Proposition 3 For γ = γ1 ∪γ2 with Act(γ1 )∩Act (γ2 ) = ∅, E(γ) ≡ E(γ1 )∧E(γ2 ).
Compositional Invariant Generation for Timed Systems 271

Corollary 4 If the interaction model γ has only disjoint interactions, i.e., for

any α1 , α2 ∈ γ, α1 ∩ α2 = ∅, then E(γ) ≡ h ai = h aj .
α∈γ ai ,aj ∈α

Example 3. The interaction set γ in Example 1 is not conﬂicting. Thus, by apply-

ing Corollary 4, we can simplify the expression of E(γ) to (ha = hb1 )∧(hc = hd1 ).

3.2 History Clocks for Interactions

The equality constraints on history clocks allow to relate the local constraints
obtained individually on components. In the case of non-conflicting interactions,
the relation is rather “tight”, that is, expressed as conjunction of equalities on
history clocks. In contrast, the presence of conflicts lead to a significantly weaker
form. Intuitively, every action in conflict can be potentially used in different
interactions. The uncertainty on its exact use leads to a disjunctive expression
as well as to more restricted equalities and inequalities amongst history clocks.
Nonetheless, the presence of conflicts themselves can be additionally exploited
for the generation of new invariants. That is, in contrast to equality constraints
obtained from interaction, the presence of conflicting actions enforce disequal-
ities (or separation) constraints between all interactions using them. In what
follows, we show a generic way to automatically compute such invariants enforc-
ing differences between the timings of the interactions themselves. To effectively
implement this, we proceed in a similar manner as in the previous section: we
again make use of history clocks and corresponding resets but this time we as-
sociate them to interactions, at the system level.
Definition 6 (Systems with Interaction History Clocks). Given a system
γ Bi , its extension wrt history clocks for interactions is γ h Bih , Γ ∗ where:
– Γ ∗ is an auxiliary TA having one location l with no invariant, and for each
interaction α in γ a clock hα , i.e., Γ ∗ = ({l∗ }, Aγ , T, Hγ , ∅) where:
• the set of actions Aγ = {aα | α ∈ γ}
• the set of clocks Hγ = {hα | α ∈ γ}
• T = {(l∗ , aα , true, hα := 0, l∗ ) | α ∈ γ}
– γ = {(aα | α) | α ∈ γ} with (aα | α) denoting {aα } ∪ {a | a ∈ α}.
h

Using a similar argument as for Proposition 1, it can be shown that any

invariant of γ h Bih , Γ ∗ corresponds to an invariant of γ Bi by first showing that
any execution of γ h Bih , Γ ∗ corresponds to an execution of γ Bi .
Proposition 5 Any execution in γ h Bih , Γ ∗ corresponds to an execution in γ Bi .
Moreover, if γ h Bih , Γ ∗ |= Φ then γ Bi |= ∃Hγ ∃HA .Φ where the new nota-
tion ∃Hγ stands for ∃hα1 ∃hα2 . . . ∃hαn when γ = {α1 , α2 , . . . , αn }.
We use history clocks for interactions to express additional constraints on
their timing. The starting point is the observation that when two conflicting
interactions compete for the same action a, no matter which one is first, the
other one must wait until the component which owns a is again able to execute
a. This is referred to as a “separation constraint” for conflicting interactions.
272 L. Aştefănoaei et al.

Deﬁnition 7 (Separation Constraints for Interaction Clocks). Given an

interaction set γ, the induced separation constraints, S(γ), are deﬁned as follows:

S(γ) = | hα − hβ |≥ ka
a∈Act(γ) α=β∈γ
a∈α∩β

where | | stands for absolute values and ka denotes the minimum between the
ﬁrst occurrence time of a and the minimal time elapse between two consecutive
occurrences of a. It is computed4 locally on the component executing a.

Example 4. In our running example the only shared actions are a and c within
the controller, and both ka and kc are equal to 4, thus the expression of the
separation constraints reduces to:

S(γ) ≡ |hc|di − hc|dj | ≥ 4 ∧ |ha|bi − ha|bj | ≥ 4.
i=j i=j

Proposition 6 S(γ) is an inductive invariant of γ h Bih , Γ ∗ .

Proof. By induction on the length of computations. For the base case, we assume
that the initial values of the history clocks for interactions in Γ ∗ are such that
they satisfy S(γ). Obviously, such a satisfying initial model always exists: it
suﬃces to take all hα with a minimal distance between them greater than the
maximum ka , in an arbitrary order.
For the inductive step, let s be the state reached after i steps, s a successor,
α
α an interaction such that s → s , a an arbitrary action and β ∈ γ such that
a ∈ β. For any β = α, | hβ − hβ |≥ ka is unchanged from s to s (α is the only
interaction for which hα is reset from s to s ) and thus holds by induction. We
now turn to | hβ − hα | which at s evaluates to hβ . Let sa be the most recent
state reached by an interaction containing a. If no such interaction exists, that
is, if a has no appearance in the i steps to s, let sa be the initial state. On the
path from sa to s , hβ could not have been reset (otherwise, sa would not be the
most recent one). Thus hβ ≥ ka by the deﬁnition of ka .

The invariant S(γ) is deﬁned over the history clocks for interactions. Previ-
ously, the invariant E(γ) has been expressed using history clocks for actions. In
order to “glue” them together in a meaningful way, we need some connection
between history and interaction clocks. This connection is formally addressed by
the constraints E ∗ deﬁned below.

Deﬁnition 8 (E ∗ ). Given an interaction set γ, we deﬁne E ∗ (γ) as follows:

E ∗ (γ) = ha = min hα .
α∈γ,a∈α
a∈Act(γ)

4
For instance, by reduction to a shortest path problem in weighted graphs [14].
Compositional Invariant Generation for Timed Systems 273

By a similar argument as the one in Proposition 2, it can be shown that E ∗ (γ)

is an inductive invariant of the extended system. The connection between E and
E ∗ is given in Proposition 7.
Proposition 7 E ∗ (γ) is an inductive invariant of γ h Bih , Γ ∗ . Moreover, the
equivalence ∃Hγ .E ∗ (γ) ≡ E(γ) is a valid formula.
Proof. To see that E ∗ (γ) is an invariant it suﬃces to note that, for any action
a, there is always an interaction α containing a such that ha and hα are both
reset in the same time.
The connection between E and E ∗ is shown by induction on the number of
interactions in γ. We only present the base case, γ = {α}, (the inductive one as
well as all proofs can be found in [4]):

E(γ) = hai = haj ≡ ∃hα . h ai = h α ≡
ai ,aj ∈α ai ∈α

∃Hγ . h ai = min hα ≡ ∃Hγ .E ∗ (γ).
α∈γ,ai ∈α
ai ∈Act(γ)

From Proposition 7, together with Propositions 5 and 6, it follows that
∃HA ∃Hγ .(∧i CI (Bih ) ∧ II (γ) ∧ E ∗ (γ) ∧ S(γ)) is an invariant of γ Bi . This new in-
variant is in general stronger than ∧i CI (Bih )∧II (γ)∧E(γ) and it provides better
state space approximations for timed systems with conﬂicting interactions.
Example 5. To get some intuition about the invariant generated using separation
constraints, let us reconsider the running example with two workers. The subfor-
mula which we emphasise here is the conjunction of E ∗ and S. The interaction
inequalities for history clocks are:
E ∗ (γ) ≡ hb1 = ha|b1 ∧ hb2 = ha|b2 ∧ ha = min (ha|bi )∧
i=1,2

hd1 = hc|d1 ∧ hd2 = hc|d2 ∧ hc = min (hc|di )

i=1,2

by recalling the expression of S(γ) from Example 4 we obtain that:

∃Hγ .E ∗ (γ) ∧ S(γ) ≡ |hb2 − hb1 | ≥ 4 ∧ |hd2 − hd1 | ≥ 4
and thus, after quantiﬁer elimination in ∃HA ∃Hγ .(CI (Controller h ) ∧i
CI (Worker hi ) ∧ II (γ) ∧ E ∗ (γ) ∧ S(γ)), we obtain the following invariant Φ:

Φ = l11 ∧ l12 ∧ lc0 ∧ x = y1 = y2 ∨
1 2
l1 ∧ l1 ∧ lc1 ∧ x ≤ 4 ∧ ((y1 = x ∧ y2 − y1 ≥ 4 ∨ y1 ≥ x + 8∨

y2 = x ∧ y1 − y2 ≥ 4 ∨ y2 ≥ x + 8)) ∨
1 2
l2 ∧ l1 ∧ lc2 ∧ (y1 ≥ x + 8 ∨ (y2 = x + 4 ∧ y1 − y2 ≥ 4)) ∨
2 1
l1 ∧ l2 ∧ lc2 ∧ (y2 ≥ x + 8 ∨ (y1 = x + 4 ∧ y2 − y1 ≥ 4))
We emphasised in the expression of Φ the newly discovered constraints. All in
all, Φ is strong enough to prove that the system is deadlock free.
274 L. Aştefănoaei et al.

4 Implementation and Experiments

The method has been implemented in a Scala (scala-lang.org/) prototype
(www-verimag.imag.fr/~lastefan/tas) which is currently being integrated
with the D-Finder tool [9] for verification of Real-Time BIP systems [1]. The
prototype takes as input components Bi , an interaction set γ and a global safety
property Ψ and checks whether the system satisfies Ψ . Internally, it uses PPL
(bugseng.com/products/ppl) to manipulate zones (essentially polyhedra) and
to compute component invariants. It generates Z3 (z3.codeplex.com) Python
code to check the satisfiability of the formula ∧i CI (Bi ) ∧ II (γ) ∧ Φ∗ ∧ ¬Ψ where
Φ∗ , depending on whether γ is conflicting, stands for E(γ) or E ∗ (γ) ∧ S(γ). If
the formula is not satisfiable, the prototype returns no solution, that is, the
system is guaranteed to satisfy Ψ . Otherwise, it returns a substitution for which
the formula is satisfiable, that is, the conjunction of invariants is true while Ψ is
not. This substitution may correspond to a false positive in the sense that the
state represented by the substitution could be unreachable.
For experiments, we chose three classical benchmarks which we discuss below.
Train Gate Controller: This is a classical example from [3]. The system is
composed of a controller, a gate and a number of trains. For simplicity, Figure 3
depicts only one train interacting with the controller and the gate. The controller
lowers and raises the gate when a train enters, respectively exits. The safety
property of interest is that when a train is at location in, the gate has been
lowered: ∧i (in i → g2 ). When there is only one train in the system, E(γ) is enough
to show safety. When there are more trains, we use the separation constraints.
approach

lower

y≤1
approach

z≤1
lower

approach approach lower

far near c0 c1 g0 g1
x := 0 z := 0 y := 0
x≤5
z=1
raise y≥1
exit x≥3 lower
z := 0 y := 0
in c3 c2 g3 g2
exit raise
raise
raise
exit
exit

x≤5 z≤1 y≤2

Train Controller Gate

Fig. 3. A Controller Interacting with a Train and a Gate

Fischer Protocol: This is a well-studied protocol for mutual exclusion [20]. The
protocol speciﬁes how processes can share a resource one at a time by means
of a shared variable to which each process assigns its own identiﬁer number.
After θ time units, the process with the id stored in the variable enters the
critical state and uses the resource. We use an auxiliary component Id Variable
to mimic the role of the shared variable. To keep the size of the generated
invariants manageable, we restrict to the acyclic version. The system with two
concurrent processes is represented in Figure 4. The property of interest is mutual
exclusion: (csi ∧ csj ) → i = j. The component Id Variable has combinatorial
behavior and a large number of actions (2n + 1), thus the generated invariant
Compositional Invariant Generation for Timed Systems 275

is huge except for very small values of n. To overcome this issue, we extracted
from the structure of the generated invariant a weaker inductive one which we
veriﬁed for validity locally with Uppaal. Basically, it encodes information like
heqi < hseti → heqi < heq0 ∧ hseti < heq0 for any index i. This invariant,
together with the component invariants for the processes and together with E(γ)
is suﬃcient to show that mutual exclusion holds.

try 1 enter 1 eq 1 eq 0 eq 2 enter 2 try 2

x1 ≤ θ x2 ≤ θ
eq 0
r1 i1 i2 r2
try 1 , x1 := 0 S0 try 2 , x2 := 0
set 1 set 1 set 2
set 1

set 1

set 2

set 2
set 2
x1 := 0 x2 := 0
set 2
enter 1 , x1 > θ enter 2 , x2 > θ
w1 cs1 S1 S2 cs2 w2
set 1
eq 1 , set 1 eq 2 , set 2
Process1 Id Variable Process2

Fig. 4. The Fischer Protocol

Temperature Controller: This example is an adaptation from [9]. It repre-

sents a simplified model of a nuclear plant. The system consists of a controller
interacting with an arbitrary number n of rods (two, in Figure 5) in order to
maintain the temperature between the bounds 450 and 900: when the temper-
ature in the reactor reaches 900 (resp. 450), a rod must be used to cool (resp.
heat) the reactor. The rods are enabled to cool only after 900n units of time.
The global property of interest is the absence of deadlock, that is, the system
can run continuously and keep the temperature between the bounds. To ex-
press this property in our prototype, we adapt from [24] the definition of enabled
states, while in Uppaal, we use the query A[] not deadlock. For one rod, E(γ)
is enough to show the property. For more rods, because interactions are conflict-
ing, we need the separation constraints which basically bring as new information
conjunctions as ∧i (hrestπ(i) − hrestπ(i−1) ≥ 1350) for π an ordering on rods.

rest0 heat rest1

l00 lc0 l01
0
t := 1800 t := 0 t1 := 1800

l10 t ≤ 900 lc1 l11

cool0 heat, t = 450 cool, t = 900 cool1
rest0 , t0 := 0 rest1 , t1 := 0
t0 ≥ 1800 t := 0 t:=0 t1 ≥ 1800

l20 t ≤ 450 lc2 l21

cool0 cool cool1

Rod0 Controller Rod1

Fig. 5. A Controller Interacting with 2 Rods

The experiments were run on a Dell machine with Ubuntu 12.04, an Intel(R)
Core(TM)i5-2430M processor of frequency 2.4GHz×4, and 5.7GiB memory. The
results, synthesised in Table 1, show the potential of our method in terms of
276 L. Aştefănoaei et al.

accuracy (no false positives) and scalability. For larger numbers of components,
the size of the resulting invariants was not problematic for Z3. However, it may
be the case that history clocks considerably increase the size of the generated
formulae. It can also be observed that Uppaal being highly optimised, it has
better scores on the ﬁrst example in particular and on smaller systems in general.
The timings for our prototype are obtained with the Unix command time while
the results for Uppaal come from the command verifyta which comes with the
Uppaal 4.1.14 distribution.

Table 1. Results from Experiments. The marking “∗ ” highlights the cases when E
alone was enough to prove the property. The expressions of form “x + y” are to be read
as “the formula ∧i CI (Bi ) ∧ II (γ) ∧ E (γ), resp. E ∗ (γ) ∧ S(γ), has length x, resp. y”.

Model & Size Time/Space

Property E∗ ∧ S Uppaal
1∗ 0m0.156s/2.6kB+140B 0ms/8 states
Train Gate Controller & 2 0m0.176s/3.2kB+350B 0ms/13 states
mutual exclusion 64 0m4.82s/530kB+170kB 0m0.210s/323 states
124 0m17.718s/700kB+640kB 0m1.52s/623 states
2∗ 0m0.144/3kB 0m0.008s/14 states
Fischer & 4∗ 0m0.22s/6.5kB 0m0.012s/156 states
mutual exclusion 6∗ 0m0.36s/12.5kB 0m0.03s/1714 states
14∗ 0m2.840s/112kB no result in 4 hours
1∗ 0m0.172s/840B+60B 0m0.01s/4 states
Temperature Controller & 8 0m0.5s/23kB+2.4kB 11m0.348s/57922 states
absence of deadlock 16 0m2.132s/127kB+9kB no result in 6 hours
124 0m19.22s/460kB+510kB idem

5 Conclusion and Future Work

We presented a fully automated compositional method to generate global in-
variants for timed systems described as parallel compositions of timed automata
components using multi-party interactions. The soundness of the method pro-
posed has been proven. In addition, it has been implemented and successfully
tested on several benchmarks. The results show that our method may outper-
form existing exhaustive exploration-based techniques for large systems, thanks
to the use of compositionality and over-approximations.
This work is currently being extended in several directions. First, in order to
achieve a better integration within D-Finder tool [9] and the Real-Time BIP frame-
work [1] we are working on handling urgencies [6] on transitions. Actually, urgen-
cies provide an alternative way to constrain time progress, which is more intuitive
to use by programmers but much diﬃcult to handle in a compositional way. A
second extension concerns the development of heuristics to reduce the size of the
generated invariants. As an example, symmetry-based reduction is potentially in-
teresting for systems containing identical, replicated components. Finally, we are
considering speciﬁc extensions to particular classes of timed systems and proper-
ties, in particular, for schedulability analysis of systems with mixed-critical tasks.
Compositional Invariant Generation for Timed Systems 277

References
1. Abdellatif, T., Combaz, J., Sifakis, J.: Model-based implementation of real-time
applications. In: EMSOFT (2010)
2. de Alfaro, L., Henzinger, T.A., Stoelinga, M.: Timed interfaces. In: Sangiovanni-
Vincentelli, A.L., Sifakis, J. (eds.) EMSOFT 2002. LNCS, vol. 2491, pp. 108–122.
Springer, Heidelberg (2002)
3. Alur, R., Dill, D.L.: A theory of timed automata. Theor. Comput. Sci. (1994)
4. Astefanoaei, L., Rayana, S.B., Bensalem, S., Bozga, M., Combaz, J.: Compositional
invariant generation for timed systems. Technical Report TR-2013-5, Verimag Re-
search Report (2013)
5. Badban, B., Leue, S., Smaus, J.-G.: Automated invariant generation for the veri-
fication of real-time systems. In: WING@ETAPS/IJCAR (2010)
6. Basu, A., Bozga, M., Sifakis, J.: Modeling heterogeneous real-time components in
BIP. In: SEFM (2006)
7. Behrmann, G., David, A., Larsen, K.G., Håkansson, J., Pettersson, P., Yi, W.,
Hendriks, M.: UPPAAL 4.0. In: QEST (2006)
8. Bengtsson, J., Jonsson, B., Lilius, J., Yi, W.: Partial order reductions for timed
systems. In: Sangiorgi, D., de Simone, R. (eds.) CONCUR 1998. LNCS, vol. 1466,
pp. 485–500. Springer, Heidelberg (1998)
9. Bensalem, S., Bozga, M., Sifakis, J., Nguyen, T.-H.: Compositional verification for
component-based systems and application. In: Cha, S(S.), Choi, J.-Y., Kim, M.,
Lee, I., Viswanathan, M. (eds.) ATVA 2008. LNCS, vol. 5311, pp. 64–79. Springer,
Heidelberg (2008)
10. Berendsen, J., Vaandrager, F.W.: Compositional abstraction in real-time model
checking. In: Cassez, F., Jard, C. (eds.) FORMATS 2008. LNCS, vol. 5215, pp.
233–249. Springer, Heidelberg (2008)
11. Bornot, S., Sifakis, J.: An algebraic framework for urgency. Information and Com-
putation (1998)
12. Bouyer, P.: Forward analysis of updatable timed automata. Form. Methods Syst.
Des. (2004)
13. Bozga, M., Daws, C., Maler, O., Olivero, A., Tripakis, S., Yovine, S.: Kronos: A
model-checking tool for real-time systems. In: Vardi, M.Y. (ed.) CAV 1998. LNCS,
vol. 1427, pp. 546–550. Springer, Heidelberg (1998)
14. Courcoubetis, C., Yannakakis, M.: Minimum and maximum delay problems in real-
time systems. Formal Methods in System Design (1992)
15. David, A., Larsen, K.G., Legay, A., Møller, M.H., Nyman, U., Ravn, A.P., Skou, A.,
Wasowski, A.: Compositional verification of real-time systems using Ecdar. STTT
(2012)
16. de Boer, F.S., Hannemann, U., de Roever, W.-P.: Hoare-style compositional proof
systems for reactive shared variable concurrency. In: Ramesh, S., Sivakumar, G.
(eds.) FSTTCS 1997. LNCS, vol. 1346, pp. 267–283. Springer, Heidelberg (1997)
17. Fietzke, A., Weidenbach, C.: Superposition as a decision procedure for timed au-
tomata. Mathematics in Computer Science (2012)
18. Gardey, G., Lime, D., Magnin, M., Roux, O(H.): Romeo: A tool for analyzing time
petri nets. In: Etessami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576,
pp. 418–423. Springer, Heidelberg (2005)
19. Henzinger, T.A., Nicollin, X., Sifakis, J., Yovine, S.: Symbolic model checking for
real-time systems. Inf. Comput. (1994)
20. Lamport, L.: A fast mutual exclusion algorithm. ACM Trans. Comput. Syst. (1987)
278 L. Aştefănoaei et al.

21. Legay, A., Bensalem, S., Boyer, B., Bozga, M.: Incremental generation of linear
invariants for component-based systems. In: ACSD (2013)
22. Lin, S.-W., Liu, Y., Hsiung, P.-A., Sun, J., Dong, J.S.: Automatic generation of
provably correct embedded systems. In: Aoki, T., Taguchi, K. (eds.) ICFEM 2012.
LNCS, vol. 7635, pp. 214–229. Springer, Heidelberg (2012)
23. Salah, R.B., Bozga, M., Maler, O.: Compositional timing analysis. In: EMSOFT
(2009)
24. Tripakis, S.: Verifying progress in timed systems. In: Katoen, J.-P. (ed.) ARTS
1999. LNCS, vol. 1601, pp. 299–314. Springer, Heidelberg (1999)
25. Wang, F.: Redlib for the formal veriﬁcation of embedded systems. In: ISoLA (2006)
Characterizing Algebraic Invariants
by Differential Radical Invariants

Khalil Ghorbal and André Platzer

Carnegie Mellon University, Pittsburgh, PA, 15213, USA

{kghorbal,aplatzer}@cs.cmu.edu

Abstract We prove that any invariant algebraic set of a given polynomial vector
field can be algebraically represented by one polynomial and a finite set of its
successive Lie derivatives. This so-called differential radical characterization re-
lies on a sound abstraction of the reachable set of solutions by the smallest variety
that contains it. The characterization leads to a differential radical invariant proof
rule that is sound and complete, which implies that invariance of algebraic equa-
tions over real-closed fields is decidable. Furthermore, the problem of generating
invariant varieties is shown to be as hard as minimizing the rank of a symbolic
matrix, and is therefore NP-hard. We investigate symbolic linear algebra tools
based on Gaussian elimination to efficiently automate the generation. The ap-
proach can, e.g., generate nontrivial algebraic invariant equations capturing the
airplane behavior during take-off or landing in longitudinal motion.

Keywords: invariant algebraic sets, polynomial vector fields, real algebraic ge-
ometry, Zariski topology, higher-order Lie derivation, automated generation and
checking, symbolic linear algebra, rank minimization, formal verification

1 Introduction
Reasoning about the solutions of differential equations by means of their conserved
functions and expressions is ubiquitous all over science studying dynamical processes.
It is even crucial in many scientific fields (e.g. control theory or experimental physics),
where a guarantee that the behavior of the system will remain within a certain pre-
dictable region is required. In computer science, the interest of the automated gener-
ation of these conserved expressions, so-called invariants, was essentially driven and
motivated by the formal verification of different aspects of hybrid systems, i.e. systems
combining discrete dynamics with differential equations for the continuous dynamics.
The verification of hybrid systems requires ways of handling both the discrete and
continuous dynamics, e.g., by proofs [15], abstraction [21,27], or approximation [10].
Fundamentally, however, the study of the safety of hybrid systems can be shown to
reduce constructively to the problem of generating invariants for their differential equa-
tions [18]. We focus on this core problem in this paper. We study the case of algebraic

This material is based upon work supported by the National Science Foundation by NSF CA-
REER Award CNS-1054246, NSF EXPEDITION CNS-0926181 and grant no. CNS-0931985.
This research is also partially supported by the Defense Advanced Research Agency under
contract no. DARPA FA8750-12-2-0291.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 279–294, 2014.

c Springer-Verlag Berlin Heidelberg 2014
280 K. Ghorbal and A. Platzer

invariant equation, i.e. invariants described by a polynomial equation of the form h = 0

for a polynomial h. We also only consider algebraic differential equations (or algebraic
vector fields), i.e. systems of ordinary differential equations in (vectorial) explicit form
dt = p(x), with a polynomial right-hand side, p. The class of algebraic vector fields
dx

is far from restrictive and many analytic nonalgebraic functions, such as the square
root, the inverse, the exponential or trigonometric functions, can be exactly modeled as
solutions of ordinary differential equations with a polynomial vector field (a concrete
example will be given in Section 6.2).
While algebraic invariant equations are not the only invariants of interest for hybrid
systems [19,17], they are still intimately related to all other algebraic invariants, such
as semialgebraic invariants. We, thus, believe that the characterization we achieve in
this paper to be an important step forward in understanding the invariance problem of
polynomial vector fields, and hence the hybrid systems with polynomial vector fields.
Our results indicate that algebraic geometry is well suited to reason about and effec-
tively compute algebraic invariant equations. Relevant concepts and results from alge-
braic geometry will be introduced and discussed as needed. The proofs of all presented
results are available in [5].

Content. In Section 2, we introduce a precise algebraic abstraction of the reachable set

of the solution of a generic algebraic initial value problem. This abstraction is used to
give a necessary and sufficient condition for a polynomial h to have the reachable set of
the solution as a subset of the set of its roots. Section 3 builds on top of this characteri-
zation to, firstly, check the invariance of a variety candidate (Section 3.1) and, secondly,
give an algebraic characterization for a variety to be an invariant for a polynomial vector
field (Section 3.2). The characterization of invariant varieties is exploited in Section 4
where the generation of invariant varieties is reduced to symbolic linear algebra com-
putation. The contributions of this work are summarized in Section 5. Finally, Section 6
presents three case studies to highlight the importance of our approach through concrete
and rather challenging examples.

2 Sound and Precise Algebraic Abstraction by Zariski Closure

We consider autonomous1 algebraic initial value problems (see Def. 1 below). A nonau-
tonomous system with polynomial time dependency can be reformulated as an au-
tonomous system by adding a clock variable that reflects the progress of time. In this
section, we investigate algebraic invariant equations for the considered initial value
problems. This study is novel and will turn out to be fruitful from both the theoreti-
cal and practical points of view. The usual approach which assumes the initial value to
be in a region of the space, often an algebraic set, will be discussed in Section 3.
Let x = (x1 , . . . , xn ) ∈ Rn , and x(t) = (x1 (t), . . . , xn (t)), where xi : R → R,
t )→ xi (t). The initial value x(tι ) = (x1 (tι ), . . . , xn (tι )) ∈ Rn , for some tι ∈ R, will
be denoted by xι . We do not consider any additional constraint on the dynamics, that is
the evolution domain corresponds to the domain of definition.
1
Autonomous means that the rate of change of the system over time depends only on the sys-
tem’s state, not on time.
Differential Radical Invariants 281

Definition 1 (Algebraic Initial Value Problem). Let pi , 1 ≤ i ≤ n, be multivariate

polynomials of the polynomial ring R[x]. An algebraic initial value problem is a pair
of an explicit algebraic ordinary differential equations system (or polynomial vector
field), p, and an initial value, xι ∈ Rn :

dxi
= ẋi = pi (x), 1 ≤ i ≤ n, x(tι ) = xι . (1)
dt
Since polynomial functions are smooth (C ∞ , i.e. they have derivatives of any order),
they are locally Lipschitz-continuous. By Cauchy-Lipschitz theorem (a.k.a. Picard-
Lindelöf theorem), there exists a unique maximal solution to the initial value prob-
lem (1) defined on some nonempty open set Ut ⊆ R. A global solution defined for
all t ∈ R may not exist! in general. For instance," the maximal solution x(t) of the
1-dimensional system ẋ = x2 , x(tι ) = xι = 0 is defined on R \ {tι + x−1 ι }.
Algebraic invariant equations for initial value problems are defined as follows.

Definition 2 (Algebraic Invariant Equation (Initial Value Problem))

An algebraic invariant equation for the initial value problem (1) is an expression of
the form h(x(t)) = 0 that holds true for all t ∈ Ut , where h ∈ R[x] and x : Ut → Rn ,
is the (unique) maximal solution of (1).

Notice that any (finite) disjunction of conjunctions of algebraic invariant equations

over the reals is also an algebraic invariant equation (w.r.t. Def. 2) using the following
equivalence (R[x] is an integral domain):
(
fi,j = 0 ←→ 2
fi,j =0 . (2)
i j i j

In Def. 2, the function h(x(t)), and hence the polynomial h(x), depend on the fixed
but unknown initial value xι . We implicitly assume this dependency for a clearer nota-
tion and will emphasize it whenever needed. Also, observe that h(x(t)), seen as a real
valued function of time t, is only defined over the open set Ut ⊆ R since the solution
x(t) is itself only defined over Ut . The polynomial function h : Rn → R; x )→ h(x)
is, however, defined for all Rn .

Definition 3 (Orbit). The reachable set, or orbit, of the solution of Eq. (1), x(t) is
def
defined as O(xι ) = {x(t) | t ∈ Ut } ⊆ Rn .

The complete geometrical characterization of the orbit requires the exact solution
of Eq. (1). Very few initial value problems admit an analytic solution, although a local
approximation can be always given using Taylor series approximations (such approxi-
mation is for instance used in [10] for the verification of hybrid systems). In this work,
we introduce a sound abstraction of the orbit, O(xι ), using (affine) varieties2 . The idea
is to embed the orbit (which is not a variety in general) in a variety to be defined. The
embedding we will be using is a well-known topological closure operation in algebraic
2
In the literature, some authors use the terminology algebraic sets so that varieties is reserved
for irreducible algebraic sets. Here we will use both terms equally.
282 K. Ghorbal and A. Platzer

geometry called the Zariski closure ([6, Chapter 1]). Varieties, which are sets of points,
can be represented and computed efficiently using their algebraic counterpart: ideals
of polynomials. Therefore, we first recall three useful definitions: an ideal of the ring
R[x], the variety of a subset of R[x], and finally the vanishing ideal of a subset of Rn .

Definition 4 (Ideal). An ideal I is a subset of R[x] that contains the polynomial zero
(0), is stable under addition, and external multiplication. That is, for all h1 , h2 ∈ I, the
sum h1 + h2 ∈ I; and if h ∈ I, then, qh ∈ I for all q ∈ R[x].

For a finite natural number r, we denote by h1 , . . . , hr the subset of R[x] generated
by the polynomials {h1 , . . . , hr }, i.e. the set of linear combinations of the polynomials
hi (where the coefficients are themselves polynomials):
r )
def

h1 , . . . , hr = gi hi | g1 , . . . , gr ∈ R[x] .
i=1

By Def. 4, the set h1 , . . . , hr is an ideal. More interestingly, by Hilbert’s Basis The-
orem [7], any ideal I of the Noetherian ring R[x] can be finitely generated by, say
{h1 , . . . , hr }, so that I = h1 , . . . , hr .
Given Y ⊆ R[x], the variety (over the reals), V (Y ), is a subset of Rn defined by the
common roots of all polynomials in Y . That is,
def ! "
V (Y ) = x ∈ Rn | ∀h ∈ Y, h(x) = 0 .

The vanishing ideal (over the reals), I(S), of S ⊆ Rn is the set of all polynomials
that evaluates to zero for all x ∈ S:
def ! "
I(S) = h ∈ R[x] | ∀x ∈ S, h(x) = 0 .

The Zariski closure Ō(xι ) of the orbit O(xι ) is defined as the variety of the vanish-
ing ideal of O(xι ):
def
Ō(xι ) = V (I(O(xι ))) . (3)
That is, Ō(xι ) is defined as the set of all points that are common roots of all polynomials
that are zero everywhere on the orbit O(xι ). The variety Ō(xι ) soundly overapprox-
imates all reachable states x(t) in the orbit of O(xι ), including the initial value xι :

Proposition 1 (Soundness of Zariski Closure). O(xι ) ⊆ Ō(xι ) .

Therefore, all safety properties that hold true for Ō(xι ), are also true for O(xι ).
The soundness in Proposition 1 corresponds to the reflexivity property of the Zariski
closure: for any subset S of Rn , S ⊆ V (I(S)). Besides, the algebraic geometrical fact
that the Zariski closure Ō(xι ) is the smallest3 variety containing O(xι ) corresponds to
the fact that Ō(xι ) is the most precise algebraic abstraction of O(xι ).
3
Smallest here is to be understood w.r.t. to the usual geometrical sense, that is, any other variety
containing O(xι ), contains also its closure Ō(xι ).
Differential Radical Invariants 283

Observe that if the set of generators of I(O(xι )) is only the zero polynomial,
I(O(xι )) = 0, then Ō(xι ) = Rn (the whole space) and the Zariski closure fails
to be informative. For instance, for (non-degenerated) one dimensional vector fields
(n = 1) that evolve over time, the only univariate polynomial that has infinitely many
roots is the zero polynomial. This points out the limitation of the closure operation used
in this work and raises interesting question about how to deal with such cases (this will
be left as future work).
The closure operation abstracts time. This means that Ō(xι ) defines a subset of
Rn within which the solution always evolves without saying anything about where the
system will be at what time (which is what a solution would describe and which is
exactly what the abstraction we are defining here gets rid off). In particular, Ō(xι ) is
independent of whether the system evolves forward or backward in time.
Although, we know that I(O(xι )) is finitely generated, computing all its genera-
tors may be intractable. By the real Nullstellensatz, vanishing ideals over R are in fact
exactly the real radical ideals [1, Section 4.1]. In real algebraic geometry, real radical
ideals are notoriously hard4 to compute. However, we shall see in the sequel that Lie
derivation will give us a powerful computational handle that permits to tightly approx-
imate (and even compute in some cases) I(O(xι )). The Lie derivative of a polynomial
along a vector field is defined as follows.
Definition 5 (Lie Derivative). The Lie derivative of h ∈ R[x] along the vector field
p = (p1 , . . . , pn ) is defined by:
n
def ∂h
Lp (h) = pi . (4)
i=1
∂xi

(k+1) def (k)

Higher-order Lie derivatives are defined recursively: Lp (h) = Lp (Lp (h)), with
(0) def
Lp (h) = h.
We state an important property of the ideal I(O(xι )). Similar result is known under
different formulations ([23, Theorem 3.1] and [16, Lemma 3.7]).
Proposition 2. I(O(xι )) is a differential ideal for Lp , i.e. it is stable under the action
of the Lp operator. That is, for all h ∈ I(O(xι )), Lp (h) ∈ I(O(xι )).
In the next section, we give a necessary and sufficient condition for a polynomial h
to be in I(O(xι )), that is for the expression h = 0 to be an algebraic invariant equation
for the initial value problem (1), i.e. h evaluates to 0 all along the orbit of xι .

3 Differential Radical Characterization

In this section, we study the algebraic properties of the Zariski closure Ō(xι ) defined
in the previous section. We then define and characterize invariant algebraic sets of poly-
nomial vector fields.
4
Given an ideal I ⊆ R[x], the degree of the polynomials that generate its real radical is bounded
2
by the degree of polynomials that generate I to the power of 2O(n ) [14, Theorem 5.9].
284 K. Ghorbal and A. Platzer

For h ∈ R[x], we recursively construct an ascending chain of ideals of R[x] by

appending successive Lie derivatives of h to the list of generators:
(N −1)
h ⊂ h, L(1)
p (h) ⊂ · · · ⊂ h, . . . , Lp (h) = h, . . . , L(N )
p (h) .

Since the ring R[x] is Noetherian, the chain above has necessarily a finite length: the
maximal ideal (in the*sense of inclusion), so-called the differential radical ideal5 of h,
L
will be denoted by p h. Its order is the smallest N such that:
(N −1)
L(N
p (h) ∈ Lp (h), . . . , Lp
) (0)
(h) . (5)

The following theorem, an important contribution of this work, states a necessary

and sufficient condition for a polynomial h to be in I(O(xι )).

Theorem 1 (Differential
* Radical Characterization). Let h ∈ R[x], and let N denote
L
the order of p h. Then, h ∈ I(O(xι )) if and only if

L(i)
p (h)(xι ) = 0 . (6)
0≤i≤N −1

The statement of Theorem 1 is general and assumes nothing about xι ∈ Rn . A nat-

ural question to ask is how differential radical characterization can be used to reason
about invariant regions of a given polynomial vector field. By invariant (or stable) re-
gions, we mean, regions S ⊂ Rn from which the trajectory of the solution of the initial
value problem (1), with xι ∈ S, can never escape. In particular, we focus on invariant
algebraic sets where S is variety.

Definition 6 (Invariant Variety). The variety S is an invariant variety for the vector
field p if and only if ∀xι ∈ S, O(xι ) ⊆ S.

Dual to the geometrical point of view in Def. 6, the algebraic point of view is given
by extending the definition of algebraic invariant equation for initial value problems
(Def. 2), to algebraic invariant equation for polynomial vector fields.
Definition 7 (Algebraic Invariant Equation (Vector Field)). The expression h = 0
is an algebraic invariant equation for the vector field p if and only if V (h) is an
invariant variety for p.
Unlike Def. 2, Def. 7, or its geometrical counterpart, Def. 6, corresponds to the typ-
ical object of studies in hybrid system verification as they permit the abstraction of
the continuous part by means of algebraic equations. In the two following sections, we
show how differential radical characterization (Theorem 1) can be used to address two
particular questions: checking the invariance of a variety candidate (Section 3.1) and
characterizing invariant varieties (Section 3.2).
We will the polynomial h is a differential radical invariant (for p) if and
say that
L*
only if V p h is an invariant variety for p.
L

5
The construction of p h is very similar to the construction of the radical of an ideal except
with higher-order Lie derivatives in place of higher powers of polynomials.
Differential Radical Invariants 285

3.1 Checking Invariant Varieties by Differential Radical Invariants

The problem we solve in this section is as follows: given a polynomial vector field p,
can we decide whether the equation h = 0 is an algebraic invariant equation for the
vector field p ? Dually, we want to check whether the variety V (h) is invariant for p.
Theorem 2 solves the problem.
L
*
Theorem 2. Let h ∈ R[x], and let N denote the order of p h. Then, V (h) is an
invariant variety for the vector field p (or equivalently h = 0 is an algebraic invariant
equation for p) if and only if

h=0→ L(i)
p (h) = 0 . (7)
1≤i≤N −1

Corollary 1 (Decidability). It is decidable whether the expression h = 0 is an alge-

braic invariant equation for the vector field p assuming real algebraic coefficients for
h and p.

The sound and complete * related proof rule from Theorem 2 can be written as follows
L
(N denotes the order of p h):
(i)
h=0→ 1≤i≤N −1 Lp (h) = 0
(DRI) . (8)
(h = 0) → [ẋ = p](h = 0)

Using the naive trick in Eq. (2), theoretically, the proof rule can be easily extended to
check for the invariance of any finite disjunction of conjunctions of algebraic invariant
equations for p. This means that we can check for the invariance of any variety for
p, given its algebraic representation. However, in practice, other techniques, outside
the scope of this paper, should be considered to try to keep the degree of the involved
polynomials as low as possible.

3.2 Differential Radical Characterization of Invariant Varieties

In the previous section, we were given a variety candidate of the form V (h) and asked
whether we can decide for its invariance. In this section, we characterize all invariant
varieties of a vector field p using a differential radical criterion. The following theorem
fully characterizes invariant varieties of polynomial vector fields.

Theorem 3 (Characterization of Invariant Varieties). A variety S is an invariant

L* for the vector field p if and only if there exists a polynomial h such that S =
variety
V p h . As a consequence, every invariant variety corresponds to an algebraic in-
variant equation
* involving a polynomial and its higher-order Lie derivatives (N denotes
L
the order of p h):

L(i)
p (h) = 0 . (9)
0≤i≤N −1
286 K. Ghorbal and A. Platzer

Observe how Theorem 3 proves, from the differential radical characterization point of
view, the well-known fact about invariant
* polynomial functions [17, Theorem 3]: if
L
Lp (h(x)) = 0, then, for any c ∈ R, p h(x) − c = h(x)−c, and so s = v(h(x)−
c) is an invariant variety for p.
An algebraic invariant equation for p is defined semantically (Def. 7) as a polyno-
mial that evaluates to zero if it is zero initially (admits xι as a root). Differential radi-
cal invariants are, on the other hand, defined as a structured, syntactically computable,
conjunction of polynomial equations involving one polynomial and its successive Lie
derivatives. By Theorem 3, both coincide.
The explicit formulation of Eq. (5), namely

N −1
L(N )
p (h) = gi L(i)
p (h), (10)
i=0

for some gi ∈ R[x], is computationally attractive as it only involves polynomial

arithmetic on higher-order Lie derivatives of one polynomial, h, which in turn can be
computed automatically by symbolic differentiation. Section 4 exploits this fact to auto-
matically generate differential radical invariants and consequently invariant varieties.

4 Effective Generation of Invariant Varieties

In the previous section, we have seen (Theorem 3) that differential radical ideals charac-
terize invariant varieties. Based on Eq. (10), we explain in this section how we automat-
ically construct differential radical ideals given a polynomial vector field p by deriving
a set of constraints that the coefficients of a parametrized polynomial have to satisfy.
The degree of a polynomial in R[x] is defined as the maximum degree among the
(finite) set of degrees of its monomials6. When the degrees of all nonzero monomials
of a polynomial h are equal, we say that h is homogeneous, or a form, of degree d.
By introducing an extra variable x0 and multiplying all monomials by a suitable
power of x0 , any polynomial of R[x] can be homogenized to a form in R[x0 ][x]. The
additional variable x0 is considered as a time-independent function: its time derivative
is zero (ẋ0 = p0 = 0). “De-homogenizing” the vector field corresponds to instantiating
x0 with 1, which gives back the original vector field. Geometrically, the homogenization
of polynomials corresponds to the notion of projective varieties in projective geometry,
where the homogenized polynomial is the algebraic representative of the original vari-
ety in the projective plane [3, Chapter 8].
From a computational prospective, working in the projective plane offers a more
symmetric representation: all monomials of a form have the same degree. The arith-
metic of degrees over forms is also simplified: the degree of a product is the sum of
the degrees of the operands. In the reminder of this section, we only consider forms of
R[x0 , . . . , xn ]. The symbol x will denote the vector of all involved variables.

6
The degree of the zero polynomial (0) is undefined. We assume in this work that all finite
degrees are acceptable for the zero polynomial.
Differential Radical Invariants 287

If h denotes a form of degree d, and d the maximum degree among the degrees of
(k)
pi , then the degree of the polynomial Lp (h) is given by:

deg(Lp(k) (h)) = d + k(−1 + d ) . (11)

Recall that a form of degree d in R[x0 , . . . , xn ] has

def n+d
md = d (12)

monomials (the binomial coefficient of n + d and d). A parametrized form hα of degree

d can therefore be represented by its symbolic coefficients’ vector α : Rmd . For this
representation to be canonical, one needs to fix an order over monomials of the same
degree. We will use the usual lexicographical order, except for x0 : x1 > x2 > · · · >
xn > x0 . We first compare the degrees of x1 , if equal, we compare the degrees of x2
and so on till reaching xn and then x0 . For instance, for n = 2, a parametrized form
hα of degree d = 1 is equal to α1 x1 + α2 x2 + α3 x0 . Its related coefficients’ vector is
α = (α1 , α2 , α3 ).
Let hα be a parametrized form of degree d and let α = (α1 , . . . , αmd ) denote
the coefficients’ vector with respect to the monomial order defined above. Since all
polynomials in Eq. (10) are forms in projective coordinates, the degree of each term
(i) (N )
gi Lp (hα ) matches exactly the degree of Lp (hα ). Hence, by Eq. (11): deg(gi ) =
(N − i)(−1 + d ). The coefficients’ vector of each form gi is then a vector, β i , of
size m(N −i)(−1+d ) (see Eq. (12)). So that we obtain md+N (−1+d ) biaffine equations:
linear in αi , 1 ≤ i ≤ md , and affine β i,j , 0 ≤ i ≤ N − 1, 1 ≤ j ≤ m(N −i)(−1+d ) . A
concrete example is as follows.
Example 1. Suppose we have n = 2, d = 1, p1 = a1 x1 + a2 x2 and p2 = b1 x1 + b2 x2 .
For d = 1, the parametrized form hα is equal to α1 x1 + α2 x2 + α3 x0 . Let N = 1. The
first-order Lie derivative, Lp (hα ), has the same degree, 1, and is equal to α1 (a1 x1 +
a2 x2 ) + α2 (b1 x1 + b2 x2 ). In this case, g is a form of degree
0, that is a real number. So
it has one coefficient β ∈ R. We, therefore, obtain m1 = 31 = 3 constraints:
⎛ ⎞ ⎛ ⎞
(−a1 + β)α1 + (−b1 )α2 = 0 −a1 + β −b1 0 α1
(−a2 )α1 + (−b2 + β)α2 = 0 ↔ ⎝ −a2 −b2 + β 0 ⎠ . ⎝α2 ⎠ = 0 .
(β)α3 =0 0 0 β α3
As suggested in Example 1, for a given d and N , and if we concatenate all vectors
βi into one vector β, the equational constraints can be rewritten as a symbolic linear
algebra problem of the following form:

Md,N (β)α = 0, (13)

where α and β are decoupled. The matrix Md,N (β) is called the matrix representation
(N ) (N −1)
of the ideal membership problem Lp (hα ) ∈?hα , . . . , Lp (hα ).
Recall that the kernel (or null-space) of a matrix M ∈ Rr×c , with r rows and c
columns is the subspace of Rc defined as the preimage of the vector 0 ∈ Rc :
def
ker(M ) = {x ∈ Rc | M x = 0} .
288 K. Ghorbal and A. Platzer

Let s = dim(ker(Md,N (β))) ≤ md . If, for all β, s = 0, then the kernel is {0}.
Hence, α = 0 and, for the chosen N , we have hα = 0: the only differential radi-
cal ideal generated by a form of degree d is the trivial ideal 0. If, however, s ≥ 1,
then, by Theorem 3, we generate an invariant (projective) variety for p. In this case, de-
homogenizing is not always possible. In fact, the constraint on the initial value could
involve x0 , which prevents the de-homogenization (see Example 2). Otherwise, we re-
cover an invariant (affine) variety for the original vector field. This is formally stated in
the following theorem.

Theorem 4 (Effective Generation of Projective Invariant Varieties)

Let hα denote a parametrized form of degree d. There exists a real vector
* β such that
L
dim(ker(Md,N (β))) ≥ 1 if and only if for α ∈ ker(Md,N (β)), V p hα ⊂ Rn+1
is a projective invariant variety for the homogenized vector field.

When s = dim(ker(Md,N (β))) ≥ 1, the subspace ker(Md,N (β)) is spanned by s

∈ Rmd ,and for α = γ1 e1 +· · ·+γs es , for arbitrarily (γ1 , . . . , γs ) ∈
vectors, e1 , . . . , es *
Lp
R , the variety V
s
hα is a family of invariant varieties of p (parametrized with γ).
In the sequel, we give a sufficient condition, so that, for any given initial value, one
gets a variety (different from the trivial whole space) that embeds the reachable set
of the trajectory, O(xι ). For instance, for conservative Hamiltonian system, if the total
energy function, h, is polynomial (such asthe *energy function of the perfect pendulum),
L
then, for any initial value xι , O(xι ) ⊆ V p h(x) − h(xι ) = V (h(x) − h(xι )).
For a genericx* ι ∈ R , if xι satisfies Eq. (6), then, by Theorem 1, hα ∈ I(O(xι )),
n
Lp
and Ō(xι ) ⊆ V hα ([5, Corollary 1]). However, for xι to satisfy Eq. (6), α must
be in the intersection of N hyperplanes, H0 , . . . , HN −1 , each defined explicitly by the
(i)
condition Lp (hα )(xι ) = 0:
def ! "
Hi = α ∈ Rmd | L(i)
p (hα )(xι ) = 0 . (14)

Proposition 3 (Effective Sound Approximation of O(xι )). Let hα be a parametrized

i ⊆ R
md
form of degree d, and Md,N (β) the matrix representation of Eq. (10). Let H ,
L*
0 ≤ i ≤ N − 1, be the hyperplanes defined in Eq. (14). Then, O(xι ) ⊆ V p
hα , if
there exists β such that:
N −1
%
dim(ker(Md,N (β))) > md − dim Hi . (15)
i=0

The remainder of this section discusses our approach to maximize the dimension of
the kernel of Md,N (h), as well as the complexity of the underlying computation.

Gaussian Elimination. Let β = (β1 , . . . , βs ) : Rs . By Theorem 4, we want to find

an instance, β∗ , of β that maximizes dim ker(Md,N (β)), where all the elements of
Md,N (β) are affine in β. At each iteration, our algorithm [5, Algorithm 1] assigns
new values to the remaining coefficients in β for the matrix Md,N (β) to maximize
the dimension of its kernel. A set, M , gathers all the instantiations of Md,N (β). The
procedure ends when no further assignment can be done. The algorithm is in fact a
Differential Radical Invariants 289

typical MapReduce procedure which can be parallelized. A naive approach would be

to first extract a basis for the matrix Md,N (β) (which requires symbolic computation
capabilities for linear algebra), then, solves for βs that zero the determinant. In prac-
tice, however, row-reducing speeds up the computation: we row-reduce Md,n (β), and
record any divisions by the pivot element: we then branch with any β that zero the
denominator.
Example 2. We apply the algorithm sketched above to Example 1. The determinant of
the matrix M1,1 (β) is β β 2 − (a1 + b2 )β − a2 b1 + a1 b2 . Since we do not have any
constraints on the parameters a1 , a2 , b1 , b2 , the only generic solution for the determinant
is β = 0. The kernel of M1,1 (0), of dimension 1, is generated by (0, 0, 1). The only
candidates in this case are hα (x) = γx0 , γ ∈ R. If we de-homogenize (set x0 to 1),
then, γ = 0 and we find the trivial invariant variety, Rn .
The result of Example 2 is expected as it studies a generic linear vector field without
any a priori constraints on the parameters. This triggers, naturally, an interesting fea-
ture of the differential radical characterization: its ability to synthesize vector fields to
def
enforce an invariant variety. For instance, in Example 2, let δ = (a1 − b2 )2 + 4a2 b1 .
If δ ≥ 0, and a2 = 0, then the kernel of M1,1 (β) is generated by the vector a1 − b2 ±
√
δ, 2a2 , 0 (which is an eigenvector of the matrix M1,1 (β)). By Theorem 4, we have
√
an invariant variety given by: a1 − b2 ± δ x1 + 2a2 x2 = 0. This is also expected for
linear systems as eigenvectors span stable subspaces.

Complexity. By Theorem 4, the generation of invariant varieties is equivalent to maxi-

mizing the dimension of the kernel of the matrix Md,N (β) over unconstrained β, which
is in turn equivalent to the following unconstrained minimal rank problem:

min rank(Md,N (β)), (16)

where the elements of the vector β are in R. If the vector field p has no parameters,
then the entries of the matrix Md,N (β) are either elements of β or real numbers. Under
these assumptions, the problem (16) is in PSPACE [2, Corollary 20] over the field of
real numbers7, and is at least NP-hard (see [2, Corollary 12] and [8, Theorem 8.2])
independently from the underlying field. In fact, deciding whether the rank of Md,N (β)
is less than or equal to a given fixed bound is no harder than deciding the corresponding
existential first-order theory.
On the other hand, there is an NP-hard lower bound for the feasibility of the original
set of (biaffine) equations in β and α given in Eq. (13). In the simpler bilinear case and,
assuming, as above, that the vector field has no parameters, finding a nontrivial solution
(α = 0 is trivial) is also NP-hard [8, Theorems 3.7 and 3.8].

5 Related Work and Contributions

The contribution of this work is fourfold.
7
The complexity class depends on the underlying field and is worse for fields with nonzero
characteristic.
290 K. Ghorbal and A. Platzer

Sound and Precise Algebraic Abstraction of Reachable Sets (Section 2). Unlike pre-
vious work [28,23,12,11], we start by studying algebraic initial value problems. We
propose a sound abstraction (Proposition 1) to embed (overapproximate) the reachable
set. Our abstraction relies on the Zariski closure operator over affine varieties (closed
sets of the Zariski topology), which allows a clean and sound geometrical abstraction.
From there, we define the vanishing ideal of the closure, and give a necessary and suf-
ficient condition (Theorem 1) for a polynomial equation to be an invariant for algebraic
initial value problems.
Checking Invariant Varieties by Differential Radical Invariants (Section 3.1). The
differential radical characterization allows to check for and falsify the invariance of a
variety candidate. Unlike already existing proof rules [28,12,17], which are sound but
can only prove a restrictive class of invariants. From Theorem 2, we derive a sound and
complete proof rule (Eq. (8)) and prove that the problem is decidable (Corollary 1) over
the real-closed algebraic fields.
Differential Radical Characterization of Invariant Varieties (Section 3.2). The dif-
ferential radical criterion completely characterizes all invariant varieties of polynomial
vector fields. This new characterization (Theorem 3) permits to relate invariant varieties
to a purely algebraic, well-behaved, conjunction of polynomial equations involving one
polynomial and its successive Lie derivatives (Eq. (9)). It naturally generalizes [9,26]
where linear vector fields are handled and [24,12] where only a restrictive class of in-
variant varieties is considered.
Effective Generation of Invariant Varieties (Section 4). Unlike [28,23,11,22], we do
not use quantifier elimination procedures nor Gröbner Bases algorithms for the genera-
tion of invariant varieties. We have developed and generalized the use of symbolic linear
algebra tools to effectively generate families of invariant varieties (Theorem 4) and to
soundly overapproximate reachable sets (Proposition 3). In both cases, the problem re-
quires maximizing the dimension of the kernel of a symbolic matrix. The complexity is
shown to be NP-hard, but in PSPACE, for polynomial vector fields without parameters.
We also generalize the previous related work on polynomial-consecution. In particular,
Theorems 2 and 4 in [12] are special cases of, respectively, Theorem 4 and Proposi-
tion 3, when the order of differential radical ideals is exactly 1.

6 Case Studies
The following challenging example comes up as a subsystem we encountered when
studying aircraft dynamics: p1 = −x2 , p2 = x1 , p3 = x24 , p4 = x3 x4 .
It appears frequently whenever Euler angles and the three dimensional rotational
matrix is used to describe the dynamics of rigid body motions. For some chosen initial
value, such as xι = (1, 0, 0, 1), it is an exact algebraic encoding of the trigonometric
functions : x1 (t) = cos(t), x2 (t) = sin(t), x3 (t) = tan(t), x4 (t) = sec(t). When
d = 2 and N = 1, the matrix M2,1 (β) is 35 × 15, with 90 (out of 525) nonzero
elements, and |β| = 5. The maximum dimension of ker(M2,1 (β)) is 3 attained for
β = 0. The condition of Proposition 3 is satisfied and, for any xι , we find the following
algebraic invariant equations for the corresponding initial value problem:
Differential Radical Invariants 291

h1 = x21 + x22 − xι 21 − xι 22 = 0, h2 = −x23 + x24 + xι 23 − xι 24 = 0 .

In particular, for the initial value xι = (1, 0, 0, 1), one recovers two trigonometric
identities, namely cos(t)2 + sin(t)2 − 1 = 0 for h1 and − tan(t)2 + sec(t)2 − 1 = 0
for h2 .
For N = 3, the matrix M2,3 (β) is 126×15, with 693 (out of 1890) nonzero elements,
and |β| = 55. We found a β for which the dimension of ker(M2,3 (β)) is 5. By Theo-
rem 4, we have a family of invariant varieties for p encoded by the following differential
radical invariant: h = γ1 − x23 γ2 + x24 γ2 + x2 x4 γ3 + x21 γ4 + x22 γ4 + x1 x4 γ5 , where
γi , 1 ≤ i ≤ 5, are real numbers. In particular, when (γ1 , γ2 , γ3 , γ4 , γ5 ) = (1, 0, 0, 0, 1),
we have the following algebraic invariant equation for p:

−1 + x1 x4 = 0 ∧ −x2 x4 + x3 = 0 ∧ −1 − x23 + x24 = 0 . (17)

Interestingly, since xι = (1, 0, 0, 1) satisfies the above equations, we recover, respec-

tively, the following trigonometric identities:

−1 + cos(t) sec(t) = 0 ∧ − sin(t) sec(t) + tan(t) = 0 ∧ −1 − tan(t)2 + sec(t)2 = 0 .

We stress the fact that Eq. (17) is one algebraic invariant equation for p. In fact, any
conjunct alone, a part from −1 − x23 + x24 = 0, of Eq. (17) is not an algebraic invariant
equation for p. Indeed, we can falsify the candidate −1 + x1 x4 = 0 using Theorem 2:
the implication −1 + x1 x4 = 0 −→ −x2 x4 + x3 = 0 is obviously false in general.
Notice that h1 and h2 can be found separately by splitting the original vector field
into two separate vector fields since the pairs (p1 , p2 ) and (p3 , p4 ) can be decoupled.
However, by decoupling, algebraic invariant equation such as Eq. (17) cannot be found.
This clearly shows that in practice, splitting the vector field into independent ones
should be done carefully when it comes to generating invariant varieties. This is some-
how counter-intuitive as decoupling for the purpose of solving is always desirable. In
fact, any decoupling breaks an essential link between all involved variables: time.
We proceed to discuss collision avoidance of two airplanes (Section 6.1) and then the
use of invariant varieties to tightly capture the vertical motion of an airplane
(Section 6.2).

6.1 Collision Avoidance

We revisit the linear vector field encoding Dubin’s vehicle model for aircrafts [4]. Al-
though the system was discussed in many recent papers [20,23,11], we want to highlight
an additional algebraic invariant equation that links both airplanes when turning with
the same angular velocity. The differential equation system is given by:

p1 = ẋ1 = d1 , p2 = ẋ2 = d2 , p3 = d˙1 = −ω1 d2 , p4 = ḋ2 = ω1 d1 ,

p5 = ẏ1 = e1 , p6 = ẏ2 = e2 , p7 = ė1 = −ω2 e2 , p8 = ė2 = ω2 e1 .

The angular velocities ω1 and ω2 can be either zero (straight line flight) or equal to a
constant ω which denotes the standard rate turn (typically 180◦/2mn for usual com-
mercial airplanes). When the two airplanes are manoeuvring with the same standard
292 K. Ghorbal and A. Platzer

rate turn ω, apart from the already known invariants, we discovered the following dif-
ferential radical invariant (which corresponds to a family of invariant varieties):

h1 = γ1 d1 + γ2 d2 + γ3 e1 + γ4 e2 = 0 ∧ h2 = γ2 d1 − γ1 d2 + γ4 e1 − γ3 e2 = 0,
*
L
*
L
for an arbitrarily (γ1 , . . . , γ4 ) ∈ R4 . We have p h1 = p h2 = h1 , h2 . Observe
also that V (h1 ) and V (h2 ) are not invariant varieties for p.

6.2 Longitudinal Motion of an Airplane

The full dynamics of an aircraft are often separated (decoupled) into different modes
where the differential equations take a simpler form by either fixing or neglecting the
rate of change of some configuration variables [25]. The first standard separation used in
stability analysis gives two main modes: longitudinal and lateral-directional. We study
the 6th order longitudinal equations of motion as it captures the vertical motion (climb-
ing, descending) of an airplane. We believe that a better understanding of the envelope
that soundly contains the trajectories of the aircraft will help tightening the surrounding
safety envelope and hence help trajectory management systems to safely allow more
dense traffic around airports. The current safety envelope is essentially a rough cylinder
that doesn’t account for the real capabilities allowed by the dynamics of the airplane.
We use our automated invariant generation techniques to characterize such an envelope.
The theoretical improvement and the effective underlying computation techniques de-
scribed earlier in this work allow us to push further the limits of automated invariant
generation. We first describe the differential equations (vector field) then show the non-
trivial energy functions (invariant functions for the considered vector field) we were
able to generate. Let g denote the gravity acceleration, m the total mass of an airplane,
M the aerodynamic and thrust moment w.r.t. the y axis, (X, Z) the aerodynamics and
thrust forces w.r.t. axis x and z, and Iyy the second diagonal element of its inertia ma-
trix. The restriction of the nominal flight path of an aircraft to the vertical plane reduces
the full dynamics to the following 6 differential equations [25, Chapter 5] (u: axial
velocity, w: vertical velocity, x: range, z: altitude, q: pitch rate, θ: pitch angle):

X
u̇ = − g sin(θ) − qw ẋ = cos(θ)u + sin(θ)w θ̇ = q
m
Z M
ẇ = + g cos(θ) + qu ż = − sin(θ)u + cos(θ)w q̇ = .
m Iyy

We encode the trigonometric functions using two additional variables for cos(θ) and
sin(θ), making the total number of variables equal to 8. The parameters are considered
unconstrained. Unlike [23], we do not consider them as new time independent variables.
So that the total number of state variables (n) and hence the degree of the vector field are
unchanged. Instead, they are carried along the symbolic row-reduction computation as
symbols in Md,N (β). For the algebraic encoding of the above vector field (n = 8), the
matrix M3,1 (β) is 495 × 165, with 2115 (out of 81675) nonzero elements, and |β| = 9.
We were able to automatically generate the following three invariant functions:
Differential Radical Invariants 293
+ , + ,
Mz X Z
+ gθ + − qw cos(θ) + + qu sin(θ),
Iyy m m
+ , + ,
Mx Z X 2M θ
− + qu cos(θ) + − qw sin(θ), −q 2 + .
Iyy m m Iyy

We substituted the intermediate variables that encode sin and cos back to emphasize the
fact that algebraic invariants and algebraic differential systems are suitable to encode
many real complex dynamical systems. Using our Mathematica implementation, the
computation took 1 hour on a recent laptop with 4GB and 1.7GHz Intel Core i5.

Acknowledgments. We thank the anonymous reviewers for their careful reading and
detailed comments. We also would like to very much thank J EAN -BAPTISTE J EANNIN
and A NDREW S OGOKON for the multiple questions, various comments and fruitful
objections they both had on an early version of this work. We are finally grateful to
E RIC G OUBAULT and S YLVIE P UTOT for the relevant references they pointed out to us
on the integrability theory of nonlinear systems.

7 Conclusion

For polynomial vector fields, we give an algebraic characterization of invariant varieties.

This so-called differential radical characterization makes it possible to decide for the
invariance of a given variety candidate. It is, in addition, computationally attractive:
generating invariant varieties requires minimizing the rank of a symbolic matrix and is
hence at least NP-hard. The case studies show how the technique applies successfully
to rather complex systems. We also revisited some known problems in the literature to
exemplify the benefits of having a necessary and sufficient condition: all other known
sound approaches generate a special class of invariant varieties (i.e. miss others).
In the future, we plan to investigate upper bounds for the order of the differential
radical ideal of a given polynomial. Also, invariant varieties are not the only invariant of
interest for polynomial vector fields, we want to consider semialgebraic sets as they play
a prominent role in both hybrid systems and control theory. Finally, the effective use of
algebraic invariants in general in the context of hybrid systems is still a challenging
problem that we want to explore in more depth.

References
1. Bochnak, J., Coste, M., Roy, M.F.: Real Algebraic Geometry. A series of modern surveys in
mathematics. Springer (2010)
2. Buss, J.F., Frandsen, G.S., Shallit, J.: The computational complexity of some problems of
linear algebra. J. Comput. Syst. Sci. 58(3), 572–596 (1999)
3. Cox, D.A., Little, J., O’Shea, D.: Ideals, Varieties, and Algorithms: An Introduction to Com-
putational Algebraic Geometry and Commutative Algebra. Springer (2007)
4. Dubins, L.E.: On curves of minimal length with a constraint on average curvature, and
with prescribed initial and terminal positions and tangents. American Journal of Mathemat-
ics 79(3), 497–516 (1957)
294 K. Ghorbal and A. Platzer

5. Ghorbal, K., Platzer, A.: Characterizing algebraic invariants by differential radical invariants.
Tech. Rep. CMU-CS-13-129, School of Computer Science, Carnegie Mellon University,
Pittsburgh, PA, 15213 (November 2013),
http://reports-archive.adm.cs.cmu.edu/anon/2013/
abstracts/13-129.html
6. Hartshorne, R.: Algebraic Geometry. Graduate Texts in Mathematics. Springer (1977)
7. Hilbert, D.: Über die Theorie der algebraischen Formen. Mathematische Annalen 36(4), 473–
534 (1890)
8. Hillar, C.J., Lim, L.H.: Most tensor problems are NP-hard. J. ACM 60(6), 45 (2013)
9. Lafferriere, G., Pappas, G.J., Yovine, S.: Symbolic reachability computation for families of
linear vector fields. J. Symb. Comput. 32(3), 231–253 (2001)
10. Lanotte, R., Tini, S.: Taylor approximation for hybrid systems. In: Morari, Thiele (eds.) [13],
pp. 402–416
11. Liu, J., Zhan, N., Zhao, H.: Computing semi-algebraic invariants for polynomial dynamical
systems. In: Chakraborty, S., Jerraya, A., Baruah, S.K., Fischmeister, S. (eds.) EMSOFT, pp.
97–106. ACM (2011)
12. Matringe, N., Moura, A.V., Rebiha, R.: Generating invariants for non-linear hybrid systems
by linear algebraic methods. In: Cousot, R., Martel, M. (eds.) SAS 2010. LNCS, vol. 6337,
pp. 373–389. Springer, Heidelberg (2010)
13. Morari, M., Thiele, L. (eds.): HSCC 2005. LNCS, vol. 3414. Springer, Heidelberg (2005)
14. Neuhaus, R.: Computation of real radicals of polynomial ideals II. Journal of Pure and Ap-
plied Algebra 124(13), 261–280 (1998)
15. Platzer, A.: Differential dynamic logic for hybrid systems. J. Autom. Reasoning 41(2), 143–
189 (2008)
16. Platzer, A.: Logical Analysis of Hybrid Systems: Proving Theorems for Complex Dynamics.
Springer, Heidelberg (2010)
17. Platzer, A.: A differential operator approach to equational differential invariants - (invited
paper). In: Beringer, L., Felty, A.P. (eds.) ITP. LNCS, vol. 7406, pp. 28–48. Springer (2012)
18. Platzer, A.: Logics of dynamical systems. In: LICS, pp. 13–24. IEEE (2012)
19. Platzer, A.: The structure of differential invariants and differential cut elimination. Logical
Methods in Computer Science 8(4), 1–38 (2012)
20. Platzer, A., Clarke, E.M.: Computing differential invariants of hybrid systems as fixedpoints.
In: Gupta, A., Malik, S. (eds.) CAV 2008. LNCS, vol. 5123, pp. 176–189. Springer, Heidel-
berg (2008)
21. Rodrı́guez-Carbonell, E., Kapur, D.: An abstract interpretation approach for automatic gen-
eration of polynomial invariants. In: Giacobazzi, R. (ed.) SAS 2004. LNCS, vol. 3148, pp.
280–295. Springer, Heidelberg (2004)
22. Rodrı́guez-Carbonell, E., Tiwari, A.: Generating polynomial invariants for hybrid systems.
In: Morari, Thiele (eds.) [13], pp. 590–605
23. Sankaranarayanan, S.: Automatic invariant generation for hybrid systems using ideal fixed
points. In: Johansson, K.H., Yi, W. (eds.) HSCC, pp. 221–230. ACM (2010)
24. Sankaranarayanan, S., Sipma, H.B., Manna, Z.: Constructing invariants for hybrid systems.
Formal Methods in System Design 32(1), 25–55 (2008)
25. Stengel, R.F.: Flight Dynamics. Princeton University Press (2004)
26. Tiwari, A.: Approximate reachability for linear systems. In: Maler, O., Pnueli, A. (eds.)
HSCC 2003. LNCS, vol. 2623, pp. 514–525. Springer, Heidelberg (2003)
27. Tiwari, A.: Abstractions for hybrid systems. Formal Methods in System Design 32(1), 57–83
(2008)
28. Tiwari, A., Khanna, G.: Nonlinear systems: Approximating reach sets. In: Alur, R., Pappas,
G.J. (eds.) HSCC 2004. LNCS, vol. 2993, pp. 600–614. Springer, Heidelberg (2004)
Quasi-Equal Clock Reduction:
More Networks, More Queries

Christian Herrera, Bernd Westphal, and Andreas Podelski

Albert-Ludwigs-Universität Freiburg, 79110 Freiburg, Germany

Abstract. Quasi-equal clock reduction for networks of timed automata

replaces equivalence classes of clocks which are equal except for unstable
phases, i.e., points in time where these clocks differ on their valuation,
by a single representative clock. An existing approach yields significant
reductions of the overall verification time but is limited to so-called well-
formed networks and local queries, i.e., queries which refer to a single
timed automaton only. In this work we present two new transformations.
The first, for networks of timed automata, summarises unstable phases
without losing information under weaker well-formedness assumptions
than needed by the existing approach. The second, for queries, now sup-
ports the full query language of Uppaal. We demonstrate that the cost
of verifying non-local properties is much lower in transformed networks
than in their original counterparts with quasi-equal clocks.

1 Introduction
Real-time systems often use distributed architectures and communication pro-
tocols to exchange data in real-time. Examples of such protocols are the classes
of TDMA-based protocols [1] and EPL-based protocols [2].
Real-time systems can be modelled and verified by using networks of timed
automata [3]. In [4] a technique that reduces the number of clocks that model the
local timing behaviour and synchronisation activity of distributed components is
presented in order to reduce the verification runtime of properties in networks of
timed automata that fulfill a set of syntactical criteria called well-formedness. In
systems implementing, e.g., TDMA or EPL protocols this technique eliminates
the unnecessary verification overhead caused by the interleaving semantics of
timed automata, where the automata reset their clocks one by one at the end
of each communication phase. This interleaving induces sets of reachable inter-
mediate configurations which grow exponentially in the number of components
in the system. Model checking tools like Uppaal [5] explore these configurations
even when they are irrelevant for the property being verified. This exploration
unnecessarily increases the overall memory consumption and runtime verification
of the property.
The notion of quasi-equal clocks was presented in [4] to characterise clocks
that evolve at the same rate and whose valuation only differs in unstable phases,
1
CONACYT (Mexico) and DAAD (Germany) sponsor the work of the first author.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 295–309, 2014.

c Springer-Verlag Berlin Heidelberg 2014
296 C. Herrera, B. Westphal, and A. Podelski

i.e., points in time where these clocks are reset one by one. Sets of quasi-equal
clocks induce equivalence classes in networks of timed automata.
Although the technique introduced in [4] shows promising results for trans-
formed networks, the technique has two severe drawbacks. Namely, it loses all the
information from intermediate configurations and it supports only local queries,
i.e., properties defined over single timed automaton of well-formed networks. A
concrete consequence of these drawbacks can be observed in the system with
quasi-equal clocks presented in [6] which implements an EPL protocol. In the
transformed model of this system it is not possible to perform the sanity check
that a given automaton receives configuration data from other system compo-
nents right before this automaton resets its quasi-equal clock. The check involves
querying information of several automata from intermediate configurations. Sys-
tem properties are quite often expressed in terms of several automata.
To overcome these limitations, in this work we revisit the reduction of quasi-
equal clocks in networks of timed automata, and we present an approach based
on the following new idea. For each set of quasi-equal clocks we summarise
unstable configurations using dedicated locations of automata introduced during
network transformation. Queries which explicitly refer to unstable configurations
are rewritten to refer to the newly introduced summary location instead. The
dedicated summary locations also allow us to support complex resetting edges
in the original model, i.e. edges with synchronisation of assignments other than
clock resets. This allows us to extend the queries that we support as per our new
approach which is also a source-to-source transformation, i.e. our approach can
be used with a wide range of model-checking tools.
Our approach aims to provide the modelling engineer with a system optimi-
sation technique which allows him to naturally model systems without caring
to optimise them for verification. Our contributions are: (1) We now support
properties referring to multiple timed automata, in particular properties which
query (possibly overlapping) unstable configurations. (2) We enlarge the appli-
cability of our new approach by relaxing the well-formedness criteria presented
in [4]. Our approach allows us to prove in a much simpler and more elegant way
(without a need for the reordering lemma from [4]) that transformed networks
are weakly bisimilar to their original counterparts. We show that properties wrt.
an original network are fully preserved in the transformed network, i.e., the
transformed network satisfies a transformed property if and only if the original
network satisfies the original property. We evaluate our approach on six real
world examples, three of them new, where we observe significant improvements
in the verification cost of non-local queries compared to the cost of verifying
them in the original networks.
The paper is organized as follows. In Section 2, we provide basic definitions.
Section 3 introduces the formal definition of well-formed networks and presents
the algorithm that implements our approach. In Section 4, we formalise the
relation of a well-formed network and its transformed network and prove the
correctness of our approach. In Section 5, we compare the verification time of six
real world examples before and after applying our approach. Section 6 concludes.
Quasi-Equal Clock Reduction: More Networks, More Queries 297

Related Work. The methods in [7–9] eliminate clocks by using static analysis
over single timed automaton, networks of timed automata and parametric timed
automata, respectively. The approaches in [7, 8] reduce the number of clocks in
timed automata by detecting equal and active clocks. Two clocks are equal in a
location if both are reset by the same incoming edge, so just one clock for each
set of equal clocks is necessary to determine the future behavior of the system.
A clock is active at a certain location if this clock appears in the invariant of
that location, or in the guard of an outgoing edge of such a location, or another
active clock takes its value when taking an outgoing edge. Non-active clocks play
no role in the future evolution of the system and therefore can be eliminated.
In [9] the same principle of active clocks is used in parametric timed automata.
Our benchmarks use at most one clock per component which is always active,
hence the equal and active approach is not applicable on them.
The work in [10, 11] uses observers, i.e., single components encoding properties
of a system, to reduce clocks in systems. For each location of the observer, the
technique can deactivate clocks if they do not play a role in the future evolution
of this observer. Processing our benchmarks in order to encode properties as per
the observers approach may be more expensive than our method (one observer
per property), and may not guarantee the preservation of information from in-
termediate conﬁgurations which in the case of our EPL benchmark is needed. In
general using observers to characterise non-local queries is not straightforward.
In sequential timed automata [12], one set of quasi-equal clocks is syntacti-
cally declared. Those quasi-equal clocks are implicitly reduced by applying the
sequential composition operator. The work in [13] avoids the use of shared clocks
in single timed automaton by replacing shared clocks with fresh ones if the
evolution of these automata does not depend on these clocks. This approach
increments the number of clocks (in contrast to ours). Our benchmarks do not
use shared clocks. The approach in [14] detects quasi-equal clocks in networks of
timed automata. Interestingly, the authors demonstrate the feasibility of their
approach in benchmarks that we also use in this paper.

2 Preliminaries

Following the presentation in [15], we here recall the following deﬁnitions.

Let X be a set of clocks. The set Φ(X ) of simple clock constraints over X
is defined by the grammar ϕ ::= x ∼ c | x − y ∼ c | ϕ1 ∧ ϕ2 where x, y ∈
X , c ∈ Q≥0 , and ∼ ∈ {<, ≤, ≥, >}. Let Φ(V) be a set of integer constraints
over variables V. The set Φ(X , V) of constraints comprises Φ(X ), Φ(V), and
conjunctions of clock and integer constraints. We use clocks(ϕ) and vars(ϕ)
to respectively denote the set of clocks and variables occurring in a constraint
ϕ. We assume the canonical satisfaction relation “|=” between valuations ν :
X ∪ V → Time ∪ Z and constraints, with Time = R≥0 . A timed automaton
A is a tuple (L, B, X, V, I, E, ini ), which consists of a finite set of locations
L, where ini ∈ L is the initial location, a finite set B of actions comprising
the internal action τ , finite sets X and V of clocks and variables, a mapping
298 C. Herrera, B. Westphal, and A. Podelski

I : L → Φ(X ), that assigns to each location a clock constraint, and a set of

edges E ⊆ L × B × Φ(X , V) × R(X , V) × L. An edge e = (, α, ϕ, #r , ) ∈ E
from location to involves an action α ∈ B, a guard ϕ ∈ Φ(X , V), and a
reset vector #r ∈ R(X , V). A reset vector is a finite, possibly empty sequence of
clock resets x := 0, x ∈ X , and assignments v := ψint , where v ∈ V and ψint is
an integer expression over V. We write X (A), ini (A), etc., to denote the set of
clocks, the initial location, etc., of A; clocks(#r) and vars(#r) to denote the sets
of clocks and variables occurring in #r, respectively. We use β(e) to denote the
set of basic elements (locations, reset vector, etc.) of an edge e ∈ E(A). We
use the following operation of complementation on actions ·, which is defined
by α! = α?, α? = α! and τ = τ . A network N (of timed automata) consists of
a finite set A1 , . . . , AN of timed automata with pairwise Ndisjoint sets of clocks
and pairwise disjoint sets of locations and a set B(N ) ⊆ i=1 B(Ai ) of broadcast
channels. We write A ∈ N if and only if A ∈ {A1 , . . . , AN }.
The operational semantics of the network N is the labelled transition system
λ
T (N ) = (Conf (N ), Time ∪{τ }, {− →| λ ∈ Time ∪{τ }}, Cini ). The set of configura-
tions Conf (N ) consists of pairs of location vectors 1 , . . . , N from ×N L(Ai )
i=1
N
and valuations of 1≤i≤N X (Ai ) ∪ V(Ai ) which satisfy the constraint i=1 I(i ).
We write s,i , 1 ≤ i ≤ N , to denote the location which automaton Ai assumes
in configuration s = #s , νs and νs,i to denote νs |V(Ai )∪X (Ai ) . Between two con-
figurations s, s ∈ Conf (N ) there can be four kinds of transitions. There is a
t N
delay transition #s , νs − → #s , νs if νs + t |= i=1 Ii (s,i ) for all t ∈ [0, t],
where νs + t denotes the valuation obtained from νs by time shift t . There is a
τ
local transition #s , νs − → #s , νs if there is an edge (s,i , τ, ϕ, #r, s ,i ) ∈ E(Ai ),
1 ≤ i ≤ N , such that # s = #s [s,i := s ,i ], νs |= ϕ, νs = νs [#r], and νs |= Ii (s ,i ).
τ
There is a synchronization transition #s , νs − → #s , νs if there are 1 ≤ i, j ≤
N , i = j, a channel b ∈ B(Ai ) ∩ B(Aj ), and edges (s,i , b!, ϕi , #ri , s ,i ) ∈ E(Ai )
and (s,j , b?, ϕj , #rj , s ,j ) ∈ E(Aj ) such that #s = #s [s,i := s ,i ][s,j := s ,j ],
νs |= ϕi ∧ ϕj , νs = νs [#ri ][#rj ], and νs |= Ii (s ,i ) ∧ Ij (s ,j ). Let b ∈ B be a broad-
cast channel and 1 ≤ i0 ≤ N such that (s,i0 , b!, ϕi0 , #ri0 , s ,i0 ) ∈ E(Ai0 ). Let
1 ≤ i1 , . . . , ik ≤ N , k ≥ 0, be those indices different from i0 such that there is an
τ
edge (s,ij , b?, ϕij , #rij , s ,ij ) ∈ E(Aij ). There is broadcast transition #s , νs − →

#s , νs in T (N ) if #s = #s [s,i0 := s ,i0 ] · · · [s,ik := s ,ik ], νs |= kj=0 ϕij ,
k
νs = νs [#ri ] · · · [#ri ], and νs |=
0 k j=0 jIi (s ,i ). Cini = {#ini , νini } ∩ Conf (N ),
j

where #ini = ini,1 , . . . , ini,N and νini (x) = 0 for each x ∈ X (Ai ), 1 ≤ i ≤ N .
λ λ
A finite or infinite sequence σ = s0 −→ 1
s1 −→2
s2 . . . is called transition sequence
(starting in s0 ∈ Cini ) of N . Sequence σ is called computation of N if and only
if it is infinite and s0 ∈ Cini . We denote the set of all computations of N by
Π(N ). A configuration s is called reachable (in T (N )) if and only if there exists
a computation σ ∈ Π(N ) such that s occurs in σ.
The set of basic formulae over N is given by the grammar β ::= | ¬ | ϕ
where ∈ L(Ai ), 1 ≤ i ≤ n, and ϕ ∈ Φ(X (N ), V(N )). Basic formula β is satisfied
by configuration s ∈ Conf (N ) if and only if s,i = , s,i = , or νs |= ϕ, resp. A
reachability query EPF over N is ∃♦ CF where CF is a configuration formula
Quasi-Equal Clock Reduction: More Networks, More Queries 299

over N , i.e., any logical connection of basic formulae. We use β(CF ) to denote
the set of basic formulae in CF . N satisfies ∃♦ CF , denoted by N |= ∃♦ CF , if
and only if there is a configuration s reachable in T (N ) s.t. s |= CF .
We recall from [4] the following definitions. Given a network N with clocks
X , two clocks x, y ∈ X are called quasi-equal, denoted by x 3 y, if and only
if for all computation paths of N , the valuations of x and y are equal, or the
λ1 λ2
valuation of one of them is equal to 0, i.e., if ∀ s0 −→ s1 −→ s2 · · · ∈ Π(N ) ∀ i ∈
N0 • νsi |= (x = 0 ∨ y = 0 ∨ x = y). In the following, we use EC N to denote
the set {Y ∈ X /3 | 1 < |Y |} of equivalence classes of quasi-equal clocks of
N with at least two elements. For each Y ∈ X /3, we assume a designated
representative denoted by rep(Y ). For x ∈ Y , we use rep(x) to denote rep(Y ).
Given a constraint ϕ ∈ Φ(X , V), we write Γ (ϕ) to denote the constraint that is
obtained by syntactically replacing each occurrence of a clock x ∈ X in ϕ, by the
representative rep(x). Given an automaton A ∈ N , a set of clocks X ⊆ X (A),
and a set of variables V ⊆ V(A), we use SE X (A) to denote the set of simple
resetting edges of A which reset clocks from X, have action τ , no variables occur
in their guards, and do not update any variables, i.e., SE X (A) = {(, α, ϕ, #r, ) ∈
E(A) | clocks(#r) ∩ X = ∅ ∧ α = τ ∧ vars (ϕ) = ∅ ∧ vars(#r) = ∅}. We use CE X (A)
to denote the set of complex resetting edges of A which reset clocks from X
and have an action different from τ or update some variables, i.e., CE X (A) =
{(, α, ϕ, #r, ) ∈ E(A) | clocks(#r) ∩ X = ∅ ∧ (vars(#r) ∩ V = ∅ ∨ α = τ )}.
We use LS X (A) and LC X (A) to respectively denote the set of locations (source
and destination) of simple and complex resetting edges wrt. X of A. We use
EX (A) = SE X (A) ∪ CE X (A) to denote the set of resetting edges of A which
reset clocks from X, and RES X (N ) to denote the set of automata in N which
have a resetting edge, i.e., RES X (N ) = {A ∈ N | EX (A) = ∅}. A location ( )
is called is called reset (successor) location wrt. Y ∈ EC N in N if and only if
there is a resetting edge in SE Y (A) ∪ CE Y (A) from (to) ( ). We use RLY (N )
(RL+ Y (N )) to denote the set of reset (successor) locations wrt. Y in N and we
set RLEC N (N ) := Y ∈EC N RLY (N ) and similarly RL+ EC N (N ).
A configuration s ∈ Conf (N ) is called stable wrt. Y ∈ EC N if and only if
all clocks in Y have the same value in s, i.e., if ∀ x ∈ Y • νs (x) = νs (rep(x)).
We use SC YN to denote#the set of all configurations that are stable wrt. Y and
SC N to denote the set Y ∈ECN SC YN of globally stable configurations of N . Con-
figurations not in SC N are called unstable. An edge e of a timed automaton
A in network N is called delayed if and only if time must pass before e can
λ1 λn
be taken, i.e., if ∀ s0 −→ E1 s1 . . . sn−1 −
−→En sn ∈ Π(N ) • e ∈ En =⇒
∃ 0 ≤ j < n • λj ∈ Time \ {0} ∧ ∀ j ≤ i < n • E(A) ∩ Ei = ∅. Where
λ λ
we write si −→ i
Ei si+1 , i ∈ N
>0
, to denote that the transition si −→i
si+1
is justified by the set of edges Ei ; Ei is empty for delay transitions, i.e., if
λi ∈ Time. We say EC N -reset edges are pre/post delayed in network N if and
only if all edges originating in reset or reset successor locations are delayed, i.e.,
if ∀ e = (, α, ϕ, #r , ) ∈ E(N ) • ∈ RLECN (N ) ∪ RL+ EC N (N ) =⇒ e is delayed.
300 C. Herrera, B. Westphal, and A. Podelski

x ≤ 59
idle
x ≥ 50
y ≥ 40
closed := 0 A2 :
A1 : wait ﬁll wait ﬁll
x ≥ 60 y ≥ 60
x ≤ 60 x ≤ 60
x := 0, closed := 1 y ≤ 60 y := 0 y ≤ 60

Fig. 1. Model of a chemical plant controller with quasi-equal clocks

3 Reducing Clocks in Networks of Timed Automata

Consider the following motivating example of a distributed chemical plant con-
troller. At the end of every minute, the controller fills two containers with gas,
one for at most 10 seconds and the other for at most 20 seconds. Figure 1 shows
a model of this system in form of the network N1 which is composed of automata
A1 and A2 with respective clocks x and y. Additionally, automaton A1 has the
boolean variable closed that is set to true, i.e., closed := 1 , when A1 has filled
its container. Both automata start in a waiting phase at the point in time 0
and after filling the containers they wait for the next round. Both clocks x and
y, together with the variable closed are respectively reset and updated at the
point in time 60. Yet, in the strict interleaving semantics of networks of timed
automata, these resets occur one after the other.
According to the definition of quasi-equal clocks, clocks x and y are quasi-
equal because their valuations are only different from each other when they
are reset at the point in time 60. Now consider verifying in N1 , whether the
container of automaton A1 is closed before automaton A2 resets its clock. A
query that states this property is ∃♦ φ with configuration formula φ : closed =
1 ∧ y ≥ 60. Clearly in N1 , this query is satisfied only when clocks x and y have
different valuations, i.e., in unstable configurations. Property ∃♦ φ cannot be
treated by the approach in [4] since that approach supports only local queries,
i.e., queries which refer to properties of at most one automaton. The approach
in [4] completely eliminates all unstable configurations, those where quasi-equal
clocks have different valuations, since no alternative representation of them was
proposed for transformed models. Furthermore, N1 does not satisfy the well-
formedness criteria of [4] because the resetting edge also assigns a variable.

3.1 Transformational Reduction of Quasi-Equal Clocks

In the following we present an algorithm which reduces a given set of quasi-
equal clocks in networks of timed automata and preserves all possible queries.
For simplicity, we impose a set of syntactical criteria called well-formedness rules
over networks of timed automata.
Deﬁnition 1 (Well-formed Network). A network N is called well-formed if
and only if it satisﬁes the following restrictions for each set of quasi-equal clocks
Y ∈ EC N :
Quasi-Equal Clock Reduction: More Networks, More Queries 301

(R1) An edge resets at most one clock x ∈ Y , in the constraint (guard) of this
edge there is a clause of the form x ≥ CY , and the source location of that
edge has an invariant x ≤ CY for some constant CY > 0, i.e.,

∃ CY ∈ N>0 ∀ A ∈ N ∀ (, α, ϕ, #r , ) ∈ EY (A) ∃ x ∈ Y, ϕ0 ∈ β(ϕ) •

clocks(#r) = {x} ∧ I() = x ≤ CY ∧ ϕ0 = x ≥ CY
∧ ∀ ϕ1 ∈ β(ϕ) • clocks(ϕ1 ) = ∅ =⇒ ϕ1 = ϕ0 .
(R2) Resetting edges do not coincide on source locations.

∀ A ∈ N ∀ (i , αi , ϕi , #ri , i ) = (j , αj , ϕj , #rj , j ) ∈ EY (A) • i = j .

(R3) For pairs of edges that synchronise on some channel a ∈ B(N ), either all
edges reset a clock from Y , or none of these edges resets a clock from Y , or
the output a! is in one edge resetting a clock from Y , and the inputs a? are
in the edges of automata which do not reset clocks from Y , i.e.,

∀ A1 = A2 ∈ N ∀ ei = (i , αi , ϕi , #ri , i ) ∈ E(Ai ), i = 1, 2, α1 = α2 •

(e1 ∈
/ EY (A1 ) ∧ e2 ∈/ EY (A2 )) ∨ (e1 ∈ EY (A1 ) ∧ e2 ∈ EY (A2 ))
∨ (∃ i ∈ {1, 2}, a ∈ B(N ) • αi = a! ∧ ei ∈ EY (Ai ) ∧ A3−i ∈/ RES Y (N )).
(R4) At most one clock from Y occurs in the guard of any edge, i.e.,

∀ (, α, ϕ, #r , ) ∈ E(N ) • |clocks(ϕ) ∩ Y | ≤ 1.

The transformation algorithm presented here which was developed in order to

support all queries and in particular those interested in unstable configurations,
allows us to easily relax the syntactical restrictions presented in [4]. The relax-
ations done in this work are the following. By restriction R1, now looped edges
or those edges from initial locations can reset clocks from Y ∈ EC N as well as
update variables, and we now allow the guard of such edges to conjoin integer
constraints over variables. By R2 we now allow more edges from a reset loca-
tion (but still only one resetting edge from it). By R3, we now allow a resetting
edge to have a limited but still useful synchronisation. The new well-formedness
criteria are less restrictive then they look on first sight. They allow us to extend
the applicability of our new approach by treating three new case studies. Note
that the network in Figure 1 satisfies the new well-formedness criteria.
In the following we describe the transformation algorithm K. It works with two
given inputs, a well-formed network N and a set of equivalence classes EC N =
{Y1 , . . . , Yn } of quasi-equal clocks. The output of K is the transformed network
N = {A1 , . . . , An }∪{RY | Y ∈ EC N } with broadcast channels B(N ) = B(N )∪
{reset Y | Y ∈ EC N }. The automata Ai are obtained by applying repeatedly
(in any order) the algorithm K0 to Ai for each equivalence class in EC N , i.e.,
Ai = K0 (. . . K0 (Ai , Y1 ), . . . Yn ). Algorithm K0 is defined as follows.

A , if A ∈
/ RES Y (N ),
K0 (A, Y ) =
(L , B , X , V , I , E , ini ) , otherwise
302 C. Herrera, B. Westphal, and A. Podelski

x ≥ 40
idle x ≤ 59 A2 :
rst IY ++
rst IY ++ rst IY −− wait ﬁll
reset Y ?
x ≥ 50
x ≤ 60 Y −−
rst O x ≤ 60
closed := 0
A1 : wait
rst IY ++ ﬁll reset Y !

x ≤ 60 x ≤ 60
RY : x ≥ 60 ∧ rst IY = 2

reset Y ? rst IY := 0
closed := 1
ini,RY x := 0

nst ,Y
Y −−
rst O

ξ Y rst O
Y := 2
x≤0
Y = 0∧x ≤ 0
rst O
x≤0

Fig. 2. The model the chemical plant controller after applying K

where per equivalence class

– intermediate locations for each complex resetting edge are added, L =
L(A) ∪ ΞY (A) with ΞY (A) = {ξY,e | e ∈ CE Y (A)}, ini = ini (A),
– the broadcast channel reset Y is added, B = B(A) ∪ {reset Y }; clocks except
for each representative clock are deleted, X = (X (A) \ Y ) ∪ {rep(Y )}; and
rst variables are added, V = V ∪ {rst IY , rst O
Y },
– quasi-equal clocks occurring in invariants are replaced by the respective rep-
resentative clock, I () = I()[y/rep(y) | y ∈ Y ] for ∈ L(A); and a zero
delay invariant is added to each intermediate location
I () = rep(y) ≤ 0 for ∈ ΞY (A); On non-resetting edges each quasi-equal
clock is replaced by the respective representative clock; the input reset Y ? is
placed and the reset of quasi-equal clocks is removed from simple edges; and
intermediate locations and reset successor locations are linked, respectively,

E = {(, α, ϕ[y/rep(y) | y ∈ Y ], #r; ρe , ) | e = (, α, ϕ, #r , ) ∈ E(A) \ EY (A)}

∪ {(, reset Y ?, ϕ[y ∼ c/true | y ∈ Y ], #r[y := 0/
| y ∈ Y ]; ρe , ) |
e = (, τ, ϕ, #r, ) ∈ SE Y (A)}
∪ {(ξY,e , α, ϕ[y ∼ c/true | y ∈ Y ], #r[y := 0/
| y ∈ Y ]; ρe , ),
(, reset Y ?, true,
, ξY,e ) | e = (, α, ϕ, #r, ) ∈ CE Y (A)}
where the reset sequence ρe = r1 ; r2 ; r3 depends on the edge e as follows:
– r1 = rst IY := rst IY + 1 if e is to a reset location in RLY (N ), and r1 =

otherwise,
– r2 = rst IY := rst IY −1 if e is from a reset location in RLY (N ) and e ∈
/ EY (A),
and r2 =
otherwise, and
– r3 = rst OY := rst Y − 1 if e ∈ EY (A), and r3 =
otherwise.
O

The resetter RY for equivalence class Y is the timed automaton

({ini,RY , nst,Y }, {reset Y }, {rep(Y )}, {rst IY := iLY , rst O
Y := nY }, I, E, ini,RY ).
Quasi-Equal Clock Reduction: More Networks, More Queries 303

It initializes the variable rst IY to iLY := |{A ∈ N | ini,A ∈ RLY (N )}|, i.e. the
number of automata whose initial location is a reset location of Y , and rst O Y
to nY := |RES Y (N )|, i.e. the number of automata that reset the clocks of Y .
There are two locations with the invariants I(ini,RY ) = true and I(nst,Y ) =
rep(Y ) ≤ 0. The set of edges E consists of

(ini,RY , reset Y !, (rst IY = nY ∧ rep(Y ) ≥ CY ), rst IY := 0; rep(Y ) := 0, nst,Y )

Y = 0 ∧ rep(Y ) ≤ 0), rst Y := nY , ini,RY )
and (nst,Y , τ, (rst O O

where CY is the time at which the clocks in Y are reset (cf. R1 ).

Example 1. Applying K to N1 from Figure 1 yields network N1 (cf. Figure 2).
Similar to the algorithm in [4], only the representative clock of each equivalence
class remains. All guards and invariants with quasi-equal clocks are re-written
to refer to the representative clock, and the reset operation is delegated to the
resetter. The variable rst IY together with well-formedness enforces a blocking
multicast synchronisation between resetter and the automata in RES Y (N ).
In order to support non-local queries, and in particular queries for possibly
overlapping unstable configurations, the approach presented here introduces one
resetter per equivalence class with two locations each. The location nst,Y rep-
resents all unstable configuration wrt. Y . To support complex edges, and thus
non-trivial behaviour during unstable phases, complex edges are basically split
into two. The first one synchronises with the resetter and the second one carries
out the actions of the original complex edge. As long as the second edge has not
been taken, the system is unstable. The variable rst OY is introduced to indicate
to automaton RY when this unstability finishes. Its value gives the number of
automata which still need to take their reset edge in the current unstable phase.
In N1 , we have thereby eliminated the interleaving induced by resetting the
clocks x and y in N1 , but the interleaving wrt. variable updates during reset of
quasi-equal clocks is preserved by splitting the complex edge into two. Note that
in transformed networks, configurations with the locations nst,Y1 , . . . , nst,Yn ,
where 1 < n, reflect overlapping unstable phases, i.e. instability wrt. multiple
equivalence classes at one point in time.

The following function Ω syntactically transforms properties over a well-

formed network N into properties over N = K(N , EC N ). Function Ω treats
queries for source or destination locations of resetting edges special and outputs
an equivalent property which can be verified in N .
For instance, consider a simple resetting edge e ∈ SE Y (A) of some A ∈ N .
The source location of e can be assumed in N in different configurations:
either the reset time is not yet reached, or the reset time is reached but A did
not reset yet, while other automata in RES Y (N ) may have reset their clocks
already. In N , all edges resulting from simple edges fire at once on the broadcast
synchronisation, so all source locations are left together. Because the resetter
moves to nst,Y , a configuration of N which assumes location nst,Y represents
all similar configurations of N where all simple edges are in their source or
304 C. Herrera, B. Westphal, and A. Podelski

destination location. Thus the location is reachable in N if and only if (i) N

reaches nst,Y , or (ii) if is reached while being stable, i.e., not being in nst,Y .
A similar reasoning is applied to properties querying elements of a complex
resetting edge wrt. Y , but instead of using nst,Y we use the intermediate location
ξY,e from N , since this location represents unstability before updating any
variable that occurs in a complex edge.
In this sense, configurations involving location nst,Y summarise unstable
phases of N . Assuming nst,Y in N represents both cases for a simple edge,
that it has already been taken or not, and that the clock x reset by this edge is
still CY or already 0. Although involving two choices, there are essentially two
cases (not four): having taken the reset edge and being unstable implies that, x
is 0 and some other clocks are still CY , or x is still CY and some other clocks
are already 0. To this end, we introduce fresh existentially quantified variables ˜
and x̃ in Ω0 and conjoin it with a consistency conjunction. By R1, we only need
to consider 0 and CY as values of x̃, thus the existential quantification can be
rewritten into a big disjunction, and hence is a proper query.
Definition 2 (Function Ω). Let Y ∈ EC N be sets of clocks of a well-formed
network N and let N = K(N , EC N ). Let CY be the constant described in re-
striction R1. Let nst,Y be the unique non initial location of RY , the resetter
automaton wrt. Y in N . Let β be a basic formula over N . Then the function Ω
is defined as follows where LY = LS Y (N ) ∪ (LC Y (N ) ∩ RLY (N )):
⎧
⎪ ( ∧ ¬nst,Y ) ∨ (nst,Y ∧ ) ˜ , if β = , ∈ LY .
⎪
⎪
⎨(¬ ∧ ¬
nst,Y ) ∨ (nst,Y ∧ ¬) , if β = ¬, ∈ LY .
˜
Ω0 (β) =
⎪
⎪ Γ (ϕ) ∧ ¬nst,Y ∨ nst,Y ∧ ϕ̃ , if β = ϕ, ϕ̃ = ϕ[x/x̃ | x ∈ X (N )].
⎪
⎩
β , otherwise

Ω(CF ) = ∃ x̃1 , .., x̃k ∃ ˜1 , .., ˜m • Ω0 (CF ) ∧ (˜i =⇒ ξY,e ) ∧ ¬(˜i ∧ ˜j )
(
,α,ϕ,r ,
) 1≤i=j≤m,
∈CE Y (A), 1≤p≤n

i ,
j ∈Lp ,

i =

∧ (˜i =⇒ x̃j = CY ) ∧ (˜i =⇒ x̃j = 0) ∧ (˜i =⇒ )
1≤i≤k,1≤j≤m, 1≤i≤k,1≤j≤m, (
,α,ϕ,r ,
)
xj ∈Xp ∩Y,1≤p≤n, xj ∈Xp ∩Y,1≤p≤n, ∈SE Y (A),

i ∈Lp ∩(RLY (N )\RL+

Y (N ))
i ∈Lp ∩(RL+Y (N )\RLY (N ))
i ∈{
,
}

For example, for Ω(φ) we obtain, after some simpliﬁcations given that A2 has
only simple resetting edges, the following transformed formula:
∃ x̃ ∈ {0, CY } • closed = 1 ∧ ((x ≤ 60 ∧ ¬nst,Y ) ∨ (nst,Y ∧ x̃ ≥ 60)).

4 Formal Relation of a Well-formed Network and Its

Transformed Network
In order to prove our approach correct we establish a weak bisimulation relation
between a well-formed network and its respective transformed network. To this
end, we ﬁrstly extend the notion of (un)stability to N as follows.
Quasi-Equal Clock Reduction: More Networks, More Queries 305

Deﬁnition 3 (Stable Conﬁguration of N ). Let N be a network and let

Y ∈ EC N be a set of quasi-equal clocks. Let N = K(N , EC N ).
A configuration r ∈ Conf (N ) is called stable wrt. Y if and only if the initial
location ini,RY of resetter RY ∈ N occurs in r, i.e., if r |= ini,RY . We use
SC YN to denote#the set of all configurations that are stable wrt. Y and SC N to
denote the set Y ∈ECN SC YN of globally stable configurations of N . We call a
configuration r ∈ SC N unstable.

We recall that conﬁgurations induced when each clock from Y ∈ EC N is

reset in well-formed networks N , are summarised in transformed networks N in
configurations where the nst,RY -location occurs together with the valuations of
rst IY and rst O I
Y reflecting these resets. Hence with the valuations from rst Y and

rst Y we unfold information summarised in these configurations from N .
O

Lemma 1 (Weak Bisimulation)

Any well-formed network N where EC N -reset edges are pre/post delayed, is
weakly bisimilar to N = K(N , EC N ), i.e., there is a weak bisimulation relation
S ⊆ Conf (N ) × Conf (N ) such that
1. ∀ s ∈ Cini (N ) ∃ r • (s, r) ∈ S and ∀ r ∈ Cini (N ) ∃ s • (s, r) ∈ S.
2. For all config. formulae CF over N , ∀ (s, r) ∈ S • s |= CF =⇒ r |= Ω(CF )
and ∀ r ∈ CONS EC N •r |= Ω(CF ) =⇒ ∃ s ∈ Conf (N )•(s, r) ∈ S ∧s |= CF .
3. For all (s, r) ∈ S,
λ
→ s with
(a) if s −
i. s ∈ SC YN , s ∈ / SC YN , where Y ∈ EC N , and justified by a simple
λ
resetting edge, there is r such that r − → r and (s , r ) ∈ S.

ii. s, s ∈ / SC N , or s ∈
Y
/ SC N and s ∈ SC YN , where Y ∈ EC N , and
Y
0
justified by a simple resetting edge, then r − → r and (s , r) ∈ S.

iii. s, s ∈ SC N , or s ∈ SC N and s ∈
Y Y
/ SC N , where Y ∈ EC N , and
Y

justiﬁed by the set CE Y ⊆ CE Y (N ) of complex resetting edges wrt. Y ,

τ λ
then there exist r , r such that r − → r − → r and (s, r ), (s , r ) ∈ S.

iv. s ∈/ SC N , s ∈ SC N , where Y ∈ EC N , and justiﬁed by CE Y ⊆
Y Y
λ
CE Y (N ), there is r s.t. r − → r and (s , r ) ∈ S.

v. s, s ∈ SC N , r = nst,RY for some Y ∈ EC N , and λ = d > 0, there
Y
τ λ
exist r , r such that r − → r − → r and (s, r ), (s , r ) ∈ S.
λ
vi. Otherwise there exists r such that r − → r and (s , r ) ∈ S.
λ
(b) if r −
→ r with
i. r ∈ SC YN , r ∈ / SC YN , where Y ∈ EC N , νr (rst O Y ) < N , where N =
νr (rst Y ), there exist s1 , . . . , sn where n = N − νr (rst O
O
Y ), such that
τ τ τ
s− → s1 − → ... − → sn and (si , r ) ∈ S, 1 ≤ i ≤ n.
ii. r ∈ SC YN , r ∈ / SC YN , νr (rst O
Y ) = νr (rst Y ), where Y ∈ EC N , then
O
0
s− → s and (s, r ) ∈ S.
0
iii. r = nst,RY , r = nst,RY , Y ∈ EC N , then s − → s and (s, r ) ∈ S.
λ
iv. Otherwise there exists s such that s − → s and (s , r ) ∈ S.

Proof (sketch). Let N be a well-formed network and let N = K(N , EC N ). For

each Y ∈ EC N use the following six conditions to obtain a weak bisimulation
306 C. Herrera, B. Westphal, and A. Podelski

s
τ s̄ s
τ s̄
(N )
λ τ λ
(N )
r r̄ r r̄ r̄
(i) Get unstable by SE-edge. (iv) Get unstable by CE-edge.

r τ r̄ r τ r̄

(N )
τ τ ... τ 0
(N )
s s̄1 s̄n s s
(i) Reset some SE-edges. (ii) Reset all CE-edges.

Fig. 3. Some involved weak bisimulations cases between the transition system (TS) of a
well-formed network N and the TS of the network N = K(N , E C N ). Dots with the leg-
end (s̄)s and (r̄)r represent (unstable) stable configurations of N and N , respectively.
Arrows represent transitions between configurations of the same TS. Configurations s
and r are in transition simulation if they are linked by a dotted line.

relation S between conﬁgurations s ∈ Conf (N ) and consistent conﬁgurations

r ∈ CONS ECN of N , i.e., configurations whose valuations of variables rst IY
match the number of reset locations wrt. Y assumed in r, and if r is unsta-
ble, variables rst O
Y match the number of intermediate locations assumed in r,
otherwise match |Y |. (1) any pair (s, r) matches the valuations of variables and
non-quasi-equal clocks. (2) automata from N and N which do not reset clocks
from Y must coincide on locations. (3) relate stable configurations from N where
all quasi-equal clocks from Y have the valuation CY and the variable rst IY = |Y |,
to either stable configurations in N where the clock rep(Y ) = CY , or to unsta-
ble configurations which reflect the synchronisation on the channel resetY . (4)
take unstable configurations from N and N and relate them if they show the
effect of taking the same complex resetting edge. (5) relate unstable configura-
tions wrt. Y in N to the unstable configuration in N whose variable rst O Y = 0
and the location r,RY = nst,RY . (6) relate stable configurations from N which
show the effect of resetting each clock from Y , to the unstable configuration of
N which shows the same effect, i.e., the valuation of the variable rst O
Y = 0. This
last restriction allows N to make the return transition to stability.
During stability phases there is a strong bisimulation (one-to-one) between the
networks N and N . Only during unstability phases there is a weak bisimulation
(one-to-many) in both directions. There are cases (reset of simple edges) where
N simulates one step of N with multiple steps, and cases (reset of complex
edges) where N simulates one step of N with multiple steps. Figure 3 shows
some involved simulation steps between unstable phases in N and N .

Theorem 1. Let N be a well-formed network where EC N -resets are pre/post

delayed. Let CF be a conﬁguration formula over N . Then
K(N , EC N ) |= ∃♦ Ω(CF ) ⇐⇒ N |= ∃♦ CF .
Quasi-Equal Clock Reduction: More Networks, More Queries 307

Table 1. Column XX-N (K) gives the ﬁgures for case study XX with N sensors (and
K applied). ‘C’ gives the number of clocks in the model, ‘kStates’ the number of visited
states times 103 , ‘M’ memory usage in MB, and ‘t(s)’ veriﬁcation time in seconds.
FraTTA transformed each of our benchmarks in at most 5 seconds.
(Env.: Intel i3, 2.3GHz, 3GB, Ubuntu 11.04, verifyta 4.1.3.4577 with default options.)

Network C kStates M t(s) Network C kStates M t(s)

EP-21 21 3,145.7 507.4 498.4 FS-8 14 5,217.7 181.4 758.7
EP-21K 1 5,242.9 624.9 167.5 FS-8K 5 1,081.9 41.6 32.8
EP-22 22 6,291.5 1,025.9 1,291.7 FS-10 16 17,951.3 568.2 4,271.2
EP-22K 1 10,485.8 1,269.6 358.3 FS-10K 5 1,215.0 44.1 39.0
EP-23 23 - - - FS-11 17 - - -
EP-23K 1 18,431.8 2,490.2 646.8 FS-126K 5 9,512.3 300.5 1,529.8
TT-6 7 2,986.0 114.9 38.1 CD-14 16 7,078.1 591.8 1,384.3
TT-6K 1 2,759.6 106.7 30.6 CD-14K 2 442 52.5 42.3
TT-7 8 16,839.9 611.5 276.4 CD-16 18 13,441.1 1,996.4 3,806.3
TT-7K 1 15,746.7 577.3 262.7 CD-16K 2 2,031.9 240.9 389.0
TT-8 9 - - - CD-17 19 - - -
TT-8K 1 66,265.9 2,367.7 1,227.6 CD-18K 2 9,1975.3 1,142.8 2,206.1
LS-7 18 553.3 74.5 22.7 CR-6 6 264.5 18.7 2.9
LS-7K 6 605.3 83.2 11.5 CR-6K 1 129.4 18.2 1.5
LS-9 22 8,897.6 1,283.5 686.6 CR-7 7 7,223.7 496.5 136.3
LS-9K 6 9,106.3 1,417.9 238.8 CR-7K 1 2,530.1 342.4 42.5
LS-11 26 - - - CR-8 8 - - -
LS-11K 6 7,694.0 2,188.2 460.6 CR-8K 1 5,057.6 785.0 109.4

Proof. Use Lemma 1 and induction over the lenght of paths to show that CF
holds in N if and only if Ω(CF ) holds in K(N , EC N ).

5 Experimental Results
We applied our approach to six industrial case studies using FraTTA [16], our
implementation of K. The three case studies FS [17], CR [18], CD [19] are from
the class of TDMA protocols and appear in [4]. The relaxed well-formedness cri-
teria (compared to [4]) allowed us to include the three new case studies EP [6],
TT [20], LS [21]. We verified non-local queries as proposed by the respective
authors of these case studies. None of these queries could be verified with the
approach presented in [4]. Our motivating case study is inspired by the network
from [6] which use the Ethernet PowerLink protocol in Alstom Power Control
Systems. The network consists of N sensors and one master. The sensors ex-
change information with the master in two phases, the first is isochronous and the
second asynchronous. An error occurs if a sensor fails to update the configuration
data as sent by the master in the beginning of the isochronous phase. Specif-
ically, each sensor should update its internal data before the master has reset
its clock. The query configData := ∀ A.configData = 1 ∧ A.x = 0 ∧ M.y > 0,
where A is a sensor and M is the master, x and y are quasi-equal clocks from
308 C. Herrera, B. Westphal, and A. Podelski

the same equivalence class, and configData is a boolean variable set to true by
the edge that resets x when A has successfully updated its configuration data,
checks whether this network is free from errors as explained before. Note that
query configData is non-local and in addition refers to an unstable configuration.
We refer the reader to [17–21] for more information on the other case studies.
Table 1 gives figures for the verification of the non-local queries in instances of the
original and the transformed model. The rows without results indicate the smallest
instances for which we did not obtain results within 24 hours. For all examples ex-
cept for TT, we achieved significant reductions in verification time. The quasi-equal
clocks in the TT model are reset by a broadcast transition so there is no interleaving
of resets in the original model. Still, verification of the transformed TT instances
including transformation time is faster than verification of the original ones. Re-
garding memory consumption, note that verification of the K -models of EP and LS
takes slightly more memory than verification of the original counterparts. We argue
that this is due to all resetting edges being complex in these two networks. Thus,
our transformation preserves the full interleaving of clock resets and the whole set
of unstable locations whose size is exponential in the number of participating au-
tomata, and it adds the transitions to and from location nst . The shown reduction
of the verification time is due to a smaller size of the DBMs that Uppaal uses to
represent zones [22] and whose size grows quadratically in the number of clocks. If
the resetting edges are simple (as in FS, CD, and CR), our transformation removes
all those unstable configurations.

6 Conclusion
Our new technique reduces the verification time of networks of timed automata
with quasi-equal clocks. It represents all clocks from an equivalence class by one
representative, and it eliminates those configurations induced by automata that
reset quasi-equal clocks one by one. All interleaving transitions which are induced
by simple resetting edges are replaced by just two transitions in the transformed
networks. We use nst-locations to summarise unstable configurations. This allows
us to also reduce the runtime of non-local properties or properties explicitly query-
ing unstable phases. With variables rstI , rstO we unfold information summarised
in nst-locations, and together with a careful syntactical transformation of proper-
ties, we reflect all properties of original networks in transformed ones. Our new ap-
proach fixes the two severe drawbacks of [4], which only supports local queries and
whose strong well-formedness conditions rules out many industrial case-studies.
Our experiments show the feasibility and potential of the new approach, even if
some interleavings are preserved and only the number of clocks is reduced.

References
1. Rappaport, T.S.: Wireless communications, vol. 2. Prentice Hall (2002)
2. Cena, G., Seno, L., et al.: Performance analysis of ethernet powerlink networks for
distributed control and automation systems. CSI 31(3), 566–572 (2009)
Quasi-Equal Clock Reduction: More Networks, More Queries 309

3. Alur, R., Dill, D.: A theory of timed automata. TCS 126(2), 183–235 (1994)
4. Herrera, C., Westphal, B., Feo-Arenis, S., Muñiz, M., Podelski, A.: Reducing quasi-
equal clocks in networks of timed automata. In: Jurdziński, M., Ničković, D. (eds.)
FORMATS 2012. LNCS, vol. 7595, pp. 155–170. Springer, Heidelberg (2012)
5. Behrmann, G., David, A., Larsen, K.G.: A tutorial on uppaal. In: Bernardo, M.,
Corradini, F. (eds.) SFM-RT 2004. LNCS, vol. 3185, pp. 200–236. Springer, Hei-
delberg (2004)
6. Limal, S., Potier, S., Denis, B., Lesage, J.: Formal verification of redundant media
extension of ethernet powerlink. In: ETFA, pp. 1045–1052. IEEE (2007)
7. Daws, C., Yovine, S.: Reducing the number of clock variables of timed automata.
In: RTSS, pp. 73–81. IEEE (1996)
8. Daws, C., Tripakis, S.: Model checking of real-time reachability properties using
abstractions. In: Steffen, B. (ed.) TACAS 1998. LNCS, vol. 1384, pp. 313–329.
Springer, Heidelberg (1998)
9. André, É.: Dynamic clock elimination in parametric timed automata. In: FSFMA,
OASICS, pp. 18–31, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2013)
10. Braberman, V., Garbervestky, D., Kicillof, N., Monteverde, D., Olivero, A.: Speed-
ing up model checking of timed-models by combining scenario specialization and
live component analysis. In: Ouaknine, J., Vaandrager, F.W. (eds.) FORMATS
2009. LNCS, vol. 5813, pp. 58–72. Springer, Heidelberg (2009)
11. Braberman, V.A., Garbervetsky, D., Olivero, A.: Improving the verification of
timed systems using influence information. In: Katoen, J.-P., Stevens, P. (eds.)
TACAS 2002. LNCS, vol. 2280, pp. 21–36. Springer, Heidelberg (2002)
12. Muñiz, M., Westphal, B., Podelski, A.: Timed automata with disjoint activity. In:
Jurdziński, M., Ničković, D. (eds.) FORMATS 2012. LNCS, vol. 7595, pp. 188–203.
Springer, Heidelberg (2012)
13. Balaguer, S., Chatain, T.: Avoiding shared clocks in networks of timed automata.
In: Koutny, M., Ulidowski, I. (eds.) CONCUR 2012. LNCS, vol. 7454, pp. 100–114.
Springer, Heidelberg (2012)
14. Muñiz, M., Westphal, B., Podelski, A.: Detecting quasi-equal clocks in timed au-
tomata. In: Braberman, V., Fribourg, L. (eds.) FORMATS 2013. LNCS, vol. 8053,
pp. 198–212. Springer, Heidelberg (2013)
15. Olderog, E.-R., Dierks, H.: Real-time systems - formal specification and automatic
verification. Cambridge University Press (2008)
16. Fitriani, K.: FraTTA: Framework for transformation of timed automata, Master
Team Project, Albert-Ludwigs-Universität Freiburg (2013)
17. Dietsch, D., Feo-Arenis, S., et al.: Disambiguation of industrial standards through
formalization and graphical languages. In: RE, pp. 265–270. IEEE (2011)
18. Gobriel, S., Khattab, S., Mossé, D., et al.: RideSharing: Fault tolerant aggregation
in sensor networks using corrective actions. In: SECON, pp. 595–604. IEEE (2006)
19. Jensen, H., Larsen, K., Skou, A.: Modelling and analysis of a collision avoidance
protocol using SPIN and Uppaal. In: 2nd SPIN Workshop (1996)
20. Steiner, W., Elmenreich, W.: Automatic recovery of the TTP/A sensor/actuator
network. In: WISES, pp. 25–37, Vienna University of Technology (2003)
21. Kordy, P., Langerak, R., et al.: Re-verification of a lip synchronization protocol
using robust reachability. In: FMA. EPTCS, vol. 20, pp. 49–62 (2009)
22. Bengtsson, J., Yi, W.: Timed automata: Semantics, algorithms and tools. In: Desel,
J., Reisig, W., Rozenberg, G. (eds.) ACPN 2003. LNCS, vol. 3098, pp. 87–124.
Springer, Heidelberg (2004)
Are Timed Automata Bad for a Specification Language?
Language Inclusion Checking for Timed Automata

Ting Wang1 , Jun Sun2 , Yang Liu3 , Xinyu Wang1 , and Shanping Li1
1
College of Computer Science and Technology, Zhejiang University, China
2
ISTD, Singapore University of Technology and Design, Singapore
3
School of Computer Engineering, Nanyang Technological University, Singapore

Abstract. Given a timed automaton P modeling an implementation and a timed

automaton S as a specification, language inclusion checking is to decide whether
the language of P is a subset of that of S. It is known that this problem is un-
decidable and “this result is an obstacle in using timed automata as a specifica-
tion language” [2]. This undecidability result, however, does not imply that all
timed automata are bad for specification. In this work, we propose a zone-based
semi-algorithm for language inclusion checking, which implements simulation
reduction based on Anti-Chain and LU-simulation. Though it is not guaranteed
to terminate, we show that it does in many cases through both theoretical and em-
pirical analysis. The semi-algorithm has been incorporated into the PAT model
checker, and applied to multiple systems to show its usefulness and scalability.

1 Introduction
Timed automata, introduced by Alur and Dill in [2], have emerged as one of the most
popular models to specify and analyze real-time systems. It has been shown that the
reachability problem for timed automata is decidable using the construction of region
graphs [2]. Efficient zone-based methods for checking both safety and liveness proper-
ties have later been developed [14,21]. In [2], it has also been shown that timed automata
in general cannot be determinized, and the language inclusion problem is undecidable,
which “is an obstacle in using timed automata as a specification language”.
In order to avoid undecidability, a number of subclasses of timed automata which are
determinizable (and perhaps serve as a good specification language) have been identi-
fied, e.g., event-clock timed automata [3,17], timed automata restricted to at most one
clock [16] and integer resets timed automata [18]. Recently, Baier et al. [4] described a
method for determinizing arbitrary timed automaton, which under a boundedness con-
dition, yields an equivalent deterministic timed automaton in finite time. Furthermore,
they show that the boundedness condition is satisfied by several subclasses of timed au-
tomata which are known to be determinizable. However, the method is based on region
graphs and it is well-known that region graphs are inefficient and lead to state space
explosion. Compared to region graphs, zone graphs are often used in existing tools
for real-time system verification, such as U PPAAL [14] and K RONOS [23]. Zone-based
approaches have also been used to solve problems which are related to the language

This research is sponsored in part by NSFC Program (No.61103032) of China.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 310–325, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Are Timed Automata Bad for a Specification Language? 311

inclusion problem, like the universality problem (which asks whether a timed automa-
ton accepts all timed words) for timed automata with one-clock only [1]. However, to
the best of our knowledge, there has not been any zone-based method proposed for
language inclusion checking for arbitrary timed automata.
In this work we develop a zone-based method to solve the language inclusion prob-
lem. Formally, given an implementation timed automaton P and a specification timed
automaton S, the language inclusion problem is to decide whether the language of P
is a subset of that of S. It is known that the problem can be converted to a reachabil-
ity problem on the synchronous product of P and determinization of S [16]. Inspired
by [1,4], the main contribution of this work is that we present a semi-algorithm with a
transformation that determinizes S and constructs the product on-the-fly, where zones
are used as a symbolic representation. Furthermore, simulation relations between the
product states are used, which can be obtained through LU-simulation [5] and Anti-
Chain [22]. With the simulation relations, many product states may be skipped, which
often contributes to the termination of our semi-algorithm.
Our semi-algorithm can be applied to arbitrary timed automata, though it may not ter-
minate sometimes. To argue that timed automata can nonetheless serve as a specification
language, we investigate when our approach is terminating, both theoretically and em-
pirically. Firstly, we prove that, with the clock boundedness condition [4], we are able to
construct a suitable well-quasi-order on the product state space to ensure termination.
It thus implies that our semi-algorithm is always terminating for subclasses of timed
automata which are known to be determinizable. Furthermore, we prove that for some
classes of timed automata which may violate the boundedness condition, our semi-
algorithm is always terminating as long as there is a well-quasi-order on the abstract
state space explored. Secondly, using randomly generated timed automata, we show
that our approach terminates for many timed automata which are not determinizable
(and violating the boundedness condition) because of the simulation reduction. Thirdly,
we collect a set of commonly used patterns for specifying timed properties [8,12] and
show that our approach is always terminating for all of those properties. Lastly, our
semi-algorithm has been implemented in the PAT [19] framework, and applied to a
number of benchmark systems to demonstrate its effectiveness and scalability.
The remainders of the paper is organized as follows. Section 2 reviews the notions of
timed automata. Section 3 shows how to reduce language inclusion checking to a reach-
ability problem, which is then solved using a zone-based approach. Section 4 reports
the experimental results. Section 5 reviews related work. Section 6 concludes.

2 Background

In this section, we review the relevant background and define the language inclusion
problem. We start with defining labeled transition systems (LTS). An LTS is a tuple
L = (S, Init, Σ, T ), where S is a set of states; Init ⊆ S is a set of initial states;
Σ is an alphabet; and T ⊆ S × Σ × S is a labeled transition relation. A run of L is
a finite sequence of alternating states and events s0 , e1 , s1 , e2 , · · · , en , sn such that
(si , ei , si+1 ) ∈ T for all 0 ≤ i ≤ n − 1. We say the run starts with s0 and ends with sn .
A state s is reachable from s iff there is a run starting with s and ending with s . A state
312 T. Wang et al.

is always reachable from itself. A run is rooted if it starts with a state in Init. A state
is reachable if there is a rooted run which ends at the state. Given a state s ∈ S and an
event e ∈ Σ, we write post(s, e, L) to denote {s |(s, e, s ) ∈ T }. We write post(s, L)
to denote {s |∃e ∈ Σ · (s, e, s ) ∈ T }, i.e., the set of successors of s.
Let F ⊆ S be a set of target states. Given two states s0 and s1 in S, we say that s0
is simulated by s1 with respect to F if s0 ∈ F implies that s1 ∈ F ; and for any e ∈ Σ,
(s0 , e, s0 ) ∈ T implies there exists (s1 , e, s1 ) ∈ T such that s0 is simulated by s1 . In
order to check whether a state in F is reachable, if we know that s is simulated by s ,
then s can be skipped during system exploration if s has been explored already. This is
known as simulation reduction [7].
The original definition of timed automata is finite-state timed Büchi automata [2]
equipped with real-valued clock variables and Büchi accepting condition (to enforce
progress). Later, timed safety automata were introduced in [11] which adopt an intu-
itive notion of progress. That is, instead of having accepting states, each state in timed
safety automata is associated with a local timing constraint called a state invariant. An
automaton can stay at a state as long as the valuation of the clocks satisfies the state
invariant. The reader can refer to [9] for the expressiveness of timed safety automata.
In the following, we focus on timed safety automata as they are supported in the state-
of-art model checker U PPAAL [14] and are often used in practice. Hereafter, they are
simply referred to as timed automata.
Let R+ be the set of non-negative real numbers. Given a set of clocks C, we define
Φ(C) as the set of clock constraints. Each clock constraint is inductively defined by:
δ := true|x ∼ n|δ1 ∧δ2 |¬δ1 where ∼∈ {=, ≤, ≥, <, >}; x is a clock in C and n ∈ R+
is a constant. Without loss of generality, we assume that n is an integer constant. The
set of downward constraints obtained with ∼∈ {≤, <} is denoted as Φ≤,< (C). A clock
valuation v for a set of clocks C is a function which assigns a real value to each clock. A
clock constraint can be viewed as the set of clock valuations which satisfy the constraint.
A clock valuation v satisfies a clock constraint δ, written as v ∈ δ, iff δ evaluates to be
true using the clock values given by v. For d ∈ R+ , let v + d denote the clock valuation
v s.t. v (c) = v(c) + d for all c ∈ C. For X ⊆ C, let clock resetting notion [X )→ 0]v
denote the valuation v such that v (c) = v(c) for all c ∈ C ∧ c ∈ / X, and v (x) = 0 for
all x ∈ X. We write C = 0 to be the clock valuation where each clock c ∈ C reads 0.
Formally, a timed automaton is a tuple A = (S, Init, Σ, C, L, T ) where S is a finite
set of states; Init ⊆ S is a set of initial states; Σ is an alphabet; C is a finite set of
clocks; L : S → Φ≤,< (C) labels each state with an invariant; T ⊆ S × Σ × Φ(C) ×
2C × S is a labeled transition relation. Intuitively, a transition (s, e, δ, X, s) ∈ T can
be fired if δ is satisfied. After event e occurs, clocks in X are set to zero. The (concrete)
semantics of A is an infinite-state LTS, denoted as C(A) = (Sc , Initc , R+ × Σ, Tc)
such that Sc is a set of configurations of A, each of which is a pair (s, v) where s ∈ S
is a state and v is a clock valuation; Initc = {(s, C = 0)|s ∈ Init} is a set of initial
configurations; and Tc is a set of concrete transitions of the form ((s, v), (d, e), (s , v ))
such that there exists a transition (s, e, δ, X, s ) ∈ T ; v + d ∈ δ; v + d ∈ L(s);
[X )→ 0](v + d) = v ; and v ∈ L(s ). Intuitively, the system idles for d time units at
state s and then take the transition (generating event e) to reach state s . An example
timed automaton is shown in Fig. 1(a). The initial state is p0 . The automaton has a state
Are Timed Automata Bad for a Specification Language? 313

x>3, a x>3, a x>3, a

p0 p1 p0 p1 s0 s1
{x} {x} {x}
a x>0, a a x3, a x>0, a a x3, a x>0, a a
{x} {x} {x} {x} {x} {x} {x} {x} {x, y}
p3 x>0, a
p2 p3 0<x3, a
p2 s3 s2
y>0, a
x3 (a) (b) (c)

Fig. 1. Timed automata examples

invariant x ≤ 3 on state p3 which implies that if the control is at p3 , it must transit to

the next state before the value of clock x is larger than 3.
A timed automaton A is deterministic iff Init contains only one state and for any
two transitions (s0 , e0 , δ0 , X0 , s0 ) ∈ T and (s1 , e1 , δ1 , X1 , s1 ) ∈ T , if s0 = s1 and
e0 = e1 , then δ0 and δ1 are mutually exclusive. Otherwise, A is non-deterministic. For
instance, The timed automaton in Fig. 1(c) is non-deterministic as the two transitions
from state s2 are both labeled with a and the guards are not mutually exclusive.
Given (s0 , v0 ), (d1 , e1 ), (s1 , v1 ), (d2 , e2 ), · · · (sn , vn ) as a run of C(A), we can ob-
i
tain a timed word: (D1 , e1 ), (D2 , e2 ), · · · , (Dn , en ) so that Di = j=1 dj for all
1 ≤ i ≤ n. We define the L(A, (s, v)) to be the set of timed words obtained from the
set of all runs starting with (s, v). The language of A, written as L(A), is defined as
the language obtained from any rooted run of A. Two timed automata are equivalent if
they define the same language. In practice, a system model is often composed of sev-
eral automata executing in parallel. We skip the details on parallel composition of timed
automata and remark our approach in this work applies to networks of timed automata.
The language inclusion checking problem is then defined as follows. Given a timed
automaton P and a timed automaton S, how do we check whether L(P) ⊆ L(S)?
In order to simplify the presentation in later sections, we first transform a given
timed automaton to an equivalent one without state invariants, which will not affect our
approach. The idea is to move the state invariants to transition guards. Given a timed
automaton A and a state s with state invariant L(s), we construct a timed automaton A
with the following two steps. Firstly, if (s, e, δ, X, s ) is a transition from s, change it
to (s, e, δ ∧ L(s), X, s ). Secondly, if (s , e, δ, X, s) is a transition leading to s, for any
clock constraint of the form x ∼ n where ∼∈ {≤, <} in L(s), if x ∈ / X, conjunct δ
with x ∼ n. For instance, given the timed automaton in Fig. 1(a), we construct the timed
automaton in Fig. 1(b). The state invariant x ≤ 3 of state p3 is added to the transition
from p2 to p3 and the transition from p3 to p0 . By a simple argument, it can be shown
that L(A) = L(A ). Notice that this transformation is not sound if the language of a
timed automaton is defined differently, e.g., with a non-Zenoness assumption. In the
following, we assume that all timed automata are without state invariants.

3 Language Inclusion Checking

In this section, we present our method on solving the language inclusion checking prob-
lem. We fix P = (Sp , Initp , Σp , Cp , Lp , Tp ) and S = (Ss , Inits , Σs , Cs , Ls , Ts ) to be
the two timed automata such that Sp and Ss are disjoint as well as Cp and Cs .1
1
The proofs in this section can be found at
http://www.comp.nus.edu.sg/˜pat/refine ta/paper.pdf
314 T. Wang et al.

Level 0 Level 1 Level 2 Level 3 Level 4 Level 5

s3
a, {z4} z4>0, a, {z5}
s1 s2 {z4}
z0>3, a, {z1} a, {z2} z2>0, a, {z3} z4>0, a, {z5}
s0 s1 s2 {z3, z2} {z4}
s1
z2>0, a, {z3} z23, a, {z4}
{z0} {z1, z0} {z2} s3 s0 {z5, z4}
z4>3, a, {z5}
{z2} {z4, z2}
s1
{z5, z2}
Fig. 2. Unfolding a timed automaton into an infinite timed tree

3.1 Unfolding Specification

In our method, we construct on-the-fly an unfolding of S in the form of an infinite timed

tree which is equivalent to S. The idea is adopted from the approach in [4]. Before we
present the formal definition, we illustrate the unfolding using an example. Fig. 2 shows
the infinite timed tree after unfolding the automaton shown in Fig. 1(c). The idea is to
introduce a fresh clock at every level and use the newly introduced clocks to replace
ordinary clocks, i.e., x and y in this example. The benefit of doing this becomes clear
later. At level 0, we are at state s0 and introduce a clock z0 . Now since clock x and
clock y are started at the same time as z0 and the clocks will not be reset before the
transition from s0 takes place, we can use z0 to replace x and y in the transition guard
from s0 at level 0 to s1 at level 1. Because at level 0, the reading of clock z0 is relevant
to the future system behavior, we say that z0 is active. In the tree, we label every node
with a pair (s, A) where s is a state and A is a set of active clocks. Notice that not all
clocks are active. For instance, clock z0 and clock z1 are no longer active at level 2.
One transition from the level 0 node leads to the node of level 1, corresponding to the
transition from state s0 to state s1 in Fig. 1(c). The clock constraint x > 3 is rewritten
to z0 > 3 using only active clocks from the source node. A fresh clock z1 is introduced
along the transition. Notice that the node at level 1 is labelled with a set of two active
clocks. z1 is active at state s1 at level 1 since it can be used to replace clock x which
is reset along the transition, whereas z0 is active because it is used to replace clock y
which is not reset along the transition. The set of active clocks of the node at level 2 is
a singleton z2 since both of the clocks x and y are reset along the transition. z0 and z1
are not active as their reading is irrelevant to future transitions from s2 . Following the
same construction, we build the tree level by level.
In the following, we define the unfolding of S. Let Z = z0 , z1 , z2 , · · · be an infinite
sequence of clocks. The unfolding S is an infinite timed tree, which can be viewed
as a timed automaton S∞ = (St∞ , Init∞ , Σ∞ , Z, T∞ ) with infinitely many states.
Furthermore, we assume that S∞ is associated with function level such that level(n)
is the level of node n in the tree for all n ∈ St∞ . A state n in St∞ is in the form of
(s, A) where s ∈ Ss and A is a set of clocks. Given any state n, we define a function
fn : Cs )→ Z which maps ordinary clocks in Cs to active clocks in Z. In an abuse of
notations, given a clock constraint δ on Cs , we write fn (δ) to denote the clock constraint
obtained by replacing clocks in Cs with those in Z according to fn . Given any state
Are Timed Automata Bad for a Specification Language? 315

n = (s, A), we define A to be {fn (c)|c ∈ Cs }. The initial states Init∞ and transition
relation T∞ are unfolded as follows.
– For any s ∈ Inits , there is a level-0 node n = (s, {z0 }) in St∞ with level(n) = 0,
fn (c) = z0 for all c ∈ Cs .
– For each node n = (s, A) at level i and for each transition (s, e, δ, X, s) ∈
Ts , we add a node n = (s , A ) at level i + 1 such that fn (c) = fn (c) if
c ∈ Cs \ X, fn (c) = zi+1 if c ∈ X; level(n ) = i + 1. We add a transition
(n, e, fn (δ), {zi+1 }, n ) to T∞ .
Note that transitions at the same level have the same set of resetting clocks, which
contains one clock. Given a node n = (s, A) in the tree, observe that not every clock x
in A is active as the clock may never be used to guard any transition from s. Hereafter,
we assume that inactive clocks are always removed.

3.2 Zone Abstraction for Language Inclusion Checking

It can be shown that S and S∞ are equivalent [4]. Intuitively, S and S∞ have the
same language, thus the language inclusion problem can be converted to the language
inclusion problem between P and S∞ . To solve the problem, we have to deal with two
sources of infinity. One is that there are infinitely many clocks and the other is there are
infinitely many clock valuations for each clock. In the following, we tackle the latter
with zone abstraction [14].
In this work, we define a zone (which may or may not be convex) as the maximum
set of clock valuations satisfying a clock constraint. Given a clock constraint δ, let δ ↑
denote the zone reached by delaying an arbitrary amount of time. For X ⊆ C, let
[X )→ 0]δ denote the zone obtained by setting clocks in X to 0; and let δ[X] denote the
projection of δ on X.
We define an LTS Z∞ = (S, Init, Σ, T ), which is a zone graph generated from the
synchronous product of P and the determinization of S∞ . A state in S is an abstract
configuration of the form (sp , Xs , δ) such that sp ∈ Sp ; Xs is a set of nodes in S∞ as
defined in Section 3.1; and δ is a clock constraint. Recall that a state of S∞ is of the form
(ss , A) where ss ∈ Ss and A is a set of active clocks. Given a set of states Xs of S∞ ,
we write Act(Xs ) to denote the set of all active clocks, i.e., {c|∃(ss , A) ∈ Xs · c ∈ A}.
δ constraints all clocks in Act(Xs ).
The Init of the zone graph is defined as: {(sp , Init∞ , (Act(Init∞ ) = 0)↑ )|sp ∈
Initp }. Σ equals to Σp . Next, we define T by showing how to generate successors of a
given abstract configuration (sp , Xs , δ). For every state (ss , A) ∈ Xs , let T∞ (e, Xs ) be
the set of transitions in T∞ which start with a state in Xs and are labeled with event e.
Notice that the guard conditions of transitions in T∞ (e, Xs ) may not be mutually exclu-
sive. We define a set of constraints Cons(e, Xs ) such that each element in Cons(e, Xs )
is a constraint which conjuncts, for each transition in T∞ (e, Xs ), either the transition
guard or its negation. Notice that elements in Cons(e, Xs ) are by definition mutually
exclusive. Given (sp , Xs , δ) and an outgoing transition (sp , e, gp , Xp , sp ) from sp in P,
for each g ∈ Cons(e, Xs ) we generate a successor (sp , Xs , δ ) as follows.
– For any state (ss , A) ∈ Xs and any transition ((ss , A), e, gs , Y, (ss , A )) ∈ T∞ , if
δ ∧ gp ∧ g ∧ gs is not false, then (ss , A ) ∈ Xs .
316 T. Wang et al.

L0 p0, {(s0, {z0})} p0, {(s0, {z0})}

0x=z0 0x=z0
x>3 ‫ ޔ‬z0>3, a {x, z1} x>3 ‫ ޔ‬z0>3, a {x, z1}
L1 p1, {(s1, {z1, z0})} p1, {(s1, {z1, z0})}
(0x=z1) ‫ ޔ‬z0>3 ‫ ޔ‬z0-x>3 ‫ ޔ‬z0-z1>3 0x=z1<z0
a {x, z2} a {x, z0}
L2 p2, {(s2, {z2})} p2, {(s2, {z0})}
0x=z2 0x=z0
0<x3 ‫ ޔ‬z2>0, a x>0 ‫ ޔ‬z2>0, a 0<x3 ‫ ޔ‬z0>0, a x>0 ‫ ޔ‬z0>0, a
{z3} {x, z3} {z1} {x, z1}
p3, {(s3, {z2}), (s1, {z3, z2})} p1, {(s3, {z2}), (s1, {z3, z2})} p3, {(s3, {z0}), (s1, {z1, z0})} p1, {(s3, {z0}), (s1, {z1, z0})}
L3
0z3<x=z2 ‫ ޔ‬x-z33 ‫ ޔ‬z2-z33 0x=z3<z2 0z1<x=z0 ‫ ޔ‬x-z13 ‫ ޔ‬z0-z13 0x=z1<z0
x3 ‫ ޔ‬z23, a z23, a z2>3, a x3 ‫ ޔ‬z03, a
{x, z4} {x, z4} {x, z4} {x, z1}
p0, {(s2, {z4}), (s0, {z4, z2})} p2, {(s2, {z4}), (s0, {z4, z2})} p0, {(s2, {z1}), (s0, {z1, z0})}
L4 Ă
0x=z4<z2 ‫ ޔ‬z2-x3 ‫ ޔ‬z2-z43 0x=z4<z2 ‫ ޔ‬z2-x3 ‫ ޔ‬z2-z43 0x=z1<z0
x>3 ‫ ޔ‬z4>3, a x>0 ‫ ޔ‬z4>3, a x>0 ‫ ޔ‬0<z43, a 0<x3 ‫ ޔ‬0<z43, a x>3 ‫ ޔ‬z1>3, a
{x, z5} {x, z5} {x, z5} {z5} {x, z2}
p1, {(s3, {z4}), (s1, {z5, z4}), (s1, {z5, z2})}
p1, {(s3, {z4}), (s1, {z5, z4})} p1, {(s3, {z1}), (s1, {z2, z1}), (s1, {z2, z0})}
L5 0x=z5 ‫ ޔ‬3<z4<z2 ‫ ޔ‬z4-x>3 ‫ ޔ‬z2-x>3
0x=z5<z4 ‫ ޔ‬z4-x3 ‫ ޔ‬z4-z53 Ă
0x=z2<z1 ‫ ޔ‬3<z1<z0 ‫ ޔ‬z1-x>3 ‫ ޔ‬z1-z2>3
‫ ޔ‬z4-z5>3 ‫ ޔ‬z2-z5>3 ‫ ޔ‬z2-z43

(a) (b)

Fig. 3. Zone graphs: (a) Z∞ and (b) ZrLU

– All states in Xs are at the same level and thus all transitions in T∞ (e, Xs ) have the
same resetting clock. Let Y be that clock and δ = ([Y ∪ Xp )→ 0](δ ∧ g ∧ gp ))↑ .
– The transition from (sp , Xs , δ) to (sp , Xs , δ ) is labeled with the tuple (e, gp ∧
g, Xp ∪ Y ).
We illustrate the above using the example in Fig. 3(a). Given the abstract config-
uration in level 4, which is (p2 , {(s2 , {z4 }), (s0 , {z4 , z2 })}, 0 ≤ x = z4 < z2 ∧
z2 − x ≤ 3 ∧ z2 − z4 ≤ 3). As shown in Fig. 2, there are two transitions from
state (s2 , {z4 }) which are labeled with event a and one from state (s0 , {z4 , z2 }), which
makes up Ts (a, {(s2 , {z4 }), (s0 , {z4 , z2 })}). The two transitions from (s2 , {z4 }) have
the same guard z4 > 0 and the one from (s0 , {z4 , z2 }) has the guard z4 > 3. The set
Cons(e, Xs ) contains the following constraints: z4 > 0 ∧ z4 > 3, z4 ≤ 0 ∧ z4 > 3,
z4 > 0 ∧ z4 ≤ 3, and z4 ≤ 0 ∧ z4 ≤ 3. Taking the transition form p2 to p1 as an
example, we generate four potential successors for each of constraints in Cons(e, Xs ),
as shown above. Two of them are infeasible as the resultant constraints are false. The
rest two are shown in Fig. 3(a) (the first two from left at level 5). Since z2 is no longer
active for the second successor, the clock constraint of the second successor is modified
to 0 ≤ x = z5 < z4 ∧ z4 − x ≤ 3 ∧ z4 − z5 ≤ 3 so as to remove constraints on z2 .
Similarly, we can generate other configurations in Fig. 3(a).
In the following, we reduce the language inclusion checking problem to a reach-
ability problem in Z∞ . Notice that one of constraints in Cons(e, Xs ) conjuncts the
negations of all guards of transitions in T∞ (e, Xs ). Let us denote the constraint as neg.
For instance, given the same abstract state in the middle of level 4 in Fig. 3(a), the
constraint neg in Cons(e, Xs ) is: z4 ≤ 0 ∧ z4 ≤ 3, which is equivalent to z4 ≤ 0.
Conjuncted with the guard condition x > 0 and the initial constraint 0 ≤ x = z4 <
z2 ∧ z2 − x ≤ 3 ∧ z2 − z4 ≤ 3, it becomes false and hence no successor is generated
for neg. Given neg, assume the corresponding successor is (sp , Xs , δ ). It is easy to see
Are Timed Automata Bad for a Specification Language? 317

that Xs is empty. If δ is not false, intuitively there exists a time point such that P can
perform e whereas S cannot, which implies language inclusion is not true. Thus, we
have the following theorem.

Theorem 1. L(P) ⊆ L(S) iff there is no reachable state (sp , ∅, δ) in Z∞ .

Theorem 1 therefore reduces our language inclusion problem to a reachability problem

on Z∞ . If a state in the form of (sp , ∅, δ) is reachable, then we can conclude that the
language inclusion is false. The remaining problem is that there may be infinitely many
clocks. In the following, we show how to reduce the number of clocks, which is inspired
by [4]. Intuitively, given any abstract state (sp , Xs , δ) in the zone graph Z∞ , instead of
always using a new clock in Z, we can reuse a clock which is not currently active, or
equivalently not in Act(Xs ). For instance, given the state on level 1 in Fig. 3(a), there
are two active clocks z0 and z1 . For the successor of this state on level 2, instead of
using z2 , we can reuse z0 and systematically rename z2 to z0 afterwards. The result of
renaming is shown partially in Fig. 3(b) (notice that some zones in Fig. 3(b) are different
from the ones in Fig. 3(a) and some states have been removed, because of simulation
reduction shown next). We denote the zone graph after renaming as Zr . We also denote
the successors of an abstract state ps in Zr as post(ps, Zr ). By a simple argument, it
can be shown that there is a reachable state (sp , ∅, δ) in Z∞ iff there is a reachable state
(sp , ∅, δ ) in Zr .

3.3 Simulation Reduction

We have so far reduced the language inclusion checking problem to a reachability prob-
lem in the potentially infinite-state LTS Zr . Next, we reduce the size of Zr by exploring
simulation relation between states in Zr . We first extend the lower-upper bounds (here-
after LU-bounds) simulation relation defined in [5] to language inclusion checking.
We define two functions L and U . Given a state s in Zr and a clock x ∈ Cp ∪
Z, we perform a depth-first-search to collect all transitions reachable from s without
going through a transition which resets x. Next, we set L(s, x) (resp. U (s, x)) to be the
maximal constant k such that there exists a constraint x > k or x ≥ k (resp. x < k or
x ≤ k) in a guard of those transitions. If such a constant does not exist, we set L(s, x)
(resp. U (s, x)) to −∞. We remark that L(s, x) is always the same as U (s, x) for a clock
in Z because both guard conditions and their negations are used in constructing Zr . For
instance, if we denote the state at level 0 in Fig. 3(b) as s0 , which can be seen as the
initial state in Zr , the function L is then defined such that L(s0 , x) = 3, L(s0 , z0 ) = 3.
Next, we define a relation between two zones using the LU-bounds and show that
the relation constitutes a simulation relation. Given two clock valuations v and v at a
state s and the two functions L and U , we write v LU v if for each clock c, either
v (c) = v(c) or L(s, c) < v (c) < v(c) or U (s, c) < v(c) < v (c). Next, given two
zones δ1 and δ2 , we write δ1 LU δ2 to denote that for all v1 ∈ δ1 , there is a v2 ∈ δ2
such that v1 LU v2 . The following shows that LU constitutes a simulation relation.

Lemma 1. Let (s, X, δi ) where i ∈ {0, 1} be two states of Zr and F be the set of states
{(s , ∅, δ )} in Zr . (s, X, δ1 ) simulates (s, X, δ0 ) w.r.t. F if δ0 LU δ1 .
318 T. Wang et al.

With the above lemma, given an abstract state (s, X, δ) of Zr , we can enlarge the time
constraint δ so as to include all clock valuations which are simulated by some valuations
in δ without changing the result of reachability analysis. In the following, we write
LU (δ) to denote the set {v|∃v ∈ δ · v LU v }2 . We construct an LTS, denoted
as ZrLU which replaces each state (s, X, δ) in Zr with (s, X, LU (δ)). We denote the
successors of a state ps in ZrLU as post(ps, ZrLU ). By a simple argument, we can show
that there is a reachable state (s, ∅, δ) in Zr iff there is a reachable state (s , ∅, δ ) in
ZrLU . For instance, given the Zr after renaming Z∞ shown in Fig. 3(a), Fig. 3(b) shows
the corresponding ZrLU .
Next, we incorporate another simulation relation in our work which is inspired by
the Anti-Chain algorithm [22]. The idea is that given two abstract states (s, X, δ) and
(s , X , δ ) of ZrLU , we can infer a simulation relation by comparing X and X . One
problem is that states in X and X may have different sets of active clocks. The exact
names of the clocks however do not matter semantically. In order to compare X and
X (and compare δ and δ ), we define clock mappings. A mapping from Act(X ) to
Act(X) is a injection function f : Act(X ) → Act(X) which maps every clock in
Act(X ) to one in Act(X). We write X ⊆f X if there exists a mapping f such that
for all (ss , A ) ∈ X , there exists (ss , A) ∈ X such that ss = ss and for all x ∈ A ,
f (x) ∈ A. Notice that there might be clocks in Act(X) which are not mapped to. We
write range(f ) to denote the set of clocks which are mapped to in Act(X). With an
abuse of notations, given a constraint δ constituted by clocks in Act(X ), we write
f (δ ) to denote the constraint obtained by renaming the clocks accordingly to f . We
write δ ⊆f δ if δ[range(f )] ⊆ f (δ ), i.e., the clock valuations which satisfy the
constraint δ[range(f )] (obtained by projecting δ onto clocks in Act(X )) satisfy δ
after clock renaming. Next, we define a relation between two abstract configurations.
We write (s, X, δ) ( (s , X , δ ) iff the following are satisfied: s = s and there exists
a mapping f such that X ⊆f X and δ ⊆f δ . The next lemma establishes that ( is a
simulation relation.
Lemma 2. Let (s, X, δ) and (s , X , δ ) be states in ZrLU . Let F = {(s, ∅, δ0 )} be the
set of target states. (s , X , δ ) simulates (s, X, δ) w.r.t. F if (s, X, δ) ( (s , X , δ ).
For example, let ps0 denote the state at level 1 in Fig. 3(a). Let ps1 denote the bold-
lined state at level 1 and ps2 denote the one at level 5 in Fig. 3(b). With the LU simu-
lation relation, ps0 can be replaced by ps1 . A renaming function f can be defined from
clocks in ps1 to clocks in ps2 , i.e., f (z0 ) = z1 and f (z1 ) = z2 . After renaming, ps1
becomes (p1 , {(s1 , {z2 , z1 })}, 0 ≤ x = z2 < z1 ). Therefore, ps2 ( ps1 and hence we
do not need to explore from ps2 . Similarly, we do not need to explore from the bold-
lined state at level 3 in Fig. 3(b), namely ps3 . Notice that without the LU simulation
reduction ps3 ( ps1 cannot hold, and the successors of ps3 must be explored.

3.4 Algorithm
In the following, we present our semi-algorithm. Let ZrLU be the tuple (S, Init, Σ, T )
where Init is a set (initp , Inits , LU ((Cp = 0 ∧ z0 = 0)↑ )). Algorithm 1 constructs
2
Notice that we may not be able to represent this set as a convex time constraint [5].
Are Timed Automata Bad for a Specification Language? 319

Algorithm 1. Language inclusion checking

1: let working := Init;
2: let done := ∅;
3: while working = ∅ do
4: remove ps = (sp , Xs , δ) from working;
5: add ps into done and remove all ps ∈ done s.t. ps ps;
6: for all (sp , Xs , δ ) ∈ post(ps, ZrLU ) do
7: if Xs = ∅ then
8: return false;
9: end if
10: if ps ∈ done such that (sp , Xs , δ ) ps then
11: put (sp , Xs , δ ) into working;
12: end if
13: end for
14: end while
15: return true;

ZrLU on-the-fly while performing reachability analysis with simulation reduction. It

maintains two data structures. One is a set working which stores states in S which are
yet to be explored. The other is a set done which contains states which have already
been explored. Initially, working is set to be Init and done is empty. During the loop
from line 3 to line 14, each time a state is removed from working and added to done.
Notice that in order to keep done small, whenever a state ps is added into done, all
states which are simulated by ps are removed. We generate successors of ps at line 6.
For each successor, if it is a target state, we return false at line 8. If it is simulated by
a state in done, it is ignored. Otherwise, it is added into working so that it will be
explored later. Lastly, we return true at line 15 after exploring all states. We remark that
done is an Anti-Chain [22] as any pair of states in done is incomparable. The following
theorem states that the semi-algorithm always produces correct results.

Theorem 2. Algorithm 1 returns true iff L(P) ⊆ L(S).

Next, we establish sufficient conditions for the termination of semi-algorithm with the
theorems of well quasi-order (W QO [15]). A quasi-order (QO) on a set A is a pair
(A, ) where is a reflective and transitive binary relation in A × A. A QO is a
W QO if for each infinite sequence a0 , a1 , a2 , . . . composed of the elements in A,
there exists i < j such that aj ai . Therefore if a W QO can be found among states
in ZrLU with the simulation relation (, our semi-algorithm terminates, as stated in the
following theorem.

Theorem 3. Let S be the set of states of ZrLU . If (S, () is a W QO, Algorithm 1 is

terminating.

The above theorem implies that our semi-algorithm always terminates given the subclass
of timed automata satisfying the clock boundedness condition [4], including strongly
non-Zeno timed automata, event-clock timed automata and timed automata with integer
resets. That is, if the boundedness condition is satisfied, ZrLU has a bounded number
320 T. Wang et al.

of clocks and if the number of clocks are bounded, obviously the set S is finite (with
maximum ceiling zone normalization). Since (S, =) is a W QO if S is finite by a property
of W QO, and ‘=’ implies ‘(’, (S, () is also a W QO for this special case. Furthermore,
the theorem also shows that the semi-algorithm is terminating for all single-clock timed
automata, as a W QO has been shown in [1], which may not satisfy the boundedness
condition.

4 Evaluation

Our method has been implemented with 46K lines of C# code and integrated into the
PAT model checker [19]3 . We remark that in our setting, a zone may not be convex
(for instance, due to negation used in constructing Zr ) and thus cannot be represented
as a single difference bound matrix (DBM). Rather it can be represented either as a
difference bound logic formula, as shown in [3], or as a set of DBMs. In this work, the
latter approach is adopted for the efficiency reason. In the following, we evaluate our
approach in order to answer three research questions. All experiment data are obtained
using a PC with Intel(R) Core(TM) i7-2600 CPU at 3.40 GHz and 8.0 GB RAM.
The first question is: are timed automata good to specify commonly used timed
properties? That is, if timed automata are used to model the properties, will our semi-
algorithm terminate? In [8,12], the authors summarized a set of commonly used patterns
for real-time properties. Some of the patterns are shown below where a, b, c are events;
x is a clock and h denotes all the other events. Most of the patterns are self-explanatory
and therefore we refer to the readers to [8,12] for details. We remark that although the
patterns below are all single-clock timed automata, a specification may be the parallel
composition of multiple patterns and hence have multiple clocks. Observe that all timed
automata below are deterministic except (g). A simple investigation shows that (g) sat-
isfies the clock boundedness condition and hence our semi-algorithm terminates for all
the properties below.
x>k x>k a
xk a xk a xk h h {x}
s0 x > k a s 0
x k a s0 s1
h x>k h s0 s1 xk
h h b b
(a) absence (b) universality (c) existence a
xk (d) response
a h xk h
h h a
{x} a
s0 s1 xk xk xk
x>k xk
s0 a s1 b s2 c s3
a
b b
(e) precedence-1 xk (g) chains c b
a h c
xk xk
h {x} xk
h h h
s0 s1 h xk xk xk
xk
a s0 a s1 a s2 a s3
b {x}
(f) precedence-2 (h) occurrence times a

The second question is: is the semi-algorithm useful in practice? That is, given a
real-world system, is it scalable? In the following, we model and verify benchmark
3
PAT and the experiment details can be found at
http://www.comp.nus.edu.sg/˜pat/refine ta
Are Timed Automata Bad for a Specification Language? 321

Table 1. Experiments on Language Inclusion Checking for Timed Systems

+LU LU
System |Cs | Det
stored total time stored total time stored total time
Fischer*8 1 Yes 91563 224208 28.3 138657 300384 516.7 - - -
Fischer*6 6 No 38603 78332 537.0 - - - - - -
Fischer*6 2 No 27393 58531 6.8 36218 70348 30.3 - - -
Fischer*7 2 No 121782 271895 42.9 159631 326772 661.7 - - -
Railway*8 1 Yes 796154 1124950 142.1 - - - - - -
Railway*6 6 No 23265 33427 7.2 27903 39638 20.4 - - -
Railway*7 7 No 180034 260199 66.7 222806 318698 1352.8 - - -
Lynch*5 1 Yes 3852 11725 0.6 16193 48165 6.0 45488 421582 377.2
Lynch*7 1 Yes 79531 400105 34.9 - - - - - -
Lynch*5 2 No 8091 29686 2.4 63623 208607 151.3 56135 324899 290.1
Lynch*6 2 No 35407 162923 16.7 477930 1828668 5751.1 - - -
FDDI*7 7 Yes 1198 1590 7.4 8064 9592 36.4 8452 11836 125.5
CSMA*7 1 Yes 9840 36255 4.5 - - - - - -

timed systems using our semi-algorithm and evaluate its performance. The benchmark
systems include Fischer’s mutual exclusion protocol (Fischer for short, similarly here-
inafter), Lynch-Shavit’s mutual exclusion protocol (Lynch), railway control system
(Railway), fiber distributed data interface (FDDI), and CSMA/CD protocol (CSMA).
The results are shown in Table 1. The systems are all built as networks of timed au-
tomata, and the number of processes is shown in column ‘System’. The verified prop-
erties are requirements on the systems specified using the timed patterns. Some of the
properties contain one timed automaton with one clock, while the rest are networks of
timed automata with more than one clock (one clock for each timed automaton). In
the table, column ‘|Cs |’ is the number of clocks (processes) in the specification. The
systems in the same group, e.g., Fischer*6 and Fischer*7 both with |Cs | = 2, have
the same specification. Notice that the number of processes in a system and the one
in the specification can be different because we can ‘hide’ events in the systems and
use h in the specifications as shown in the patterns. Column ‘Det’ shows whether the
specification is deterministic or not. The results of our semi-algorithm are shown in
column ‘( +LU ’. In order to show the effectiveness of simulation reduction, we show
the results without ( reduction in column LU and the results without LU -reduction
in column (. For each algorithm, column ‘stored’ denotes the number of stored states;
column ‘total’ denotes the total number of generated states; column ‘time’ denotes the
verification time in seconds. Symbol ‘-’ means either the verification time is more than
2 hours or out-of-memory exception happens. Notice that our semi-algorithm termi-
nates in all cases and all verification results are true. Comparing stored and total, we
can see that many states are skipped due to simulation reduction. From the verification
time we can see that both simulation relations are helpful in reducing the state space.
To the best of our knowledge, there is no existing tool supporting language inclusion
checking of these models.
322 T. Wang et al.

Table 2. Experiments on Random Timed Automata

|S| |C| Dt = 0.6 Dt = 0.8 Dt = 1.0 Dt = 1.1 Dt = 1.3
4 1 1.00\0.99\0.98 0.99\0.93\0.74 0.99\0.82\0.59 0.99\0.63\0.39 0.89\0.18\0.09
4 2 0.99\0.98\0.94 0.98\0.87\0.68 0.94\0.72\0.51 0.85\0.49\0.33 0.45\0.12\0.06
4 3 0.99\0.98\0.93 0.95\0.82\0.65 0.89\0.67\0.52 0.75\0.42\0.28 0.31\0.10\0.06
6 1 1.00\0.99\0.98 0.99\0.97\0.90 0.99\0.61\0.41 0.97\0.43\0.29 0.83\0.13\0.08
6 2 0.99\0.99\0.98 0.99\0.96\0.88 0.88\0.49\0.32 0.79\0.34\0.22 0.44\0.09\0.05
6 3 0.99\0.99\0.98 0.99\0.94\0.85 0.78\0.44\0.29 0.69\0.31\0.21 0.34\0.11\0.07
8 1 1.00\0.99\0.99 0.99\0.92\0.83 0.96\0.53\0.40 0.94\0.37\0.31 0.55\0.08\0.07
8 2 0.99\0.99\0.99 0.99\0.91\0.84 0.84\0.48\0.37 0.73\0.32\0.25 0.25\0.10\0.09
8 3 0.99\0.99\0.99 0.98\0.91\0.83 0.78\0.47\0.38 0.70\0.40\0.32 0.20\0.08\0.07

The last question is: how good are timed automata as a specification language? We
consider a timed automaton specification is ‘good’ if given an implementation model,
our semi-algorithm answers conclusively on the language inclusion problem. To answer
this question, we extend the approach on generating non-deterministic finite automata
in [20] to automatically generate random timed automata, and then apply our semi-
algorithm for language inclusion checking. Without loss of generality, a generated timed
automaton has always one initial state and the alphabet is {0, 1}. In addition, the follow-
ing parameters are used to control the random generation process: the number of state
|S|, the number of clocks |C|, a parameter Dt for transition density and a clock ceiling.
For each event in the alphabet, we generate k transitions (and hence the transition den-
sity for the event is Dt = k/|S|) and distribute the transitions randomly among all |S|
states. For each transition, the clock constraint and the resetting clocks are generated
randomly according to the clock ceiling. We remark that if both implementation and
specification models are generated randomly, language inclusion almost always fails.
Thus, in order to have cases where language inclusion does hold, we generate a group
of implementation specification pairs by generating an implementation first, and then
adding transitions to the implementation to get the specification.
The experimental results are shown in Table 24 . For each different combinations of
|S|, |C| and Dt, we compute three numbers shown in the form of a \ b \ c. a is the per-
centage of cases in which our semi-algorithm terminates; c is the percentage of the cases
satisfying the boundedness condition (and therefore being determinizable [4]). The gap
between a and c thus shows the effectiveness of our approach on timed automata which
may be non-determinizable. In order to show the effectiveness of simulation reduction,
b is the percentage of cases in which our semi-algorithm terminates without simulation
reduction (and with maximum ceiling zone normalization). We generate 1000 random
pairs to calculate each number. In all cases a > b and b > c, e.g., a is much larger than
4
Notice that there are cases where there is only one clock in the specification and yet our semi-
algorithm is not terminating. This is because of using a set of DBMs to represent zones. That
is, because there is no efficient procedure to check whether a zone z is a subset of another
(which is represented as the union of multiple DBMs), the LU-simulation that we discover is
partial and we may unnecessarily explore more states, infinitely more in some cases.
Are Timed Automata Bad for a Specification Language? 323

b and c when Dt ≥ 1.0. This result implies that our semi-algorithm terminates even if
the specification may not be ‘determinizable’, which we credit to simulation reduction
and the fact that the semi-algorithm is on-the-fly (so that language inclusion checking
can be done without complete determinization). When transition density increases, the
gap between a and b increases (e.g., when Dt ≥ 1.0, b is always much smaller that
a), which evidences the effectiveness of our simulation reduction. In general, the lower
the density is, the more likely it is that the semi-algorithm terminates. We calculate the
transition density of the timed property patterns and the benchmark systems. We find
that all the events have transition densities less than or equal to 1.0 except the absence
pattern. Based on the results presented in Table 2, we conclude that in practice, our
semi-algorithm has a high probability of terminating. This perhaps supports the view
that timed automata could serve as a good specification language.

5 Related Work
The work in [2] is the first study on the language inclusion checking problem for
timed automata. The work shows that timed automata are not closed under complement,
which is an obstacle in automatically comparing the languages of two timed automata.
Naturally, this conclusion leads to work on identifying determinizable subclasses of
timed automata, with reduced expressiveness. Several subclasses of timed automata
have been identified, i.e., event-clock timed automata [3,17], timed automata with inte-
ger resets [18] or with one clock [16] and strongly non-Zeno timed automata [4].
Our work is inspired by the work in [4] which presents an approach for deciding
when a timed automaton is determinizable. The idea is to check whether the timed
automaton satisfies a clock boundedness condition. The authors show that the condi-
tion is satisfied by event-clock timed automata, timed automata with integer resets and
strongly non-Zeno timed automata. Using region construction, it is shown in [4] that an
equivalent deterministic timed automaton can be constructed if the given timed automa-
ton satisfies the boundedness condition. The work is closely related to [1], in which the
authors proposed a zone-based approach for determinizing timed automata with one
clock. Our work combines [1,4] and extends them with simulation reduction so as to
provide an approach which could be useful for arbitrary timed automata in practice.
In addition, a game-based approach for determinizing timed automata has been pro-
posed in [6,13]. This approach produces an equivalent deterministic timed automaton
or a deterministic over-approximation, which allows one to enlarge the set of timed
automata that can be automatically determinized compared to the one in [4]. In com-
parison, our approach could determinize timed automata which fail the boundedness
condition in [4], and can cover the examples shown in [6]. The work is remotely re-
lated to work in [10]. In particular, it has been shown that under digitization with the
definition of weakly monotonic timed words, whether the language of a closed timed
automaton is included in the language of an open timed automaton is decidable [10].

6 Conclusion
In summary, the contributions of this work are threefold. First, we develop a zone-based
approach for language inclusion checking of timed automata, which is further combined
324 T. Wang et al.

with simulation reduction for better performance. Second, we investigate, both theoret-
ical and empirically, when the semi-algorithm is terminating. Lastly, we implement the
semi-algorithm in the PAT framework and apply it to benchmark systems. As far as
the authors know, our implementation is the first tool which supports using arbitrary
timed automata as a specification language. More importantly, with the proposed semi-
algorithm and the empirical results, we would like to argue that timed automata do serve
a specification language in practice. As for the future work, we would like to investigate
the language inclusion checking problem with the assumption of non-Zenoness.

References
1. Abdulla, P.A., Ouaknine, J., Quaas, K., Worrell, J.B.: Zone-Based Universality Analysis
for Single-Clock Timed Automata. In: Arbab, F., Sirjani, M. (eds.) FSEN 2007. LNCS,
vol. 4767, pp. 98–112. Springer, Heidelberg (2007)
2. Alur, R., Dill, D.L.: A Theory of Timed Automata. Theory of Computer Science 126(2),
183–235 (1994)
3. Alur, R., Fix, L., Henzinger, T.A.: Event-clock Automata: A Determinizable Class of Timed
Automata. Theoretical Computer Science 211, 253–273 (1999)
4. Baier, C., Bertrand, N., Bouyer, P., Brihaye, T.: When Are Timed Automata Determiniz-
able? In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.)
ICALP 2009, Part II. LNCS, vol. 5556, pp. 43–54. Springer, Heidelberg (2009)
5. Behrmann, G., Bouyer, P., Larsen, K.G., Pelánek, R.: Lower and Upper Bounds in Zone-
based Abstractions of Timed Automata. International Journal on Software Tools for Tech-
nology Transfer 8(3), 204–215 (2004)
6. Bertrand, N., Stainer, A., Jéron, T., Krichen, M.: A Game Approach to Determinize Timed
Automata. In: Hofmann, M. (ed.) FOSSACS 2011. LNCS, vol. 6604, pp. 245–259. Springer,
Heidelberg (2011)
7. Dill, D.L., Hu, A.J., Wong-Toi, H.: Checking for Language Inclusion Using Simulation Pre-
orders. In: Larsen, K.G., Skou, A. (eds.) CAV 1991. LNCS, vol. 575, pp. 255–265. Springer,
Heidelberg (1992)
8. Gruhn, V., Laue, R.: Patterns for Timed Property Specifications. Electronic Notes in Theo-
retical Computer Science 153(2), 117–133 (2006)
9. Henzinger, T.A., Kopke, P.W., Wong-Toi, H.: The Expressive Power of Clocks. In: Fülöp, Z.
(ed.) ICALP 1995. LNCS, vol. 944, pp. 417–428. Springer, Heidelberg (1995)
10. Henzinger, T.A., Manna, Z., Pnueli, A.: What Good are Digital Clocks? In: Kuich, W. (ed.)
ICALP 1992. LNCS, vol. 623, pp. 545–558. Springer, Heidelberg (1992)
11. Henzinger, T.A., Nicollin, X., Sifakis, J., Yovine, S.: Symbolic Model Checking for Real-
time Systems. Journal of Information and Computation 111(2), 193–244 (1994)
12. Konrad, S., Cheng, B.H.C.: Real-time Specification Patterns. In: ICSE, pp. 372–381 (2005)
13. Krichen, M., Tripakis, S.: Conformance Testing for Real-Time Systems. Formal Methods in
System Design 34(3), 238–304 (2009)
14. Larsen, K.G., Petterson, P., Wang, Y.: UPPAAL in a Nutshell. Journal on Software Tools for
Technology Transfer 1(1-2), 134–152 (1997)
15. Marcone, A.: Foundations of BQO Theory. Transactions of the American Mathematical So-
ciety 345(2), 641–660 (1994)
16. Ouaknine, J., Worrell, J.: On The Language Inclusion Problem for Timed Automata: Closing
a Decidability Gap. In: LICS, pp. 54–63 (2004)
17. Raskin, J., Schobbens, P.: The Logic of Event Clocks - Decidability, Complexity and Expres-
siveness. Journal of Automata, Languages, and Combinatories 4(3), 247–286 (1999)
Are Timed Automata Bad for a Specification Language? 325

18. Suman, P.V., Pandya, P.K., Krishna, S.N., Manasa, L.: Timed Automata with Integer Resets:
Language Inclusion and Expressiveness. In: Cassez, F., Jard, C. (eds.) FORMATS 2008.
LNCS, vol. 5215, pp. 78–92. Springer, Heidelberg (2008)
19. Sun, J., Liu, Y., Dong, J.S., Pang, J.: PAT: Towards flexible verification under fairness. In:
Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 709–714. Springer, Heidel-
berg (2009)
20. Tabakov, D., Vardi, M.Y.: Experimental Evaluation of Classical Automata Constructions.
In: Sutcliffe, G., Voronkov, A. (eds.) LPAR 2005. LNCS (LNAI), vol. 3835, pp. 396–411.
Springer, Heidelberg (2005)
21. Tripakis, S.: Verifying progress in timed systems. In: Katoen, J.-P. (ed.) ARTS 1999. LNCS,
vol. 1601, pp. 299–314. Springer, Heidelberg (1999)
22. De Wulf, M., Doyen, L., Henzinger, T.A., Raskin, J.-F.: Antichains: A New Algorithm for
Checking Universality of Finite Automata. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS,
vol. 4144, pp. 17–30. Springer, Heidelberg (2006)
23. Yovine, S.: Kronos: a Verification Tool for Real-time Systems. Journal on Software Tools for
Technology Transfer 1(1-2), 123–133 (1997)
Formal Design of Fault Detection
and Identiﬁcation Components
Using Temporal Epistemic Logic

Marco Bozzano, Alessandro Cimatti, Marco Gario, and Stefano Tonetta

Fondazione Bruno Kessler, Trento, Italy

{bozzano,cimatti,gario,tonettas}@fbk.eu

Abstract. Automated detection of faults and timely recovery are fun-

damental features for autonomous critical systems. Fault Detection and
Identification (FDI) components are designed to detect faults on-board,
by reading data from sensors and triggering predefined alarms.
The design of effective FDI components is an extremely hard problem,
also due to the lack of a complete theoretical foundation, and of precise
specification and validation techniques.
In this paper, we present the first formal framework for the design of
FDI for discrete event systems. We propose a logical language for the
specification of FDI requirements that accounts for a wide class of prac-
tical requirements, including novel aspects such as maximality and non-
diagnosability. The language is equipped with a clear semantics based on
temporal epistemic logic. We discuss how to validate the requirements
and how to verify that a given FDI component satisfies them. Finally,
we develop an algorithm for the synthesis of correct-by-construction FDI
components, and report on the applicability of the framework on an in-
dustrial case-study coming from aerospace.

1 Introduction
The correct operation of complex critical systems (e.g., trains, satellites, cars)
increasingly relies on the ability to detect when and which faults occur during op-
eration. This function, called Fault Detection and Identification (FDI), provides
information that is vital to drive the containment of faults and their recovery.
This is especially true for fail-operational systems, where the occurrence of faults
should not compromise the ability to carry on critical functions, as opposed to
fail-safe systems, where faults are typically handled by going to a safe state. FDI
is typically carried out by dedicated modules, called FDI components, running
in parallel with the system. An FDI component processes sequences of observa-
tions, made available by predefined sensors, and is required to trigger a set of
predefined alarms in a timely and accurate manner. The alarms are then used
by recovery modules to autonomously guarantee the survival of the system.
Faults are often not directly observable, and their occurrence can only be
inferred by observing the effects that they have on the observable parts of the
system. Moreover, faults may have complex dynamics, and may interact with
each other in complex ways. For these reasons, the design of FDI components

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 326–340, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Formal Design of FDI Components Using Temporal Epistemic Logic 327

is a very challenging task, as witnessed by a recent Invitation To Tender issued

by the European Space Agency [1]. The key reasons are the lack of a clear and
complete theoretical foundation, supported by clear and effective specification
and validation techniques. As a consequence, the design often results in very con-
servative assumptions, so that the overall system features suboptimal behaviors,
and is not trusted during critical phases.
In this paper, we propose a formal foundation to support the design of FDI by
introducing a pattern based language for the specification of FDI requirements.
Intuitively, an FDI component is specified by stating which are the observable
signals (the inputs of the FDI component), the desired alarms (in terms of the
unobservable state), and defining the relation between the two. The language
supports various forms of delay (exact, finite, bounded) between the occurrence
of faults, and the raising of the corresponding alarm. The patterns are provided
with an underlying formal semantics expressed in epistemic temporal logic [2],
where the knowledge operator is used to express the certainty of a condition,
based on the available observations. The formalization encodes properties such
as alarm correctness (whenever an alarm is raised by the FDI component, then
the associated condition did occur), and alarm completeness (if an alarm is
not raised, then either the associated condition did not occur, or it would have
been impossible to detect it, given the available observations). Moreover, we
precisely characterize two aspects that are important for the specification of FDI
requirements. The first one is the diagnosability of the plant, i.e., whether the
sensors convey enough information to detect the required conditions. We explain
how to deal with non-diagnosable plants by introducing the more fine grained
concept of trace diagnosability, where diagnosability is localized to individual
traces. The second is the maximality of the diagnoser, that is, the ability of the
diagnoser to raise an alarm as soon as and whenever possible.
Within our framework, we cover the problems of (i) validation of a given speci-
fication, (ii) verification of a given diagnoser with respect to a given specification,
and (iii) automated synthesis of a diagnoser from a given specification, using a
synchronous and perfect-recall semantics for the epistemic operator. Moreover,
we provide formal proofs for the correctness of the synthesis algorithm, we show
that the specification language correctly captures the formal semantics and we
clearly define the relation between diagnosability, maximality and correctness.
The framework has been validated on an industrial setting, in a project funded
by the European Space Agency [1]. The framework provides the conceptual foun-
dation underlying a design toolset, which has been applied to the specification,
verification and synthesis of an FDI component for a satellite.
It is important to remark the deep difference between the design of FDI com-
ponents and diagnosis. A diagnosis system can benefit from powerful computing
platform, it can provide partial diagnoses, and can possibly require further (post-
mortem) inspections. This is typical of approaches that rely on logical reasoning
engines (e.g., SAT solvers [3]). An FDI component, on the contrary, runs on-
board (as part of the on-line control strategy) and is subject to restrictions such
as timing and computation power. FDI design thus requires a deeper theory,
328 M. Bozzano et al.

which accounts for the issues of delay in raising the alarms, trace diagnosability,
and maximality. Furthermore, a consistency-based approach [3] is not applica-
ble to the design of FDI: in order to formally verify the eﬀectiveness of an FDI
component as part of an overall fault-management strategy, a formal model of
the FDI component (e.g., as an automaton) is required.
This paper is structured as follows. Section 2 provides some introductory
background. Section 3 formalizes the notion of FDI. Section 4 presents the spec-
iﬁcation language. In Section 5 we discuss how to validate the requirements,
and how to verify an FDI component with respect to the requirements. In Sec-
tion 6 we present an algorithm for the synthesis of correct-by-construction FDI
components. The results of evaluating our approach in an industrial setting are
presented in Section 7. Section 8 compares our work with previous related works.
Section 9 concludes the paper with a hint on future work.

2 Background
Plants and FDIs are represented as transition systems. A transition system is a
tuple S = V, Vo , W, Wo , I, T , where V is the set of state variables, Vo ⊆ V is
the set of observable state variables; W is the set of input variables, Wo ⊆ W is
the set of observable input variables; I is a formula over V defining the initial
states, T is a formula over V , W , V (with V being the next version of the state
variables) defining the transition relation.
A state s is an assignment to the state variables V . We denote with s the
corresponding assignment to V . An input i is an assignment to the input vari-
ables W . The observable part obs(s) of a state s is the projection of s on the
subset Vo of observable state variables. The observable part obs(i) of an input i
is the projection of i on the subset Wo of observable input variables. Given an
assignment a to a set of variables X and X1 ⊆ X, we denote the projection of a
over X1 with a|X1 . Thus, obs(s) = s|V o and obs(i) = i|W o .
A trace of S is a sequence π = s0 , i1 , s1 , i2 , s2 , . . . of states and in-
puts such that s0 satisfies I and, for each k ≥ 0, sk , ik+1 , sk+1 satis-
fies T . W.l.o.g. we consider infinite traces only. The observable part of π is
obs(π) = obs(s0 ), obs(i1 ), obs(s1 ), obs(i2 ), obs(s2 ), . . .. Given a sequence π =
s0 , i1 , s1 , i2 , s2 , . . . and an integer k ≥ 0, we denote with σ k the finite prefix
s0 , i1 , . . . , sk of π containing the first k + 1 states. We denote with π[k] the k + 1-
th state sk . We say that s is reachable in S iff there exists a trace π of S such
that s = π[k] for some k ≥ 0. We say that S is deterministic if i) there are no
two initial states s0 and s0 s.t. obs(s0 ) = obs(s0 ) ii) there are no two transi-
tions s, i1 , s1 and s, i2 , s2 from a reachable state s s.t. obs(i1 ) = obs(i2 ) and
obs(s1 ) = obs(s2 ).
Let S 1 = V 1 , Vo1 , W 1 , Wo1 , I 1 , T 1 and S 2 = V 2 , Vo2 , W 2 , Wo2 , I 2 , T 2 be two
transition systems with ∅ = (V 1 \Vo1 )∩V 2 = V 1 ∩(V 2 \Vo2 ) = (W 1 \Wo1 )∩W 2 =
W 1 ∩ (W 2 \ Wo2 ). We define the synchronous product S 1 × S 2 as the transition
system V 1 ∪ V 2 , Vo1 ∪ Vo2 , W 1 ∪ W 2 , Wo1 ∪ Wo2 , I 1 ∧ I 2 , T 1 ∧ T 2 . Every state s
of S 1 × S 2 can be considered as the product s1 × s2 such that s1 = s|V 1 is a
state of S 1 and s2 = s|V 2 is a state of S 2 .
Formal Design of FDI Components Using Temporal Epistemic Logic 329

We say that S 1 is compatible with S 2 iﬀ i) for every initial state s2 of S 2 ,

there exists an initial state s1 of S 1 such that s1|V 1 ∩V 2 = s2|V 1 ∩V 2 and ii) for every
o o o o
reachable state s1 ×s2 of S 1 ×S 2 , for every transition s2 , i2 , s2 of S 2 , there exists
a transition s1 , i1 , s1 such that i1|W 1 ∩W 2 = i2|W 1 ∩W 2 and s1 2
|V 1 ∩V 2 = s|V 1 ∩V 2 .
o o o o o o o o

3 Formal Framework

3.1 Diagnoser

A diagnoser is a machine D that synchronizes with observable traces of the plant

P . D has a set A of Boolean alarm variables that are activated in response to
the monitoring of P . We use diagnoser and FDI component interchangeably,
and call system the composition of the plant and the FDI component.
Formally, given a set A of alarms and a plant transition system P =
V P , VoP , W P , WoP , I P , T P , a diagnoser is a deterministic transition system
D(A, P ) = V D , VoD , W D , WoD , I D , T D such that: VoP ⊆ VoD , WoP ⊆ WoD ,
A ⊆ VoD and D(A, P ) is compatible with P ; when clear from the context, we
use D to indicate D(A, P ). Given a trace πP of P we denote with D(πP ) the
trace of D matching πP . Only observable variables can be shared among the two
systems and used to perform synchronization. This gives raise to the problem of
partial observability: the diagnoser cannot perfectly track the evolution of the
original system. This makes the diagnoser synthesis problem hard.

3.2 Detection, Identiﬁcation, and Diagnosis Conditions

The first element for the specification of the FDI requirements is given by the
conditions that must be monitored. Here, we distinguish between detection and
identification, which are the two extreme cases of the diagnosis problem; the first
deals with knowing whether a fault occurred in the system, while the second tries
to identify the characteristics of the fault. Between these two cases there can be
intermediate ones: we might want to restrict the detection to a particular sub-
system, or identification among two similar faults might not be of interest.
For example, a data acquisition system composed of a sensor and a filter
might have several possible faults: the sensor might fail in a single way (sdie)
while the filter might fail in two ways (f diehigh or f dielow ). The detection task
is the problem of understanding when (at least) one of the two components has
failed. The identification task tries to understand exactly which fault occurred.
Similarly, e.g., if we can replace the filter whenever it fails, it might suffice to
know that one of f diehigh or f dielow occurred (this is sometimes called isolation).
FDI components are generally used to recognize faults. However, there is
no reason to restrict our interest to faults. Recovery procedures might differ
depending on the current state of the plant, therefore, it might be important to
consider other unobservable information of the system.
We call the condition of the plant to be monitored diagnosis condition, de-
noted with β. We assume that for any point in time along a trace execution of
330 M. Bozzano et al.

β
ExactDel(A, β, 2)
BoundDel(A, β, 4)
FiniteDel(A, β)

Fig. 1. Examples of alarm responses to the diagnosis condition β

the plant (and therefore also of the system), β is either true or false based on
what happened before that time point. Therefore, β can be an atomic condi-
tion (including faults), a sequence of atomic conditions, or Boolean combination
thereof. If β is a fault, the fault must be identiﬁed; if β is a disjunction of faults,
instead, it suﬃces to perform the detection, without identifying the exact fault.

3.3 Alarm Conditions

The second element of the speciﬁcation of FDI requirements is the relation be-
tween a diagnosis condition and the raising of an alarm. This also leads to the
deﬁnition of when the FDI is correct and complete with regard to a set of alarms.
An alarm condition is composed of two parts: the diagnosis condition and
the delay. The delay relates the time between the occurrence of the diagnosis
condition and the corresponding alarm; it might be the case that the occurrence
of a fault can go undetected for a certain amount of time. It is important to
specify clearly how long this interval can be at most. Interaction with industrial
experts led us to identify three patterns of alarm conditions, which we denote
with ExactDel(A, β, n), BoundDel(A, β, n), and FiniteDel(A, β):

1. ExactDel(A, β, n) speciﬁes that whenever β is true, A must be triggered

exactly n steps later and A can be triggered only if n steps earlier β was true;
formally, for any trace π of the system, if β is true along π at the time point i,
then A is true in π[i + n] (Completeness); if A is true in π[i], then β must be
true in π[i − n] (Correctness).
2. BoundDel(A, β, n) specifies that whenever β is true, A must be triggered
within the next n steps and A can be triggered only if β was true within the
previous n steps; formally, for any trace π of the system, if β is true along π at
the time point i then A is true in π[j], for some i ≤ j ≤ i+n (Completeness); if A
is true in π[i], then β must be true in π[j ] for some i − n ≤ j ≤ i (Correctness).
3. FiniteDel(A, β) specifies that whenever β is true, A must be triggered
in a later step and A can be triggered only if β was true in some previous step;
formally, for any trace π of the system, if β is true along π at the time point i
then A is true in π[j] for some j ≥ i (Completeness); if A is true in π[i], then β
must be true along π in some time point between 0 and i (Correctness).
Figure 1 provides an example of admissible responses for the various alarms
to the occurrences of the same diagnosis condition β; note how in the case of
BoundDel(A, β, 4) the alarm can be triggered at any point as long as it is within
the next 4 time-steps. FiniteDel(A, β) is of particular theoretical interest since
it captures the idea of diagnosis as defined in previous works [4].
Formal Design of FDI Components Using Temporal Epistemic Logic 331

An alarm condition is actually a property of the whole system since it relates

a condition of the plant with an alarm of the diagnoser. Thus, when we say that
a diagnoser D of P satisfies an alarm condition, we mean that the traces of the
system D × P satisfy it.
Considering our previous example of the data acquisition system, we can
define the following toy specification. β1 = (sdie ∨ f diehigh ∨ f dielow ) indicates
the fault detection condition, therefore we define the FiniteDel(A1 , β1 ) as the
finite-delay fault detection. Another example could be identification of the sensor
death (β2 = sdie) within a bound: BoundDel(A2 , β2 , 5). Finally, we could be
interested in knowing that some fault occurred in the filter with some precise
delay information: ExactDel(A3 , (f diehigh ∨ f dielow ), 2).

3.4 Diagnosability
Given an alarm condition, we need to know whether it is possible to build a
diagnoser for it. In fact, there is no reason in having a specification that cannot
be realized. This property is called diagnosability and was introduced in [5].
In this section, we define the concept of diagnosability for the different types
of alarm conditions. We proceed by first giving the definition of diagnosability in
the traditional way (à la Sampath) in terms of observationally equivalent traces
w.r.t. the diagnosis condition. Then, we prove that a plant P is diagnosable iff
there exists a diagnoser that satisfies the specification. In the following, we will
not provide definitions for finite-delay since they can be obtained by generalizing
the ones for bounded-delay.

Deﬁnition 1 Given a plant P and a diagnosis condition β, we say that

ExactDel(A, β, d) is diagnosable in P iﬀ for any pair of traces σ1 ,σ2 and
for all i ≥ 0, if obs(σ1i+d ) = obs(σ2i+d ) then σ1 , i |= β iﬀ σ2 , i |= β.

Deﬁnition 2 Given a plant P and a diagnosis condition β, we say that

BoundDel(A, β, d) is diagnosable in P iﬀ for any pair of traces σ1 ,σ2 and for
all i ≥ 0 there exists a j, i ≤ j ≤ i + d, s.t. if obs(σ1j ) = obs(σ2j ) and σ1 , i |= β
then σ2 , k |= β for some k, j − d ≤ k ≤ j.

An exact-delay alarm condition is not diagnosable in P iﬀ there exists a pair

of traces that violates the conditions of Definition 1: this would be a pair of
traces σ1 and σ2 such that for some i ≥ 0, σ1 , i |= β, obs(σ1i+d ) = obs(σ2i+d ), and
σ2 , i |= β. We call such a pair a critical pair. Definition 2 is a generalization of
Sampath’s definition of diagnosability:
Theorem 1. Let α be a propositional formula and βα a condition that holds
in a point of a trace if α holds in some point of its prefix, then α is d-delay
diagnosable (as in [5]) in P iff BoundDel(A, βα , d) is diagnosable in P .
The following theorem shows that if a component satisfies the diagnoser spec-
ification then the monitored plant must be diagnosable for that specification. In
Section 6 on synthesis we will show also the converse, i.e., if the specification is
diagnosable then a diagnoser exists.
332 M. Bozzano et al.

Theorem 2. Let D be a diagnoser for P . If D satisﬁes an alarm condition then

the alarm condition is diagnosable in P .

The above deﬁnition of diagnosability might be stronger than necessary, since

diagnosability is deﬁned as a global property of the plant. Imagine the situation
in which there is a critical pair and after removing this critical pair from the
possible executions of the system, our system becomes diagnosable. This suggests
that the system was “almost” diagnosable, and an ideal diagnoser would be
able to perform a correct diagnosis in all the cases except one (i.e., the one
represented by the critical pair). To capture this idea, we redeﬁne the problem of
diagnosability from a global property expressed on the plant, to a local property
(expressed on points of single traces).

Deﬁnition 3 Given a plant P , a diagnosis condition β and a trace σ1 such that

for some i ≥ 0 σ1 , i |= β, we say that ExactDel(A, β, d) is trace diagnosable
in σ1 , i iﬀ for any trace σ2 such that obs(σ1i+d ) = obs(σ2i+d ) then σ2 , i |= β.

Deﬁnition 4 Given a plant P , a diagnosis condition β, and a trace σ1 such that

for some i ≥ 0 σ1 , i |= β, we say that BoundDel(A, β, d) is trace diagnosable
in σ1 , i iﬀ there exists j, i ≤ j ≤ i + d such that, for any trace σ2 such that
obs(σ1j ) = obs(σ2j ) then σ2 , k |= β for some j − d ≤ k ≤ j.

A speciﬁcation that is trace diagnosable in a plant along all points of all traces
is diagnosable in the classical sense, and we say it is system diagnosable.

3.5 Maximality
As shown in Figure 1, bounded- and finite-delay alarms are correct if they are
raised within the valid bound. However, there are several possible variations of
the same alarm in which the alarm is active in different instants or for different
periods. We address this problem by introducing the concept of maximality.
Intuitively, a maximal diagnoser is required to raise the alarms as soon as possible
and as long as possible (without violating the correctness condition).

Deﬁnition 5 D is a maximal diagnoser for an alarm condition with alarm A

in P iﬀ for every trace πP of P , D(πP ) contains the maximum number of points
i such that D(πP ), i |= A in the sense that if D(πP ), i |= A then there does not
exist another correct diagnoser D of P such that D (πP ), i |= A.

4 Formal Specification
In this section, we present the Alarm Specification Language with Epistemic
operators (ASLK ). This language allows designers to define requirements on the
FDI alarms including aspects such as delays, diagnosability and maximality.
Diagnosis conditions and alarm conditions are formalized using LTL with past
operators [6] (from here on, simply LTL). The definitions of trace diagnosability
Formal Design of FDI Components Using Temporal Epistemic Logic 333

and maximality, however, cannot be captured by using a formalization based

on LTL. Therefore, in order to capture these two concepts, we rely on temporal
epistemic logic. The intuition is that this logic enables us to reason on set of
observationally equivalent traces instead that on single traces (like in LTL). We
assume the familiarity of the reader to LTL, but we provide a brief introduction
to temporal epistemic logic, and then show how it can be used to verify diagnos-
ability, deﬁne requirements for non-diagnosable cases and express the concept of
maximality.

4.1 Temporal Epistemic Logic

Epistemic logic has been used to describe and reason about knowledge of agents
and processes. There are several ways of extending epistemic logic with temporal
operators. We use the logic KL1 [2] and extended it with past operators.
A formula in KL1 is defined as β ::= p | β ∧ β | ¬β| Oβ | Y β | F β | Xβ | Kβ.
Note how this is an extension of LTL on the past, with the addition of the epis-
temic operator K. The intuitive semantics of Oβ is that β was true in the past,
while Y β means that in the previous state β was true; the intuitive semantics of
Kβ is that the diagnoser knows that β holds in the current execution. The formal
semantics of the epistemic operator K is given on indistinguishable traces:
σ1 , n |= Kβ iff ∀σ2 , obs(σ1n ) = obs(σ2n ) ⇒ σ2 , n |= β.
Therefore, Kβ holds at time n in a trace σ1 , if β holds in all traces that are obser-
vational equivalent to σ1 at time n. This definition implicitly forces perfect-recall
in the semantics of the epistemic operator, since we define the epistemic equiva-
lence between traces and not between states. Moreover, the traces are compared
with a synchronous semantics. Therefore, the semantics of our transition system
is synchronous and with perfect-recall (compare [2]).

4.2 Diagnosis and Alarm Conditions as LTL Properties

Let P be a set of propositions representing either faults or elementary conditions
for the diagnosis. The set DP of diagnosis conditions over P is any formula β
built with the following rule: β ::= p | β ∧ β | ¬β | Oβ | Y β with p ∈ P. We use
the abbreviations Y n φ = Y Y n−1 φ (with Y 0 φ = φ), O≤n φ = φ ∨ Y φ ∨ · · · ∨ Y n φ
and F ≤n φ = φ ∨ Xφ ∨ · · · ∨ X n φ.
We define the Alarm Specification Language (ASL) in Figure 2, where we
associate to each type of alarm condition an LTL formalization encoding the
concepts of correctness and completeness. Correctness, the first conjunct, intu-
itively says that whenever the diagnoser raises an alarm, then the fault must
have occurred. Completeness, the second conjunct, intuitively encodes that
whenever the fault occurs, the alarm will be raised. In the following, for sim-
plicity, we abuse notation and indicate with ϕ both the alarm condition and
the associated LTL; for an alarm condition ϕ, we denote with Aϕ the asso-
ciated alarm variable A, and with τ (ϕ) the following formulae: τ (ϕ) = Y n β
for ϕ = ExactDel(A, β, n); τ (ϕ) = O≤n β for ϕ = BoundDel(A, β, n);
τ (ϕ) = Oβ for ϕ = FiniteDel(A, β).
334 M. Bozzano et al.

Alarm Condition LTL Formulation

ExactDel(A, β, n) G(A → Y n β) ∧ G(β → X n A)
BoundDel(A, β, n) G(A → O≤n β) ∧ G(β → F ≤n A)
FiniteDel(A, β) G(A → Oβ) ∧ G(β → F A)

Fig. 2. Alarm conditions as LTL (ASL)

Alarm Condition Diagnosability condition

β
ExactDel(A, β, n) G(β → X n KY n β)
BoundDel(A, β, n) G(β → F ≤n KO≤n β)
KO≤4 β
FiniteDel(A, β) G(β → F KOβ) A (Maximal)
A (Non-Maximal)
(a) Diagnosability Property. (b) Example of Maximality.

Fig. 3. Diagnosability and Maximality

4.3 Diagnosability as Epistemic Property

We can write the diagnosability tests for the different alarm conditions directly
as epistemic properties that can be verified on single points of the traces (trace
diagnosability) or on the entire plant (system diagnosability) (Figure 3.a). For
example, the diagnosability test for ExactDel(A, β, n) says that it is always
the case that whenever β occurs, exactly n steps afterwards, the diagnoser knows
that n steps before β occurred. Since K is defined on observationally equivalent
traces, the only way to falsify the formula would be to have a trace in which
β occurs, and another one (observationally equivalent at least for the next n
steps) in which β did not occur; but this is in contradiction with the definition
of diagnosability (Definition 1).

4.4 Maximality as Epistemic Property

The property of maximality says that the diagnoser will raise the alarm as soon
as it is possible to know the diagnosis condition, and the alarm will stay up as
long as possible. The property Kτ (ϕ) → A encodes this behavior:
Theorem 3. D is maximal for ϕ in P iff D × P |= G(Kτ (ϕ) → Aϕ ).
Whenever the diagnoser knows that τ (ϕ) is satisfied, it will raise the alarm. For
bounded- and finite-delay alarms, this guarantees also that the alarm will stay
up if possible, since Kτ (ϕ) → XKY τ (ϕ). An example of maximal and non-
maximal alarm is given in Figure 3.b. Note that according to our definition, the
set of maximal alarms is a subset of the non-maximal ones.

4.5 ASLK Speciﬁcations

The formalization of ASLK (Figure 4) is obtained by extending ASL (Figure 2)
with the concepts of maximality and diagnosability, deﬁned as epistemic proper-
ties. When maximality is required we add a third conjunct following Theorem 3.
Formal Design of FDI Components Using Temporal Epistemic Logic 335

Template maximality = F alse maximality = T rue

G(A → Y n β) ∧ G(β → X n A) G(A → Y n β) ∧ G(β → X n A) ∧
diag = System

ExactDel
G(KY n β → A)
G(A → O≤n β) ∧ G(β → F ≤n A) G(A → O≤n β) ∧ G(β → F ≤n A) ∧
BoundDel
G(KO≤n β → A)
G(A → Oβ) ∧ G(β → F A) G(A → Oβ) ∧ G(β → F A) ∧
FiniteDel
G(KOβ → A)
G(A → Y n β) ∧ G(A → Y n β) ∧
ExactDel
G((β → X n KY n β) → (β → X n A)) G((β → X n KY n β) → (β → X n A)) ∧
diag = T race

G(KY n β → A)
G(A → O≤n β) ∧ G(A → O≤n β) ∧
BoundDel
G((β → F ≤n KO≤n β) → (β → F ≤n A)) G((β → F ≤n KO≤n β) → (β → F ≤n A)) ∧
G(KO≤n β → A)
G(A → Oβ) ∧ G(A → Oβ) ∧
FiniteDel
G((β → F KOβ) → (β → F A)) G((β → F KOβ) → (β → F A)) ∧
G(KOβ → A)

Fig. 4. ASLK speciﬁcation patterns

When diag = trace instead, we precondition the completeness to the trace diag-
nosability (as defined in Figure 3.a); this means that the diagnoser will raise an
alarm whenever the diagnosis condition is satisfied and the diagnoser is able to
know it. The formalizations presented in the table can be simplified, but are left
as-is to simplify their comprehension. For example, in the case diag = trace, we
do not need to verify the completeness due to the following result:
Theorem 4. Given a diagnoser D for a plant P and a trace diagnosable alarm
condition ϕ, if D is maximal for ϕ, then D is complete.
A similar result holds for ExactDel in the non-maximal case, that becomes:
G(A → Y n β) ∧ G(KY n β → A). Finally, the implications for the completeness in
the trace diagnosability case can be rewritten as, e.g., G((β ∧ F KOβ) → (F A)).
Another interesting result is the following:
Theorem 5. Given a diagnoser D for a plant P and a system diagnosable con-
dition ϕ, if D is maximal for ϕ and ϕ is diagnosable in P then D is complete.
An ASLK specification is built by instantiating the patterns defined in Fig-
ure 4. For example, we would write ExactDelK (A, β, n, trace, T rue) for an
exact-delay alarm A for β with delay n, that satisfies the trace diagnosability
property and is maximal. An introductory example on the usage of ASLK for
the specification of a diagnoser is provided in [7].

5 Validation and Veriﬁcation of ASLK Speciﬁcations

Thanks to the formal characterization of ASLK , it is possible to apply formal
methods for the validation and verification of a set of FDI requirements. In
validation we verify that the requirements capture the interesting behaviors and
exclude the spurious ones, before proceeding with the design of the diagnoser.
In verification, we check that a candidate diagnoser fulfills a set of requirements.
336 M. Bozzano et al.

Validation In the following we focus on the validation of an alarm speciﬁcation,

but the same ideas can be applied to a set of diagnosis conditions. We consider
a set of environmental assumptions E and a specification AP . The environment
assumption E may include both assumption on the plant’s input and an abstrac-
tion of the plant. It can vary in a spectrum starting from trivially no assumption
(E = #), to some LTL properties, to a detailed model of the plant, going through
several intermediate levels. The idea is that throughout the different phases of
the development process, we have access to better versions of the plant model,
and therefore the analysis can be refined. For example, it might be possible to
provide some assumption on the maximum number of faults in the system, or
on their dynamics, before a complete description of the subsystems is available.
Known techniques for requirements validation (e.g.,[8]) include checking their
consistency, their compatibility with some possible scenarios, whether they en-
tail some expected properties and if they are realizable, i.e., if there exists an
implementation satisfying the requirements. In the following we instantiate these
checks for the alarm specification.
In Section 6, we prove that we can always synthesize a diagnoser satisfying
AP , with the only assumption that if AP contains some system diagnosable
alarm condition, then that condition is diagnosable in the plant. This means that
any specification AP is consistent by construction (just consider a diagnosable
plant). Moreover, the check for realizability reduces to checking that the plant
is diagnosable for the system diagnosable conditions in AP . The diagnosability
check can be performed via epistemic model-checking (Section 4.3) or it can be
reduced to an LTL model-checking problem using the twin-plant construction [9].
As check of possible scenarios, we would like that alarms should eventually
be activated, but also that alarms are not always active. This means that for
a given alarm condition ϕ ∈ AP , we are interested in verifying that there is a
trace π ∈ E and a trace π ∈ E s.t. π |= F Aϕ and π |= F ¬Aϕ . This can be
done by checking the unsatisfiability of (E ∧ ϕ) → G¬Aϕ and (E ∧ ϕ) → GAϕ .
As check of entailed properties, it is interesting to understand whether there is
some correlation between alarms in order to simplify the model, or to guarantee
some redundancy requirement. To check whether Aϕ is a more general alarm
than Aϕ (subsumption) we verify that (E ∧ ϕ ∧ ϕ ) → G(Aϕ → Aϕ ) is a
tautology. A trivial example of subsumption of alarms is given by the definition of
maximality: any non-maximal alarm is subsumed by its corresponding maximal
version. Finally, we can verify that two alarms are mutually exclusive by checking
the validity of (E ∧ ϕ ∧ ϕ ) → G¬(Aϕ ∧ Aϕ ). In general, the validation of alarm
conditions requires reasoning in temporal epistemic logic, however, the validation
of diagnosis condition only requires reasoning on LTL with past.

Verification. The verification of a system w.r.t. a specification can be performed

via model-checking techniques using the semantics of the alarm conditions:

Deﬁnition 6 Let D be a diagnoser for alarms A and plant P . We say that D

satisfies a set AP of ASLK specifications iff for each ϕ in AP there exists an
alarm Aϕ ∈ A and D × P |= ϕ.
Formal Design of FDI Components Using Temporal Epistemic Logic 337

To perform this verification steps, we need in general a model checker for KL1
with synchronous perfect recall such as MCK [10]. However, if the specification
falls in the pure LTL fragment (ASL) we can verify it with an LTL model-checker
such as NuSMV [11] thus benefiting from the efficiency of the tools in this area.
Moreover, a diagnoser is required to be compatible with the plant. Therefore,
we need to take care that the synchronous composition of the plant with the
diagnoser does not reduce the behaviors of the plant. This would imply that
there is a state and an observation that are possible for the plant, but not taken
into account by the diagnoser. Compatibility can be checked with dedicated tools
such as Ticc [12] based on game theory. However, here we require compatibility in
all environments and therefore, compatibility can be checked by model checking
by adding a sink state to the diagnoser, so that if we are in a state and we receive
an observation that was not covered by the original diagnoser, we go to the sink
state. Once we modified the diagnoser, we verify that D × P |= G¬SinkState.

6 Synthesis of a Diagnoser from an ASLK Speciﬁcation

In this section, we sketch an algorithm to synthesize a diagnoser that satisfies
a given specification AP . The algorithm considers the most expressive case of
ASLK (maximal/trace diagnosable), satisfying, therefore all other cases.
The idea of the algorithm is to generate an automaton that encodes the set
of possible states in which the plant could be after each observations. The result
is achieved by generating the power-set of the states of the plant, and defining
a suitable transition relation among the elements of this set that only considers
the observable information. We call the sets in the power-set belief states. Each
belief state of the automaton can be annotated with the alarms that are satisfied
in all the states of the belief state, obtaining the diagnoser.
Our algorithm resembles the construction by Sampath [5] and Schumann [13].
The main differences are that we consider LTL Past expression as diagnosis
condition, and not only fault events as done in previous works. Moreover, instead
of providing a set of possible diagnosis, we provide alarms: we need to be certain
that the alarm condition is satisfied in all possible diagnosis in order to raise the
alarm. This gives raise to a 3-valued alarm system, in which we know that the
fault occurred, know that the fault did not occurred or are uncertain.
Given a plant P = V P , VoP , W P , WoP , I P , T P , let S be the set of states of
P . The belief automaton is defined as B(P ) = B, E, B0 , R where B = 2S ,
E = 2Wo ∪Vo and B0 ⊆ B and R : (B × E) → B are defined as follows.
P P

P
We define B0 = {b | there exists u ∈ 2Vo s.t. for all s ∈ b, s |= I P and
obs(s) = u}: we assume that the diagnoser can be initialized by observing the
plant, and each initial belief state must, therefore, be compatible with one of the
possible initial observations on the plant. The transition function R is defined as
follows R(b, e) = {s | ∃s ∈ b s.t. s, i, s |= T P , obs(s ) = e|VoP , obs(i) = e|WoP }:
the belief state b = R(b, e) is a successor of b iff all the states in b are compatible
with the observations from a state in b.
The diagnoser is obtained by annotating each state of the belief automaton
with the corresponding alarms. To do this we explore the belief automaton,
338 M. Bozzano et al.

and annotate with Aϕ all the states b that satisfy the temporal property τ (ϕ):
b |= Aϕ iff ∀s ∈ b.s |= τ (ϕ). It might occur that neither Kτ (ϕ) nor K¬τ (ϕ)
hold in a state. In this case there is at least a state in the belief state in which
Kτ (ϕ) holds and one in which it does not hold. This pair of states represents
uncertainty, and are caused by non-diagnosabile traces.
We define Dϕ as the diagnoser for ϕ. For the propositional case τ (ϕ) = p,
D D
Dϕ = V Dϕ , Vo ϕ , W Dϕ , Wo ϕ , I Dϕ , T Dϕ is a symbolic representation of B(P )
Dϕ D
with Vo = VoP ∪{Aϕ }, Wo ϕ = WoP and such that every state b of Dϕ represents
a state in B (with abuse of notation we do not distinguish between the two) and,
D
for all v ∈ Vo ϕ , v ∈ obs(b) iff for all s ∈ b, v ∈ s, and such that every observation
e of Dϕ represents an observation in E and obs(e) = e|W Dϕ . The following holds:
o

Theorem 6 (Compatibility). Dϕ is compatible with P .

Theorem 7 (Correctness, Completeness and Maximality). Dϕ is correct

(i.e. Dϕ × P |= G(Aϕ → τ (ϕ))), maximal (i.e. Dϕ × P |= G(K(τ (ϕ)) → Aϕ ))
and complete (under the assumption that if ϕ is system diagnosable, then ϕ is
diagnosable in P ).

All other alarm conditions can be reduced to the propositional case. We build
a new plant P by adding a monitor variable τ to P s.t., P = P ×(G(τ (ϕ) ↔ τ )),
where we abuse notation to indicate the automaton that encodes the monitor
variable. By rewriting the alarm condition as ϕ = ExactDel(Aϕ , τ , 0), we
obtain that D × P |= ϕ iﬀ D × P |= ϕ .

7 Industrial Experience

The framework described in this paper has been motivated by, and used in, the
AUTOGEF project [1], funded by the European Space Agency. The main goal
of the project was the definition of a set of requirements for an on-board Fault
Detection, Identification and Recovery (FDIR) component and its synthesis. The
problem was tackled by synthesizing the Fault Detection (FDI) and Fault Recov-
ery (FR) components separately, with the idea that the FDI provides sufficient
diagnosis information for the FR to act on.
The AUTOGEF framework was evaluated using scalable benchmark exam-
ples. Moreover, Thales Alenia Space evaluated AUTOGEF on a case study based
on the EXOMARS Trace Gas Orbiter. This case-study is a significant exemplifi-
cation of the framework described in this paper, since it covers all the phases of
the FDIR development process. The system behavior (including faulty behavior)
was modeled using a formal language and table- and pattern-based description
of the mission phases/modes and observability characteristics of the system. The
specification of FDIR requirements by means of patterns greatly simplified the
accessibility of the tool to engineers that were not experts in formal methods.
Specification of alarms was carried out in the case of finite delay, under the
assumption of trace diagnosability and maximality of the diagnoser. Moreover,
different faults and alarms were associated with specific mission phase/mode and
Formal Design of FDI Components Using Temporal Epistemic Logic 339

conﬁgurations of the system, which enabled generation of speciﬁc alarms (and

recoveries) for each configuration. The specification was validated, by performing
diagnosability analysis on the system model. The synthesis routines were run on
a system composed of 11 components, with 10 faults in total, and generated an
FDI component with 754 states. Finally, the correctness of the diagnoser was
verified by using model-checking routines. Synthesis and verification capabilities
have been implemented on top of the NuSMV model checker. We remark that
the ability to define trace diagnosable alarms was crucial for the synthesis of the
diagnoser, since most of the modeled faults were not system diagnosable.
Successful completion of the project, and positive evaluations from the indus-
trial partner and ESA, suggest that a first step towards a formal model-based
design process for FDIR was achieved.

8 Related Work

Previous works on formal FDI development have considered the speciﬁcation

and synthesis in isolation. Our approach differs with the state of the art because
we provide a comprehensive view on the problem. Due to the lack of specification
formalism for diagnosers, the problem of verifying their correctness, completeness
and maximality was, to the best of our knowledge, unexplored.
Concerning specification and synthesis [14] is close to our work. The authors
present a way to specify the diagnoser using LTL properties, and present a
synthesis algorithm for this specification. However, problems such as maximality
and trace diagnosability are not taken into account. Interesting in [14] is the
handling of diagnosis condition with future operators.
Some approaches exist that define diagnosability as epistemic properties. Two
notable examples are [15] and [16], where the latter extends the definition of
diagnosability to a probabilistic setting. However, these works focus on finite-
delay diagnosability only, and do not consider other types of delays and the
problem of trace diagnosability.
Finally, we extend the results on diagnosability checking from [9] in order to
provide an alternative way of checking diagnosability and redefine the concept
of diagnosability at the trace level.

9 Conclusions and Future Work

This paper presents a formal framework for the design of FDI components,
that covers many practically-relevant issues such as delays, non-diagnosability
and maximality. The framework is based on a formal semantics provided by
temporal epistemic logic. We covered the specification, validation, verification
and synthesis steps of the FDI design, and evaluated the applicability of each
step on a case-study from aerospace. To the best of our knowledge, this is the
first work that provides a formal and unified view to all the phases of FDI design.
In the future, we plan to explore the following research directions. First, we
will extend FDI to deal with asynchronous and infinite-state systems. In this
work we addressed the development of FDI for finite state synchronous systems
340 M. Bozzano et al.

only. However, it would be of practical interest to consider inﬁnite state systems

and timed/hybrid behaviors. Another interesting line of research is the develop-
ment of optimized ad-hoc techniques for reasoning on the fragment of temporal
epistemic logic that we are using, both for veriﬁcation and validation, and eval-
uating and improving the scalability of the synthesis algorithms. Finally, we will
work on integrating the FDI component with the recovery procedures.

References
1. European Space Agency: ITT AO/1-6570/10/NL/LvH “Dependability Design Ap-
proach for Critical Flight Software”. Technical report (2010)
2. Halpern, J.Y., Vardi, M.Y.: The complexity of reasoning about knowledge and time.
Lower bounds. Journal of Computer and System Sciences 38(1), 195–237 (1989)
3. Grastien, A., Anbulagan, A., Rintanen, J., Kelareva, E.: Diagnosis of discrete-event
systems using satisfiability algorithms. In: AAAI, vol. 1, pp. 305–310 (2007)
4. Rintanen, J., Grastien, A.: Diagnosability testing with satisfiability algorithms. In:
Veloso, M.M. (ed.) IJCAI, pp. 532–537 (2007)
5. Sampath, M., Sengupta, R., Lafortune, S., Sinnamohideen, K., Teneketzis, D.C.:
IEEE Transactions on Control Systems Technology 4, 105–124 (1996)
6. Lichtenstein, O., Pnueli, A., Zuck, L.: The glory of the past. In: Parikh, R. (ed.)
Logics of Programs, vol. 193, pp. 196–218. Springer, Heidelberg (1985)
7. Bozzano, M., Cimatti, A., Gario, M., Tonetta, S.: Formal Specification and Syn-
thesis of FDI through an Example. In: Workshop on Principles of Diagnosis, DX
2013 (2013), https://es.fbk.eu/people/gario/dx2013.pdf
8. Cimatti, A., Roveri, M., Susi, A., Tonetta, S.: Validation of requirements for hy-
brid systems: A formal approach. ACM Transactions on Software Engineering and
Methodology 21(4), 22 (2012)
9. Cimatti, A., Pecheur, C., Cavada, R.: Formal Verification of Diagnosability via
Symbolic Model Checking. In: IJCAI, pp. 363–369 (2003)
10. Gammie, P., van der Meyden, R.: MCK: Model checking the logic of knowledge.
In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 479–483. Springer,
Heidelberg (2004)
11. Cimatti, A., Clarke, E., Giunchiglia, E., Giunchiglia, F., Pistore, M., Roveri, M.,
Sebastiani, R., Tacchella, A.: NuSMV 2: An OpenSource Tool for Symbolic Model
Checking. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp.
359–364. Springer, Heidelberg (2002)
12. Adler, B.T., de Alfaro, L., da Silva, L.D., Faella, M., Legay, A., Raman, V., Roy,
P.: Ticc: A Tool for Interface Compatibility and Composition. In: Ball, T., Jones,
R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 59–62. Springer, Heidelberg (2006)
13. Schumann, A.: Diagnosis of discrete-event systems using binary decision diagrams.
In: Workshop on Principles of Diagnosis (DX 2004), pp. 197–202 (2004)
14. Jiang, S., Kumar, R.: Failure diagnosis of discrete event systems with linear-time
temporal logic fault specifications. IEEE Transactions on Automatic Control, pp.
128–133 (2001)
15. Ezekiel, J., Lomuscio, A., Molnar, L., Veres, S.: Verifying Fault Tolerance and Self-
Diagnosability of an Autonomous Underwater Vehicle. In: IJCAI, pp. 1659–1664
(2011)
16. Huang, X.: Diagnosability in concurrent probabilistic systems. In: Proceedings of
the 2013 International Conference on Autonomous Agents and Multi-agent Systems
(2013)
Monitoring Modulo Theories

Normann Decker, Martin Leucker, and Daniel Thoma

Institute for Software Engineering and Programming Languages

Universität zu Lübeck, Germany
{decker,leucker,thoma}@isp.uni-luebeck.de

Abstract. This paper considers a generic approach to enhance tradi-

tional runtime verification techniques towards first-order theories in or-
der to reason about data. This allows especially for the verification of
multi-threaded, object-oriented systems. It presents a general framework
lifting the monitor synthesis for propositional temporal logics to a tempo-
ral logic over structures within some first-order theory. To evaluate such
temporal properties, SMT solving and classical monitoring of proposi-
tional temporal properties is combined. The monitoring procedure was
implemented for linear-time temporal logic (LTL) based on the Z3 SMT
solver and evaluated regarding runtime performance.

1 Introduction
In this paper we consider runtime verification of multi-threaded, object-oriented
systems, representing a major class of today’s practical software. As opposed
to other verification techniques such as model checking or theorem proving,
runtime verification (RV) does not aim at the analysis of the whole system but
on evaluating a correctness property on a particular run, based on log-files or
on-the-fly. To this end, typically a monitor is synthesized from some high-level
specification that is monitoring the run at hand.
In recent years, a variety of synthesis algorithms has been developed, dif-
fering in the underlying expressiveness of the specification formalism and the
resulting monitoring approach. Typically, a variant of linear-time temporal logic
(LTL) is employed as specification language and monitoring is automata-based
or rewriting-based.
Within the setting of multiple, in general arbitrarily many instances of pro-
gram parts, for example in terms of threads or objects, a software engineer is
naturally interested in verifying that the interaction of individual instances fol-
lows general rules. The ability of taking the dynamics of data structures and
values into account is a desirable feature for specification and verification ap-
proaches. As such, the expressiveness of plain propositional temporal logics such
as LTL does not suffice, as they do not allow for specifying complex properties
on data.
In this paper, we enhance traditional runtime verification techniques for propo-
sitional temporal logics by first-order theories for reasoning about data, based
on SMT solvers. In result, we obtain a powerful tool for verifying complex prop-
erties at runtime which exceeds the expressiveness of previous approaches. The
implementation in our tool jUnitRV [1] also shows that the framework is suitable
for practical applications.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 341–356, 2014.

c Springer-Verlag Berlin Heidelberg 2014
342 N. Decker, M. Leucker, and D. Thoma

Today’s SMT solvers are highly optimized tools that can check the satisfiabil-
ity of formulae over a variety of first-order theories such as arithmetics, arrays,
lists and uninterpreted functions. They allow for reasoning on a large class of data
structures used in modern software systems. We hence aim at integrating their
capabilities with the efficient monitoring approaches for temporal properties. We
formulate example properties showing the specific strength of our framework in
terms of expressiveness. Our benchmarks for monitoring Java programs show
that such specifications can be monitored efficiently.

Combining Monitoring and SMT. In the following we outline the idea of

our approach my means of a running example. Consider a mutual exclusion
property where a resource must not be accessed while it is locked, stated in LTL
as G(lock → ¬access U unlock). If there are several resource objects available at
runtime, this is too restrictive and one might specifically limit foreign access to
locked resources. Using variables r and p, p , intended to represent resources and
processes, respectively, and suitable predicates, the property can be stated as
G(lock(p, r) → (∀p = p : ¬access(p , r)) U unlock(p, r)). (1)
The free variables r and p are then implicitly universally quantified. Formally,
all variables range over some universe that is fixed by the application. In our
example, this could be the set of object identifiers in a certain Java program.

Data theories. To deﬁne a formal semantics for expressions as those above we

note that we essentially use an LTL formula and exchanged propositions by first-
order formulae. In LTL, propositions are evaluated at a position in some word,
i.e. a letter. To evaluate first-order formulae, such a letter must now be a first-
order structure describing a system state. In the example we fixed the universe to
object IDs and used a binary predicate “=” indicating equality. This is covered
by the first-order theory of natural numbers with equality and can be handled
by essentially all SMT solvers. It is possible to use more powerful theories, e.g.,
with linear order or arithmetics. Section 3.4 provides more examples.
What remains are the predicates that are not part of the theory. These specif-
ically characterize the current system state or, more accurately, are interpreted
by the current system state in terms of the current observation. We call them
observation predicates, in the example we use binary predicates lock, unlock and
access. Inspecting the program states, we obtain, at any time, an observation g
that interprets all observation predicates in terms of relations on the universe.
The first-order formulae reason about the data structures under a specific
observation. We therefore refer to this logic as data logic. Data logic formulae may
have free variables, such as p and r in the example. Summing up, we can define
the semantics of a data logic formula in terms of (1) an observation interpreting
the observation predicates (and possibly observation functions), (2) a theory that
fixes the interpretation of all other predicates and functions and (3) a valuation
that assigns a value from the universe to each free variable.

Temporal data logic. In the example, the data logic replaces the propositional
part that LTL is based on. To the logic expressing the temporal aspect, we gener-
ically refer to as the temporal logic. Our assumptions on the temporal logic must
be that it is linear (deﬁned on words) and that it only uses atomic propositions
Monitoring Modulo Theories 343

to “access” the word. For example, the semantics of some temporal operator
must not depend on the current letter directly but only on the semantics of
some proposition. We formally define that requirement in Section 3 but for now
only remark that typical temporal logics like LTL, the linear μ-calculus or the
temporal logic of calls and returns (CaRet) [2] fit into that schema.
Given a suitable temporal logic and data logic we can define the formalism
we aim at. Taking the temporal logic and replacing the atomic propositions by
data formulae, we obtain what we call a temporal data logic. The theory and
universe is fixed by the data logic and the semantics of temporal data logic
formulae can thus be defined over a sequence of observations. The free variables
are bound universally so the formula is evaluated over the observation sequence
for all possible valuations. The semantics of the formula is the conjunction (more
generally the infimum) of these results.

Monitor construction. Assuming a monitor construction for the (propositional)

temporal logic, we can evaluate a sequence of observations on-the-fly. The idea
is to construct a symbolic monitor that deals atomically with data formulae. In
the example formula (Equation 1) we treat lock(p, r), ∀p = p : ¬access(p , r)
and unlock(p, r) as three atomic propositions, say χ1 ,χ2 and χ3 . We obtain a
temporal logic formula G(χ1 → χ2 U χ3 ) over a set of atomic propositions AP =
{χ1 , χ2 , χ3 }. A monitor can then be constructed that reads words over the finite
symbolic alphabet Σ := 2AP .
The free variables in the formula are p and r and range over the universe of
natural numbers N. Given a valuation θ : {p, r} → N for those, mapping, e.g., p
to θ(p) = 1 and r to θ(r) = 2, we can map an observation g to the letter a ∈ Σ
that contains all formulae that are satisfied by g. For example, say g interprets
the observation predicates as lockg = {(1, 2), (10, 7)} and accessg = unlockg = ∅
(because objects 1 and 10 happen to lock the resources 2 and 7, respectively,
and otherwise nothing happened in the current execution step of the program).
Then, under θ, g is mapped to a = {χ1 , χ2 } ∈ Σ since χ1 and χ2 hold but χ3
does not. The observation g might be mapped to some other symbolic letter for
another valuation θ . If, for example, θ (p) = 2 then χ1 does not hold and g is
projected to a = {χ2 } ∈ Σ.
In Section 4 we present a monitoring algorithm that maintains a copy of
the symbolic monitor for each valuation. For a new observation, the algorithm
simulates the individual transition for each copy by projecting the observation
under the specific valuation. As the universe is in general infinite, the number
of monitor instances is infinite as well but the algorithm uses a data structure
to finitely represent the state of all monitor instances.

Related Work. In runtime veriﬁcation, handling data values to reason about

the computation of a system more precisely has always been a concern. One
of the first works extending LTL by parameters is by Stolz and Bodden [3].
Binding of parameters of propositions takes place in a Prolog-style fashion
and the resulting approach is reasonable for the intended applications. However,
no precise denotational semantics is given.
The works on Eagle and RuleR [4,5] allow the formulation of first-order safety
properties. The corresponding systems come with a rewriting-based semantics and
are well-suited for specifying safety properties of especially finite, yet perhaps
344 N. Decker, M. Leucker, and D. Thoma

expanding traces. In [6] a runtime verification approach for the temporal evalu-
ation of integer-modulo-constraints was presented. The underlying logic has a de-
cidable satisfiability problem and the overall approach is anticipatory. However,
only limited computations can be followed. To reason about the temporal evolu-
tion of data values along some computation, some form of bounded unrolling like
in bounded model checking [7] can be used. For runtime verification, however, such
an approach is not suitable, as the observed trace cannot be bounded.
Closely related to our work is that of Chen and Rosu [8]. It considers the
setting of sequences of actions which are parameterized by identifiers (ID). The
main idea is to divide the sequence of a program into sub-sequences, called slices,
containing only a single ID, and monitor each slice independently. Hence, in
contrast to our approach, no interdependencies between the different slices can be
checked. Moreover, our monitoring approach is not limited to plain IDs but allows
the user to reason more generally over data in terms of arbitrary (decidable) first-
order theories. The work considers a dedicated temporal logic (LTL) together
with the dedicated notion of parameters, whereas in our framework an arbitrary
linear temporal logic is extended by a first-order theory.
Recently, Bauer et al. presented an approach combining LTL with a variant
of first-order logic for runtime verification [9]. However, their approach restricts
quantification to finite sets always determined in advance by the system observa-
tion. This allows for finitely instantiating quantifiers during monitor execution,
but also profoundly limits the expressiveness of first-order logic. Basically, it is
only possible to evaluate first-order formulae over finite system observations, and
not to express properties in a declarative manner.

2 Preliminaries
First-Order Logic. A signature S = (P, F, ar) consists of finite sets P , F
of predicate and function symbols, respectively, each of some arity defined by
ar : P ∪ F → N. An extension of S is a signature T = (P , F , ar ) such that
P ⊆ P , F ⊆ F and ar ⊆ ar.
The syntax of first-order formulae over the signature S is defined in the usual
way using operators ∨ (or), ∧ (and), ¬ (negation), variables x0 , x1 , . . . , predicate
and function symbols p ∈ P , f ∈ F , quantifiers ∀ (universal), ∃ (existential).
Free variables are not in the scope of some quantifier and are assumed to come
from some set V. The set of all first-order formulae over a signature S is denoted
FO[S]. We consider constants as function symbols f with ar(f ) = 0. A sentence
is a formula without free variables.
An S-structure is a tuple s = (U, s) comprising a non-empty universe U and
a function s mapping each predicate symbol p ∈ P to a relation ps ⊆ U n of arity
n = ar(p) and each function symbol f ∈ F to a function fs : U m → U of arity
m = ar(f ). A T -structure t = (T , t) is an extension of s if T is an extension of
S, T = U and s(r) = t(r) for all symbols r ∈ P ∪ F .
A valuation is a mapping θ : V → U of free variables to values. The set of
all such mappings may be denoted U V . The semantics of first-order formulae is
defined as usual. We write (s, θ) |= χ if a formula χ is satisfied for some structure
s and valuation θ. For sentences, we refer to a sole satisfying structure as a model,
omitting a valuation. The theory T of an S-structure s is the set of all sentences
χ such that s is a model for χ.
Monitoring Modulo Theories 345

Temporal Speciﬁcations. We use AP to denote a ﬁnite set of atomic propo-

sitions and Σ := 2AP for the finite alphabet over AP . For arbitrary, possibly
infinite alphabets we mostly use Γ . A word over some alphabet Γ is a sequence
of letters from Γ and Γ ∗ , Γ ω denote the sets of finite and infinite words over
Γ , respectively. The syntax of linear-time temporal logic (LTL) is defined in
the usual way over atomic propositions AP using negation, boolean connectives
and temporal operators X (next), U (until), G (globally) and F (eventually).
We refer to the standard LTL semantics over infinite words w ∈ Σ ω as LTLω
given for an LTL formula ϕ by a mapping ϕω : Σ ω → B where B = {#, ⊥}
denotes the two-valued boolean lattice. The finitary three-valued LTL semantics
LTL3 [10], is given for ϕ by a mapping ϕ3 : Σ ∗ → B3 where B3 = {#, ?, ⊥}
denotes the three-valued boolean lattice ordered # > ? > ⊥. It is defined
for w ∈ Σ ∗ as ϕ3 (w) := # if ∀u ∈ Σ ω : ϕω (wu) = #, ϕ3 (w) := ⊥ if
∀u ∈ Σ ω : ϕω (wu) = ⊥ and ϕ3 (w) := ? otherwise.

Monitor. A monitor M = (Q, Γ, δ, q0 , λ, Λ) for a temporal property is a Moore

machine where Q is a possibly inﬁnite set of states, Γ is a possibly inﬁnite input
alphabet, δ : Q × Γ → Q is a deterministic transition function and λ : Q → Λ is
a labeling function mapping states to labels from the set Λ.

3 Temporal Data Logic

The aim of the framework is to enable the user to specify and check complex
properties of execution traces. As described above we consider two aspects, time
and data. Note that we refer to discrete time, as opposed to continuous notions
like in timed automata. In this section we therefore formalize how and under
which assumptions two logics considering time (temporal logic) and data (data
logic) can be combined to a speciﬁcation formalism (temporal data logic) that
can express the timely behaviour of a system with respect to the data it processes.
The clear separation of the aspects will give rise to a monitoring procedure.

3.1 Temporal Logic

The notion of a temporal logic (TL) that we consider for our monitoring frame-
work is inspired by the intuition for LTL which is widely used for behavioural
specifications, in particular in runtime verification. However, our monitoring ap-
proach does only rely on some specific properties that also come with other, also
more expressive logics. In the following we identify the required features of a
suitable temporal logic for our framework.
We require the desired temporal behaviour to be specified in a finitary, linear
logic, that is, the semantics is defined on finite words over some alphabet Γ . The
truth values of the semantics need to come from a complete semi-lattice (S, )
since we will handle multiple monitor instances and combine individual verdicts.
Second, there must be a monitor construction for the logic in question since
our framework is intended to generically lift such a construction for handling
data. We assume that such a construction turns a TL formula ϕ into a Moore
machine Mϕ with output Mϕ (w) = ϕ(w) for w ∈ Γ ∗ . The restriction to Moore
machines is not essential, our constructions are applicable to similar models,
including Mealy machines and we do not rely on a finite state space.
346 N. Decker, M. Leucker, and D. Thoma

As we aim at replacing atomic propositions, we require that the semantics

of the temporal logic can only distinguish letters by means of the semantics of
such propositions. This allows for lifting the semantics from a propositional to
a complex alphabet where letters have more internal structure.
Proposition semantics. We formalize the distinction of positional and temporal
aspects of a temporal logic formula using a proposition semantics ps : AP → 2Γ
mapping propositions p ∈ AP to the set of letters ps(p) ⊆ Γ that satisfy the
proposition. Given, that the semantics of some propositional temporal logic can
be defined by only referring to letters using a proposition semantics, it can be
substituted without influencing the temporal aspect.
We refer to the canonical semantics for Γ = Σ = 2AP as psAP : AP → 2Σ ,
with psAP (p) := {a ⊆ AP | p ∈ a}. It is the “sharpest” in the sense that it
distinguishes maximally many letters by means of combinations of propositions.
Symbolic abstraction. For an alphabet Γ , atomic propositions AP and a propo-
sition semantics ps : AP → 2Γ , let πps : Γ → Σ be a projection with πps (g) :=
{p ∈ AP | g ∈ ps(p)}, mapping a letter g ∈ Γ to the set of propositions that
hold for it. For convenience, we lift the projection to words g1 . . . gn (gi ∈ Γ )
by πps (g1 . . . gn ) := πps (g1 ). . . πps (gn ). Using πps , we consider the letters form Σ
as symbolic abstractions of Γ wrt. AP and ps in the sense that πps maintains
all the structure of Γ that is relevant for evaluating (boolean combinations of)
propositions form AP .
As argued above, for the purpose of lifting a temporal logic over atomic propo-
sitions to propositions carrying data, i.e., structure, it is essential that the se-
mantics of propositions can be encapsulated and exchanged without influencing
the temporal aspect. We can formalize this requirement on a temporal logic TL
using the symbolic abstraction. We assume the semantics of a TL formula ϕ to
be a mapping that takes linear sequences from Γ ∗ and assigns a truth value from
the complete semi-lattice S. If the semantics satisfies our criterion we can make
the proposition semantics ps : AP → 2Γ an explicit parameter and assume the
semantics of a formula ϕ is given by a mapping ϕ(ps) : Γ ∗ → S, or, gener-
ally, ϕ : (AP → 2Γ ) → (Γ ∗ → S). Moreover, projecting the input word to a
symbolic word and evaluating ϕ(psAP ) on it must not change the result.
Definition 1 (Propositional semantics). Let AP be a set of atomic propo-
sitions and Γ an alphabet. A semantics ϕ : (AP → 2Γ ) → (Γ ∗ → S) is propo-
sitional iff for all proposition semantics ps : AP → 2Γ and all words γ ∈ Γ ∗
ϕ(ps)(γ) = ϕ(psAP )(πps (γ)).
Based on that notion of propositional semantics we can summarize the formal
criteria for a temporal logic to be suitable for our monitoring framework.
Definition 2 (Temporal logic). A temporal logic is a specification formalism
TL over a set of atomic propositions AP that enjoys the following properties.
1. The semantics of formulae ϕ is given for finite words over an input alphabet
Γ by a mapping ϕTL : (AP → 2Γ ) → (Γ ∗ → S) where (S, ) is a complete
semi-lattice.
2. The semantics is propositional.
3. A monitor construction is available that turns a formula ϕ into a Moore
machine Mϕ with output Mϕ (w) = ϕTL (psAP )(w) for w ∈ Σ ∗ .
Monitoring Modulo Theories 347

3.2 Data Logic

To reason about data values our framework can use a so called data logic DL
based on any first-order theory for which satisfiability is decidable. We assume
the theory is represented by some structure which can be extended by additional
predicate and function symbols that will represent observations from the system
that shall be monitored.
Definition 3 (Data logic). Let T = (P, F, ar) be a signature, t = (D, a) some
T -structure and P , F be additional predicate and function symbols with arity
defined by ar : P ∪ F → N, called observation symbols.
A data logic DL is a tuple (t, G, V, D) such that G = (P ∪ P , F ∪ F , ar ∪ ar )
is an extension of T and V is a finite set of first-order variables.
A DL formula is a first-order formula over the signature G and possibly free
variables from V. A DL formula is called observation-independent, if it does not
contain observation symbols. An observation is a G-structure g = (D, g) that is
an extension of t. The set of all observations is denoted Γ .
The semantics of a DL formula is defined over tuples (g, θ) ∈ Γ × DV consist-
ing of an observation and a valuation θ : V → D of free variables in the usual
way.
For an instance of the monitoring framework the structure t representing the
theory is fixed. An observation-independent DL formula ϕ with free variables
x1 , . . . , xn ∈ V can be evaluated just wrt. t, without considering an observation.
A decision procedure for the theory of t can thus be applied directly. Further, ϕ
can be interpreted as a constraint on the domain of variable valuations DV by
considering the set Θϕ := {θ ∈ DV | (t, θ) |= ϕ}.

3.3 Temporal Data Logic

Given a temporal and a data logic as described above, we can now define their
combination, the temporal data logic TDL. In TDL formulae we use brackets
and to clarify which parts come from the data logic.
Definition 4 (Temporal data logic). Let TL be a temporal logic and DL =
(t, G, V, D) a data logic. Let AP be a finite set {χ1 , . . . , χn } where χ1 , . . . , χn
are DL formulae with free variables from V.
A TDL formula is a TL formula over AP. A structured word is a finite
sequence γ ∈ Γ ∗ of DL observations. For a valuation θ ∈ DV , let the proposition
semantics psθ : AP → 2Γ be defined by psθ (χ) := {g ∈ Γ | (g, θ) |= χ} for
χ ∈ AP. The semantics of a TDL formula ϕ is a mapping ϕTDL : Γ ∗ → S
defined for γ ∈ Γ ∗ by

ϕTDL (γ) := ϕTL (psθ )(γ).
θ∈D V

Recall, for psθ we obtain a projection πpsθ : Γ ∗ → Σ ∗ from structured to

symbolic words. In the following we abbreviate πpsθ by πθ . From Deﬁnition 2 of
the temporal logic it follows that we can evaluate the semantics of some TDL
formula symbolically, which is an essential step in lifting a monitor construction
for TL to the data setting.
348 N. Decker, M. Leucker, and D. Thoma

Table 1. Example properties using LTL and CaRet with data

mutex G(lock(f, t) → ∀t = t : ¬access(f, t ) Uunlock(f, t))

access (open(x) R ¬access(x)) ∧ G(close(x) → G ¬access(x))
iterator G((iterator(i)∨next(i)) → X(hasNext(i, true) R ¬next(i)))
modiﬁed G(iterator(c, i) → G(add(c) → (¬next(i) Ufinalize(i))))
server G(request(t, x) → F∃t : response(t , x, t))
response G(request(t)∧x = time → (time < x+100 Uresponse(t)))
counter G(p(x) → Xp(x + 1))
velocity G(s = x ∧ t = y → Xs − x < vmax · (t − y))
matching G((call ∧ printOpen(x)) → Xa printClose(x))
bound G(open(x) → X(¬ret → Ga (open(y) → x > y)))
depth G(open(x) → X((¬ret ∧ Fa open(x − 1)) ∨ (ret ∧ x = 0))

Proposition 1. Let ϕ be a TDL formula, DV the valuation space for free variables
formulae used in ϕ and AP = {χ1 , . . . , χn }. For
in ϕ, χ1 , . . . , χn the data logic
γ ∈ Γ ∗ we have ϕTDL (γ) = θ∈DV ϕTL (psAP )(πθ (γ)).

3.4 LTL and CaRet with Data

We now exemplify the instantiation of our framework by means of LTL. More
precisely, we show that the the finitary, three-valued LTL3 semantics ϕ3 : Σ ∗ →
B3 can be formulated to comply Definition 2. It is defined over Σ = 2AP based
on the infinitary LTLω semantics. The inductive definition of LTLω only refers
to letters for atomic propositions. This can be easily reformulated in terms of
an arbitrary proposition semantics ps : AP → 2Γ over an arbitrary alphabet Γ .
Instead of defining pω (w) = # iff p ∈ w0 , we let pω (ps)(γ) := # if γ0 ∈ ps(p)
and pω (ps)(γ) := ⊥ otherwise, for γ ∈ Γ ω . The rest of the definition remains
untouched. The definition of the three-valued semantics ϕ3 does not at all refer
to letters directly but only to LTLω . With these simple modifications LTL3 fits
to the notion of temporal logic in the sense of Definition 2. The corresponding
monitor construction proposed in [10,11] can be applied.
Proposition 2. The MMT framework can be instantiated for LTL3 .
The mutual exclusion property presented earlier is one example for a specifi-
cation based on LTL and the theory of IDs. Other common examples of temporal
properties are the correct use of iterators or global request/response properties.
In the propositional versions of such properties the objects in question, itera-
tors, resources or requests, are assumed to be unique. Adding data in terms of
IDs, for example, allows for a much more realistic formulation. Table 1 lists for-
mulations of these properties and also others that cannot be expressed without
distinguishing at least identities. The property modified requires that an itera-
tor must not be used after the collection it corresponds to has been changed.
Further, counting (counter ) or arithmetic constraints (response, velocity), also
on real numbers, are valuable features for a realistic specification.

RLTL and CaRet: Regular and nesting properties. Regular LTL [12] is an ex-
tension of LTL based on regular expressions. CaRet [2] is a temporal logic with
Monitoring Modulo Theories 349

calls and returns expressing non-regular properties. In addition to the LTL oper-
ators, CaRet allows for abstract temporal operators such as Xa and Ga , moving
forward by jumping on a word from a calling position to matching return posi-
tion, reﬂecting the intuition of procedure calls. For RLTL and CaRet monitor
constructions have been proposed [6,13]. Despite both are more complex the
same arguments as for LTL apply. Example properties are listed in Table 1 and
express matching call- and return values and nesting-depth bounds.

4 Monitoring
In this section we present our monitoring procedure for TDL formulae. It relies
on the observation made in Proposition 1, namely that the TDL semantics for
an input word γ ∈ Γ ∗ is characterized by the TL semantics for projections of γ.
Any TDL formula can be interpreted as TL formula when considering all
occurring data logic formulae as individual symbols. With this interpretation we
can employ the monitor construction for TL to obtain a monitor over a finite
alphabet constructed from these symbols.
Definition 5 (Symbolic monitor). Let ϕ be a TDL formula and χ1 , . . . , χn
the data logic formulae used in ϕ and AP = {χ1 , . . . , χn }. The symbolic
alphabet for ϕ is the finite set Σ := 2AP . The symbolic monitor for ϕ is the
monitor MΣ constructed for ϕ interpreted as TL formula over AP.
The symbolic monitor MΣ for a TDL formula ϕ computes the semantics
ϕ(psAP ) : Σ ∗ → S. Following Proposition 1, what remains is to maintain a
monitor for each valuation θ ∈ DV and to individually compute the correspond-
ing projection πθ on the input.
Within this section we present an algorithm for efficiently maintaining these,
in general infinitely many, monitor instances. It uses a data structure, called
constraint tree, that represents finitely many equivalence classes of symbolic
monitors. The constraint tree also allows for easy computation of the infimum
of the outputs of all monitor instances, which is the semantics of the property
on the input trace read so far.

4.1 Representing and Evaluating Observations

While observations are formally defined as first-order structures, we want to
use them algorithmically and must therefore choose a representation. An ac-
tual implementation of an SMT solver already fixes how to represent all objects
essential for handling a certain theory, such as first-order formulae, predicates
and function symbols. We have defined observations to be extensions of a struc-
ture representing the theory and want to handle them practically using an SMT
solver. Consequently, we assume them to be extensions of the structure that the
tool uses to represent and handle a theory. For the purpose of the implementa-
tion, it is a reasonable assumption that the semantics of observation symbols be
expressible or, more precisely, expressed within the considered data theory.
Formally, for DL = (t, G, V, D) where t is a T -structure, we assume that any
observation g ∈ Γ induces a mapping ĝ : F O[G] → F O[T ] s.t. for all DL formulae
χ and all valuations θ ∈ DV we have (g, θ) |= χ iff (t, θ) |= ĝ(χ). Note that
350 N. Decker, M. Leucker, and D. Thoma

this can be realized by substituting observation predicates by some observation-

independent formula that characterizes its semantics wrt. g. Function symbols
f can be replaced using existential substitution replacing expressions of the
form e(f (e )) by ∃z : e(z) ∧ ξf (e , z) where an observation-free DL formula ξf
characterizes the semantics of f wrt. g.
As noted earlier, we can also employ observation-free formulae ρ to describe
sets of valuations Θρ ⊆ DV . While this does not allow for representing any arbi-
trary set of valuations, the expressiveness of the data theory suffices to express
any relevant set. If ρ represents an equivalence class wrt. some formula ĝ(χ),
meaning (t, θ) |= ĝ(χ) holds for all θ ∈ Θρ or none, we have that (t, θ) |= ĝ(χ)
iff there is any θ ∈ DV such that (t, θ ) |= ĝ(χ) ∧ ρ.
Let χ be a DL formula and g ∈ Γ an observation. Let ρ be an observation-free
DL formula such that for all θ1 , θ2 ∈ Θρ we have (t, θ1 ) |= ĝ(χ) iff (t, θ2 ) |= ĝ(χ).
Then, for all θ ∈ Θρ , (g, θ) |= χ iff ĝ(χ) ∧ ρ is satisfiable. Note that ĝ(χ) ∧ ρ is
an observation-free DL formula and that checking it for satisfiability is exactly
what we assume an SMT solver be able to do.

4.2 Constraint Trees

We next introduce constraint trees, a data structure storing the configurations
of a set of instances of some symbolic monitor. It maintains sets of valuations
Θ ⊆ DV represented by constraints and stores for each such set a monitor state.
The desired property regarding the use in our monitoring algorithm is that the
sets of constraints induce a partition of the valuation space.
Definition 6 (Constraint tree). Let MΣ be a symbolic monitor with states Q
and DL a data logic. A constraint tree is a tuple T = (I, L, S1 , S2 , C, λI , λL ) such
that (I ∪ L, S1 , S2 ) is a finite, non-empty binary tree with internal nodes I, leaf
nodes L and successor relations S1 , S2 ⊆ I × (I ∪ L), C is a set of observation-
independent DL formulae called constraints, λI : I → C labels internal nodes
with constraints and λL : L → Q labels leaf nodes with monitor states.
Let the DL formula ρ(v0 . . . vi ) be the conjunction over all constraints along
the path v0 . . . vi−1 , where all S2 -successors are negated and ρ(v0 ) = true. A path
constraint in T is a DL formula ρ(v0 . . . vn ) such that v0 . . . vn is a maximal path
in T . A constraint tree T is consistent if the set of all path constraints in T
induces a partition of DV . The set of all constraint trees is denoted T .
In a constraint tree T , each inner node represents a constraint that is used
to separate the valuation space DV . S1 -branches represent the parts where the
particular constraint holds while in the S2 -branches it does not.
Constraint trees T = (I, L, E, C, λI , λL ) will be used to represent mappings
t : DV → Q assigning a monitor state q ∈ Q to each valuation θ ∈ DV . If T is
consistent, every valuation θ satisfies exactly one path constraint ρ in T which in
turn corresponds to a unique path ending in some leaf node v ∈ L. The mapping
is thereby defined as t(θ) = λL (v). Note that t would not be necessarily well-
defined for constraint trees that are not consistent. Where convenient, we may
identify a path constraint ρ with the set Θρ of valuations satisfying it and write,
e.g., θ ∈ ρ if some valuation θ ∈ Θρ satisfies ρ.
Monitoring Modulo Theories 351

4.3 Symbolic Monitor Execution

In the following we present an algorithm incrementally processing a sequence
of observations in order to compute the semantics of some TDL formula ϕ. It
maintainings a consistent constraint tree as a finite representation of a mapping
of valuations to states of the symbolic monitor MΣ = (Q, Σ, δ, q0, λ, S) for ϕ.
The algorithm starts on the trivial constraint tree consisting only of one leaf
node labeled by the initial state q0 . This means that the monitor instances for
all valuations are in state q0 . Intuitively, for an input word γ ∈ Γ the algorithm
executes one monitor instance for each valuation θ ∈ DV on the respective
projection πθ (γ). For the empty word γ =
, all projections are equal and all
instances are in the same state. When reading a new observation g ∈ Γ which
is, for all valuations, projected to the same symbolic letter a ∈ Σ, all monitor
instances read the same projection and their state changes equally to δ(q0 , a).
Otherwise, if g is mapped to different symbolic letters for different valuations,
the so far uniformly handled valuation space is split.
Consider two valuations θ, θ ∈ DV and an input symbol g ∈ Γ such that their
projections a = πθ (g) = b = πθ (g) are different. Then there is some proposition
χ ∈ AP that distinguishes a and b, e.g., let χ ∈ a and χ ∈ b. In general,
the behaviour of all monitor instances reading a letter including χ may diverge
from those reading a letter not including χ. Therefore, the algorithm records
this fact by splitting the valuation space in two parts, one for which χ holds
under observation g and another for which it does not. A new node is added to
the tree, labeled by the constraint ĝ(χ) precisely distinguishing the two parts.
A part may be split up further in the same way in case other propositions again
distinguish valuations from it. Additional nodes are created in the constraint
tree accordingly and so the path constraint ρ on the path to a leaf node v ∈ L
characterizes exactly the set of valuations Θρ for which the projection of the
observation g is equal and thus the state of all corresponding monitor instances.
This process is continued when reading further observations. For each part Θρ
represented in the constraint tree, a new observation h ∈ Γ is processed by
checking for each proposition χ ∈ AP if there are valuations in Θρ that observe
a projection including χ by checking satisfiability of ρ ∧ ĥ(χ) and if there are
others observing a projection not including χ by checking the satisfiability of
ρ ∧ ¬ĥ(χ). If one of the formulae is empty, meaning that one of the hypothetical
new parts Θρ∧ĥ(χ) = Θρ ∩ Θĥ(χ) and Θρ∧¬ĥ(χ) = Θρ ∩ Θĥ(χ) is empty, the new
observation h is projected equally wrt. χ for all valuations in the part which
is thus not split by ĥ(χ). Only if both new parts are non-empty, the part is split
by adding a new node to the constraint tree labeled by ĥ(χ). Once all necessary
splits are performed for an observation, all propositions are evaluated yielding
the projections for each (possibly new) part. According to those, the leaf nodes
are updated using the transition function of the symbolic monitor.
The procedure described above is listed explicitly as Algorithm 1. There, for
the set of all constraint trees T , we use constructors InnerCTree : FO[S] ×
T × T → T and LeafCTree : Q → T for sub-trees and leafs, respectively, where
FO[S] is the set of observation-independent DL formulae. For T = LeafCTree(q)
we assume that T consists of a single node v ∈ L that is labeled by λL (v) = q
and for T = InnerCTree(ϕ, T1 , T2 ) we assume that T has at least three nodes
352 N. Decker, M. Leucker, and D. Thoma

Algorithm 1. Split constraints and simulate monitor steps

1 function split =
2 // r e c u r s i v e l y process subtrees , a c c u m u l a t e c o n s t r a i n t s
3 case (P , ρ, a, I n n e r C T r e e(ϕ, t0 , t1 ) , g) then
4 I n n e r C T r e e(ϕ, split (P, ρ ∧ ¬ϕ, a, t0 , g) , split (P, ρ ∧ ϕ, a, t1 , g))

6 // evaluate propositions , split p a r t i t i o n if n e c e s s a r y

7 case ({χ} ∪ P , ρ, a, L e a f C T r e e(s) , g) then
8 T0 = if SAT (ρ ∧ ¬ĝ(χ)) then
9 split (P , ρ ∧ ¬ĝ(χ), a, L e a f C T r e e(s) , g)
10 else Empty
11 T1 = if SAT (ρ ∧ ĝ(χ)) then
12 split (P , ρ ∧ ĝ(χ), a ∪ {χ}, L e a f C T r e e(s) , g)
13 else Empty
14 if (t0 = Empty ) then t1
15 else if (t1 = Empty ) then t0
16 else I n n e r C T r e e( ĝ(χ), t0 , t1 )

18 // store new state

19 case (∅, ρ, a, L e a f C T r e e(s) , g) then
20 L e a f C T r e e(δ(s, a))

22 function step (t: CTree , g ∈ Γ ): CTree =

23 split (AP, true, ∅, t, g)

v, v1 , v2 such that v is the root of T labeled by λI (v) = ϕ, v1 , v2 are the roots of

T1 and T2 , respectively, and (v, v1 ) ∈ S1 and (v, v2 ) ∈ S2 .
Based on constraint trees as data structure and the algorithm for modify-
ing constraint trees regarding a new observation we can now define the data
monitor for a TDL formula, where, as before, the data logic DL is defined over
observations Γ and the temporal logic TL uses truth values S.
Definition 7 (Data monitor). Let ϕ be a TDL formula, Σ = 2AP the symbolic
alphabet and MΣ = (Q, Σ, δ, q0 , λQ , S) the symbolic monitor for ϕ.
The data monitor for ϕ is a Moore machine MΓ = (T , Γ, step, T0 , λT , S)
using constraint trees T as states.
The transition function step : T × Γ → T is given by Algorithm 1 and the
initial tree T0 consists of a single leaf node labeled with the initial state q0 of
MΣ . For a constraint tree T ∈ T where the leaf nodesL are labeled by λL , the
monitor output is defined by λT : T → S with λ(T ) = v∈L λΣ (λL (v)).

4.4 Correctness
Proposition 3 (Termination). On a constraint tree T , the function step in
Algorithm 1 terminates and has a running time in O(|T | · |Σ|) where |T | is the
number of nodes in T and |Σ| = 2|AP| is the number of abstract symbols.
The monitoring procedure presented above is correct in that the data monitor
MΓ for a TDL formula ϕ computes the correct semantics for all input words.
Theorem 1 (Correctness). Let ϕ be a TDL formula and MΓ the data mon-
itor for ϕ. Then, for all γ ∈ Γ ∗ , MΓ (γ) = ϕTDL (γ).
In order to prove correctness, we ﬁrst settle some observations.
Recall that the
semantics ϕTDL can be represented as the conjunction θ∈DV ϕTL (psAP )
(πpsθ (γ)) over projections πpsθ (γ) (Proposition 1). We ﬁx the data logic DL for
this section and write πθ for πpsθ in the following.
Monitoring Modulo Theories 353

Despite the conjunction above is inﬁnite, given a ﬁnite word γ ∈ Γ ∗ , the

valuation space DV can be partitioned into finitely many equivalence classes
Θ1 , . . . , Θn such that the projection of γ is unique for each class Θi , i.e., ∀θ,θ ∈Θi :
πθ (γ) = πθ (γ). It therefore suffices to maintain this set of equivalence classes
which can in turn be finitely represented by constraints ρi . Let wi = πθ (γ) ∈ Σ ∗
for θ ∈ Θi be the projection of γ for the class Θi (i ∈ {1, . . . , n}).
The semantics
can then be computed as the finite conjunction ϕTDL (γ) = ni=1 ϕTL (psAP )
(wi ).
It remains to reason that this partition exists which we do by showing that it
is in fact computed by the monitoring algorithm. More precisely, we show that
the constraint tree T that is the configuration of the monitor MΓ after reading
a word γ is consistent. That is, the path constraints ρ represented by T cover
the whole valuation space and are disjoint. Moreover, for all valuations θ ∈ ρ of
such an equivalence class ρ, the symbolic monitor MΣ behaves the same on all
corresponding projections πθ (γ).

Lemma 1. Let MΓ = (T , Γ, δΓ , t0 , λΓ ) and MΣ = (Q, Σ, δΣ , q0 , λΣ ) be the

data monitor and the symbolic monitor, respectively, for some TDL formula ϕ.
Let for γ ∈ Γ ∗ be T = δΓ (T0 , γ) and RT the set of path constraints in T . If T is
consistent, T (ρ) denote for ρ ∈ RT the unique label of the leaf in T corresponding
to ρ. Then, (i) {Θρ | ρ ∈ RT } is a partition of DV and (ii) ∀ρ∈RT ∀θ∈Θρ : T (ρ) =
δΣ (q0 , πθ (γ)).

We can now proof that the data monitor computes the correct semantics.

Proof (Theorem 1).

Let T = δΓ (T0 , γ). We have, using Lemma 1 (i) and (ii),
(i) (ii)
MΓ (γ) = λΓ (T ) = λΣ (λL (v)) = λΣ (T (ρ)) = δΣ (q0 , πθ (γ))
v∈L ρ∈RT ρ∈RT θ∈ρ
(i)
= δΣ (q0 , πθ (γ)) = ϕTL (psAP )(πθ (γ)) = ϕTDL (γ)
θ∈D V θ∈D V

4.5 Remarks and Optimizations

Impartiality and anticipation. An impartial semantics distinguishes between pre-

liminary and final verdicts. A final verdict for some word indicates that it will
not change for any continuation. Impartiality is desirable as monitoring can be
stopped as a soon as a final verdict is encountered (c.f. [14,6]). In the context
of our framework this gains even more importance. When the underlying mon-
itor is impartial, a branch already yielding a final verdict can be pruned. This
immensely improves runtime performance. If the symbolic monitor is impartial,
the data monitor (partially) inherits this property in the typical cases. Another
desired property is anticipation, i.e., evaluating to a final verdict as early as
possible. While in general not transfered from the symbolic to the data monitor,
this may still lead to better performance.
354 N. Decker, M. Leucker, and D. Thoma

Dedicated theories as ﬁrst-class citizens. The monitoring framework is also ﬂex-

ible in the sense that one can trade eﬃciency for generality. When the properties
intended to monitor are simple enough it is reasonable to extend the algorithm
to directly evaluate constraints. As we show in the experiments this works well,
in particular for properties concerning only object IDs.

5 Experimental Results

We implemented our framework based on jUnitRV [1], a tool for monitoring

temporal properties for applications running on the Java Virtual Machine. The
previous version of jUnitRV supported classical LTL specifications referring to,
e.g., the invocation of a method of some class. With the approach proposed here
it is now possible, for example, to specify properties that relate to individual
objects and their evolution in time. The implementation is based on a generic
interface to an SMT solver. We present benchmarks using the SMT solver Z3
[15]. For comparison, we additionally implemented a dedicated solver for the
theory of IDs (i.e., conjunctions of equality constraints on natural numbers). For
the benchmarks, we have chosen representative properties from Table 1. The
property mutex is a typical example for interaction patterns in object-oriented
systems. It was evaluated on a program with resource objects and user objects
randomly accessing them. The iterator example was evaluated on a simple pro-
gram using randomly one of two iterator objects for traversing a list. Third, we
evaluated a typical client-server response pattern (server ) on a program simulat-
ing a number of server threads that receive requests and responses. For handling
existential quantification, we rely on Z3. For comparison, we also evaluate the
property G(request(t, x) → F response(x, t)) (server2 ) as a variation that can
be handled by our simple solver. The counter property covers the counting of
natural numbers which is a very elementary aspect in computer programs and
uses an unbounded number of different data values. A property involving a rather
complex theory is velocity. The free variables refer to real numbers as data values
and the constraints that have to be checked are multi-dimensional.
In our experiments we measured the execution time of a program with an in-
tegrated monitor over the number of monitoring steps. The measurements were
taken up to 104 steps. Very simple programs were used, since the measured run-
time is thereby essentially the runtime of the monitoring algorithm. The linear
graphs obtained for every example show that the execution time for a monitoring
step is constant. The most complex properties, velocity and server induce the
most overhead due to a higher computational cost by the SMT solver. However,
even the performance for velocity of 4.2 ms/step is acceptable for many applica-
tions. Thus, employing an SMT solver is viable whenever performance is not a
main concern, for instance in case a monitoring step is not expected to happen
frequently wrt. to the overall computation steps. Our dedicated implementation
is much faster (by factor 100) and hence can only be distinguished in the right-
hand diagram. These results demonstrate, that performance can be improved
for specific settings and the approach can still be employed when performance
is more critical. As mentioned before, the number of calls to the SMT solver is
linear in the size of the constraint tree. Hence, the overhead may increase up
to linearly in the number of runtime objects that need to be tracked. In our
Monitoring Modulo Theories 355

·102 ·102

400

300 1
time in ms

200
0.5
100

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
steps ·104 steps ·104
Counter EQ Mutex EQ Server2 EQ Iterator EQ Counter Z3
Mutex Z3 Server2 Z3 Iterator Z3 Velocity Z3 Server Z3

Fig. 1. Experimental results

example the maximal size of the constraint tree was six. All experiments were
carried out on an Intel i5 (750) CPU.

6 Conclusion
With the combination of propositional temporal logics and first-order theories,
the framework we propose in this paper allows for a precise, yet high-level and
universal formulation of behavioural properties. This helps the user to avoid
modeling errors by formulating specifications describing a system on a higher
level of abstraction than required for an actual implementation.
The clear separation of the aspects of time and data allows for efficient run-
time verification as the different aspects are handled separately in terms of a
symbolic monitor construction and solving satisfiability for first-order theories.
The independent application of techniques from monitoring and SMT solving
benefits from improvements in both fields.
Our implementation and the experimental evaluation show that the approach
is applicable in the setting of object-oriented systems and that the runtime
overhead is reasonably small. Note that this is despite the properties expressible
in our framework are hard to analyze. The satisfiability problem, for example,
is already undecidable for the combination of LTL and the very basic theory of
identities.

References
1. Decker, N., Leucker, M., Thoma, D.: jUnitRV –Adding Runtime Veriﬁcation to jU-
nit. In: Brat, G., Rungta, N., Venet, A. (eds.) NFM 2013. LNCS, vol. 7871, pp.
459–464. Springer, Heidelberg (2013)
2. Alur, R., Etessami, K., Madhusudan, P.: A temporal logic of nested calls and
returns. In: Jensen, K., Podelski, A. (eds.) TACAS 2004. LNCS, vol. 2988, pp.
467–481. Springer, Heidelberg (2004)
356 N. Decker, M. Leucker, and D. Thoma

3. Stolz, V., Bodden, E.: Temporal assertions using AspectJ. Electr. Notes Theor.
Comput. Sci. (2006)
4. Goldberg, A., Havelund, K.: Automated runtime verification with Eagle. In:
MSVVEIS. INSTICC Press (2005)
5. Barringer, H., Rydeheard, D.E., Havelund, K.: Rule systems for run-time monitor-
ing: From Eagle to RuleR. In: Sokolsky, O., Taşıran, S. (eds.) RV 2007. LNCS,
vol. 4839, pp. 111–125. Springer, Heidelberg (2007)
6. Dong, W., Leucker, M., Schallhart, C.: Impartial anticipation in runtime-
verification. In: Cha, S(S.), Choi, J.-Y., Kim, M., Lee, I., Viswanathan, M. (eds.)
ATVA 2008. LNCS, vol. 5311, pp. 386–396. Springer, Heidelberg (2008)
7. Biere, A., Clarke, E., Raimi, R., Zhu, Y.: Verifying safety properties of a
powerPCTM microprocessor using symbolic model checking without BDDs. In:
Halbwachs, N., Peled, D.A. (eds.) CAV 1999. LNCS, vol. 1633, pp. 60–71. Springer,
Heidelberg (1999)
8. Chen, F., Roşu, G.: Parametric trace slicing and monitoring. In: Kowalewski, S.,
Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505, pp. 246–261. Springer, Hei-
delberg (2009)
9. Bauer, A., Küster, J.-C., Vegliach, G.: From propositional to first-order monitoring.
In: Legay, A., Bensalem, S. (eds.) RV 2013. LNCS, vol. 8174, pp. 59–75. Springer,
Heidelberg (2013)
10. Bauer, A., Leucker, M., Schallhart, C.: Monitoring of real-time properties. In: Arun-
Kumar, S., Garg, N. (eds.) FSTTCS 2006. LNCS, vol. 4337, pp. 260–272. Springer,
Heidelberg (2006)
11. Bauer, A., Leucker, M., Schallhart, C.: Runtime verification for LTL and TLTL.
ACM Trans. Softw. Eng. Methodol. (2011)
12. Leucker, M., Sánchez, C.: Regular linear temporal logic. In: Jones, C.B., Liu, Z.,
Woodcock, J. (eds.) ICTAC 2007. LNCS, vol. 4711, pp. 291–305. Springer, Heidel-
berg (2007)
13. Decker, N., Leucker, M., Thoma, D.: Impartiality and anticipation for monitoring of
visibly context-free properties. In: Legay, A., Bensalem, S. (eds.) RV 2013. LNCS,
vol. 8174, pp. 183–200. Springer, Heidelberg (2013)
14. Bauer, A., Leucker, M., Schallhart, C.: The good, the bad, and the ugly, but how
ugly is ugly? In: Sokolsky, O., Taşıran, S. (eds.) RV 2007. LNCS, vol. 4839, pp.
126–138. Springer, Heidelberg (2007)
15. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
Temporal-Logic Based Runtime Observer Pairs
for System Health Management
of Real-Time Systems⋆,⋆⋆

Thomas Reinbacher1, Kristin Yvonne Rozier2 , and Johann Schumann3

1
Vienna University of Technology, Austria
[email protected]
2
NASA Ames Research Center, Moffett Field, CA, USA
[email protected]
3
SGT, Inc., NASA Ames, Moffett Field, CA, USA
[email protected]

Abstract. We propose a real-time, Realizable, Responsive, Unobtrusive Unit (rt-

R2U2) to meet the emerging needs for System Health Management (SHM) of
new safety-critical embedded systems like automated vehicles, Unmanned Aerial
Systems (UAS), or small satellites. SHM for these systems must be able to han-
dle unexpected situations and adapt specifications quickly during flight testing
between closely-timed consecutive missions, not mid-mission, necessitating fast
reconfiguration. They must enable more advanced probabilistic reasoning for di-
agnostics and prognostics while running aboard limited hardware without affect-
ing the certified on-board software. We define and prove correct translations of
two real-time projections of Linear Temporal Logic to two types of efficient ob-
server algorithms to continuously assess the status of the system. A synchronous
observer yields an instant abstraction of the satisfaction check, whereas an asyn-
chronous observer concretizes this abstraction at a later, a priori known, time. By
feeding the system’s real-time status into a statistical reasoning unit, e.g., based
on Bayesian networks, we enable advanced health estimation and diagnosis. We
experimentally demonstrate our novel framework on real flight data from NASA’s
Swift UAS. By on-boarding rt-R2U2 aboard an existing FPGA already built into
the standard UAS design and seamlessly intercepting sensor values through read-
only observations of the system bus, we avoid system integration problems of
software instrumentation or added hardware. The flexibility of our approach with
regard to changes in the monitored specification is not due to the reconfigurability
offered by FPGAs; it is a benefit of the modularity of our observers and would
also be available on non-reconfigurable hardware platforms such as ASICs.

1 Introduction
Autonomous and automated systems, including Unmanned Aerial Systems (UAS),
rovers, and satellites, have a large number of components, e.g., sensors, actuators, and
⋆
A full version with appendices containing full proofs of correctness for all observer al-
gorithms is available at http://research.kristinrozier.com/TACAS14.html.
This work was supported in part by the Austrian Research Agency FFG, grant 825891, and
NASA grant NNX08AY50A.
⋆⋆
The rights of this work are transferred to the extent transferable according to title 17 U.S.C. 105.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 357–372, 2014.
© Springer-Verlag Berlin Heidelberg 2014
358 T. Reinbacher, K.Y. Rozier, and J. Schumann

software, that must function together reliably at mission time. System Health Manage-
ment (SHM) [17] can detect, isolate, and diagnose faults and possibly initiate recovery
activities on such real-time systems. Effective SHM requires assessing the status of the
system with respect to its specifications and estimating system health during mission
time. Johnson et al. [17, Ch.1] recently highlighted the need for new, formal-methods
based capabilities for modeling complex relationships among different sensor data and
reasoning about timing-related requirements; computational expense prevents the cur-
rent best methods for SHM from meeting operational needs.
We need a new SHM framework for real-time systems like the Swift [16] electric
UAS (see Fig. 1), developed at NASA Ames. SHM for such systems requires:
R ESPONSIVENESS : the SHM framework must continuously monitor the system. Devi-
ations from the monitored specifications must be detected within a tight and a priori
known time bound, enabling mitigation or rescue measures, e.g., a controlled emer-
gency landing to avoid damage on the ground. Reporting intermediate status and satis-
faction of timed requirements as early as possible is required for enabling responsive
decision-making.
U NOBTRUSIVENESS : the SHM framework must not alter crucial properties of the sys-
tem including functionality (not change behavior), certifiability (avoid re-certification
of flight software/hardware), timing (not interfere with timing guarantees), and toler-
ances (not violate size, weight, power, or telemetry bandwidth constraints). Utilizing
commercial-off-the-shelf (COTS) and previously proven system components is abso-
lutely required to meet today’s tight time and budget constraints; adding the SHM
framework to the system must not alter these components as changes that require them
to be re-certified cancel out the benefits of their use. Our goal is to create the most ef-
fective SHM capability with the limitation of read-only access to the data from COTS
components.
R EALIZABILITY: the SHM framework must be usable in a plug-and-play manner by
providing a generic interface to connect to a wide variety of systems. The specification
language must be easily understood and expressive enough to encode e.g. temporal
relationships and flight rules. The framework must adapt to new specifications without
a lengthy re-compilation. We must be able to efficiently monitor different requirements
during different mission stages, like takeoff, approach, measurement, and return.

1.1 Related Work

Existing methods for Runtime Verification (RV) [4] assess system status by automatically
generating, mainly software-based, observers to check the state of the system against a
formal specification. Observations in RV are usually made accessible via software in-
strumentation [15]; they report only when a specification has passed or failed. Such in-
strumentation violates our requirements as it may make re-certification of the system
onerous, alter the original timing behavior, or increase resource consumption [23]. Also,
reporting only the outcomes of specifications violates our responsiveness requirement.
Systems in our applications domain often need to adhere to timing-related rules like:
after receiving the command ’takeoff’ reach an altitude of 600f t within five minutes.
These flight rules can be easily expressed in temporal logics; often in some flavor of lin-
ear temporal logic (LTL), as studied in [7]. Mainly due to promising complexity
Temporal-Logic Based Runtime Observer Pairs for SHM 359

results [6,11], restrictions of LTL to its past-time fragment have most often been used for
RV. Though specifications including past time operators may be natural for some other
domains [19], flight rules require future-time reasoning. To enable more intuitive spec-
ifications, others have studied monitoring of future-time claims; see [22] for a survey
and [5, 11, 14, 21, 27, 28] for algorithms and frameworks. Most of these observer algo-
rithms, however, were designed with a software implementation in mind and require a
powerful computer. There are many hardware alternatives, e.g. [12], however all either
resynthesize monitors from scratch or exclude checking real-time properties [2]. Our
unique approach runs the logic synthesis tool once to synthesize as many real-time ob-
server blocks as we can fit on our platform, e.g., FPGA or ASIC; our Sec. 4.1 only inter-
connects these blocks. Others have proposed using Bayesian inference techniques [10]
to estimate the health of a system. However, modeling timing-related behavior with dy-
namic Bayesian networks is very complex and quickly renders practical implementa-
tions infeasible.

1.2 Approach and Contributions

We propose a new paired-observer SHM framework allowing systems like the Swift
UAS to assess their status against a temporal logic specification while enabling advanced
health estimation, e.g., via discrete Bayesian networks (BN) [10] based reasoning. This
novel combination of two approaches, often seen as orthogonal to each other, enables
us to check timing-related aspects with our paired observers while keeping BN health
models free of timing information, and thus computationally attractive. Essentially, we
can enable better real-time SHM by utilizing paired temporal observers to optimize BN-
based decision making. Following our requirements, we call our new SHM framework
for real-time systems a rt-R2U2 (real-time, Realizable, Responsive, Unobtrusive Unit).
Our rt-R2U2 synthesizes a pair of observers for a real-time specification ϕ given in
Metric Temporal Logic (MTL) [1] or a specialization of LTL for mission-time bounded
characteristics, which we define in Sec. 2. To ensure R ESPONSIVENESS of our rt-R2U2,
we design two kinds of observer algorithms in Sec. 3 that verify whether ϕ holds at a
discrete time and run them in parallel. Synchronous observers have small hardware foot-
prints (max. eleven two-input gates per operator; see Theorem 3 in Sec. 4) and return
an instant, three-valued abstraction {true, false, maybe}) of the satisfaction check of ϕ
with every new tick of the Real Time Clock (RTC) while their asynchronous counter-
parts concretize this abstraction at a later, a priori known time. This unique approach al-
lows us to signal early failure and acceptance of every specification whenever possible
via the asynchronous observer. Note that previous approaches to runtime monitoring
signal only specification failures; signaling acceptance, and particularly early accep-
tance is unique to our approach and required for supporting other system components
such as prognostics engines or decision making units. Meanwhile, our synchronous ob-
server’s three-valued output gives intermediate information that a specification has not
yet passed/failed, enabling probabilistic decision making via a Bayesian Network as
described in [26].
We implement the rt-R2U2 in hardware as a self-contained unit, which runs
externally to the system, to support U NOBTRUSIVENESS; see Sec. 4. Safety-critical
embedded systems often use industrial, vehicle bus systems, such as CAN and PCI,
interconnecting hardware and software components, see Fig 1. Our rt-R2U2 provides
360 T. Reinbacher, K.Y. Rozier, and J. Schumann

estimation
Higher

health
Flight Laser H
Health Level
... rt-
Com- Alti- M
Model Reasoning
R2U2 (BN)
puter meter
Swift UAS en ⊧ {ϕ1 , .., ϕn }
Speciﬁ-
cation

system
status
Common Bus Interface (ϕ) Runtime
Observers
event updates
Baro
... Radio IMU & Event
Alti-
Link GPS Capture
meter & RTC

n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

en ⊧ (alt ≥ 600f t)
en ⊧ (pitch ≥ 5○ )
en ⊧ (cmd == takeoﬀ )

Fig. 1. rt-R2U2: An instance of our SHM framework rt-R2U2 for the NASA Swift UAS. Swift
subsystems (top): The laser altimeter maps terrain and determines elevation above ground by
measuring the time for a laser pulse to echo back to the UAS. The barometric altimeter deter-
mines altitude above sea level via atmospheric pressure. The inertial measurement unit (IMU)
reports velocity, orientation (yaw, pitch, and roll), and gravitational forces using accelerometers,
gyroscopes, and magnetometers. Running example (bottom): predicates over Swift UAS sensor
data on execution e; ranging over the readings of the barometric altimeter, the pitch sensor, and
the takeoff command received from the ground station; n is the time stamp as issued by the
Real-Time-Clock.

generic read-only interfaces to these bus systems supporting our U NOBTRUSIVENESS

requirement and sidestepping instrumentation. Events collected on these interfaces are
time stamped by a RTC; progress of time is derived from the observed clock signal,
resulting in a discrete time base N0 . Events are then processed by our runtime observer
pairs that check whether a specification holds on a sequence of collected events. Other
RV approaches for on-the-fly observers exhibit high overhead [13, 20, 24] or use power-
ful database systems [3], thus, violate our requirements.
To meet our R EALIZABILITY requirement, we design an efficient, highly parallel
hardware architecture, yet keep it programmable to adapt to changes in the specification.
Unlike existing approaches, our observers are designed with an efficient hardware im-
plementation in mind, therefore, avoid recursion and expensive search through memory
and aim at maximizing the benefits of the parallel nature of hardware. We synthesize
rt-R2U2 once and generate a configuration, similar to machine code, to interconnect
and configure the static hardware observer blocks of rt-R2U2, adapting to new specifi-
cations without running CAD or compilation tools like previous approaches. UAS have
very limited bandwidth constraints; transferring a lightweight configuration is prefer-
able to transferring a new image for the whole hardware design. The checks computed
by these runtime observers represent the system’s status and can be utilized by a higher
level reasoner, such as a human operator, Bayesian network, or otherwise, to compute a
health estimation, i.e., a conditional probability expressing the belief that a certain sub-
system is healthy, given the status of the system. In this paper, we compute these health
Temporal-Logic Based Runtime Observer Pairs for SHM 361

estimations by adapting the BN-based inference algorithms of [10] in hardware. Our

contributions include synthesis and integration of the synchronous/asynchronous ob-
server pairs, a modular hardware implementation, and execution of a proof-of-concept
rt-R2U2 running on a self-contained Field Programmable Gate Array (FPGA) (Sec. 5).

2 Real-Time Projections of LTL

MTL replaces the temporal operators of LTL with operators that respect time bounds [1].

Definition 1 (Discrete-Time MTL). For atomic proposition σ ∈ Σ, σ is a formula. Let

time bound J = [t, t′ ] with t, t′ ∈ N0 . If ϕ and ψ are formulas, then so are:
¬ϕ ∣ ϕ ∧ ψ ∣ ϕ ∨ ψ ∣ ϕ → ψ ∣ X ϕ ∣ ϕ UJ ψ ∣ ◻J ϕ ∣ ◇J ϕ.
Time bounds are specified as intervals: for t, t′ ∈ N0 , we write [t, t′ ] for the set {i ∈
N0 ∣ t ≤ i ≤ t′ }. We use the functions min, max, dur, to extract the lower time bound
(t), the upper time bound (t′ ), and the duration (t′ − t) of J. We define the satisfaction
relation of an MTL formula as follows: an execution e = (sn ) for n ≥ 0 is an infinite
sequence of states. For an MTL formula ϕ, time n ∈ N0 and execution e, we define ϕ
holds at time n of execution e, denoted en ⊧ ϕ, inductively as follows:
en ⊧ true is true, en ⊧ σ iff σ holds in sn , en ⊧ ¬ϕ iff en ⊭ ϕ,
e ⊧ ϕ ∧ ψ iff e ⊧ ϕ and e ⊧ ψ, en ⊧ X ϕ iff en+1 ⊧ ϕ,
n n n

en ⊧ ϕ UJ ψ iff ∃i(i ≥ n) ∶ (i − n ∈ J ∧ ei ⊧ ψ ∧ ∀j(n ≤ j < i) ∶ ej ⊧ ϕ).

With the dualities ◇J ϕ ≡ true UJ ϕ and ¬◇J ¬ϕ ≡ ◻J ϕ we arrive at two additional
operators: ◻J ϕ (ϕ is an invariant within the future interval J) and ◇J ϕ (ϕ holds
eventually within the future interval J). In order to efficiently encode specifications in
practice, we introduce two special cases of ◻J ϕ and ◇J ϕ: τ ϕ ≡ ◻[0,τ ] ϕ (ϕ is an
invariant within the next τ time units) and τ ϕ ≡ ◇[0,τ ] ϕ (ϕ holds eventually within the
next τ time units). For example, the flight rule from Sec. 1, “After receiving the takeoff
command reach an altitude of 600f t within five minutes,” is efficiently captured in MTL
by (cmd == takeoff) → 5 (alt ≥ 600f t), assuming a time-base of one minute and the
atomic propositions (alt ≥ 600f t) and (cmd == takeoff) as in Fig. 1.
Systems in our application domain are usually bounded to a certain mission time.
For example, the Swift UAS has a limited air-time, depending on the available battery
capacity and predefined waypoints. We capitalize on this property to intuitively monitor
standard LTL requirements using a mission-time bounded projection of LTL.
Definition 2 (Mission-Time LTL). For a given LTL formula ξ and a mission time tm ∈
N0 , we denote by ξm the mission-time bounded equivalent of ξ, where ξm is obtained
by replacing every ◻ϕ, ◇ϕ, and ϕ U ψ operator in ξ by the τ ϕ, τ ϕ, and ϕ UJ ψ
operators of MTL, where J = [0, tm ] and τ = tm .
Inputs to rt-R2U2 are time-stamped events, collected incrementally from the system.
Definition 3 (Execution Sequence). An execution sequence for an MTL formula ϕ,
denoted by ⟨Tϕ ⟩, is a sequence of tuples Tϕ = (v, τe ) where τe ∈ N0 is a time stamp and
v ∈ {true, false, maybe} is a verdict.
We use a superscript integer to access a particular element in ⟨Tϕ ⟩, e.g., ⟨Tϕ0 ⟩ is
the first element in execution sequence ⟨Tϕ ⟩. We write Tϕ .τe to access τe and Tϕ .v to
362 T. Reinbacher, K.Y. Rozier, and J. Schumann

access v of such an element. We say Tϕ holds if Tϕ .v is true and Tϕ does not hold if
Tϕ .v is false. For a given execution sequence ⟨Tϕ ⟩ = ⟨Tϕ0 ⟩, ⟨Tϕ1 ⟩, ⟨Tϕ2 ⟩, ⟨Tϕ3 ⟩, . . . , the
tuple accessed by ⟨Tϕi ⟩ corresponds to a section of an execution e as follows: for all
times n ∈ [⟨Tϕi−1 ⟩.τe + 1, ⟨Tϕi ⟩.τe ], en ⊧ ϕ in case ⟨Tϕi ⟩.v is true and en ⊭ ϕ in case
⟨Tϕi ⟩.v is false. In case ⟨Tϕi ⟩ is maybe, neither en ⊧ ϕ nor en ⊭ ϕ is defined.
In the remainder of this paper, we will frequently refer to execution sequences col-
lected from the Swift UAS as shown in Fig. 1. The predicates shown are atomic propo-
sitions over sensor data in our specifications and are sampled with every new time
stamp n issued by the RTC. For example, ⟨Tpitch≥5○ ⟩ = ((false, 0), (false, 1), (false, 2),
(true, 3), . . . , (true, 17), (true, 18)) describes en ⊧ (pitch ≥ 5○ ) sampled over n ∈
[0, 18] and ⟨Tpitch≥5○ ⟩ holds 19 elements.

3 Asynchronous and Synchronous Observers

The problem of monitoring a real-time specification has been studied extensively in
the past; see [8, 22] for an overview. Solutions include: (a) translating the temporal for-
mula into a finite-state automaton that accepts all the models of the specification [11,
12, 14, 28], (b) restricting MTL to its safety fragment and waiting until the operators’
time bounds have elapsed to decide the truth value afterwards [5,21], and (c) restricting
LTL to its past-time fragment [6,11,24]. Compiling new observers to automata as in (a)
requires re-running the logic synthesis tool to yield a new hardware observer, in automa-
ton or autogenerated VHDL code format as described in [12], which may take dozens of
minutes to complete, violating the R EALIZABILITY requirement. Observers generated
by (b) are in conflict with the R ESPONSIVENESS requirement and (c) do not natively
support flight rules. Our observers provide U NOBTRUSIVENESS via a self-contained
hardware implementation. To enable such an implementation, our design needs to re-
frain from dynamic memory, linked lists, and recursion – commonly used in existing
software-based observers, however, not natively available in hardware.
Our two types of runtime observers differ in the times when new outputs are gener-
ated and in the resource footprints required to implement them. A synchronous (time-
triggered) observer is trimmed towards a minimalistic hardware footprint and computes
a three-valued abstraction of the satisfaction check for the specification with each tick of
the RTC, without considering events happening after the current time. An asynchronous
(event-triggered) observer concretizes this abstraction at a later, a priori known, time
and makes use of synchronization queues to take events into account that occur after
the current time.1 Our novel parallel composition of these two observers updates the
status of the system at every tick of the RTC, yielding great responsiveness. An incon-
clusive answer when we can’t yet know true/false is still beneficial as the higher-level
reasoning part of our rt-R2U2 supports reasoning with inconclusive inputs. This al-
lows us to derive an intermediate estimation of system health with the option to initiate
fault mitigation actions even without explicitly knowing all inputs. If exact reasoning
is required, we can re-evaluate system health when the asynchronous observer provides
exact answers.
1
Similar terms have been used by others [9] to refer to monitoring with pairs of observers that
do not update with the RTC, incur delays dangerous to a UAS, and require system interaction
that violates our requirements (Sec. 1).
Temporal-Logic Based Runtime Observer Pairs for SHM 363

In the remainder of this section, we discuss2 both asynchronous and synchronous

observers for the operators ¬ ϕ, ϕ ∧ ψ, τ ϕ, ◻J ϕ, and ϕ UJ ψ. Informally, an MTL
observer is an algorithm that takes execution sequences as input and produces another
execution sequence as output. For a given unary operator ●, we say that an observer
algorithm implements en ⊧ ● ϕ, iff for all execution sequences ⟨Tϕ ⟩ as input, it pro-
duces an execution sequence as output that evaluates en ⊧ ● ϕ (analogous for binary
operators).

3.1 Asynchronous Observers

The main characteristic of our asynchronous observers is that they are evaluated with
every new input tuple and that for every generated output tuple T we have that T.v ∈
{true, false} and T.τe ∈ [0, n]. Since verdicts are exact evaluations of a future-time
specification ϕ for each clock tick they may resolve ϕ for clock ticks prior to the current
time n if the information required for this resolution was not available until n.
Our observers distinguish two types of transitions of the signals described by execu-
tion sequences. We say transition of execution sequence ⟨Tϕ ⟩ occurs at time n =
⟨Tϕi ⟩.τe + 1 iff (⟨Tϕi ⟩.v ⊕ ⟨Tϕi+1 ⟩.v) ∧ ⟨Tϕi+1 ⟩.v holds. Similarly, we say transition of
execution sequence ⟨Tϕ ⟩ occurs at time n = ⟨Tϕi ⟩.τe +1 iff (⟨Tϕi ⟩.v ⊕⟨Tϕi+1 ⟩.v)∧⟨Tϕi ⟩.v
holds (⊕ denotes the Boolean exclusive-or). For example, transitions and of
⟨Tpitch≥5○ ⟩ in Fig. 1 occur at times 3 and 11, respectively.

Negation (¬ ϕ). The observer for ¬ ϕ, as stated in Alg. 1, is straightforward: for every
input Tϕ we negate the truth value of Tϕ .v. The observer generates (. . . , (true, 2),
(false, 3), . . . ).

Invariant within the Next τ Time Stamps ( τ ϕ). An observer for τ ϕ requires
registers m↑ϕ and mτs with domain N0 : m↑ϕ holds the time stamp of the latest
transition of ⟨Tϕ ⟩ whereas mτs holds the start time of the next tuple in ⟨Tϕ ⟩. For the
observer in Alg. 2, the check m ≤ (Tϕ .τe − τ ) in line 8 tests whether ϕ held for at
least the previous τ time stamps. To illustrate the algorithm, consider an observer for
○
5 (pitch ≥ 5 ) and the execution in Fig. 1. At time n = 0, we have m↑ϕ = 0 and since
⟨Tpitch≥5○ ⟩ does not hold the output is (false, 0). Similarly, the outputs for n ∈ [1, 2]
0

are (false, 1) and (false, 2). At time n = 3, a transition of ⟨Tpitch≥5○ ⟩ occurs, thus
m↑ϕ = 3. Since the check in line 8 does not hold, the algorithm does not generate a
new output, i.e., returns ( , ) designating output is delayed until a later time, which
repeats at times n ∈ [4, 7]. At n = 8, the check in line 8 holds and the algorithm returns
(true, 3). Likewise, the outputs for n ∈ [9, 10] are (true, 4) and (true, 5). At n = 11,
⟨Tpitch≥5
11
○ ⟩ does not hold and the algorithm outputs (false, 11). We note the ability of

the observer to re-synchronize its output with respect to its inputs and the RTC. For
n ∈ [8, 10], outputs are given for a time prior to n, however, at n = 11 the observer re-
synchronizes: the output (false, 11) signifies that en ⊭ 5 (pitch ≥ 5○ ) for n ∈ [6, 11].
By the equivalence τ ϕ ≡ ¬ τ ¬ϕ, we immediately arrive at an observer for τ ϕ from
Alg. 2 by negating both the input and the output tuple.

2
Proofs of correctness for every observer algorithm appear in the Appendix.
364 T. Reinbacher, K.Y. Rozier, and J. Schumann

Algorithm 1. Observer for ¬ ϕ. Algorithm 4. Observer for ◻J ϕ.

1: At each new input Tϕ : 1: At each new input Tϕ :
2: Tξ ← (¬ Tϕ .v, Tϕ .τe ) 2: Tξ ← dur(J ) Tϕ
3: return Tξ 3: if (Tξ .τe − min(J) ≥ 0) then
4: Tξ .τe ← Tξ .τe − min(J)
Algorithm 2. Observer for τ ϕ. Initially, m↑ϕ = 5: else
mτs = 0. 6: Tξ ← ( , )
7: end if
1: At each new input Tϕ : 8: return Tξ
2: Tξ ← Tϕ
3: if transition of Tξ occurs then Algorithm 5. Observer for ϕ UJ ψ. Initially,
4: m↑ϕ ← mτs mpre = m↑ϕ = 0, m↓ϕ = −∞, and p = false.
5: end if
1: At each new input (Tϕ , Tψ ) in lockstep mode:
6: mτs ← Tϕ .τe + 1
2: if transition of Tϕ occurs then
7: if Tξ holds then
3: m↑ϕ ← τe − 1
8: if m↑ϕ ≤ (Tξ .τe − τ ) holds then
4: mpre ← −∞
9: Tξ .τe ← Tξ .τe − τ
5: end if
10: else
6: if transition of Tϕ occurs and Tψ holds then
11: Tξ ← ( , )
7: Tϕ .v, p ← true, true
12: end if
8: m↓ϕ ← τe
13: end if
9: end if
14: return Tξ
10: if Tϕ holds then
11: if Tψ holds then
Algorithm 3. Observer for ϕ ∧ ψ. 12: if (m↑ϕ + min(J) < τe ) holds then
1: At each new input (Tϕ , Tψ ): 13: mpre ← τe
2: if Tϕ holds and Tψ holds and qϕ ≠ () holds and 14: return (true, τe − min(J))
qψ ≠ () holds then 15: else if p holds then
3: Tξ ← (true, min(Tϕ .τe , Tψ .τe )) 16: return (false, m↓ϕ )
4: else if ¬Tϕ holds and ¬Tψ holds and qϕ ≠ () holds 17: end if
and qψ ≠ () holds then 18: else if (mpre + dur(J) ≤ τe ) holds then
5: Tξ ← (false, max(Tϕ .τe , Tψ .τe )) 19: return (false, max(m↑ϕ , τe − max(J)))
6: else if ¬Tϕ holds and qϕ ≠ () holds then 20: end if
7: Tξ ← (false, Tϕ .τe ) 21: else
8: else if ¬Tψ holds and qψ ≠ () holds then 22: p ← false
9: Tξ ← (false, Tψ .τe ) 23: if (min(J) = 0) holds then
10: else 24: return (Tψ .v, τe )
11: Tξ ← ( , ) 25: end if
12: end if 26: return (false, τe )
13: dequeue(qϕ , qψ , Tξ .τe ) 27: end if
14: return Tξ 28: return ( , )

Invariant within Future Interval (◻J ϕ). The observer for ◻J ϕ, as stated in Alg. 4,
builds on an observer for τ ϕ and makes use of the equivalence τ ϕ ≡ ◻[0,τ ] ϕ. Intu-
itively, the observer for τ ϕ returns true iff ϕ holds for at least the next τ time units.
We can thus construct an observer for ◻J ϕ by reusing the algorithm for τ ϕ, assign-
ing τ = dur(J) and shifting the obtained output by min(J) time stamps into the past.
From the equivalence ◇J ϕ ≡ ¬ ◻J ¬ϕ, we can immediately derive an observer for
◇J ϕ from the observer for ◻J ϕ. To illustrate the algorithm, consider an observer for
◻5,10 (alt ≥ 600f t) over the execution in Fig. 1. For n ∈ [0, 4] the algorithm returns
( , ), since (⟨Talt≥600f
0...4
t ⟩.τe − 5) ≥ 0 (line 3 of Alg. 4) does not hold. At n = 5 the under-
lying observer for 5 (alt ≥ 600f t) returns (false, 5), which is transformed (by line 4)
into the output (false, 0). For similar arguments, the outputs for n ∈ [6, 9] are (false, 1),
(false, 2), (false, 3), and (false, 4). At n ∈ [10, 14], the observer for 5 (alt ≥ 600f t)
returns ( , ). At n = 15, 5 (alt ≥ 600f t) yields (true, 10), which is transformed (by
line 4) into the output is (true, 5). Note also that X ϕ ≡ ◻[1,1] ϕ.
Temporal-Logic Based Runtime Observer Pairs for SHM 365

The remaining observers for the binary operators ϕ ∧ ψ and ϕ UJ ψ take tuples
(Tϕ , Tψ ) as inputs, where Tϕ is from ⟨Tϕ ⟩ and Tψ is from ⟨Tψ ⟩. Since ⟨Tϕ ⟩ and ⟨Tψ ⟩
are execution sequences produced by two different observers, the two elements of the
input tuple (Tϕ , Tψ ) are not necessarily generated at the same time. Our observers for
binary MTL operators thus use two FIFO-organized synchronization queues to buffer
parts of ⟨Tϕ ⟩ and ⟨Tψ ⟩, respectively. For a synchronization queue q we denote by q =()
its emptiness and by ∣q∣ its size.

Conjunction (ϕ∧ψ). The observer for ϕ∧ψ, as stated in Alg. 3, reads inputs (Tϕ , Tψ )
from two synchronization queues, qϕ and qψ . Intuitively, the algorithm follows the
rules for conjunction in Boolean logic with additional emptiness checks on qϕ and
qψ . The procedure dequeue(qϕ , qψ , Tξ .τe ) drops all entries Tϕ in qϕ for which the
following holds: Tϕ .τe ≤ Tξ .τe (analogous for qψ ). To illustrate the algorithm, con-
sider an observer for 5 (alt ≥ 600f t)∧(pitch ≥ 5○ ) and the execution in Fig. 1. For
n ∈ [0, 9] the two observers for the involved subformulas immediately output (false, n).
For n ∈ [10, 14], the observer for 5 (alt ≥ 600f t) returns ( , ), while in the meantime,
the atomic proposition (pitch ≥ 5○ ) toggles its truth value several times, i.e., (true, 10),
(false, 11), (false, 12), (true, 13), (false, 14). These tuples need to be buffered in
queue qpitch≥5○ until the observer for 5 (alt ≥ 600f t) generates its next output, i.e.,
(true, 10) at n = 15. We apply the function aggregate(⟨Tϕ ⟩), which repeatedly re-
places two consecutive elements ⟨Tϕi ⟩, ⟨Tϕi+1 ⟩ in ⟨Tϕ ⟩ by ⟨Tϕi+1 ⟩ iff ⟨Tϕi ⟩.v = ⟨Tϕi+1 ⟩.v,
to the content of qpitch≥5○ once every time an element is added to qpitch≥5○ . Therefore,
at n = 15: qpitch≥5○ = ((true, 10), (false, 12), (true, 13), (false, 14), (true, 15)) and
q 5 (alt≥600f t) = ((true, 10)). The observer returns (true, 10) (line 3) and
dequeue(qϕ , qψ , 10) yields: qpitch≥5○ = ((false, 12), (true, 13), (false, 14), (true, 15))
and q 5 (alt≥600f t) = ().

Until within Future Interval (ϕ UJ ψ). The observer for ϕ UJ ψ, as stated in Alg. 5,
reads inputs (Tϕ , Tψ ) from two synchronization queues and makes use of a Boolean
flag p and three registers m↑ϕ , m↓ϕ , and mpre with domain N0 ∪ {−∞}: m↑ϕ (m↓ϕ )
holds the time stamp of the latest transition ( transition) of ⟨Tϕ ⟩ and mpre holds
the latest time stamp where the observer detected ϕ UJ ψ to hold. Input tuples (Tϕ , Tψ )
for the observer are read from synchronization queues in a lockstep mode: (Tϕ , Tψ ) is
split into (Tϕ′ , Tψ′ ), where Tϕ′ .τe = Tψ′ .τe and the time stamp Tϕ′′ .τe of the next tuple
(Tϕ′′, Tψ′′ ) is Tϕ′ .τe + 1. This ensures that the observer outputs only a single tuple at
each run and avoids output buffers, which would account for additional hardware re-
sources (see correctness proof in the Appendix for a discussion). Intuitively, if Tϕ does
not hold (lines 22-26) the observer is synchronous to its input and immediately outputs
(false, Tϕ .τe ). If Tϕ holds (lines 11-20) the time stamp n′ of the output tuple is not nec-
essarily synchronous to the time stamp Tϕ .τe of the input anymore, however, bounded
by (Tϕ .τe − max(J)) ≤ n′ ≤ Tϕ .τe (see Lemma “unrolling” in the Appendix). To il-
lustrate the algorithm, consider an observer for (pitch ≥ 5○ ) U[5,10] (alt ≥ 600f t) over
the execution in Fig. 1. At time n = 0, we have mpre = 0, m↑ϕ = 0, and m↓ϕ = −∞
and since ⟨Tpitch≥5
0
○ ⟩ does not hold, the observer outputs (false, 0) in line 26. The out-

puts for n ∈ [1, 2] are (false, 1) and (false, 2). At time n = 3, a transition of
⟨Tpitch≥5○ ⟩ occurs, thus we assign m↑ϕ = 2 and mpre = −∞ (lines 3 and 4). Since
366 T. Reinbacher, K.Y. Rozier, and J. Schumann

⟨Tpitch≥5
3
○ ⟩ holds and ⟨Talt≥600f t ⟩ does not hold, the predicate in line 18 is evaluated,
3

which holds and the algorithm returns ⟨false, max(2, 3 − 10)⟩ = (false, 2). Thus, the
observer does not yield a new output in this case, which repeats for times n ∈ [4, 9].
At time n = 10, a transition of ⟨Talt≥600f t ⟩ occurs and the predicate in line 12
is evaluated. Since (2 + 5) < 10 holds, the algorithm returns (true, 5), revealing that
en ⊧ (pitch ≥ 5○ ) U[5,10] (alt ≥ 600f t) for n ∈ [3, 5]. At time n = 11, a transition of
⟨Tpitch≥5○ ⟩ occurs and since ⟨Talt≥600f
11
t ⟩ holds, p and the truth value of the current input
⟨Tpitch≥5
11
○ ⟩.v are set true and m↓ϕ = 11. Again, line 12 is evaluated and the algorithm re-

turns (true, 6). At time n = 12, since ⟨Tpitch≥5

12
○ ⟩ does not hold, we clear p in line 22 and

the algorithm returns (false, 12) in line 26, i.e., en ⊭ (pitch ≥ 5○ ) U[5,10] (alt ≥ 600f t)
for n ∈ [7, 12]. At time n = 13, a transition of ⟨Tpitch≥5○ ⟩ occurs, thus m↑ϕ = 12
and mpre = −∞. The predicates in line 12 and 15 do not hold, the algorithm returns
no new output in line 28. At time n = 14, a transition of ⟨Tpitch≥5○ ⟩ occurs, thus
p and ⟨Tpitch≥5○ ⟩.v are set true and m↓ϕ = 14. The predicate in line 15 holds, and the
14

algorithm outputs (false, 14), revealing that en ⊭ (pitch ≥ 5○ ) U[5,10] (alt ≥ 600f t) for
n ∈ [13, 14].

3.2 Synchronous Observers

The main characteristic of our synchronous observers is that they are evaluated at ev-
ery tick of the RTC and that their output tuples T are guaranteed to be synchronous
to the current time stamp n. Thus, for each time n, a synchronous observer outputs
a tuple T with T.τe = n. This eliminates the need for synchronization queues. In-
puts and outputs of these observers are execution sequences with three-valued verdicts.
̂ ∶ → {true, false, maybe}, where ∈
The underlying abstraction is given by eval
{¬ ϕ, ϕ ∧ ψ, τ ϕ, ◻J ϕ, ϕ UJ ψ}. The implementation of eval ̂ (¬ ϕ) and eval
̂ (ϕ ∧ ψ)
follows the rules for Kleene logic [18]. For the remaining operators we define the ver-
dict Tξ .v of the output tuple (Tξ .v, n), generated for inputs (Tϕ .v, n) (respectively
(Tψ .v, n) for ϕ UJ ψ), as: ⎧
⎪
⎪ true if Tϕ .v holds and τ = 0,
̂ ⎪
eval ( τ ϕ) = ⎨ false if Tϕ .v does not hold,
⎪
⎪
⎪
⎩ maybe otherwise.
̂ (◻J ϕ)
eval = maybe.
⎧
⎪ Tϕ .v and Tψ .v holds
⎪
⎪ true if
⎪
⎪ and min(J) = 0,
̂
eval (ϕ UJ ψ) = ⎨
⎪
⎪ false if Tϕ .v does not hold,
⎪
⎪
⎪
⎩ maybe otherwise.
To illustrate our synchronous observer algorithms, consider the previously discussed
formula 5 (alt ≥ 600f t) ∧ (pitch ≥ 5○ ), which we want to evaluate using the syn-
chronous observer:
̂ (eval
ξ = eval ̂ ( 5 (alt ≥ 600f t)) ∧ (pitch ≥ 5○ ))
For n ∈ [0, 9], as in the case of the asynchronous observer, we can immediately output
̂ ( 5 (alt ≥ 600f t)) yields (maybe, n), thus, the observer is
(false, n). At n = 10, eval
inconclusive about the truth value of e10 ⊧ ξ. At n ∈ [11, 12] since (pitch ≥ 5○ ) does
not hold, the outputs are (false, n). For analogous arguments, the output at n = 13 is
(maybe, 13), at n = 14 (false, 14), and at n = 15 (maybe, 15). In this way, at times
Temporal-Logic Based Runtime Observer Pairs for SHM 367

n ∈ {11, 12, 14} the synchronous observer completes early evaluation of ξ, producing
output that would, without the abstraction, be guaranteed by the exact asynchronous
observer with a delay of 5 time units, i.e., at times n ∈ {16, 17, 19}.

4 Mapping Observers into Efficient Hardware

We introduce a mapping of the observer pairs into efficient hardware blocks and a syn-
thesis procedure to generate a configuration for these blocks from an arbitrary MTL
specification. This configuration is loaded into the control unit of our rt-R2U2, where it
changes the interconnections between a pool of (static) hardware observer blocks and as-
signs memory regions for synchronization queues. This approach enables us to quickly
change the monitored specification (within resource limitations) without re-compiling
the rt-R2U2’s hardware, supporting our R EALIZABILITY requirement.
Asynchronous observers require arithmetic operations on time stamps. Registers and
flags as required by the observer algorithm are mapped to circuits that can store informa-
tion, such as flip-flops. For the synchronization queues we turn to block RAMs (abun-
dant on FPGAs), organized as ring buffers. Time stamps are internally stored in registers
of width w = ⌈log2 (n)⌉ + 2, to indicate −∞ and to allow overflows when performing
arithmetical operations on time stamps. Subtraction and relational operators as required
by the observer for τ ϕ (Fig. 2) can be built around adders. For example, the check
in line 8 of Alg. 2 is implemented using two w-bit wide adders: one for q = Tϕ .τe − τ
and one to decide whether m↑ϕ ≥ q. A third adder runs in parallel and assigns a new
value to mτs (line 6 of Alg. 2). Detecting a transition on ⟨Tϕ ⟩ maps to an XOR gate
and an AND gate, implementing the circuit (Tϕi−1 .v ⊕ Tϕi .v) ∧ Tϕi .v, where Tϕi−1 .v is the
truth value of the previous input, stored in a flip-flop. The multiplexer either writes a
new output or sets a flag to indicate ( , ).
Synchronous observers do not require calculations on time stamps and directly map
to basic digital logic gates. Fig. 2 shows a circuit representing an eval ̂ ( τ ϕ) observer
that accounts for one two-input AND gate, one two-input OR gate, and two Inverter
gates. Inputs (i1 , i2 ) and outputs (y1 , y2 ) are encoded (to project the three-valued logic
into Boolean logic) such as: true (0, 0), false (0, 1), and maybe (1, 0). Input j is set if
τe = 0 and cleared otherwise.

4.1 Synthesizing a Configuration for the rt-R2U2

The synthesis procedure to translate an MTL specification ξ into a configuration such
̂ (ξ), works as follows:
that the rt-R2U2 instantiates observers for both ξ and eval

– Preprocessing. By the equivalences given in Sect. 2 rewrite ξ to ξ ′ , such that opera-

tors in ξ ′ are from {¬ ϕ, ϕ ∧ ψ, τ ϕ, ◻J ϕ, ϕ UJ ψ} (SA1).
– Parsing. Parse ξ ′ to obtain an Abstract Syntax Tree (AST), denoted by AST(ξ ′ ).
The leaves of this tree are the atomic propositions Σ of ξ ′ (SA2).
– Allocating observers. For all nodes q in AST(ξ ′ ) allocate both the corresponding
synchronous and the asynchronous hardware observer block (SA3).
– Adding synchronization queues. ∀q ∈ AST(ξ ′ ): If q is of type ϕ ∧ ψ or ϕ UJ ψ add
queues qϕ and qψ to the inputs of the respective asynchronous observer (MA1).
368 T. Reinbacher, K.Y. Rozier, and J. Schumann

Algorithm 6. Assigning synchronization queue sizes for AST(ξ ′ ). Let S be a set of nodes;
Initially: w = 0, add all Σ nodes of AST(ξ ′ ) to S; The function wcd ∶ → N0 calculates the
worst-case-delay an asynchronous observer may introduce by: wcd(¬ ϕ) = wcd(ϕ ∧ ψ) = 0,
wcd( τ ϕ) = τ , wcd(◻J ϕ) = wcd(ϕ UJ ψ) = max(J).
1: while S is not empty do
2: s, w ← get next node from S, 0
3: if s is type ϕ UJ ψ or ϕ ∧ ψ then
4: w ← max(∣qϕ ∣, ∣qψ ∣) + wcd(s)
5: end if
6: while s is not a synchronization queue do
7: s, w ← get predecessor of s in AST(ξ ′ ), w + wcd(s)
8: end while
9: Set ∣q∣ = w; (q is opposite synchronization queue of s)
10: Add all ϕ UJ ψ and ϕ ∧ ψ nodes that have unassigned synchronization queue sizes to S
11: end while

– Interconnect and dimensioning. Connect observers and queues according to AST(ξ ′).
Execute Alg. 6 (MA2).

Let {σ1 , σ2 , σ3 } ∈ Σ and ξ = σ1 → ( 10 (σ2 ) ∨ 100 (σ3 )) be an MTL formula

we want to synthesize a configuration for. SA1 yields ξ ′ = ¬(σ1 ∧ ¬(¬ 10 (¬σ2 )) ∧
¬(¬ 100 (¬σ3 ))) which simplifies to ξ = ¬(σ1 ∧ 10 (¬σ2 ) ∧ 100 (¬σ3 )). SA2
yields AST(ξ ′ ). SA3 instantiates two ϕ ∧ ψ, three ¬ ϕ, one 10 Tϕ and one 100 Tϕ ob-
servers, both synchronous and asynchronous. MA1, introduces queues qσ1 , qξ2 , qξ3 , qξ4
and MA2 interconnects observers and queues and assigns ∣qσ1 ∣ = 100, ∣qξ2 ∣ = 100,
∣qξ3 ∣ = 10, and ∣qξ4 ∣ = 0, see Fig. 2.

edge detection
¬ ξ5
′
en ⊧ ξ ̂ (ξ)
en ⊧ eval ̂ (¬ ξ5 )
eval
q= asynchronous synchronous
Tϕ .τe − τ mτs = σ1 ∧ ξ 4 ̂ (σ1 ∧ ξ4 )
eval
outputs
Tϕ Tϕ .τe + 1 Tξ
qξ 4
m↑ϕ ≥ q
ξ2 ∧ ξ3 ̂ (ξ2 ∧ ξ3 )
eval
depth d of
multiplexer qξ2 qξ3 AST (ξ) = 5

10 ξ0 100 ξ1 ̂(
eval 10
̂(
ξ0 ) eval 100 ξ1 )
q σ1
i1 ¬ σ2 ¬ σ3 ̂ (¬ σ2 ) ̂ (¬ σ3 )
y1 eval eval
j inputs
σ1
y2 σ2
i2 σ3

Fig. 2. Left: hardware implementations for τ ϕ (top) and eval ̂ ( τ ϕ) (bottom). Right: subfor-
mulas of AST(ξ), observers, and queues synthesized for ξ. Mapping the observers to hardware
yields two levels of parallelism: (i) asynchronous (left) and the synchronous observers (right) run
in parallel and (ii) observers for subformulas run in parallel, e.g., 10 ξ0 and 100 ξ1 .
Temporal-Logic Based Runtime Observer Pairs for SHM 369

4.2 Circuit Size and Depth Complexity Results

Having discussed how to determine the size of the synchronization queues for our asyn-
chronous MTL observers, we are now in the position to prove space and time complex-
ity bounds.

Theorem 1 (Space Complexity of Asynchronous Observers). The respective asyn-

chronous observer for a given MTL specification ϕ has a space complexity, in terms of
memory bits, bounded by (2 + ⌈log2 (n)⌉) ⋅ (2 ⋅ m ⋅ p), where m is the number of binary
observers (i.e., ϕ ∧ ψ or ϕ UJ ψ) in ϕ, p is the worst-case delay of a single predecessor
chain in AST(ϕ), and n ∈ N0 is the time stamp it is executed.

Theorem 2 (Time Complexity of Asynchronous Observers). The respective asyn-

chronous observer for a given MTL specification ϕ has an asymptotic time complexity
of O( log2 log2 max(p, n) ⋅ d), where p is the maximum worst-case-delay of any ob-
server in AST(ϕ), d the depth of AST(ϕ), and n ∈ N0 the time stamp it is executed.

For our synchronous observers, we prove upper bounds in terms of two-input gates on
the size of resulting circuits. Actual implementations may yield significant better results
on circuit size, depending on the performance of the logic synthesis tool.

Theorem 3 (Circuit-Size Complexity of Synchronous Observers). For a given MTL

̂ (ϕ) has a circuit-size complexity bounded by
formula ϕ, the circuit to monitor eval
11 ⋅ m, where m is the number of observers in AST(ϕ).

Theorem 4 (Circuit-Depth Complexity of Synchronous Observers). For a given

̂ (ϕ) has a circuit-depth complexity of 4 ⋅ d.
MTL formula ϕ, the circuit to monitor eval

5 Applying the rt-R2U2 to NASA’s Swift UAS

We implemented our rt-R2U2 as a register-transfer-level VHDL hardware design, which
we simulated in M ENTOR G RAPHICS M ODEL S IM and synthesized for different FPGAs
using the industrial logic synthesis tool A LTERA Q UARTUS II.3 With our rt-R2U2, we
analyzed raw flight data from NASA’s Swift UAS collected during test flights. The
higher-level reasoning is performed by a health model, modeled as a Bayesian network
(BN) where the nodes correspond to discrete random variables. Fig. 3 shows the relevant
excerpt for reasoning about altitude. Directed edges encode conditional dependencies
between variables, e.g., the sensor reading SL depends on the health of the laser altime-
ter sensor HL . Conditional probability tables at each node define the local dependencies.
During health estimation, verdicts computed by our observer algorithms are provided
as virtual sensor values to the observable nodes SL , SB , SS ; e.g., the laser altimeter
measuring an altitude increase would result in setting SL to state inc. Then, the poste-
riors of the multivariate probability distribution encoded in the BN are calculated [10];
for details of modeling and reasoning see [25].
Our temporal specifications are evaluated by our runtime observers and describe
flight rules (ϕ1 , ϕ2 ) and virtual sensors:
3
Simulation traces are available in the Appendix; tools can be downloaded at
http://www.mentor.com and http://www.altera.com.
370 T. Reinbacher, K.Y. Rozier, and J. Schumann

900
Swift UAS ﬂight data
600
H LaserAlt UA Θ UA H BaroAlt
(HL ) inc 0.5 U Altimeter (HB )
Barometric altitude (altB ) / f t
300 dec 0.5 (UA )
HL Θ HL HB ΘHB
Laser altitude (altL ) / f t healthy 0.7 healthy 0.9
bad 0.3 bad 0.1

Vert. velocity (vel up) / m S LaserAlt S Sensors S BaroAlt

s
(SL ) (SS ) (SB )
Euler pitch angle (pitch) / rad
UA HL SL Θ SL UA HB SB Θ SB
inc 1.0 UA SS Θ SS inc 1.0
healthy inc 0.7 healthy
dec 0.0 dec 0.0
UAS healthn estimation (output of higher-level reasoning unit) inc
inc 0.5 inc dec 0.1 inc
inc 0.5
Pr(HB = healthy ∣ e ⊧ {σSL , σSB , ϕSS }) bad maybe 0.2 bad
dec 0.5 dec 0.5
Pr(HL = healthy ∣ en ⊧ {σSL , σSB , ϕSS }) inc 0.0 inc 0.1 inc 0.0
healthy dec dec 0.7 healthy
dec 1.0 dec 1.0
dec maybe 0.2 dec
UAS status assessment (output of runtime observers) bad
inc 0.5
bad
inc 0.5
dec 0.5 dec 0.5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 n

en ⊧ altB ≥ 600f t
Inputs to our rt-R2U2 are flight data, sampled in real time;
en ⊧ (cmd == takeoff ) a health model as BN, right; and an MTL specification ϕ.
̂ (ϕ1 ) Outputs: health estimation (posterior marginals of HL
resolve by async. observer en ⊧ eval and HB , quantifying the health of the laser and
e n ⊧ ϕ1 v barometric altimeter) and the status of the UAS.
0 0 0 0 0 0 0 0 0 0 0 1 12 13 14 15 16 17 18 19 20 21 22 23 τe

Fig. 3. Adding SHM to the Swift UAS

ϕ1 = (cmd == takeoff ) → 10 (altB ≥ 600f t)

ϕ2 = (cmd == takeoff ) → ∗ (cmd == land)
ϕ1 encodes our running example flight rule; ϕ2 is a mission-bounded LTL property re-
quiring that the command land is received after takeoff, within the projected mission
time, indicated by ∗. Fig. 3 shows the execution sequences produced by both the asyn-
chronous (en ⊧ ϕ1 ) and the synchronous (en ⊧ eval̂ (ϕ1 )) observers for flight rule ϕ1 .
To keep the presentation accessible we scaled the timeline to just 24 time stamps; the
actual implementation uses a resolution of 232 time stamps. The synchronous observer
is able to prove the validity of ϕ1 immediately at all time stamps but one (n = 1), where
the output is (maybe, 1), indicated by . The asynchronous observer will resolve this
inconclusive output at time n = 11, by generating the tuple (false, 1), revealing a vio-
lation of ϕ at time n = 1. The verdicts of σSL↑ ,σSL↓ , σSB↑ , σSB↓ , ϕSS↑ , and ϕSS↓ are
mapped to inputs SL , SB , SS of the health model:
σSL↑ = (altL − alt′L ) > 0 σSL↓ = (altL − alt′L ) < 0
′
σSB↑ = (altB − altB ) > 0 σSB↓ = (altB − alt′B ) < 0
σSB↑ observes if the first derivation of the barometric altimeter reading is positive, thus,
holds if the sensors values indicate that the UAS is ascending. We set SB to inc if σSB↑
holds and to dec if σSB↓ holds. The specifications ϕSS↑ and ϕSS↓ subsume the pitch
and the velocity readings to an additional, indirect altitude sensor. Due to sensor noise,
simple threshold properties on the IMU signals would yield a large number of false
positives. Instead ϕSS↑ and ϕSS↓ use τ ϕ observers as filters, by requiring that the
pitch and the velocity signals exceed a threshold for multiple time steps.
ϕSS↑ = 10 (pitch ≥ 5○ ) ∧ 5 (vel up ≥ 2 m s
)
ϕSS↓ = 10 (pitch < 2○ ) ∧ 5 (vel up ≤ −2 m s
)

Our real-time SHM analysis matched post-flight analysis by test engineers, includ-
ing successfully pinpointing a laser altimeter failure, see Fig 3: the barometric altime-
ter, pitch, and the velocity readings indicated an increase in altitude (σSB↑ and ϕSS↑
Temporal-Logic Based Runtime Observer Pairs for SHM 371

held) while the laser altimeter indicated a decrease (σSL↓ held). The posterior marginal
Pr(HL = healthy ∣ en ⊧ {σSL , σSB , ϕSS }) of the node HL , inferred from the BN,
dropped from 70% to 8%, indicating a low degree of trust in the laser altimeter reading
during the outage; engineers attribute the failure to the UAS exceeding its operational
altitude.

6 Conclusion
We presented a novel SHM technique that enables both real-time assessment of the
system status of an embedded system with respect to temporal-logic-based specifica-
tions and also supports statistical reasoning to estimate its health at runtime. To ensure
R EALIZABILITY, we observe specifications given in two real-time projections of LTL
that naturally encode future-time requirements such as flight rules. Real-time health
modeling, e.g., using Bayesian networks allows mitigative reactions inferred from com-
plex relationships between observations. To ensure R ESPONSIVENESS, we run both an
over-approximative, but synchronous to the real-time clock (RTC), and an exact, but
asynchronous to the RTC, observer in parallel for every specification. To ensure U NOB -
TRUSIVENESS to flight-certified systems, we designed our observer algorithms with a
light-weight, FPGA-based implementation in mind and showed how to map them into
efficient, but reconfigurable circuits. Following on our success using rt-R2U2 to analyze
real flight data recorded by NASA’s Swift UAS, we plan to analyze future missions of
the Swift or small satellites with the goal of deploying rt-R2U2 onboard.

References
1. Alur, R., Henzinger, T.A.: Real-time Logics: Complexity and Expressiveness. In: LICS, pp.
390–401. IEEE (1990)
2. Backasch, R., Hochberger, C., Weiss, A., Leucker, M., Lasslop, R.: Runtime verification for
multicore SoC with high-quality trace data. ACM Trans. Des. Autom. Electron. Syst. 18(2),
18:1–18:26 (2013)
3. Barre, B., Klein, M., Soucy-Boivin, M., Ollivier, P.-A., Hallé, S.: MapReduce for paral-
lel trace validation of LTL properties. In: Qadeer, S., Tasiran, S. (eds.) RV 2012. LNCS,
vol. 7687, pp. 184–198. Springer, Heidelberg (2013)
4. Barringer, H., et al.: RV 2010. LNCS, vol. 6418. Springer, Heidelberg (2010)
5. Basin, D., Klaedtke, F., Müller, S., Pfitzmann, B.: Runtime monitoring of metric first-order
temporal properties. In: FSTTCS, pp. 49–60 (2008)
6. Basin, D., Klaedtke, F., Zălinescu, E.: Algorithms for monitoring real-time properties. In:
Khurshid, S., Sen, K. (eds.) RV 2011. LNCS, vol. 7186, pp. 260–275. Springer, Heidelberg
(2012)
7. Bauer, A., Leucker, M., Schallhart, C.: Comparing LTL semantics for runtime verification. J.
Log. and Comp. 20, 651–674 (2010)
8. Bauer, A., Leucker, M., Schallhart, C.: Runtime verification for LTL and TLTL. ACM Trans.
Softw. Eng. M. 20, 14:1–14:64 (2011)
9. Colombo, C., Pace, G., Abela, P.: Safer asynchronous runtime monitoring using compensa-
tions. FMSD 41, 269–294 (2012)
10. Darwiche, A.: Modeling and Reasoning with Bayesian Networks, 1st edn. Cambridge Uni-
versity Press, New York (2009)
11. Divakaran, S., D’Souza, D., Mohan, M.R.: Conflict-tolerant real-time specifications in metric
temporal logic. In: TIME, pp. 35–42 (2010)
372 T. Reinbacher, K.Y. Rozier, and J. Schumann

12. Finkbeiner, B., Kuhtz, L.: Monitor circuits for LTL with bounded and unbounded future. In:
Bensalem, S., Peled, D.A. (eds.) RV 2009. LNCS, vol. 5779, pp. 60–75. Springer, Heidelberg
(2009)
13. Fischmeister, S., Lam, P.: Time-aware instrumentation of embedded software. IEEE Trans.
Ind. Informatics 6(4), 652–663 (2010)
14. Geilen, M.: An improved on-the-fly tableau construction for a real-time temporal logic. In:
Hunt Jr., W.A., Somenzi, F. (eds.) CAV 2003. LNCS, vol. 2725, pp. 394–406. Springer, Hei-
delberg (2003)
15. Havelund, K.: Runtime verification of C programs. In: Suzuki, K., Higashino, T., Ulrich, A.,
Hasegawa, T. (eds.) TestCom/FATES 2008. LNCS, vol. 5047, pp. 7–22. Springer, Heidelberg
(2008)
16. Ippolito, C., Espinosa, P., Weston, A.: Swift UAS: An electric UAS research platform for
green aviation at NASA Ames Research Center. In: CAFE EAS IV (April 2010)
17. Johnson, S., Gormley, T., Kessler, S., Mott, C., Patterson-Hine, A., Reichard, K., Philip Scan-
dura, J.: System Health Management: with Aerospace Applications. Wiley & Sons (2011)
18. Kleene, S.C.: Introduction to Metamathematics. North Holland (1996)
19. Lichtenstein, O., Pnueli, A., Zuck, L.: The glory of the past. In: Parikh, R. (ed.) Logic of
Programs 1985. LNCS, vol. 193, pp. 196–218. Springer, Heidelberg (1985)
20. Lu, H., Forin, A.: The design and implementation of P2V, an architecture for zero-overhead
online verification of software programs. Tech. Rep. MSR-TR-2007-99 (2007)
21. Maler, O., Nickovic, D., Pnueli, A.: On synthesizing controllers from bounded-response
properties. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 95–107.
Springer, Heidelberg (2007)
22. Maler, O., Nickovic, D., Pnueli, A.: Checking temporal properties of discrete, timed and con-
tinuous behaviors. In: Avron, A., Dershowitz, N., Rabinovich, A. (eds.) Pillars of Computer
Science. LNCS, vol. 4800, pp. 475–505. Springer, Heidelberg (2008)
23. Pike, L., Niller, S., Wegmann, N.: Runtime verification for ultra-critical systems. In: Khur-
shid, S., Sen, K. (eds.) RV 2011. LNCS, vol. 7186, pp. 310–324. Springer, Heidelberg (2012)
24. Reinbacher, T., Függer, M., Brauer, J.: Real-time runtime verification on chip. In: Qadeer, S.,
Tasiran, S. (eds.) RV 2012. LNCS, vol. 7687, pp. 110–125. Springer, Heidelberg (2013)
25. Schumann, J., Mbaya, T., Mengshoel, O., Pipatsrisawat, K., Srivastava, A., Choi, A., Dar-
wiche, A.: Software health management with Bayesian Networks. Innovations in Systems
and SW Engineering 9(4), 271–292 (2013)
26. Schumann, J., Rozier, K.Y., Reinbacher, T., Mengshoel, O.J., Mbaya, T., Ippolito, C.: To-
wards real-time, on-board, hardware-supported sensor and software health management for
unmanned aerial systems. In: PHM (2013)
27. Tabakov, D., Rozier, K.Y., Vardi, M.Y.: Optimized temporal monitors for SystemC. Formal
Methods in System Design 41(3), 236–268 (2012)
28. Thati, P., Roşu, G.: Monitoring Algorithms for Metric Temporal Logic specifications.
ENTCS 113, 145–162 (2005)
Status Report on Software Verification
(Competition Summary SV-COMP 2014)

Dirk Beyer

University of Passau, Germany

Abstract. This report describes the 3rd International Competition on

Software Verification (SV-COMP 2014), which is the third edition of
a thorough comparative evaluation of fully automatic software verifiers.
The reported results represent the state of the art in automatic soft-
ware verification, in terms of effectiveness and efficiency. The verification
tasks of the competition consist of nine categories containing a total of
2 868 C programs, covering bit-vector operations, concurrent execution,
control-flow and integer data-flow, device-drivers, heap data structures,
memory manipulation via pointers, recursive functions, and sequential-
ized concurrency. The specifications include reachability of program la-
bels and memory safety. The competition is organized as a satellite event
at TACAS 2014 in Grenoble, France.

1 Introduction
Software verification is an important part of software engineering, which is re-
sponsible for guaranteeing safe and reliable performance of the software systems
that our economy and society relies on. The latest research results need to be
implemented in verification tools, in order to transfer the theoretical knowledge
to engineering practice. The Competition on Software Verification (SV-COMP) 1
is a systematic comparative evaluation of the effectiveness and efficiency of the
state of the art in software verification. The benchmark repository of SV-COMP 2
is a collection of verification tasks that represent the current interest and abil-
ities of tools for software verification. For the purpose of this competition, the
verification tasks are arranged in nine categories, according to the characteristics
of the programs and the properties to verify. Besides the verification tasks that
are used in this competition and written in the programming language C, the
SV-COMP repository also contains tasks written in Java 3 and as Horn clauses 4 .
The main objectives of the Competition on Software Verification are to:
1. provide an overview of the state of the art in software-verification technology,
2. establish a repository of software-verification tasks that is widely used,
3. increase visibility of the most recent software verifiers, and
4. accelerate the transfer of new verification technology to industrial practice.
1
http://sv-comp.sosy-lab.org
2
https://svn.sosy-lab.org/software/sv-benchmarks/trunk
3
https://svn.sosy-lab.org/software/sv-benchmarks/trunk/java
4
https://svn.sosy-lab.org/software/sv-benchmarks/trunk/clauses

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 373–388, 2014.

c Springer-Verlag Berlin Heidelberg 2014
374 D. Beyer

The large attendance at the past competition sessions at TACAS witnesses

that the community is interested in the topic and that the competition really
helps achieving the above-mentioned objectives (1) and (3). Also, objective (2)
is achieved: an inspection of recent publications on algorithms for software veri-
fication reveals that it becomes a standard for evaluating new algorithms to use
the established verification benchmarks from the SV-COMP repository.
The difference of SV-COMP to other competitions 5 6 7 8 9 10 11 12 13 is that we
focus on evaluating tools for fully automatic verification of program source code
in a standard programming language [1, 2]. The experimental evaluation is per-
formed on dedicated machines that provide the same limited amount of resources
to each verification tool.

2 Procedure
The procedure for the competition was not changed in comparison to the previ-
ous editions [1,2], and consisted of the phases (1) benchmark submission (collect
and classify new verification tasks), (2) training (teams inspect verification tasks
and train their verifiers), and (3) evaluation (verification runs with all competi-
tion candidates and review of the system descriptions by the competition jury).
All systems and their descriptions were again archived and stamped for identifi-
cation with SHA hash values. Also, before public announcement of the results,
all teams received the preliminary results of their verifier for approval. After the
competition experiments for the ‘official’ categories were finished, some teams
participated in demonstration categories, in order to experiment with new cate-
gories and new rules for future editions of the competition.

3 Definitions and Rules

As a new feature of the competition and to streamline the specification of the
various properties, we introduced a syntax for properties (described below). The
definition of verification tasks was not changed (taken from [2]).
Verification Tasks. A verification task consists of a C program and a property.
A verification run is a non-interactive execution of a competition candidate on
a single verification task, in order to check whether the following statement is
correct: “The program satisfies the property.” The result of a verification run is
a triple (answer, witness, time). answer is one of the following outcomes:
5
http://www.satcompetition.org
6
http://www.smtcomp.org
7
http://ipc.icaps-conference.org
8
http://www.qbflib.org/competition.html
9
http://fmv.jku.at/hwmcc12
10
http://www.cs.miami.edu/~tptp/CASC
11
http://termination-portal.org
12
http://fm2012.verifythis.org
13
http://rers-challenge.org
Status Report on Software Verification 375

TRUE: The property is satisfied (i.e., no path that violates the property exists).
FALSE: The property is violated (i.e., there exists a path that violates the
property) and a counterexample path is produced and reported as witness.
UNKNOWN: The tool cannot decide the problem, or terminates by a tool
crash, or exhausts the computing resources time or memory (i.e., the compe-
tition candidate does not succeed in computing an answer TRUE or FALSE).
For the counterexample path that must be produced as witness for the result
FALSE, we did not require a particular fixed format. (Future editions of SV-
COMP will support machine-readable error witnesses, such that error witnesses
can be automatically validated by a verifier.) The time is measured as consumed
CPU time until the verifier terminates, including the consumed CPU time of all
processes that the verifier started. If time is equal to or larger than the time
limit, then the verifier is terminated and the answer is set to ‘timeout’ (and
interpreted as UNKNOWN). The verification tasks are partitioned into nine
separate categories and one category Overall that contains all verification tasks.
The categories, their defining category-set files, and the contained programs are
explained under Verification Tasks on the competition web site.
Properties. The specification to be verified is stored in a file that is given
as parameter to the verifier. In the repository, the specifications are available
in .prp files in the main directory.
The definition init(main()) gives the initial states of the program by a call of
function main (with no parameters). The definition LTL(f) specifies that formula
f holds at every initial state of the program. The LTL (linear-time temporal logic)
operator G f means that f globally holds (i.e., everywhere during the program
execution), and the operator F f means that f eventually holds (i.e., at some
point during the program execution). The proposition label(ERROR) is true if
the C label ERROR is reached, and the proposition end is true if the program
execution terminates (e.g., return of function main, program exit, abort).
Label Unreachability. The reachability property perror is encoded in the program
source code using a C label and expressed using the following specification (the
interpretation of the LTL formula is given in Table 1):
CHECK( init(main()), LTL(G ! label(ERROR)) )
The new syntax (in comparison to previous SV-COMP editions) allows a more
general specification of the reachability property, by decoupling the specification
from the program source code, and thus, not requiring the label to be named
ERROR.
Memory Safety. The memory-safety property pmemsafety (only used in one cat-
egory) consists of three partial properties and is expressed using the following
specification (interpretation of formulas given in Table 1):
CHECK( init(main()), LTL(G valid-free) )
CHECK( init(main()), LTL(G valid-deref) )
CHECK( init(main()), LTL(G valid-memtrack) )
376 D. Beyer

Table 1. Formulas used in the competition, together with their interpretation

Formula Interpretation
G ! label(ERROR) The C label ERROR is not reachable on any finite execution of
the program.
G valid-free All memory deallocations are valid (counterexample: invalid free).
More precisely: There exists no finite execution of the program
on which an invalid memory deallocation occurs.
G valid-deref All pointer dereferences are valid (counterexample: invalid
dereference). More precisely: There exists no finite execution of
the program on which an invalid pointer dereference occurs.
G valid-memtrack All allocated memory is tracked, i.e., pointed to or deallocated
(counterexample: memory leak). More precisely: There exists
no finite execution of the program on which the program lost
track of some previously allocated memory.
F end All program executions are finite and end on proposition end
(counterexample: infinite loop). More precisely: There exists
no execution of the program on which the program never
terminates.

Table 2. Scoring schema for SV-COMP 2013 and 2014 (taken from [2])
Reported result Points Description
UNKNOWN 0 Failure to compute veriﬁcation result
FALSE correct +1 Violation of property in program was correctly found
FALSE incorrect −4 Violation reported but property holds (false alarm)
TRUE correct +2 Correct program reported to satisfy property
TRUE incorrect −8 Incorrect program reported as correct (missed bug)

The verification result FALSE for the property pmemsafety is required to include
the violated partial property: FALSE(p), with p ∈ {pvalid−free, pvalid−deref ,
pvalid−memtrack}, means that the (partial) property p is violated. According to the
requirements for verification tasks, all programs in category MemorySafety violate
at most one (partial) property p ∈ {pvalid−free , pvalid−deref , pvalid−memtrack}. Per
convention, function malloc is assumed to always return a valid pointer, i.e., the
memory allocation never fails, and function free always deallocates the memory
and makes the pointer invalid for further dereferences.
Program Termination. The termination property ptermination (only used in a
demonstration category) is based on the proposition end and expressed using the
following specification (interpretation in Table 1):
CHECK( init(main()), LTL(F end) )
Evaluation by Scores and Run Time. The scoring schema was not changed
from SV-COMP 2013 to 2014 and is given in Table 2. The ranking is decided
based on the sum of points and for equal sum of points according to success
run time, which is the total CPU time over all verification tasks for which the
verifier reported a correct verification result. Sanity tests on obfuscated versions
of verification tasks (renaming of variable and function names; renaming of file)
Status Report on Software Verification 377

Table 3. Competition candidates with their system-description references and repre-

senting jury members

Competition candidate Ref. Jury member Aﬃliation

Blast 2.7.2 [31] Vadim Mutilin ISP RAS, Moscow, Russia
Cbmc [24] Michael Tautschnig Queen Mary U, London, UK
CPAchecker [25] Stefan Löwe U Passau, Germany
CPAlien [27] Petr Muller TU Brno, Czech Republic
CSeq-Lazy [20] Bernd Fischer Stellenbosch U, South Africa
CSeq-Mu [33] Gennaro Parlato U Southampton, UK
Esbmc 1.22 [26] Lucas Cordeiro FUA, Manaus, Brazil
FrankenBit [15] Arie Gurﬁnkel SEI, Pittsburgh, USA
Llbmc [11] Stephan Falke KIT, Karlsruhe, Germany
Predator [9] Tomas Vojnar TU Brno, Czech Republic
Symbiotic 2 [32] Jiri Slaby Masaryk U, Brno, Czech Rep.
Threader [29] Corneliu Popeea TU Munich, Germany
Ufo [14] Aws Albarghouthi U Toronto, Canada
Ultimate Automizer [16] Matthias Heizmann U Freiburg, Germany
Ultimate Kojak [10] Alexander Nutz U Freiburg, Germany

did not reveal any discrepancy of the results. Opting-out from Categories and
and Computation of Score for Meta Categories were deﬁned as in SV-COMP
2013 [2]. The Competition Jury consists again of the chair and one member of
each participating team. Team representatives are indicated in Table 3.

4 Participating Teams
Table 3 provides an overview of the participating competition candidates. The
detailed summary of the achievements for each veriﬁer is presented in Sect. 5.
A total of 15 competition candidates participated in SV-COMP 2014: Blast
2.7.2 14 , Cbmc 15 , CPAchecker 16 , CPAlien 17 , CSeq-Lazy 18 , CSeq-Mu,
Esbmc 1.22 19 , FrankenBit 20 , Llbmc 21 , Predator 22 , Symbiotic 2 23 ,
Threader 24 , Ufo 25 , Ultimate Automizer 26 , and Ultimate Kojak 27 .

14
http://forge.ispras.ru/projects/blast
15
http://www.cprover.org/cbmc
16
http://cpachecker.sosy-lab.org
17
http://www.fit.vutbr.cz/~imuller/cpalien
18
http://users.ecs.soton.ac.uk/gp4/cseq/cseq.html
19
http://www.esbmc.org
20
http://bitbucket.org/arieg/fbit
21
http://llbmc.org
22
http://www.fit.vutbr.cz/research/groups/verifit/tools/predator
23
https://sf.net/projects/symbiotic
24
http://www7.in.tum.de/tools/threader
25
http://bitbucket.org/arieg/ufo
26
http://ultimate.informatik.uni-freiburg.de/automizer
27
http://ultimate.informatik.uni-freiburg.de/kojak
378 D. Beyer

Table 4. Technologies and features that the veriﬁcation tools oﬀer (incl. demo track )

Bounded Model Check.

Explicit-Value Analysis
Predicate Abstraction

Concurrency Support
ARG-based Analysis
Bit-precise Analysis
Symbolic Execution

Ranking Functions
Lazy Abstraction
Interval Analysis

Shape Analysis

Interpolation
Verification tool CEGAR
(incl. demo
track)
AProVE ✓ ✓
Blast 2.7.2 ✓ ✓ ✓ ✓ ✓
Cbmc ✓ ✓ ✓
CPAlien ✓ ✓
CPAchecker ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
CSeq-Lazy ✓ ✓
CSeq-Mu ✓ ✓
Esbmc 1.22 ✓ ✓ ✓
FuncTion ✓ ✓
FrankenBit ✓ ✓ ✓
Llbmc ✓
Predator ✓
Symbiotic 2 ✓
T2 ✓ ✓ ✓ ✓ ✓ ✓ ✓
Tan ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Threader ✓ ✓ ✓ ✓ ✓
Ufo ✓ ✓ ✓ ✓ ✓ ✓ ✓
Ultimate Automizer ✓ ✓ ✓ ✓
Ultimate Kojak ✓ ✓ ✓ ✓
Ultimate Büchi ✓ ✓ ✓ ✓ ✓
Table 4 lists the features and technologies that are used in the verification
tools. Counterexample-guided abstraction refinement (CEGAR) [8], predicate
abstraction [13], bounded model checking [6], lazy abstraction [19], and inter-
polation for predicate refinement [18] are implemented in many verifiers. Other
features that were implemented include symbolic execution [22], the construction
of an abstract reachability graph (ARG) as proof of correctness [3], and shape
analysis [21]. Only a few tools support the verification of concurrent programs.
Computing ranking functions [28] for proving termination is a feature that is
implemented in tools that participated in the demo category on termination.
Status Report on Software Verification 379

Table 5. Quantitative overview over all results — Part 1 (score / CPU time)

2 766 points max.

1 261 points max.

1 428 verif. tasks

DeviceDrivers
136 points max.

135 points max.

843 verif. tasks
86 points max.

Concurrency

ControlFlow

HeapManip.
49 verif. tasks

78 verif. tasks

80 verif. tasks
BitVectors
Competition candidate
Representing jury member

Blast 2.7.2 — — 508 2 682 —

V. Mutilin, Moscow, Russia 32 000 s 13 000 s
Cbmc 86 128 397 2 463 132
M. Tautschnig, London, UK 2 300 s 29 000 s 42 000 s 390 000 s 12 000 s
CPAchecker 78 0 1009 2 613 107
S. Löwe, Passau, Germany 690 s 0.0 s 9 000 s 28 000 s 210 s
CPAlien — — 455 — 71
P. Muller, Brno, Czech Republic 6 500 s 70 s
CSeq-Lazy — 136 — — —
B. Fischer, Stellenbosch, ZA 1 000 s

CSeq-Mu — 136 — — —
G. Parlato, Southampton, UK 1 200 s
Esbmc 1.22 77 32 949 2 358 97
L. Cordeiro, Manaus, Brazil 1 500 s 30 000 s 35 000 s 140 000 s 970 s
FrankenBit — — 986 2 639 —
A. Gurﬁnkel, Pittsburgh, USA 6 300 s 3 000 s
Llbmc 86 0 961 0 107
S. Falke, Karlsruhe, Germany 39 s 0.0 s 13 000 s 0.0 s 130 s
Predator -92 0 511 50 111
T. Vojnar, Brno, Czech Republic 28 s 0.0 s 3 400 s 9.9 s 9.5 s

Symbiotic 2 39 -82 41 980 105

J. Slaby, Brno, Czech Republic 220 s 5.7 s 39 000 s 2 200 s 15 s
Threader — 100 — — —
C. Popeea, Munich, Germany 3 000 s
Ufo — — 912 2 642 —
A. Albarghouthi, Toronto, Canada 14 000 s 5 700 s

Ultimate Automizer — — 164 — —

M. Heizmann, Freiburg, Germany 6 000 s
Ultimate Kojak -23 0 214 0 18
A. Nutz, Freiburg, Germany 1 100 s 0.0 s 5 100 s 0.0 s 35 s
380 D. Beyer

Table 6. Quantitative overview over all results — Part 2 (score / CPU time)

4 718 points max.

MemorySafety

Sequentialized

2 868 verif. tasks

364 points max.
261 verif. tasks
98 points max.

39 points max.

67 points max.
61 verif. tasks

23 verif. tasks

45 verif. tasks
Recursive

Overall
Simple
Competition candidate
Representing jury member

Blast 2.7.2 — — — 30 —
V. Mutilin, Moscow, Russia 5 400 s
Cbmc 4 30 237 66 3 501
M. Tautschnig, London, UK 11 000 s 11 000 s 47 000 s 15 000 s 560 000 s
CPAchecker 95 0 97 67 2 987
S. Löwe, Passau, Germany 460 s 0.0 s 9 200 s 430 s 48 000 s

CPAlien 9 — — — —
P. Muller, Brno, Czech Republic 690 s
CSeq-Lazy — — — — —
B. Fischer, Stellenbosch, ZA
CSeq-Mu — — — — —
G. Parlato, Southampton, UK
Esbmc 1.22 -136 -53 244 31 975
L. Cordeiro, Manaus, Brazil 1 500 s 4 900 s 38 000 s 27 000 s 280 000 s
FrankenBit — — — 37 —
A. Gurﬁnkel, Pittsburgh, USA 830 s

Llbmc 38 3 208 0 1 843

S. Falke, Karlsruhe, Germany 170 s 0.38 s 11 000 s 0.0 s 24 000 s
Predator 14 -18 -46 0 -184
T. Vojnar, Brno, Czech Republic 39 s 0.12 s 7 700 s 0.0 s 11 000 s
Symbiotic 2 -130 6 -32 -22 -220
J. Slaby, Brno, Czech Republic 7.5 s 0.93 s 770 s 13 s 42 000 s

Threader — — — — —
C. Popeea, Munich, Germany
Ufo — — 83 67 —
A. Albarghouthi, Toronto, Canada 4 800 s 480 s

Ultimate Automizer — 12 49 — 399

M. Heizmann, Freiburg, Germany 850 s 3 000 s 10 000 s
Ultimate Kojak 0 9 9 0 139
A. Nutz, Freiburg, Germany 0.0 s 54 s 1 200 s 0.0 s 7 600 s
Status Report on Software Veriﬁcation 381

Table 7. Overview of the top-three veriﬁers for each category (CPU time in s)

Rank Candidate Score CPU Solved False Missed

Time Tasks Alarms Bugs
BitVectors
1 Llbmc 86 39 49
2 Cbmc 86 2 300 49
3 CPAchecker 78 690 45
Concurrency
1 CSeq-Lazy 136 1 000 78
2 CSeq-Mu 136 1 200 78
3 Cbmc 128 29 000 76 1
ControlFlow
1 CPAchecker 1009 9 000 764 2
2 FrankenBit 986 6 300 752 2
3 Llbmc 961 13 000 783 14
DeviceDrivers
1 Blast 2.7.2 2 682 13 000 1 386 2
2 Ufo 2 642 5 700 1 354 2 3
3 FrankenBit 2 639 3 000 1 383 5 5
HeapManipulation
1 Cbmc 132 12 000 78
2 Predator 111 9.5 68
3 Llbmc 107 130 66
MemorySafety
1 CPAchecker 95 460 59
2 Llbmc 38 170 31
3 Predator 14 39 43 12
Recursive
1 Cbmc 30 11 000 22 1
2 Ultimate Automizer 12 850 9
3 Ultimate Kojak 9 54 7
SequentializedConcurrency
1 Esbmc 1.22 244 38 000 187 2
2 Cbmc 237 47 000 225 10
3 Llbmc 208 11 000 191 3 3
Simple
1 CPAchecker 67 430 45
2 Ufo 67 480 45
3 Cbmc 66 15 000 44
Overall
1 Cbmc 3 501 560 000 2 597 3 90
2 CPAchecker 2 987 48 000 2 421 12
3 Llbmc 1 843 24 000 1 123 3 17
382 D. Beyer

5 Results and Discussion

The results that we obtained in the competition experiments and reported in
this article represent the state of the art in fully automatic and publicly avail-
able software-verification tools. The results show achievements in effectiveness
(number of verification tasks that can be solved, correctness of the results) and
efficiency (resource consumption in terms of CPU time). All reported results
were approved by the participating teams.
The verification runs were natively executed on dedicated unloaded compute
servers with a 3.4 GHz 64-bit Quad Core CPU (Intel i7-2600) and a GNU/Linux
operating system (x86_64-linux). The machines had (at least) 16 GB of RAM, of
which exactly 15 GB were made available to the verification tools. The run-time
limit for each verification run was 15 min of CPU time. The tables report the run
time in seconds of CPU time; all measured values are rounded to two significant
digits. One complete competition run with all candidates on all verification tasks
required a total of 51 days of CPU time.
Tables 5 and 6 present a quantitative overview over all tools and all categories.
The tools are listed in alphabetical order; every table cell for competition results
lists the score in the first row and the CPU time for successful runs in the
second row. We indicated the top-three candidates by formatting their score in
bold face and in larger font size. The entry ‘—’ means that the verifier opted-
out from the respective category. For the calculation of the score and for the
ranking, the scoring schema in Table 2 was applied, the scores for meta categories
(Overall and ControlFlow; consisting of several sub-categories) were computed
using normalized scores as defined in last year’s report [2].
Table 7 reports the top-three verifiers for each category. The run time refers
to successfully solved verification tasks. The columns ‘False Alarms’ and ‘Missed
Bugs’ report the number of verification tasks for which the tool reported wrong
results: reporting a counterexample path but the property holds (false positive)
and claiming that the program fulfills the property although it actually contains
a bug (false negative), respectively.
Score-Based Quantile Functions for Quality Assessment. As described in
the previous competition report [2], score-based quantile functions are a helpful
visualization of the results. The competition web page 28 presents such a plot
for each category, while we illustrate in Fig. 1 only the category Overall (all
verification tasks). A total of eight verifiers participated in category Overall, for
which the quantile plot shows the overall performance over all categories. (Note
that the scores are normalized as described last year [2].)
Overall Quality Measured in Scores (Right End of Graph). Cbmc is the winner of
this category, because the x-coordinate of the right-most data point represents
the highest total score (and thus, the total value) of the completed verification
work (cf. Table 7; right-most x-coordinates match the score values in the table).
Amount of Incorrect Verification Work (Left End of Graph). The left-most data
points of the quantile functions represent the total negative score of a verifier
28
http://sv-comp.sosy-lab.org/2014/results
Status Report on Software Verification 383

!" ! " #

Fig. 1. Quantile functions: For each competition candidate, we plot all data points (x, y)
such that the maximum run time of the n fastest correct veriﬁcation runs is y and x is
the accumulated score of all incorrect results and those n correct results. A logarithmic
scale is used for the time range from 1 s to 1000 s, and a linear scale is used for the
time range between 0 s and 1 s. The graphs are decorated with symbols at every 15-th
data point.

(x-coordinate), i.e., amount of incorrect veriﬁcation work. Veriﬁers should start

with a score close to zero; CPAchecker is best in this aspect (also the right-most
columns of category Overall in Table 7 report this: only 12 false alarms and no
missed bug for all 2 868 verification tasks).
Characteristics of the Verification Tools. The plot visualizations also help under-
standing how the verifiers work internally: (1) The y-coordinate of the left-most
data point refers to the ‘easiest’ verification task for the verifier. We can see
that verifiers that are based on a Java virtual machine need some start-up time
(CPAchecker, Ultimate). (2) The y-coordinate of the right-most data point
refers to the successfully solved verification task that the verifier spent most
time on (this is mostly just below the time limit). We can read the ranking of
verifiers in this category from right to left. (3) The area below a graph is pro-
portional to the accumulated CPU time for successfully solved tasks. We can
identify the most resource-efficient verifiers by looking at the graphs close to the
x-axis. (4) Also the shape of the graph can give interesting insights: From the two
horizontal lines just below the time limit (at 850 s and 895 s, resp.), we can see
that two of the bounded model checkers (Cbmc, Esbmc 1.22) return a result just
before the time limit is reached. The quantile plot for category DeviceDrivers64
(not available here, but on the competition web page) shows an interesting bend
at about 20 s of run time for verifier CPAchecker: the verifier gives up with
one strategy (without abstraction) and performs an internal restart for using
another strategy (with abstraction and CEGAR-based refinement).
384 D. Beyer

Table 8. Quantitative overview over results in category Termination

Termination-crafted Termination-ext Errors

Competition candidate 89 points max. 265 points max. false alarms
Representing team member 47 verif. tasks 199 verif. tasks missed bugs
AProVE [12] 58 0
J. Giesl, Aachen, Germany 360 s 0s
FuncTion [34] 20 0
C. Urban, Paris, France 220 s 0s
T2 [7] 46 50
M. Brockschmidt, Cambridge, UK 80 s 64 s
Tan [23] 12 23 2
C. Wintersteiger, Oxford, UK 33 s 590 s 1
Ultimate Büchi [17] 57 117
M. Heizmann, Freiburg, Germany 250 s 4 800 s

Robustness, Soundness, and Completeness. The best tools of each cate-

gory show that state-of-the-art verification technology significantly progressed
in terms of wrong verification results. Table 7 reports, in its last two columns,
the number of false alarms and missed bugs, respectively, for the best verifiers
in each category: There is a low number of false alarms (wrong bug reports),
which witnesses that verification technology can avoid wasted developer time
being spent on investigation of spurious bug reports. Also in terms of sound-
ness, the results look promising, considering that the most missed bugs (wrong
safety claims) were reported by bounded model checkers. In three categories, the
top-three verifiers did not report any wrong result.

Demonstration Categories. For the ﬁrst time in SV-COMP, we performed

experiments in demonstration categories, i.e., categories for which we wanted to
try out new applications of verification, new properties, or new rules. For the
demonstration categories, we neither rank the results nor assign awards.
Termination. Checking program termination is also an important objective of
software verification. We started with two sets of verification tasks: category
Termination-crafted is a community-contributed set of verification tasks that
were designed by verification researchers for the purpose of evaluating termina-
tion checkers (programs were collected from well-known papers in the area),
and category Termination-ext is a selection of verification tasks from exist-
ing categories for which the result was determined during the demonstration
runs.
Status Report on Software Verification 385

Table 9. Re-verification of verification results using error witnesses; verification time

in s of CPU time; path length in number of edges; expected result is ‘false’ in all cases

Cbmc Path CPAchecker

Verification task verification length re-verification
parport_false 37 179 11
eureka_01_false 0.36 42 64
Tripl.2.ufo.BOUNDED-10.pals.c 0.81 356 53
Tripl.2.ufo.UNBOUNDED.pals.c 0.80 355 44
gigaset.ko_false 44 140 120
tcm_vhost-ko–32_7a 26 197 62
vhost_net-ko–32_7a 21 89 72
si4713-i2c-ko–111_1a 430 75 12

Table 8 shows the results, which are promising: five teams participated, namely
29
AProVE , FuncTion 30 , T2 31 , Tan 32 , and Ultimate Büchi 33 .
Also, the quality of the termination checkers was extremely good: almost all
tools had no false positive (‘false alarms’, the verifier reported the program
would not terminate although it does) and no false negative (‘missed bug’, the
verifier reported termination but infinite looping is possible).
Device-Driver Challenge. Competitions are always looking for hard problems.
We received some unsolved problems from the LDV project 34 . Three teams
participated and could compute answers to 6 of the 15 problems: Cbmc found 3,
CPAchecker found 4, and Esbmc found 2 solutions to the problems.
Error-Witnesses. One of the objectives of program verification is to provide
a witness for the verification result. This is an open problem of verification
technology: there is no commonly supported witness format yet, and the verifiers
are not producing accurate witnesses that can be automatically assessed for
validity 35 . The goal of this demonstration category is to change this (restricted
to error witnesses for now): in cooperation with interested groups we defined a
format for error witnesses and the verifiers were asked to produce error paths in
that format, in order to validate their error paths with another verification tool.
Three tools participated in this category: Cbmc, CPAchecker, and Esbmc.
The demo revealed many interesting insights on practical issues of using a com-
mon witness format, serving as a test before introducing it as a requirement to
29
http://aprove.informatik.rwth-aachen.de
30
http://www.di.ens.fr/~urban/FuncTion.html
31
http://research.microsoft.com/en-us/projects/t2
32
http://www.cprover.org/termination/cta/index.shtml
33
http://ultimate.informatik.uni-freiburg.de/BuchiAutomizer
34
http://linuxtesting.org/project/ldv
35
There was research already on reusing previously computed error paths, but by the
same tool and in particular, using tool-specific formats: for example, Esbmc was
extended to reproduce errors via instantiated code [30], and CPAchecker was used
to re-check previously computed error paths by interpreting them as automata that
control the state-space search [5].
386 D. Beyer

the next edition of the competition. We will report here only a few cases to show
how this technique can help. We selected a group of verification tasks (with ex-
pected verification result ‘false’) that Cbmc could solve, but CPAchecker was
not able to compute a verification result. We started CPAchecker again on the
verification task, now together with Cbmc’s error witness. Table 9 reports the
details of eight such runs: CPAchecker can prove the error witnesses of Cbmc
valid, although it could not find the bug in the program without the hints from
the witness. In some cases this is efficient (first and last row) and sometimes it
is quite inefficient: the matching algorithm needs improvement. The matching
is based purely on syntactical hints (sequence of tokens of the source program).
This technique of re-verifying a program with a different verification tool signifi-
cantly increases the confidence in the verification result (and makes false-alarms
unnecessary).

6 Conclusion
The third edition of the Competition on Software Verification had more partic-
ipants than before: the participation in the ‘official’ categories increased from
eleven to fifteen teams, and five teams took part in the demonstration on ter-
mination checking. The number of benchmark problems increased to a total of
2 868 verification tasks (excluding demonstration categories). The organizer and
the jury made sure that the competition follows the high quality standards of the
TACAS conference, in particular to respect the important principles of fairness,
community support, transparency, and technical accuracy.
The results showcase the progress in developing new algorithms and data
structures for software verification, and in implementing efficient tools for fully-
automatic program verification. The best verifiers have shown good quality in the
categories that they focus on, in terms of robustness, soundness, and complete-
ness. The participants represent a variety of general approaches — SMT-based
model checking, bounded model checking, symbolic execution, and program anal-
ysis showed their different, complementing strengths. Also, the SV-COMP repos-
itory of verification tasks has grown considerably: it now contains termination
problems and problems for regression verification [4], but also Horn clauses and
some Java programs in addition to C programs.
Acknowledgement. We thank K. Friedberger for his support during the evalua-
tion phase and for his work on the benchmarking infrastructure, the competition
jury for making sure that the competition is well-grounded in the community,
and the teams for making SV-COMP possible through their participation.

References
1. Beyer, D.: Competition on software verification (SV-COMP). In: Flanagan, C.,
König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 504–524. Springer, Heidelberg
(2012)
2. Beyer, D.: Second competition on software verification. In: Piterman, N., Smolka,
S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 594–609. Springer, Heidelberg (2013)
Status Report on Software Verification 387

3. Beyer, D., Henzinger, T.A., Jhala, R., Majumdar, R.: The software model checker
Blast. Int. J. Softw. Tools Technol. Transfer 9(5-6), 505–525 (2007)
4. Beyer, D., Löwe, S., Novikov, E., Stahlbauer, A., Wendler, P.: Precision reuse for
efficient regression verification. In: Proc. ESEC/FSE, pp. 389–399. ACM (2013)
5. Beyer, D., Wendler, P.: Reuse of verification results - conditional model checking,
precision reuse, and verification witnesses. In: Bartocci, E., Ramakrishnan, C.R.
(eds.) SPIN 2013. LNCS, vol. 7976, pp. 1–17. Springer, Heidelberg (2013)
6. Biere, A., Cimatti, A., Clarke, E., Zhu, Y.: Symbolic model checking without BDDs.
In: Cleaveland, W.R. (ed.) TACAS 1999. LNCS, vol. 1579, pp. 193–207. Springer,
Heidelberg (1999)
7. Brockschmidt, M., Cook, B., Fuhs, C.: Better termination proving through cooper-
ation. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 413–429.
Springer, Heidelberg (2013)
8. Clarke, E.M., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guided
abstraction refinement for symbolic model checking. J. ACM 50(5), 752–794 (2003)
9. Dudka, K., Peringer, P., Vojnar, T.: Predator: A shape analyzer based on symbolic
memory graphs (Competition contribution). In: Ábrahám, E., Havelund, K. (eds.)
TACAS 2014. LNCS, vol. 8413, pp. 412–414. Springer, Heidelberg (2014)
10. Ermis, E., Nutz, A., Dietsch, D., Hoenicke, J., Podelski, A.: Ultimate Kojak (Com-
petition contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS,
vol. 8413, pp. 421–423. Springer, Heidelberg (2014)
11. Falke, S., Merz, F., Sinz, C.: LLBMC: Improved bounded model checking of C
programs using LLVM (Competition contribution). In: Piterman, N., Smolka, S.A.
(eds.) TACAS 2013. LNCS, vol. 7795, pp. 623–626. Springer, Heidelberg (2013)
12. Giesl, J., Schneider-Kamp, P., Thiemann, R.: AProVE 1.2: Automatic termina-
tion proofs in the dependency pair framework. In: Furbach, U., Shankar, N. (eds.)
IJCAR 2006. LNCS (LNAI), vol. 4130, pp. 281–286. Springer, Heidelberg (2006)
13. Graf, S., Saïdi, H.: Construction of abstract state graphs with Pvs. In: Grumberg,
O. (ed.) CAV 1997. LNCS, vol. 1254, pp. 72–83. Springer, Heidelberg (1997)
14. Albarghouthi, A., Gurfinkel, A., Li, Y., Chaki, S., Chechik, M.: UFO: Verification
with interpolants and abstract interpretation. In: Piterman, N., Smolka, S.A. (eds.)
TACAS 2013. LNCS, vol. 7795, pp. 637–640. Springer, Heidelberg (2013)
15. Gurfinkel, A., Belov, A.: FrankenBit: Bit-precise verification with many bits (Com-
petition contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS,
vol. 8413, pp. 408–411. Springer, Heidelberg (2014)
16. Heizmann, M., Christ, J., Dietsch, D., Hoenicke, J., Lindenmann, M., Musa, B.,
Schilling, C., Wissert, S., Podelski, A.: Ultimate automizer with unsatisfiable cores
(Competition contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014.
LNCS, vol. 8413, pp. 418–420. Springer, Heidelberg (2014)
17. Heizmann, M., Hoenicke, J., Leike, J., Podelski, A.: Linear ranking for linear lasso
programs. In: Van Hung, D., Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172, pp.
365–380. Springer, Heidelberg (2013)
18. Henzinger, T.A., Jhala, R., Majumdar, R., McMillan, K.L.: Abstractions from
proofs. In: Proc. POPL, pp. 232–244. ACM (2004)
19. Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Lazy abstraction. In: Proc.
POPL, pp. 58–70. ACM (2002)
20. Inverso, O., Tomasco, E., Fischer, B., La Torre, S., Parlato, G.: Lazy-CSeq: A
lazy sequentialization tool for C (Competition contribution). In: Ábrahám, E.,
Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 398–401. Springer, Hei-
delberg (2014)
388 D. Beyer

21. Jones, N.D., Muchnick, S.S.: A flexible approach to interprocedural data-flow anal-
ysis and programs with recursive data structures. In: POPL, pp. 66–74 (1982)
22. King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–
394 (1976)
23. Kröning, D., Sharygina, N., Tsitovich, A., Wintersteiger, C.M.: Termination anal-
ysis with compositional transition invariants. In: Touili, T., Cook, B., Jackson, P.
(eds.) CAV 2010. LNCS, vol. 6174, pp. 89–103. Springer, Heidelberg (2010)
24. Kröning, D., Tautschnig, M.: CBMC – C bounded model checker (Competition
contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413,
pp. 389–391. Springer, Heidelberg (2014)
25. Löwe, S., Mandrykin, M., Wendler, P.: CPAchecker with sequential combination
of explicit-value analyses and predicate analyses (Competition contribution). In:
Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 392–394.
Springer, Heidelberg (2014)
26. Morse, J., Ramalho, M., Cordeiro, L., Nicole, D., Fischer, B.: ESBMC 1.22 (Com-
petition contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS,
vol. 8413, pp. 405–407. Springer, Heidelberg (2014)
27. Muller, P., Vojnar, T.: CPAlien: Shape analyzer for CPAChecker (Competition
contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413,
pp. 395–397. Springer, Heidelberg (2014)
28. Podelski, A., Rybalchenko, A.: A complete method for the synthesis of linear rank-
ing functions. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp.
239–251. Springer, Heidelberg (2004)
29. Popeea, C., Rybalchenko, A.: Threader: A verifier for multi-threaded programs
(Competition contribution). In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013.
LNCS, vol. 7795, pp. 633–636. Springer, Heidelberg (2013)
30. Rocha, H., Barreto, R., Cordeiro, L., Neto, A.D.: Understanding programming bugs
in ANSI-C software using bounded model checking counter-examples. In: Derrick,
J., Gnesi, S., Latella, D., Treharne, H. (eds.) IFM 2012. LNCS, vol. 7321, pp. 128–
142. Springer, Heidelberg (2012)
31. Shved, P., Mandrykin, M., Mutilin, V.: Predicate analysis with BLAST 2.7. In:
Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 525–527.
Springer, Heidelberg (2012)
32. Slaby, J., Strejček, J.: Symbiotic 2: More precise slicing (Competition contribution).
In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 415–417.
Springer, Heidelberg (2014)
33. Tomasco, E., Inverso, O., Fischer, B., La Torre, S., Parlato, G.: MU-CSeq: Sequen-
tialization of C programs by shared memory unwindings (Competition contribu-
tion). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp.
402–404. Springer, Heidelberg (2014)
34. Urban, C., Miné, A.: An abstract domain to infer ordinal-valued ranking functions.
In: Shao, Z. (ed.) ESOP 2014. LNCS, vol. 8410, pp. 412–431. Springer, Heidelberg
(2014)
CBMC – C Bounded Model Checker
(Competition Contribution)

Daniel Kroening1 and Michael Tautschnig2

1
University of Oxford, UK
2
Queen Mary University of London, UK

Abstract CBMC implements bit-precise bounded model checking for

C programs and has been developed and maintained for more than ten
years. CBMC verifies the absence of violated assertions under a given
loop unwinding bound. Other properties, such as SV-COMP’s ERROR
labels or memory safety properties are reduced to assertions via auto-
mated instrumentation. Only recently support for efficiently checking
concurrent programs, including support for weak memory models, has
been added. Thus, CBMC is now capable of finding counterexamples in
all of SV-COMP’s categories. As back end, the competition submission
of CBMC uses MiniSat 2.2.0.

1 Overview
The C Bounded Model Checker (CBMC) [2] demonstrates the violation of as-
sertions in C programs, or proves safety of the assertions under a given bound.
CBMC implements a bit-precise translation of an input C program, annotated
with assertions and with loops unrolled to a given depth, into a formula. If the
formula is satisﬁable, then an execution leading to a violated assertion exists.
For SV-COMP, satisﬁability of the formula is decided using MiniSat 2.2.0 [4].

2 Architecture
Bounded model checkers such as CBMC reduce questions about program paths
to constraints that can be solved by oﬀ-the-shelf SAT or SMT solvers. With the
SAT back end, and given a program annotated with assertions, CBMC outputs a
CNF formula the solutions of which describe program paths leading to assertion
violations. In order to do so, CBMC performs the following main steps, which
are outlined in Figure 1, and are explained below.

Front end. The command-line front end first configures CBMC according to
user-supplied parameters, such as the bit-width. The C parser utilises an off-
the-shelf C preprocessor (such as gcc -E) and builds a parse tree from the pre-
processed source. Source file- and line information is maintained in annotations.
Type checking populates a symbol table with type names and symbol identifiers
by traversing the parse tree. Each symbol is assigned bit-level type information.
CBMC aborts if any inconsistencies are detected at this stage.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 389–391, 2014.

c Springer-Verlag Berlin Heidelberg 2014
390 D. Kroening and M. Tautschnig

Command Line
Front End

Static Analysis &

C Parser Type Checking GOTO Conversion
Instrumentation

Counterexample
Symbolic Execution CNF Conversion SAT Solver
Analysis

Fig. 1. CBMC architecture

Intermediate Representation. CBMC uses GOTO programs as intermediate rep-

resentation. In this language, all non-linear control ﬂow, such as if or switch-
statements, loops and jumps, is translated to equivalent guarded goto statements.
These statements are branch instructions that include (optional) conditions.
CBMC generates one GOTO program per C function found in the parse tree. Fur-
thermore, it adds a new main function that ﬁrst calls an initialisation function for
global variables and then calls the original program entry function.
At this stage, CBMC performs a light-weight static analysis to resolve function
pointers to a case split over all candidate functions, resulting in a static call
graph. Furthermore, assertions to guard against invalid pointer operations or
memory leaks are inserted.

Middle end. CBMC performs symbolic execution by eagerly unwinding loops

up to a fixed bound, which can be specified by the user on a per-loop basis or
globally, for all loops. In the course of this unwinding step, CBMC also translates
GOTO statements to static single assignment (SSA) form. Constant propagation
and expression simplification are key to efficiency, and prevent exploration of
certain infeasible branches. At the end of this process the program is represented
as a system of equations over renamed program variables in guarded statements.
The guards determine whether an assignment is actually performed in a given
concrete program execution. In [1] we presented an extension to perform efficient
bounded model checking of concurrent programs, which symbolically encodes
partial orders over read and write accesses to shared variables.

Back end. While CBMC also supports SMT solvers as back ends, we use Min-
iSat 2.2.0 in this competition. Consequently, the resulting equation is translated
into a CNF formula by bit-precise modelling of all expressions plus the Boolean
guards [3]. A model computed by the SAT solver corresponds to a path violat-
ing at least one of the assertions in the program under scrutiny, and the model
is translated back to a sequence of assignments to provide a human-readable
counterexample. Conversely, if the formula is unsatisﬁable, no assertion can be
violated within the given unwinding bounds.

3 Strengths and Weaknesses

As a bounded model checker, and in absence of additional loop transforma-
tions or k-induction, CBMC cannot provide proofs of correctness for programs
CBMC – C Bounded Model Checker 391

with unbounded loops in general. Yet we decided to enforce termination with

a TRUE/FALSE answer within the time bounds specified in SV-COMP to
provide best-effort answers. Consequently there may be unsound results on cer-
tain benchmarks. To reduce the number of such results, the wrapper script (see
below) runs CBMC with increasing loop bounds of 2, 6, 12, 17, 21, and 40 until
the timeout is reached. These values were obtained as educated guesses informed
by the training phase.
Apart from this fundamental limitation, we observed several errors (both false
positives and false negatives) caused by current limitations in treatment of point-
ers. This affects at least one benchmark in the Concurrency category and possibly
several in MemorySafety.
The strengths of bounded model checking, on the other hand, are its predict-
able performance and amenability to the full spectrum of categories.

4 Tool Setup
The competition submission is based on CBMC version 4.5. The full source code
of the competing version is available at
http://svn.cprover.org/svn/cbmc/releases/cbmc-4.5-sv-comp-2014/.
To process a benchmark FOO.c (with properties in FOO.prp), the script
cbmc-wrapper.sh should be invoked as follows:
cbmc-wrapper.sh --propertyfile FOO.prp --32 FOO.c
for all categories with a 32-bit memory model; for those with a 64-bit memory
model, --32 should be replaced by --64.

5 Software Project
CBMC is maintained by Daniel Kroening with patches supplied by the com-
munity. It is made publicly available under a BSD-style license. The source code
and binaries for popular platforms are available at http://www.cprover.org/cbmc.

References
1. Alglave, J., Kroening, D., Tautschnig, M.: Partial orders for eﬃcient bounded model
checking of concurrent software. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS,
vol. 8044, pp. 141–157. Springer, Heidelberg (2013)
2. Clarke, E., Kroening, D., Lerda, F.: A tool for checking ANSI-C programs. In:
Jensen, K., Podelski, A. (eds.) TACAS 2004. LNCS, vol. 2988, pp. 168–176. Springer,
Heidelberg (2004)
3. Clarke, E.M., Kroening, D., Yorav, K.: Behavioral consistency of C and Verilog
programs using Bounded Model Checking. In: DAC, pp. 368–371 (2003)
4. Eén, N., Sörensson, N.: An extensible SAT-solver. In: Giunchiglia, E., Tacchella, A.
(eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518. Springer, Heidelberg (2004)
CPAchecker with Sequential Combination
of Explicit-Value Analyses
and Predicate Analyses
(Competition Contribution)

Stefan Löwe1 , Mikhail Mandrykin2 , and Philipp Wendler1

1
University of Passau, Germany
2
Institute for System Programming of Russian Academy of Science, Russia

Abstract. CPAchecker is a framework for software veriﬁcation, built

on the foundations of Configurable Program Analysis (CPA). For
the SV-COMP’14, we file a CPAchecker configuration that runs up to
five analyses in sequence. The first two analyses of our approach utilize
the explicit-value domain for modeling the state space, while the remain-
ing analyses are based on predicate abstraction. In addition to that, a
bit-precise counterexample checker comes into action whenever an anal-
ysis finds a counterexample. The combination of conceptually different
analyses is key to the success of our verification approach, as the diversity
of verification tasks is taken into account.

1 Software Architecture
CPAchecker, which is built on the foundations of Configurable Program
Analysis (CPA), strives for high extensibility and reuse. As such, auxiliary anal-
yses, such as tracking the program counter, modeling the call stack, and keeping
track of function pointers, all of which is required for virtually any verification
tool, are implemented as independent CPAs. The same is true for the main anal-
yses, such as, e.g., the explicit-value analysis and the analysis based on predicate
abstraction, which are also available as decoupled CPAs within CPAchecker.
All these CPAs can be enabled and flexibly recombined on a per-demand basis
without the need of changing adjacent CPAs. Other algorithms, like CEGAR,
counterexample checks, parallel or sequential combinations of analyses, as the
one being filed to this year’s SV-COMP’14, can be plugged together by simply
passing the according configuration options to the CPAchecker framework.
CPAchecker, which is written in Java, uses the C parser of the Eclipse CDT
project1 , and MathSAT52 for solving SMT formulae and interpolation queries.

1
http://www.eclipse.org/cdt/
2
http://mathsat.fbk.eu/

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 392–394, 2014.

c Springer-Verlag Berlin Heidelberg 2014
CPAchecker with Sequential Combination of Explicit-Value Analyses 393

Fig. 1. Overview of the sequential combination used for reachability problems

2 Veriﬁcation Approach

CPAchecker gets as input a speciﬁcation and the source of a C program, which

is then transformed into a control flow automaton (CFA) of the input program.
During the analysis, this CFA is traversed, gradually building the abstract reach-
ability graph (ARG). The nodes of the ARG represent the reachable states of the
program, containing all relevant information, such as the program counter, the
call stack and the information collected by the main CPAs, like explicit variable
assignments or boolean combinations of predicates about program variables.
For reachability problems, we use a sequential combination [1] of up to five
analyses using explicit-value analysis and predicate abstraction. The general ap-
proach of our sequential combination is as follows. Once any analysis in the
sequence reports the verdict true, this result is returned. In case a counterexam-
ple is found and validated by a subsequent counterexample check, the verdict
false is returned. If the counterexample is found to be spurious, or when the
current analysis reaches a predefined time limit, the next analysis takes over.
The sequence starts with an explicit-value analysis without abstraction or re-
finement for 20 seconds. The motivation here is, that many control-flow intense
programs can be solved with this approach in very little time. However, this
simple analysis easily falls prey to state-space explosion. This is why a more
sophisticated analysis of the same domain, including an abstract-refine loop [3],
is started in case the first one does not come up with a result. Next in line are
three analyses using predicate abstraction with adjustable block encoding [2].
The reason for switching to analyses that are conceptually different is motivated
by the fact, that different programs have different characteristics. The third and
fourth analyses model program variables as real variables and use only linear
arithmetic. The first of these two configurations computes predicate abstrac-
tions only at loop heads (ABE-l) and runs for at most ten minutes. The second
one additionally abstracts at function call and return sites (ABE-lf), and shows
different performance characteristics. The final analysis, a bit-precise predicate
analysis, is used if all previous analyses failed to provide a result (reasoning
about bit vectors is too expensive to use it on all programs). In addition, an
analysis similar to the last one, but lacking the abstract-refine loop, checks fi-
nite counterexamples found by any of the previously mentioned analyses. The
bounded model checker CBMC3 is used to check counterexamples of the last
3
http://www.cprover.org/cbmc
394 S. Löwe, M. Mandrykin, and P. Wendler

analysis for an even higher conﬁdence in the result. For checking memory safety
properties, we use a bounded analysis consisting of concrete memory graphs in
combination with an instance of the explicit-value analysis mentioned above.

3 Strengths and Weaknesses

Similarly to our last years submissions, though far more sophisticated, the key
idea of the submitted configuration is the combination of conceptually different
analyses. In addition, the predicate analysis now has support for bit vectors, and
also allows for more precise and efficient support for pointer aliasing by encod-
ing possibly aliased memory locations with uninterpreted functions. However,
CPAchecker lacks support for multi-threaded or recursive programs. Efficient
tracking of heap memory remains an issue, yet solvable, e.g., by summarization.

4 Setup and Conﬁguration

CPAchecker is available at http://cpachecker.sosy-lab.org . The submitted
version is 1.2.11-svcomp14b. The command line for running CPAchecker is
scripts/cpa.sh -sv-comp14 -disable-java-assertions -heap 10000m -spec property.prp program.i

Please add the parameter -64 for C programs assuming a 64-bit environment.
For machines with less RAM, the amount of memory given to the Java VM
needs to be adjusted with the parameter -heap. CPAchecker will print the
veriﬁcation result and the name of the output directory to the console. Additional
information (such as the error path) will be written to ﬁles in this directory.

5 Project and Contributors

CPAchecker is an open-source project led by Dirk Beyer from the Software
Systems Lab at the University of Passau. Several other research groups use and
contribute to CPAchecker, such as the Institute for System Programming of the
Russian Academy of Sciences, the University of Paderborn and the University
of Technology in Brno. We would like to thank all contributors for their work on
CPAchecker. The full list can be found at http://cpachecker.sosy-lab.org .

References
1. Beyer, D., Henzinger, T.A., Keremoglu, M.E., Wendler, P.: Conditional model check-
ing: A technique to pass information between veriﬁers. In: Proc. FSE. ACM (2012)
2. Beyer, D., Keremoglu, M.E., Wendler, P.: Predicate abstraction with adjustable-
block encoding. In: Proc. FMCAD, pp. 189–197, FMCAD (2010)
3. Beyer, D., Löwe, S.: Explicit-state software model checking based on CEGAR and
interpolation. In: Cortellessa, V., Varró, D. (eds.) FASE 2013. LNCS, vol. 7793, pp.
146–162. Springer, Heidelberg (2013)
CPA LIEN: Shape Analyzer for CPAChecker
(Competition Contribution)

Petr Muller and Tomáš Vojnar

FIT, Brno University of Technology, IT4Innovations Centre of Excellence, Czech Republic

Abstract. CPA LIEN is a configurable program analysis framework instance. It

uses an extension of the symbolic memory graphs (SMGs) abstract domain for
shape analysis of programs manipulating the heap. In particular, CPA LIEN ex-
tends SMGs with a simple integer value analysis in order to handle programs
with both pointers and integer data. The current version of CPA LIEN is an early
prototype intended as a basis for a future research in the given area. The version
submitted for SV-COMP’14 does not contain any shape abstraction, but it is still
powerful enough to participate in several categories.

1 Verification Approach

CPA LIEN is an analyzer of pointer manipulating programs written in the C language.

It intends to handle industrial, often highly optimized code. CPA LIEN is able to detect
common memory manipulation errors like invalid dereferences, invalid deallocations,
and memory leaks.
CPA LIEN is an offspring of the successful Predator shape analyzer [1]. Predator
implements a sound shape analysis of programs manipulating list-like data structures
of various kinds. While Predator’s ability to handle programs with complex lists is great
(as witnessed by the tool winning gold medals in the appropriate categories in the first
two SV-COMP competitions), we were unsuccessful with extending Predator to handle
other data structures than lists and to also handle data other than pointers.
Therefore, we decided to redesign Predator’s abstract domain of Symbolic Memory
Graphs (SMGs) within the extensible framework of CPAChecker, another successful
verification framework [4], which, however, so far lacked a support for shape analysis.
Consequently, CPA LIEN is implemented as an extension of CPAChecker and hence as
an instance of the underlying Configurable Program Analysis (CPA) [3] framework.
Compared with the use of SMGs in Predator [2], the abstract domain of CPA LIEN
does not yet use any shape abstractions, which means the analysis will not terminate on
programs building unbounded dynamic data structures (unless an error is found). On
the other hand, CPA LIEN combines usage of SMGs with a simple integer value anal-
ysis. Where possible, integer values are tracked explicitly for variables. When explicit
values are not available, we infer information about value equality or nonequality from
assumptions. The combination of this light-weight explicit integer value analysis and

This work was supported by the Czech Science Foundation project 14-11384S and the
EU/Czech IT4Innovations Centre of Excellence project CZ.1.05/1.1.00/02.0070.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 395–397, 2014.

c Springer-Verlag Berlin Heidelberg 2014
396 P. Muller and T. Vojnar

pointer analysis based on SMGs works well for enough test cases from the SV-COMP
benchmark to get a positive score in categories where we participate.
The CPA framework allows one to merge the encountered states to reduce the gen-
erated state space. This feature is, however, not used in CPA LIEN. To compute the cov-
ering relation, which is used by the high-level CPA reachability algorithm to determine
the end of the state space search, CPA LIEN uses the SMG join operation. CPA LIEN
also uses several specialized helper analyses provided by the CPAChecker framework
to deal with certain specific tasks. These helper analyses are the Location, CallStack,
and FunctionPointer CPAs.

2 Software Architecture
CPA LIEN builds upon the CPAChecker framework for implementation, execution, and
combination of instances of the CPA formalism. CPAChecker implements a reachabil-
ity analysis algorithm over a generic CPA and also provides several other algorithms.
CPA LIEN is an implementation of a CPA instance, consisting of the abstract domain
definition and the transfer relation between the states. Symbolic execution is driven by
CPAChecker. CPAChecker also provides a C language parsing capability, wrapping a C
parser present in the Eclipse CDT. Both CPAChecker and CPA LIEN are written in Java.

3 Strengths and Weaknesses

A general strength of CPA LIEN comes with implementation in the generic CPAChecker
framework, offering a potential for the future in terms of combining the SMG-based
shape analysis with other analyses.
Currently, CPA LIEN is, however, mainly focused on heap manipulating programs
as its integer value analysis plays just a supporting role without an ambition to handle
harder problems. Moreover, CPA LIEN is an early prototype, and it so far lacks any
shape abstraction. Therefore, CPA LIEN does not terminate on many of the benchmark
test cases from the Memory Safety category for which CPA LIEN is suited otherwise.
For the Heap Manipulation category, the results are better: there are significantly more
correct answers, with just a few timeouts and only a single false positive reported. Even
the simple integer value analysis combined with the SMG domain managed to provide
a correct answer for many test cases from the Control Flow and Integer Value category,
especially those in the Product Lines sub-category.
Generally, the results correspond with the prototype status of the tool. Apart from the
already mentioned missing abstraction, the tool still has many implementation issues.
It also has deficiencies to handle some C language elements, like implicit type con-
versions. Another roadblock is CPA LIEN’s handling of external functions (functions
with the body unavailable to the verifier). CPA LIEN takes a stance that any unknown
function can contain incorrect code, therefore the memory safety of programs calling
unknown functions cannot be proved. An UNKNOWN answer is given for these cases.
Therefore, CPA LIEN’s results could be improved by modeling the common C library
functions, because many programs use them.
With these limitations being reflected by the results, we still argue that after their
resolution, CPA LIEN will form a promising base for further research on shape analysis
CPA LIEN: Shape Analyzer for CPAChecker 397

and its integration with other specialized analyses, providing heap analysis capabilities
still missing in the CPAChecker ecosystem.
4 Tool Setup and Configuration
CPA LIEN is available online at the project page:
http://www.fit.vutbr.cz/research/groups/verifit/tools/cpalien/
It is a modified version of the upstream CPAChecker, containing code not yet present
in the upstream repository. For the participation in the competition, we have prepared a
tarball. The only dependency needed to run CPA LIEN is Java version 7.
For running the verifier, we have prepared a wrapper script to provide the output
required by the competition rules. The script is run in the following way:
$ ./cpalien.sh target_program.c
Upon completion, a single line with the answer is provided. More information about
the verification result, such as the error path, is provided in the output directory. The
tool does not adhere to competition requirements with respect to property files: it does
not allow a property file to be passed as a parameter. This was caused by our incorrect
reading of the requirements. The property file is expected to be present in the same
directory as the verification task.
CPA LIEN participates in the Heap Manipulation, Memory Safety and Control Flow
and Integer Variable categories. We opt out from the remaining ones.
5 Software Project and Contributors
CPA LIEN is an extension of the CPAChecker project, building on the CPAChecker
heavily. CPA LIEN is developed by the VeriFIT 1 group at the Brno University of Tech-
nology. A significant part of the SMG code was contributed by Alexander Driemeyer
from University of Passau, whom we would like to thank. CPAChecker is a project
developed mainly by the Software Systems Lab2 at the University of Passau. Both
CPA LIEN and CPAChecker are distributed under the Apache 2.0 license.
References
1. Dudka, K., Peringer, P., Vojnar, T.: Predator: A Practical Tool for Checking Manipulation of
Dynamic Data Structures Using Separation Logic. In: Gopalakrishnan, G., Qadeer, S. (eds.)
CAV 2011. LNCS, vol. 6806, pp. 372–378. Springer, Heidelberg (2011)
2. Dudka, K., Peringer, P., Vojnar, T.: Byte-Precise Verification of Low-Level List Manipulation.
In: Logozzo, F., Fähndrich, M. (eds.) SAS 2013. LNCS, vol. 7935, pp. 215–237. Springer,
Heidelberg (2013)
3. Beyer, D., Henzinger, T.A., Théoduloz, G.: Configurable Software Verification: Concretizing
the Convergence of Model Checking and Program Analysis. In: Damm, W., Hermanns, H.
(eds.) CAV 2007. LNCS, vol. 4590, pp. 504–518. Springer, Heidelberg (2007)
4. Beyer, D., Keremoglu, M.E.: CPACHECKER: A Tool for Configurable Software Verification.
In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 184–190. Springer,
Heidelberg (2011)

1
http://www.fit.vutbr.cz/research/groups/verifit/
2
http://www.sosy-lab.org/
Lazy-CSeq: A Lazy Sequentialization Tool for C
(Competition Contribution)

Omar Inverso1, Ermenegildo Tomasco1 , Bernd Fischer2 ,

Salvatore La Torre3 , and Gennaro Parlato1
1
Electronics and Computer Science, University of Southampton, UK
2
Division of Computer Science, Stellenbosch University, South Africa
3
Dipartimento di Informatica, Università degli Studi di Salerno, Italy
{oi2c11,et1m11,gennaro}@ecs.soton.ac.uk, [email protected],
[email protected]

Abstract. We describe a version of the lazy sequentialization schema by La

Torre, Madhusudan, and Parlato that is optimized for bounded programs, and
avoids the re-computation of the local state of each process at each context switch.
Lazy-CSeq implements this sequentialization schema for sequentially consistent
C programs using POSIX threads. Experiments show that it is very competitive.

1 Introduction
Sequentialization translates concurrent programs into (under certain assumptions) equiv-
alent non-deterministic sequential programs and so reduces concurrent verification to
its sequential counterpart. The widely used (e.g., in CSeq [2,3] or Rek [1]) sequential-
ization schema by Lal and Reps (LR) [6] considers only round-robin schedules with
K rounds, which bounds the number of context switches between the different threads.
LR first replaces the shared global memory by K indexed copies. It then executes the
individual threads to completion, simulating context switches by non-deterministically
incrementing the index. The first thread works with the initial memory guesses, while
the remaining threads work with the values left by their predecessors. The initial guesses
are also stored in a second set of copies; after all threads have terminated these are used
to ensure consistency (i.e., the last thread has ended its execution in each round with
initial guesses for the next round).
LR explores a large number of configurations unreachable by the concurrent pro-
gram, due to the completely non-deterministic choice of the global memory copies and
the late consistency check. The lazy sequentialization schema by La Torre, Madhu-
sudan, and Parlato (LMP) [4,5] avoids this non-determinism, but at each context switch
it re-computes from scratch the local state of each process. This can lead to verifica-
tion conditions of exponential size when constructing the formula in a bounded model
checking approach (due to function inlining). However, for bounded programs this re-
computation can be avoided and the sequentialized program can instead jump to the
context switch points. Lazy-CSeq implements this improved bounded LMP schema
(bLMP) for sequentially consistent C programs that use POSIX threads.

This work was partially funded by the MIUR grant FARB 2011-2012, Università degli Studi
di Salerno (Italy).

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 398–401, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Lazy-CSeq: A Lazy Sequentialization Tool for C 399

2 Verification Approach

Overview. bLMP considers only round-robin schedules with K rounds. It further as-
sumes that the concurrent program (and thus in particular the number of possible threads)
is bounded and that all jumps are forward jumps, which are both enforced in Lazy-CSeq
by unrolling. Unlike LR, however, bLMP does not run the individual threads to comple-
tion in one fell swoop; instead, it repeatedly calls the sequentialized thread functions in
a round-robin fashion. For each thread it maintains the program locations at which the
previous round’s context switch has happened and thus the computation must resume
in the next round. The sequentialized thread functions then jump (in multiple hops)
back to these stored locations. bLMP also keeps the thread-local variables persistent (as
static) and thus, unlike the original LMP, does not need to re-compute their values
from saved copies of previous global memory states before it resumes the computation.

Data Structures. bLMP only stores and maintains, for each thread, a flag denoting
whether the thread is active, the thread’s original arguments, and an integer denoting the
program location at which the previous context switch has happened. Since it does not
need any copy of the shared global memory, heap allcotion needs no special treatment
during the sequentialization and can be delegated entirely to the backend model checker.

Main Driver. The sequentialized program’s main function orchestrates the analysis.
It consists of a sequence of small code snippets, one for each thread and each round,
that check the thread’s active flag (maintained by Lazy-CSeq’s implementation of the
pthread create and pthread join functions), and, if this is set, non-determinis-
tically increment the next context switch point pc cs (which must be smaller than the
thread’s size), call the sequen- if (active_tr[thr_idx] == 1) {
tialized thread function with pc_cs = pc[thr_idx] + nondet_uint();
the original arguments, and assume(pc_cs <= SIZE_<thr_idx>);
store the context switch point thread_<thr_tdx>(thr_args[thr_idx]);
for the next round. Lazy- pc[thr_idx] = pc_cs;
CSeq obtains from the un- }
rolling phase the set of thread instances that the original concurrent program can pos-
sibly create within the given bounds. This allows the static construction of the main
driver. Note that the choice of the context switch points in the driver is the only addi-
tional non-determinism introduced by the sequentialization.
Thread Translation. The sequentialized program also contains a function for each
thread instance (including the original main) identified during the unrolling phase.
Within the function each statement is guarded by a check whether its location is be-
fore the stored location or after the next context switch non-deterministically chosen
by the driver. In the former case, the statement has already been executed in a previ-
ous round, and the simulation jumps ahead one hop; in the latter case, the statement
will be executed in a future round, and the simulation jumps to the thread’s exit. Each
jump target (corresponding either directly to a goto label or indirectly to a branch of
an if statement) is also guarded by an additional check to ensure that the jump does
not jump over the context switch. Since bLMP only explores states reachable in the
400 O. Inverso et al.

original concurrent program, assert statements need no special treatment during the
sequentialization and can be delegated entirely to the backend model checker.

3 Architecture, Implementation, and Availability

Architecture. Lazy-CSeq is implemented as a source-to-source transformation tool in
Python (v2.7.1). Like CSeq [2,3] and MU-CSeq [7] it uses the pycparser (v2.10,
github.com/eliben/pycparser) to parse a C program into an abstract syn-
tax tree (AST). However, in order to produce the right jump targets Lazy-CSeq un-
rolls all loops and replicates the thread functions. The sequentialized program can then
be processed independently by any sequential verification tool for C. Lazy-CSeq has
been tested with CBMC (v4.5, www.cprover.org/cbmc/) and ESBMC (v1.22,
www.esbmc.org).
A small wrapper script bundles up translation and verification. It also invokes Lazy-
CSeq repeatedly, with the parameters -f2 -w2 -r2 -d135, -f4 -w4 -r1 -d145,
-f16 -w1 -r1 -d220, and -f11 -w1 -r11 -d150. Here f and w are the unwind
bound for for (i.e. bounded) and while (i.e. potentially unbounded) loops, respec-
tively, r is the number of rounds, and d is the depth option for the backend. We leave the
analysis running to completion every time, without timeouts or memory limits. When
the result is TRUE, the scripts restarts the analysis with the next set of parameters. As
soon the script gets FALSE, it returns FALSE. Only if the analysis using the last set of
parameters is finished and the results is TRUE, then the scripts returns TRUE.

Availability and Installation. Lazy-CSeq can be downloaded from http://users.

ecs.soton.ac.uk/gp4/cseq/lazy-cseq-0.1.zip; it also requires instal-
lation of the pycparser. It can be installed as global Python script. In the competition
we only used CBMC as a sequential verification backend; this must be installed in the
same directory as Lazy-CSeq.

Call. Lazy-CSeq should be called in the installation directory as follows:

lazy-cseq.py -i<file> --spec<specfile> --witness<logfile>
Strengths and Weaknesses. Since Lazy-CSeq is not a full verification tool but only a
concurrency pre-processor, we only competed in the Concurrency category. Here it
achieved a perfect score.

References
1. Chaki, S., Gurfinkel, A., Strichman, O.: Time-bounded analysis of real-time systems. In: FM-
CAD, pp. 72–80 (2011)
2. Fischer, B., Inverso, O., Parlato, G.: CSeq: A Sequentialization Tool for C (Competition Con-
tribution). In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 616–618.
Springer, Heidelberg (2013)
3. Fischer, B., Inverso, O., Parlato, G.: CSeq: A Concurrency Pre-Processor for Sequential C
Verification Tools. In: ASE, pp. 710–713 (2013)
Lazy-CSeq: A Lazy Sequentialization Tool for C 401

4. La Torre, S., Madhusudan, P., Parlato, G.: Reducing context-bounded concurrent reachability
to sequential reachability. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp.
477–492. Springer, Heidelberg (2009)
5. La Torre, S., Madhusudan, P., Parlato, G.: Sequentializing parameterized programs. In: FIT,
EPTCS 87, pp. 34–47 (2012)
6. Lal, A., Reps, T.W.: Reducing concurrent analysis under a context bound to sequential analy-
sis. Formal Methods in System Design 35(1), 73–97 (2009)
7. Tomasco, E., Inverso, O., Fischer, B., La Torre, S., Parlato, G.: MU-CSeq: Sequentialization
of C Programs by Shared Memory Unwindings (Competition Contribution). In: Ábrahám,
E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 402–404. Springer, Heidelberg
(2014)
MU-CSeq: Sequentialization of C Programs
by Shared Memory Unwindings
(Competition Contribution)

Ermenegildo Tomasco1 , Omar Inverso1, Bernd Fischer2 ,

Salvatore La Torre3 , and Gennaro Parlato1
1
Electronics and Computer Science, University of Southampton, UK
2
Division of Computer Science, Stellenbosch University, South Africa
3
Dipartimento di Informatica, Università di Salerno, Italy
{et1m11,oi2c11,gennaro}@ecs.soton.ac.uk, [email protected],
[email protected]

Abstract. We implement a new sequentialization algorithm for multi-threaded C

programs with dynamic thread creation as a new CSeq module. The novel basic
idea of this algorithm is to fix (by a nondeterministic guess) the sequence of write
operations in the shared memory and then simulate the behavior of the program
according to any scheduling that respects this choice. Simulation is done thread-
by-thread and the thread creation mechanism is replaced by function calls.

1 Introduction
Sequentialization translates a concurrent program into a corresponding sequential one
while preserving a given verification property (e.g., reachability). The idea is to reuse
in the domain of concurrent programs the technology developed for the analysis of se-
quential programs. This simplifies and speeds up the development of robust tools for
concurrent programs. It also allows the designers to focus only on the concurrency
aspects and provides them with a framework in which they can quickly check the effec-
tiveness of their solutions. A sequentialization tool can be designed as a front-end for a
number of analysis tools that share the same input language, and thus many alternatives
are immediately available.
We design a new sequentialization algorithm for multi-threaded C programs with
dynamic thread creation. Its main novelty is the idea of memory unwinding (MU). We
fix (by a nondeterministic guess) the sequence of write operations in the shared mem-
ory and then simulate the behavior of the program according to any scheduling that
respects this choice. We can then use of the number of writes in the shared memory
as a parameter of the bounded analysis, which is orthogonal to considering the num-
ber of context switches underlying previous research on sequentializations based on
the notion of bounded context-switching (e.g., [10,6,7,2,1,8,9]). Moreover, MU-CSeq
naturally accommodates the simulation of dynamic thread creation by function calls.
We implement MU-CSeq as a new module of the tool CSeq [3,4]. Other modules of
CSeq implement the Lal/Reps algorithm [6] and a lazy-sequentialization scheme aimed
to exploit bounded model checking [5].

This work was partially funded by the MIUR grant FARB 2011-2012, Università degli Studi
di Salerno (Italy).

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 402–404, 2014.

c Springer-Verlag Berlin Heidelberg 2014
MU-CSeq: Sequentialization of C Programs by Shared Memory Unwindings 403

2 Verification Approach
Overview. MU-CSeq translates a multi-threaded C program P , into a standard C pro-
gram P . The source-to-source translation is parameterized over the number of writes
Nw in the shared memory and the maximum number of threads Nt . The overall scheme
consists of guessing a sequence σ of Nw writes and then simulating any execution of P
that matches σ. The simulation is done thread-by-thread, starting from the original main
function; when a new thread is created the simulation of the current thread is suspended
until the simulation of the new thread has ended. When the number of threads passes
the bound Nt , each new thread creation operation is just ignored.
Modules of P’. The main function of P is in charge of guessing a consistent sequence
of writes σ and starting the simulation of P . P has a function for each function (includ-
ing the main) and each thread of P . The translation of P modules into the corresponding
modules of P consists of: 1) adding a few lines of control code to handle creation and
execution of threads, and 2) replacing the reads and writes in the shared memory with
calls to read and write functions, respectively.
Guessing the Sequence of Writes. We use a global two-dimensional array mem that
corresponds to the temporal unwinding of the shared memory according to the memory
updates. Here, each column corresponds to an updating event (i.e., a write) in σ and
each row corresponds to a variable. The entry mem[i,j] contains the value of the i-th
shared variable after the j-th write in σ. We use a second global array sigma to store
for each write the involved variable and the thread that has executed the write. To guess
the writes, we assign non-deterministic values to these arrays. The main function of P
then uses assume statements to check the consistency of the values stored in the guessed
arrays before starting the simulation of P .
Accessing Global Memory. On executing each thread t, we store in a variable thr pos
the index of the last executed write in σ. This variable is updated by read and write.
On calling write for the assignment x=e, thr pos is updated to the corresponding index
and then mem[x,thr pos]=e is checked. By calling read for reading variable x, first
thr pos is nondeterministically updated to any index between its current value and the
next write in σ by t, and then mem[x,thr pos] is returned.
Thread Creation and Execution. Thread creation and execution are implemented as
function calls in P . Thus, if a thread t2 is created from a thread t1 , the simulation of t1
stops until the call to t2 has terminated. Before the simulation of t2 starts, the current
value of thr pos is stored in a local variable such that when t2 has terminated, the
simulation of t1 restarts from this index. Accordingly, the simulation of t2 starts from
the current value of thr pos. When either the last statement of thread t2 is reached, or
a write after the last guessed write for t2 is executed, or an index greater than Nw is
guessed for a read, then all the calls of thread t2 are returned, including the call that has
started the thread simulation. After the return we check that all write operations that t2
has to execute actually happened.

3 Architecture, Tool Setup, and Configuration

Architecture. Our sequentialization is implemented as a source-to-source
transformation in Python (v2.7.1), within the CSeq tool. It uses pycparser (v2.10,
404 E. Tomasco et al.

github.com/eliben/pycparser to parse a C program into an abstract syntax

tree (AST), and then traverses the AST to construct the sequentialized version, as out-
lined above. The resulting program can be processed independently by any verification
tool for C. MU-CSeq has been tested with CBMC (v4.2, www.cprover.org/cbmc/)
and ESBMC (v1.22, www.esbmc.org). For the competition we use a wrapper script
that bundles up the translation and calls CBMC for verification. We use the parame-
ters -w24 -t17 -f17 -unwind1 -depth4000 -MaxThreadCreate3, where
w (resp., t) is the bound on the number of write operations (resp., of spawned threads), f
is the unwind bound for for and unwind is the unwind bound for the remaining loops,
depth is the depth option for the backend, and MaxThreadCreate is the bound on
the number of threads that are spawned in any while loop. No timeouts or memory
limits are used in the analysis. The wrapper returns the output from CBMC.
Availability and Installation. MU-CSeq can be downloaded from http://users.
ecs.soton.ac.uk/gp4/cseq/mu-cseq-0.1.zip; it also requires installation
of the pycparser. It can be installed as global Python script. In the competition we
only used CBMC as a sequential verification backend; this must be installed in the same
directory as MU-CSeq.
Call. The tool should be called in the installation directory as follows:
mu-cseq.py -i<file> --spec<specfile> --witness<logfile>
Strengths and Weaknesses. Since MU-CSeq is not a full verification tool but only a
concurrency pre-processor, we only competed in the Concurrency category. Here it
achieved a perfect score.

References
1. Bouajjani, A., Emmi, M., Parlato, G.: On sequentializing concurrent programs. In: Yahav, E.
(ed.) SAS 2011. LNCS, vol. 6887, pp. 129–145. Springer, Heidelberg (2011)
2. Emmi, M., Qadeer, S., Rakamaric, Z.: Delay-bounded scheduling. In: POPL, pp. 411–422
(2011)
3. Fischer, B., Inverso, O., Parlato, G.: CSeq: A Sequentialization Tool for C (Competition
Contribution). In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp.
616–618. Springer, Heidelberg (2013)
4. Fischer, B., Inverso, O., Parlato, G.: CSeq: A Concurrency Pre-Processor for Sequential C
Verification Tools. In: ASE, pp. 710–713 (2013)
5. Inverso, O., Tomasco, E., Fischer, B., La Torre, S., Parlato, G.: Lazy-CSeq: A Lazy Se-
quentialization tool for C (Competition Contribution). In: Ábrahám, E., Havelund, K. (eds.)
TACAS 2014. LNCS, vol. 8413, pp. 398–401. Springer, Heidelberg (2014)
6. Lal, A., Reps, T.W.: Reducing concurrent analysis under a context bound to sequential anal-
ysis. Formal Methods in System Design 35(1), 73–97 (2009)
7. La Torre, S., Madhusudan, P., Parlato, G.: Reducing context-bounded concurrent reachability
to sequential reachability. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643,
pp. 477–492. Springer, Heidelberg (2009)
8. La Torre, S., Madhusudan, P., Parlato, G.: Sequentializing parameterized programs. In: FIT,
EPTCS 87, pp. 34–47 (2012)
9. La Torre, S., Parlato, G.: Scope-bounded Multistack Pushdown Systems: Fixed-Point, Se-
quentialization, and Tree-Width. In: FSTTCS. LIPIcs, vol. 18, pp. 173–184 (2012)
10. Qadeer, S., Wu, D.: KISS: keep it simple and sequential. In: PLDI, pp. 14–24 (2004)
ESBMC 1.22
(Competition Contribution)

Jeremy Morse1 , Mikhail Ramalho2 , Lucas Cordeiro2,

Denis Nicole1 , and Bernd Fischer3
1
Electronics and Computer Science, University of Southampton, UK
2
Electronic and Information Research Center, Federal University of Amazonas, Brazil
3
Division of Computer Science, Stellenbosch University, South Africa
[email protected]

Abstract. We have implemented an improved memory model for ESBMC which

better takes into account C’s memory alignment rules and optimizes the gen-
erated SMT formulae. This simultaneously improves ESBMC’s precision and
performance.

1 Overview
ESBMC is a context-bounded symbolic model checker that allows the verification of
single- and multi-threaded C code with shared variables and locks. ESBMC was origi-
nally branched off CBMC (v2.9) [4] and has inherited its object-based memory model.
With the increasingly large SV-COMP benchmarks this is now reaching its limits. We
have thus implemented an improved memory model for ESBMC; however, we opted for
an incremental change and have kept the underlying object-based model in place, rather
than adapting a fully byte-precise memory model as for example used by LLBMC [7].
We believe this strikes the right balance between precision and scalability.
In this paper we focus on the differences from the ESBMC version used in last year’s
competition (1.20) and, in particular, on the memory model; an overview of ESBMC’s
architecture and more details are given in our previous work [1–3, 5].

2 Differences to ESBMC 1.20

In the last year we have mostly made changes to improve ESBMC’s stability, precision,
and performance. In addition to the improved memory model (see below) we made a
wide range of bug fixes and replaced the string-based accessor functions of the interme-
diate representation (which also go back to CBMC v2.9) by proper accessor functions.
This change alone improves ESBMC’s speed by roughly a factor of two.

3 Memory Model
The correct implementation of operations involving pointers is a significant challenge
in model checking C programs. As a bounded model checker, ESBMC reduces the

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 405–407, 2014.

c Springer-Verlag Berlin Heidelberg 2014
406 J. Morse et al.

bounded program traces to first order logic, which requires us to eliminate pointers in
the model checker. We follow CBMC’s approach and use a static analysis to approx-
imate for each pointer variable the set of data objects (i.e., memory chunks) at which
it might point at some stage in the program execution. The data objects are numbered,
and a pointer target is represented by a pair of integers identifying the data object and
the offset within the object. The value of a pointer variable is then the set of (object,
offset)-pairs to which the pointer may point at the current execution step. The result
of a dereference is the union of the sets of values associated with each of the (object,
offset)-pairs.
The performance of this approach suffers if pointer offsets cannot be statically de-
termined, e.g., if a program reads a byte from an arbitrary offset into a structure. The
resulting SMT formula is large and unwieldy, and its construction is error-prone. To
avoid this, we extended the static pointer analysis to determine the weakest alignment
guarantee that a particular pointer variable provides, and inserted padding in structures
to make all fields align to word boundaries, as prescribed by C’s semantics.
These guarantees, in combination with enforcing memory access alignment rules,
allow us to significantly reduce the number of valid dereference behaviours and thus
the size of the resulting formula, and to detect alignment errors which we have previ-
ously ignored. In circumstances where the underlying type of a memory allocation is
unclear (e.g., dynamically allocated memory with nondeterministic size), we fall back
to allocating a byte array and piecing together higher level types from the bytes.
Other models checkers (in particular LLBMC [7]) treat all memory as a single byte
array, upon which all pointer accesses are decomposed into byte operations. This can
lead to performance problems due to the repeated updates to the memory array that
need to be reflected in the SMT formula.

4 Competition Approach
In bounded model checking, the choice of the unwinding bounds can make a huge
difference. In contrast to previous years, where we only used a single experimentally
determined unwinding bound, we now operate an explicit iterative deepening schema
(n = 8, 12, 16). This replaces the iterative deepening that is implicit in the k-induction
that we used last year [5]. In addition, we no longer use the partial loops option [3].
For categories other than MemorySafety we only check for the reachability of the
error label and ignore all other built-in properties. We use a small script that implements
iterative deepening and calls ESBMC with the main parameters set as follows:
esbmc --timeout 15m --memlimit 15g --64 --unwind <n>
--no-unwinding-assertions --no-assertions --error-label ERROR
--no-bounds-check --no-div-by-zero-check --no-pointer-check <f>

Here, --no-unwinding-assertions removes the unwinding assertion and thus

a correctness claim is not a full correctness proof ; however, this increases ESBMC’s
performance in the competition. The script also sets the specific parameters for the
MemorySafety category. The run script and a self-contained binary for 64-bit Linux
environments are available at www.esbmc.org/download.html; other versions
are available on request. For the competition we used the Z3 solver (V4.0).
ESBMC 1.22 407

5 Results
With the approach described above, ESBMC correctly claims 1837 benchmarks cor-
rect and finds existing errors in 557. However, it also finds unexpected errors for 38
benchmarks and fails to find the expected errors in another 52. The failures are con-
centrated in the MemorySafety and Recursive categories, where we produce 36
and 15 unexpected results, respectively. In MemorySafety, these are caused by differ-
ences in the memory models respectively assumed by the competition, and implemented
in ESBMC; in particular, in 22 cases ESBMC detects an unchecked dereference of a
pointer to a freshly allocated memory chunk, which can lead to a null pointer violation
and so mask the result expected by the benchmark. In Recursive, all unexpected
results are false alarms, which are caused by bounding the programs. Additionally,
ESBMC produces 259 time-outs, which mostly stem from the larger benchmarks in
ldv-consumption, ldv-linux-3.4-simple, seq-mthreaded, and eca.
The remaining programs fail due to parsing errors (16), conversion error (1), or dif-
ferent internal (mostly out-of-memory) errors during the symbolic execution (108). ES-
BMC produces good results for all categories but MemorySafety and Recursive;
however, since we did not opt out of these, our overall result suffered substantially.
ESBMC’s performance has improved greatly over last year’s version (v1.20). The
number of errors detected has gone up from 448 to 557, while the number of un-
expected and missed errors has gone down, from 53 to 38 and from 209 to 52, re-
spectively. The biggest improvements are in the categories Sequentialized and
ControlFlowInteger.
Demonstration Section. We took part in the stateful verification, error-witness check-
ing, and device-driver challenges tracks. In particular, we use EZProofC [6] to collect
and manipulate the counterexample produced by ESBMC in order to reproduce the
identified error for the first round (B1) of the error-witness checking.
Acknowledgements. The third author thanks Samsung for financial support.
References
1. Cordeiro, L., Fischer, B.: Verifying Multi-Threaded Software using SMT-based Context-
Bounded Model Checking. In: ICSE, pp. 331–340 (2011)
2. Cordeiro, L., Fischer, B., Marques-Silva, J.: SMT-based bounded model checking for embed-
ded ANSI-C software. IEEE Trans. Software Eng. 38(4), 957–974 (2012)
3. Cordeiro, L., Morse, J., Nicole, D., Fischer, B.: Context-Bounded Model Checking with
ESBMC 1.17 (Competition Contribution). In: Flanagan, C., König, B. (eds.) TACAS 2012.
LNCS, vol. 7214, pp. 534–537. Springer, Heidelberg (2012)
4. Kroening, D., Clarke, E., Yorav, K.: Behavioral Consistency of C and Verilog Programs Using
Bounded Model Checking. In: DAC, pp. 368–371. IEEE (2003)
5. Morse, J., Cordeiro, L., Nicole, D., Fischer, B.: Handling Unbounded Loops with ESBMC
1.20 (Competition Contribution). In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS,
vol. 7795, pp. 619–622. Springer, Heidelberg (2013)
6. Rocha, H., Barreto, R., Cordeiro, L., Neto, A.D.: Understanding Programming Bugs in ANSI-
C Software Using Bounded Model Checking Counter-Examples. In: Derrick, J., Gnesi, S.,
Latella, D., Treharne, H. (eds.) IFM 2012. LNCS, vol. 7321, pp. 128–142. Springer, Heidelberg
(2012)
7. Sinz, C., Falke, S., Merz, F.: A Precise Memory Model for Low-Level Bounded Model Check-
ing. In: SSV, USENIX (2010)
FrankenBit: Bit-Precise Veriﬁcation
with Many Bits
(Competition Contribution)

Arie Gurﬁnkel1 and Anton Belov2

1
Carnegie Mellon Software Engineering Institute, USA
2
University College Dublin, Ireland

Abstract. Bit-precise software veriﬁcation is an important and diﬃcult

1 Veriﬁcation Approach
FrankenBit combines two orthogonal techniques: one searches for bit-precise
counterexamples, and the other synthesizes bit-precise inductive invariants. The
counterexample search is done using Bounded Model Checking, and is delegated
completely to LLBMC [11]. Invariant synthesis is implemented by ﬁrst unsoundly
approximating programs using Linear Arithmetic (LA), then computing induc-
tive invariants for the approximation, and using those to guide the search for
bit-precise invariants. The details of this approach are described in [7].

2 Software Architecture
The architecture of FrankenBit is shown in Fig. 1. First, the input C source
is processed and compiled into LLVM [10] bitcode using the UFO front-end
(UFO-FE) [1]. This involves normalizing with a custom CIL [12] pass, compiling
with llvm-gcc, and simplifying using customized optimizations from LLVM ver-
sion 2.6. The front-end is often sufficient to decide simple verification tasks. Sec-
ond, two threads are started, one used to synthesize an inductive invariant (left
part of Fig. 1), and the other to search for a counterexample (right part of Fig. 1).

This material is based upon work funded and supported by the Department of De-
fense under Contract No. FA8721-05-C-0003 with Carnegie Mellon University for
the operation of the Software Engineering Institute, a federally funded research and
development center. This material has been approved for public release and unlim-
ited distribution. DM-0000870. The second author is financially supported by SFI
PI grant BEACON (09/IN.1/I2618).

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 408–411, 2014.

c Springer-Verlag Berlin Heidelberg 2014
FrankenBit: Bit-Precise Veriﬁcation with Many Bits 409

C source

UFO-FE

LLVM bitcode

UNSAFE/UNKNOWN UNKNOWN
UFO-MUZ LLBMC

SAFE

UNSAFE
LA system safe LA invariant

LA → BV LA → BV candidate BV invariant

SAT
Z3
MISPER

Boolector
AIGER UNSAT

MUSer2

BV system BV invariant

UNSAT
Z3 (safety)

SAT
SAT/UNKNOWN

Z3/PDR UNSAT

If both threads
are here
Return: TRUE Return: UNKNOWN Return: FALSE

Fig. 1. FrankenBit: Software architecture

Invariants. Invariants are synthesized using our new algorithm Misper [7]. First,
Z3/PDR engine [8] of UFO (UFO-MUZ) abstracts the input over Linear Arith-
metic (LA) and synthesizes LA invariant. If this fails, synthesis is aborted. Sec-
ond, the LA invariant and abstraction are converted to bit-vectors (LA → BV).
Third, the candidate bit-vector (BV) invariant is checked using Z3 [4]. If the
candidate is not inductive, it is weakened until it becomes inductive using Mis-
per that, in turn, uses Boolector [3] for bit-blasting, aiger for CNF conversion,
and MUSer2 [2] for extraction of Minimal Unsatisﬁable Subformulas (MUSes).
Finally, the safety of the weakened invariant is checked again with Z3 (Z3 safety),
and, if necessary, strengthened using the bit-precise version of Z3/PDR.
410 A. Gurﬁnkel and A. Belov

Counterexamples. The search for counterexamples is delegated to LLBMC [11],

that itself uses STP [6], and MiniSAT [5]. In order to run LLBMC on bitcode
ﬁles produced by UFO-FE, they are ﬁrst dis-assembled using llvm-dis from
LLVM v2.9 and then re-assembled using llmv-as from LLVM v3.2.
FrankenBit is written in Python and borrows code from Spacer [9].

3 Tool Setup and Conﬁguration

FrankenBit is available for download from bitbucket.org/arieg/fbit/wiki
/svcomp14.wiki. The options for running the tool are:
./bin/fbit.py [-m64] --cex=TRACE --spec=SPEC input
where -m64 turns on 64-bit model, --cex and --spec are the locations of the
counter-example and the specification files, respectively, and input is a C file.
The result is printed on the output terminal: TRUE, FALSE, UNKNOWN, if the prop-
erty evaluates, respectively, to true, false, or unknown on the input.
FrankenBit is participating in the following categories: Simple, Control
Flow and Integer Variables, and Device Drivers Linux 64-bit.

References
1. Albarghouthi, A., Gurfinkel, A., Li, Y., Chaki, S., Chechik, M.: UFO: Verifica-
tion with Interpolants and Abstract Interpretation (Competition Contribution).
In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 637–640.
Springer, Heidelberg (2013)
2. Belov, A., Marques-Silva, J.: MUSer2: An Efficient MUS Extractor. JSAT 8(1/2)
(2012)
3. Brummayer, R., Biere, A.: Boolector: An Efficient SMT Solver for Bit-Vectors and
Arrays. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505,
pp. 174–177. Springer, Heidelberg (2009)
4. de Moura, L., Bjørner, N.: Z3: An Efficient SMT Solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
5. Eén, N., Sörensson, N.: An Extensible SAT-solver. In: Giunchiglia, E., Tacchella,
A. (eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518. Springer, Heidelberg (2004)
6. Ganesh, V., Dill, D.L.: A Decision Procedure for Bit-Vectors and Arrays. In:
Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 519–531.
Springer, Heidelberg (2007)
7. Gurfinkel, A., Belov, A., Marques-Silva, J.: Synthesizing Safe Bit-Precise Invari-
ants. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp.
93–108. Springer, Heidelberg (2014)
8. Hoder, K., Bjørner, N.: Generalized Property Directed Reachability. In: Cimatti,
A., Sebastiani, R. (eds.) SAT 2012. LNCS, vol. 7317, pp. 157–171. Springer, Hei-
delberg (2012)
FrankenBit: Bit-Precise Verification with Many Bits 411

9. Komuravelli, A., Gurﬁnkel, A., Chaki, S., Clarke, E.M.: Automatic Abstraction
in SMT-Based Unbounded Software Model Checking. In: Sharygina, N., Veith, H.
(eds.) CAV 2013. LNCS, vol. 8044, pp. 846–862. Springer, Heidelberg (2013)
10. Lattner, C., Adve, V.S.: LLVM: A Compilation Framework for Lifelong Program
Analysis & Transformation. In: CGO, pp. 75–88. IEEE Computer Society (2004)
11. Merz, F., Falke, S., Sinz, C.: LLBMC: Bounded Model Checking of C and C++
Programs Using a Compiler IR. In: Joshi, R., Müller, P., Podelski, A. (eds.) VSTTE
2012. LNCS, vol. 7152, pp. 146–161. Springer, Heidelberg (2012)
12. Necula, G.C., McPeak, S., Rahul, S.P., Weimer, W.: CIL: Intermediate Language
and Tools for Analysis and Transformation of C Programs. In: Horspool, R.N. (ed.)
CC 2002. LNCS, vol. 2304, pp. 213–228. Springer, Heidelberg (2002)
Predator: A Shape Analyzer
Based on Symbolic Memory Graphs
(Competition Contribution)

Kamil Dudka, Petr Peringer, and Tomáš Vojnar

FIT, Brno University of Technology, IT4Innovations Centre of Excellence, Czech Republic

Abstract. Predator is a shape analyzer that uses the abstract domain of symbolic
memory graphs in order to support various forms of low-level memory manipu-
lation commonly used in optimized C code. This paper briefly describes the ver-
ification approach taken by Predator and its strengths and weaknesses revealed
during its participation in the Software Verification Competition (SV-COMP’14).

1 Verification Approach
Predator is a shape analyzer that uses the abstract domain of symbolic memory graphs
(SMGs) in order to support various forms of low-level memory manipulation commonly
used in optimized C code. Compared to separation logic-based works [1], which our
work is inspired by, SMGs allow one to easily apply various graph-based algorithms to
efficiently manipulate with the low-level memory representation.
The formal definition of SMGs can be found in [2] together with algorithms of all
the operations needed for use of SMGs in a fully automatic shape analysis. This is in
particular the case of a specialised unary abstraction operator and a binary join oper-
ator that aid termination of the SMG-based shape analysis. The join operator is based
on an algorithm that simultaneously traverses a pair of input SMGs and merges their
corresponding nodes. The core of the join algorithm is also used by the algorithm imple-
menting the abstraction operator to merge pairs of neighbouring nodes, together with
their sub-SMGs (describing the data structures nested below them), into a single list
segment. For checking entailment of SMGs, Predator again reuses the join algorithm
(extended to compare generality of the SMGs being joined).
Predator requires all external functions to be properly modelled wrt. memory safety
in order to exclude any side effects that could possibly break soundness of the analy-
sis. Our distribution of Predator includes models of memory allocation functions (like
malloc or free) and selected memory manipulating functions (memset, memcpy,
memmove, etc.).
Since SV-COMP’13, the core algorithms of shape analysis were reimplemented
in order to match their description presented in [2]. Consequently, the current imple-
mentation is much easier to follow, but at the same time also faster and more precise (as
witnessed by the results of SV-COMP’14).

This work was supported by the Czech Science Foundation project 14-11384S and the
EU/Czech IT4Innovations Centre of Excellence project CZ.1.05/1.1.00/02.0070.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 412–414, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Predator: A Shape Analyzer Based on Symbolic Memory Graphs 413

2 Software Architecture

Predator is implemented as a GCC (GNU Compiler Collection) plug-in, which makes

the tool easy to use without a need to manually preprocess the source code. GCC as an
industrial-strength compiler takes care of parsing the C code into an intermediate rep-
resentation (known as GIMPLE). The input code is symbolically executed by Predator
using the algorithms proposed in [2] with the aim to precisely interpret various low-
level memory operations (such as pointer arithmetic, valid use of pointers with invalid
targets, operations with memory blocks, or reinterpretation of the memory contents).
Predator is written in C++ and requires Boost libraries, mainly to enable using legacy
compilers for building it. The Predator GCC plug-in can be loaded into any GCC with
a plug-in support up to GCC 4.8.2 (which was the latest release in 2013).
Compared to SV-COMP’13, Predator uses an improved algorithm for live variable
analysis (based on a points-to analysis). The improved live variable analysis makes the
shape analysis run five times faster in certain cases (e.g. the Merge-Sort algorithm case
study from [2]).

3 Strengths and Weaknesses

The main strength of Predator is its byte-precise representation of reachable memory

configurations, which makes it possible to successfully verify certain low-level pointer-
intensive programs in the MemorySafety and HeapManipulation categories. The key
design principle of Predator is soundness, which was again confirmed by reaching zero
false negatives on the whole benchmark of SV-COMP’14. On the other hand, Predator
does not check spuriousness of possible counterexamples for now, which caused numer-
ous false positives (and consequently a significant loss of score). Since SV-COMP’13,
the MemorySafety category has been extended by case studies that cause problems to
Predator either by operating on data structures not covered by the current abstraction al-
gorithm (trees and skip lists), or by the requirement to track non-pointer data along with
the shapes of data structures (for example, tracking the length of lists). Compared to the
SV-COMP’13 version, Predator now finally achieved the full score in the ProductLines
subcategory. Moreover, the correct results were now delivered five times faster than the
partially correct results in this (sub)category last year.
Results in the ControlFlowInteger BitVectors categories still suffer from a high ra-
tio of false positives caused mainly by a too coarse analysis of integers. Due to un-
defined external functions, Predator was not able to analyze many test cases in the
DeviceDrivers64 and Concurrency categories.

4 Tool Setup and Configuration

The source code of the Predator release1 used in the competition can be downloaded
from the project web page. The file README-SVCOMP-2014 included in the archive
1
http://www.fit.vutbr.cz/research/groups/verifit/tools/predator
/download/predator-2013-10-30-d1bd405.tar.gz
414 K. Dudka, P. Peringer, and T. Vojnar

describes how to build Predator from source code and how to apply the tool on the com-
petition benchmarks. After successfully building the tool from sources, the sl build
directory contains a script named check-property.sh, which needs to be invoked
once for each input program. Besides the name of the input program, the script requires
a mandatory option --propertyfile specifying the property to be verified. Com-
piler flags needed to compile the input program with GCC must be specified after the
file name of the input program. For programs relying on a particular target architecture
(such as preprocessed C sources), it is important to use the -m32 or -m64 compiler
flags to specify the architecture. The script also provides a voluntary option --trace
that allows one to write the error trace to a file. The verification result is printed to the
standard output on success. Otherwise, the verification outcome should be treated as
UNKNOWN. The script does not check for exceeding any resource limits on its own.
Although we use a global configuration of Predator for all categories, the tool pro-
vides many useful compile-time options via the sl/config.h configuration file. The
default configuration is tweaked to obtain good overall results in both the competi-
tion benchmark and Predator’s regression test-suite. The configuration can be further
tweaked to improve the results in a particular category, however, at the cost of loosing
some points in other categories.

5 Software Project and Contributors

Predator is an open source software project developed at Brno University of Technol-
ogy (BUT) and distributed under the GNU General Public License version 3, which
allows Predator to be used for both commercial and non-commercial purposes. There
is no binary distribution of Predator, but it can be easily built from sources on any up
to date distribution of Linux. The interaction with the compiler is facilitated by the
Code Listener infrastructure [3], which is shared with Forester (a shape analyser based
on forest automata [4]), including a suite of regression tests. Both Code Listener and
Forester are projects also developed at BUT. Besides our development teams, we have
numerous external contributors listed in the docs/THANKS file inside the distribution
of Predator. Collaboration on further development of Predator is welcome.

References
1. Berdine, J., Calcagno, C., Cook, B., Distefano, D., O’Hearn, P.W., Wies, T., Yang, H.: Shape
Analysis for Composite Data Structures. In: Damm, W., Hermanns, H. (eds.) CAV 2007.
LNCS, vol. 4590, pp. 178–192. Springer, Heidelberg (2007)
2. Dudka, K., Peringer, P., Vojnar, T.: Byte-Precise Verification of Low-Level List Manipulation.
In: Logozzo, F., Fähndrich, M. (eds.) SAS 2013. LNCS, vol. 7935, pp. 215–237. Springer,
Heidelberg (2013)
3. Dudka, K., Peringer, P., Vojnar, T.: An Easy to Use Infrastructure for Building Static Analy-
sis Tools. In: Moreno-Dı́az, R., Pichler, F., Quesada-Arencibia, A. (eds.) EUROCAST 2011,
Part I. LNCS, vol. 6927, pp. 527–534. Springer, Heidelberg (2012)
4. Habermehl, P., Holı́k, L., Rogalewicz, A., Šimáček, J., Vojnar, T.: Forest Automata for Veri-
fication of Heap Manipulation. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS,
vol. 6806, pp. 424–440. Springer, Heidelberg (2011)
Symbiotic 2: More Precise Slicing
(Competition Contribution)

Jiri Slaby and Jan Strejček

Faculty of Informatics, Masaryk University

Botanická 68a, 60200 Brno, Czech Republic
{slaby,strejcek}@fi.muni.cz

Abstract. Symbiotic 2 keeps the concept and the structure of the orig-
inal bug-finding tool Symbiotic, but it uses a more precise slicing based
on a field-sensitive pointer analysis instead of field-insensitive analysis of
the original tool. The paper discusses this improvement and its conse-
quences. We also briefly recall basic principles of the tool, its strong and
weak points, installation, and running instructions. Finally, we comment
the results achieved by Symbiotic 2 in the competition.

1 Veriﬁcation Approach and Software Architecture

Both Symbiotic [6] and Symbiotic 2 implement our verification approach pro-
posed earlier [4]. The approach combines three standard techniques, namely
code instrumentation, program slicing [7], and symbolic execution [3]. While the
approach was originally designed for the detection of bugs described by state-
machines, Symbiotic 2 still supports only one kind of bugs: reachability of an
ERROR label. Hence, we briefly recall the approach restricted just to this simple
kind of errors. We explain the structure of the tool simultaneously.
1. Code instrumentation inserts assert(0) to each ERROR label. It is performed
by a bash script calling sed. The instrumented code is translated into the
Llvm bitcode by the Clang compiler.
2. Program slicing removes instructions of the instrumented code that do not
affect reachability of the inserted assert(0) statements. This code size re-
duction is crucial for the overall efficiency of Symbiotic 2. The slicer is
implemented in C++ as a plug-in for the Llvm optimizer opt.
3. Symbolic execution either reaches assert(0), or correctly finishes the exe-
cution without reaching assert(0), or it runs out of time or memory etc.
These possibilities correspond to answers FALSE, TRUE, UNKNOWN, respectively.
We use the symbolic executor Klee [2] whose outputs are translated to
TRUE/FALSE/UNKNOWN by a simple bash script.
The whole pipeline is executed stepwise by another bash script.
All improvements of Symbiotic 2 over the tool Symbiotic competing in
SV-COMP 2013 are in the slicer. We have fixed some bugs in the original slicer

The authors are supported by the Czech Science Foundation grant P202-10-1469.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 415–417, 2014.

c Springer-Verlag Berlin Heidelberg 2014
416 J. Slaby and J. Strejček

(invalid treatment of several instructions and functions with variable number

of arguments). While the original fixpoint algorithm for slicing [7] is relatively
simple, it gets more complicated for programs with pointers as one instruction
can influence the following one without any syntactic overlap. We need to use a
pointer analysis (also called points-to analysis) to know which pointers can point
to the same target. Both Symbiotic and Symbiotic 2 use Andersen’s pointer
analysis [1], Symbiotic 2 replaces the original field-insensitive analysis by a
field-sensitive one. This means that every field of a struct is now handled as a
different pointer target. Similarly, we handle the first 64 elements of each array as
distinct targets. The field-sensitive analysis is computationally more demanding.
Thus we have also added some type filters that speed up the pointer analysis and
make its results more accurate. All these improvements of the slicer significantly
reduce the number of incorrect answers produced by Symbiotic 2. For example,
Symbiotic 2 produces only correct answers for the category ProductLines while
Symbiotic produces 131 incorrect answers for the same category.

2 Strengths and Weaknesses

Our tool is applicable to all competition benchmarks satisfying two restrictions:
the studied property is ERROR label reachability and the benchmark code is a
sequential C program. Hence, the results of Symbiotic 2 in the competition
categories MemorySafety and Concurrency should be ignored (we missed the
opt-out deadline). The first restriction can be removed by the implementation
of a more sophisticated code instrumentation. The second restriction comes di-
rectly from the approach as symbolic execution and program slicing are primarily
designed for sequential programs.
We first discuss strong and weak aspects of the approach and then we talk
about additional strong and weak aspects of the tool. Our approach is based
on symbolic execution which should produce only correct answers. On the other
hand, symbolic execution suffers from the path explosion problem and relies on
expensive (and often even undecidable) SMT solving. Hence, application of sym-
bolic execution leads to many UNKNOWN answers, which is also the main weakness
of the approach. To reduce this weakness, we combine symbolic execution with
slicing which is the only theoretical source of incorrect answers (namely false
positives) of our approach. Indeed, slicing can in some cases remove an infinite
loop and a potential unreachable ERROR label located below the cycle thus be-
come reachable. However, this situation is very rare in practice (e.g. it does not
appear in the competition benchmarks) and we do not see it as a problem. An
orthogonal method to reduce the high cost of symbolic execution is to use some
of its variants suppressing the path explosion problem. For example, we plan to
apply compact symbolic execution [5] instead of the classic one.
The strong aspect of the tool is its simple architecture: it is a sequence of
scripts and standalone tools that are easy to replace (for example, if there is a
better symbolic executor for Llvm bitcode, we can deploy it in few minutes). The
main weakness of Symbiotic 2 lies in the incorrect results which are sometimes
Symbiotic 2: More Precise Slicing 417

reported. Even if the number of incorrect results is substantially lower than in

the case of Symbiotic, it is still relatively high. All the incorrect results are due
to imperfection of our implementation.

3 Tool Setup and Conﬁguration

Before using Symbiotic 2, ensure that the target system contains 32-bit li-
braries (for 32-bit benchmarks) and Llvm with Clang. Llvm and Clang
have to be in version 3.2 exactly. Then, Symbiotic 2 can be downloaded from
http://sf.net/projects/symbiotic/. Due to a bug in Klee causing absolute
paths to be built in, Klee requires to be run from a pre-deﬁned path. Hence
we are obliged to change the current directory to /opt/ and untar the down-
loaded Symbiotic 2 archive there. Running the tool is then straightforward.
When the current directory is /opt/symbiotic/, the tool can be invoked for
each <benchmark.c> from the set by ./runme <benchmark.c>. For benchmarks
intended for 64-bit, set MFLAG=-m64 environment variable. The answers provided
by Symbiotic 2 are as required by the competition rules: TRUE/FALSE/UNKWNOWN.
If the result for <benchmark.c> is FALSE, discovered error paths can be found
in <benchmark.c>-klee-out/.

4 Software Project and Contributors

Symbiotic 2 was contributed mostly by the authors of this paper and Marek
Trtı́k. Jiri Slaby is a contact person. The tool is licensed under the GNU GPLv2
License unless speciﬁed otherwise for its parts.

References
1. Andersen, L.O.: Program Analysis and Specialization for the C Programming Lan-
guage. PhD thesis, DIKU, University of Copenhagen (1994)
2. Cadar, C., Dunbar, D., Engler, D.: KLEE: Unassisted and automatic generation
of high-coverage tests for complex systems programs. In: Proceedings of OSDI, pp.
209–224. USENIX Association (2008)
3. King, J.C.: Symbolic execution and program testing. Communications of
ACM 19(7), 385–394 (1976)
4. Slabý, J., Strejček, J., Trtı́k, M.: Checking properties described by state machines:
On synergy of instrumentation, slicing, and symbolic execution. In: Stoelinga, M.,
Pinger, R. (eds.) FMICS 2012. LNCS, vol. 7437, pp. 207–221. Springer, Heidelberg
(2012)
5. Slaby, J., Strejček, J., Trtı́k, M.: Compact symbolic execution. In: Van Hung, D.,
Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172, pp. 193–207. Springer, Heidelberg
(2013)
6. Slaby, J., Strejček, J., Trtı́k, M.: Symbiotic: Synergy of instrumentation, slicing,
and symbolic execution (competition contribution). In: Piterman, N., Smolka, S.A.
(eds.) TACAS 2013. LNCS, vol. 7795, pp. 630–632. Springer, Heidelberg (2013)
7. Weiser, M.: Program slicing. In: Proceedings of ICSE, pp. 439–449. IEEE (1981)
Ultimate Automizer with Unsatisﬁable Cores
(Competition Contribution)

Matthias Heizmann, Jürgen Christ, Daniel Dietsch, Jochen Hoenicke,

Markus Lindenmann, Betim Musa, Christian Schilling,
Stefan Wissert, and Andreas Podelski

University of Freiburg, Germany

Abstract. Ultimate Automizer is an automatic software veriﬁcation

tool for C programs. This tool is a prototype implementation of an
automata-theoretic approach that allows a modular verification of pro-
grams. Furthermore, this is the first implementation of a novel interpola-
tion technique where interpolants are not obtained from an interpolating
theorem prover but from a combination of a live variable analysis, inter-
procedural predicate transformers and unsatisfiable cores.

1 Verification Approach
Ultimate Automizer verifies a C program by first executing several program
transformations and then performing an interpolation-based variant of trace
abstraction [4]. As a first step, we translate the C program into a Boogie [6]
program. The heap of the system is modeled via arrays in this Boogie pro-
gram [7]. Next, the Boogie program is translated into an interprocedural control
flow graph [9]. As an optimization, we do not label the edges with single program
statements but with loop free code blocks of the program [11]. Our verification
algorithm then performs the following steps iteratively:
1. We take a sequence of statements π that leads from the start of the main
procedure to an error location and analyze its correctness (resp. feasibility).
In this analysis an SMT solver is used.
2. We consider this sequence of statements as a standalone program Pπ and
compute a correctness proof for Pπ in form of a Hoare annotation.
3. We find a larger program Pπ that has the same correctness proof [4].
4. We consider the preceding step as a semantical decomposition of the original
program P into one part P π whose correctness is already proven and one re-

maining part Prest := P\Pπ , on which we continue. The programs P, P π , Prest
are represented by automata. This allows us to compute and represent the
remaining part of the program Prest (the part for which correctness was not
yet proven). Furthermore, this automata-theoretic representation allows us
to apply minimization [10] to represent the programs P, P π , Prest efficiently.

This work is supported by the German Research Council (DFG) as part of the
Transregional Collaborative Research Center “Automatic Verification and Analysis
of Complex Systems” (SFB/TR14 AVACS)

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 418–420, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Ultimate Automizer with Unsatisﬁable Cores 419

Our previous competition candidate [2] followed a similar approach in which

the above mentioned Hoare annotation was computed by an interpolating SMT
solver via Craig interpolation. Computation of Craig interpolants is known to be
diﬃcult, especially for the theory of arrays. This competition candidate follows
a novel approach [8] to obtain a Hoare annotation for a sequence of statements.
The predicates that represent the Hoare annotation are obtained using interpro-
cedural predicate transformers. The arguments of these predicate transformers
are not the statements of the sequence but generalized statements that are ob-
tained from a live variable analysis and from unsatisﬁable cores of the feasiblity
analysis.

2 Software Architecture

Ultimate Automizer is one toolchain of the software analysis framework Ul-

timate which is implemented in Java. Ultimate offers data structures for
different representations of a program, plugins which analyze or transform a
program, and an interface for the communication with SMT-LIBv2 compatible
theorem provers. For parsing C programs, we use the C parser of the Eclipse
CDT project1 . The operations on nested word automata are implemented in
the Ultimate Automata Library. Our SMT queries can be answered by any
SMT-LIBv2 compatible solver that supports quantifiers and the theory of arrays.

3 Discussion of Approach

Currently we model primitive data types (int, float,...) as integers Z or real

numbers R. We report unknown whenever we find a potential counterexample
whose infeasibility cannot be shown because of this imprecision.
The main flaw of our implementation is the translation from C to Boogie. We
failed to finish this translation in time and our submitted competition candidate
is unable to verify programs that contain pointers or arrays.

4 Tool Setup and Conﬁguration

Our competition candidate assumes that version 4.3.2.ﬀ265c6c6ccf of the SMT

solver Z32 is installed and that the directory of the Z3 binary is part of the PATH
variable. Our competition candidate is included in a command-line version of
Ultimate Automizer that can be downloaded from the following website:
https://ultimate.informatik.uni-freiburg.de/automizer/

The zip archive in which Ultimate Automizer is shipped contains the Python
script automizerSV-COMP.py which wraps input and output for the SV-COMP.
1
https://www.eclipse.org/cdt/
2
https://z3.codeplex.com/
420 M. Heizmann et al.

Using the following command, the C program fnord.c is verified with respect to
the property file prop.prp and an error path is written to the file errPath.txt.
python AutomizerSvcomp.py prop.prp fnord.c errPath.txt

5 Software Project and Contributors

Our software analysis framework Ultimate was started as a bachelor thesis [1].
In the last years, many students contributed plugins or improved the frame-
work itself. A list of all developers is available on our website. An instance of
Ultimate is running on our web server and is available via a web interface.

6 Demonstration Category Termination

We also participated in the demonstration category on termination with Ulti-
mate Büchi Automizer which is our tool for termination analysis. The un-
derlying approach is based on Büchi automata and has not been published yet.
As a subroutine the tool Ultimate Lasso Ranker [3,5] is used. We thank Jan
Leike and Alexander Nutz for their contributions to our termination analysis.

References
1. Dietsch, D.: STALIN: A plugin-based modular framework for program analysis.
Bachelor Thesis, Albert-Ludwigs-Universität, Freiburg, Germany (2008)
2. Heizmann, M., et al.: Ultimate automizer with SMTInterpol. In: Piterman, N.,
Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 641–643. Springer, Heidel-
berg (2013)
3. Heizmann, M., Hoenicke, J., Leike, J., Podelski, A.: Linear ranking for linear lasso
programs. In: Van Hung, D., Ogawa, M. (eds.) ATVA 2013. LNCS, vol. 8172, pp.
365–380. Springer, Heidelberg (2013)
4. Heizmann, M., Hoenicke, J., Podelski, A.: Software model checking for people who
love automata. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp.
36–52. Springer, Heidelberg (2013)
5. Leike, J.: Ranking function synthesis for linear lasso programs. Master’s thesis,
University of Freiburg, Germany (2013)
6. Leino, K.R.M.: This is Boogie 2. Manuscript working draft, Microsoft Research,
Redmond, WA, USA (June 2008),
http://research.microsoft.com/en-us/um/people/leino/papers/krml178.pdf
7. Lindenmann, M.: A simple but sufficient memory model for ultimate. Master’s
thesis, University of Freiburg, Germany (2012)
8. Musa, B.: Trace abstraction with unsatisfiable cores. Bachelor’s thesis, University
of Freiburg, Germany (2013)
9. Reps, T.W., Horwitz, S., Sagiv, S.: Precise interprocedural dataflow analysis via
graph reachability. In: POPL 1995, pp. 49–61. ACM (1995)
10. Schilling, C.: Minimization of nested word automata. Master’s thesis, University
of Freiburg, Germany (2013)
11. Wissert, S.: Adaptive block encoding for recursive control flow graphs. Master’s
thesis, University of Freiburg, Germany (2013)
Ultimate Kojak
(Competition Contribution)

Evren Ermis, Alexander Nutz , Daniel Dietsch,

Jochen Hoenicke, Andreas Podelski

University of Freiburg, Germany

{ermis,nutz,dietsch,hoenicke,podelski}@informatik.uni-freiburg.de

Abstract. Ultimate Kojak is a symbolic software model checker for

C programs. It is based on CEGAR and Craig interpolation. The basic
algorithm, described in an earlier work [1], was extended to be able to
deal with recursive programs using nested word automata and nested
(tree) interpolants.

1 Veriﬁcation Approach

Ultimate Kojak computes inductive invariants from interpolants to prove the

correctness of a program. A program is represented by a program graph. In a
program graph, a vertex is a pair consisting of a program location and an invari-
ant describing the abstract program state. An edge is labelled with a transition
formula that corresponds to a block of program statements. A program assertion
is represented by a transition to an error state where the transition is labelled
with the negated assertion. The goal is to show the unreachability of all error
states. The program graph is refined by the algorithm presented in a paper by
Ermis et al. [1]. This algorithm computes a sequence of interpolants for an infea-
sible error path and adds them to the invariant annotated at the corresponding
vertices. Since the interpolants are only invariants for the particular error path,
we also have to add new vertices for the case where the interpolants do not hold.
This is achieved by splitting every vertex on the error path into two new vertices
where each receives a new invariant: the old invariant conjoined with the inter-
polant for the first, and the old invariant conjoined with the negated interpolant
for the second. Afterwards, the algorithm removes all infeasible edges, thereby
refining the abstraction.
The newest version of Ultimate Kojak implements this algorithm and ex-
tends it by handling inter-procedural control flow. We use nested word automata
to represent programs containing procedures [2]. These automata have call and
return transitions and support procedure summaries to prove the correctness
of recursive programs. A return transition conceptually has two predecessors:
the node representing the call site and the node representing the exit point of
the called procedure. Therefore our error paths are in in fact trees. To obtain

Corresponding author.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 421–423, 2014.

c Springer-Verlag Berlin Heidelberg 2014
422 E. Ermis et al.

··· ···
ϕ1 · · · ϕi ϕ1 ϕi ϕ1 ϕi
ϕ

(, Inv) (, Inv ∧ I) (, Inv ∧ ¬I)

ϕ ϕ ϕ
ϕ
ϕj · · · ϕn ϕj ϕn ϕj ϕn

··· ···

Fig. 1. We split a node that represents program location , and that has earlier been
annotated with the invariant formula Inv, with the interpolant I. The node and its
incoming and outgoing edges are duplicated and one copy of the node is labeled with
I and the other with ¬I.

interpolants for these error paths, we use tree (nested) interpolation [2,3]. Ul-
timate Kojak utilizes block encoding [4] to summarize loop-free segments of
the program, such that the focus is put on loops.

2 Software Architecture
Ultimate Kojak is a toolchain in the Ultimate1 Veriﬁcation Framework,
which is implemented in Java. Ultimate manages diﬀerent representations of
a program and passes them between its plug-ins which may analyse, transform,
or visualize the representation. Ultimate also provides an interface for com-
munication with SMT-LIBv2 compatible SMT solvers. For parsing C programs,
we use the C parser provided by the CDT2 project. We use Z33 for feasibility
checks of error paths and transition formulas. Interpolation is done by our own
algorithm, which is not yet published [5].

3 Discussion – Strengths and Weaknesses

In our approach, every refinement of the abstraction tends to introduce a rather
high number of new edges into the graph, especially when dealing with hyper-
edges (i.e. return edges), which may lead to a quickly growing model. Currently,
interpolants are computed by our own interpolation method which may introduce
quantifiers. This makes Ultimate Kojak capable of handling arrays, however
quantifiers that can not be easily eliminated lead to a heavy load on the SMT
solver.
Due to some remaining issues in our C translation, we have problems with
certain constructs, which makes us unable to verify many benchmarks, especially
those where some standard library header files were inlined.
1
http://ultimate.informatik.uni-freiburg.de
2
http://eclipse.org/cdt/
3
http://z3.codeplex.com/
Ultimate Kojak 423

In principle, Ultimate Kojak can handle any program that can be formal-
ized in a logic the attached SMT solver is capable of. Currently, we do not
support bit-precise treatment of integers or concurrent programs.

4 Tool Setup and Conﬁguration

A commandline version of Ultimate Kojak can be downloaded from
http://ultimate.informatik.uni-freiburg.de/kojak
The downloaded archive contains a python script KojakSVComp.py that provides
support for the SV-Comp-compatible input and output of the tool. The directory
where the content of the archive lies is used as the working directory of the tool.
The veriﬁcation is started by a command like
python KojakSvComp.py prop.prp example.c errorPathOutput.txt
An installation of the SMT solver Z3 is required.4 The Z3 executable must be
in your PATH environment variable.

5 Software Project and Contributors

Ultimate Kojak has been developed at the Chair of Software Engineering at
the University of Freiburg as part of the Ultimate veriﬁcation framework. Over
the years numerous contributors have helped to transform a project, which ini-
tially started as a students’ project, into a chair maintained veriﬁcation framework
on which model checkers such as Ultimate Kojak and Ultimate Automizer5
rely on. We would like to thank all the developers and contributors, particularly
Matthias Heizmann, Jürgen Christ, Mohamed Abdelazim Sherif, Mostafa Mah-
moud Mohamed, Markus Lindenmann, Betim Musa, Christian Schilling, and Ste-
fan Wissert.

References
1. Ermis, E., Hoenicke, J., Podelski, A.: Splitting via interpolants. In: Kuncak, V.,
Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 186–201. Springer, Hei-
delberg (2012)
2. Heizmann, M., Hoenicke, J., Podelski, A.: Nested interpolants. In: Hermenegildo,
M.V., Palsberg, J. (eds.) POPL, pp. 471–482. ACM (2010)
3. Christ, J., Hoenicke, J.: Extending proof tree preserving interpolation to sequences
and trees (work in progress). In: SMT Workshop, pp. 72–86 (2013)
4. Beyer, D., Cimatti, A., Griggio, A., Erkan Keremoglu, M., Sebastiani, R.: Software
model checking via large-block encoding. In: FMCAD, pp. 25–32. IEEE (2009)
5. Musa, B.: Trace abstraction with unsatisﬁable cores. Bachelor’s thesis, University
of Freiburg, Germany (2013)

4
We use version 4.3.2 for Windows, any recent version should work.
5
http://ultimate.informatik.uni-freiburg.de/automizer/
Discounting in LTL

Shaull Almagor1, Udi Boker2 , and Orna Kupferman1

1
The Hebrew University, Jerusalem, Israel
2
The Interdisciplinary Center, Herzliya, Israel

Abstract. In recent years, there is growing need and interest in formalizing and
reasoning about the quality of software and hardware systems. As opposed to
traditional verification, where one handles the question of whether a system satis-
fies, or not, a given specification, reasoning about quality addresses the question
of how well the system satisfies the specification. One direction in this effort is to
refine the “eventually” operators of temporal logic to discounting operators: the
satisfaction value of a specification is a value in [0, 1], where the longer it takes
to fulfill eventuality requirements, the smaller the satisfaction value is.
In this paper we introduce an augmentation by discounting of Linear Tem-
poral Logic (LTL), and study it, as well as its combination with propositional
quality operators. We show that one can augment LTL with an arbitrary set of
discounting functions, while preserving the decidability of the model-checking
problem. Further augmenting the logic with unary propositional quality opera-
tors preserves decidability, whereas adding an average-operator makes the model-
checking problem undecidable. We also discuss the complexity of the problem,
as well as various extensions.

1 Introduction
One of the main obstacles to the development of complex hardware and software sys-
tems lies in ensuring their correctness. A successful paradigm addressing this obstacle
is temporal-logic model checking – given a mathematical model of the system and a
temporal-logic formula that specifies a desired behavior of it, decide whether the model
satisfies the formula [5]. Correctness is Boolean: a system can either satisfy its specifi-
cation or not satisfy it. The richness of today’s systems, however, justifies specification
formalisms that are multi-valued. The multi-valued setting arises directly in systems
with quantitative aspects (multi-valued / probabilistic / fuzzy) [9–11, 16, 23], but is ap-
plied also with respect to Boolean systems, where it origins from the semantics of the
specification formalism itself [1, 7].
When considering the quality of a system, satisfying a specification should no longer
be a yes/no matter. Different ways of satisfying a specification should induce differ-
ent levels of quality, which should be reflected in the output of the verification pro-
cedure. Consider for example the specification G(request → F(response grant ∨
response deny)) (“every request is eventually responded, with either a grant or a de-
nial”). There should be a difference between a computation that satisfies it with re-
sponses generated soon after requests and one that satisfies it with long waits.
Moreover, there may be a difference between grant and deny responses, or cases in
which no request is issued. The issue of generating high-quality hardware and software

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 424–439, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Discounting in LTL 425

systems attracts a lot of attention [13, 26]. Quality, however, is traditionally viewed as
an art, or as an amorphic ideal. In [1], we introduced an approach for formalizing qual-
ity. Using it, a user can specify quality formally, according to the importance he gives to
components such as security, maintainability, runtime, and more, and then can formally
reason about the quality of software.
As the example above demonstrates, we can distinguish between two aspects of the
quality of satisfaction. The first, to which we refer as “temporal quality” concerns the wait-
ing time to satisfaction of eventualities. The second, to which we refer as “propositional
quality” concerns prioritizing related components of the specification. Propositional qual-
ity was studied in [1]. In this paper we study temporal quality as well as the combinations
of both aspects. One may try to reduce temporal quality to propositional quality by a re-
peated use of the X (“next”) operator or by a use of bounded (prompt) eventualities [2, 3].
Both approaches, however, partitions the future into finitely many zones and are limited:
correctness of LTL is Boolean, and thus has inherent dichotomy between satisfaction
and dissatisfaction. On the other hand, the distinction between “near” and “far” is not
dichotomous.
This suggests that in order to formalize temporal quality, one must extend LTL to
an unbounded setting. Realizing this, researchers have suggested to augment temporal
logics with future discounting [8]. In the discounted setting, the satisfaction value of spec-
ifications is a numerical value, and it depends, according to some discounting function,
on the time waited for eventualities to get satisfied.
In this paper we add discounting to Linear Temporal Logic (LTL), and study it, as
well as its combination with propositional quality operators. We introduce LTLdisc [D]
– an augmentation by discounting of LTL. The logic LTLdisc [D] is actually a family of
logics, each parameterized by a set D of discounting functions – strictly decreasing func-

tions from to [0, 1] that tend to 0 (e.g., linear decaying, exponential decaying, etc.).
LTLdisc [D] includes a discounting-“until” (Uη ) operator, parameterized by a function
η ∈ D. We solve the model-checking threshold problem for LTLdisc [D]: given a Kripke
structure K, an LTLdisc [D] formula ϕ and a threshold t ∈ [0, 1], the algorithm decides
whether the satisfaction value of ϕ in K is at least t.
In the Boolean setting, the automata-theoretic approach has proven to be very use-
ful in reasoning about LTL specifications. The approach is based on translating LTL
formulas to nondeterministic Büchi automata on infinite words [28]. Applying this ap-
proach to the discounted setting, which gives rise to infinitely many satisfaction values,
poses a big algorithmic challenge: model-checking algorithms, and in particular those
that follow the automata-theoretic approach, are based on an exhaustive search, which
cannot be simply applied when the domain becomes infinite. A natural relevant exten-
sion to the automata-theoretic approach is to translate formulas to weighted automata
[22]. Unfortunately, these extensively-studied models are complicated and many prob-
lems become undecidable for them [15]. We show that for threshold problems, we can
translate LTLdisc [D] formulas into (Boolean) nondeterministic Büchi automata, with the
property that the automaton accepts a lasso computation iff the formula attains a value
above the threshold on that computation. Our algorithm relies on the fact that the lan-
guage of an automaton is non-empty iff there is a lasso witness for the non-emptiness.
426 S. Almagor, U. Boker, and O. Kupferman

We cope with the infinitely many possible satisfaction values by using the discounting be-
havior of the eventualities and the given threshold in order to partition the state space into
a finite number of classes. The complexity of our algorithm depends on the discounting
functions used in the formula. We show that for standard discounting functions, such as
exponential decaying, the problem is PSPACE-complete – not more complex than stan-
dard LTL. The fact our algorithm uses Boolean automata also enables us to suggest a
solution for threshold satisfiability, and to give a partial solution to threshold synthesis.
In addition, it allows to adapt the heuristics and tools that exist for Boolean automata.
Before we continue to describe our contribution, let us review existing work on dis-
counting. The notion of discounting has been studied in several fields, such as economy,
game-theory, and Markov decision processes [25]. In the area of formal verification, it
was suggested in [8] to augment the μ-calculus with discounting operators. The discount-
ing suggested there is exponential; that is, with each iteration, the satisfaction value of the
formula decreases by a multiplicative factor in (0, 1]. Algorithmically, [8] shows how to
evaluate discounted μ-calculus formulas with arbitrary precision. Formulas of LTL can
be translated to the μ-calculus, thus [8] can be used in order to approximately model-
check discounted-LTL formulas. However, the translation from LTL to the μ-calculus
involves an exponential blowup [6] (and is complicated), making this approach ineffi-
cient. Moreover, our approach allows for arbitrary discounting functions, and the algo-
rithm returns an exact solution to the threshold model-checking problem, which is more
difficult than the approximation problem.
Closer to our work is [7], where CTL is augmented with discounting and weighted-
average operators. The motivation in [7] is to introduce a logic whose semantics is not
too sensitive to small perturbations in the model. Accordingly, formulas are evaluated
on weighted-systems or on Markov-chains. Adding discounting and weighted-average
operators to CTL preserves its appealing complexity, and the model-checking problem
for the augmented logic can be solved in polynomial time. As is the case in the Boolean
semantics, the expressive power of discounted CTL is limited. The fact the same com-
bination, of discounting and weighted-average operators, leads to undecidability in the
context of LTL witnesses the technical challenges of the LTLdisc [D] setting.
Perhaps closest to our approach is [19], where a version of discounted-LTL was in-
troduced. Semantically, there are two main differences between the logics. The first is
that [19] uses discounted sum, while we interpret discounting without accumulation,
and the second is that the discounting there replaces the standard temporal operators, so
all eventualities are discounted. As discounting functions tend to 0, this strictly restricts
the expressive power of the logic, and one cannot specify traditional eventualities in it.
On the positive side, it enables a clean algebraic characterization of the semantics, and
indeed the contribution in [19] is a comprehensive study of the mathematical properties
of the logic. Yet, [19] does not study algorithmic questions about to the logic. We, on
the other hand, focus on the algorithmic properties of the logic, and specifically on the
model-checking problem.
Discounting in LTL 427

Let us now return to our contribution. After introducing LTLdisc [D] and studying its
model-checking problem, we augment LTLdisc [D] with propositional quality operators.
Beyond the operators min, max, and ¬, which are already present, two basic proposi-
tional quality operators are the multiplication of an LTLdisc [D] formula by a constant
in [0, 1], and the averaging between the satisfaction values of two LTLdisc [D] formulas
[1]. We show that while the first extension does not increase the expressive power of
LTLdisc [D] or its complexity, the latter causes the model-checking problem to become
undecidable. In fact, model checking becomes undecidable even if we allow averaging
in combination with a single discounting function. Recall that this is in contrast with
the extension of discounted CTL with an average operator, where the complexity of the
model-checking problem stays polynomial [7].
We consider additional extensions of LTLdisc [D]. First, we study a variant of the
discounting-eventually operators in which we allow the discounting to tend to arbitrary
values in [0, 1] (rather than to 0). This captures the intuition that we are not always pes-
simistic about the future, but can be, for example, ambivalent about it, by tending to 12 .
We show that all our results hold under this extension. Second, we add to LTLdisc [D] past
operators and their discounting versions (specifically, we allow a discounting-“since” op-
erator, and its dual). In the traditional semantics, past operators enable clean specifica-
tions of many interesting properties, make the logic exponentially more succinct, and
can still be handled within the same complexity bounds [17, 18]. We show that the same
holds for the discounted setting. Finally, we show how LTLdisc [D] and algorithms for it
can be used also for reasoning about weighted systems.
Due to lack of space, most proofs are omitted, and can be found in the full version, in
the authors’ home pages.

2 The Logic LTLdisc [D]

The linear temporal logic LTLdisc [D] generalizes LTL by adding discounting temporal
operators. The logic is actually a family of logics, each parameterized by a set D of dis-
counting functions.
Let = {0, 1, ...}. A function η : → [0, 1] is a discounting function if
limi→∞ η(i) = 0, and η is strictly monotonic-decreasing. Examples for natural discount-
ing functions are η(i) = λi , for some λ ∈ (0, 1), and η(i) = i+1
1
.
Given a set of discounting functions D, we define the logic LTLdisc [D] as follows.
The syntax of LTLdisc [D] adds to LTL the operator ϕUη ψ (discounting-Until), for every
function η ∈ D. Thus, the syntax is given by the following grammar, where p ranges
over the set AP of atomic propositions and η ∈ D.
ϕ := True | p | ¬ϕ | ϕ ∨ ϕ | Xϕ | ϕUϕ | ϕUη ϕ.
The semantics of LTLdisc [D] is defined with respect to a computation π = π 0 , π 1 , . . . ∈
(2 ) . Given a computation π and an LTLdisc [D] formula ϕ, the truth value of ϕ in π
AP ω

is a value in [0, 1], denoted [[π, ϕ]]. The value is defined by induction on the structure of
ϕ as follows, where π i = πi , πi+1 , . . ..
428 S. Almagor, U. Boker, and O. Kupferman

– [[π, True]]= 1. – [[π, ϕ ∨ ψ]] = max {[[π, ϕ]], [[π, ψ]]}.

1 if p ∈ π0 ,
– [[π, p]] = – [[π, ¬ϕ]] = 1 − [[π, ϕ]].
0 if p ∈ / π0 .
– [[π, Xϕ]] = [[π 1 , ϕ]].
– [[π, ϕUψ]] = sup{min{[[π i , ψ]], min {[[π j , ϕ]]}}}.
i≥0 0≤j<i
– [[π, ϕUη ψ]] = sup{min{η(i)[[π i , ψ]], min {η(j)[[π j , ϕ]]}}}.
i≥0 0≤j<i

The intuition is that events that happen in the future have a lower influence, and the
rate by which this influence decreases depends on the function η. 1 For example, the sat-
isfaction value of a formula ϕUη ψ in a computation π depends on the best (supremum)
value that ψ can get along the entire computation, while considering the discounted sat-
isfaction of ψ at a position i, as a result of multiplying it by η(i), and the same for the
value of ϕ in the prefix leading to the i-th position.
We add the standard abbreviations Fϕ ≡ TrueUϕ, and Gϕ = ¬F¬ϕ, as well as their
quantitative counterparts: Fη ϕ ≡ TrueUη ϕ, and Gη ϕ = ¬Fη ¬ϕ. We denote by |ϕ| the
number of subformulas of ϕ.
A computation of the form π = u · v ω , for u, v ∈ (2AP )∗ , with v =
, is called a
lasso computation. We observe that since a specific lasso computation has only finitely
many distinct suffixes, the inf and sup in the semantics of LTLdisc [D] can be replaced
with min and max, respectively, when applied to lasso computations.
The semantics is extended to Kripke structures by taking the path that admits the low-
est satisfaction value. Formally, for a Kripke structure K and an LTLdisc [D] formula ϕ
we have that [[K, ϕ]] = inf {[[π, ϕ]] : π is a computation of K}.

Example 1. Consider a lossy-disk: every moment in time there is a chance that some
bit would flip its value. Fixing flips is done by a global error-correcting procedure. This
procedure manipulates the entire content of the disk, such that initially it causes more
errors in the disk, but the longer it runs, the more bits it fixes.
Let init and terminate be atomic propositions indicating when the error-correcting
procedure is initiated and terminated, respectively. The quality of the disk (that is, a mea-
sure of the amount of correct bits) can be specified by the formula ϕ = GFη (init ∧
¬Fμ terminate) for some appropriate discounting functions η and μ. Intuitively, ϕ gets
a higher satisfaction value the shorter the waiting time is between initiations of the error-
correcting procedure, and the longer the procedure runs (that is, not terminated) in be-
tween these initiations. Note that the “worst case” nature of LTLdisc [D] fits here. For
instance, running the procedure for a very short time, even once, will cause many errors.

3 LTLdisc [D] Model Checking

In the Boolean setting, the model-checking problem asks, given an LTL formula ϕ and a
Kripke structure K, whether [[K, ϕ]] = True. In the quantitative setting, the
1
Observe that in our semantics the satisfaction value of future events tends to 0. One may think
of scenarios where future events are discounted towards another value in [0, 1] (e.g. discounting
towards 12 as ambivalence regarding the future). We address this in Section 5.
Discounting in LTL 429

model-checking problem is to compute [[K, ϕ]], where ϕ is now an LTLdisc [D] formula.
A simpler version of this problem is the threshold model-checking problem: given ϕ, K,
and a threshold v ∈ [0, 1], decide whether [[K, ϕ]] ≥ v. In this section we show how we
can solve the latter.
Our solution uses the automata-theoretic approach, and consists of the following steps.
We start by translating ϕ and v to an alternating weak automaton Aϕ,v such that L(Aϕ,v ) =
∅ iff there exists a computation π such that [[π, ϕ]] > v. The challenge here is that ϕ has in-
finitely many satisfaction values, naively implying an infinite-state automaton. We show
that using the threshold and the discounting behavior of the eventualities, we can restrict
attention to a finite resolution of satisfaction values, enabling the construction of a finite
automaton. Complexity-wise, the size of Aϕ,v depends on the functions in D. In Sec-
tion 3.3, we analyze the complexity for the case of exponential-discounting functions.
The second step is to construct a nondeterministic Büchi automaton B that is equiva-
lent to Aϕ,v . In general, alternation removal might involve an exponential blowup in the
state space [21]. We show, by a careful analysis of Aϕ,v , that we can remove its alterna-
tion while only having a polynomial state blowup.
We complete the model-checking procedure by composing the nondeterministic Büchi
automaton B with the Kripke structure K, as done in the traditional, automata-based,
model-checking procedure.
The complexity of model-checking an LTLdisc [D] formula depends on the discounting
functions in D. Intuitively, the faster the discounting tends to 0, the less states there will be.
For exponential-discounting, we show that the complexity is NLOGSPACE in the system
(the Kripke structure) and PSPACE in the specification (the LTLdisc [D] formula and the
threshold), staying in the same complexity classes of standard LTL model-checking.
We conclude the section by showing how to use the generated nondeterministic Büchi
automaton for addressing threshold satisfiability and synthesis.

3.1 Alternating Weak Automata

For a given set X, let B +(X) be the set of positive Boolean formulas over X (i.e., Boolean
formulas built from elements in X using ∧ and ∨), where we also allow the formulas
True and False. For Y ⊆ X, we say that Y satisfies a formula θ ∈ B + (X) iff the
truth assignment that assigns true to the members of Y and assigns false to the members
of X \ Y satisfies θ. An alternating Büchi automaton on infinite words is a tuple A =
Σ, Q, qin , δ, α, where Σ is the input alphabet, Q is a finite set of states, qin ∈ Q is an
initial state, δ : Q×Σ → B + (Q) is a transition function, and α ⊆ Q is a set of accepting
states. We define runs of A by means of (possibly) infinite DAGs (directed acyclic graphs).
A run of A on a word w = σ0 · σ1 · · · ∈ Σ ω is a (possibly) infinite DAG G = V, E
satisfying the following (note that there may be several runs of A on w).

– V ⊆ Q × is as follows. Let Ql ⊆ Q denote all states in level l. Thus, Ql = {q :
q, l ∈ V }. Then, Q0 = {qin }, and Ql+1 satisfies q∈Ql δ(q, σl ).

– For every l ∈ , Ql is minimal with respect to containment.

– E ⊆ l≥0 (Ql × {l}) × (Ql+1 × {l + 1}) is such that for every state q ∈ Ql , the
set {q ∈ Ql+1 : E(< q, l >, < q , l + 1 >)} satisfies δ(q, σl ).
430 S. Almagor, U. Boker, and O. Kupferman

Thus, the root of the DAG contains the initial state of the automaton, and the states asso-
ciated with nodes in level l + 1 satisfy the transitions from states corresponding to nodes
in level l. The run G accepts the word w if all its infinite paths satisfy the acceptance con-
dition α. Thus, in the case of Büchi automata, all the infinite paths have infinitely many
nodes q, l such that q ∈ α (it is not hard to prove that every infinite path in G is part
of an infinite path starting in level 0). A word w is accepted by A if there is a run that
accepts it. The language of A, denoted L(A), is the set of infinite words that A accepts.
When the formulas in the transition function of A contain only disjunctions, then A
is nondeterministic, and its runs are DAGs of width 1, where at each level there is a single
node.
The alternating automaton A is weak, denoted AWA, if its state space Q can be par-
titioned into sets Q1 , . . . , Qk , such that the following hold: First, for every 1 ≤ i ≤ k
either Qi ⊆ α, in which case we say that Qi is an accepting set, or Qi ∩ α = ∅, in
which case we say that Qi is rejecting. Second, there is a partial-order ≤ over the sets,
and for every 1 ≤ i, j ≤ k, if q ∈ Qi , s ∈ Qj , and s ∈ δ(q, σ) for some σ ∈ Σ, then
Qj ≤ Qi . Thus, transitions can lead only to states that are smaller in the partial order.
Consequently, each run of an AWA eventually gets trapped in a set Qi and is accepting
iff this set is accepting.

3.2 From LTLdisc [D] to AWA

Our model-checking algorithm is based on translating an LTLdisc [D] formula ϕ to an
AWA. Intuitively, the states of the AWA correspond to assertions of the form ψ > t
or ψ < t for every subformula ψ of ϕ, and for certain thresholds t ∈ [0, 1]. A lasso
computation is then accepted from state ψ > t iff [[π, ψ]] > t. The assumption about
the computation being a lasso is needed only for the “only if” direction, and it does not
influence the proof’s generality since the language of an automaton is non-empty iff there
is a lasso witness for its non-emptiness. By setting the initial state to ϕ > v, we are done.
Defining the appropriate transition function for the AWA follows the semantics of
LTLdisc [D] in the expected manner. A naive construction, however, yields an infinite-state
automaton (even if we only expand the state space on-the-fly, as discounting
formulas can take infinitely many satisfaction values). As can be seen in the proof of
Theorem 1, the “problematic” transitions are those that involve the discounting operators.
The key observation is that, given a threshold v and a computation π, when evaluating a
discounted operator on π, one can restrict attention to two cases: either the satisfaction
value of the formula goes below v, in which case this happens after a bounded prefix,
or the satisfaction value always remains above v, in which case we can replace the dis-
counted operator with a Boolean one. This observation allows us to expand only a finite
number of states on-the-fly.
Before describing the construction of the AWA, we need the following lemma, which
reduces an extreme satisfaction of an LTLdisc [D] formula, meaning satisfaction with a
value of either 0 or 1, to a Boolean satisfaction of an LTL formula. The proof proceeds
by induction on the structure of the formulas.
Discounting in LTL 431

Henceforth, given an LTLdisc [D] formula ϕ, we refer to ϕ+ as in Lemma 1.

Consider an LTLdisc [D] formula ϕ. By Lemma 1, if there exists a computation π such
that [[π, ϕ]] > 0, then ϕ+ is satisfiable. Conversely, since ϕ+ is a Boolean LTL formula,
then by [27] we know that ϕ+ is satisfiable iff there exists a lasso computation π that
satisfies it, in which case [[π, ϕ]] > 0. We conclude with the following.
Corollary 1. Consider an LTLdisc [D] formula ϕ. There exists a computation π such that
[[π, ϕ]] > 0 iff there exists a lasso computation π such that [[π , ϕ]] > 0, in which case
π |= ϕ+ as well.

Remark 1. The curious reader may wonder why we do not prove that [[π, ϕ]] > 0 iff
π |= ϕ+ for every computation π. As it turns out, a translation that is valid also for
computations with no period is not always possible. For example, as is the case with
the prompt-eventuality operator of [14], the formula ϕ = G(Fη p) is such that the set of
computations π with [[π, ϕ]] > 0 is not ω-regular, thus one cannot hope to define an LTL
formula ϕ+ .

We start with some definitions. For a function f : → [0, 1] and for k ∈ , we

define f +k : → [0, 1] as follows. For every i ∈ we have that f +k (i) = f (i + k).
Let ϕ be an LTLdisc [D] formula over AP . We define the extended closure of ϕ, denoted
xcl(ϕ), to be the set of all the formulas ψ of the following classes:
1. ψ is a subformula of ϕ.
2. ψ is a subformula of θ+ or ¬θ+ , where θ is a subformula of ϕ.

3. ψ is of the form θ1 Uη+k θ2 for k ∈ , where θ1 Uη θ2 is a subformula of ϕ.

Observe that xcl(ϕ) may be infinite, and that it has both LTLdisc [D] formulas (from
Classes 1 and 3) and LTL formulas (from Class 2).

Theorem 1. Given an LTLdisc [D] formula ϕ and a threshold v ∈ [0, 1], there exists an
AWA Aϕ,v such that for every computation π the following hold.
1. If [[π, ϕ]] > v, then Aϕ,v accepts π.
2. If Aϕ,v accepts π and π is a lasso computation, then [[π, ϕ]] > v.

Proof. We construct Aϕ,v = Q, 2AP , Q0 , δ, α as follows.

The state space Q consists of two types of states. Type-1 states are assertions of the
form (ψ > t) or (ψ < t), where ψ ∈ xcl(ϕ) is of Class 1 or 3 and t ∈ [0, 1]. Type-2
states correspond to LTL formulas of Class 2. Let S be the set of Type-1 and Type-2
states for all ψ ∈ xcl(ϕ) and thresholds t ∈ [0, 1]. Then, Q is the subset of S constructed
on-the-fly according to the transition function defined below. We later show that Q is
indeed finite.
432 S. Almagor, U. Boker, and O. Kupferman

The transition function δ : Q × 2AP → B + (Q) is defined as follows. For Type-2

states, the transitions are as in the standard translation from LTL to AWA [27] (see the
full version for details). For the other states, we define the transitions as follows. Let
σ ∈ 2AP .
True if t < 1,
δ((True > t), σ) = δ((False > t), σ) = False.
False if t = 1.

True if t > 0,
δ((True < t), σ) = False. δ((False < t), σ) =
False if t = 0.

True if p ∈ σ and t < 1, False if p ∈ σ or t = 0,
δ((p > t), σ) = δ((p < t), σ) =
False otherwise. True otherwise.
δ((ψ1 ∨ ψ2 > t), σ) = δ((ψ1 > t), σ) ∨ δ((ψ2 > t), σ).
δ((ψ1 ∨ ψ2 < t), σ) = δ((ψ1 < t), σ) ∧ δ((ψ2 < t), σ).
δ((¬ψ1 > t), σ) = δ((ψ1 < 1 − t), σ) δ((¬ψ1 < t), σ) = δ((ψ1 > 1 − t), σ).
δ((Xψ1 > t), σ) = (ψ1⎡> t). δ((Xψ1 < t), σ) = (ψ1 < t).
δ((ψ2 > t), σ) ∨ [δ((ψ1 > t), σ) ∧ (ψ1 Uψ2 > t)] if 0 < t < 1,
δ((ψ1 Uψ2 > t), σ) = ⎣ False if t ≥ 1,
δ(((ψ1 Uψ2 )+ ), σ) if t = 0.
⎡
δ((ψ2 < t), σ) ∧ [δ((ψ1 < t), σ) ∨ (ψ1 Uψ2 < t)] if 0 < t ≤ 1,
δ((ψ1 Uψ2 < t), σ) = ⎣ True if t > 1,
False if t = 0.
⎡ 1 Uη ψ2 >t t), σ) =
δ((ψ
δ((ψ2 > η(0) ), σ) ∨ [δ((ψ1 > η(0)t
), σ) ∧ (ψ1 Uη+1 ψ2 > t)] if 0 < η(0)
t
< 1,
⎢ False if η(0) ≥ 1,
t
⎣
δ(((ψ1 Uη ψ2 )+ ), σ) t
if η(0) = 0 (i.e., t = 0).
δ((ψ
⎡ 1 U η ψ2 < t), σ) =
δ((ψ2 < η(0)t
), σ) ∧ [δ((ψ1 < η(0)
t
), σ) ∨ (ψ1 Uη+1 ψ2 < t)] if 0 < η(0)
t
≤ 1,
⎢ True t
⎣ if η(0) > 1,
t
False if η(0) = 0 (i.e., t = 0).
We provide some intuition for the more complex parts of the transition function: con-
sider, for example, the transition δ((ψ1 Uη ψ2 > t), σ). Since η is decreasing, the highest
possible satisfaction value for ψ1Uη ψ2 is η(0). Thus, if η(0) ≤ t (equivalently, η(0) t
≥ 1),
then it cannot hold that ψ1 Uη ψ2 > t, so the transition is to False. If t = 0, then we only
need to ensure that the satisfaction value of ψ1 Uη ψ2 is not 0. To do so, we require that
(ψ1 Uη ψ2 )+ is satisfied. By Corollary 1, this is equivalent to the satisfiability of the for-
mer. So the transition is identical to that of the state (ψ1 Uη ψ2 )+ . Finally, if 0 < t < η(0),
then (slightly abusing notation) the assertion ψ1 Uη ψ2 > t is true if either η(0)ψ2 > t is
true, or both η(0)ψ1 > t and ψ1 Uη+1 ψ2 > t are true.
The initial state of Aϕ,v is (ϕ > v). The accepting states are these of the form
(ψ1 Uψ2 < t), as well as accepting states that arise in the standard translation of Boolean
LTL to AWA (in Type-2 states). Note that each path in the run of Aϕ,v eventually gets
trapped in a single state. Thus, Aϕ,v is indeed an AWA. The intuition behind the accep-
tance condition is as follows. Getting trapped in a state of the form (ψ1 Uψ2 < t) is
allowed, as the eventuality is satisfied with value 0. On the other hand, getting stuck in
other states (of Type-1) is not allowed, as they involve eventualities that are not satisfied
in the threshold promised for them.
This concludes the definition of Aϕ,v . Finally, observe that while the construction as
described above is infinite (indeed, uncountable), only finitely many states are reachable
Discounting in LTL 433

from the initial state (ϕ > v), and we can compute these states in advance. Intuitively,
it follows from the fact that once the proportion between t and η(i) goes above 1, for
Type-1 states associated with threshold t and sub formulas with a discounting function
η, we do not have to generate new states.
A detailed proof of A’s finiteness and correctness is given in the full version.
Since Aϕ,v is a Boolean automaton, then L(A) = ∅ iff it accepts a lasso computation.
Combining this observation with Theorem 1, we conclude with the following.
Corollary 2. For an LTLdisc [D] formula ϕ and a threshold v ∈ [0, 1], it holds that
L(Aϕ,v ) = ∅ iff there exists a computation π such that [[π, ϕ]] > v.

3.3 Exponential Discounting

The size of the AWA generated as per Theorem 1 depends on the discounting functions.
In this section, we analyze its size for the class of exponential discounting functions,
showing that it is singly exponential in the specification formula and in the threshold.
This class is perhaps the most common class of discounting functions, as it describes what
happens in many natural processes (e.g., temperature change, capacitor charge, effective
interest rate, etc.) [8, 25].
For λ ∈ (0, 1) we define the exponential-discounting function expλ : → [0, 1]

by expλ (i) = λi . For the purpose of this section, we restrict to λ ∈ (0, 1) ∩ . Let

E = {expλ : λ ∈ (0, 1) ∩ }, and consider the logic LTLdisc [E].
For an LTLdisc [E] formula ϕ we define the set F (ϕ) to be {λ1 , ..., λk : the operator
Uexpλ appears in ϕ}. Let |ϕ| be the length of the description of ϕ. That is, in addition
to |ϕ|, we include in |ϕ| the length, in bits, of describing F (ϕ).
Theorem 2. Given an LTLdisc [E] formula ϕ and a threshold v ∈ [0, 1] ∩ , there exists
an AWA Aϕ,v such that for every computation π the following hold.

1. If [[π, ϕ]] > v, then Aϕ,v accepts π.

2. If Aϕ,v accepts π and π is a lasso computation, then [[π, ϕ]] > v.

Furthermore, the number of states of Aϕ,v is singly exponential in |ϕ| and in the de-
scription of v.
The proof follows from the following observation. Let λ ∈ (0, 1) and v ∈ (0, 1). When
discounting by expλ , the number of states in the AWA constructed as per Theorem 1 is
log v
proportional to the maximal number i such that λi > v, which is at most logλ v = log λ,
which is polynomial in the description length of v and λ. A similar (yet more complicated)
consideration is applied for the setting of multiple discounting functions and negations.

3.4 From Aϕ,v to an NBA

Every AWA can be translated to an equivalent nondeterministic Büchi automaton (NBA,
for short), yet the state blowup might be exponential BKR10,MH84. By carefully ana-
lyzing the AWA Aϕ,v generated in Theorem 1, we show that it can be translated to an
NBA with only a polynomial blowup.
434 S. Almagor, U. Boker, and O. Kupferman

The idea behind our complexity analysis is as follows. Translating an AWA to an NBA
involves alternation removal, which proceeds by keeping track of entire levels in a run-
DAG . Thus, a run of the NBA corresponds to a sequence of subsets of Q. The key to the
reduced state space is that the number of such subsets is only |Q|O(|ϕ|) and not 2|Q| . To
see why, consider a subset S of the states of A. We say that S is minimal if it does not
include two states of the form ϕ < t1 and ϕ < t2 , for t1 < t2 , nor two states of the form
ϕUη+i ψ < t and ϕUη+j ψ < t, for i < j, and similarly for “>”. Intuitively, sets that are
not minimal hold redundant assertions, and can be ignored. Accordingly, we restrict the
state space of the NBA to have only minimal sets.
Lemma 2. For an LTLdisc [D] formula ϕ and v ∈ [0, 1], the AWA Aϕ,v constructed in
Theorem 1 with state space Q can be translated to an NBA with |Q|O(|ϕ|) states.

3.5 Decision Procedures for LTLdisc [D]

Model Checking and Satisfiability. Consider a Kripke structure K, an LTLdisc [D]
formula ϕ, and a threshold v. By checking the emptiness of the intersection of K with
A¬ϕ,1−v , we can solve the threshold model-checking problem. Indeed, L(A¬ϕ,1−v ) ∩
L(K) = ∅ iff there exists a lasso computation π that is induced by K such that [[π, ϕ]] < v,
which happens iff it is not true that [[K, ϕ]] ≥ v.
The complexity of the model-checking procedure depends on the discounting func-
tions in D. For the set of exponential-discounting functions E, we provide the following
concrete complexities, showing that it stays in the same complexity classes of standard
LTL model-checking.

Theorem 3. For a Kripke structure K, an LTLdisc [E] formula ϕ, and a threshold v ∈

[0, 1]∩ , the problem of deciding whether [[K, ϕ]] > v is in NLOGSPACE in the number
of states of K, and in PSPACE in |ϕ| and in the description of v.

Proof. By Theorem 2 and Lemma 2, the size of an NBA B corresponding to ϕ and v is

singly exponential in |ϕ| and in the description of v. Hence, we can check the emptiness
of the intersection of K and B via standard “on the fly” procedures, getting the stated
complexities.

Note that the complexity in Theorem 3 is only NLOGSPACE in the system, since
our solution does not analyze the Kripke structure, but only takes its product with the
specification’s automaton. This is in contrast to the approach of model checking temporal
logic with (non-discounting) accumulative values, where, when decidable, involves a
doubly-exponential dependency on the size of the system [4].
Finally, observe that the NBA obtained in Lemma 2 can be used to solve the threshold-
satisfiability problem: given an LTLdisc [D] formula ϕ and a threshold v ∈ [0, 1], we can
decide whether there is a computation π such that [[π, ϕ]] ∼ v, for ∼∈ {<, >}, and return
such a computation when the answer is positive. This is done by simply deciding whether
there exists a word that is accepted by the NBA.

Threshold Synthesis. Consider an LTLdisc [D] formula ϕ, and assume a partition of

the atomic propositions in ϕ to input and output signals, we can use the NBA Aϕ,v in
Discounting in LTL 435

order to address the synthesis problem, as stated in the following theorem (see the full
version for the proof).
Theorem 4. Consider an LTLdisc [D] formula ϕ. If there exists a transducer T all of
whose computations π satisfy [[π, ϕ]] > v, then we can generate a transducer T all of
whose computations τ satisfy [[τ, ϕ]] ≥ v.

4 Adding Propositional Quality Operators

As model checking is decidable for LTLdisc [D], one may wish to push the limit and ex-
tend the expressive power of the logic. In particular, of great interest is the combining of
discounting with propositional quality operators [1].

4.1 Adding the Average Operator

A well-motivated extension is the introduction of the average operator ⊕, with the se-
mantics [[π, ϕ ⊕ ψ]] = [[π,ϕ]]+[[π,ψ]]
2 . The work in [1] proves that extending LTL by this
operator, as well as with other propositional quantitative operators, enables clean spec-
ification of quality and results in a logic for which the model-checking problem can be
solved in PSPACE.
We show that adding the ⊕ operator to LTLdisc [D] gives a logic, denoted LTLdisc⊕ [D],
for which the validity and model-checking problems are undecidable. The validity prob-
lem asks, given an LTLdisc⊕ [D] formula ϕ over the atomic propositions AP and a thresh-
old v ∈ [0, 1], whether [[π, ϕ]] > v for every π ∈ (2AP )ω .
In the undecidability proof, we show a reduction from the 0-halting problem for two-
counter machines. A two-counter machine M is a sequence (l1 , . . . , ln ) of commands
involving two counters x and y. We refer to {1, . . . , n} as the locations of the machine.
There are five possible forms of commands:
INC(c), DEC(c), GOTO li , IF c=0 GOTO li ELSE GOTO lj , HALT,
where c ∈ {x, y} is a counter and 1 ≤ i, j ≤ n are locations. Since we can always
check whether c = 0 before a DEC(c) command, we assume that the machine never
reaches DEC(c) with c = 0. That is, the counters never have negative values. Given a
counter machine M, deciding whether M halts is known to be undecidable [20]. Given
M, deciding whether M halts with both counters having value 0, termed the 0-halting
problem, is also undecidable: given a counter machine M, we can replace every HALT
command with a code that clears the counters before halting.
Theorem 5. The validity problem for LTLdisc⊕ [D] is undecidable (for every nonempty
set of discounting functions D).
The proof goes along the following lines: We construct from M an LTLdisc⊕ [D]
formula ϕ such that M 0-halts iff there exists a computation π such that [[π, ϕ]] = 12 .
The idea behind the construction is as follows. The computation that ϕ is verified with
corresponds to a description of a run of M, where every triplet li , α, β is encoded as
the string ixα y β #.
The formula ϕ will require the following properties of the computation π (recall that
the setting is quantitative, not Boolean):
436 S. Almagor, U. Boker, and O. Kupferman

1. The first configuration in π is the initial configuration of M, namely l1 , 0, 0, or 1#

in our encoding.
2. The last configuration in π is HALT, 0, 0, or k in our encoding, where k is a line
whose command is HALT.
3. π represents a legal run of M, up to the consistency of the counters between transi-
tions.
4. The counters are updated correctly between configurations.
Properties 1-3 can easily be captured by an LTL formula. Property 4 utilizes the expres-
sive power of LTLdisc⊕ [D], as we now explain. The intuition behind Property 4 is the
following. We compare the value of a counter before and after a command, such that the
formula takes a value smaller than 12 if a violation is encountered, and 12 otherwise. Since
the value of counters can change by at most 1, the essence of this formula is the ability
to test equality of counters.
We start with a simpler case, to demonstrate the point. Let η ∈ D be a discounting
function. Consider the formula CountA := aUη ¬a and the computation ai bj #ω . It
holds that [[ai bj , CountA]] = η(i). Similarly, it holds that [[ai bj #ω , aU(bUη ¬b)]] =
η(j). Denote the latter by CountB. Let CompareAB := (CountA ⊕ ¬CountB) ∧
(¬CountA ⊕ CountB). We now have 1 that 2
[[ai bj #ω , CompareAB]] = min η(i)+1−η(j) η(j)+1−η(i)
2 , 2 = 12 − |η(i)−η(j)|
2 , and
1 1
observe that the latter is iff i = j
2 (and is less than 2 otherwise). This is because
η is
strictly decreasing, and in particular an injection.
Thus, we can compare counters. To apply this technique to the encoding of a com-
putation, we use formulas that “parse” the input and find successive occurrences of a
counter.
Since, by considering a Kripke structure that generates all computations, it is easy to
reduce the validity problem to the model-checking problem, we can conclude with the
following.
Theorem 6. The model-checking problem for LTLdisc⊕ [D] is undecidable.

4.2 Adding Unary Multiplication Operators

As we have seen in Section 4.1, adding the operator ⊕ to LTLdisc [D] makes model check-
ing undecidable. One may still want to find propositional quality operators that we can
add to the logic preserving its decidability. In this section we describe one such oper-
ator. We extend LTLdisc [D] with the operator λ , for λ ∈ (0, 1), with the semantics
[[π, λ ϕ]] = λ · [[π, ϕ]]. This operator allows the specifier to manually change the satis-
faction value of certain subformulas. This can be used to express importance, reliability,
etc. of subformulas. For example, in G(request → (response ∨ 23 Xresponse), we limit
the satisfaction value of computations in which a response is given with a delay to 23 .
Note that the operator λ is similar to a one-time application of Uexp+1 , thus λ ϕ is
λ
equivalent to FalseUexp+1 ψ. In practice, it is better to handle λ formulas directly, by
λ
adding the followingtransitions to the construction in the proof of Theorem 1.
δ(ϕ > t
λ
, σ) if t
λ
< 1, δ(ϕ < t
λ
, σ) if t
λ
≤ 1,
δ(λ ϕ > t, σ) = δ(λ ϕ < t, σ) =
False if t
λ
≥ 1, True if t
λ
> 1.
Discounting in LTL 437

5 Extensions
LTLdisc [D] with Past Operators A useful augmentation of LTL is the addition of past
operators [18]. These operators enable the specification of clearer and more succinct for-
mulas while preserving the PSPACE complexity of model checking. In the full version,
we add discounting-past operators to LTLdisc [D] and show how to perform model check-
ing on the obtained logic. The solution goes via 2-way weak alternating automata and
preserves the complexity of LTLdisc [D].
Weighted Systems. In LTLdisc [D], the verified system need not be weighted in order to
get a quantitative satisfaction – it stems from taking into account the delays in satisfying
the requirements. Nevertheless, LTLdisc [D] also naturally fits weighted systems, where
the atomic propositions have values in [0, 1]. In the full version we extend the semantics of
LTLdisc [D] to weighted Kripke structures, whose computations assign weights in [0, 1]
to every atomic proposition. We solve the corresponding model-checking problem by
properly extending the construction of the automaton Aϕ,v .
Changing the Tendency of Discounting. One may observe that in our discounting
scheme, the value of future formulas is discounted toward 0. This, in a way, reflects an
intuition that we are pessimistic about the future. While in some cases this fits the needs
of the specifier, it may well be the case that we are ambivalent to the future. To capture
this notion, one may want the discounting to tend to 12 . Other values are also possible. For
example, it may be that we are optimistic about the future, say when a system improves
its performance while running and we know that components are likely to function better
in the future. We may then want the discounting to tend, say, to 34 .
To capture this notion, we define the operator Oη,z , parameterized by η ∈ D and
z ∈ [0, 1], with the semantics. [[π, ϕOη,z ψ]] = supi≥0 {min{η(i)[[π i , ψ]] + (1 − η(i))z,
min0≤j<i η(j)[[π j , ϕ]] + (1 − η(j))z}}. The discounting function η determines the rate
of convergence, and z determines the limit of the discounting. In the full version, we
show how to augment the construction of Aϕ,v with the operator O in order to solve the
model-checking problem.

6 Discussion
An ability to specify and to reason about quality would take formal methods a signifi-
cant step forward. Quality has many aspects, some of which are propositional, such as
prioritizing one satisfaction scheme on top of another, and some are temporal, for ex-
ample having higher quality for implementations with shorter delays. In this work we
provided a solution for specifying and reasoning about temporal quality, augmenting the
commonly used linear temporal logic (LTL). A satisfaction scheme, such as ours, that
is based on elapsed times introduces a big challenge, as it implies infinitely many sat-
isfaction values. Nonetheless, we showed the decidability of the model-checking prob-
lem, and for the natural exponential-decaying satisfactions, the complexity remains as
the one for standard LTL, suggesting the interesting potential of the new scheme. As for
combining propositional and temporal quality operators, we showed that the problem is,
in general, undecidable, while certain combinations, such as adding priorities, preserve
the decidability and the complexity.
438 S. Almagor, U. Boker, and O. Kupferman

Acknowledgement. We thank Eleni Mandrali for pointing to an error in an earlier ver-

sion of the paper.

References
1. Almagor, S., Boker, U., Kupferman, O.: Formalizing and reasoning about quality. In: Fomin,
F.V., Freivalds, R., Kwiatkowska, M., Peleg, D. (eds.) ICALP 2013, Part II. LNCS, vol. 7966,
pp. 15–27. Springer, Heidelberg (2013)
2. Almagor, S., Hirshfeld, Y., Kupferman, O.: Promptness in ω-regular automata. In: Bouajjani,
A., Chin, W.-N. (eds.) ATVA 2010. LNCS, vol. 6252, pp. 22–36. Springer, Heidelberg (2010)
3. Bojańczyk, M., Colcombet, T.: Bounds in ω-regularity. In: 21st LICS, pp. 285–296 (2006)
4. Boker, U., Chatterjee, K., Henzinger, T.A., Kupferman, O.: Temporal Specifications with
Accumulative Values. In: 26th LICS, pp. 43–52 (2011)
5. Clarke, E., Grumberg, O., Peled, D.: Model Checking. MIT Press (1999)
6. Dam, M.: CTL and ECTL as fragments of the modal μ-calculus. TCS 126, 77–96 (1994)
7. de Alfaro, L., Faella, M., Henzinger, T., Majumdar, R., Stoelinga, M.: Model checking dis-
counted temporal properties. TCS 345(1), 139–170 (2005)
8. de Alfaro, L., Henzinger, T., Majumdar, R.: Discounting the future in systems theory.
In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS,
vol. 2719, pp. 1022–1037. Springer, Heidelberg (2003)
9. Droste, M., Rahonis, G.: Weighted automata and weighted logics with discounting.
TCS 410(37), 3481–3494 (2009)
10. Droste, M., Vogler, H.: Weighted automata and multi-valued logics over arbitrary bounded
lattices. TCS 418, 14–36 (2012)
11. Faella, M., Legay, A., Stoelinga, M.: Model checking quantitative linear time logic. Electr.
Notes Theor. Comput. Sci. 220(3), 61–77 (2008)
12. Gastin, P., Oddoux, D.: Fast LTL to büchi automata translation. In: Berry, G., Comon, H.,
Finkel, A. (eds.) CAV 2001. LNCS, vol. 2102, pp. 53–65. Springer, Heidelberg (2001)
13. Kan, S.H.: Metrics and Models in Software Quality Engineering. Addison-Wesley Longman
Publishing Co. (2002)
14. Kupferman, O., Piterman, N., Vardi, M.Y.: From Liveness to Promptness. In: Damm, W.,
Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 406–419. Springer, Heidelberg (2007)
15. Krob, D.: The equality problem for rational series with multiplicities in the tropical semiring
is undecidable. International Journal of Algebra and Computation 4(3), 405–425 (1994)
16. Kwiatkowska, M.: Quantitative verification: models techniques and tools. In:
ESEC/SIGSOFT FSE, pp. 449–458 (2007)
17. Laroussinie, F., Schnoebelen, P.: A hierarchy of temporal logics with past. In: Enjalbert, P.,
Mayr, E.W., Wagner, K.W. (eds.) STACS 1994. LNCS, vol. 775, pp. 47–58. Springer, Hei-
delberg (1994)
18. Lichtenstein, O., Pnueli, A., Zuck, L.: The glory of the past. In: Parikh, R. (ed.) Logic of
Programs 1985. LNCS, vol. 193, pp. 196–218. Springer, Heidelberg (1985)
19. Mandrali, E.: Weighted LTL with discounting. In: Moreira, N., Reis, R. (eds.) CIAA 2012.
LNCS, vol. 7381, pp. 353–360. Springer, Heidelberg (2012)
20. Minsky, M.: Computation: Finite and Infinite Machines, 1st edn. Prentice Hall (1967)
21. Miyano, S., Hayashi, T.: Alternating finite automata on ω-words. TCS 32, 321–330 (1984)
22. Mohri, M.: Finite-state transducers in language and speech processing. Computational Lin-
guistics 23(2), 269–311 (1997)
23. Moon, S., Lee, K., Lee, D.: Fuzzy branching temporal logic. IEEE Transactions on Systems,
Man, and Cybernetics, Part B 34(2), 1045–1055 (2004)
Discounting in LTL 439

24. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Proc. 16th POPL, pp. 179–
190 (1989)
25. Shapley, L.: Stochastic games. Proc. of the National Academy of Science 39 (1953)
26. Spinellis, D.: Code Quality: The Open Source Perspective. Addison-Wesley Professional
(2006)
27. Vardi, M.Y.: An automata-theoretic approach to linear temporal logic. In: Moller, F.,
Birtwistle, G. (eds.) Logics for Concurrency. LNCS, vol. 1043, pp. 238–266. Springer, Hei-
delberg (1996)
28. Vardi, M., Wolper, P.: An automata-theoretic approach to automatic program verification. In:
1st LICS, pp. 332–344 (1986)
Symbolic Model Checking of Stutter-Invariant
Properties Using Generalized Testing Automata(

Ala Eddine Ben Salem1,2 , Alexandre Duret-Lutz1,

Fabrice Kordon2, and Yann Thierry-Mieg2
1
LRDE, EPITA, Le Kremlin-Bicêtre, France
{ala,adl}@lrde.epita.fr
2 Sorbonne Universités, UPMC Univ. Paris 06,

CNRS UMR 7606, LIP6, F-75005, Paris, France

{Fabrice.Kordon,Yann.Thierry-Mieg}@lip6.fr

Abstract. In a previous work, we showed that a kind of ω-automata known

as Transition-based Generalized Testing Automata (TGTA) can outperform the
Büchi automata traditionally used for explicit model checking when verifying
stutter-invariant properties.
In this work, we investigate the use of these generalized testing automata to
improve symbolic model checking of stutter-invariant LTL properties. We propose
an efficient symbolic encoding of stuttering transitions in the product between a
model and a TGTA. Saturation techniques available for decision diagrams then
benefit from the presence of stuttering self-loops on all states of TGTA. Exper-
imentation of this approach confirms that it outperforms the symbolic approach
based on (transition-based) Generalized Büchi Automata.

1 Introduction
Model checking for Linear-time Temporal Logic (LTL) is usually based on converting
the negation of the property to check into an ω-automaton B , composing that automa-
ton with a model M given as a Kripke structure, and finally checking the language
emptiness of the resulting product B ⊗ M [21].
One way to implement this procedure is the explicit approach where B and M are
represented as explicit graphs. B is usually a Büchi automaton or a generalization us-
ing multiple acceptance sets. We use Transition-based Generalized Büchi Automata
(TGBA) for their conciseness. When the property to verify is stutter-invariant [8], test-
ing automata [13] should be preferred to Büchi automata. Instead of observing the
values of state propositions in the system, testing automata observe the changes of
these values, making them suitable to represent stutter-invariant properties. In previ-
ous work [1], we showed how to generalize testing automata using several acceptance
sets, and allowing a more efficient emptiness check. Our comparison showed these
Transition-based Generalized Testing Automata (TGTA) to be superior to TGBA for
model-checking of stutter-invariant properties.
Another implementation of this procedure is the symbolic approach where the au-
tomata and their products are represented by means of decision diagrams (a concise
way to represent large sets or relations) [3]. Symbolic encodings for generalized Büchi

( This work has been partially supported by the project ImpRo/ANR-2010-BLAN-0317.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 440–454, 2014.
*
c Springer-Verlag Berlin Heidelberg 2014
Symbolic Model Checking of Stutter-Invariant Properties 441

automata are pretty common [17]. With such encodings, we can compute, in one step, the
sets of all direct successors (PostImage) or predecessors (PreImage) of any set of states.
Using this technique, there have been a lot of propositions for symbolic emptiness-check
algorithms [9, 19, 14]. These symbolic algorithms manipulate fixpoints on the transition
relation which can be optimized using saturation techniques [4, 20].
However these approaches do not offer any reduction when verifying stutter-invariant
properties. So far, and to the best of our knowledge, testing automata have never been
used in symbolic model checking. Our goal is therefore to propose a symbolic approach
for model checking using TGTA, and compare it to the symbolic approach using TGBA.
In particular, we show that the computation of fixpoints on the transition relation of the
product can be sped up with a dedicated evaluation of stuttering transitions. We exploit
a separation of the transition relation into two terms, one of which greatly benefits from
saturation techniques.
This paper is organized as follows. Section 2 presents the symbolic model-checking
approach for TGBA. For generality we define our symbolic structures using predicates
over state variables in order to remain independent of the decision diagrams used to
actually implement the approach. Section 3 focuses on the encoding of TGTA in the
same framework. We first show how a TGTA can be encoded, then we show how to im-
prove the encoding of the Kripke structure and the product to benefit from saturation in
the encoding of stuttering transitions in the TGTA. Finally, Section 4 compares the two
approaches experimentally with an implementation that uses hierarchical Set Decision
Diagrams (SDD) [20] (a particular type of Decision Diagrams on integer variables, on
which we can apply user-defined operations). On our large, BEEM-based benchmark,
our symbolic encoding of TGTA appears to to be superior to TGBA.

2 Symbolic LTL Model Checking Using TGBA

We first recall how to perform the automata-theoretic approach to LTL model checking
using symbolic encodings of TGBA and Kripke structures. This setup will serve as a
baseline to measure our improvements from later sections.
Through the paper, let AP designate the finite set of atomic propositions of the
model. Any state of the model is labeled by a valuation of these atomic propositions.
Let Σ = 2AP denote the set of these valuations, which we interpret either as sets or as
Boolean conjunctions. For instance if AP = {a, b}, then Σ = 2AP = {{a, b}, {a}, {b}, 0} /
or equivalently Σ = {ab, ab̄, āb, āb̄}. An execution of the model is an infinite sequence
of such valuations, i.e., an element of Σω .

2.1 Kripke Structures and Their Symbolic Encoding

The executions of the model can be represented by a Kripke structure M .
Definition 1 (Kripke Structure). A Kripke structure over Σ is a tuple M = S, S0 , R, L,
where:
– S is a finite set of states,
– S0 ⊆ S is the set of initial states,
– R ⊆ S × S is the transition relation,
– L : S → Σ is a state-labeling function.
442 A.E. Ben Salem et al.

An execution w = 0 1 2 . . . ∈ Σω is accepted by M if there exists an infinite sequence

s0 , s1 , . . . ∈ Sω such that s0 ∈ S0 and ∀i ∈ N, (L(si ) = i ) ∧ ((si , si+1 ) ∈ R). The language
accepted by M is the set L (M ) ⊆ Σω of executions it accepts.
In symbolic model checking we encode such a structure with predicates that rep-
resent sets of states or transitions [18]. These predicates are then implemented using
decision diagrams [3].
Definition 2 (Symbolic Kripke Structure). A Kripke structure M = S, S0 , R, L can
be encoded by the following predicates where s, s5 ∈ S and ∈ Σ:
– PS0 (s) is true iff s ∈ S0 ,
– PR (s, s5 ) is true iff (s, s5 ) ∈ R,
– PL (s, ) is true iff L(s) = .
In the sequel, we use the notations S0 (s), R(s, s5 ) and L(s, ) instead of PS0 (s), PR (s, s5 )
and PL (s, ). A Symbolic Kripke structure is therefore a triplet of predicates K = S0 , R, L
on state variables.
Variables s and s5 used above are typically implemented using decision diagrams
to represent either a state or a set of states. In a typical encoding [3], states are rep-
resented by conjunctions of Boolean variables. For instance if S = {0, 1}3, a state
s = (1, 0, 1) would be encoded as s1 s̄2 s3 . Similarly, s1 s3 would encode the set of states
{(1, 0, 1), (1, 1, 1)}. With this encoding, S0 , R and L are propositional formulae which
can be implemented with BDDs or other kind of decision diagrams. In our implemen-
tation, we used SDDs on integer variables [20].

2.2 TGBA and Their Symbolic Encoding

Transition-based Generalized Büchi Automata (TGBA) [11] are a generalization of the
Büchi Automata (BA) commonly used for model checking. In our context, the TGBA
represents the negation of the LTL property to verify. We chose to use TGBA rather
than BA since they allow a more compact representation of properties [7].
Definition 3 (TGBA). A TGBA over the alphabet Σ = 2AP is a tuple B = Q, Q0 , δ, F
where:
– Q is a finite set of states,
– Q0 ⊆ Q is a set of initial states,
– δ ⊆ Q × Σ × Q is a transition relation, where each element (q, , q5 ) represents a
transition from state q to state q5 labeled by the valuation ,
– F ⊆ 2δ is a set of acceptance sets of transitions.
B accepts an execution 0 1 . . . ∈ Σω if there exists an infinite path (q0 , 0 , q1 )(q1 , 1 , q2 )
. . . ∈ δω that visits each acceptance set infinitely often: q0 ∈ Q0 and ∀ f ∈ F, ∀i ∈
N, ∃ j ≥ i, (q j , j , q j+1 ) ∈ f .
The language accepted by B is the set L (B ) ⊆ Σω of the executions it accepts.
We target TGBA in this paper because their use of generalized and transition-based
acceptance make them more concise than traditional Büchi automata [11]. Generalized
acceptance is classically used in symbolic model checking [9] and using transition-
based acceptance is not a problem [17]. People working with (classical) Büchi automata
Symbolic Model Checking of Stutter-Invariant Properties 443

can adjust to our definitions by “pushing” the acceptance of states to their outgoing
transitions [7].
Any LTL formula ϕ can be converted into a TGBA whose language is the set of ex-
ecutions that satisfy ϕ [7]. Figure 1(a) shows a TGBA derived from the LTL formula
F Ga. The Boolean expression over AP = {a} that labels each transition represents the
valuation of atomic propositions that hold in this transition (in this example, Σ = {a, ā}).
Any infinite path in this example is accepted if it visits infinitely often the only accep-
tance set containing transition (1, a, 1).
Like Kripke structures, TGBAs can be encoded by predicates [18] on state variables.

Definition 4 (Symbolic TGBA)

A TGBA Q, Q0 , δ, F is symbolically encoded by a triplet of predicates Q0 , Δ, {Δ f } f ∈F
where:
– Q0 (q) is true iff q ∈ Q0 ,
– Δ(q, , q5 ) is true iff (q, , q5 ) ∈ δ,
– ∀ f ∈ F, Δ f (q, , q5 ) is true iff (q, , q5 ) ∈ f .

2.3 Symbolic Product of a TGBA with a Kripke Structure

We now show how to build a synchronous product by composing the symbolic repre-
sentations of a TGBA with that of a Kripke structure, inspired from Sebastian et al. [18].

Definition 5 (Symbolic Product for TGBA). Given a Symbolic Kripke structure K =

S0 , R, L and a Symbolic TGBA A = Q0 , Δ, {Δ f } f ∈F sharing a set AP of atomic
propositions, the Symbolic Product K ⊗ A = P0 , T, {T f } f ∈F is defined by the predi-
cates P0 , T and T f encoding respectively the set of initial states, the transition relation
and the acceptance transitions of the product:
– (s, q) denotes the state variables of the product (s for the Kripke structure and q for
TGBA),
– P0 (s, q) = S0 (s) ∧ Q0 (q),3 4
– T ((s, q), (s5 , q5 )) = ∃ R(s, s5 ) ∧ L(s, ) ∧ Δ(q, , q5 ) , where (s5 , q5 ) encodes the next
state variables, 3 4
– ∀ f ∈ F, T f ((s, q), (s5 , q5 )) = ∃ R(s, s5 ) ∧ L(s, ) ∧ Δ f (q, , q5 ) .

a a a ā a
{a}
a {a}
0 1 0 1 2

{a}
ā 0/ 0/ 0/
(a) TGBA (b) TGTA

Fig. 1. TGBA and TGTA for the LTL property ϕ = F Ga. Acceptance transitions are
indicated by .
444 A.E. Ben Salem et al.

The labels are used to ensure that a transition (q, , q5 ) of A is synchronized with a
state s of K such that L(s, ). This way, we ensure that the product recognizes only the
executions of K that are also recognized by A. However we do not need to remember
how product transitions are labeled to check K ⊗ A for emptiness. A product can be seen
as a TGBA without labels on transitions.
In symbolic model checkers, the exploration of the product is based on the following
PostImage operation 3 [18]. For any set of states 4 encoded by a predicate P, PostImage(P)
(s5 , q5 ) = ∃(s, q) P(s, q) ∧ T ((s, q), (s5 , q5 )) returns a predicate representing the set of
states reachable in one step a state in P.
Because in TGBA the acceptance conditions are based on transitions, we also define
PostImage(P, f ) to computes the successors of P reached 3 using only transitions from4 an
acceptance set f ∈ F: PostImage(P, f )(s5 , q5 ) = ∃(s, q) P(s, q) ∧ T f ((s, q), (s5 , q5 )) .
These two operations are at the heart of the symbolic emptiness check presented in
the next section.

2.4 Symbolic Emptiness Check

One way to check if a product is not empty is to find a reachable Strongly Connected
Component that contains transitions from all acceptance sets (we call it an accepting
SCC). Figure 2 shows such an algorithm implemented using symbolic operations. It
mimics the algorithm F EASIBLE of Kesten et al. [14] and can be seen as a forward vari-
ant of OWCTY (One Way Catch Them Young [9]) that uses PostImage computations
instead of PreImage. Line 3 computes the set P of all reachable states of the product.
The main loop on lines 4–8 refines P at each iteration. Lines 5–6 keep only the states
of P that can be reached from a cycle in P. Lines 7–8 then remove all cycles that never
visit some acceptance set f ∈ F. Eventually the main loop will reach a fixpoint where
P contains all states that are reachable from an accepting SCC. The product is empty iff
that set is empty.
There are many variants of such symbolic emptiness checks. We selected this variant
mainly for its simplicity, as our contributions are mostly independent of the chosen
algorithm: essentially, we will improve the cost of computing Reach(P) (used lines 3
and 8).

1 Input: PostImage, P0 and F

2 begin
3 P ← Reach(P0 )
1 Reach(P)
4 while P changes do
2 while P changes do
5 while P changes do
3 P ← P ∪ PostImage(P)
6 P ← PostImage(P)
4 return P
7 for f in F do
8 P ← Reach(PostImage(P, f ))
9 return P = 0/

Fig. 2. Forward-variant of OWCTY, a symbolic emptiness check

Symbolic Model Checking of Stutter-Invariant Properties 445

3 Symbolic Approach Using TGTA

Testing automata [10] are a kind of automata that recognize only stutter-invariant prop-
erties. In previous work [1] we generalized them as Transition-based Generalized Test-
ing Automata (TGTA). In this section, we show how to encode a TGTA for symbolic
model checking.

Definition 6. A property, i.e., a set of infinite sequences P ⊆ Σω , is stutter-invariant iff

any sequence 0 1 2 . . . ∈ P remains in P after repeating any valuation i or omitting
i
duplicate valuations. Formally, P is stutter-invariant iff 0 1 2 . . . ∈ P ⇐⇒ 00 i11 i22 . . . ∈
P for any i0 > 0, i1 > 0 . . .
Theorem 1. An LTL property is stutter-invariant iff it can be expressed as an LTL for-
mula that does not use the X operator [16].

3.1 Transition-Based Generalized Testing Automata

While a TGBA observes the value of the atomic propositions AP, a TGTA observes the
changes in these values. If a valuation of AP does not change between two consecutive
valuations of an execution, we say that a TGTA executes a stuttering transition.
If A and B are two valuations, A ⊕ B denotes the symmetric set difference, i.e., the
set of atomic propositions that differ (e.g., ab̄ ⊕ ab = {a} ⊕ {a, b} = {b}). Technically,
this can be implemented with an XOR operation on bitsets (hence the symbol ⊕).

Definition 7. A TGTA over the alphabet Σ is a tuple T = Q, Q0 ,U, δ, F where:

– Q is a finite set of states,
– Q0 ⊆ Q is a set of initial states,
– U : Q0 → 2Σ is a function mapping each initial state to a set of symbols of Σ,
– δ ⊆ Q × Σ × Q is the transition relation, where each element (q, c, q5 ) represents a
transition from state q to state q5 labeled by a changeset c interpreted as a (possibly
empty) set of atomic propositions whose value must change between q and q5 ,
– F ⊆ 2δ is a set of acceptance sets of transitions,
and such that all stuttering transitions (i.e., transitions labeled by 0) / are self-loops
and every state has a stuttering self-loop. More formally, we can define a partition of
δ = δ0/ ∪ δ∗ where:
– δ0/ = {(q, 0, / q) | q ∈ Q} is the stuttering transition relation,
– δ∗ = {(q, , q5 ) ∈ δ | = 0} / is the non-stuttering transition relation.
An execution 0 1 2 . . . ∈ Σω is accepted by T if there exists an infinite path (q0 , 0 ⊕
1 , q1 )(q1 , 1 ⊕ 2, q2 )(q2 , 2 ⊕ 3, q3 ) . . . ∈ δω where:
– q0 ∈ Q0 with 0 ∈ U(q0 ) (the execution is recognized by the path),
– ∀ f ∈ F, ∀i ∈ N, ∃ j ≥ i, (q j , j ⊕ j+1 , q j+1 ) ∈ f (each acceptance set is visited in-
finitely often).
The language accepted by T is the set L (T ) ⊆ Σω of executions it accepts.

Figure 1(b) shows a TGTA recognizing the LTL formula F Ga. Acceptance sets are
represented using dots as in TGBAs. Transitions are labeled by changesets: e.g., the
transition (0, {a}, 1) means that the value of a changes between states 0 and 1. Initial
446 A.E. Ben Salem et al.

valuations are shown above initial arrows: U(0) = {a}, U(1) = {ā} and U(2) = {a}. As
{a} 0/ 0/
an illustration, the execution ā; a; a; a; . . . is accepted by the run 1 2 2 2 ...
because the value a only changes between the first two steps.
Theorem 2. Any stutter-invariant property can be translated into an equivalent
TGTA [1].
Note that Def. 7 differs from our previous work [1] because we now enforce a par-
tition of δ such that stuttering transitions can only be self-loops. However, the TGTA
resulting from the LTL translation we presented previously [1] already have this prop-
erty. We will use it to optimize symbolic computation in section 3.3.
Finally, a TGTA’s symbolic encoding is similar to that of a TGBA.
Definition 8 (Symbolic TGTA)
A TGTA T = Q, Q0 ,U, δ, F is symbolically encoded by a triplet of predicates
U0 , Δ⊕ , {Δ⊕f } f ∈F where:
– U0 (q, ) is true iff (q ∈ Q0 ) ∧ (U(q) = )
– Δ⊕ (q, c, q5 ) is true iff (q, c, q5 ) ∈ δ
– ∀ f ∈ F, Δ⊕f (q, c, q5 ) is true iff ((q, c, q5 ) ∈ f )

3.2 Symbolic Product of a TGTA with a Kripke Structure

The product between a TGTA and a Kripke structure is similar to the TGBA case, except
that we have to deal with changesets. The transitions (s, s5 ) of a Kripke structure that
must be synchronized with a transition (q, c, q5 ) of a TGTA, are all the transitions such
that the label of s and s5 differs by the changeset c.
In order to reduce the number of symbolic operations when computing the Symbolic
product of a TGTA with a Kripke structure, we introduce a changeset-based encoding
of Kripke structure (only the transition relation changes).
Definition 9 (Changeset-based symbolic Kripke structure). A Kripke structure M =
S, S0 , R, L, can be encoded by the changeset-based symbolic Kripke structure K ⊕ =
S0 , R⊕ , L, where:
– the predicate R⊕ (s, c, s5 ) is true iff ((s, s5 ) ∈ R ∧ (L(s) ⊕ L(s5 )) = c),
– the predicates S0 and L have the same definition as for a Symbolic Kripke structure
K (Def. 2).
In practice, the (changeset-based or not) symbolic transition relation of the Kripke
structure should be constructed directly from the model and atomic propositions of the
formula to check. In Section 4.2, we discuss how we build such changeset-based Kripke
structures in our setup.
The procedure requires reconstruction of the symbolic transition relation for each
formula (or at least for each set of atomic propositions used in the formulas). However
the cost of this construction is not significant with respect to the complexity of the
overall model checking procedure (overall on our benchmark, less than 0.16% of total
time was spent building these transition relations).
Adjusting the symbolic encoding of the Kripke structure to TGTA, allows us to ob-
tain the following natural definition of the symbolic product using TGTA:
Symbolic Model Checking of Stutter-Invariant Properties 447

Definition 10 (Symbolic Product for TGTA). Given a changeset-based Symbolic Kri-

pke structure K ⊕ = S0 , R, L and a Symbolic TGTA A⊕ = U0 , Δ⊕ , {Δ⊕f } f ∈F sharing the
same set of atomic propositions AP, the Symbolic Product K ⊕ ⊗ A⊕ = P0 , T, {T f } f ∈F
is defined by the following predicates: 3 4
– The set of initial states is encoded by: P0 (s, q) = ∃ S0 (s) ∧ L(s, ) ∧U0 (q, )
– The transition relation3 of the product is: 4
T ((s, q), (s5 , q5 )) = ∃c R⊕ (s, c, s5 ) ∧ Δ⊕ (q, c, q5 )
– The definition of T f is similar to T by replacing Δ⊕ with Δ⊕f .
The definitions of PostImage(P) and PostImage(P, f ) are the same as in the TGBA
approach, with the new expressions of T and T f above.
As for the product in TGBA approach, the product in TGTA approach is a TGBA (or
a TGTA) without labels on transitions, and the same emptiness check algorithm (Fig. 2)
can be used for the two products.

3.3 Exploiting Stuttering Transitions to Improve Saturation in the TGTA

Approach
Among symbolic approaches for evaluating a fixpoint on a transition relation, the sat-
uration algorithm offers gains of one to three orders of magnitude [4] in both time and
memory, especially when applied to asynchronous systems [5].
The saturation algorithm does not use a breadth-first exploration of the product (i.e.,
each iteration in the function Reach (Fig. 2) is not a “global" PostImage() computation).
Saturation instead recursively repeats “local” fixed-points by recognizing and exploiting
transitions locality and identity transformations on state variables [5].
This algorithm considers that the system state consists of k discrete variables encoded
by a Decision Diagram, and that the transition relation is expressed as a disjunction of
terms called transition clusters. Each cluster typically only reads or writes a limited
subset consisting of k5 ≤ k variables, called the support of the cluster. During the least
fixpoint computing the reachable states, saturation technique consists in reordering [12]
the evaluation of (“local” fixed-points on) clusters in order to avoid the construction of
(useless) intermediate Decision Diagram nodes.
The algorithm to determine an ordering for saturation is based on the support of each
cluster.
We now show how to decompose the transition relation of the product K ⊕ ⊗ A⊕ to
exhibit clusters having a smaller support, favoring the saturation technique.
We base our decomposition on the fact that in a TGTA, all stuttering transitions are
self-loops and every state has a stuttering self-loop (δ0/ in Def. 7). Therefore, stuttering
transitions in the Kripke structure can be mapped to stuttering transitions in the product
regardless of the TGTA state.
Let us separate stuttering and non-stuttering transitions in the transition relation T of
the product between a Kripke structure and a TGTA (K ⊕ ⊗ A⊕):
3 4
T ((s, q), (s5 , q5 )) = R⊕ (s, 0,
/ s5 ) ∧ Δ⊕ (q, 0,
/ q5 ) ∨ ∃c R⊕ 5 ⊕ 5
∗ (s, c, s ) ∧ Δ∗ (q, c, q )

where R⊕ ⊕
∗ and Δ∗ encode respectively the non-stuttering transitions of the model and
of the TGTA:
448 A.E. Ben Salem et al.

– Δ⊕ 5 5
∗ (q, c, q ) is true iff (q, c, q ) ∈ δ∗ (see Def. 7)
– R∗ (s, c, s ) is true iff R (s, c, s5 ) ∧ (c = 0)
⊕ 5 ⊕ /
According to the definition of δ0/ in Def. 7, the predicate Δ⊕ (q, 0, / q5 ) encodes the set
of TGTA’s self-loops and can be replaced by the predicate equal(q, q5), simplifying T :
3 4
T ((s, q), (s5 , q5 )) = R⊕ (s, 0,
/ s5 ) ∧ equal(q, q5) ∨ ∃c R⊕ 5 ⊕ 5
∗ (s, c, s ) ∧ Δ∗ (q, c, q ) (1)
5 67 8 5 67 8
T0/ ((s,q),(s5 ,q5 )) T∗ ((s,q),(s5 ,q5 ))

The transition relation (1) is a disjunction of T∗ , synchronizing updates of both TGTA

and Kripke structure, and T0/ , corresponding to the stuttering transitions of the Kripke
structure. Since all states in the TGTA have a stuttering self-loop, T0/ does not depend on
the TGTA state. In practice, the predicate equal(q, q5) is an identity relation for variable
q [5] and is simplified away (i.e., the term T0/ can be applied to a decision diagram
without consulting or updating the variable q [12]). Hence q is not part of the clusters
supports in T0/ (while q is part of the clusters supports in T∗ ). This gives more freedom
to the saturation technique for reordering the application of clusters in T0/ .
Note that in the product of TGBA with Kripke structure (Def. 5) there is no T0/ that
could be extracted since there is no stuttering hypothesis in general. This severely limits
the possibilities of the saturation algorithm in the TGBA approach.
In the symbolic emptiness check presented in Fig. 2, the function Reach corresponds
to a least fixpoint performed using saturation. As we shall see experimentally in the next
section, the better encoding of T0/ (without q in its support) in the product of TGTA with
Kripke structure, greatly favors the saturation technique, leading to gains of roughly
one order of magnitude.

4 Experimentation

We now compare the approaches presented in this paper. The symbolic model-checking
approach using TGBA, presented in Section 2 serves as our baseline. We first describe
our implementation and selected benchmarks, prior to discussing the results.

4.1 Implementation

All approaches are implemented on top of three libraries1 : Spot, SDD/ITS, and LTSmin.
Spot is a model-checking library providing several bricks that can be combined to
build model checkers [7]. In our implementation, we reused the modules providing a
translation form an LTL formula into a TGBA and into a TGTA [1].
SDD/ITS is a library for symbolic representation of state spaces in the form of
Instantiable Transition Systems (ITS): an abstract interface for symbolic Labeled Tran-
sition Systems (LTS). The symbolic encoding of ITS is based on Hierarchical Set De-
cision Diagrams (SDD) [20]. SDDs allow a compact symbolic representation of states
and transition relation.
1 Respectively http://spot.lip6.fr, http://ddd.lip6.fr, and
http://fmt.cs.utwente.nl/tools/ltsmin.
Symbolic Model Checking of Stutter-Invariant Properties 449

The algorithms presented in this paper can be implemented using any kind of de-
cision diagram (such as OBDD), but use of the SDD software library allows to easily
benefit from the automatic saturation mechanism described in [12].
LTSmin [2] can generate state spaces from various input formalisms (µCRL, DVE,
GNA, MAPLE, PROMELA, ...) and store the obtained LTS in a concise symbolic for-
mat, called Extended Table Format (ETF). We used LTSmin to convert DVE models
into ETF for our experiments. This approach offers good generality for our tool, since
it can process any formalism supported by LTSmin tool.
Our symbolic model checker inputs an ETF file and an LTL formula. The LTL
formula is converted into TGBA or TGTA which is then encoded using an ITS. The
ETF model is also symbolically encoded using an ITS (see Sec. 4.2). The two obtained
ITSs are then composed to build a symbolic product, which is also an ITS. Finally, the
OWCTY emptiness check is applied to this product.

4.2 Using ETF to Build a Changeset-Based Symbolic Kripke Structure

An ETF file2 produced by LTSmin is a text-based serialization of the symbolic repre-
sentation of the transition relation of a model whose states consist in k integer variables.
Transitions are described in the following tabular form:
0/1 0/1 * *
1/2 * 0/1 *
...
where each column correspond to a variable, and each line describes the effect of a
symbolic transition on the corresponding variables. The notation “in/out” means that
the variable must have the value “in” for the transition to fire, and the value is then
updated to “out”. A “*” means that the variable is not consulted or updated by the tran-
sition. Each line may consequently encode a set of explicit transitions that differ only
by the values of the starred variables: the support of a transition is the set of unstarred
variables.
A changeset-based symbolic Kripke structure, as defined in Sec. 3.2, can be easily
obtained from such a description. To obtain a changeset associated to a line in the file,
it is enough to compute difference between values of atomic propositions associated
to the in variables and the values associated to the out variables. Because they do not
change, starred variables have no influence on the changeset.
Note that an empty changeset does not necessarily correspond to a line where all
variables are starred. Even when in and out values are different, they may have no
influence on the atomic propositions, and the resulting changeset may be empty. For
instance if the only atomic proposition considered is p = (v1 > 1) (where v1 denotes
the first-column variable), then the changeset associated to the first line is 0,/ and the
changeset for the second line is {p}.

4.3 Benchmark
We evaluated the TGBA and TGTA approaches on the following models and formulae:
2 http://fmt.cs.utwente.nl/tools/ltsmin/doc/etf.html
450 A.E. Ben Salem et al.

Table 1. Characteristics of our selected benchmark models. The stuttering-ratio rep-

resents the percentage of stuttering transitions in the model. Since the definition of
stuttering depends on the atomic propositions of the formula, we give an average over
the 200 properties checked against each model.

states stut. states stut.

BEEM model BEEM model
103 × ratio 103 × ratio
at.5 31 999 95% lann.6 144 151 52%
bakery.4 157 83% lann.7 160 025 64%
bopdp.3 1 040 91% lifts.7 5 126 93%
elevator.4 888 74% peterson.5 131 064 83%
brp2.3 40 79% pgm_protocol.8 3 069 92%
fischer.5 101 028 89% phils.8 43 046 89%
lamport_nonatomic.5 95 118 92% production_cell.6 14 520 85%
lamport.7 38 717 93% reader_writer.3 604 88%

– Our models come from the BEEM benchmark [15], a suite of models for explicit
model checking, which contains some models that are considered difficult for sym-
bolic model checkers [2]. Table 1 summarizes the 16 models we selected as repre-
sentatives of the overall benchmark.
– BEEM provides a few LTL formulae, but they mostly represent safety properties
and can thus be checked without building a product. Therefore, for each model,
we randomly generated 200 stutter-invariant LTL formulae: 100 verified formulae
(empty product) and 100 violated formulae (non-empty product). We consequently
have a total 3200 pairs of (model, formula).
All tests were run on a 64bit Linux system running on an Intel Xeon E5645 at 2.40GHz.
Executions that exceeded 30 minutes or 4GB of RAM were aborted and are reported
with time and memory above these thresholds in our graphics.
In all approaches evaluated, symbolic products are encoded using the same variable
ordering: we used the symbolic encoding named “log-encode with top-order” by Se-
bastiani et al. [18].

4.4 Results
The results of our experimental3 comparisons are presented by the two scatter plot ma-
trices of Fig. 3 and Fig. 4. The scatter plot highlighted at the bottom of Fig. 3 compares
the time-performance of the TGTA-approach against the reference TGBA approach.4
Each point of the scatter plot represents a measurement for a pair (model, formula). For
the highlighted plot, the x-axis represents the TGBA approach and the y-axis represent
the TGTA approach, so 3060 points below the diagonal correspond to cases where the
TGTA approach is better, and the 131 points above the diagonal corresponds to points
were the TGBA approach is better (In scatter plot matrices, each point below the di-
agonal is in favor of the approach displayed on the right, while each point above the
3 The results, models, formulae and tools used in these tests, can be downloaded from
http://www.lrde.epita.fr/~ala/TACAS-2014/Benchmark.html
4 We recommend viewing these plots online.
Symbolic Model Checking of Stutter-Invariant Properties 451

Time (s)

TGBA (nosat) empty products (veriﬁed formulae)

19 Timeouts non-empty products (violated formulae)
2 OutOfMem

1000 1036 cases

TGBA (sat)
10 25 Timeouts
1 OutOfMem
0.1
2064 cases

1000 2306 cases 2345 cases

TGTA (nosat)
10 127 Timeouts
6 OutOfMem
0.1
853 cases 809 cases

1000 116 cases 131 cases 20 cases

TGTA (sat)
10 0 Timeouts
2 OutOfMem
0.1
3073 cases 3060 cases 3173 cases
0.1 10 1000 0.1 10 1000 0.1 10 1000

Fig. 3. Time-comparison of the TGBA and TGTA approaches, with saturation enabled
“(sat)” or disabled “(nosat)”, on a set of 3199 pairs (model, formula). Timeouts and
Out-of-memory errors are plotted on separate lines on the top or right edges of the
scatter plots. Each plot also displays the number of cases that are above or below
the main diagonal (including timeouts and out-of-memory errors), i.e., the number of
(model, formula) for which one approach was better than the other. Additional diago-
nals show the location of ×10 and /10 ratios. Points are plotted with transparency to
better highlight dense areas, and lessen the importance of outliers.

diagonal is in favor of the approach displayed in the top). Axes use a logarithmic scale.
The colors distinguish violated formulae (non-empty product) from verified formulae
(empty products). In order to show the influence of the saturation technique, we also ran
the TGBA and TGTA approaches with saturation disabled. In our comparison matrix,
the labels “(sat)” and “(nosat)” indicate whether saturation was enabled or not. Fig. 4
gives the memory view of this experiment.
452 A.E. Ben Salem et al.

Memory (MB)

TGBA (nosat) empty products (veriﬁed formulae)

19 Timeouts non-empty products (violated formulae)
2 OutOfMem

5000 2088 cases

2000 TGBA (sat)
1000
500 25 Timeouts
200 1 OutOfMem
100
50
1090 cases

5000 2099 cases 2052 cases

2000 TGTA (nosat)
1000
500 127 Timeouts
200 6 OutOfMem
100
50
1081 cases 1125 cases

5000 362 cases 350 cases 179 cases

2000 TGTA (sat)
1000
500 0 Timeouts
200 2 OutOfMem
100
50
2837 cases 2849 cases 3020 cases
50 200 1000 5000 50 200 1000 5000 50 200 1000 5000

Fig. 4. Comparison of the memory-consumption of the TGBA and TGTA approaches,

with or without saturation, on the same set of problems

As shown by the highlighted scatter plots in Fig. 3 and 4, the TGTA approach clearly
outperforms the traditional TGBA-based scenario by one order of magnitude. This is
due to the combination of two factors: saturation and exploration of stuttering.
The saturation technique does not significantly improve the model checking using
TGBA (compare “TGBA (sat)” against “TGBA (nosat)” at the top of Fig. 3 and 4). In
fact, the saturation technique is limited on the TGBA approach, because in the transition
relation of Def. 5 each conjunction must consult the variable q representing the state of
the TGBA, therefore q impacts the supports and the reordering of clusters evaluated by
the saturation. This situation is different in the case of TGTA approach, where the T0/
term of the transition-relation of the product (equation (1)) does not involve the state q
of the TGTA: here, saturation strongly improve performances (compare “TGTA (sat)”
against “TGTA (nosat)”).
Symbolic Model Checking of Stutter-Invariant Properties 453

Overall the improvement to this symbolic technique was only made possible because
the TGTA representation makes it easy to process the stuttering behaviors separately
from the rest. These stuttering transitions represent a large part of the models transi-
tions, as shown by the stuttering-ratios of Table 1. Using these stuttering-ratios, we can
estimate in our Benchmark the importance of the term T0/ compared to T∗ in equation (1).

5 Conclusion
Testing automata [10] are a way to improve the explicit model checking approach when
verifying stutter-invariant properties, but they had not been used for symbolic model
checking. In this paper, we gave the first symbolic approach using testing automata, with
generalized acceptance (TGTA), and compare it to a more classical symbolic approach
(using TGBA).
On our benchmark, using TGTA, we were able to gain one order of magnitude over
the TGBA-based approach.
We have shown that fixpoints over the transition relation of a product between a
Kripke structure and a TGTA can benefit from the saturation technique, especially be-
cause part of their expression is only dependent on the model, and can be evaluated
without consulting the transition relation of the property automaton. The improvement
was possible only because TGTA makes it possible to process stuttering behaviors
specifically, in a way that helps the saturation technique.
In future work, we plan to evaluate the use of TGTA in the context of hybrid ap-
proaches, mixing both explicit and symbolic approaches [18, 6].

References
1. Ben Salem, A.-E., Duret-Lutz, A., Kordon, F.: Model checking using generalized testing
automata. In: Jensen, K., van der Aalst, W.M., Ajmone Marsan, M., Franceschinis, G., Kleijn,
J., Kristensen, L.M. (eds.) ToPNoC VI. LNCS, vol. 7400, pp. 94–122. Springer, Heidelberg
(2012)
2. Blom, S., van de Pol, J., Weber, M.: LTS MIN : Distributed and symbolic reachability. In:
Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 354–359. Springer,
Heidelberg (2010)
3. Burch, J.R., Clarke, E.M., McMillan, K.L., Dill, D.L., Hwang, L.: Symbolic model checking:
1020 states and beyond. In: Proc. of the Fifth Annual IEEE Symposium on Logic in Computer
Science, pp. 1–33. IEEE Computer Society Press (1990)
4. Ciardo, G., Marmorstein, R., Siminiceanu, R.: Saturation unbound. In: Garavel, H., Hatcliff,
J. (eds.) TACAS 2003. LNCS, vol. 2619, pp. 379–393. Springer, Heidelberg (2003)
5. Ciardo, G., Yu, A.J.: Saturation-based symbolic reachability analysis using conjunctive and
disjunctive partitioning. In: Borrione, D., Paul, W. (eds.) CHARME 2005. LNCS, vol. 3725,
pp. 146–161. Springer, Heidelberg (2005)
6. Duret-Lutz, A., Klai, K., Poitrenaud, D., Thierry-Mieg, Y.: Self-loop aggregation product
— A new hybrid approach to on-the-fly LTL model checking. In: Bultan, T., Hsiung, P.-A.
(eds.) ATVA 2011. LNCS, vol. 6996, pp. 336–350. Springer, Heidelberg (2011)
7. Duret-Lutz, A., Poitrenaud, D.: SPOT: an extensible model checking library using transition-
based generalized Büchi automata. In: Proc. of MASCOTS 2004, pp. 76–83. IEEE Computer
Society Press (2004)
454 A.E. Ben Salem et al.

8. Etessami, K.: Stutter-invariant languages, ω-automata, and temporal logic. In: Halbwachs,
N., Peled, D.A. (eds.) CAV 1999. LNCS, vol. 1633, pp. 236–248. Springer, Heidelberg
(1999)
9. Fisler, K., Fraer, R., Kanhi, G., Vardi, M.Y., Yang, Z.: Is there a best symbolic cycle-detection
algorithm? In: Margaria, T., Yi, W. (eds.) TACAS 2001. LNCS, vol. 2031, pp. 420–434.
Springer, Heidelberg (2001)
10. Geldenhuys, J., Hansen, H.: Larger automata and less work for LTL model checking. In:
Valmari, A. (ed.) SPIN 2006. LNCS, vol. 3925, pp. 53–70. Springer, Heidelberg (2006)
11. Giannakopoulou, D., Lerda, F.: From states to transitions: Improving translation of LTL for-
mulæ to Büchi automata. In: Peled, D.A., Vardi, M.Y. (eds.) FORTE 2002. LNCS, vol. 2529,
pp. 308–326. Springer, Heidelberg (2002)
12. Hamez, A., Thierry-Mieg, Y., Kordon, F.: Hierarchical set decision diagrams and automatic
saturation. In: van Hee, K.M., Valk, R. (eds.) PETRI NETS 2008. LNCS, vol. 5062, pp.
211–230. Springer, Heidelberg (2008)
13. Hansen, H., Penczek, W., Valmari, A.: Stuttering-insensitive automata for on-the-fly detec-
tion of livelock properties. In: Proc. of FMICS 2002, ENTCS, vol. 66(2) (2002)
14. Kesten, Y., Pnueli, A., Raviv, L.-O.: Algorithmic verification of linear temporal logic spec-
ifications. In: Larsen, K.G., Skyum, S., Winskel, G. (eds.) ICALP 1998. LNCS, vol. 1443,
pp. 1–16. Springer, Heidelberg (1998)
15. Pelánek, R.: BEEM: Benchmarks for explicit model checkers. In: Bošnački, D., Edelkamp,
S. (eds.) SPIN 2007. LNCS, vol. 4595, pp. 263–267. Springer, Heidelberg (2007)
16. Peled, D., Wilke, T.: Stutter-invariant temporal properties are expressible without the next-
time operator. Information Processing Letters 63(5), 243–246 (1995)
17. Rozier, K.Y., Vardi, M.Y.: A multi-encoding approach for LTL symbolic satisfiability check-
ing. In: Butler, M., Schulte, W. (eds.) FM 2011. LNCS, vol. 6664, pp. 417–431. Springer,
Heidelberg (2011)
18. Sebastiani, R., Tonetta, S., Vardi, M.Y.: Symbolic systems, explicit properties: On hybrid
approaches for LTL symbolic model checking. In: Etessami, K., Rajamani, S.K. (eds.) CAV
2005. LNCS, vol. 3576, pp. 350–363. Springer, Heidelberg (2005)
19. Somenzi, F., Ravi, K., Bloem, R.: Analysis of symbolic SCC hull algorithms. In: Aagaard,
M.D., O’Leary, J.W. (eds.) FMCAD 2002. LNCS, vol. 2517, pp. 88–105. Springer, Heidel-
berg (2002)
20. Thierry-Mieg, Y., Poitrenaud, D., Hamez, A., Kordon, F.: Hierarchical set decision diagrams
and regular models. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505,
pp. 1–15. Springer, Heidelberg (2009)
21. Vardi, M.Y.: An automata-theoretic approach to linear temporal logic. In: Moller, F.,
Birtwistle, G. (eds.) Logics for Concurrency. LNCS, vol. 1043, pp. 238–266. Springer, Hei-
delberg (1996)
Symbolic Synthesis for Epistemic Specifications
with Observational Semantics

Xiaowei Huang and Ron van der Meyden

School of Computer Science and Engineering

UNSW Australia
{xiaoweih,meyden}@cse.unsw.edu.au

Abstract. The paper describes a framework for the synthesis of protocols for
distributed and multi-agent systems from specifications that give a program struc-
ture that may include variables in place of conditional expressions, together with
specifications in a temporal epistemic logic that constrain the values of these vari-
ables. The epistemic operators are interpreted with respect to an observational
semantics. The framework generalizes the notion of knowledge-based program
proposed by Fagin et al (Dist. Comp. 1997). An algorithmic approach to the syn-
thesis problem is developed that computes all solutions, using a reduction to epis-
temic model checking, that has been implemented using symbolic techniques. An
application of the approach to synthesize mutual exclusion protocols is presented.

1 Introduction
In concurrent, distributed or multi-agent systems it is typical that agents must act on the
basis of local data to coordinate to ensure global properties of the system. This leads
naturally to the consideration of the notion of what an agent knows about the global
state, given the state of its local data structures. Epistemic logic, or the logic of knowl-
edge [9] has been developed as a formal language within which to express reasoning
about this aspect of concurrent systems. In particular, knowledge-based programs [10],
a generalization of standard programs in which agents condition their actions on for-
mulas expressed in a temporal-epistemic logic, have been proposed as a framework for
expressing designs of distributed protocols at the knowledge level. Many of the inter-
esting analyses of problems in distributed computing based on notions of knowledge
(e.g. [13]) can be cast in the form of knowledge-based programs.
Knowledge-based programs have the advantage of abstracting from the details of
how information is encoded in an agent’s local state, enabling a focus on what an agent
needs to know in order to decide between its possible actions. On the other hand, this
abstraction means that knowledge-based programs do not have an operational seman-
tics. They are more like specifications than like programs in this regard: obtaining an
implementation of a knowledge-based program requires that concrete properties of the
agent’s local state be found that are equivalent to the conditions on the agent’s knowl-
edge used in the program.
This gap has meant that knowledge-based analyses have been largely conducted as
pencil and paper exercises to date, and only limited automated support for knowledge-
based programming has been available. One approach to automation that has emerged

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 455–469, 2014.

c Springer-Verlag Berlin Heidelberg 2014
456 X. Huang and R. van der Meyden

in the last ten years is the development of epistemic model checking tools [11,16].
These give a partial solution to the gap, in that they allow a putative implementation
of a knowledge-based program to be verified for correctness (for examples, see [2,3]).
However, this leaves open the question of how such an implementation is to be obtained,
which still requires human insight.
Our contribution in this paper is to develop and implement an approach that auto-
mates the construction of implementations for knowledge-based programs for the case
of the observational semantics for knowledge-based programs. (In earlier work [14]
we dealt with stronger semantics for a more limited program syntax, see Section 7 for
discussion). Our approach is to reduce the problem to model checking, enabling the
investment in epistemic model checking to be leveraged to automatically synthesize
implementations of knowledge-based programs. In particular, we build on symbolic
techniques for epistemic model checking.
We in fact generalize the notion of knowledge-based program to a more liberal notion
that we call epistemic protocol specification, based on a protocol template together with
a set of temporal-epistemic formulas that constrain how the template is to be instanti-
ated. This enables our techniques to be applied also to cover ideas such as the sound
local proposition generalization of knowledge-based programs [8]. We illustrate the
approach through an application of the knowledge-based programming methodology
to the development of protocols for mutual exclusion. We give an abstract knowledge-
based specification of a protocol for mutual exclusion, and show how our approach can
automatically extract different protocols solving this problem.

2 A Semantic Model for Knowledge and Time

For brevity, we present the theory of our approach at the level of semantic structures,
since the symbolic algorithms we use work at this level. However, the input to our
synthesis system is given in a programming notation, and, for clarity of exposition, we
use this notation to present examples. For motivation, the reader may prefer to read
the example in Section 4 first. Details of the mapping from programming syntax to the
semantic structures are fairly standard, and left to the reader’s intuition.
Let V be a finite set of boolean variables and Ags be a finite set of agents. The
language CTLK(V, Ags) has the syntax:

φ ::= v | ¬φ | φ1 ∨ φ2 | EXφ | E(φ1Uφ2 ) | EGφ | Ki φ

where v ∈ V and i ∈ Ags. This is CTL plus the construct Ki φ, which says that agent i
knows that φ holds. We freely use standard operators that are definable in terms of the
above, e.g., AFφ = ¬EG¬φ and AGφ = ¬E(trueU¬φ).
A (finite) model is a tuple M = (S , I, −→, {∼i }i∈Ags , F , π) where S is a (finite) set of
states, I ⊆ S is a set of initial states, →⊆ S × S is a transition relation, ∼i : S × S →
{0, 1} is an indistinguishability relation of agent i, component F is a fairness condition
(explained below), and π : S → P(V) is a truth assignment (here P(V) denotes the
powerset of V.) A path in M from a state s ∈ S is a finite or infinite sequence s = s0 −→
s1 −→ s2 −→ . . . We assume that −→ is serial, i.e. for each s ∈ S there exists t ∈ S such
that s −→ t. We model fairness using the condition F by taking this to be a generalized
Symbolic Synthesis for Epistemic Specifications 457

Büchi fairness condition, expressed as a set of sets of states: F = {α1 , . . . , αk } where

each αi ⊆ S . An infinite path s = s0 −→ s1 −→ s2 −→ . . . is fair with respect to F if,
for each i = 1 . . . k, there are infinitely many indices j for which s j ∈ αi . Let rch(M) be
the set of fair reachable states of M, i.e., the set of states sn (for some n) such that there
exists a fair path s0 −→ s1 −→ . . . −→ sn −→ sn+1 −→ . . . with s0 ∈ I an initial state.
We assume that I ⊆ rch(M), i.e., all initial states are the source of a fair path.
The semantics of the language is given by a satisfaction relation M, s |= φ, where
s ∈ rch(M) is a fair reachable state. This relation is defined inductively as follows:
1. M, s |= v if v ∈ π(s),
2. M, s |= ¬φ if not M, s |= φ,
3. M, s |= φ1 ∨ φ2 if M, s |= φ1 or M, s |= φ2 ,
4. M, s |= EXφ if there exists t ∈ rch(M) such that s −→ t and M, t |= φ,
5. M, s |= E(φ1Uφ2 ) if there exists a fair path s = s0 −→ s1 −→ . . . −→ sn → . . . such
that M, sk |= φ1 for k = 0 . . . n − 1 and M, sn |= φ2 ,
6. M, s |= EGφ if for there exists a fair path s = s0 −→ s1 −→ . . . with M, sk |= φ for
all k ≥ 0,
7. M, s |= Ki φ if for all t ∈ rch(M) with s ∼i t we have M, t |= φ.
We are interested in models in which each of the agents runs a protocol in which it
chooses its actions based on local information. To this end, we describe how a model
may be obtained from the agents running such protocols in the context of an environ-
ment, which provides shared structure through which the agents can communicate.
An environment for agents Ags is a tuple E = Vare , Ie , Acts, −→e , Fe , where
1. Vare is a finite set of variables, from which we derive a set of states S e = P(Vare ),
2. Ie is a subset of S e , representing the initial states,
3. Acts = Πi∈Ags Actsi is a set of joint actions, where each Actsi is a finite set of actions
that may be performed by agent i,
4. −→e ⊆ S e × Acts × S e is a transition relation, labelled by joint actions,
5. Fe is a generalized Büchi fairness condition over the states S e .
Intuitively, a joint action a represents a choice of action ai for each agent, performed
simultaneously, and the transition relation resolves this into an effect on the state. We
assume that −→e is serial in the sense that for all s ∈ S e and a ∈ Acts there exists t ∈ S e
a
such that s −→ t.
Semantically, a concrete protocol for agent i ∈ Ags in such an environment E may
be represented by a tuple Proti = PVari , LVari , OVari , Ii , Actsi , −→i , where
1. PVari ⊆ Vare is a subset of the variables of E, called the parameter variables of the
protocol,
2. LVari is a finite set of variables, understood as the local variables of the agent,
3. OVari ⊆ PVari ∪ LVari is the set of variables of the above two types that are observ-
able to the agent, and on the basis of which the agent computes what it knows,
4. Ii is a subset of P(LVari ), representing the initial states of the protocol,
5. Actsi is the set of actions that the agent is able to perform (this must match the set
of actions associated to this agent in the environment),
6. −→i ⊆ P(PVari ∪ LVari ) × Actsi × P(LVari ) is a serial labelled transition relation.
458 X. Huang and R. van der Meyden

We assume that the sets Vare and LVari , for i ∈ Ags, are mutually disjoint.1
Note that the transition relation −→i indicates how an agent’s local variables are
updated when performing an action, which may depend on the current values of the pa-
rameter variables in the environment. This transition relation does not specify a change
in the value of the parameter variables: changes to these are determined in the envi-
ronment on the basis of the actions that this agent, and others, perform in the given
step.
Given an environment E and a collection {Proti }i∈Ags of concrete protocols for the
agents, we may construct a model M(E, {Proti }i∈Ags ) = (S , I, −→, {∼i }i∈Ags , F , π) as fol-

lows. The set of states is S = P(Vare ∪ i∈Ags LVari ), i.e., the set of all assignments to the

environment and local variables. We represent such states in the form s = se ∪ i∈Ags li ,
where se ⊆ Vare and each li ⊆ LVari . Such a state s is taken to be an initial state in I if
se ∈ Ie and li ∈ Ii for all agents i. That is, I is the set of states where the environment
and each of the agents is in an initial state. The epistemic indistinguishability relations
for agent i over the states S is defined by s ∼i t iff s ∩ OVari = t ∩ OVari , i.e., the states
s and t have the same values for all of agent i’s observable variables. The transition

relation −→ is given by se ∪ i∈Ags li −→ se ∪ i∈Ags li if there exists a joint action a
a ai
such that se −→e se and (se ∩ PVari ) ∪ li −→i li for each agent i. We take the fairness
condition F to contain

{se ∪ li | se ∈ α, l1 ∈ P(LVar1 ), . . . , ln ∈ P(LVarn )}
i∈Ags

for each α ∈ Fe . That is, we impose the environment’s fairness constraints on the envi-
ronment portion of the state. The assignment π is given by π(s) = s.

3 Epistemic Protocol Specifications

Protocol templates generalize concrete protocols by introducing some variables that
may be instantiated with a boolean expression in the observable variables in order to
obtain a concrete protocol. Formally, a protocol template for agent i ∈ Ags is a tuple
Pi = KVari , PVari , LVari , OVari , Ii , Actsi , −→i
1. KVari is a set of variables, disjoint from all the other sets of variables, that we call
the template variables,
2. PVari , LVari , OVari , Actsi are, respectively, a set of parameter variables, local vari-
ables, observable variables and actions of agent i, exactly as in a concrete protocol;
as in concrete protocols, we obtain a set of local states P(LVari ),
3. Ii ⊆ P(LVari ) is a set of initial local states,
4. −→i ⊆ P(KVari ∪ PVari ∪ LVari ) × Actsi × P(LVari ) is a transition relation that
describes how local states are updated, depending on the value of the template
variables, parameter variables, local variables and action performed.
1
We could also include a fairness condition, but exclude this here for brevity. We do not assume
that LVari ⊆ OVari : this allows the impact on knowledge of particular local variables to be
studied, and helps in managing the complexity of our technique, which scales exponentially in
the number of observable variables.
Symbolic Synthesis for Epistemic Specifications 459

An epistemic protocol specification is a tuple S = Ags, E, {Pi }i∈Ags , Φ , consisting

of a set of agents Ags, an environment E for Ags, a collection of protocol templates
{Pi }i∈Ags for environment E, and a collection of epistemic logic formulas Φ over the

agents Ags and variables Vare ∪ i∈Ags (KVari ∪ LVari ).
Epistemic protocol specifications generalize the notion of knowledge-based program
[9,10]. Essentially, these are epistemic protocol specifications in which, for each agent
i ∈ Ags and each template variable v ∈ KVari , the set Φ contains a formula of the form
AG(v ⇔ Ki ψ). That is, each template variable is associated with a formula of the form
Ki ψ, expressing some property of agent i’s knowledge, and we require that the meaning
of the template variable be equivalent to this property.
Epistemic protocol specifications also encompass the sound local proposition inter-
pretation of knowledge-based programs proposed by Engelhardt et al [8]: these asso-
ciate to each template variable v a formula φ and require that v be interpreted by a local
proposition (under the observational semantics, this is just a condition on the observable
variables), such that the system satisfies AG(v ⇒ ψ). (By the assumption of locality of
v, this is equivalent to satisfying AG(v ⇒ Ki ψ).)
To implement an epistemic protocol specification with respect to the observational
semantics, we need to replace each template variable v in each agent i’s protocol tem-
plate by an expression over the agent’s observable variables, in such a way that the
specification formulas are satisfied in the model resulting from executing the resulting
standard program. We now formalize this semantics.
Let θ be a substitution mapping each template variable v ∈ KVari , for i ∈ Ags, to a
boolean expression over the observable variables OVari of agent i’s protocol Pi . If we
apply this substitution to Pi , we obtain a standard protocol Pi θ = PVari , LVari , OVari , Ii ,
Actsi , −→θi , where the template variables KVari have been removed, and all the other
components are as in the protocol template, except that we derive the concrete transition
relation −→θi ⊆ P(PVari ∪ LVari ) × Actsi × P(LVari ) from the transition relation −→i ⊆
P(KVari ∪ PVari ∪ LVari ) × Actsi × P(LVari ) in the protocol template, as follows.
Since, for each v ∈ KVari , the value θ(v) is an expression over the variables OVari ,
which is a subset of PVari ∪ LVari , we may evaluate θ(v) on states in P(PVari ∪ LVari ).
Given a state s ∈ P(PVari ∪ LVari ), define sθ ∈ P(KVari ) by sθ = {v ∈ KVari | s |= θ(v)}.
a a
We then define −→θi by s −→ θi li when sθ ∪ s −→i li , for s ∈ P(PVari ∪ LVari ) and
a ∈ Actsi and li ∈ P(LVari ).
The substitution θ may also be applied to the specification formulas in Φ. Each φ ∈ Φ

is a formula over variables Vare ∪ i∈Ags (KVari ∪ LVari ). Replacing each occurrence of

a variable v ∈ i∈Ags KVari by the formula θ(v) over Vare ∪ i∈Ags LVari , we obtain a

formula φθ over Vare ∪ i∈Ags LVari . We write Φθ for {φθ | φ ∈ Φ}.
We say that such a substitution θ provides an implementation of the epistemic pro-
tocol specification S, provided M(E, {Pi θ}i∈Ags ) |= Φθ. The problem we study in this
paper is the following: given an environment E and an epistemic protocol specification
S, synthesize an implementation θ. This is an inherently complex problem. To provide
a fair comparison with the performance of our implementation, we measure it here as a
function of the size of a succint representation (by means of boolean formulas for the
environment and protocol components, or programs PTIME encodable by such formu-
las). Since the size of the output implementation θ could be of exponential size in the
460 X. Huang and R. van der Meyden

number of observable variables, we measure the complexity of determining the exis-

tence of an implementation: even this is already hard, as the following result shows:

Theorem 1. The problem of determining the existence of an implementation of a given

epistemic protocol specification is NEXPTIME-complete.

4 Example: Mutual Exclusion

To illustrate our approach we use a running example concerned with mutual exclusion.
Mutual exclusion protocols [7] are intended for settings where it is required that only
one of a set of agents has access to a resource (e.g. a printer, or a write access to a
file) at a given time. There exists a large literature on this topic, with many different
approaches to its solution [17].
To model the structure of a mutual exclusion protocol, we suppose that each agent
has three states: waiting, trying, and critical. Intuitively, while in the waiting
state, the agent does not require the resource, and it idles for some period of time until
it decides that it needs access to the resource. It then enters the trying state, where
it waits for permission to use the resource. Once this permission has been obtained, it
enters the critical state, within which it may use the resource. Once done, it exits
the critical state and returns to the waiting state. The overall structure of the protocol
is therefore a cycle waiting → trying → critical → waiting. To ensure fair
sharing of the resource, we assume that no agent remains in its critical state forever.
To avoid the situation where two agents are using the resource at the same time, the
specification requires that no two agents are in the critical state simultaneously. In
order for a solution to the mutual exclusion problem to satisfy this specification, the
agents need to share some information about their state and to place an appropriate
guard on the transition from the trying state to the critical state. Mutual exclusion
protocols differ in their approach to these requirements by providing different ways for
agents to use shared variables to distribute and exploit information about their state.
Our application of our synthesis methodology assumes that the designer has some
intuitions concerning what information needs to be distributed, and writes the protocol
and environment in so as to reflect these ideas concerning information distribution.
However, given a pattern of communication, it may still be a subtle matter to determine
what information an agent can deduce from some particular values of its observable
variables. We use the epistemic specification to relate the information distributed and
the conditions used by the agent to make state transitions.
A general structure for a mutual exclusion protocol is given as a protocol template
in Figure 1. The code uses a simple programming language, containing a Dijkstra style
nondeterministic-if construct if e1 → P1 [] . . . [] ek → P p fi which nondeterministically
executes one of the statements Pi for which the corresponding guard ei evaluates to
true. The final ek may be the keyword otherwise which represents the negation of the
disjunction of the preceding ei . If there is no otherwise clause and none of the guards in
a conditional are true then the program defaults to a skip action. Evaluation of guards in
if and while statements is assumed to take zero time, and a transition occurs only once
an action is encountered in the execution. This applies also to an exit from a while loop.
Symbolic Synthesis for Epistemic Specifications 461

/* protocol for agent i; initially state[i] = waiting */;

while True do
begin
/* waiting section: wait for some amount of time before entering the trying section */
while state[i] = waiting do
if True → skip [] True → EnterTry fi;
/* trying section: wait until the condition represented by template variable xi holds */
while state[i] = trying do
if xi → EnterCrit [] otherwise → skip fi;
/* critical section: stay critical for a random amount of time,
return to waiting when done */
while state[i] = critical do
if True → skip [] True → ExitCrit fi
end

Fig. 1. Protocol template for a mutual exclusion solution

Variables in the programming notation are allowed to be of finite types (these are
boolean encoded in the translation to the semantic level. We assume that a vector of vari-
ables state indexed by agent names records the state in {waiting, trying, critical}
of each agent. Thus, mutual exclusion can be specified by the formula

AG ¬(state[i] = critical ∧ state[ j] = critical) . (1)
i, j∈Ags, i j

The protocol template also uses three actions for the agent: EnterTry, EnterCrit and
ExitCrit, which correspond to entering the trying, critical and waiting states respec-
tively. We take the variables state[i] to be included in the set of environment variable
Vare . When there are n agents, with Ags = {0 . . . n − 1}, we assume the code for the
environment transition always includes the following:
for i = 0 . . . n − 1 do
if i.EnterTry → state[i] := trying
[] i.EnterCrit → state[i] := critical
[] i.ExitCrit → state[i] := waiting
fi
(Here i.a is a proposition that holds during the computation of any transition in which
agent i performs the action a.) Additional code describing the effect of these actions
may be included, which represents the way that the agents distribute information to
each other concerning their state. A number of different instantiations of this additional
code for these actions are discussed below.
In our epistemic specifications, we include in Φ, for each agent i, the following
constraint on the template variable xi that guards entry to the critical section:

AG(xi ⇔ Ki (AX( ( j i ⇒ state[ j] critical))) (2)
j∈Ags

Intuitively, this states that agent i enters its critical section when it knows that, after
next transition, no other agent will be in its critical section. Note that this formula falls
462 X. Huang and R. van der Meyden

within the structure of the specifications for knowledge-based programs as discussed

above. We also include in Φ the formula

AG(state[i] = trying ⇒ AFstate[i] = critical) (3)
i∈Ags

which requires that the protocol synthesized ensures that whenever an agent starts try-
ing, it is eventually able to enter its critical section.
One of the benefits of knowledge-based programs is that they enable the essential
reasons for correctness of a protocol to be abstracted in a way that separates the infor-
mation on the basis of which an agent acts from the way that this information is encoded
in the state of the system. This, it is argued, allows for simpler correctness proofs that
display the commonalities between different protocols solving the same problem.
This can be seen in the present specification: if the agents follow this specification,
then they will not violate mutual exclusion. The proof of this is straightforward; we
sketch it informally. Suppose that there is a violation of mutual exclusion, and let t
be the earliest time that we have state[i] = critical ∧ state[ j] = critical for
some pair of agents i j. Then either i or j performs EnterCrit to enter its critical
section at time t − 1. Assuming, without loss of generality, that it is agent i, we have xi

at time t − 1, so by (2), we must have Ki (AX( k∈Ags (k i ⇒ state[k] critical)))
at time t − 1. But then (since validity of Ki φ ⇒ φ is immediate from the semantics
of the knowledge operator), it follows that AX(state[ j] critical) at time t − 1,
contradicting the fact that the protocol makes a transition, in the next step, to a state
where state[ j] = critical.
We note that only the implication from left to right in (2) is used in this argument,
and it would also be valid if we removed the knowledge operator. This is an example of
a general point that led to the “sound local proposition” generalization of knowledge-
based programs proposed in [8]. However, weakening (2) to only the left to right part
allows the trivial implementation θ(xi ) = False, where no agent ever enters its critical
section. The implication from right to left in (2) amounts to saying that rather than this
very weak implementation, we want the strongest possible implementation where an
agent enters its critical section whenever it has sufficient information. Here the knowl-
edge operator is essential since, in general, the non-local condition inside the knowledge
operator will not be equivalent to a local proposition implementing xi .
The description above is not yet a complete solution to the mutual exclusion problem:
it remains to describe how agents distribute information about their state, and how the
data structures encoding this information are related to a local condition of the agent’s
state that can be substituted for the template variable so as to satisfy the epistemic
specification. We consider here two distinct patterns of information passing, based on
two overall systems architectures. In both cases KVari = {xi } for all agents i.
Ring Architecture: In the ring architecture we consider n agents Ags = {0, . . . , n − 1}
in a ring, with agent i able to communicate with agent i + 1 mod n. This communication
pattern is essentially that of token ring protocols. In this case we assume that communi-
cation is by means of a single bit for each agent i, represented by a variable bit[i]. We
take Vare = {bit[i], state[i] | i = 0 . . . n − 1} and let PVari = {bit[i], state[i]} and
LVari = ∅ and OVari = {bit[i]}. Agent i is able to affect its own bit as well as the bit of
Symbolic Synthesis for Epistemic Specifications 463

agent i + 1 mod n through its actions. More precisely, we add to the above code for the
environment state transitions the following semantics for the ExitCrit actions:
for i = 0 . . . n − 1 do
if i.ExitCrit then begin bit[i] := ¬bit[i]; bit[i + 1 mod n] := ¬bit[i + 1 mod n] end

That is, on exiting the critical section, the agent flips the value of its own bit, as well
as the value of its successor’s bit. To ensure fairness, we also add to the environment,
for each agent i, the Büchi fairness constraint state[i] waiting, which says that the
agent does not remain forever in the waiting state, but eventually tries to go critical. This
ensures that this agent takes its turn and does not forever block other agents who may
be trying to enter their critical section. We also add the fairness constraints state[i]
critical to ensure that no agent stays in its critical section forever. (However, we do
not include state[i] trying as a fairness constraint: it is up to the protocol to ensure
that an agent is eventually able to enter its critical section once it starts trying!)
Broadcast Architecture: In the broadcast architecture, we assume that the n agents
broadcast their state to all other agents. In this case, no additional variables are needed
and we take Vare = {state[ j] | j = 0 . . . n − 1}. Also, for each agent i, we take PVari =
OVari = Vare and LVari = ∅. The only code required for the actions EnterTry, EnterCrit
and ExitCrit is that given above for updating the variables state[i]. We do not
need to assume eventual progression from waiting to trying in this case (we allow an
agent to wait forever, in this case) so the only fairness constraints are state[i]
critical to ensure that no agent is forever critical.
Implementation Example: We describe an example of an implementation in the case
of the ring architecture for mutual exclusion described above. We assume that initially,
bit[i] = 0 for all agents i. Consider the substitution defined by θ(xi ) = ¬bit[i] if i = 0
and θ(xi ) = bit[i] if i 0. (Note that these are boolean expressions in the observable
variables OVari = {bit[i]}.) It can be shown that this yields an implementation of the
epistemic protocol specification for the ring architecture (we discuss our automated
synthesis of this implementation below.) Intuitively, in this implementation, agent 0
initially holds the token, represented by bit[0] = 0. After using the token to enter its
critical section, it sets bit[0] = 1 to relinquish the token, and bit[1] = 1 in order
to pass the token to agent 1. Thus, for agent 1, holding the token is represented by
bit[1] being true. The same holds for the remaining agents. (Obviously, there is an
asymmetry in these conditions for the agents, but any solution needs to somehow break
the symmetry in the initial state.) Intuitively, specification formula (2) holds because the
implementation maintains the invariant that at most one of the conditions θ(xi ) guarding
entry to the agents’ critical sections holds at any time, and when it is false, the agent is
not in its critical section. Thus, the agent i for which θ(xi ) is true knows that no other
agent is in, or is able to enter, its critical section. Consequently, it knows that no other
agent will be in its critical section at the next moment of time.

5 Reduction of Synthesis to Model Checking

We now show how the synthesis of implementations of epistemic protocol specifica-
tions S can be reduced to the problem of epistemic model checking. The approach
464 X. Huang and R. van der Meyden

essentially constructs a model that encodes all possible guesses of the environment, and
then uses model checking to determine which guesses actually yield an implementation.
The consideration of all guesses is done in bulk, using symbolic techniques.
For each agent i, let Oi be the set of boolean assignments to OVari ; this represents
the set of possible observations that agent i can make. We may associate to each o ∈ Oi
a conjunction ψo of literals over variables v in OVari , containing literal v if o(v) = 1 and
¬v otherwise.
Since an implementation θ(v) for a template variable v is a boolean condition over
observable variables, we may equivalently view this as corresponding to the set of ob-
servations on which it holds. This set can in turn be represented by its characteristic
mapping from Oi to boolean values. To represent the entire implementation θ, we intro-
duce for each agent i ∈ Ags a set of new boolean variables Xi , containing the variables

xi,o,v , where o ∈ Oi and v ∈ KVari . Let X = i∈Ags Xi . We call X the implementation
variables of the epistemic protocol specification S.
A candidate assignment θ for an implementation of the epistemic protocol specifica-
tion, can be represented by a state χθ over the variables X, such that for an observation
o ∈ Oi and variable v ∈ KVari , we have xi,o,v ∈ χθ iff θ(v) holds with respect to assign-
ment o. Conversely, given a state χ over the variables X, we can construct an assignment
θχ mapping, for each agent i, the variables KVari to boolean conditions over OVari , by

θχ (v) = ψo .
o∈Oi , xi,o,v ∈χ

To reduce synthesis to model checking, we construct a system in which the state

space is based on the variables X as well as a state of a model for the implemen-
tation. Given an environment E = Vare , Ie , Acts, −→e , we define an environment
E X = VareX , IeX , Acts, −→eX as follows. The variables making up states are defined to be
VareX = Vare ∪ X. The initial states are given by IeX = {s ∪ χ | s ∈ Ie , χ ∈ P(X)}, i.e., an
initial state is obtained by adding any assignment to variables X to an initial state of E.
The set of actions Acts is the same as in the environment E. Finally, the transition rela-
tion −→eX is given by s ∪ χ −→eX s ∪ χ iff s −→e s and χ = χ where s, s ∈ P(Vare )
and χ, χ ∈ P(X).
Additionally, for each agent i, we transform its protocol template Pi = KVari , PVari ,
LVari , OVari , Ii , Actsi , −→i into a concrete protocol PiX = PVariX , LVari , OVariX , Ii , Actsi ,
−→iX for the environment E X . The local variables LVari and the initial states Ii are
exactly as in the protocol template. The parameter variables are given by PVariX =
PVari ∪ X, and the observable variables are given by OVariX = OVari ∪ X. The tran-
sition relation −→iX ⊆ P(PVari ∪ X ∪ LVari ) × Actsi × P(LVari ) is derived from the
transition relation −→i ⊆ P(KVar ∪ PVari ∪ LVari ) × Actsi × P(LVari ) as follows. For
s ∈ P(PVari ∪ LVari ) and χ ∈ P(X), define κ(s, χ) ∈ P(KVar) by

κ(s, χ) = {v ∈ KVari | s ∪ χ |= ψo ∧ xi,o,v } .
o∈Oi

a a
For li ∈ P(LVari ) and a ∈ Actsi , we then let s ∪ χ −→iX li iff s ∪ κ(s, χ) −→i li .
Intuitively, since the assignment χ to the variables X encodes an implementation θ,
we make these variables an input to the transformed protocol, which uses them to make
Symbolic Synthesis for Epistemic Specifications 465

decisions that depend on the protocol template variables when executing the protocol
template. In particular, when an observation o = s ∩ OVari ∈ Oi (equivalently, s |= ψo )
satisfies xi,o,v ∈ χ, this corresponds to the template variable v taking the value true on
state s according to the implementation θ(v). We therefore execute a transition of the
protocol template in which v is taken to be true.
Note that the definition of the sets OVariX makes the variables X observable to all the
agents: this effectively makes the particular implementation being run common knowl-
edge to the agents, as it is in the system that we obtain from each concrete imple-
mentation. However, the combined transformed environment and transformed protocol
templates represent not just one implementation, but all possible implementations. This
is stated formally in the following result.
Theorem 2. Let S = Ags, E, {Pi }i∈Ags , Φ be an epistemic protocol specification, and
let X be the set of implementation variables of S. For each implementation θ of S, we
have M(E X , {PiX }i∈Ags ), s |= Φθ for all initial states s of M(E X , {PiX }i∈Ags ) with s∩X = χθ .
Conversely, suppose that χ ∈ P(X) is such that M(E X , {PiX }i∈Ags ), s |= Φθχ for all initial
states s of M(E X , {PiX }i∈Ags ) with s ∩ X = χ. Then θχ is an implementation of S.
This result gives a reduction from the synthesis problem to the well understood prob-
lem of model checking. Any algorithm for model checking specifications expressible in
the framework can now be applied. In particular, symbolic model checking techniques
apply. We have implemented the above approach as an extension of binary-decision di-
agram (BDD) based epistemic model checking algorithms already implemented in the
epistemic model checker MCK [11], which handles formulas in CTL∗ Kn with fairness
constraints using BDD based representations. The model checking techniques involved
are largely standard, as in [6], with a trivial extension to handle the epistemic operators
(these just require BDD’s representing the set of reachable states and an equivalence
on observable variables.) We make one optimization, based on the observation that the
variables X encoded in the state do not actually change on any given run. We can there-
fore reduce the number of BDD variables required to represent the transition relation
by retaining only one copy of these variables. Also, we first compute the observations
o ∈ Oi that can occur at reachable states in any putative implementation, to reduce the
set X to variables xi,o,v where o is in fact a possible observation.
We note that the reduction does entail a blowup in the number of variables. Suppose
we have n agents, with the number of observable variables of agent i being ki . Then the

size of the set Xi could be as large as 2ki |KVari |, so that |X| = i=1...n 2ki |KVari | is the
number of new variables that need to be included in the BDD computation. With BDD-
based symbolic model checking currently typically viable for numbers of the BDD
variables in the order of 100’s, this places an inherent limit on the size of example that
we can expect to handle using our technique. Evidently, the technique favours examples
in which the number of observable variables per agent is kept small. This is reflected in
the results obtained for our running example, which we now discuss.

6 Solutions to the Mutual Exclusion Example

We have applied our implementation of the above reduction to the epistemic protocol
specifications for mutual exclusion described in Section 4. Our technique computes the
466 X. Huang and R. van der Meyden

Table 1. Running times (s) of Synthesis Experiments

No. of Agents 2 3 4 5 6 7 8
Ring (time) 0.3 1.7 5.5. 17.2 157.7 509.1 597
(No. BDD vars) 22 33 44 55 66 77 88
Broadcast (time) 0.2 194.2
(No. BDD vars) 34 105 356

set of all possible implementations. We now describe the implementations obtained for
the two versions of this specification.
We note that, as defined above, two implementations, corresponding to substitu-
tions θ1 and θ2 for the template variables, may be behaviorally equivalent, yet for-
mally distinct. Define the equivalence relation ∼ on such substitutions by θ1 ∼ θ2 if
M(E, {Pi θ1 }i∈Ags ) and M(E, {Pi θ2 }i∈Ags ) have the same set of reachable states, and for all
such reachable states s, and all template variables v, we have M(E, {Pi θ1 }i∈Ags ), s |= θ1 (v)
iff M(E, {Pi θ2 }i∈Ags ), s |= θ2 (v). Intuitively, this means that θ1 and θ2 are equivalent, ex-
cept on unreachable states. We treat such implementations as identical and return only
one element of each equivalence class.
Ring Architecture: We have already discussed one of the possible implementations
of the epistemic protocol specification for the ring architecture as the example in Sec-
tion 4, viz., that in which θ(x0 ) = ¬bit0 and θ(xi ) = biti for i 0. Our synthesis
system returns this as one of the implementations synthesized. As discussed above, this
implementation essentially corresponds to a token ring protocol in which agent 0 ini-
tially holds the token. By symmetry, it is easily seen that we can take any agent k to
be the one initially holding the token, and each such choice yields an implementation,
with θ(xk ) = ¬bitk and θ(xi ) = biti for i k. Our synthesis system returns all these
solutions, but also confirms that there are no others. Thus, up to symmetry, there is
essentially just one implementation for this specification.
We note that, whatever the total number of agents n, the number of variables observ-
able to agent i is just one, so we have |Xi | = 2 and we add |X| = 2n variables to the
underlying BDD for model checking in order to perform synthesis. This gives a slow
growth rate in the number of BDD variables as we scale the number of agents, and
enables us to deal with moderate size instances. Table 1 gives the performance results
for our implementation as we scale the number of agents.2 The total number of BDD
variables per state (i.e., the environment variables, local protocol and program counter
variables and X) is also indicated.
Broadcast Architecture: In case of the broadcast architecture, the number of variables
that need to be added for synthesis increases much more rapidly. In case of n agents,
we have |Xi | = 22n (since we need two bits to represent each agent’s state variable
state[ j]), and |X| = n22n . Accordingly, the approach works only on modest scale
examples. We describe the solutions obtained in the case of 3 agents. Our synthesis
procedure computes that there exist 6 distinct solutions, which amount essentially to
2
Our experiments were conducted on a Debian Linux system, 3.3GHz Intel i5-2500 CPU, with
each process allocated up to 500M memory.
Symbolic Synthesis for Epistemic Specifications 467

TTT:0

WTT:1 TTW:0

TWT:2

TWW:0

WWT:2 WTW:1

Fig. 2. Structure of a ME protocol synthesized (3 agents, broadcast architecture)

one solution under permutation of the roles of the agents. To understand this solution,
note first that if any agent is in its critical section, all others know this, but cannot know
whether the agent will exit its critical section in the next step. It follows that no agent
is able to enter its critical section in the next step. It therefore suffices to consider the
behavior of the solution on states where no agent is in its critical section, but at least one
agent is in state trying. We describe this by means of the graph in Figure 2. Vertices
in this graph indicate the protocol state inhabited by each of the agents, as well as the
agent that the protocol selects for entry to the critical state, e.g., WTT:1 indicates that
agent 0 is in state waiting, and agents 1 and 2 are in state trying, and that agent
1 enters its critical state in the next step. The edges point to possible successor states
reached at the time the selected agent next exits its critical state. (Note that, at this time,
no other agent has had the opportunity to enter its critical state, but another agent may
have moved from waiting to trying, so there is some nondeterminism in the graph.)
It can be verified by inspection (a focus on the upper triangle suffices, since only one
agent is trying in states in the lower triangle) that the solution is fair: there is no cycle
where an agent is constantly trying but never selected for entry to the critical section.

7 Related Work
Most closely related to our work in this paper are results on the complexity of verifying
and deciding the existence of knowledge-based programs [20,10], with respect to what
is essentially the observational semantics. The key idea of these complexity results is
similar to the one we have used in our construction: guess a knowledge assignment that
indicates at which observations (local states, in their terminology) a knowledge formula
holds, and verify that this corresponds to an implementation. However, our epistemic
specifications are syntactically more expressive than knowledge-based programs, and
some of the details of their work are more complex, in that a labelling of runs by sub-
formulas of knowledge formulas is also required. In part this is because of the focus on
linear time temporal logic in this work, compared to our use of branching time temporal
logic. This work also does not consider any concrete implementation of the theoretical
results using symbolic techniques. The complexity bounds for determining the exis-
tence of an implementation of a knowledge-based program in [20,10] (NP-complete
468 X. Huang and R. van der Meyden

for atemporal knowledge-based programs and PSPACE-complete for programs with

LTL-based knowledge conditions) are lower than our EXPTIME bound in Theorem 1
because they are based on an explicit-state rather than variable-based representation.
Our focus in this paper is on the observational semantics for knowledge. Other se-
mantics have been studied from the point of view of synthesis. Van der Meyden and
Vardi [18] consider, for the synchronous perfect recall semantics, the problem of syn-
thesizing a protocol satisfying a formula in a linear time temporal epistemic logic in a
given environment (with no limitations on the program structure of the solution). They
show the problem to be decidable only in the case of a single agent. Some restrictions
on environments and specifications are identified in [19] under which the problem be-
comes decidable. The problem can also be shown to be decidable for knowledge-based
programs that run only a finitely bounded number of steps: a symbolic technique for
implementing such programs with respect to synchronous semantics including perfect
recall and clock based semantics is developed in [14].
A number of papers have also applied model checking of knowledge properties
to synthesize distributed control strategies [4,12,15]. These works do not deal with
knowledge-based programs per se, however, and it is not guaranteed that the implement-
ing condition is equivalent to the desired knowledge property in the protocol
synthesized. However, these solutions would be included in the space of solutions of
specifications expressible in our more general framework.
Bonollo et al [5] have previously proposed knowledge-based specifications for dis-
tributed mutual exclusion. However, this work deals only with the specification level,
and does not relate the specification developed to any concrete implementations.
Bar-David and Taubenfeld [1] considered the automated synthesis of mutual ex-
clusion protocols. In some respects their approach is more general than ours, in that
they synthesize the entire program structure, not just the implementations of conditions
within a program template. However, they do not consider epistemic specifications.
Also, compared to our symbolic approach, they essentially conduct a brute force search
over all possible implementations up to a given size of program code, (with some opti-
mizations to avoid redundant work) and they use explicit state model checking to verify
an implementation. This limits the number of agents to which their approach can be
expected to scale: they consider only two-agent systems. They mention a construction
by which a two-agent solution can used to construct an n-agent solution, but this does
not amount to generation of all possible solutions for the n-agent case.

8 Conclusion

Our focus in this paper has been to develop an approach that enables the space of all
solutions to an epistemic protocol specification to be explored. Our implementation
gives the first tool with this capability with respect to the observational semantics for
knowledge, opening up the ability to more effectively explore the overall methodology
of the knowledge-based approach to concurrent systems design through experimenta-
tion with examples beyond the simple mutual exclusion protocol we have considered.
Applications of the tool to the synthesis of fault-tolerant protocols, where the flow of
knowledge is considerably more subtle than in the reliable setting we have considered,
Symbolic Synthesis for Epistemic Specifications 469

is one area that we intend to explore in future work. Use of alternative model checking
approaches to the BDD-based algorithm we have used (e.g., SAT-based algorithms) are
also worth exploring.

References
1. Bar-David, Y., Taubenfeld, G.: Automatic discovery of mutual exclusion algorithms. In: Fich,
F.E. (ed.) DISC 2003. LNCS, vol. 2848, pp. 136–150. Springer, Heidelberg (2003)
2. Bataineh, O.A., van der Meyden, R.: Abstraction for epistemic model checking of dining-
cryptographers based protocols. In: Proc. TARK, pp. 247–256 (2011)
3. Baukus, K., van der Meyden, R.: A knowledge based analysis of cache coherence. In: Davies,
J., Schulte, W., Barnett, M. (eds.) ICFEM 2004. LNCS, vol. 3308, pp. 99–114. Springer,
Heidelberg (2004)
4. Bensalem, S., Peled, D., Sifakis, J.: Knowledge based scheduling of distributed systems. In:
Manna, Z., Peled, D.A. (eds.) Time for Verification. LNCS, vol. 6200, pp. 26–41. Springer,
Heidelberg (2010)
5. Bonollo, U., van der Meyden, R., Sonenberg, E.: Knowledge-based specification: Investigat-
ing distributed mutual exclusion. In: Bar Ilan Symposium on Foundations of AI (2001)
6. Clarke, E., Grumberg, O., Peled, D.: Model Checking. The MIT Press (1999)
7. Dijkstra, E.W.: Solution of a problem in concurrent programming control. Commun.
ACM 8(9), 569 (1965)
8. Engelhardt, K., van der Meyden, R., Moses, Y.: Knowledge and the logic of local proposi-
tions. In: Proc. Conf. Theoretical Aspects of Knowledge and Rationality, pp. 29–41 (1998)
9. Fagin, R., Halpern, J., Moses, Y., Vardi, M.: Reasoning About Knowledge. MIT Press (1995)
10. Fagin, R., Halpern, J.Y., Moses, Y., Vardi, M.Y.: Knowledge-based programs. Distributed
Computing 10(4), 199–225 (1997)
11. Gammie, P., van der Meyden, R.: MCK: Model checking the logic of knowledge. In: Alur, R.,
Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 479–483. Springer, Heidelberg (2004)
12. Graf, S., Peled, D., Quinton, S.: Achieving distributed control through model checking. For-
mal Methods in System Design 40(2), 263–281 (2012)
13. Halpern, J.Y., Zuck, L.D.: A little knowledge goes a long way: Knowledge-based derivations
and correctness proofs for a family of protocols. J. ACM 39(3), 449–478 (1992)
14. Huang, X., van der Meyden, R.: Symbolic synthesis of knowledge-based program imple-
mentations with synchronous semantics. In: Proc. TARK, pp. 121–130 (2013)
15. Katz, G., Peled, D., Schewe, S.: Synthesis of distributed control through knowledge accumu-
lation. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 510–525.
Springer, Heidelberg (2011)
16. Lomuscio, A., Qu, H., Raimondi, F.: MCMAS: A model checker for the verification of multi-
agent systems. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 682–688.
Springer, Heidelberg (2009)
17. Srimani, P., Das, S.R. (eds.): Distributed Mutual Exclusion Algorithms. IEEE (1992)
18. van der Meyden, R., Vardi, M.Y.: Synthesis from knowledge-based specifications (Extended
abstract). In: Sangiorgi, D., de Simone, R. (eds.) CONCUR 1998. LNCS, vol. 1466, pp.
34–49. Springer, Heidelberg (1998)
19. van der Meyden, R., Wilke, T.: Synthesis of distributed systems from knowledge-based spec-
ifications. In: Abadi, M., de Alfaro, L. (eds.) CONCUR 2005. LNCS, vol. 3653, pp. 562–576.
Springer, Heidelberg (2005)
20. Vardi, M.Y.: Implementing knowledge-based programs. In: Proc. Conf. on Theoretical As-
pects of Rationality and Knowledge, pp. 15–30 (1996)
Synthesis for Human-in-the-Loop Control Systems

Wenchao Li1, , Dorsa Sadigh2 , S. Shankar Sastry2 , and Sanjit A. Seshia2

1
SRI International, Menlo Park, USA
[email protected]
2
University of California, Berkeley, USA
{dsadigh,sastry,sseshia}@eecs.berkeley.edu

Abstract. Several control systems in safety-critical applications involve the in-

teraction of an autonomous controller with one or more human operators. Exam-
ples include pilots interacting with an autopilot system in an aircraft, and a driver
interacting with automated driver-assistance features in an automobile. The cor-
rectness of such systems depends not only on the autonomous controller, but also
on the actions of the human controller. In this paper, we present a formalism for
human-in-the-loop (HuIL) control systems. Particularly, we focus on the problem
of synthesizing a semi-autonomous controller from high-level temporal specifica-
tions that expect occasional human intervention for correct operation. We present
an algorithm for this problem, and demonstrate its operation on problems related
to driver assistance in automobiles.

1 Introduction
Many safety-critical systems are interactive, i.e., they interact with a human being, and
the human operator’s role is central to the correct working of the system. Examples
of such systems include fly-by-wire aircraft control systems (interacting with a pilot),
automobiles with driver assistance systems (interacting with a driver), and medical de-
vices (interacting with a doctor, nurse, or patient). We refer to such interactive control
systems as human-in-the-loop control systems. The costs of incorrect operation in the
application domains served by these systems can be very severe. Human factors are
often the reason for failures or “near failures”, as noted by several studies (e.g., [1,7]).
One alternative to human-in-the-loop systems is to synthesize a fully autonomous
controller from a high-level mathematical specification. The specification typically cap-
tures both assumptions about the environment and correctness guarantees that the con-
troller must provide, and can be specified in a formal language such as linear temporal
logic (LTL) [15]. While this correct-by-construction approach looks very attractive, the
existence of a fully autonomous controller that can satisfy the specification is not al-
ways guaranteed. For example, in the absence of adequate assumptions constraining
its behavior, the environment can be modeled as being overly adversarial, causing the
synthesis algorithm to conclude that no controller exists. Additionally, the high-level
specification might abstract away from inherent physical limitations of the system, such
as insufficient range of sensors, which must be taken into account in any real implemen-
tation. Thus, while full manual control puts too high a burden on the human operator,

This work was performed when the first author was at UC Berkeley.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 470–484, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Synthesis for Human-in-the-Loop Control Systems 471

some element of human control is desirable. However, at present, there is no system-

atic methodology to synthesize a combination of human and autonomous control from
high-level specifications. In this paper, we address this limitation of the state of the
art. Specifically, we consider the following question: Can we devise a controller that
is mostly automatic and requires only occasional human interaction for correct opera-
tion? We formalize this problem of human-in-the-loop (HuIL) synthesis and establish
formal criteria for solving it.
A particularly interesting domain is that of automobiles with “self-driving” features,
otherwise also termed as “driver assistance systems”. Such systems, already capable
of automating tasks such as lane keeping, navigating in stop-and-go traffic, and paral-
lel parking, are being integrated into high-end automobiles. However, these emerging
technologies also give rise to concerns over the safety of an ultimately driverless car.
Recognizing the safety issues and the potential benefits of vehicle automation, the Na-
tional Highway Traffic Safety Administration (NHTSA) recently published a statement
that provides descriptions and guidelines for the continual development of these tech-
nologies [13]. Particularly, the statement defines five levels of automation ranging from
vehicles without any control systems automated (Level 0) to vehicles with full automa-
tion (Level 4). In this paper, we focus on Level 3 which describes a mode of automation
that requires only limited driver control:
“Level 3 - Limited Self-Driving Automation: Vehicles at this level of automa-
tion enable the driver to cede full control of all safety-critical functions un-
der certain traffic or environmental conditions and in those conditions to rely
heavily on the vehicle to monitor for changes in those conditions requiring
transition back to driver control. The driver is expected to be available for oc-
casional control, but with sufficiently comfortable transition time. The vehicle
is designed to ensure safe operation during the automated driving mode.” [13]
Essentially, this mode of automation stipulates that the human driver can act as a fail-
safe mechanism and requires the driver to take over control should something go wrong.
The challenge, however, lies in identifying the complete set of conditions under which
the human driver has to be notified ahead of time. Based on the NHTSA statement, we
identify four important criteria required for a human-in-the-loop controller to achieve
this level of automation.
1. Monitoring. The controller should be able to determine if human intervention is
needed based on monitoring past and current information about the system and its
environment.
2. Minimally Intervening. The controller should only invoke the human operator when
it is necessary, and does so in a minimally intervening manner.
3. Prescient. The controller can determine if a specification may be violated ahead of
time, and issues an advisory to the human operator in such a way that she has suffi-
cient time to respond.
4. Conditionally Correct. The controller should operate correctly until the point when
human intervention is deemed necessary.
We further elaborate and formally define these concepts later in Section 3. In gen-
eral, a human-in-the-loop controller, as shown in Figure 1 is a controller consists of
472 W. Li et al.

three components: an automatic controller, a human operator, and an advisory control

mechanism that orchestrates the switching between the auto-controller and the human
operator.1 In this setting, the auto-controller and the human operator can be viewed as
two separate controllers, each capable of producing outputs based on inputs from the en-
vironment, while the advisory controller is responsible for determining precisely when
the human operator should assume control while giving her enough time to respond.

Fig. 1. Human-in-the-Loop Controller: Component Overview

In this paper, we study the construction of such controller in the context of reactive
synthesis from LTL specifications. Reactive synthesis is the process of automatically
synthesizing a discrete system (e.g., a finite-state Mealy transducer) that reacts to en-
vironment changes in such a way that the given specification (e.g., a LTL formula) is
satisfied. There has been growing interest recently in the control and robotics commu-
nities (e.g., [20,9]) to apply this approach to automatically generate embedded control
software. In summary, the main contributions of this paper are:
• A formalization of human-in-the-loop control systems and the problem of synthesiz-
ing such controllers from high-level specifications, including four key criteria these
controllers must satisfy.
• An algorithm for synthesizing human-in-the-loop controllers that satisfy the afore-
mentioned criteria.
• An application of the proposed technique to examples motivated by driver-assistance
systems for automobiles.
The paper is organized as follows. Section 2 describes an motivating example dis-
cussing a car following example. Section 3 provides a formalism and characterization
of the human-in-the-loop controller synthesis problem. Section 4 reviews material on
reactive controller synthesis from temporal logic. Section 5 describes our algorithm for
the problem. We then present case studies of safety critical driving scenarios in Sec-
tion 6. Finally, we discuss related work in Section 7 and conclude in Section 8.
1
In this paper, we do not consider explicit dynamics of the plant. Therefore it can be considered
as part of the environment also.
Synthesis for Human-in-the-Loop Control Systems 473

2 Motivating Example

Consider the example in Figure 2. Car A is the autonomous vehicle, car B and C are
two other cars on the road. We assume that the road has been divided into discretized
regions that encode all the legal transitions for the vehicles on the map, similar to the
discretization setup used in receding horizon temporal logic planning [21]. The objec-
tive of car A is to follow car B. Note that car B and C are part of the environment and
cannot be controlled. The notion of following can be stated as follows. We assume that
car A is equipped with sensors that allows it to see two squares ahead of itself if its view
is not obstructed, as indicated by the enclosed region by blue dashed lines in Figure 2a.
In this case, car B is blocking the view of car A, and thus car A can only see regions 3,
4, 5 and 6. Car A is said to be able to follow car B if it can always move to a position
where it can see car B. Furthermore, we assume that car A and C can move at most 2
squares forward, but car B can move at most 1 square ahead, since otherwise car B can
out-run or out-maneuver car A.

Car C Car C

1 3 5 7 9 1 3 5 7 9

Car A Car B Car A Car B

2 4 6 8 10 2 4 6 8 10

(a) A’s Sensing Range. (b) Failed to Follow.

Fig. 2. Controller Synthesis – Car A Following Car B

Given this objective, and additional safety rules such as cars not crashing into one
another, our goal is to automatically synthesize a controller for car A such that:
• car A follows car B whenever possible;
• and in situations where the objective may not be achievable, switches control to the
human driver while allowing sufficient time for the driver to respond and take control.
In general, it is not always possible to come up with a fully automatic controller that
satisfies all requirements. Figure 2b illustrates such a scenario where car C blocks the
view as well as the movement path of car A after two time steps. The brown arrows
indicate the movements of the three cars in the first time step, and the purple arrows
indicate the movements of car B and C in the second time step. Positions of a car X
at time t is indicated by Xt . In this failure scenario, the autonomous vehicle needs to
notify the human driver since it has lost track of car B.
Hence, human-in-the-loop synthesis is tasked with producing an autonomous con-
troller along with advisories for the human driver in situations where her attention is
required. Our challenge, however, is to identify the conditions that we need to monitor
and notify the driver when they may fail. In the next section, we discuss how human
constraints such as response time can be simultaneously considered in the solution, and
mechanisms for switching control between the auto-controller and the human driver.
474 W. Li et al.

3 Formal Model of HuIL Controller

3.1 Preliminaries
Consider a Booleanized space over the input and output alphabet X = 2X and Y =
2Y , where X and Y are two disjoint sets of variables representing inputs and outputs
respectively, we model a discrete controller as a finite-state transducer. A finite-state
(Mealy) transducer (FST) is a tuple M = (Q, q0 , X , Y, ρ, δ), where Q is the set of
states, q0 ∈ Q is the initial state, ρ : Q × X → Q is the transition function, and
δ : Q × X → Y is the output function. Given an input sequence x = x0 x1 . . ., a run
of M is the infinite sequence q = q0 q1 . . . of states such that qk+1 = ρ(qk , ik ) for all
k ≥ 0. The run q on x produces the word M (x) = δ(q0 , x0 )δ(q1 , x1 ) . . .. The language
of M is then denoted by the set L(M ) = {(x, y)ω | M (x) = y}.
To characterize correctness of M , we assume that we can label if a state is unsafe
or not, by using a function F : Q → {true, false}, i.e. a state q is failure-prone if
F (q) = true. We elaborate on F later in Section 5.1.

3.2 Agents as Automata

We model two of the three agents in a human-in-the-loop controller, the automatic con-
troller AC and the advisory controller VC, as finite-state transducers (FSTs). The human
operator can be viewed as another FST HC that uses the same input and output interface
as the auto-controller. The overall controller Hu IL is then a composition of the models
of HC, AC and VC.
We use a binary variable auto to denote the internal advisory signal that VC sends
to both AC and HC. Hence, X HC = X AC = X ∪ {auto}, and Y VC = {auto}. When
auto = false, it means the advisory controller is requiring the human operator to
take over control, and the auto-controller can have control otherwise.
We assume that the human operator (e.g., driver behind the wheel) can take control at
any time by transitioning from a “non-active” state to an “active” state, e.g., by hitting
a button on the dashboard or simply pressing down the gas pedal or the brake. When
HC is in the “active” state, the human operator essentially acts as the automaton that
produces outputs to the plant (e.g., a car) based on environment inputs. We use a binary
variable active to denote if HC is in the “active” state. When active = true, the
output of HC overwrites the output of AC, i.e. the output of HC is the output of Hu IL.
The “overwrite” action happens when a sensor senses the human operator is in control,
e.g., putting her hands on the wheel. Similarly, when active = false, the output of
Hu IL is the output of AC. Note that even though the human operator is modeled as
a FST here, since we do not have direct control of the human operator, it can in fact
be any arbitrary relation mapping X to Y. Considering more complex human driver
models is left as a future direction [17].

3.3 Criteria for Human-in-the-Loop Controllers

One key distinguishing factor of a human-in-the-loop controller from traditional con-
troller is the involvement of a human operator. Hence, human factors such as response
time cannot be disregarded. In addition, we would like to minimize the need to engage
the human operator. Based on the NHTSA statement, we derive four criteria for any
effective human-in-the-loop controller, as stated below.
Synthesis for Human-in-the-Loop Control Systems 475

1. Monitoring. An advisory auto is issued to the human operator under specific con-
ditions. These conditions in turn need to be determined unambiguously at runtime,
potentially based on history information but not predictions. In a reactive setting,
this means we can use trace information only up to the point when the environment
provides a next input from the current state.
2. Minimally intervening. Our mode of interaction requires only selective human inter-
vention. An intervention occurs when HC transitions from the “non-active” state to
the “active” state (we discuss mechanisms for suggesting a transition from “active”
to “non-active” in Section 5.3, after prompted by the advisory signal auto being
false). However, frequent transfer of control would mean constant attention is
required from the human operator, thus nullifying the benefits of having the auto-
controller. In order to reduce the overhead of human participation, we want to mini-
mize a joint objective function C that combines two elements: (i) the probability that
when auto is set to false, the environment will eventually force AC into a failure
scenario, and (ii) the cost of having the human operator taking control. We formalize
this objective function in Sec. 5.1.
3. Prescient. It may be too late to seek the human operator’s attention when failure is
imminent. We also need to allow extra time for the human to respond and study the
situation. Thus, we require an advisory to be issued ahead of any failure scenario. In
the discrete setting, we assume we are given a positive integer T representing human
response time (which can be driver-specific), and require that auto is set to false
at least T number of transitions ahead of a state (in AC) that is unsafe.
4. Conditionally-Correct. The auto-controller is responsible for correct operation as
long as auto is set to true. Formally, if auto = true when AC is at a state q,
then F (q) = false. Additionally, when auto is set to false, the auto-controller
should still maintain correct operation in the next T − 1 time steps, during or after
which we assume the human operator take over control. Formally, if auto changes
from true to false when AC is at a state q, let RT (q) be the set of states reachable
from q within T − 1 transitions, then F (q ) = false, ∀q ∈ RT (q).
Now we are ready to state the HuIL Controller Synthesis Problem: Given a model
of the system and its specification expressed in a formal language, synthesize a HuIL
controller Hu IL that is, by construction, monitoring, minimally intervening, prescient,
and conditionally correct.
In this paper, we study the synthesis of a HuIL controller in the setting of synthesis of
reactive systems from linear temporal logic (LTL) specifications. We give background
on this setting in Section 4, and propose an algorithm for solving the HuIL controller
synthesis problem in Section 5.

4 Synthesis from Temporal Logic

4.1 Linear Temporal Logic
An LTL formula is built from atomic propositions AP , Boolean connectives (i.e. nega-
tions, conjunctions and disjunctions), and temporal operators X (next) and U (until).
In this paper, we consider AP = X ∪ Y .
476 W. Li et al.

LTL formulas are usually interpreted over infinite words (traces) w ∈ Σ ω , where
Σ = 2AP . The language of an LTL formula ψ is the set of infinite words that satisfy
ψ, given by L(ψ) = {w ∈ Σ ω | w |= ψ}. One classic example is the LTL formula
G (p → F q), which means every occurrence of p in a trace must be followed by some
q in the future.
An LTL formula ψ is satisfiable if there exists an infinite word that satisfies ψ, i.e.,
∃w ∈ (2AP )ω such that w |= ψ. A transducer M satisfies an LTL formula ψ if L(M ) ⊆
L(ψ). We write this as M |= ψ. Realizability is the problem of determining whether
there exists a transducer M with input alphabet X = 2X and output alphabet Y = 2Y
such that M |= ψ.

4.2 Synthesis from GR(1) Specification

Synthesis is the process of automatically finding an implementation that satisfies a given
specification. However, the complexity of deciding the realizability of an LTL formula
can be prohibitively high (2EXPTIME-complete [16]). Piterman et al. [14] describe
a more efficient algorithm for synthesizing a subclass of LTL properties, known as
Generalized Reactivity (1) [GR(1)] formulas. In this paper, we consider (unrealizable)
specifications given in the GR(1) subclass. A GR(1) formula has the form ψ = ψ env →
ψ sys , where ψ env represents the environment assumptions and ψ sys represents the sys-
tem guarantees. The syntax of GR(1) formulas is given as follows. We require ψ l for
l ∈ {env, sys} to be a conjunction of sub-formulas in the following forms:
• ψil : a Boolean formula that characterizes the initial states.
• ψtl : an LTL formula that characterizes the transition, in the form G B, where B is
a Boolean combination of variables in X ∪ Y and expression X u where u ∈ X if
l = env and u ∈ X ∪ Y if l = sys.
• ψfl : an LTL formula that characterizes fairness, in the form G F B, where B is a
Boolean formula over variables in X ∪ Y .

4.3 Games and Strategies

In general, the synthesis problem can be viewed as a two-player game between the
system sys and the environment env. Following [14], a finite-state two-player game is
defined by its game graph, represented by the tuple G = (Qg , θg , ρenv , ρsys , W in) for
input variables X controlled by the environment env and output variables Y controlled
by the system sys, where Qg ⊆ 2X∪Y is the state space of the game, θg is a Boolean for-
mula over X ∪ Y that specifies the initial states of the game structure, ρenv ⊆ Qg × 2X
is the environment transition relation relating a present state in Qg to the possible next
inputs the environment can pick in 2X , ρsys ⊆ Qg × 2X × 2Y is the system transition
relation relating a present state in Qg and a next input in 2X picked by the environ-
ment to the possible next outputs the system can pick in 2Y , and W in is the winning
condition. Given a set of GR(1) specifications, i.e. ψienv , ψisys , ψtenv , ψtsys , ψfenv , ψfsys ,
we can define a game structure G by setting θg = ψienv ∧ ψisys , ρenv = ψtenv with
all occurrences of X u replaced by u2 , ρsys = ψtsys with all occurrences of X u re-
placed by u , and W in as ψfenv → ψfsys . A play π of G is a maximal sequence of states
2
We use the primed copies u of u to denote the next input/output variables.
Synthesis for Human-in-the-Loop Control Systems 477

π = q0 q1 . . . of states such that q0 |= θg and (qi , qi+1 ) ∈ ρenv ∧ ρsys for all i ≥ 0. A
play π is winning for the system iff it is infinite and π |= W in. Otherwise, π is winning
for the environment. The set of states from which there exists a winning strategy for the
environment is called the winning region for env.
A finite-memory strategy for env in G is a tuple S env = (Γ env , γ env0 , η env ), where
Γ env
is a finite set representing the memory, γ env0 ∈ Γ env is the initial memory con-
tent, and η env ⊆ Qg × Γ env × X × Γ env is a relation mapping a state in G and some
memory content γ env ∈ Γ env to the possible next inputs the environment can pick and
an updated memory content. A strategy S env is winning for env from a state q if all
plays starting in q and conforming to S env are won by env. Following the terminology
used in [8], if a strategy S env is winning from an initial state q satisfying θg , then it is
called a counterstrategy for env. The existence of a counterstrategy is equivalent to the
specification being unrealizable. We refer the readers to [8] for details on how a coun-
terstrategy can be extracted from intermediate results of the fix-point computation for
the winning region for env. On the other hand, a winning strategy for the system can
be turned into an implementation, e.g., a sequential circuit with |X| inputs, |X| + |Y |
state-holding elements (flip-flops), and |Y | outputs that satisfies the given GR(1) speci-
fication. In this paper, the synthesized implementation is effectively the auto-controller
in the proposed HuIL framework, and can be viewed as a Mealy machine with state
space Q ⊆ 2X∪Y . We refer the readers to [14] for details of this synthesis process.

4.4 Counterstrategy Graph

The counterstrategy can be conveniently viewed as a transition system. A counterstrat-
egy graph Gc is a discrete transition system Gc = (Qc , Qc0 ⊆ Qc , ρc ⊆ Qc × Qc ),
where Qc ⊆ Qg × Γ env is the state space, Qc0 = Qg0 × γ env0 is the set of initial states,
and ρc = η env ∧ ρsys is the transition relation. In a nutshell, Gc describes evolutions
of the game state where env adheres to η env and sys adheres to ρsys . For convenience,
we use a function θc : Qc → 2X∪Y to denote the game state (an assignment to X and
Y ) associated with a state q c ∈ Qc . A run π c of Gc is a maximal sequence of states
π c = q0c q1c . . . of states such that q0c ∈ Qc0 and (qic , qi+1
c
) ∈ ρc for all i ≥ 0.
We can also view G as a directed graph, where each state in Qc is given its own
c

node, and there is an edge from node qic to node qjc if given the current state at qic , there
exists a next input picked from the counterstrategy for which the system can produce a
legal next output so that the game proceeds to a new state at qjc .

5 HuIL Controller Synthesis

Given an unrealizable specification, a counterstrategy S env exists for env which de-
scribes moves by env such that it can force a violation of the system guarantees. The
key insight of our approach for synthesizing a HuIL controller is that we can synthesize
an advisory controller that monitors these moves and prompts the human operator with
sufficient time ahead of any danger. These moves are essentially assumptions on the
environment under which the system guarantees can be ensured. When these assump-
tions are not violated (the environment may behave in a benign way in reality), the
auto-controller fulfills the objective of the controller. On the other hand, if any of the
478 W. Li et al.

assumptions is violated, as flagged by the advisory controller, then the control is safely
switched to the human operator in a way that she can have sufficient time to respond.
The challenge, however, is to decide when an advisory should be sent to the human
operator, in a way that it is also minimally intervening to the human operator. We use
the following example to illustrate our algorithm.
Example 1. Consider X = {x}, Y = {y} and the following GR(1) sub-formulas which
together form ψ = ψ env → ψ sys .
1. ψfenv = G (F ¬x)
2. ψtsys = G (¬x → ¬y)
3. ψfsys = G (F y)

(a) Counterstrategy graph Gc for unre- (b) Condensed graph Ĝc for Gc after
alizable specification ψ. contracting the SCC.

Specification ψ is not realizable. Figure 3a shows the computed counterstrategy

graph Gc . The literal x̄ (ȳ) denotes the negation of the propositional variable x (y). The
memory content is denoted by γi with γ0 being the initial memory content. The three
shaded states on the left are the initial states. The literals on the edges indicate that the
environment first chooses x̄ and then the system chooses ȳ. (the system is forced to pick
ȳ due to ψtsys ). Observe that, according the counterstrategy, the system will be forced
to pick ȳ perpetually. Hence, the other system guarantee ψfsys cannot be satisfied.

5.1 Weighted Counterstrategy Graph

Recall that a counterstrategy can be viewed as a discrete transition system or a directed
graph Gc . We consider two types of imminent failures (violation of some system guar-
antee specification) described by Gc .
• Safety violation. For a node (state) q1c ∈ Qc , if there does not exist a node q2c such
that (q1c , q2c ) ∈ ρc , then we say q1c is failure-imminent. In this scenario, after env
picks a next input according to the counterstrategy, sys cannot find a next output
such that all of the (safety) guarantees are satisfied (some ψisys or ψtsys is violated).
• Fairness violation. If a node q c is part of a strongly connected component (SCC)
scc in Qc , then we say q c is failure-doomed. For example, the node (x̄, ȳ, γ1 ) in
Figure 3a is a failure-doomed node. Starting from q c , env can always pick inputs in
such a way that the play is forced to get stuck in scc. Clearly, all other states in scc
are also failure-doomed.
Synthesis for Human-in-the-Loop Control Systems 479

Now we make the connection of the labeling function F for a controller M to the
counterstrategy graph Gc which describes behaviors that M should not exhibit. Con-
sider an auto-controller M and a state q (represented by the assignment xy) in M .
F (q) = true if and only if there exist some q c ∈ Qc such that θc (q c ) = xy and q c is
either failure-imminent or failure-doomed. In practice, it is not always the case that the
environment will behave in the most adversarial way. For example, a car in front may
yield if it is blocking our path. Hence, even though the specification is not realizable, it
is still important to assess, at any given state, whether it will actually lead to a violation.
For simplicity, we assume that the environment will adhere to the counterstrategy once
it enters a failure-doomed state.
We can convert Gc to its directed acyclic graph (DAG) embedding Ĝc = (Q̂c , Q̂c0 , ρ̂c )
by contracting each SCC in Gc to a single node. Figure 3b shows the condensed graph
Ĝc of Gc shown in Figure 3a. We use a surjective function fˆ : Qc → Q̂c to describe
the mapping of nodes from Gc to Ĝc . We say a node q̂ ∈ Q̂c is failure-prone if a node
q c ∈ Qc is either failure-imminent or failure-doomed and fˆ(q c ) = q̂.
Recall from Section 3.3 that the notion of minimally-intervening requires the mini-
mization of a cost function C, which involves the probability that auto is set to false,
Thus far, we have not associated any probabilities with transitions taken by the environ-
ment or the system. While our approach can be adapted to work with any assignment
of probabilities, for ease of presentation, we make a particular choice in this paper.
Specifically, we assume that at each step, the environment picks a next-input uniformly
at random from the set of possible legal actions (next-inputs) obtained from η env given
the current state. In Example 1 and correspondingly Figure 3a, this means that it is
equally likely for env to choose x̄ or x from any of the states. We use c(q) to denote the
total number of legal actions that the environment can take from a state q.
In addition, we take into account of the cost of having the human operator perform
the maneuver instead of the auto-controller. In general, this cost increases with longer
human engagement. Based on these two notions, we define ), which assigns a weight
to an edge e ∈ Q̂c × Q̂c in Ĝc , recursively as follows. For an edge between q̂i and q̂j ,

1 if q̂j is failure-prone
)(q̂i , q̂j ) = pen(q̂i )×len(q̂i )
c(q̂i ) Otherwise
where pen : Q̂c → Q+ is a user-defined penalty parameter3, and len : Q̂c → Z+ is the
length (number of edges) of the shortest path from a node q̂i to any failure-prone node
in Ĝc . Intuitively, a state far away from any failure-prone state is less likely to cause a
failure since the environment would need to make multiple consecutive moves all in an
adversarial way. However, if we transfer control at this state, the human operator will
have to spend more time in control, which is not desirable for a HuIL controller. Next,
we describe how to use this edge-weighted DAG representation of a counterstrategy
graph to derive a HuIL controller that satisfies the criteria established earlier.

5.2 Counterstrategy-Guided Synthesis

Suppose we have a counterstrategy graph Gc that summarizes all possible ways for the
environment to force a violation of the system guarantees. Consider an outgoing edge
3
pen(q̂i ) should be chosen such that (q̂i , q̂j ) < 1.
480 W. Li et al.

from a non-failure-prone node q̂ in Ĝc (condensed graph of Gc ), this edge encodes a

particular condition where the environment makes a next-move given some last move
made by the environment and the system. If some of these next-moves by the environ-
ment are disallowed, such that none of the failure-prone nodes are reachable from any
initial state, then we have effectively eliminated the counterstrategy. This means that if
we assert the negation of the corresponding conditions as additional ψtenv (environment
transition assumptions), then we can obtain a realizable specification.
Formally, we mine assumptions of the form φ = i (G (ai → ¬X bi )), where ai is
a Boolean formula describing a set of assignments over variables in X ∪ Y , and bi is a
Boolean formula describing a set of assignments over variables in X.
Under the assumption φ, if (φ ∧ ψ env ) → ψ sys is realizable, then we can automati-
cally synthesize an auto-controller that satisfies ψ. In addition, the key observation here
is that mining φ is equivalent to finding a set of edges in Ĝc such that, if these edges
are removed from Ĝc , then none of the failure-prone nodes is reachable from any initial
state. We denote such set of edges as E φ , where each edge e ∈ E φ corresponds to a con-
junct in φ. For example, if we remove the three outgoing edges from the source nodes
in Figure 3b, then the failure-prone node is not reachable. Removing these three edges
correspond to adding the following environment assumption, which can be monitored
at runtime.
(G ((x ∧ y) → ¬X x̄)) ∧ (G ((x̄ ∧ ȳ) → ¬X x̄)) ∧ (G ((x ∧ ȳ) → ¬X x̄))
Human factors play an important role in the design of a HuIL controller. The crite-
ria established for a HuIL controller in Section 3.3 also require it to be prescient and
minimally intervening. Hence, we want to mine assumptions that reflect these crite-
ria as well. The notion of prescient essentially requires that none of the failure-prone
nodes is reachable from a non-failure-prone node with less than T steps (edges). The
weight function ) introduced earlier can be used to characterize the cost of a failing
assumption resulting in the advisory controller prompting the human operator to take
φ
auto to false). Formally, we seek E such that the total cost
over control (by setting
of switching control e∈E φ )(e) is minimized.
We can formulate this problem as a s-t min-cut problem for directed acyclic graphs.
Given Ĝc , we first compute the subset of nodes Q̂cT ⊆ Q̂c that are backward reachable
within T − 1 steps from the set of failure-prone nodes (when T = 1, Q̂cT is the set of
failure-prone node). We assume that Q̂c0 ∩ Q̂cT = ∅. Next, we remove the set of nodes
Q̂cT from Ĝc and obtain a new graph ĜcT . Since ĜcT is again a DAG, we have a set of
source nodes and a set of terminal nodes. Thus, we can formulate a s-t min-cut problem
by adding a new source node that has an outgoing edge (with a sufficiently large weight)
to each of the source nodes and a new terminal node that has an incoming edge (with
a sufficiently large weight) from each of the terminal nodes. This s-t min-cut problem
can be easily solved by standard techniques [6]. The overall approach is summarized in
Algorithm 1.
Theorem 1. Given a GR(1) specification ψ and a response time parameter T , Algo-
rithm 1 is guaranteed to either produce a fully autonomous controller satisfying ψ,
or a HuIL controller, modeled as a composition of an auto-controller AC, a human
operator and an advisory controller VC, that is monitoring, prescient with parameter
Synthesis for Human-in-the-Loop Control Systems 481

Algorithm 1 . Counterstrategy-Guided HuIL Controller Synthesis

Input: GR(1) specification ψ = ψ env → ψ sys .
Input: T : parameter for minimum human response time.
Output: AC and VC. Hu IL is then a composition of AC, VC and HC.
if ψ is realizable then
Synthesize transducer M |= ψ (using standard GR(1) synthesis);
Hu IL := M (fully autonomous).
else
Generate Gc from ψ (assume a single Gc ; otherwise the algorithm is performed iteratively);

Generate the DAG embedded Ĝc from Gc .

Reduce Ĝc to ĜcT ;
Assign weights to Ĝc using ϕ; by removing Q̂cT – nodes that are within T − 1 steps of any
failure-prone node;
Formulate a s-t min-cut problem with ĜcT ;
Solve the s-t min-cut problem to obtain E φ ;
Add assumptions φ to ψ to obtain the new specification ψnew := (φ ∧ ψ env ) → ψ sys ;
Synthesize AC so that M |= ψnew ;
Synthesize VC as a (stateless) monitor that outputs auto = false if φ is violated.
end if

T , minimally intervening4 with respect to the cost function fC = e∈E φ )(e), and
conditionally correct5.
Proof. (Sketch) When ψ is realizable, a fully autonomous controller is synthesized and
unconditionally satisfies ψ. Now consider that case when ψ is not realizable.
The HuIL controller is monitoring as φ only comprises a set of environment transi-
tions up to the next environment input.
It is prescient by construction. The auto flag advising the human operator to take
over control is set to false precisely when φ is violated. When φ is violated, it cor-
responds to the environment making a next-move from the current state q according to
some edge e = (q̂i , q̂j ) ∈ E φ . Consider any q c ∈ Qc such that fˆ(q c ) = q̂i , θc (q c ) = q.
Since q̂i ∈ Q̂cT by the construction of ĜcT , q̂i is at least T transitions away from any
failure-prone state in Ĝc . This means q c must also be at least T transitions away from
any failure-imminent state or failure-doomed state in Qc . Hence, by the definition of F
with respect to a failure-doomed or failure-doomed state in Section 5.1, q is (and auto
is set) at least T transitions ahead of any state that is unsafe.
The HuIL controller is also conditionally correct. By the same reasoning as above,
for any state q ∈ RT (q), F (q ) = false, i.e. q is safe.
Finally, since auto is set to false precisely when φ is violated, and φ in turn is
constructed
based on the set of edges E φ , which minimizes the cost function fC =
e∈E φ )(e), the HuIL controller is minimally-intervening with respect to the cost
function fC .
4
We assume the counterstrategy we use to mine the assumptions is an optimal one – it forces a
violation of the system guarantees as quickly as possible.
5
We assume that all failure-prone nodes are at least T steps away from any initial node.
482 W. Li et al.

5.3 Switching from Human Operator to Auto-Controller

Once control has been transferred to the human operator, when should the human yield
control to the autonomous controller again? One idea is for the HuIL controller to con-
tinually monitor the environment after the human operator has taken control, checking
if a state is reached from which the auto-controller can ensure that it satisfies the specifi-
cation (under assumption φ), and then the advisory controller can signal a driver telling
her that the auto-controller is ready to take back control. We note that alternative ap-
proaches may exist and we plan to investigate this further in future work.

6 Experimental Results
Our algorithm is implemented as an extension to the GR(1) synthesis tool RATSY [4].
Due to space constraint, we discuss only the car-following example (as shown in Sec-
tion 2) here and refer the readers to http://verifun.eecs.berkeley.edu/
tacas14/ for other examples.
Recall the car-following example shown in Section 2. We describe some of the more
interesting specifications below and their corresponding LTL formulas. pA , pB , pC are
used to denote the positions of car A, B and C respectively.
• Any position can be occupied by at most one car at a time (no crashing):

G pA = x → (pB = x ∧ pC = x)
where x denotes a position on the discretized space. The cases for B and C are
similar, but they are part of ψenv .
• Car A is required to follow car B:

G (vAB = true ∧ pA = x) → X (vAB = true)
where vAB = true iff car A can see car B.
• Two cars cannot cross each other if they are right next to each other. For example,
when pC = 5, pA = 6 and pC = 8 (in the next cycle), pA = 7. In LTL,

G ((pC = 5) ∧ (pA = 6) ∧ (X pC = 8)) → (X (pA = 7))

The other specifications can be found in the link described at the beginning of this
section. Observe that car C can in fact force a violation of the system guarantees in one
step under two situations – when pC = 5, pB = 8 and pA = 4, or pC = 5, pB = 8 and
pA = 6. Both are situations where car C is blocking the view of car A, causing it to
lose track of car B. The second failure scenario is illustrated in Figure 2b.
Applying our algorithm to this (unrealizable) specification with T = 1, we obtain
the following assumption φ.

φ = G ((pA = 4) ∧ (pB = 6) ∧ (pC = 1)) → ¬X ((pB = 8) ∧ (pC = 5))

G ((pA = 4) ∧ (pB = 6) ∧ (pC = 1)) → ¬X ((pB = 6) ∧ (pC = 3))

G ((pA = 4) ∧ (pB = 6) ∧ (pC = 1)) → ¬X ((pB = 6) ∧ (pC = 5))
Synthesis for Human-in-the-Loop Control Systems 483

In fact, φ corresponds to three possible evolutions of the environment from the initial
state. In general, φ can be a conjunction of conditions at different time steps as env and
sys progress. The advantage of our approach is that it can produce φ such that we can
synthesize an auto-controller that is guaranteed to satisfy the specification if φ is not
violated, together with an advisory controller that prompts the driver (at least) T (T =
1 in this case) time steps ahead of a potential failure when φ is violated.

7 Related Work
Similar to [9], we synthesize a discrete controller from temporal logic specifications.
Wongpiromsarn et al. [21] consider a receding horizon framework to reduce the synthe-
sis problem to a set of simpler problems for a short horizon. Livingston et al. [11,12]
exploit the notion of locality that allows “patching” a nominal solution. They update the
local parts of the strategy as new data accumulates allowing incremental synthesis. The
key innovation in this paper is that we consider synthesizing interventions to combine
an autonomous controller with a human operator.
Our work is inspired by the recent works on assumption mining. Chatterjee et al. [5]
construct a minimal environment assumption by removing edges from the game graph
to ensure safety assumptions, then compute liveness assumptions to put additional fair-
ness constraints on the remaining edges. Li et al. [10] and later Alur et al. [2] use a
counterstrategy-guided approach to mine environment assumptions for GR(1) specifi-
cations. We adapt this approach to the synthesis of human-in-the-loop control systems.
In recent years, there has been an increasing interest in human-in-the-loop systems
in the control systems community. Anderson et al. [3] study obstacle avoidance and
lane keeping for semiautonomous cars. They use a model predictive control for their
autonomous control. Our approach, unlike this one, seeks to provide correctness guar-
antees in the form of temporal logic properties. Vasudevan et al. [19] focus on learning
and predicting a human model based on prior observations. Based on the measured level
of threat, the controller intervenes and overwrites the driver’s input. However, we believe
that allowing an auto-controller to override the human inputs is unsafe especially since
it is hard to fully model the environment. We propose a different paradigm where we
allow the human to take control if the autonomous system predicts failure. Finally, hu-
man’s reaction time while driving is an important consideration in this paper. The value
of reaction time can range from 1 to 2.5 seconds for different tasks and drivers [18].

8 Conclusions
In this paper, we propose a synthesis approach for designing human-in-the-loop con-
trollers. We consider a mode of interaction where the controller is mostly autonomous
but requires occasional intervention by a human operator, and study important criteria
for devising such controllers. Further, we study the problem in the context of controller
synthesis from (unrealizable) temporal-logic specifications. We propose an algorithm
based on mining monitorable conditions from the counterstrategy of the unrealizable
specifications. Preliminary results on applying this approach to driver assistance in au-
tomobiles are encouraging. One limitation of the current approach is the use of an ex-
plicit counterstrategy graph (due to weight assignment). We plan to explore symbolic
algorithms in the future.
484 W. Li et al.

Acknowledgment. This work was supported in part by TerraSwarm, one of six centers
of STARnet, a Semiconductor Research Corporation program sponsored by MARCO
and DARPA. This work was also supported by the NSF grants CCF-1116993 and
CCF-1139138.
References
1. Federal Aviation Administration. The interfaces between flight crews and modern flight sys-
tems (1995)
2. Alur, R., et al.: Counter-strategy guided refinement of gr(1) temporal logic specifications. In:
The Conference on Formal Methods in Computer-Aided Design, pp. 26–33 (2013)
3. Anderson, S.J., et al.: An optimal-control-based framework for trajectory planning, threat as-
sessment, and semi-autonomous control of passenger vehicles in hazard avoidance scenarios.
International Journal of Vehicle Autonomous Systems 8(2), 190–216 (2010)
4. Bloem, R., Cimatti, A., Greimel, K., Hofferek, G., Könighofer, R., Roveri, M., Schuppan, V.,
Seeber, R.: RATSY – A new requirements analysis tool with synthesis. In: Touili, T., Cook,
B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 425–429. Springer, Heidelberg (2010)
5. Chatterjee, K., Henzinger, T.A., Jobstmann, B.: Environment assumptions for synthesis.
In: van Breugel, F., Chechik, M. (eds.) CONCUR 2008. LNCS, vol. 5201, pp. 147–161.
Springer, Heidelberg (2008)
6. Costa, M.-C., et al.: Minimal multicut and maximal integer multiflow: A survey. European
Journal of Operational Research 162(1), 55–69 (2005)
7. Kohn, L.T., et al.: To err is human: Building a safer health system. Technical report, A report
of the Committee on Quality of Health Care in America, Institute of Medicine (2000)
8. Könighofer, R., et al.: Debugging formal specifications using simple counterstrategies. In:
Conference on Formal Methods in Computer-Aided Design, pp. 152–159 (2009)
9. Kress-Gazit, H., et al.: Temporal-logic-based reactive mission and motion planning. IEEE
Transactions on Robotics 25(6), 1370–1381 (2009)
10. Li, W., et al.: Mining assumptions for synthesis. In: Conference on Formal Methods and
Models for Codesign, pp. 43–50 (2011)
11. Livingston, S.C., et al.: Backtracking temporal logic synthesis for uncertain environments.
In: Conference on Robotics and Automation, pp. 5163–5170 (2012)
12. Livingston, S.C., et al.: Patching task-level robot controllers based on a local μ-calculus
formula. In: Conference on Robotics and Automation, pp. 4588–4595 (2013)
13. National Highway Traffic Safety Administration. Preliminary statement of policy concerning
automated vehicles (May 2013)
14. Piterman, N., Pnueli, A., Sa’ar, Y.: Synthesis of reactive(1) designs. In: Emerson, E.A.,
Namjoshi, K.S. (eds.) VMCAI 2006. LNCS, vol. 3855, pp. 364–380. Springer, Heidelberg
(2006)
15. Pnueli, A.: The temporal logic of programs. In: Annual Symposium on Foundations of Com-
puter Science, pp. 46–57 (1977)
16. Rosner, R.: Modular synthesis of reactive systems. Ph.D. dissertation, Weizmann Institute of
Science (1992)
17. Sadigh, D., et al.: Data-driven probabilistic modeling and verification of human driver be-
havior. In: Formal Verification and Modeling in Human-Machine Systems (2014)
18. Triggs, T.J., et al.: Reaction time of drivers to road stimuli (1982)
19. Vasudevan, R., et al.: Safe semi-autonomous control with enhanced driver modeling. In:
American Control Conference, pp. 2896–2903 (2012)
20. Wongpiromsarn, T., et al.: Receding horizon temporal logic planning for dynamical systems.
In: Conference on Decision and Control, pp. 5997–6004 (2009)
21. Wongpiromsarn, T., et al.: Receding horizon temporal logic planning. IEEE Transactions on
Automatic Control 57(11), 2817–2830 (2012)
Learning Regular Languages over Large Alphabets

Oded Maler and Irini-Eleftheria Mens

CNRS-VERIMAG
University of Grenoble
France

Abstract. This work is concerned with regular languages defined over large al-
phabets, either infinite or just too large to be expressed enumeratively. We define
a generic model where transitions are labeled by elements of a finite partition of
the alphabet. We then extend Angluin’s L∗ algorithm for learning regular lan-
guages from examples for such automata. We have implemented this algorithm
and we demonstrate its behavior where the alphabet is the set of natural numbers.

1 Introduction

The main contribution of this paper is a generic algorithm for learning regular languages
defined over a large alphabet Σ. Such an alphabet can be infinite, like N or R or just
so large, like Bn for very large n, that it is impossible or impractical to treat it in an
enumerative way, that is, to write down δ(q, a) for every a ∈ Σ. The obvious solution
is to use a symbolic representation where transitions are labeled by predicates which are
applicable to the alphabet in question. Learning algorithms infer an automaton from a
finite set of words (the sample) for which membership is known. Over small alphabets,
the sample should include the set S all the shortest words that lead to each state and,
in addition, the set S · Σ of all their Σ-continuations. Over large alphabets this is not
a practical option and as an alternative we develop a symbolic learning algorithm over
symbolic words which are only partially backed up by the sample. In a sense, our algo-
rithm is a combination of automaton learning and learning of non-temporal functions.
Before getting technical, let us discuss briefly some motivation.
Finite automata are among the corner stones of Computer Science. From a practical
point of view they are used daily in various domains ranging from syntactic analy-
sis, design of user interfaces or administrative procedures to implementation of digital
hardware and verification of software and hardware protocols. Regular languages ad-
mit a very nice, clean and comprehensive theory where different formalisms such as
automata, logic, regular expressions, semigroups and grammars are shown to be equiv-
alent. As for learning from examples, a problem introduced by Moore [Moo56], the
Nerode right-congruence relation [Ner58] which declares two input histories as equiv-
alent if they lead to the same future continuations, provides a crisp characterization of
what a state in a dynamical system is in terms of observable input-output behavior.
All algorithms for learning automata from examples, starting with the seminal work of
Gold [Gol72] and culminating in the well-known L∗ algorithm of Angluin [Ang87] are
based on this concept [DlH10].

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 485–499, 2014.

c Springer-Verlag Berlin Heidelberg 2014
486 O. Maler and I.-E. Mens

One weakness, however, of the classical theory of regular languages is that it is rather
“thin” and “flat”. In other words, the alphabet is often considered as a small set devoid of
any additional structure. On such alphabets, classical automata are good for expressing
and exploring the temporal (sequential, monoidal) dimension embodied by the concate-
nation operations, but less good in expressing “horizontal” relationships. To make this
statement more concrete, consider the verification of a system consisting of n automata
running in parallel, making independent as well as synchronized transitions. To express
the set of joint behaviors of this product of automata as a formal language, classical
theory will force you to use the exponential alphabet of global states and indeed, a large
part of verification is concerned with fighting this explosion using constructs such as
BDDs and other logical forms that exploit the sparse interaction among components.
This is done, however, without a real interaction with classical formal language theory
(one exception is the theory of traces [DR95] which attempts to treat this issue but in a
very restricted context).1
These and other considerations led us to use symbolic automata as a generic frame-
work for recognizing languages over large alphabets where transitions outgoing from a
state are labeled, semantically speaking, by subsets of the alphabet. These subsets are
expressed syntactically according to the specific alphabet used: Boolean formulae when
Σ = Bn or by some classes of inequalities when Σ = N. Determinism and complete-
ness of the transition relation, which are crucial for learning and minimization, can be
enforced by requiring that the subsets of Σ that label the transitions outgoing from a
given state form a partition of the alphabet.
Readers working on program verification or hybrid automata are, of course, aware
of automata with symbolic transition guards but it should be noted that in our model no
auxiliary variables are added to the automaton. Let us stress this point by looking at a
popular extension of automata to infinite alphabets, initiated by Kaminski and Francez
[KF94] using register automata to accept data languages (see [BLP10] for theoretical
properties and [HSJC12] for learning algorithms). In that framework, the automaton
is augmented with additional registers that can store some input letters. The registers
can then be compared with newly-read letters and influence transitions. With register
automata one can express, for example, the requirement that your password at login is
the same as the password at sign-up. This very restricted use of memory makes register
automata much simpler than more notorious automata with variables whose emptiness
problem is typically undecidable. The downside is that beyond equality they do not
really exploit the potential richness of the alphabets/theories.
Our approach is different: we do allow the values of the input symbols to influence
transitions via predicates, possibly of a restricted complexity. These predicates involve
domain constants and they partition the alphabet into finitely many classes. For exam-
ple, over the integers a state may have transitions labeled by conditions of the form
c1 ≤ x ≤ c2 which give real (but of limited resolution) access to the input domain. On
the other hand, we insist on a finite (and small) memory so that the exact value of x
cannot be registered and has no future influence beyond the transition it has triggered.
The symbolic transducers, recently introduced by [VHL+ 12], are based on the same

1
This might also be the reason that Temporal Logic is more popular in verification than regular
expressions because the nature of until is less global and less synchronous than concatenation.
Learning Regular Languages over Large Alphabets 487

principle. Many control systems, artificial (sequential machines working on quantized

numerical inputs) as well as natural (central nervous system, the cell), are believed to
operate in this manner.
We then develop a symbolic version of Angluin’s L∗ algorithm for learning regular
sets from queries and counter-examples whose output is a symbolic automaton. The
main difference relative to the concrete algorithm is that in the latter, every transition
δ(q, a) in a conjectured automaton has at least one word in the sample that exercises
it. In the symbolic case, a transition δ(q, a) where a is a set of concrete symbols, will
be backed up in the sample only by a subset of a. Thus, unlike concrete algorithms
where a counter-example always leads to a discovery of one or more new states, in
our algorithm it may sometimes only modify the boundaries between partition blocks
without creating new states.
The rest of the paper is organized as follows. In Section 2 we provide a quick sum-
mary of learning algorithms over small alphabets. In Section 3 we define symbolic
automata and then extend the structure which underlies all automaton learning algo-
rithms, namely the observation table, to be symbolic, where symbolic letters represent
sets, and where entries in the table are supported only by partial evidence. In Section 4
we write down a symbolic learning algorithm and illustrate the behavior of a prototype
implementation on learning subsets of N∗ . We conclude by a discussion of past and
future work.

2 Learning Concrete Automata

We briefly survey Angluin’s L∗ algorithm [Ang87] for learning regular sets from mem-
bership queries and counter-examples, with slightly modified definitions to accommo-
date for its symbolic extension. Let Σ be a finite alphabet and let Σ ∗ be the set of
sequences (words) over Σ. Any order relation < over Σ can be naturally lifted to a
lexicographic order over Σ ∗ . With a language L ⊆ Σ ∗ we associate a characteristic
function f : Σ ∗ → {0, 1}.
A deterministic finite automaton over Σ is a tuple A = (Σ, Q, δ, q0 , F ), where Q is a
non-empty finite set of states, q0 ∈ Q is the initial state, δ : Q×Σ → Q is the transition
function, and F ⊆ Q is the set of final or accepting states. The transition function δ can
be extended to δ : Q × Σ ∗ → Q, where δ(q,
) = q and δ(q, u · a) = δ(δ(q, u), a)
for q ∈ Q, a ∈ Σ and u ∈ Σ ∗ . A word w ∈ Σ ∗ is accepted by A if δ(q0 , w) ∈ F ,
otherwise w is rejected. The language recognized by A is the set of all accepted words
and is denoted by L(A).
Learning algorithms, represented by the learner, are designed to infer an unknown
regular language L (the target language). The learner aims to construct a finite automa-
ton that recognizes the target language by gathering information from the teacher. The
teacher knows the target language and can provide information about it. It can answer
two types of queries: membership queries, i.e., whether a word belongs to the target
language, and equivalence queries, i.e., whether a conjectured automaton suggested by
the learner is the right one. If this automaton fails to accept L the teacher responds to
the equivalence query by a counter-example, a word misclassified by the conjectured
automaton.
488 O. Maler and I.-E. Mens

In the L∗ algorithm, the learner starts by asking membership queries. All information
provided is suitably gathered in a table structure, the observation table. Then, when the
information is sufficient, the learner constructs a hypothesis automaton and poses an
equivalence query to the teacher. If the answer is positive then the algorithm terminates
and returns the conjectured automaton. Otherwise the learner accommodates the in-
formation provided by the counter-example into the table, asks additional membership
queries until it can suggest a new hypothesis and so on, until termination.
A prefix-closed set S 6 R ⊂ Σ ∗ is a balanced Σ-tree if ∀a ∈ Σ: 1) For every s ∈ S
s · a ∈ S ∪ R, and 2) For every r ∈ R, r · a ∈ S ∪ R. Elements of R are called boundary
elements or leaves.

Definition 1 (Observation Table). An observation table is a tuple T = (Σ, S, R, E, f )

such that Σ is an alphabet, S ∪ R is a finite balanced Σ-tree, E is a subset of Σ ∗ and
f : (S ∪ R) · E → {0, 1} is the classification function, a restriction of the characteristic
function of the target language L.

The set (S ∪ R) · E is the sample associated with the table, that is, the set of words
whose membership is known. The elements of S admit a tree structure isomorphic to a
spanning tree of the transition graph rooted in the initial state. Each s ∈ S corresponds
to a state q of the automaton for which s is an access sequence, one of the shortest words
that lead from the initial state to q. The elements of R should tell us about the back- and
cross-edges in the automaton and the elements of E are “experiments” that should be
sufficient to distinguish between states. This works by associating with every s ∈ S ∪ R
a specialized classification function fs : E → {0, 1}, defined as fs (e) = f (s·e), which
characterizes the row of the observation table labeled by s. To build an automaton from
a table it should satisfy certain conditions.

Definition 2 (Closed, Reduced and Consistent Tables). An observation table T is:

– Closed if for every r ∈ R, there exists an s ∈ S, such that fr = fs ;
– Reduced if for every s, s ∈ S fs = fs ;
– Consistent if for every s, s ∈ S, fs = fs implies fs·a = fs ·a , ∀a ∈ Σ.

Note that a reduced table is trivially consistent and that for a closed and reduced table
we can define a function g : R → S mapping every r ∈ R to the unique s ∈ S such
that fs = fr . From such an observation table T = (Σ, S, R, E, f ) one can construct an
automaton AT = (Σ, Q, q0 , δ, F ) where Q = S, q0 =
, F = {s ∈ S : fs (
) = 1} and
9
s·a when s · a ∈ S
δ(s, a) =
g(s · a) when s · a ∈ R

The learner attempts to keep the table closed at all times. The table is not closed
when there is some r ∈ R such that fr is different from fs for all s ∈ S. To close
the table, the learner moves r from R to S and adds the Σ-successors of r to R. The
extended table is then filled up by asking membership queries until it becomes closed.
Variants of the L∗ algorithm differ in the way they treat counter-examples, as de-
scribed in more detail in [BR04]. The original algorithm [Ang87] adds all the prefixes
of the counter-example to S and thus possibly creating inconsistency that should be
Learning Regular Languages over Large Alphabets 489

fixed. The version proposed in [MP95] for learning ω-regular languages adds all the
suffixes of the counter-example to E. The advantage of this approach is that the table
always remains consistent and reduced with S corresponding exactly to the set of states.
A disadvantage is the possible introduction of redundant columns that do not contribute
to further discrimination between states. The symbolic algorithm that we develop in this
paper is based on an intermediate variant, referred to in [BR04] as the reduced obser-
vation algorithm, where some prefixes of the counter-example are added to S and some
suffixes are added to E.
Example: We illustrate the behavior of the L∗ algorithm while learning L = aΣ ∗
over Σ = {a, b}. We use +w to indicate a counter-example w ∈ L rejected by the
conjectured automaton, and −w for the opposite case. Initially, the observation table
is T0 = (Σ, S, R, E, f ) with S = E = {
} and R = Σ and we ask membership
queries for all words in (S ∪ R) · E = {
, a, b} to obtain table T0 , shown in Fig. 1. The
table is not closed so we move a to S, add its continuations, aa and ab to R and ask
membership queries to obtain the closed table T1 , from which the hypothesis automaton
A1 of Fig. 2 is derived. In response to the equivalence query for A1 , a counter-example
−ba is presented, its prefixes b and ba are added to S and their successors are added
to R, resulting in table T2 of Fig. 1. This table is not consistent: two elements
and b
in S are equivalent but their a-successors a and ba are not. Adding a to E and asking
membership queries yields a consistent table T3 whose automaton A3 is the minimal
automaton recognizing L.

T0 T1 T2 T3
a
0 0 1
a 1 a 1 1
0 b 0 b 0 0
0 a 1 ba 0 ba 0 0
a 1 b 0 aa 1 aa 1 1
b 0 aa 1 ab 1 ab 1 1
ab 1 bb 0 bb 0 0
baa 0 baa 0 0
bab 0 bab 0 0

Fig. 1. Observation tables of L∗ while learning a · Σ ∗

3 Symbolic Automata
Symbolic automata are automata over large alphabets where from each state there is a
small number of outgoing transitions labelled by subsets of Σ that form a partition of
the alphabet. Let Σ be a large and possibly infinite alphabet, that we call the concrete
alphabet. Let ψ be a total surjective function from Σ to a finite (symbolic) alphabet Σ.
For each symbolic letter a ∈ Σ we assign a Σ-semantics [a]ψ = {a ∈ Σ : ψ(a) = a}.
Since ψ is total and surjective, the set {[a]ψ : a ∈ Σ} forms a partition of Σ. We will
490 O. Maler and I.-E. Mens

A1 A3
q1 a, b
a, b a
b
q0
q0 q1 b
a
q2 a, b

Fig. 2. Hypothesis automata for a · Σ ∗

often omit ψ from the notation and use [a] where ψ, which is always present, is clear
from the context. The Σ-semantics can be extended to symbolic words of the form
w = a1 · a2 · · · ak ∈ Σ ∗ as the concatenation of the concrete one-letter languages
associated with the respective symbolic letters or, recursively speaking, [] = {
} and
[w · a] = [w] · [a] for w ∈ Σ ∗ , a ∈ Σ.
Definition 3 (Symbolic Automaton). A deterministic symbolic automaton is a tuple
A = (Σ, Σ, ψ, Q, δ, q0 , F ), where
– Σ is the input alphabet, :
– Σ is a finite alphabet, decomposable into Σ = q∈Q Σq ,
– ψ = {ψq : q ∈ Q} is a family of total surjective functions ψq : Σ → Σq ,
– Q is a finite set of states,
– δ : Q × Σ → Q is a partial transition function decomposable into a family of total
functions δq : {q} × Σq → Q,
– q0 is the initial state and F is the set of accepting states.
Automaton A can be viewed as representing a concrete deterministic automaton A
whose transition function is defined as δ(q, a) = δ(q, ψq (a)) and its accepted concrete
language is L(A) = L(A).
Remark: The association of a symbolic language with a symbolic automaton is more
subtle because we allow different partitions of Σ and hence different input alphabets
at different states, rendering the transition function partial with respect to Σ. When in
a state q and reading a symbol a ∈ Σq , the transition to be taken is well defined only
when [a] ⊆ [a ] for some a ∈ Σq . The model can, nevertheless, be made deterministic
and complete over a refinement of the symbolic alphabet. Let
(
Σ = Σq , with the Σ-semantics [(a1 , . . . , an )] = [a1 ] ∩ . . . ∩ [an ]
q∈Q

and let Σ̃ = {b ∈ Σ : [b] = ∅}. We can then define an ordinary automaton A ; =

(Σ̃, Q, δ̃, q0 , F ) where, by construction, for every b ∈ Σ̃ and every q ∈ Q, there
is a ∈ Σq such that [b] ⊆ [a] and hence one can define the transition function as
δ̃(q, b) = δ(q, a). This model is more comfortable for language-theoretic studies but
we stick in this paper to Definition 3 as it is more efficient. A similar choice has been
made in [IHS13].
Learning Regular Languages over Large Alphabets 491

Proposition 1 (Closure under Boolean Operations). Languages accepted by deter-

ministic symbolic automata are closed under Boolean operations.

Proof. Closure under negation is immediate by complementing the set of accepting

states. For intersection we adapt the standard product construction as follows. Let L1 , L2
be languages recognized by the symbolic automata A1 = (Σ, Σ1 , ψ1 , Q1 , δ1 , q01 , F1 )
and A2 = (Σ, Σ2 , ψ2 , Q2 , δ2 , q02 , F2 ), respectively. Let A = (Σ, Σ, ψ, Q, δ, q0 , F ),
where

– Q = Q1 × Q2 , q0 = (q01 , q02 ), F = F1 × F2
– For every (q1 , q2 ) ∈ Q
• Σ(q1 ,q2 ) = {(a1 , a2 ) ∈ Σ1 × Σ2 | [a1 ] ∩ [a2 ] = ∅}
• ψ(q1 ,q2 ) (a) = (ψ1,q1 (a), ψ2,q2 (a)) ∀a ∈ Σ
• δ((q1 , q2 ), (a1 , a2 )) = (δ1 (q1 , a1 ), δ2 (q2 , a2 )) ∀(a1 , a2 ) ∈ Σ(q1 ,q2 )

It is sufficient to observe that the corresponding implied concrete automata A1 , A2 and

A satisfy δ((q1 , q2 ), a) = (δ1 (q1 , a), δ2 (q2 , a)) and the standard proof that L(A) =
L(A1 ) ∩ L(A2 ) follows.

Equivalence queries are implemented by constructing a product automaton which

accepts the symmetric difference between the target language and the conjectured one,
finding shortest paths to accepting states and selecting a lexicographically minimal
word.

Definition 4 (Balanced Symbolic Σ-Tree). A balanced symbolic Σ-tree is a tuple

(Σ, S, R, ψ) where

– S 6 R:is a prefix-closed subset of Σ

– Σ = s∈S Σs is a symbolic alphabet
– ψ = {ψs }s∈S is a family of total surjective functions of the form ψs : Σ → Σs .

It is required that for every s ∈ S and a ∈ Σs , s · a ∈ S ∪ R and for any r ∈ R and

a ∈ Σ, r · a ∈ S ∪ R . Elements of R are called boundary elements of the tree.

We will use observation tables whose rows are symbolic words and hence an entry
in the table will constitute a statement about the inclusion or exclusion of a large set
of concrete words in the language. We will not ask membership queries concerning all
those words, but only for a small representative sample that we call evidence.

Definition 5 (Symbolic Observation Table). A symbolic observation table is a tuple

T = (Σ, Σ, S, R, ψ, E, f , μ) such that

– Σ is an alphabet,
– (Σ, S, R, ψ) is a finite balanced symbolic Σ-tree (with R being its boundary),
– E is a subset of Σ ∗ ,
– f : (S ∪ R) · E → {0, 1} is the symbolic classification function
∗
– μ : (S ∪ R) · E → 2Σ − {∅} is an evidence function satisfying μ(w) ⊆ [w]. The
image of the evidence function is prefix-closed: w · a ∈ μ(w · a) ⇒ w ∈ μ(w).
492 O. Maler and I.-E. Mens

We use, as for the concrete case, fs : E → {0, 1} to denote the partial evaluation
of f to some symbolic word s ∈ S ∪ R, such that, fs (e) = f (s · e). Note that the
set E consists of concrete words but this poses no problem because elements of E are
used only to distinguish between states and do not participate in the derivation of the
symbolic automaton from the table. The notions of closed, consistent and reduced table
are similar to the concrete case.
The set MT = (S ∪ R) · E is called the symbolic sample associated with T . We
require that for each word w ∈ MT there is at least one concrete w ∈ μ(w) whose
membership in L, denoted by f (w), is known. The set of such words is called the
concrete sample and is defined as MT = {s · e : s ∈ μ(s), s ∈ S ∪ R, e ∈ E}. A table
where all evidences of the same symbolic word admit the same classification is called
evidence-compatible.

Definition 6 (Table Conditions). A table T = (Σ, Σ, S, R, ψ, E, f , μ) is

– Closed if ∀r ∈ R, ∃s = g(r) ∈ S, fr = fs ,
– Reduced if ∀s, s ∈ S, fs = fs ,
– Consistent if ∀s, s ∈ S, fs = fs implies fs·a = fs ·a , ∀a ∈ Σs .
– Evidence compatible if ∀w ∈ MT , ∀w1 , w2 ∈ μ(w), f (w1 ) = f (w2 ).

When a table T is evidence compatible the symbolic classification function f can be

defined for every s ∈ (S ∪ R) and e ∈ E as f (s · e) = f (s · e), s ∈ μ(s).

Theorem 1 (Automaton from Table). From a closed, reduced and evidence compat-
ible table T = (Σ, Σ, S, R, ψ, E, f , μ) one can construct a deterministic symbolic
automaton compatible with the concrete sample.

Proof. Let AT = (Σ, Σ, ψ, Q, δ, q0 , F ) where:

– Q = S, q0 =
– F = {s ∈ S | fs () = 1} 9
s·a when s · a ∈ S
– δ : Q × Σ → Q is defined as δ(s, a) =
g(s · a) when s · a ∈ R
By construction and like the L∗ algorithm, AT classifies correctly the symbolic sample.
Due to evidence compatibility this holds also for the concrete sample.

4 The Algorithm
In this section we present a symbolic learning algorithm starting with an intuitive verbal
description. From now on we assume that the alphabet is ordered and use a0 to denote
its minimal. We assume that the teacher always provides the smallest counter-example
with respect to length and lexicographic order on Σ ∗ . Also, when we choose an evi-
dence for a new symbolic word w in a membership query we always take the smallest
possible element of [w].
The algorithmic scheme is similar to the concrete L∗ algorithm but differs in the
treatment of counter-examples and the new concept of evidence compatibility. When the
table is not closed, S ∪ R is extended until closure. Then a conjectured automaton AT
Learning Regular Languages over Large Alphabets 493

is constructed and an equivalence query is posed. If the answer is positive we are done.
Otherwise the teacher provides a counter-example leading possibly to the extension of
E and/or S ∪ R. Whenever such an extension occurs, additional membership queries
are posed to fill the table. The table is always kept evidence compatible and reduced
except temporarily during the processing of counter-examples.
The learner starts with the symbolic table T = (Σ, Σ, S, R, ψ, E, f , μ), where
Σ = {a0 }, S = {}, R = {a0 }, E = {
}, and μ(a0 ) = {a0 }. Whenever T is
not closed, there is some r ∈ R such that fr = fs for every s ∈ S. To make the
table closed we move r from R to S and add to R the word r = r · a, where a
is a new symbolic letter with [a] = Σ, and extend the evidence function by letting
μ(r ) = μ(r) · a0 .
When a counter-example w is presented, it is of course not part of the concrete sam-
ple. It admits a factorization w = u · a · v, where u is the largest prefix of u such that
u ∈ μ(u) for some u ∈ S ∪ R. There are two cases, the second of which is particular
to our symbolic algorithm.

1. u ∈ R: Assume that g(u) = s ∈ S and since the table is reduced, fu = fs for
any other s ∈ S. Because w is the shortest counter-example, the classification of
s · a · v in the automaton is correct (otherwise s · a · v, for some s ∈ [s] would
constitute a shorter counter-example) and different from that of u · a · v. Thus we
conclude that u deserves to be a state and should be added to S. To distinguish
between u and s we add a · v to E, possibly with some of its suffixes (see [BR04]
for a more detailed discussion of counter-example treatment). As u is a new state
we need to add its continuations to R. We distinguish two cases depending on a:
(a) If a = a0 is the smallest element of Σ then a new symbolic letter a is added
to Σ, with [a] = Σ and μ(u · a) = μ(u) · a0 , and the symbolic word u · a is
added to R.
(b) If a = a0 then two new symbolic letters, a and a , are added to Σ with [a] =
{b : b < a}, [a ] = {b : b ≥ a} and μ(u · a) = μ(u) · a0 , μ(u · a) = μ(u) · a.
The words u · a and u · a are added to R.
2. u ∈ S: In this case the counter-example indicates that u · a was wrongly assumed
to be part of [u · a] for some a ∈ Σu , and a was wrongly assumed to be part of
[a]. There are two cases:
(a) There is some a = a such that the classification of u · a · v by the symbolic
automaton agrees with the classification of u · a · v. In this case we just move
a and all letters greater than a from [a] to [a ] and no new state is added.
(b) If there is no such a symbolic letter, we create a new a with [a ] = {b ∈ [a] :
b ≥ a} and update [a] to be [a] − [a ]. We let μ(u · a ) = μ(u) · a and add
u · a to R.

A detailed description is given in Algorithm 1 with major procedures in Algorithm 2. A

statement of the form Σ = Σ ∪ {a} indicates the introduction of a new symbolic letter
a ∈ Σ. We use M Q and EQ as shorthands for membership and equivalence queries,
respectively. Note also that for every r ∈ R, μ(r) is always a singleton.
We illustrate the behavior of the algorithm on L = {a · u : b ≤ a < c, u ∈ Σ ∗ }
for two constants b < c in Σ. The table is initialized to T0 = (Σ, Σ, S, R, ψ, E, f , μ),
494 O. Maler and I.-E. Mens

Algorithm 1. The symbolic algorithm

1: procedure S YMBOLIC
2: learned = f alse
3: Initialize the table T = (Σ, Σ, S, R, ψ, E, f , μ)
4: Σ = {a}; ψ (a) = a, ∀a ∈ Σ
5: S = {}; R = {a}; E = {}
6: μ(a) = {a0 }
7: Ask MQ on and a0 to fill f
8: if T is not closed then
9: C LOSE
10: end if
11: repeat
12: if EQ(AT ) then AT is correct
13: learned = true
14: else A counter-example w is provided
15: M = M ∪ {w}
16: C OUNTER - EX (w) Process counter-example
17: end if
18: until learned
19: end procedure

where Σ = {a0 }, μ(a0 ) = {a0 }, S = {

}, E = {
}, R = {a0 } and ψ = {ψ } with
ψ (a) = a0 , ∀a ∈ Σ. We ask membership queries to learn f (
) and f (a0 ). Table T0 ,
shown in Fig. 3, is closed, reduced and evidence compatible and its related hypothesis
automaton A0 consists of only one rejecting state, as shown in Fig. 4. The teacher
responds to this conjecture by the counter-example +b. Since b ∈ / μ(a0 ) and
∈ S, we
are in Case 2-(b) of the counter-example treatment, where there is no symbolic word
that classifies b correctly. We create a new symbolic letter a1 with μ(a1 ) = {b} and
modify ψ to ψ (a) = a0 when a < b and ψ (a) = a1 otherwise. The derived table
T1 is not closed since for a1 ∈ R there is no element s ∈ S such that fa1 = fs .
To close the table we move a1 from R to S and introduce a new symbolic letter a2
to represent the continuations of a1 . We define ψa1 with ψa1 (a) = a2 for all a ∈ Σ,
μ(a1 · a2 ) = {b · a0 } and add the symbolic word a1 · a2 to R. We ask membership
queries for the missing words and construct a new observation table T2 .
This table is closed and reduced, resulting in a new hypothesis automaton A2 . The
counter-example provided by the teacher is −c. This is case 2-(a) of the counter-example
treatment as there exists a symbolic letter a0 that agrees with the classification of c. We
move c and all elements greater than it from [a1 ] to [a0 ], that is, ψ (a) = a0 when
a < b, ψ (a) = a1 when b ≤ a < c and ψ (a) = a3 otherwise. Table T3 is closed,
reduced and evidence compatible leading to the hypothesis automaton A3 for which
−ab is a counter-example where a ∈ μ(a0 ) and a0 ∈ R. Thus we are now in case 1
and since the counter-example is considered to be the shortest, a0 is a new state of the
automaton, different from
. We move a0 to S and add a new symbolic letter a4 to Σ,
which represents the transition from a0 , with μ(a0 ·a4 ) = {a0 ·a0 }. Now ψa0 (a) = a4 .
However the obtained table T4 is not reduced since f (e) = fa0 (e) for all e ∈ E. We
Learning Regular Languages over Large Alphabets 495

Procedures 2. Closing the table and processinging counter-examples

1: procedure C LOSE Make the table closed
2: while ∃r ∈ R such that ∀s ∈ S, fr = fs do
3: Σ = Σ ∪ {a}; ψ = ψ ∪ {ψr } with ψr (a) = a, ∀a ∈ Σ
4: S = S ∪ {r}; R = (R − {r}) ∪ {r · a}
5: μ(r · a) = μ(r) · a0
6: Ask MQ for all words in {μ(r · a) · e : e ∈ E}
7: T = (Σ, Σ , S , R , ψ , E, f , μ )
8: end while
9: end procedure

1: procedure C OUNTER - EX(w) Process counter-example

2: Find a factorization w = u · a · v, a ∈ Σ, u, v ∈ Σ ∗ such that
3: ∃u ∈ MT , u ∈ μ(u) and ∀u ∈ M , u · a ∈ / μ(u )
4: if u ∈ R then
5: if a = a0 then Case 1(a)
6: Σ = Σ ∪ {a}; ψ = ψ ∪ {ψu }, with ψu (σ) = a, ∀σ ∈ Σ
7: S = S ∪ {u} ; R = (R − {u}) ∪ {u · a}; E = E ∪ {suffixes of a · v}
8: μ(u · a) = μ(u) · a0
9: Ask MQ for all words in {μ(u · a) · e : e ∈ E }
10: else Case 1(b)
11: Σ = Σ ∪ {a, a }
a if σ < a
12: ψ = ψ ∪ {ψu }, with ψu (σ) =
a otherwise
13: S = S ∪ {u}; R = (R − {u}) ∪ {u · a, u · a }; E = E ∪ {suffixes of a · v}
14: μ(u · a) = μ(u) · a0 ; μ(u · a ) = μ(u) · a
15: Ask MQ for all words in {(μ(u · a) ∪ μ(u · a )) · e : e ∈ E }
16: end if
17: T = (Σ, Σ , S , R , ψ , E , f , μ)
18: else Case 2(a),(b)
19: Find a ∈ Σu such that a ∈ [a]
20: if there is no a ∈ Σ : fu·a = fμ(u·a ) on E then
21: Σ = Σ ∪ {a }; R = R ∪ {u · a }
22: μ(u · a ) = μ(u) · a
23: Ask MQ for all words in {μ(u · a ) · e : e ∈ E}
24: end if ⎧
⎨ ψu (σ) if σ ∈ / [a]
25: ψu (σ) = a if σ ∈ [a] and σ < a
⎩
a otherwise
26: T = (Σ, Σ , S, R , ψ, E, f , μ)
27: end if
28: if T is not closed then
29: CLOSE
30: end if
31: end procedure
496 O. Maler and I.-E. Mens

T0 T1 T2 T3 T4 T5
b
01
0
0 a0 0 0
0 a0 0
0 a1 1 a1 1 1
0 a1 1 a1 1
a0 0 a0 0 a0 a4 0 0
a0 0 a0 0 a0 a4 0
a1 1 a1 a2 1 a1 a2 1 1
a1 a2 1 a1 a2 1
a3 0 a3 0 0
a3 0

Fig. 3. Symbolic observation tables for (b ≤ a < c) · Σ ∗

A0 A2
a0 a0 a2

q0 q0 q1
a1

[a0 ] = {a | a < b}
[a0 ] = Σ [a1 ] = {a | a ≥ b}
[a2 ] = Σ
q1 a2
a1
a0 ∪ a3 a2
q0
q0 q1
a1
a0 ∪ a3
q2 a4

[a0 ] = {a | a < b} [a0 ] = {a | a < b}

[a1 ] = {a | b ≤ a < c} [a1 ] = {a | b ≤ a < c}
[a3 ] = {a | a ≥ c} [a3 ] = {a | a ≥ c}
[a2 ] = Σ [a2 ] = [a4 ] = Σ
A3 A5

Fig. 4. Hypothesis automata for (b ≤ a < c) · Σ ∗

add experiment b to E and fill the gaps using membership queries, resulting in table T5
which is closed, reduced and evidence compatible. The derived automaton A5 is the
right one and the algorithm terminates.
It is easy to see that for large alphabets our algorithm is much more efficient than
L∗ . For example, when Σ = {1..100}, b = 20 and c = 50, the L∗ algorithm will
need around 400 queries while ours will ask less than 10. The symbolic algorithm is
influenced not by the size of the alphabet but by the resolution (partition size) with
which we observe it. Fig. 5 shows a larger automaton over the same alphabet learned
by our procedure.
497

Fig. 5. An automaton learned by our procedure using 418 membership queries, 27 equivalence

Learning Regular Languages over Large Alphabets

queries with a table of size 46 × 11

498 O. Maler and I.-E. Mens

5 Discussion
We have defined a generic algorithmic scheme for automaton learning, targeting lan-
guages over large alphabets that can be recognized by finite symbolic automata having
a modest number of states and transitions. Some ideas similar to ours have been pro-
posed for the particular case of parametric languages [BJR06] and recently in a more
general setting [HSM11, IHS13] including partial evidential support and alphabet re-
finement during the learning process.2
The genericity of the algorithm comes from the semantic approach (alphabet parti-
tions) but of course, each and every domain will have its own semantic and syntactic
specialization in terms of the size and shape of the alphabet partitions. In this work we
have implemented an instantiation of this scheme for the alphabet Σ = (N, ≤) and
the adaptation to real numbers is immediate. When dealing with numbers, the partition
into a finite number of intervals (and convex sets in higher dimensions) is very natural
and used in many application domains ranging from quantization of sensor readings
to income tax regulations. It will be interesting to compare the expressive power and
succinctness of symbolic automata with other approaches for representing numerical
time series and to compare our algorithm with other inductive inference techniques for
sequences of numbers.
As a first excursion into the domain, we have made quite strong assumptions on
the nature of the equivalence oracle, which, already for small alphabets, is a bit too
strong and pedagogical to be realistic. We assumed that it provides the shortest counter-
example and also that it chooses always the minimal available concrete symbol. We
can relax the latter (or both) and replace the oracle by random sampling, as already
proposed in [Ang87] for concrete learning. Over large alphabets, it might be even more
appropriate to employ probabilistic convergence criteria a-la PAC learning [Val84] and
be content with a correct classification of a large fraction of the words, thus tolerating
imprecise tracing of boundaries in the alphabet partitions. This topic, as well as the
challenging adaptation of our framework to languages over Boolean vectors are left for
future work.
Acknowledgement. This work was supported by the French project EQINOCS (ANR-
11-BS02-004). We thank Peter Habermehl, Eugene Asarin and anonymous referees for
useful comments and pointers to the literature.

References
[Ang87] Angluin, D.: Learning regular sets from queries and counterexamples. Information
and Computation 75(2), 87–106 (1987)
[BJR06] Berg, T., Jonsson, B., Raffelt, H.: Regular inference for state machines with param-
eters. In: Baresi, L., Heckel, R. (eds.) FASE 2006. LNCS, vol. 3922, pp. 107–121.
Springer, Heidelberg (2006)
[BLP10] Benedikt, M., Ley, C., Puppis, G.: What you must remember when processing data
words. In: AMW (2010)
2
Let us remark that the modification of partition boundaries is not always a refinement in the
precise mathematical sense of the term.
Learning Regular Languages over Large Alphabets 499

[BR04] Berg, T., Raffelt, H.: Model Checking. In: Broy, M., Jonsson, B., Katoen, J.-P.,
Leucker, M., Pretschner, A. (eds.) Model-Based Testing of Reactive Systems. LNCS,
vol. 3472, pp. 557–603. Springer, Heidelberg (2005)
[DlH10] De la Higuera, C.: Grammatical inference: learning automata and grammars. Cam-
bridge University Press (2010)
[DR95] Diekert, V., Rozenberg, G.: The Book of Traces. World Scientific (1995)
[Gol72] Gold, E.M.: System identification via state characterization. Automatica 8(5), 621–
636 (1972)
[HSJC12] Howar, F., Steffen, B., Jonsson, B., Cassel, S.: Inferring canonical register automata.
In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 251–
266. Springer, Heidelberg (2012)
[HSM11] Howar, F., Steffen, B., Merten, M.: Automata learning with automated alphabet
abstraction refinement. In: Jhala, R., Schmidt, D. (eds.) VMCAI 2011. LNCS,
vol. 6538, pp. 263–277. Springer, Heidelberg (2011)
[IHS13] Isberner, M., Howar, F., Steffen, B.: Inferring automata with state-local alphabet ab-
stractions. In: Brat, G., Rungta, N., Venet, A. (eds.) NFM 2013. LNCS, vol. 7871,
pp. 124–138. Springer, Heidelberg (2013)
[KF94] Kaminski, M., Francez, N.: Finite-memory automata. Theoretical Computer Sci-
ence 134(2), 329–363 (1994)
[Moo56] Moore, E.F.: Gedanken-experiments on sequential machines. In: Automata Studies.
Annals of Mathematical Studies, vol. 34, pp. 129–153. Princeton (1956)
[MP95] Maler, O., Pnueli, A.: On the learnability of infinitary regular sets. Information and
Computation 118(2), 316–326 (1995)
[Ner58] Nerode, A.: Linear automaton transformations. Proceedings of the American Mathe-
matical Society 9(4), 541–544 (1958)
[Val84] Valiant, L.G.: A theory of the learnable. Communications of the ACM 27(11), 1134–
1142 (1984)
[VHL+ 12] Veanes, M., Hooimeijer, P., Livshits, B., Molnar, D., Björner, N.: Symbolic finite
state transducers: algorithms and applications. In: POPL, pp. 137–150 (2012)
Veriﬁcation of Concurrent Quantum Protocols
by Equivalence Checking

Ebrahim Ardeshir-Larijani1,2, , Simon J. Gay2 , and Rajagopal Nagarajan3,

1
Department of Computer Science, University of Warwick, UK
[email protected]
2
School of Computing Science, University of Glasgow, UK
[email protected]
3
Department of Computer Science,
School of Science and Technology, Middlesex University, UK
[email protected]

Abstract. We present a tool which uses a concurrent language for de-

scribing quantum systems, and performs verification by checking equiv-
alence between specification and implementation. In general, simulation
of quantum systems using current computing technology is infeasible.
We restrict ourselves to the stabilizer formalism, in which there are
efficient simulation algorithms. In particular, we consider concurrent
quantum protocols that behave functionally in the sense of computing
a deterministic input-output relation for all interleavings of the concur-
rent system. Crucially, these input-output relations can be abstracted by
superoperators, enabling us to take advantage of linearity. This allows
us to analyse the behaviour of protocols with arbitrary input, by simu-
lating their operation on a finite basis set consisting of stabilizer states.
Despite the limitations of the stabilizer formalism and also the range of
protocols that can be analysed using this approach, we have applied our
equivalence checking tool to specify and verify interesting and practical
quantum protocols from teleportation to secret sharing.

1 Introduction

There have been signiﬁcant advances in quantum information science over the
last few decades and technologies based on these developments are at a stage well
suited for deployment in a range of industrial applications. The construction of
practical, general purpose quantum computers has been challenging. The only
large scale quantum computer available today is manufactured by the Cana-
dian company D-Wave. However, it does not appear to be general purpose and
not everyone is convinced that it is truly quantum. On the other hand, quantum

Supported by the Centre for Discrete Mathematics and its Applications (DIMAP),
University of Warwick, EPSRC award EP/D063191/1.

Partially supported by “Process Algebra Approach to Distributed Quantum Com-
putation and Secure Quantum Communication”, Australian Research Council Dis-
covery Project DP110103473.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 500–514, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Veriﬁcation of Concurrent Quantum Protocols by Equivalence Checking 501

communication and cryptography have made large strides and is now well estab-
lished. Physical restrictions of quantum communication, like preserving photon
states over long distances, are gradually being resolved, for example, by quantum
repeaters [11] and using quantum teleportation. Various Quantum Key Distri-
bution networks have been built, including the DARPA Quantum Network in
Boston, the SeCoQC network around Vienna and the Tokyo QKD Network.
There is no doubt that quantum communication and quantum cryptographic
protocols will become an integral part of our society’s infrastructure.
On the theoretical side, quantum key distribution protocols such as BB84
have been proved to be unconditionally secure [20]. It is important to understand
that this an information-theoretic proof, which does not necessarily guarantee
that implemented systems are unconditionally secure. That is why alternative
approaches, such as those based on formal methods, could be useful in analysing
behaviour of implemented systems.
The area of formal verification, despite being a relatively young field, has
found numerous applications in hardware and software technologies. Today, ver-
ification techniques span a wide spectrum from model checking and theorem
proving to process calculus, all of them have helped us to grasp better under-
standing of interactive and complicated distributed systems. This work repre-
sents another milestone in our ongoing programme of applying formal methods to
quantum systems. In this paper, we present a concurrent language for describing
quantum systems, and perform verification by equivalence checking. The goal in
equivalence checking is to show that the implementation of a program is identical
to its specification, on all possible executions of the program. This is different
from property based model checking, where an intended property is checked over
all possible execution paths of a program. The key idea of this paper is to check
equivalence of quantum protocols by using their superoperator semantics (Sec-
tion 4). Superoperators are linear, so they are completely defined by their effect
on a basis of the appropriate space. To show that two protocols are equivalent,
we show that their associated superoperators are equivalent by simulating the
protocols for every state in a basis of the input vector space. By choosing a basis
that consists of stabilizer states, we can do this simulation efficiently.
One of the main challenges here is the explosion of states arising from branch-
ing and concurrency of programs. This is in addition to the need to deal with the
explosion of space needed for specifying quantum states. For a quantum state
with n qubits(quantum bits), we need to consider 2n complex coefficients. To
avoid this problem we restrict ourself to the stabilizer formalism [1]. In this for-
malism, quantum states can be described in polynomial space and also for certain
quantum operations, the evolution of stabilizer states can be done in polynomial
time. Although one cannot do universal quantum computation within the stabi-
lizer formalism, many important protocols such as Teleportation [7], Quantum
Error Correction [8] as well as quantum entanglement can be analysed within
it. Crucially, quantum error correction is a prerequisite for fault tolerant quan-
tum computing [24]. The latter is necessary for building a scalable quantum
computer, capable of doing universal quantum computing.
502 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan

Contributions. This paper extends our previous work [4] substantially in two
ways. First, we now use a concurrent modelling language, which means that
we can explicitly represent concurrency and communication in order to model
protocols more realistically. Second, we have analysed a much wider range of
examples, including several standard quantum protocols.
The paper is organised as follows. In Section 2 we give preliminaries from
quantum computing and the stabilizer formalism. In Sections 3 and 4, we give
the syntax and semantics of our concurrent modelling language. Section 5 de-
scribes our equivalence checking technique and Section 6 presents example pro-
tocols. In Section 7 we give details of our equivalence checking tool, and present
experimental results. Section 8 reviews related work and Section 9 concludes.

2 Background
In this section, we give a very concise introduction to quantum computing. For
more details, we refer to [22]. The basic element of quantum information is a
qubit (quantum bit). Qubits are vectors in an inner product vector space which is
called Hilbert space.1 Quantum states are description of qubits with the general
form: |Ψ = α1 |00 . . . 0 + . . . + αn |11 . . . 1, where αi ∈ C are called amplitudes
satisfying |α1 |2 + . . . + |αn |2 = 1. The so-called Ket notation or Dirac’s nota-
tion is used to distinguish unit vectors |0 and |1 from classical bits 0 and 1.
Also |00 . . . 0 corresponds to tensor product of unit vectors (i.e |0 ⊗ 0 . . . ⊗ |0).
There are two kinds of operations on quantum states, unitary transformations
and measurement. The side eﬀect of the measurement operation is classical infor-
mation, for example, the outcome of measuring the above state |Ψ is a classical
bit string (00 . . . 0) with probability |α1 |2 , to (11 . . . 1) with probability |αn |2 .
Note that measurement is a destructive operation and it changes the state of a
qubit permanently. Qubits can be entangled. For example, a two qubit entangled
state |00 + |11, which is called a Bell state, cannot be decomposed into two
single qubit states. Measuring one of the qubits will ﬁx the state of the other
qubit, even if they are physically separated.
Some basic quantum operations and their matrix representation are shown in
the Figure 1. A model for describing a quantum system is the quantum circuit

01 1 0 0 −i 10
X= , Z= , Y = , I=
10 0 −1 i 0 01

Fig. 1. Pauli operators

model, analogous to the classical circuit model. Each quantum circuit consists of
unitary and measurement gates. Unitary gates can be applied to one and more
1
Normally Hilbert space is deﬁned with additional conditions which we are not con-
cerned with in this paper.
Veriﬁcation of Concurrent Quantum Protocols by Equivalence Checking 503

qubits. In a certain kind of multiple qubit gates, which are called controlled gates,
there is one or more control qubits and one or more target qubits. Depending
on the value of the control qubit, a unitary gate is applied to the target qubit.
Controlled-X (or CNot ) and Toffoli [22, p. 29] gates are examples of controlled
gates. Quantum circuits are normally described in the following way: single wires
represent qubits, double wires represent classical bits. Single gates and measure-
ment gates are depicted with squares, whereas controlled gates are shown with a
point representing the control qubit and a circle depicting the target qubit with
a vertical wire connecting them.
The stabilizer formalism is a useful scheme which characterises a small but
important part of quantum mechanics. The core idea of the stabilizer formalism
is to represent certain quantum states, which are called stabilizer states, with
their stabilizer group, instead of an exponential number of complex amplitudes.
For an n-qubit quantum stabilizer state |ϕ, the stabilizer group is defined by
Stab(|ϕ) = {S|S |ϕ = +1 |ϕ}. This group can be represented elegantly by its n
generators in the Pauli group (i.e P n for P in Figure 1). Several algorithms have
been developed for specifying and manipulating stabilizer states using group
representation of quantum states, see [1]. Importantly, the effect of Clifford Op-
erators (members of the normaliser of Pauli group, known as Clifford group) on
stabilizer states can be simulated by a polynomial time algorithm. Consequently,
we have the following important theorem which guarantees stabilizer states can
be specified in polynomial space and certain operations and measurement can
be done in polynomial time:
Theorem 1. (Gottesman-Knill, [22, p. 464]) Any quantum computation which
consists of only the following components:
1. State preparation, Hadamard gates, Phase gates, Controlled-Not gates and
Pauli gates.
2. Measurement gates.
3. Classical control conditions on the outcomes of measurements.
can be efficiently simulated on a classical computer.
The density operator is an alternative way of describing a quantum state where
we need to deal with uncertainty. For instance, an ensemble of a quantum state
{(|φi , pi )}, where pi s are probabilities, can be represented by the following
density operator:
ρ := pi |φi φi |
i
where |φi φi | denote outer product. Density operators are positive and Hermi-
tian, meaning they satisfy |ϕ: ϕ| ρ |ϕ ≥ 0 and ρ† = ρ († denotes transpose of
the complex conjugate) respectively. Also a composite quantum system can be
elegantly described in the language of density operators. In particular, one can
obtain a reduced density operator by applying a partial trace operation on the
density operator of a composite system, see [22, p. 105].
Superoperators are linear transforms on the space of density operators. Note
that for an n-qubit system, the space of density operators has dimension of 22n .
504 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan

Quantum information systems can be abstracted using superoperators. In the

present paper, we take advantage of the linearity of superoperators: a superop-
erator is uniquely defined by its action on the elements of a basis of the space on
which it acts, which in our case is a space of density operators. We can therefore
check equality of superoperators by checking that for a given basis element as
input, they produce the same output. In this paper, we are interested in systems
which are in the stabilizer formalism. The following result [16] explicitly con-
structs a basis for the space of density operators which only consists of stabilizer
states and hence it can be used in our equivalence checking technique.
Theorem 2. The space of density operators for n-qubit states, considered as a
(2n )2 -dimensional real vector space, has a basis consisting of density matrices of
n-qubit stabilizer states.
Equality test: Another useful property of stabilizer states is that there is an
efficient way of checking whether two stabilizer states are equal. One may test
equality by using an algorithm for inner product of two states as in [1]. How-
ever, in [4] a novel approach is introduced which checks the linear dependence of
stabilizer generators of two states. This is possible using polynomial time algo-
rithms for obtaining stabilizer normal forms, introduced in [5]. Let |φ and |ψ
be stabilizer states and Stab(|φ), Stab(|ψ) be their stabilizer groups, it can be
easily seen that:
|φ = |ψ ⇐⇒ Stab(|φ) = Stab(|ψ)
Therefore it suffices to show that Stab(|φ) ⊆ Stab(|ψ) and Stab(|φ) ⊇
Stab(|ψ), using independence checking approach, which results in the follow-
ing proposition, proved in [4].
Proposition 1. There is a polynomial time algorithm which decides for any
stabilizer states |φ and |ψ, whether or not |φ = |ψ.

3 Speciﬁcation of Concurrent Quantum Protocols

In this section we present a concurrent language for specifying quantum proto-
cols, based on CCS [21]. This is different from our previous work [4] since our
focus now is on concurrent communicating quantum systems. We will, however,
restrict attention to protocols that receive input at the beginning and produce
output at the end of their execution, rather than considering more general contin-
uous interaction. One reason to use this language is to illustrate how designing
concurrent quantum protocols can be difficult and non intuitive compared to
classical protocols. For example, quantum measurement is a destructive action
and if it used wrongly between parallel processes, it can destroy the effect of
the whole system. Our language is similar to qCCS. [27] Communication in our
system is done using synchronous message passing or handshaking. This struc-
ture can be extended to describe asynchronous communication similar to the
approach of CCS. However, there is another reason to consider synchronicity,
namely the lack of quantum memory. Therefore a synchronous system is closer
Verification of Concurrent Quantum Protocols by Equivalence Checking 505

p ::= t | t || t

t ::= nil | c!x.t | c?x.t | a:= measure x . t | U(x) . t |

newqubit x . t | input list . t | output list . t |
if x then U(y) . t | match list then U(x) . t

list ::= x:val | list,x:val

val ::= 0 | 1

Fig. 2. Syntax of the Concurrent Language

to the current technological level of development of quantum communication.

The syntax of our language is summarised in Figure 2.
Here t||t denotes parallel composition of processes and . represents pro-
cess prefixes, similar to CCS. Terminated process is represented by nil. Prefix
input list defines the input state of a protocol and output list stands
for the intended output of a protocol. Often for the output we need to deallocate
qubits. In [23], allocation and deallocation of qubits is presented by assuming
that there is an operating system which gives/reset access to a pool of qubits
(i.e qubits are not created or destroyed). However in this paper, we don’t make
that assumption and therefore allocation/deallocation of qubits has the physical
meaning of creating or applying partial trace operation on quantum states.
For sending and receiving classical and quantum bits we use prefixes c!x and
c?x. Measurement is done using prefix a:=measure x, where the outcome of
measuring qubit x is assigned to a classical variable a. Conditionals, if and
match impose classical conditions on the system in order to apply particular
unitaries. The classical values val correspond to classical bits 0 and 1.

4 Semantics of Quantum Protocols

In this section we explain how the semantics of our specification language can
be understood using superoperators. We apply π-calculus style reduction rules
to our concurrent syntax in order to derive sequential interleavings, removing
all communication and parallel composition. Finally, inspired by the semantics
of Selinger’s quantum programming language (QPL) [23], we argue that each
sequential interleaving is equivalent to a QPL program, and thus can be charac-
terised by a superoperator.
The reduction rules are in Figure 3. Here α denotes prefixes other than com-
munication (i.e. not c!x or c?x). Structural congruence is defined as associativity
and commutativity of parallel composition, similarly to π-calculus, and is de-
noted by ≡. Finally, τ represents the silent action and substitution of v with
x is denoted by [v/x]. Using these rules, the transition graph for a concurrent
process is a tree, because we do not have any form of loop or recursion, in which
interleavings are represented by paths from the root to the leaves. For a pro-
tocol P , each interleaving i ∈ I(P ) (i.e. the set of all interleavings of P ), can
506 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan

P ≡ P P −
→ Q Q ≡ Q
α
α
α R-Cong α.P −
→P R-Act
P −
→Q
→ P
α
τ P −
c!v.P c?x.Q −
→P Q[v/x] R-Com
α R-Par
P Q−
→ P Q

Fig. 3. Reduction Rules

be translated to a QPL program, whose semantics is deﬁned by a superopera-

tor [23]. In this setting we assume that the input expression can be placed at
the beginning of i, and only used once. Similarly, the output expression occurs
at the end of each i, where we stop interpreting sequential interleavings. The
sequential QPL program obtained in this way contains measurement operators
which, when simulated, may generate branching behaviour due to quantum ran-
domness. The semantics of QPL merges these branches by calculating a weighted
sum of the density matrix representations of their ﬁnal output quantum states.
In the stabilizer formalism we cannot calculate these weighted sums in general,
so instead we require that all of the ﬁnal output quantum states are equal and
hence the weighted sum becomes trivial. We further restrict our attention to
protocols that are functional in the sense of computing a deterministic input-
output relation despite their internal non-determinism due to concurrency and
quantum randomness.

Deﬁnition 1 (Conﬁguration). Let I Q : Q )→ N denote index of qubit variables

in a quantum state. Conﬁgurations of a concurrent quantum program are deﬁned
by a tuples of the form (σ, κ, |φ), where σ represents classical variables assign-
ment, κ represents channel variables assignment and |φ represents a quantum
state corresponding to the qubit variables mapped by I Q .

Deﬁnition 2. A concurrent quantum protocol P is functional if

1. for a given quantum input |ψ and a given interleaving i ∈ I(P ), all execution
paths produce the same output quantum state; and
2. for a given quantum input |ψ, the unique output quantum state is the same
across all interleavings.

Proposition 2. Any functional concurrent protocol P , speciﬁed with the lan-

guage in Figure 2, deﬁnes a unique superoperator, which we denote by [[P ]].

Proof. The QPL programs corresponding to all of the interleavings of P map in-
puts to outputs in the same way, and therefore all deﬁne the same superoperator,
which we take to be [[P ]].

Remark 1. Quantum measurement in this paper is slightly diﬀerent from QPL.

Here the classical outcomes of measurement are assigned to classical variables
for the modelling convenience. Nevertheless, one can easily translate our mea-
surement expressions into QPL’s if-then-else style form.
Veriﬁcation of Concurrent Quantum Protocols by Equivalence Checking 507

Remark 2. We separate classical and quantum data as in Deﬁnition 1 for a sim-

pler implementation. Nonetheless, classical and quantum data can be mixed in
a similar way as [23], using tuples of stabiliser states.

For example, the concurrent protocol

input x . a!x . nil | a?y . X(y) . nil |
newqubit u . b!u . nil | b?u . Z(u) . output u . nil
has the following interleavings, among others:

I1 :input x; X(x); newqubit u; Z(u); output u

I2 :input x; newqubit u; X(x); Z(u); output u
I3 :newqubit u; input x; Z(u); X(x); output u.

5 Checking Equivalence of Quantum Protocols

In the following we define the equivalence between two concurrent protocols. In
the classical theory of computation there are several ways of checking equiv-
alence of concurrent systems using semantics namely: bisimulation based, au-
tomata based, game semantics and trace semantics. In this paper we focus on
superoperator semantics where each interleaving of a system is described by a
superoperator. We require that protocols be functional as in Definition 2.
In our system, we represent density matrices (mixed states) implicitly. This is
done by interpreting protocols with pure stabilizer states on different runs of the
protocol’s model. However it may possible to work with mixed stabilizer states
directly [1].
Having defined superoperators for concurrent quantum protocols, we explain
how to verify the correctness of the protocols. We show that the specification
and implementation of a protocol are equivalent by proving their corresponding
supeoperators are equal. Knowing that supeoperators are linear, it suffices to
execute the protocol for all elements in the stabilizer basis set of Theorem 2, to
capture their effects on the input. This process is done in two phases: first func-
tionality of specification and implementation is checked. Secondly equivalence of
them will be established, in any case we require to schedule, interpret protocols
in stabilizer formalism and apply equality tests on the reached states on every
basis input.
The following algorithm describes how to do these two steps: for a concurrent
program P , let I(P, v) denote all possible interleavings of the program with initial
state v (from basis set in the Theorem 2, denoted by B), produced by a scheduler
and indexed by integers from 1 upwards. Let Ii denote the ith interleaving and
suppose StabSim ∗ (P, v, i) shows the final state given by that stabilizer simulation
algorithm in [1] applied to Ii , on initial basis state v. Finally, let EQ S (v, w) be
the equality test algorithm from Section 2. Then Figure 4 shows the equivalence
checking algorithm for two concurrent programs P1 and P2 , and establishes the
following result.
508 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan

for all v ∈ B do
for all i ∈ {1, 2} do
|φvi = StabSim ∗ (Pi , v, 1)
for all j ∈ I(Pi , v) − {1} do
if ¬EQ S (StabSim ∗ (Pi , v, j), |φvi ) then
return Pi non-functional
end if
end for
end for
if ¬EQ S (|φv1 , |φv2 ) then
return P1 P2
end if
end for
return P1 ∼ = P2
Fig. 4. Algorithm for checking equivalence of concurrent protocols

Proposition 3. Given two functional concurrent quantum protocols, which only

use operations in the stabilizer formalism, one can systematically decide whether
they are equivalent with respect to their superoperator semantics on every possible
input, by iteration on the stabilizer basis.

Proposition 4. Checking equivalence of concurrent quantum protocols has over-

all (time) complexity of O(N 22n poly(m + n)), where n is the number of in-
put qubits (basis size), m is the number of qubits inside a program ( i.e those
createdby newqbit) and N is the number of interleavings of processes (where
( Mi ni )!
N= M for M processes each having ni atomic instructions) .
i (ni !)

Remark 3. In classical computing, the equivalence checking problem or imple-

mentation veriﬁcation of concurrent systems (where only the containment prob-
lem is considered, not the simulation problem), is PSPACE-complete (see [18]
for details).

6 Examples
We have analysed a range of quantum protocols covering quantum communica-
tion, quantum fault-tolerance and quantum cryptography using our equivalence
checking tool. In this section, we present two of the protocols we have veriﬁed.
The remaining protocols that we have analysed, in particular, fault tolerant pro-
tocols such as one bit teleportation and remote CNOT as well as error correction
protocols can be found at http://go.warwick.ac.uk/eardeshir/qec.
Teleportation [7]: The goal in this protocol is to teleport a quantum state
from Alice to Bob without physically transferring qubits, using quantum en-
tanglement. Before starting the communication between the two parties, Alice
and Bob, an entangled pair is established and shared between them. Alice then
entangles the input qubit with her half of the entangled qubit by applying a
Veriﬁcation of Concurrent Quantum Protocols by Equivalence Checking 509

|ψ • H •
FE
|0
FE
|0 H • X Z |ψ

Fig. 5. Teleportation circuit

controlled-not gate followed by a Hadamard gate. She then measures her qubits
and sends the classical outcome to the Bob. Depending on the four classical
outcomes of Alice measurements, Bob applies certain X and Z operations and
recovers the input state on his entangled qubit. The circuit which implements
Quantum Teleportation is shown in the Figure 5.
However the circuit model does not provide a high level interface and does
not capture the notion of physical separation between Alice and Bob. Through
our concurrent language, we provide a programming interface and we can also
describe the implementation of teleportation reﬂecting physical separation. The
speciﬁcation and implementation programs for teleportation in our concurrent
language are shown in Figure 6.

//Implementation:
//Preparing EPR pair and sending to Alice and Bob:
newqubit y . newqubit z . H(y) . CNOT(y,z) . c!y . d!z . nil
|

//Alice’s process:
(input x . c?y . CNOT(x,y) . H(x) . m := measure x . n := measure y.

b!m . b!n . nil

//Bob’s process :
d?w . b?m . b?n . if n then X(w) . if m then Z(w) . output w . nil)

//Specification:
input x.output x.nil

Fig. 6. Teleportation: Speciﬁcation and Implementation

Quantum Secret Sharing : This protocol was ﬁrst introduced by Hillery et.
al. [19]. The original problem of secret sharing involves an agent Alice sending a
message to two agents Bob and Charlie, one of whom is dishonest. Alice doesn’t
know which one of the agents is dishonest, so she must encode the message so that
Bob and Charlie must collaborate to retrieve it. For the quantum version of this
protocol the three agents need to share a maximally entangled three-qubit state,
called the GHZ state, prior to the execution of the protocol: |000 + |111 . In
Figure 7, we assume that Charlie will end up with the original qubit (a variation
of the protocol will allow Bob to end up with it). First, Alice entangles the input
qubit with her entangled qubit from the GHZ state. Then Alice measures her
qubits and sends the outcome to Charlie. Bob also measures his qubit and sends
510 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan

// Preparing GHZ state and sending to Alice, Bob and Charlie:

newqubit a . newqubit b . newqubit c . H(a) . CNOT(a,b) . CNOT(b,c) .

d ! a . e ! b . f ! c . nil
|

//Alice, who wants to share her qubit:

(input x . d ? a . CNOT(x,a) . H(x) . m := measure x .

n:=measure a . t ! m . w ! n . nil
|

//Bob, who is chosen as a collaborator:

(e ? b . H(b) . o:=measure b . u ! o . nil
|

//Charlie, who recovers the original quit from Alice:

f ? c . t ? m . w ? n . u ? o .

if o then Z(c) . if m then X(c) . if n then Z(c) . output c . nil))

Fig. 7. Secret Sharing Implementation. The speciﬁcation is the same as Teleportation.

the outcome to Charlie. Finally, Charlie is able to retrieve the original qubit once
he has access to the bits from Alice and Bob. The specification of secret sharing
is similar to teleportation, expressed in Figure 6. The security of this protocol is
a consequence of no-cloning theorem and is discussed in [19]. The specification
for this protocol is the same as for teleportation.
We conclude with some final remarks. First, we can easily add more inputs
to each of our protocols, which means that we are checking e.g. teleportation of
one qubit in the presence of entanglement with other qubits. This follows from
linearity, but it is never explicitly stated in standard presentations of telepor-
tation. Second, we can model different implementations of a protocol, e.g. by
changing the amount of concurrency. These differences are invisible at the level
of circuit diagrams.

7 Equivalence Checker and Experimental Results

We have implemented a concurrent equivalence checker in Java [3]. The parser
for the concurrent language is produced using SableCC [15]. The input of this
tool is a concurrent protocol as described in Section 3. The scheduler generates
all possible interleavings arising from execution of concurrent protocols. Each
interleaving on a given input basis is passed to the interpreter which interprets
programs using the Aaronson-Gottesman algorithm. The verification procedure
consists of two steps. First functionality will be checked for a given input pro-
tocol and then, equivalence of two protocols will be determined. Both steps use
the equality test algorithm in Section 2. The experimental results of verifica-
tion of protocols based on the models presented in Section 6 and in [3], are
summarized in the Table 8. Note that the specification of error corrections and
Z/X-teleportation are the same as Figure 6, whereas remote CNOT’s are spec-
ified by a single application of CNOT gate on two qubit inputs. The tool was
Verification of Concurrent Quantum Protocols by Equivalence Checking 511

run on a 2.5GHz Intel Core i3 machine with 4GB RAM. We would like to com-
pare our results with those produced by the model checker QMC, but we have
not been successful in running all the examples. This is partly because QMC is
based on a different approach to verification i.e, temporal logic model checking,
rather than equivalence checking. The tool Quantomatic is not a fully automatic
tool, therefore we were not able to provide a comparison with case studies in
that tool as well. The experimental results show how concurrency affects quan-
tum systems. Not surprisingly, with more sharing of entanglement and increased
classical and quantum communication, we have to deal with a larger number of
interleavings. We have verified (Figure 8) sequential models of protocols in our
current and previous tools [4]. Because of the complex structure of measurements
in the five qubit code, we were not able to model this protocol in the sequen-
tial equivalence checker. The scheduler in our previous tool is slower than the
one in our current work. This is because we were building program graphs for
extracting schedules, whereas in this work schedules are directly obtained from
abstract syntax tree. Comparing the results in Figure 8 shows that sequential
models are analysed more quickly because they do not deal with concurrency.
However, error correction protocols are inherently sequential, so their sequential
and concurrent models are very similar and produce similar results.

Protocol No. Interleaving CM No. Branch SM SEC

Teleportation 400 343 16 39 43

Dense Coding 100 120 4 22 30
Bit flip code 16 62 16 60 61
Phase flip code 16 63 16 61 62
Five qubit code 64 500 64 451 *
X-Teleportation 32 63 8 18 25
Z-Teleportation 72 78 8 19 27
Remote CNOT 78400 12074 64 112 140
Remote CNOT(A) 23040 4882 64 123 156
Quantum Secret Sharing 88480 13900 32 46 60
Fig. 8. Experimental results of equivalence checking of quantum protocols. The
columns headed by CM and SM show the results of verification of concurrent and
sequential models of protocols in the current tool. Column SEC shows verification
times for sequential models in our previous tool [4]. The number of branches for SM
and SEC models are the same. Times are in milliseconds.

8 Related Work

In recent years there have been several approaches to the formal analysis of
quantum information systems. In this section we review some of the work that
is most relevant to this paper.
We have already mentioned the QMC system. QMC checks properties in
Quantum Computation Tree Logic (QCTL) [6] on models which lie within the
stabilizer formalism. It can be used to check some protocols in a process-oriented
512 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan

style similar to that of the present paper; however, it simulates the protocols on
all stabilizer states as inputs, not just the smaller set of stabilizer states that
form a basis for the space of density matrices, and is therefore less efficient.
Our previous work [4] uses a similar approach to the present paper, but lim-
ited to sequential protocols. It therefore lacks the ability to explore, for a given
protocol, different models with different degrees of concurrency.
Process calculus can also be used to analyse quantum systems. Gay and Na-
garajan introduced CQP [17] based on the π-calculus; bisimulation for CQP has
been developed and applied by Davidson et al. [10,9]. Ying et al. have developed
qCCS [27] based on classical CCS, and studied its theory of bisimulation [14].
These are theoretical investigations which have not yet produced tools.
Wille et al. [26] consider two reversible circuits and then check their equiv-
alence with respect to a target functionality (specification). To this end, tech-
niques based on Boolean SAT and Quantum Binary Decision Diagrams [25] have
been used. However, these methods are only applicable to quantum circuits with
classical inputs/outputs.
Abramsky and Coecke [2] have developed diagrammatic reasoning techniques
for quantum systems, based on a category-theoretic formulation of quantum me-
chanics. Quantomatic [12] is a tool based on this formalism, which uses graph
rewriting in order to reason about quantum systems. The interface of Quan-
tomatic is graphical, in contrast to our tool which uses a programming lan-
guage syntax. Also, our tool verifies quantum protocols in a fully automatic
way, whereas Quantomatic is a semi-automatic tool which needs a considerable
amount of user intervention (see [13] for an example and discussion).

9 Conclusions and Future work

We have presented a technique and a tool for the verification of concurrent
quantum protocols, using equivalence checking. In our work, we require that a
concurrent protocol computes a function, in the sense that, (1) each interleaving
yields a deterministic input-output relation; and (2) all interleavings yield the
same input-output relation. The semantics of each interleaving of a quantum
protocol is defined by a superoperator, which is a linear transformation from a
protocol’s input space to its output space. In particular, since superoperators are
linear, we can analyse the behaviour of a concurrent protocol on arbitrary inputs
using a suitable stabilizer basis [16]. This enables us to reduce the problem of
checking equivalence over a continuum of quantum states to a tractable problem
of checking equivalence over a discrete set of states. We have presented results
comparing the execution times of three approaches to verification: (1) analysis
of concurrent models; (2) analysis of sequentialised models with the concurrent
equivalence checker; and (3) analysis of sequential models with our previous
sequential equivalence checker [4].
In this work we are not able to analyse algorithms with non-stabilizer ele-
ments, like Shor’s and Grover’s algorithm. Also, continuously running protocols
with input/output at intermediate points, need a more general notion of equiva-
lence such as bisimulation. And of course our tool cannot be used directly when
Verification of Concurrent Quantum Protocols by Equivalence Checking 513

the correctness of protocols can not be speciﬁed as an equivalence, such as the

security property of QKD. Nevertheless, our tool has been successfully applied
to a range of examples as shown in Figure 8.
One area for future work is extending the scope of our tool to go beyond
the stabilizer formalism. There are number of cases studied in [1] which involve
non-stabilizer states and operations. We should be able to extend our techniques
in a straightforward way, if the number of non-stabilizer operations is limited.
A more comprehensive modelling language e.g. implementing functional fea-
tures, as well as improvement of the classical aspects of our tool, is highly desir-
able. Another area for research is extending our equivalence checking technique
to other quantum modelling languages, such as CQP or qCCS. It would be in-
teresting to investigate whether the bisimulation relations in [10] or [14] can be
automated. In this paper, we have only considered an interleaving model of con-
currency. Of course, one could consider other models of concurrency, for example
true concurrency, and see whether it can be characterized by superoperators.
Finally, it will be interesting to develop a stabilizer based technique for anal-
ysis of security protocols such as QKD.

References
1. Aaronson, S., Gottesman, D.: Improved simulation of stabilizer circuits. Phys. Rev.
A 70, 052328 (2004)
2. Abramsky, S., Coecke, B.: A categorical semantics of quantum protocols. In: Pro-
ceedings of the 19th Annual IEEE Symposium on Logic in Computer Science, pp.
415–425 (2004)
3. Ardeshir-Larijani, E.: Quantum equivalence checker (2013),
http://go.warwick.ac.uk/eardeshir/qec
4. Ardeshir-Larijani, E., Gay, S.J., Nagarajan, R.: Equivalence checking of quantum
protocols. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795,
pp. 478–492. Springer, Heidelberg (2013)
5. Audenaert, K.M.R., Plenio, M.B.: Entanglement on mixed stabilizer states: normal
forms and reduction procedures. New Journal of Physics 7(1), 170 (2005)
6. Baltazar, P., Chadha, R., Mateus, P.: Quantum computation tree logic—model
checking and complete calculus. International Journal of Quantum Informa-
tion 6(2), 219–236 (2008)
7. Bennett, C.H., Brassard, G., Crépeau, C., Jozsa, R., Peres, A., Wootters, W.K.:
Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-
Rosen channels. Phys. Rev. Lett. 70, 1895–1899 (1993)
8. Calderbank, A.R., Shor, P.W.: Good quantum error-correcting codes exist. Phys.
Rev. A 54, 1098–1105 (1996)
9. Davidson, T.A.S., Gay, S.J., Nagarajan, R., Puthoor, I.V.: Analysis of a quantum
error correcting code using quantum process calculus. EPTCS 95, 67–80 (2012)
10. Davidson, T.A.S.: Formal Veriﬁcation Techniques Using Quantum Process Calcu-
lus. PhD thesis, University of Warwick (2011)
514 E. Ardeshir-Larijani, S.J. Gay, and R. Nagarajan

11. de Riedmatten, H., Marcikic, I., Tittel, W., Zbinden, H., Collins, D., Gisin, N.:
Long distance quantum teleportation in a quantum relay configuration. Physical
Review Letters 92(4), 047904 (2004)
12. Dixon, L., Duncan, R.: Graphical reasoning in compact closed categories for quan-
tum computation. Annals of Mathematics and Artificial Intelligence 56(1), 23–42
(2009)
13. Duncan, R., Lucas, M.: Verifying the Steane code with quantomatic.
arXiv:1306.4532 (2013)
14. Feng, Y., Duan, R., Ying, M.: Bisimulation for quantum processes. In: Proceed-
ings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of
Programming Languages, pp. 523–534. ACM (2011)
15. Gagnon, E.: SableCC, an object-oriented compiler framework. Master’s thesis,
School of Computer Science, McGill University (1998)
16. Gay, S.J.: Stabilizer states as a basis for density matrices. arXiv:1112.2156 (2011)
17. Gay, S.J., Nagarajan, R.: Communicating Quantum Processes. In: Proceedings
of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages, pp. 145–157. ACM (2005)
18. Harel, D., Kupferman, O., Vardi, M.Y.: On the complexity of verifying concurrent
transition systems. Information and Computation 173(2), 143–161 (2002)
19. Hillery, M., Bužek, V., Berthiaume, A.: Quantum secret sharing. Phys. Rev. A 59,
1829–1834 (1999)
20. Mayers, D.: Unconditional Security in Quantum Cryptography. Journal of the
ACM 48(3), 351–406 (2001)
21. Milner, R.: Communication and concurrency. Prentice Hall (1989)
22. Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information.
Cambridge University Press (2000)
23. Selinger, P.: Towards a quantum programming language. Mathematical Structures
in Computer Science 14(4), 527–586 (2004)
24. Shor, P.W.: Fault-tolerant quantum computation. In: Proceedings of the 37th An-
nual Symposium on Foundations of Computer Science, FOCS 1996. IEEE Com-
puter Society, Washington, DC (1996)
25. Viamontes, G.F., Markov, I.L., Hayes, J.P.: Quantum Circuit Simulation. Springer
(2009)
26. Wille, R., Grosse, D., Miller, D., Drechsler, R.: Equivalence checking of reversible
circuits. In: 39th International Symposium on Multiple-Valued Logic, pp. 324–330
(2009)
27. Ying, M., Feng, Y., Duan, R., Ji, Z.: An algebra of quantum processes. ACM Trans.
Comput. Logic 10(3), 19:1–19:36 (2009)
Computing Conditional Probabilities
in Markovian Models Efficiently

Christel Baier, Joachim Klein, Sascha Klüppelholz, and Steﬀen Märcker

Institute for Theoretical Computer Science

Technische Universität Dresden, Germany

Abstract. The fundamentals of probabilistic model checking for Marko-

vian models and temporal properties have been studied extensively in
the past 20 years. Research on methods for computing conditional prob-
abilities for temporal properties under temporal conditions is, however,
comparably rare. For computing conditional probabilities or expected
values under ω-regular conditions in Markov chains, we introduce a new
transformation of Markov chains that incorporates the eﬀect of the con-
dition into the model. For Markov decision processes, we show that the
task to compute maximal reachability probabilities under reachability
conditions is solvable in polynomial time, while it was conjectured to be
computationally hard. Using adaptions of known automata-based meth-
ods, our algorithm can be generalized for computing the maximal condi-
tional probabilities for ω-regular events under ω-regular conditions. The
feasibility of our algorithms is studied in two benchmark examples.

1 Introduction

Probabilistic model checking has become a prominent technique for the quanti-
tative analysis of systems with stochastic phenomena. Tools like PRISM [20] or
MRMC [18] provide powerful probabilistic model checking engines for Markovian
models and temporal logics such as probabilistic computation tree logic (PCTL)
for discrete models and its continuous-time counterpart CSL (continuous stochas-
tic logic) or linear temporal logic (LTL) as formalism to specify complex path
properties. The core task for the quantitative analysis is to compute the prob-
ability of some temporal path property or the expected value of some random
variable. For ﬁnite-state Markovian model with discrete probabilities, this task
is solvable by a combination of graph algorithms, matrix-vector operations and
methods for solving linear equation systems or linear programming techniques
[25,9,15,7]. Although probabilistic model checking is a very active research topic
and many researchers have suggested sophisticated methods e.g. to tackle the
state explosion problem or to provide algorithms for the analysis of inﬁnite-state

This work was in part funded by the DFG through the CRC 912 HAEC, the cluster
of excellence cfAED, the project QuaOS, the DFG/NWO-project ROCKS, and by
the ESF young researcher group IMData 100098198, and the EU-FP-7 grant 295261
(MEALS).

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 515–530, 2014.

c Springer-Verlag Berlin Heidelberg 2014
516 C. Baier et al.

stochastic models or probabilistic games, there are important classes of prop-

erties that are not directly supported by existing probabilistic model checkers.
Among these are conditional probabilities that are well-known in probability the-
ory and statistics, but have been neglected by the probabilistic model checking
community. Exceptions are [1,2] where PCTL has been extended by a condi-
tional probability operator and recent approaches for discrete and continuous-
time Markov chains and patterns of path properties with multiple time- and
cost-bounds [13,17].
The usefulness of conditional probabilities for anonymization protocols has
been illustrated in [1,2]. Let us provide here some more intuitive examples that
motivate the study of conditional probabilities. For systems with unreliable com-
ponents one might ask for the conditional probability to complete a task success-
fully within a given deadline, under the condition that no failure will occur
that prevents the completion of the task. If multiple tasks θ1 , . . . , θk have to be
completed, assertions on the conditional probability or the conditional costs to
complete task θi , under the condition that some other task θj will be completed
successfully might give important insights on how to schedule the tasks without
violating some service-level agreements. For another example, the conditional
expected energy requirements for completing a task, under the condition that a
certain utility value can be guaranteed, can provide useful insights for the design
of power management algorithms. Conditional probabilities can also be useful
for assume-guarantee-style reasoning. In these cases, assumptions on the stimuli
of the environment can be formalized by a path property ψ and one might then
reason about the quantitative system behavior using the conditional probability
measure under the condition that ψ holds.
Given a purely stochastic system model M (e.g. a Markov chain), the analysis
under conditional probability measures can be carried out using standard meth-
ods for unconditional probabilities, as we can simply rely on the mathematical
deﬁnition of the conditional probability for ϕ (called here the objective) under
condition ψ:
PrM
s (ϕ ∧ ψ)
PrM ϕ | ψ =
s
PrM
s (ψ)

where s is a state in M with PrM s (ψ) > 0. If both the objective ϕ and the
condition ψ are ω-regular path properties, e.g. speciﬁed by LTL formulas or
some ω-automaton, then ϕ ∧ ψ is again ω-regular, and the above quotient is
computable with standard techniques. This approach has been taken by Andrés
and van Rossum [1,2] for the case of discrete Markov chains and PCTL path
formulas, where ϕ ∧ ψ is not a PCTL formula, but a ω-regular property of some
simple type if nested state formulas are viewed as atoms. Recently, an automata-
based approach has been developed for continuous-time Markov chains and CSL
path formulas built by cascades of the until-operator with time- and cost-bounds
[13]. This approach has been adapted in [17] for discrete-time Markov chains and
PCTL-like path formulas with multiple bounded until-operators.
For models that support both the representation of nondeterministic and prob-
abilistic behaviors, such as Markov decision processes (MDPs), reasoning about
Computing Conditional Probabilities in Markovian Models Eﬃciently 517

(conditional) probabilities requires the resolution of the nondeterministic choices

by means of schedulers. Typically, one is interested in guarantees that can be
given even for worst-case scenarios. That is, we are interested in the maximal
(or minimal) conditional probability for the objective ϕ under condition ψ when
ranging over all schedulers. Unfortunately, there is no straightforward reduction
to unconditional maximal (or minimal) probabilities, simply because extrema
of quotients cannot be computed by the quotient of the extremal values of the
numerator and the denominator. [1,2] present a model checking algorithm for
MDP and PCTL extended by a conditional probability operator. The essential
features are algorithms for computing the maximal or minimal conditional prob-
abilities for the case where both the objective and the condition are given as
PCTL path formulas. These algorithms rely on transformations of the given
MDP into an acyclic one and the fact that for PCTL objectives and conditions
optimal schedulers that are composed by two memoryless schedulers (so-called
semi history-independent schedulers) exist. The rough idea is to consider all
semi history-independent schedulers and compute the conditional probabilities
for them directly. This method suﬀers from the combinatorial blow-up and leads
to an exponential-time algorithm. [1,2] also present reduction and bounding
heuristics to omit some semi history-independent schedulers, but these cannot
avoid the exponential worst-case time complexity. We are not aware of an imple-
mentation of these methods.
Contribution. The theoretical main contribution is twofold. First, for discrete
Markov chains we present an alternative approach that relies on a transformation
where we switch from the original Markov chain M to a modiﬁed Markov chain
Mψ such that the conditional probabilities in M agree with the (unconditional)
probabilities in Mψ for all measurable path properties ϕ. That is, Mψ only de-
pends on the condition ψ, but not on the objective ϕ. Second, for MDPs we
provide a polynomial-time algorithm for computing maximal conditional prob-
abilities when both the objective ϕ and the condition ψ are reachability prop-
erties. (This task was suspected to be computationally hard in [2].) Moreover,
we show that adaptions of known automata-based approaches are applicable to
extend this method for ω-regular objectives and conditions. In both cases, the
time complexity of our methods is roughly the same as for computing (extremal)
unconditional probabilities for properties of the form ϕ ∧ ψ.
Outline. Section 2 summarizes the relevant concepts of Markov chains and
MDPs. The theoretical foundations of our approach will be presented for Markov
chains in Sections 3 for MDPs in Section 4. Section 5 reports on experimental
results. Section 6 contains some concluding remarks. Omitted proofs and other
additional material can be found in the technical report [6].

2 Preliminaries
We brieﬂy summarize our notations used for Markov chains and Markov decision
processes. Further details can be found in textbooks on probability theory and
Markovian models, see e.g. [24,19,16].
518 C. Baier et al.

Markov Chains. A Markov chain is a pair M = (S, P ) where S is a countable

: s × s → [0, 1] a function, called the transition probability
set of states and p
function, such that s ∈S P (s, s ) = 1 for each state s. Paths in M are finite or
infinite sequences s0 s1 s2 . . . of states built by transitions, i.e., P (si−1 , si ) > 0
for all i ≥ 1. If π = s0 s1 . . . sn is a finite path then first(π) = s0 denotes the
first state of π, and last (π) = sn the last state of π. The notation first(π) will
be used also for infinite paths. We refer to the value
<
Pr(π) = P (si−1 , si )
1≤i≤n

as the probability for π. The cylinder set Cyl (π) is the set of all infinite paths ς
where π is a prefix of ς. We write FPaths(s) for the set of all finite paths π with
first(π) = s. Similarly, Paths(s) stands for the set of infinite paths starting in s.
Given a state s, the probability space induced by M and s is defined using
classical measure-theoretic concepts. The underlying sigma-algebra is generated
by the cylinder sets of finite paths. This sigma-algebra does not depend on
s. We refer to the elements of this sigma-algebra as (measurable) path events.
The probability measure PrM s is defined on the basis of standard measure ex-
tension
theorems
that yield the existence of a probability measure PrM s with
M
Prs Cyl (π) = Pr(π) for all π ∈ FPaths(s), while the cylinder sets of paths π
with first(π) = s have measure 0 under PrM s .

Markov Decision Processes (MDPs). MDPs can be seen as a generaliza-

tion of Markov chains where the operational behavior in a state s consists of
a nondeterministic selection of an enabled action α, followed by a probabilis-
tic choice of the successor state, given s and α. Formally, an MDP is a tuple
M = (S, Act, P ) where S is a ﬁnite set of states, Act a ﬁnite set of actions and
P : S × Act × S → [0, 1] a function such that for all states s ∈ S and α ∈ Act:

P (s, α, s ) ∈ {0, 1}
s ∈S

We write Act(s) for the set of actions that are enabled in s, i.e., P (s, α, s ) > 0
for some s ∈ S. For technical reasons, we require that Act(s) = ∅ for all states
s. State s is said to be probabilistic if Act(s) = {α} is a singleton, in which case
we also write P (s, s ) rather than P (s, α, s ). A trap state is a probabilistic state
s with P (s, s) = 1. Paths are finite or infinite sequences s0 s1 s2 . . . of states such
that for all i ≥ 1 there exists an action αi with P (si−1 , αi , si ) > 0. (For our
purposes, the actions are irrelevant in paths.) Several notations that have been
introduced for Markov chains can now be adapted for Markov decision processes,
such as first(π), FPaths(s), Paths(s).
Reasoning about probabilities for path properties in MDPs requires the selec-
tion of an initial state and the resolution of the nondeterministic choices between
the possible transitions. The latter is formalized via schedulers, often also called
policies or adversaries, which take as input a finite path and select an action to be
executed. For the purposes of this paper, it suffices to consider deterministic, pos-
sibly history-dependent schedulers, i.e., partial functions S : FPaths → Act such
Computing Conditional Probabilities in Markovian Models Efficiently 519

that S(π) ∈ Act last(π) for all finite paths π. Given a scheduler S, an S-path
is any path that might arise when the nondeterministic choices in M are resolved

using S. Thus, π = s0 s1 . . . sn is an S-path iff P sk−1 , S(s0 s1 . . . sk−1 ), sk > 0
for all 1 ≤ k ≤ n. In this case, S[π] denotes the scheduler “S after π” given
by S[π](t0 t1 . . . tk ) = S(s0 s1 . . . sn t1 . . . tk ) if sn = t0 . The behavior of S[π]
for paths not starting in sn is irrelevant. The probability of π under S is the
product of the probabilities of its transitions:
<
n−1
PrS (π) = P sk−1 , S(s0 s1 . . . sk−1 ), sk
i=0

Inﬁnite S-paths are deﬁned accordingly.

For a pointed MDP (M, sinit ), i.e. an MDP as before with some distinguished
initial state sinit ∈ S, the behavior of (M, sinit ) under S is purely probabilistic and
can be formalized by an infinite tree-like Markov chain MS s where the states are
the finite S-paths starting in s. The probability measure PrS M,s for measurable
S
sets of the infinite paths in the Markov chain Ms , can be transferred to infinite
S-paths in M starting in s. Thus, if Φ is a path event then PrS M,s (Φ) denotes
its probability under scheduler S for starting state s. For a worst-case analysis
of a system modeled by an MDP M, one ranges over all initial states and all
schedulers (i.e., all possible resolutions of the nondeterminism) and considers the
maximal or minimal probabilities for Φ. If Φ represents a desired path property,
S
then PrminM,s (Φ) = inf S PrM,s (Φ) is the probability for Φ in M that can be
guaranteed even for the worst-case scenarios. Similarly, if Φ stands for a bad
S
(undesired) path event, then PrmaxM,s (Φ) = supS PrM,s (Φ) is the least upper bound
that can be guaranteed for the likelihood of Φ in M.

Temporal-Logic Notations, Path Properties. Throughout the paper, we

suppose that the reader is familiar with ω-automata and temporal logics. See e.g.
[8,14,5]. We often use LTL- and CTL-like notations and identify LTL-formulas
with the set of infinite words over the alphabet 2AP that are models for the
formulas, where AP denotes the underlying set of atomic propositions. For the
Markov chain or MDP M under consideration we suppose then that they are
extended by a labeling function L : S → 2AP , with the intuitive meaning that
precisely the atomic propositions in L(s) hold for state s. At several places, we
will use temporal state and path formulas where single states or sets of states
in M are used as atomic propositions with the obvious meaning. Similarly, if
M arises by some product construction, (sets of) local states will be treated as
atomic propositions. For the interpretation of LTL- or CTL-like formulas in M,
the probability annotations (as well as the action labels in case of an MDP) are
ignored and M is viewed as an ordinary Kripke structure.
By a path property we mean any language consisting of infinite words over
2AP . Having in mind temporal logical specifications, we use the logical operators
∨, ∧, ¬ for union, intersection and complementation of path properties. A path
property Φ is said to be measurable if the set of infinite paths π in M satisfying
Φ is a path event, i.e., an element of the induced sigma-algebra. Indeed, all
520 C. Baier et al.

ω-regular path properties are measurable [25]. We abuse notations and identify
measurable path properties and the induced path event. Thus,
! "
PrS
M,s (ϕ) = PrM,s
S
π ∈ Paths(s) : π |= ϕ
denotes the probability for ϕ under scheduler S and starting state s.

Assumptions. For the methods proposed in the following sections, we suppose

that the state space of the given Markov chain and the MDP is ﬁnite and that
all transition probabilities are rational.

3 Conditional Probabilities in Markov Chains

In what follows, let M = (S, P ) be a ﬁnite Markov chain as in Section 2 and ψ

an ω-regular condition. We present a transformation M Mψ such that the
conditional probabilities PrM s ( ϕ | ψ ) agree with the (unconditional) probabilities
PrM
sψ
ψ
( ϕ ) for all ω-regular objectives ϕ. Here, s is a state in M with PrM
s (ψ) > 0
and sψ the “corresponding” state in Mψ . We ﬁrst treat the case where ψ is a
reachability condition and then explain a generalization for ω-regular conditions.

Reachability Condition. Let G ⊆ S be a set of goal states and ψ = ♦G.

Intuitively, the Markov chain Mψ arises from M by a monitoring technique that
runs in parallel to M and operates in two modes. In the initial mode “before
or at G”, briefly called before mode, the attempt is to reach G by avoiding all
states s with s |= ∃♦G. The transition probabilities for the states in before mode
are modified accordingly. As soon as G has been reached, Mψ switches to the
normal mode where Mψ behaves as M. In what follows, we write sbef and snor
for the copies
! of state s in the before
" and normal mode, respectively. For V ⊆ S,
let V bef = sbef
! : s ∈ V, s |= ∃♦G
" the set of V -states where PrM
s (♦G) is positive
and V nor
= s : s ∈ V . The Markov chain Mψ = (Sψ , Pψ ) is defined as
nor

follows. The state space of Mψ is Sψ = S bef ∪ S nor . For s ∈ S \ G and v ∈ S with

s |= ∃♦G and v |= ∃♦G:

PrM
v (♦G)
Pψ (sbef , v bef ) = P (s, v) ·
PrM
s (♦G)

For s ∈ G, we define Pψ (sbef , v nor ) = P (s, v), modeling the switch from before
to normal mode. For the states in normal mode, the transition probabilities are
given by Pψ (snor , v nor ) = P (s, v). In all other cases, Pψ (·) = 0. For the labeling
with atomic propositions, we suppose that each state s in M and its copies sbef
and snor in Mψ satisfy the same atomic propositions.
By applying standard arguments for finite Markov chains we obtain that
M
Prsbefψ (♦Gbef ) = 1 for all states s in M with s |= ∃♦G. (This is a simple conse-
quence of the fact that all states in S bef can reach Gbef .) Thus, up to the switch
from G to Gbef , the condition ♦G (which we impose for M) holds almost surely
for Mψ . For each path property ϕ, there is a one-to-one correspondence between
Computing Conditional Probabilities in Markovian Models Efficiently 521

the inﬁnite paths π in M with π |= ϕ ∧ ♦G and the inﬁnite paths πψ in Mψ

with πψ |= ϕ. More precisely, each path πψ in Mψ induces a path πψ |M in M by
dropping the mode annotations. Vice versa, each path π in M can be augmented
with mode annotations to obtain a path πψ in Mψ with πψ |M = π, provided that
π either contains some G-state or consists of states s with s |= ∃♦G. This yields
a one-to-one correspondence between the cylinder sets in Mψ and the cylinder
sets spanned by ﬁnite paths of M that never enter some state s with s |= ∃♦G
without having visited G before.

Theorem 1 (Soundness of the transformation). If Φ !is a path event for

Thus, once Mψ has been constructed, conditional probabilities for arbitrary

path properties in M can be computed by standard methods for computing
unconditional probabilities in Mψ , with the same asymptotic costs. (The size of
Mψ is linear in the size of M.) Mψ can be constructed in time polynomial in
the size of M as the costs are dominated by the computation of the reachability
probabilities PrMs (♦G). Mψ can also be used to reason about the conditional
expected value of a random function f on inﬁnite paths in M, as we have:

EM ( f | ♦G = EMψ fψ | ♦(Gbef ∪ Gnor )
=
where fψ (π ) = f π =M and EN (·) denotes the expected-value operator in N .
An important instance is expected accumulated rewards to reach a certain set
of states. See [6].

ω -regular conditions. Suppose now that the condition ψ is given by a deter-

ministic ω-automaton A with, e.g., Rabin or Streett acceptance. To construct a
Markov chain that incorporates the probabilities in M under the condition ψ, we
rely on the standard techniques for the quantitative analysis of Markov chains
against automata-speciﬁcations [26,5]. The details are straightforward, we just
give a brief outline. First, we build the standard product M ⊗ A of M and A,
which is again a Markov chain. Let G be the union of the bottom strongly con-
nected components C of N = M ⊗ A that meet the acceptance condition of A.
Then, the probability PrM N
s (ψ) equals Prs,qs (♦G), where s, qs is the state in
M ⊗ A that “corresponds” to s. We then apply the transformation for N Nψ
as explained above and obtain that for all measurable path properties ϕ:
Nψ
PrMs ϕ|ψ = PrN s,qs ϕ | ♦G = Prs,q s
(ϕ)

for all states s in M where PrM

s (ψ) is positive. This shows that the task to com-
pute conditional probabilities for ω-regular conditions is solvable by algorithms
for computing (unconditional) probabilities for ω-regular path properties.
522 C. Baier et al.

4 Conditional Probabilities in Markov Decision Processes

We now consider the task to compute maximal conditional probabilities in MDPs.

We start with the “core problem” where the objective and the condition are
reachability properties. The general case of ω-regular objectives and conditions
will be treated in Section 4.2.

4.1 Conditional Reachability Probabilities in MDPs

Let (M, sinit ) be a pointed MDP where M = (S, Act, P ) and let F , G ⊆ S such
that sinit |= ∃♦G, in which case Prmax
M,sinit (♦G) > 0. The task is to compute

PrS
M,sinit ♦F ∧ ♦G
max PrS
M,sinit ♦F | ♦G ) = max
S S PrS
M,sinit ♦G

where S ranges over all schedulers for M such that PrS M,sinit (♦G) > 0. By the
results of [1,2], there exists a scheduler S maximizing the conditional probability
for ♦F , given ♦G. (This justiﬁes the use of max rather than sup.)
Only for simplicity, we assume that F ∩ G = ∅. Thus, there are just two cases
for the event ♦F ∧ ♦G: “either F before G, or G before F ”. We also suppose
sinit ∈
/ F ∪ G and that all states s ∈ S are accessible from sinit .

Step 1: Normal form Transformation

We ﬁrst present a transformation M M such that the maximal conditional

probability for “♦F , given ♦G” agrees with the maximal conditional probabil-
ity for “♦F , given ♦G ” in M where F and G consist of trap states. This
can be seen as some kind of normal form for maximal conditional reachability
probabilities and relies on the following observation.

Lemma 1 (Scheduler improvement). For each scheduler S there is a sched-

uler T such that for all states s with PrSM,s (♦G) > 0:

(1) PrSM,s ♦F | ♦G ≤ PrT M,s ♦F | ♦G
T[π]
(2) M,t (♦G) for all t ∈ F and π ∈ Πs...t
PrM,t (♦G) = Prmax
T[π]
(3) M,u (♦F ) for all u ∈ G and ﬁnite paths π ∈ Πs...u
PrM,u (♦F ) = Prmax
where Πs...u denotes the set consisting of all ﬁnite paths s0 s1 . . . sn in M with
s0 = s, sn = u and {s0 , s1 , . . . , sn−1 } ∩ (F ∪ G) = ∅.

Recall that S[π] denotes the scheduler “S after π”. The idea is that T behaves
as S as long as neither F nor G has been reached. As soon as a G-state (resp. F -
state) has been entered, T mimics some scheduler that maximizes the probability
to reach F (resp. G). This scheduler satisﬁes (2) and (3) by construction. Item
(1) follows after some calculations (see [6]).
Computing Conditional Probabilities in Markovian Models Eﬃciently 523

As a consequence of Lemma 1, for studying the maximal conditional proba-

bility for ♦F given ♦G, it suﬃces to consider schedulers T satisfying conditions
(2) and (3). Let M be the MDP that behaves as M as long as no state in F or
G has been visited. After visiting an F -state t, M moves probabilistically to a
fresh goal state with probability Prmax
M,t (♦G) or to a fail state with the remaining
probability. Similarly, after visiting a G-state u, M moves probabilistically to
the goal state or to a new state stop. Formally, M = (S , Act, P ) where the
state space of M is S = S ∪ T and
! "
T = goal , stop, fail .
The transition probabilities in M for the states in S \(F ∪G) agree with those in
M, i.e., P (s, α, s ) = P (s, α, s ) for all s ∈ S \ (F ∪ G), s ∈ S and α ∈ Act. The
states t ∈ F and u ∈ G are probabilistic in M with the transition probabilities:
P (t, goal ) = Prmax
M,t (♦G) P (u, goal ) = Prmax
M,u (♦F )
P (t, fail ) = 1 − Prmax
M,t (♦G) P (u, stop) = 1 − Prmax
M,u (♦F )

The three fresh states goal , fail and stop are trap states. Then, by Lemma 1:

Corollary 1 (Soundness of the normal form transformation). For all

states s in M with s |= ∃♦G:
=
M,s ♦F | ♦G
Prmax =
M ,s ♦goal ♦(goal ∨ stop)
= Prmax

Optional simpliﬁcation of M . Let W be the set of states w in M such that

for some scheduler S, the goal-state is reachable from w via some S-path, while
the trap state stop will not be reached along S-paths from w. Then, all states
in W can be made probabilistic with successors goal and fail and eﬃciently
computable transition probabilities. This transformation of M might yield a
reduction of the reachable states, while preserving the maximal conditional prob-
abilities for ♦goal , given ♦(goal ∨ stop). For details, see [6].

Step 2: Reduction to Ordinary Maximal Reachability Probabilities

We now apply a further transformation M Mϕ|ψ such that the maximal
conditional probability for ϕ = ♦goal , given ψ = ♦(goal ∨ stop), in M agrees
with the maximal (unconditional) probability for ♦goal in Mϕ|ψ .
Let us first sketch the ratio of this transformation. Infinite paths in M that
violate the condition ψ do not “contribute” to the conditional probability for
♦goal . The idea is now to “redistribute” their probabilities to all the paths sat-
isfying the condition ψ. Speaking roughly, we aim to mimic a stochastic process
that generates a sequence π0 , π1 , π2 . . . of sample paths in M starting in sinit
until a path πi is obtained where the condition ψ holds. To formalize this “redis-
tribution procedure” by switching from M to some new MDP Mϕ|ψ we need
some restart mechanism to discard generated prefixes of paths πj violating ψ by
returning to the initial state sinit , from which the next sample run πj+1 will be
generated. Note that by discarding paths that do not satisfy ψ, the proportion of
524 C. Baier et al.

the paths satisfying ϕ ∧ ψ and the paths satisfying ψ is not affected and almost
surely a path satisfying ψ will be generated. Thus, the conditional probability
for ϕ ∧ ψ given ψ under some scheduler of the original MDP agrees with the
(unconditional) probability for ϕ under the corresponding scheduler of the new
MDP Mϕ|ψ .
The restart policy is obvious for finite paths that enter the trap state fail .
Instead of staying in fail , we simply restart the computation by returning to the
initial state sinit . The second possibility to violate ψ are paths that never enter
one of the three trap states in T . To treat such paths we rely on well-known
results for finite-state MDPs stating that for all schedulers S almost all S-paths
eventually enter an end component (i.e., a strongly connected sub-MDP), stay
there forever and visit all its states infinitely often [11,12]. The idea is that we
equip all states s that belong to some end component without any T -state with
the restart-option, i.e., we add the nondeterministic alternative to return to the
initial state sinit . To enforce that such end components will be left eventually
by taking the restart-transition, one might impose strong fairness conditions for
the schedulers in Mϕ|ψ . Such fairness assumptions are, however, irrelevant for
maximal reachability conditions [3,4].
Let B be the set of (bad) states v such that there exists a scheduler S that
never visits one of the three trap states goal , stop or fail when starting in v:
9
v ∈ B iff
there exists a scheduler
S
such that PrS M ,v ♦T = 0

The MDP Mϕ|ψ = (S , Act ∪ {τ }, Pϕ|ψ ) has the same state space as the normal
form MDP M . Its action set extends the action set of M by a fresh action
symbol τ for the restart-transitions. For the states s ∈ S \ B with s = fail ,
the new MDP Mϕ|ψ behaves as M , i.e., Pϕ|ψ (s, α, s ) = P (s, α, s ) for all
s ∈ S \ (B ∪ {fail }), α ∈ Act and s ∈ S . The fresh action τ is not enabled in
the states s ∈ S \(B ∪{fail }). For the fail-state, Mϕ|ψ returns to the initial state,
i.e., Pϕ|ψ (fail , τ, sinit ) = 1 and Pϕ|ψ (fail , τ, s ) = 0 for all states s ∈ S \ {sinit }.
No other action than τ is enabled in fail . For the states v ∈ B, Mϕ|ψ decides
nondeterministically to behave as M or to return to the initial state sinit . That is,
if v ∈ B, α ∈ Act, s ∈ S then Pϕ|ψ (v, α, s ) = P (v, α, s ) and Pϕ|ψ (v, τ, sinit ) = 1.
In all remaining cases, we have Pϕ|ψ (v, τ, ·) = 0.
Paths in M that satisfy B or that end up in fail , do not “contribute” to the
conditional probability for ♦goal , given ♦(goal ∨ stop). Instead the probability
for the inﬁnite paths π with π ∈ B or π |= ♦fail in M are “distributed” to
the probabilities for ♦goal and ♦stop when switching to conditional probabilities.
This is mimicked by the restart-transitions to sinit in Mϕ|ψ .

Theorem 2 (Soundness of step 2). For the initial state s = sinit , we have:
=
Prmax =
M ,s ♦goal ♦(goal ∨ stop) = PrMϕ|ψ ,s ♦goal
max

Algorithm and Complexity. As an immediate consequence of Theorem 2,

the task to compute maximal conditional reachability probabilities in MDPs is
Computing Conditional Probabilities in Markovian Models Eﬃciently 525

reducible to the task to compute maximal ordinary (unconditional) reachability

probabilities, which is solvable using linear programming techniques [24,7]. The
size of the constructed MDP is linear in the size of M , which again is linear
in the size of M. The construction of M and Mϕ|ψ is straightforward. For M
we need to compute ordinary maximal reachability probabilities in M. Using
standard algorithms for the qualitative fragment of PCTL, the set B of bad
states is computable by a graph analysis in polynomial time (see e.g. [5]). Thus,
maximal conditional probabilities for reachability objectives and conditions can
be computed in time polynomial in the size of M.

4.2 Conditional Probabilities in MDPs for Other Events

Using standard automata-based techniques, our method can be generalized to
deal with ω-regular properties for both the objective and the condition.
ω -regular objectives under reachability conditions. Using a standard
automata-based approach, the suggested technique = is also applicable to com-
pute maximal conditional probabilities Prmax =
M,s ϕ ♦G . Here, we deal with a
deterministic ω-automaton A
= for ϕ and then compute the maximal conditional
=
N ,s,qs ♦F ♦G in the product-MDP N = M ⊗ A where F is
probabilities Prmax
the union of all end components in M ⊗ A satisfying the acceptance condition
of A. Here, s, qs denotes the state in M ⊗ A that “corresponds” to s.
(co-)safety conditions. If ψ is regular co-safety condition then we can use a
representation of ψ by a deterministic finite automaton (DFA) B, switch from
M to the product-MDP M ⊗ B with the reachability condition stating that
some final state of B should be visited. With slight modifications, an analogous
technique is applicable for regular safety conditions, in which case we use a DFA
for the bad prefixes of ψ. See [6].This approach is also applicable for MDPs with
positive state rewards and if ψ is a reward-bounded reachability condition ♦≤r a.
ω -regular conditions. If the condition
ψ and the objective ϕ are ω-regular
then the task to compute Prmax M,s ϕ | ψ is reducible to the task of computing
maximal conditional probabilities for reachability objectives and some strong
fairness condition ψ . The idea is to simply use deterministic Streett automata
A and B for ϕ and ψ and then to switch from M to the product-MDP M⊗A⊗B.
The condition ψ can then be replaced by B’s acceptance condition. The goal set
F of the objective ♦F arises by the union of all end components in M ⊗ A ⊗ B
where the acceptance conditions of both A and B hold.
It remains to explain how to compute Prmax M,s ϕ | ψ where ϕ = ♦F is a
reachability objective and ψ is a strong fairness (i.e., Streett) condition, say:

ψ = (♦Ri → ♦Gi )
1≤i≤k

We can rely on very similar ideas as for reachability conditions (see Section
4.1). The construction of a normal form MDP M (step 1) is roughly the same
except that we deal only with two fresh trap states: goal and fail . The restart
mechanism in step 2 can be realized by switching from M to a new MDP Mϕ|ψ
526 C. Baier et al.

that is deﬁned in the same way as in Section 4.1, except that restart-transitions
are only added to those states v where v ∈ Ri for some i ∈ {1, . . . , k}, and v
is contained in some end component that does not contain goal and does not
contain any Gi -state. For further details we refer to the extended version [6]).

5 PRISM Implementation and Experiments

We have implemented most of the algorithms proposed in this paper in the pop-
ular model checker Prism [21], extending the functionality of version 4.1. Our
implementation is based on the “explicit” engine of Prism, i.e., the analysis is
carried out using an explicit representation of the reachable states and transi-
tions. We have extended the explicit engine to handle LTL path properties for
Markov chains using deterministic Rabin automata and Prism’s infrastructure.
For Markov chains, we implemented the presented transformation M Mψ
where ψ and ϕ are given as LTL formulas. The presented method for reachability
conditions ψ = ♦G has been adapted in our implementation for the more general
case of constrained reachability conditions ψ = H U G. Our implementation also
supports a special treatment of conditions ψ consisting of a single step-bounded
modality ♦≤n , U≤n or ≤n . Besides the computation of conditional probabilities,
our implementation also provides the option to compute conditional expected
rewards under (constrained) reachability or ω-regular conditions. We used the
three types of expected rewards supported by Prism: the expected accumulated
reward until a target set F is reached or within the next n ∈ N steps, and
the expected instantaneous reward obtained in the n-th step. For MDPs, our
current implementation only supports the computation of maximal conditional
probabilities for reachability objectives and reachability conditions based on the
algorithm presented in Section 4.1.
Experiments with Markov Chains. To evaluate our transformation-based
approach for Markov chains we carried out a series of experiments with the
Markov chain model for the bounded retransmissions protocol presented in [10].
The model specifications are from the Prism benchmark suite [22] (see
http://www.prismmodelchecker.org/casestudies/brp.php). A sender has to
transmit N fragments of a file using a simple protocol over lossy channels, where
the probability of losing a message is 0.01, while the probability of losing an ac-
knowledgment is 0.02. A parameter M specifies the maximum number of retries
for each fragment. We applied our method to compute:
=
(B1) PrM s ♦ “second retry for fragment” = ¬“finish with error”
=
(B2) PrM s ♦ “finish with success” = ♦ “2 fragments transmitted”
=
(B3) PrM s ¬“retry” = ♦ “finish with success” ∧ “retries ≤ 2”

All calculations for this paper were carried out on a computer with 2 Intel E5-
2680 8-core CPUs at 2.70 GHz with 384Gb of RAM. Table 1 lists results for the
calculation of the conditional probabilities (B1)–(B3), with N = 128 fragments
and M = 10 retries. We report the number of states and the time for building the
Computing Conditional Probabilities in Markovian Models Eﬃciently 527

Table 1. Statistics for the computation of (B1), (B2), (B3) for N = 128, M = 10

model M PrMs ( ϕ | ψ ) via transformation via quotient

states build st. Mψ M Mψ calc in Mψ total time total time
(B1) 18,701 0.5 s 17,805 19.2 s 5.5 s 24.7 s 58.7 s
(B2) 18,701 0.5 s 18,679 1.7 s 17.0 s 18.7 s 39.2 s
(B3) 18,701 0.5 s 3,976 10.5 s 1.2 s 11.7 s 14.9 s

=
model and statistics for the calculation of PrM =
s ( ϕ ψ ) with the method presented
in Section 3 and via the quotient of Prs ( ϕ ∧ ψ) and PrM
M
s ( ψ ). In addition to
the total time for the calculation, for our method we list as well the size of
the transformed model Mψ , the time spent in the transformation phase and
the time spent to calculate the probabilities of ϕ in Mψ . In these experiments,
our transformation method outperforms the quotient approach by separating
the treatment of ψ and ϕ. As expected, the particular condition significantly
influences the size of Mψ and the time spent for the calculation in Mψ . We
plan to allow caching of Mψ if the task is to treat multiple objectives under the
same condition ψ. We have carried out experiments for conditional rewards with
similar scalability results as well, see [6].
Experiments with MDPs. We report on experimental
studies with our im-
plementation of the calculation of Prmax
M,s ♦F | ♦G for the initial state s =
sinit of the parameterized MDP presented in [23]; see also [22], http://www.
prismmodelchecker.org/casestudies/wlan.php. It models a two-way hand-
shake mechanism of the IEEE 802.11 (WLAN) medium access control scheme
with two senders S1 and S2 that compete for the medium. As messages get
corrupted when both senders send at the same time (called a collision), a prob-
abilistic back-off mechanism is employed. The model deals with the case where
a single message from S1 and S2 should be successfully sent. We consider here:
=
(W1) Prmax =
M,s ♦ “c2 collisions” ♦ “c1 collisions”
=
(W2) Prmax =
M,s ♦“deadline t expired without success of S1 ” ♦ “c collisions”

The parameter N speciﬁes the maximal number of back-oﬀs that each sender per-
forms. The atomic propositions “c collisions” are supported by a global counter
variable in the model that counts the collisions (up to the maximal interesting
value for the property). For (W2), the deadline t is encoded in the model by a
global variable counting down until the deadline is expired.
Calculating (W1). Table 2 lists results for the calculation of (W1) with c2 = 4
and c1 = 2. We report the number of states and the time for building the
model. The states in the transformed MDP Mϕ|ψ consist of the states in the
original MDP M plus the three trap states introduced in the transformation.
We list the time for the transformation M Mϕ|ψ and for the computation
in Mϕ|ψ separately. For comparison, we list as well the time for calculating
the unconditional probabilities Prmax max
M ( ϕ ) and PrM ( ψ ) for all states in the
528 C. Baier et al.

Table 2. Statistics for the calculation of (W1) with c1 = 2 and c2 = 4

model M Prmax
M,s (ϕ ψ) Prmax max
M (ϕ) PrM (ψ)
N states build M Mϕ|ψ calc in Mϕ|ψ total time total total
3 118,280 2.3 s 1.6 s 3.2 s 4.8 s 1.1 s 0.4 s
4 345,120 5.5 s 3.2 s 9.0 s 12.3 s 1.6 s 1.3 s
5 1,295,338 21.0 s 12.6 s 33.8 s 46.5 s 3.9 s 4.9 s
6 5,007,668 99.4 s 38.8 s 126.0 s 164.9 s 12.7 s 18.7 s

Table 3. Statistics for the calculation of (W2) with N = 3

model M Prmax
M,s (ϕ|ψ) Prmax max
M (ϕ) PrM (ψ)
t c states build M Mϕ|ψ calc in Mϕ|ψ total time total total
50 1 539,888 10.0 s 6.4 s 0.4 s 6.8 s 6.0 s 0.1 s
50 2 539,900 9.5 s 7.1 s 4.6 s 11.7 s 6.0 s 0.6 s
100 1 4,769,199 95.1 s 194.6 s 2.4 s 197.1 s 192.0 s 0.5 s
100 2 4,769,235 93.3 s 199.8 s 85.5 s 285.5 s 184.4 s 10.4 s

model, which account for a large part of the transformation. As can be seen, our
approach scales reasonably well.
Calculating (W2). Table 3 lists selected results and statistics for (W2) with
N = 3, deadline t ∈ {50, 100} and number of collisions in the condition c ∈ {1, 2}.
Again, the time for the transformation is dominated by the computations of
Prmax max
M (ϕ) and PrM (ψ). However, in contrast to (W1), the time for the com-
putation in Mϕ|ψ is signiﬁcantly lower. The complexity in practice thus varies
signiﬁcantly with the particularities of the model and the condition.

6 Conclusion
We presented new methods for the computation of (maximal) conditional prob-
abilities via reductions to the computation of ordinary (maximal) probabilities
in discrete Markov chains and MDPs. These methods rely on transformations of
the model to encode the eﬀect of conditional probabilities. For MDPs we concen-
trated on the computation of maximal conditional probabilities. Our techniques
are, however, also
applicable
for reasoning
about
minimal conditional probabil-
M,s ϕ | ψ = 1 − PrM,s ¬ϕ | ψ . By our results, the complexity of
ities as: Prmin max

the problem that asks whether the (maximal) conditional probabilities meets a
given probability bound is not harder than the corresponding question for uncon-
ditional probabilities. This is reﬂected in our experiments. In our experiments
with Markov chains, our new method outperforms the naı̈ve approach. In future
work, we will extend our implementations for MDPs that currently only supports
reachability objectives and conditions and study methods for the computation
of maximal or minimal expected conditional accumulated rewards.
Computing Conditional Probabilities in Markovian Models Eﬃciently 529

References
1. Andrés, M.E., van Rossum, P.: Conditional probabilities over probabilistic and
nondeterministic systems. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008.
LNCS, vol. 4963, pp. 157–172. Springer, Heidelberg (2008)
2. Andrés, M.E.: Quantitative Analysis of Information Leakage in Probabilistic and
Nondeterministic Systems. PhD thesis, UB Nijmegen (2011)
3. Baier, C.: On the algorithmic verification of probabilistic systems. Universität
Mannheim, Habilitation Thesis (1998)
4. Baier, C., Groesser, M., Ciesinski, F.: Quantitative analysis under fairness con-
straints. In: Liu, Z., Ravn, A.P. (eds.) ATVA 2009. LNCS, vol. 5799, pp. 135–150.
Springer, Heidelberg (2009)
5. Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT Press (2008)
6. Baier, C., Klein, J., Klüppelholz, S., Märcker, S.: Computing conditional prob-
abilities in Markovian models efficiently. Technical report, TU Dresden (2014),
http://wwwtcs.inf.tu-dresden.de/ALGI/PUB/TACAS14/
7. Bianco, A., De Alfaro, L.: Model checking of probabilistic and non-deterministic
systems. In: Thiagarajan, P.S. (ed.) FSTTCS 1995. LNCS, vol. 1026, pp. 499–513.
Springer, Heidelberg (1995)
8. Clarke, E., Grumberg, O., Peled, D.: Model Checking. MIT Press (2000)
9. Courcoubetis, C., Yannakakis, M.: The complexity of probabilistic verification.
Journal of the ACM 42(4), 857–907 (1995)
10. D’Argenio, P.R., Jeannet, B., Jensen, H.E., Larsen, K.G.: Reachability analysis of
probabilistic systems by successive refinements. In: de Alfaro, L., Gilmore, S. (eds.)
PAPM-PROBMIV 2001. LNCS, vol. 2165, pp. 39–56. Springer, Heidelberg (2001)
11. de Alfaro, L.: Formal Verification of Probabilistic Systems. PhD thesis, Stanford
University, Department of Computer Science (1997)
12. de Alfaro, L.: Computing minimum and maximum reachability times in probabilis-
tic systems. In: Baeten, J.C.M., Mauw, S. (eds.) CONCUR 1999. LNCS, vol. 1664,
pp. 66–81. Springer, Heidelberg (1999)
13. Gao, Y., Xu, M., Zhan, N., Zhang, L.: Model checking conditional CSL for
continuous-time Markov chains. Information Processing Letters 113(1-2), 44–50
(2013)
14. Grädel, E., Thomas, W., Wilke, T. (eds.): Automata, Logics, and Infinite Games.
LNCS, vol. 2500. Springer, Heidelberg (2002)
15. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal
Aspects of Computing 6, 512–535 (1994)
16. Haverkort, B.: Performance of Computer Communication Systems: A Model-Based
Approach. Wiley (1998)
17. Ji, M., Wu, D., Chen, Z.: Verification method of conditional probability based on
automaton. Journal of Networks 8(6), 1329–1335 (2013)
18. Katoen, J.-P., Zapreev, I.S., Hahn, E.M., Hermanns, H., Jansen, D.N.: The ins and
outs of the probabilistic model checker MRMC. Performance Evaluation 68(2), 90–
104 (2011)
19. Kulkarni, V.: Modeling and Analysis of Stochastic Systems. Chapman & Hall
(1995)
20. Kwiatkowska, M., Norman, G., Parker, D.: Probabilistic symbolic model checking
with PRISM: A hybrid approach. STTT 6(2), 128–142 (2004)
21. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of probabilistic
real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS,
vol. 6806, pp. 585–591. Springer, Heidelberg (2011)
530 C. Baier et al.

22. Kwiatkowska, M., Norman, G., Parker, D.: The PRISM benchmark suite. In: QEST
2012. IEEE (2012)
23. Kwiatkowska, M., Norman, G., Sproston, J.: Probabilistic model checking of the
IEEE 802.11 wireless local area network protocol. In: Hermanns, H., Segala, R.
(eds.) PAPM-PROBMIV 2002. LNCS, vol. 2399, pp. 169–187. Springer, Heidelberg
(2002)
24. Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Program-
ming. John Wiley & Sons, Inc., New York (1994)
25. Vardi, M.: Automatic veriﬁcation of probabilistic concurrent ﬁnite-state programs.
In: FOCS 1985, pp. 327–338. IEEE (1985)
26. Vardi, M.Y.: Probabilistic linear-time model checking: An overview of the
automata-theoretic approach. In: Katoen, J.-P. (ed.) ARTS 1999. LNCS, vol. 1601,
pp. 265–276. Springer, Heidelberg (1999)
Permissive Controller Synthesis
for Probabilistic Systems

Klaus Dräger3, Vojtěch Forejt1 , Marta Kwiatkowska1,

David Parker2, and Mateusz Ujma1
1
Department of Computer Science, University of Oxford, UK
2
School of Computer Science, University of Birmingham, UK
3
EECS, Queen Mary, University of London, UK

Abstract. We propose novel controller synthesis techniques for proba-

bilistic systems modelled using stochastic two-player games: one player
acts as a controller, the second represents its environment, and probabil-
ity is used to capture uncertainty arising due to, for example, unreliable
sensors or faulty system components. Our aim is to generate robust con-
trollers that are resilient to unexpected system changes at runtime, and
flexible enough to be adapted if additional constraints need to be im-
posed. We develop a permissive controller synthesis framework, which
generates multi-strategies for the controller, offering a choice of control
actions to take at each time step. We formalise the notion of permis-
siveness using penalties, which are incurred each time a possible con-
trol action is blocked by a multi-strategy. Permissive controller synthesis
aims to generate a multi-strategy that minimises these penalties, whilst
guaranteeing the satisfaction of a specified system property. We estab-
lish several key results about the optimality of multi-strategies and the
complexity of synthesising them. Then, we develop methods to perform
permissive controller synthesis using mixed integer linear programming
and illustrate their effectiveness on a selection of case studies.

1 Introduction
Probabilistic model checking is used to automatically verify systems with stochas-
tic behaviour. Systems are modelled as, for example, Markov chains, Markov
decision processes, or stochastic games, and analysed algorithmically to verify
quantitative properties speciﬁed in temporal logic. Applications include checking
the safe operation of fault-prone systems (“the brakes fail to deploy with prob-
ability at most 10−6 ”) and establishing guarantees on the performance of, for
example, randomised communication protocols (“the expected time to establish
connectivity between two devices never exceeds 1.5 seconds”).
A closely related problem is that of controller synthesis. This entails construct-
ing a model of some entity that can be controlled (e.g., a robot, a vehicle or a
machine) and its environment, formally specifying the desired behaviour of the
system, and then generating, through an analysis of the model, a controller that
will guarantee the required behaviour. In many applications of controller syn-
thesis, a model of the system is inherently probabilistic. For example, a robot’s

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 531–546, 2014.

c Springer-Verlag Berlin Heidelberg 2014
532 K. Dräger et al.

sensors and actuators may be unreliable, resulting in uncertainty when detecting

and responding to its current state; or messages sent wirelessly to a vehicle may
fail to be delivered with some probability.
In such cases, the same techniques that underly probabilistic model checking
can be used for controller synthesis. For, example, we can model the system
as a Markov decision process (MDP), specify a property φ in a probabilistic
temporal logic such as PCTL and LTL, and then apply probabilistic model
checking. This yields an optimal strategy (policy) for the MDP, which instructs
the controller as to which action should be taken in each state of the model in
order to guarantee that φ will be satisfied. This approach has been successfully
applied in a variety of application domains, to synthesise, for example: control
strategies for robots [21], power management strategies for hardware [16], and
efficient PIN guessing attacks against hardware security modules [27].
Another important dimension of the controller synthesis problem is the pres-
ence of uncontrollable or adversarial aspects of the environment. We can take
account of this by phrasing the system model as a game between two players,
one representing the controller and the other the environment. Examples of this
approach include controller synthesis for surveillance cameras [23], autonomous
vehicles [11] or real-time systems [1]. In our setting, we use (turn-based) stochas-
tic two-player games, which can be seen as a generalisation of MDPs where de-
cisions are made by two distinct players. Probabilistic model checking of such a
game yields a strategy for the controller player which guarantees satisfaction of
a property φ, regardless of the actions of the environment player.
In this paper, we tackle the problem of synthesising robust and flexible con-
trollers, which are resilient to unexpected changes in the system at runtime. For
example, one or more of the actions that the controller can choose at runtime
might unexpectedly become unavailable, or additional constraints may be im-
posed on the system that make some actions preferable to others. One motivation
for our work is its applicability to model-driven runtime control of adaptive sys-
tems [5], which uses probabilistic model checking in an online fashion to adapt or
reconfigure a system at runtime in order to guarantee the satisfaction of certain
formally specified performance or reliability requirements.
We develop novel, permissive controller synthesis techniques for systems mod-
elled as stochastic two-player games. Rather than generating strategies, which
specify a single action to take at each time-step, we synthesise multi-strategies,
which specify multiple possible actions. As in classical controller synthesis, gener-
ation of a multi-strategy is driven by a formally specified quantitative property:
we focus on probabilistic reachability and expected total reward properties. The
property must be guaranteed to hold, whichever of the specified actions are taken
and regardless of the behaviour of the environment. Simultaneously, we aim to
synthesise multi-strategies that are as permissive as possible, which we quan-
tify by assigning penalties to actions. These are incurred when a multi-strategy
blocks (does not make available) a given action. Actions can be assigned different
penalty values to indicate the relative importance of allowing them. Permissive
controller synthesis amounts to finding a multi-strategy whose total incurred
penalty is minimal, or below some given threshold.
Permissive Controller Synthesis for Probabilistic Systems 533

We formalise the permissive controller synthesis problem and then establish

several key theoretical results. In particular, we show that randomised multi-
strategies are strictly more powerful than deterministic ones, and we prove that
the permissive controller synthesis problem is NP-hard for either class. We also
establish upper bounds, showing that the problem is in NP and PSPACE for the
deterministic and randomised cases, respectively.
Next, we propose practical methods for synthesising multi-strategies using
mixed integer linear programming (MILP) [25]. We give an exact encoding for
deterministic multi-strategies and an approximation scheme (with adaptable pre-
cision) for the randomised case. For the latter, we prove several additional results
that allow us to reduce the search space of multi-strategies. The MILP solution
process works incrementally, yielding increasingly permissive multi-strategies,
and can thus be terminated early if required. This is well suited to scenarios
where time is limited, such as online analysis for runtime control, as discussed
above, or “anytime verification” [26]. Finally, we implement our techniques and
evaluate their effectiveness on a range of case studies.
An extended version of this paper, with proofs, is available as [13].
Related Work. Permissive strategies in non-stochastic games were first studied
in [2] for parity objectives, but permissivity was defined solely by comparing
enabled actions. Bouyer et al. [3] showed that optimally permissive memoryless
strategies exist for reachability objectives and expected penalties, contrasting
with our (stochastic) setting, where they may not. The work in [3] also studies
penalties given as mean-payoff and discounted reward functions, and [4] extends
the results to the setting of parity games. None of [2,3,4] consider stochastic
games or even randomised strategies, and they provide purely theoretical results.
As in our work, Kumar and Garg [20] consider control of stochastic systems
by dynamically disabling events; however, rather than stochastic games, their
models are essentially Markov chains, which the possibility of selectively dis-
abling branches turns into MDPs. Finally, although tackling a rather different
problem (counterexample generation), [28] is related in that it also uses MILP
to solve probabilistic verification problems.

2 Preliminaries
We denote by Dist(X) the set of discrete probability distributions over a set X.
A Dirac distribution is one that assigns probability 1 to some s ∈ X. The support
of a distribution d ∈ Dist (X) is deﬁned as supp(d) = {x ∈ X | d(x) > 0}.
def

Stochastic Games. In this paper, we use turn-based stochastic two-player

games, which we often refer to simply as stochastic games. A stochastic game
takes the form G = S♦ , S , s, A, δ, where S = S♦ ∪ S is a ﬁnite set of states,
def

each associated with player ♦ or , s ∈ S is an initial state, A is a ﬁnite set of

actions, and δ : S×A → Dist (S) is a (partial) probabilistic transition function.
An MDP is a stochastic game with S = ∅. Each state s of a stochastic game
G has a set of enabled actions, given by A(s) = {a ∈ A | δ(s, a) is deﬁned}. The
def

unique player ◦ such that s ∈ S◦ picks the action a ∈ A(s) to be taken in state
534 K. Dräger et al.

s. Then, the next state is determined randomly according to the distribution

δ(s, a), i.e., a transition to state s occurs with probability δ(s, a)(s ). A path is a
(finite or infinite) sequence ω = s0 a0 s1 a1 . . . of such transitions through G. We
denote by IPath s (FPath s ) the set of all infinite (finite) paths starting in s. We
omit the subscript s when s is the initial state s.
A strategy σ : FPath → Dist (A) for player ◦ ∈ {♦, } of G is a resolution of
the choices of actions in each state from S◦ , based on the execution so far. In
standard fashion [19], a pair of strategies σ and π for ♦ and induces, for any
state s, a probability measure Pr σ,π G,s over IPath s . A strategy σ is deterministic
if σ(ω) is a Dirac distribution for all ω, and randomised if not. In this work,
we focus purely on memoryless strategies, where σ(ω) depends only on the last
state of ω, treating the strategy as a function σ : S◦ → Dist (A). The case of
history-dependent strategies is an interesting topic for future research. We write
ΣG◦ for the set of all (memoryless) player ◦ strategies in G.
Properties and Rewards. In order to synthesise controllers, we need a formal
description of their required properties. In this paper, we use two common classes
of properties: probabilistic reachability and expected total reward, which we will
express in an extended version of the temporal logic PCTL [18].
For probabilistic reachability, we write properties of the form φ = Pp [ F g ],
where ∈{ , }, p ∈ [0, 1] and g ⊆ S is a set of target states, meaning that the
probability of reaching a state in g satisfies the bound p. Formally, for a specific
pair of strategies σ ∈ ΣG♦ , π ∈ ΣG for G, the probability of reaching g under σ
and π is Pr σ,π σ,π
G,s (F g) = Pr G,s ({s0 a0 s1 a1 · · · ∈ IPath s | si ∈ g for some i}). We
def

say that φ is satisﬁed under σ and π, denoted G, σ, π |= φ, if Pr σ,π G,s (F g) p.

For rewards, we augment stochastic games with reward structures, which are
functions of the form r : S×A → R0 mapping state-action pairs to non-negative
reals. In practice, we often use these to represent “costs” (e.g. elapsed time or
energy consumption), despite the terminology “rewards”.
The totalreward for reward structure r along an infinite path ω = s0 a0 s1 a1 . . .
is r(ω) = ∞ σ ∈ ΣG♦ and π ∈ ΣG , the expected total
def
j=0 r(sj , aj ). Fordef
strategies
>
reward is defined as EG,s (r) = ω∈IPath s r(ω) dPr σ,π
σ,π
G,s . For technical reasons, we
σ,π
will always assume that the maximum possible reward supσ,π EG,s (r) is finite
(which can be checked with an analysis of the game’s underlying graph). An
expected reward property is written φ = Rrb [ C ] (where C stands for cumulative),
meaning that the expected total reward for r satisfies b. We say that φ is
σ,π
satisfied under strategies σ and π, denoted G, σ, π |= φ, if EG,s (r) b.
In fact, probabilistic reachability can be easily reduced to expected total re-
wards. Thus, in the techniques presented in this paper, we focus purely on ex-
pected total reward.
Controller Synthesis. To perform controller synthesis, we model the system
as a stochastic game G = S♦ , S , s, A, δ, where player ♦ represents the con-
troller and player represents the environment. A specification of the required
behaviour of the system is a property φ, either a probabilistic reachability prop-
erty Pp [ F t ] or an expected total reward property Rrb [ C ].
Permissive Controller Synthesis for Probabilistic Systems 535

Deﬁnition 1 (Sound strategy). A strategy σ ∈ ΣG♦ for player ♦ in stochastic

game G is sound for a property φ if G, σ, π |= φ for any strategy π ∈ ΣG .

The classical controller synthesis problem asks whether there is a sound strategy.
We can determine whether this is the case by computing the optimal strategy
for player ♦ in game G [12,15]. This problem is known to be in NP ∩ co-NP, but,
in practice, methods such as value or policy iteration can be used eﬃciently.
Example 1. Fig. 1 shows a stochas- east pass
s0 s1 s2
tic game G, with controller and
0.75 block
environment player states drawn as 0.7 south
south 0.25
diamonds and squares, respectively. It 0.3 pass
north
models the control of a robot mov- s4 s5
east
ing between 4 locations (s0 , s2 , s3 , s5 ). s3 block
done
0.6
When moving east (s0 →s2 or s3 →s5 ), 0.4

it may be impeded by a second robot, Fig. 1. A stochastic game G for Ex. 1

depending on the position of the lat-
ter. If it is blocked, there is a chance that it does not successfully move to the
next location. We use a reward structure moves, which assigns 1 to the controller
actions north, east, south, and deﬁne property φ = Rmoves 5 [ C ], meaning that the
expected number of moves to reach s5 is at most 5. A sound strategy (found
by minimising moves) chooses south in s0 and east in s3 , yielding an expected
number of moves of 3.5.

3 Permissive Controller Synthesis

We now define a framework for permissive controller synthesis, which gener-
alises classical controller synthesis by producing multi-strategies that offer the
controller flexibility about which actions to take in each state.

3.1 Multi-strategies
Multi-strategies generalise the notion of strategies, as deﬁned in Section 2.
Deﬁnition 2 (Multi-strategy). A (memoryless) multi-strategy for a game
G=S♦ , S , s, A, δ is a function θ:S♦ →Dist(2A ) with θ(s)(∅) = 0 for all s ∈ S♦ .
As for strategies, a multi-strategy θ is deterministic if θ always returns a Dirac
distribution, and randomised otherwise. We write ΘGdet and ΘGrand for the sets of
all deterministic and randomised multi-strategies in G, respectively.
A deterministic multi-strategy θ chooses a set of allowed actions in each state
s ∈ S♦ , i.e., those in the unique set B ⊆ A for which θ(s)(B) = 1. The re-
maining actions A(s) \ B are said to be blocked in s. In contrast to classical
controller synthesis, where a strategy σ can be seen as providing instructions
about precisely which action to take in each state, in permissive controller syn-
thesis a multi-strategy provides multiple actions, any of which can be taken. A
randomised multi-strategy generalises this by selecting a set of allowed actions
in state s randomly, according to distribution θ(s).
536 K. Dräger et al.

We say that a controller strategy σ complies with multi-strategy θ if it picks

actions that are allowed by θ. Formally (taking into account the possibility of
randomisation), σ complies with θ if, for any state s and non-empty subset
B ⊆ A(s),there is a distribution ds,B ∈ Dist(B) such that, for all a ∈ A(s),
σ(s)(a) = Ba θ(s)(B)ds,B (a).
Now, we can deﬁne the notion of a sound multi-strategy, i.e., one that is
guaranteed to satisfy a property φ when complied with.

Deﬁnition 3 (Sound multi-strategy). A multi-strategy θ for game G is sound

for a property φ if any strategy σ that complies with θ is sound for φ.

Example 2. We return to the stochastic game from Ex. 1 (see Fig. 1) and re-use
the property φ = Rmoves
5 [ C ]. The strategy that picks south in s0 and east in s3
results in an expected reward of 3.5 (i.e., 3.5 moves on average to reach s5 ). The
strategy that picks east in s0 and south in s2 yields expected reward 5. Thus a
(deterministic) multi-strategy θ that picks {south, east } in s0 , {south} in s2 and
{east} in s3 is sound for φ since the expected reward is always at most 5.

3.2 Penalties and Permissivity

The motivation for multi-strategies is to oﬀer ﬂexibility in the actions to be

taken, while still satisfying a particular property φ. Generally, we want a multi-
strategy θ to be as permissive as possible, i.e. to impose as few restrictions
as possible on actions to be taken. We formalise the notion of permissivity by
assigning penalties to actions in the model, which we then use to quantify the
extent to which actions are blocked by θ. Penalties provide expressivity in the
way that we quantify permissivity: if it is more preferable that certain actions
are allowed than others, then these can be assigned higher penalty values.
A penalty scheme is a pair (ψ, t), comprising a penalty function ψ : S♦ × A →
R0 and a penalty type t ∈ {sta, dyn}. The function ψ represents the impact of
blocking each action in each controller state of the game. The type t dictates how
penalties for individual actions are combined to quantify the permissiveness of
a specific multi-strategy. For static penalties (t = sta), we simply sum penalties
across all states of the model. For dynamic penalties (t = dyn), we take into
account the likelihood that blocked actions would actually have been available,
by using the expected sum of penalty values.
More precisely, for a penalty scheme (ψ, t) and a multi-strategy θ, we define
the resulting penalty for θ, denoted pen t (ψ, θ) as follows.
First, we define the
local penalty for θ at state s as pen loc (ψ, θ, s) = B⊆A(s) a∈B / θ(s, B)ψ(s, a).
If θ is deterministic, pen loc (ψ, θ, s) is simply the sum of the penalties of actions
that are blocked by θ in s. If θ is randomised, pen loc (ψ, θ, s) gives the expected
penalty value in s, i.e. the sum of penalties weighted by the probability with
which θ blocks them in s.
Now, for the static case, we sum the local penalties over all states, i.e. we put
pen sta (ψ, θ) = s∈S♦ pen loc (ψ, θ, s). For the dynamic case, we use the (worst-
case) expected sum of local penalties. We define an auxiliary reward structure
Permissive Controller Synthesis for Probabilistic Systems 537

ψ given by the local penalties: ψ (s, a) = pen loc (ψ, θ, s) for all a ∈ A(s). Then:
σ,π
pen dyn (ψ, θ) = sup{EG,s (ψ ) | σ ∈ ΣG♦ , π ∈ ΣG and σ complies with θ}.

3.3 Permissive Controller Synthesis

We can now formally define the central problem studied in this paper.
Definition 4 (Permissive controller synthesis). Consider a game G, a class
of multi-strategies ( ∈ {det , rand }, a property φ, a penalty scheme (ψ, t) and a
threshold c ∈ Q0 . The permissive controller synthesis problem asks: does there
exist a multi-strategy θ ∈ ΘG that is sound for φ and satisfies pen t (ψ, θ) c?
Alternatively, in a more quantitative fashion, we can aim to synthesise (if it
exists) an optimally permissive sound multi-strategy.
Definition 5 (Optimally permissive). Let G, (, φ and (ψ, t) be as in Defn. 4.
A sound multi-strategy θ̂ ∈ ΘG is optimally permissive if its penalty pen t (ψ, θ̂)
equals inf{pen t (ψ, θ) | θ ∈ ΘG and θ is sound for φ}.

Example 3. We return to Ex. 2 and consider a static penalty scheme (ψ, sta)
assigning 1 to the actions north, east , south (in any state). The deterministic
multi-strategy θ from Ex. 2 is optimally permissive for φ = Rmoves 5 [ C ], with
penalty 1 (just north in s3 is blocked). If we instead use φ = Rmoves
16 [ C ], the multi-
strategy θ that extends θ by also allowing north is now sound and optimally
permissive, with penalty 0. Alternatively, the randomised multi-strategy θ that
picks 0.7:{north}+0.3:{north, east } in s3 is sound for φ with penalty just 0.7.
Next, we establish several fundamental results about the permissive controller
synthesis problem. Proofs can be found in [13].
Optimality. Recall that two key parameters of the problem are the type of
multi-strategy sought (deterministic or randomised) and the type of penalty
scheme used (static or dynamic). We first note that randomised multi-strategies
are strictly more powerful than deterministic ones, i.e. they can be more permis-
sive (yield a lower penalty) for the same property φ.
Theorem 1. The answer to a permissive controller synthesis problem (for ei-
ther a static or dynamic penalty scheme) can be “no” for deterministic multi-
strategies, but “yes” for randomised ones.
This is why we explicitly distinguish between classes of multi-strategies when
defining permissive controller synthesis. This situation contrasts with classi-
cal controller synthesis, where deterministic strategies are optimal for the same
classes of properties φ. Intuitively, randomisation is more powerful in this case
because of the trade-off between rewards and penalties: similar results exist in,
for example, multi-objective controller synthesis on MDPs [14].
Second, we observe that, for the case of static penalties, the optimal penalty
value for a given property (the infimum of achievable values) may not actually
be achievable by any randomised multi-strategy.
538 K. Dräger et al.

Theorem 2. For permissive controller synthesis using a static penalty scheme,

an optimally permissive randomised multi-strategy does not always exist.
If, on the other hand, we restrict our attention to deterministic strategies, then
an optimally permissive multi-strategy does always exist (since the set of deter-
ministic, memoryless multi-strategies is ﬁnite). For randomised multi-strategies
with dynamic penalties, the question remains open.
Complexity. Next, we present complexity results for the diﬀerent variants of
the permissive controller synthesis problem. We begin with lower bounds.

Theorem 3. The permissive controller synthesis problem is NP-hard, for either

static or dynamic penalties, and deterministic or randomised multi-strategies.

We prove NP-hardness by reduction from the Knapsack problem, where weights

of items are represented by penalties, and their values are expressed in terms
of rewards to be achieved. The most delicate part is the proof for randomised
strategies, where we need to ensure that the multi-strategy cannot benefit from
picking certain actions (corresponding to items being put to the Knapsack) with
probability other than 0 or 1. For upper bounds, we have the following.
Theorem 4. The permissive controller synthesis problem for deterministic (resp.
randomised) strategies is in NP (resp. PSPACE) for dynamic/ static penalties.
For deterministic multi-strategies it is straightforward to show NP membership
in both the dynamic and static penalty case, since we can guess a multi-strategy
satisfying the required conditions and check its correctness in polynomial time.
For randomised multi-strategies, with some technical effort we can encode exis-
tence of the required multi-strategy as a formula of the existential fragment of
the theory of real arithmetic, solvable with polynomial space [7]. See [13].
A natural question is whether the PSPACE upper bound for randomised
multi-strategies can be improved. We show that this is likely to be difficult, by
giving a reduction from the square-root-sum problem. We use a variant of the
n √
problem that asks, for positive rationals x1 ,. . . ,xn and y, whether i=1 xi y.
This problem is known to be in PSPACE, but establishing a better complexity
bound is a long-standing open problem in computational geometry [17].

Theorem 5. There is a reduction from the square-root-sum problem to the per-

missive controller synthesis problem with randomised multi-strategies, for both
static and dynamic penalties.

4 MILP-Based Synthesis of Multi-strategies

We now consider practical methods for synthesising multi-strategies that are
sound for a property φ and optimally permissive for some penalty scheme. Our
methods use mixed integer linear programming (MILP), which optimises an
objective function subject to linear constraints that mix both real and integer
variables. A variety of eﬃcient, oﬀ-the-shelf MILP solvers exists.
Permissive Controller Synthesis for Probabilistic Systems 539

An important feature of the MILP solvers we use is that they work incre-
mentally, producing a sequence of increasingly good solutions. Here, that means
generating a series of sound multi-strategies that are increasingly permissive. In
practice, when resources are constrained, it may be acceptable to stop early and
accept a multi-strategy that is sound but not necessarily optimally permissive.
4.1 Deterministic Multi-strategies
We first consider synthesis of deterministic multi-strategies. Here, and in the
rest of this section, we assume that the property φ is of the form Rrb [ C ]. Upper
bounds on expected rewards (φ = Rrb [ C ]) can be handled by negating rewards
and converting to a lower bound. For the purposes of encoding into MILP, we
σ,π
rescale r and b such that supσ,π EG,s (r) < 1 for all s, and rescale every (non-zero)
penalty such that ψ(s, a) 1 for all s and a ∈ A(s).
Static Penalties. Fig. 2 shows an encoding into MILP of the problem of finding
an optimally permissive deterministic multi-strategy for property φ = Rrb [ C ]
and a static penalty scheme (ψ, sta). The encoding uses 5 types of variables:
ys,a ∈ {0, 1}, xs ∈ R0 , αs ∈ {0, 1}, βs,a,t ∈ {0, 1} and γt ∈ [0, 1], where s, t ∈ S
and a ∈ A. So the worst-case size of the MILP problem is O(|A|·|S|2 ·κ), where
κ stands for the longest encoding of a number used.
Variables ys,a encode a multi-strategy θ: ys,a =1 iff θ allows action a in s
(constraint (2) enforces at least one action per state). Variables xs represent
the worst-case expected total reward (for r) from state s, under any controller
strategy complying with θ and under any environment strategy. This is captured
by constraints (3)–(4) (which amounts to minimising the reward in an MDP).
Constraint (1) imposes the required bound of b on the reward from s.
The objective function minimises the static penalty (the sum of all local
penalties) minus the expected reward in the initial state. The latter acts as
a tie-breaker between solutions with equal penalties (but, thanks to rescaling, is
always dominated by the penalties and therefore does not affect optimality).
As an additional technicality, we need to ensure that the values of xs are the
least solution of the defining inequalities, to deal with the possibility of zero
reward loops [24]. To achieve this, we use an approach similar to the one taken
in [28]. It is sufficient to ensure that xs = 0 whenever the minimum expected
reward from s achievable under θ is 0, which is the case if and only if, starting
from s, it is possible to avoid ever taking an action with positive reward.
In our encoding, αs = 1 if xs is positive (constraint (5)). The binary variables
βs,a,t = 1 represent, for each such s and each action a allowed in s, a choice of
successor t ∈ supp(δ(s, a)) (constraint (6)). The variables γs then represent a
ranking function: if r(s, a) = 0, then γs > γt(s,a) (constraint (8)). If a positive
reward could be avoided starting from s, there would in particular be an infinite
sequence s0 , a1 , s1 , . . . with s0 = s and, for all i, si+1 = t(si , ai ) and r(si , ai ) = 0,
and therefore γsi > γsi+1 . Since S is finite, this sequence would have to enter a
loop, leading to a contradiction.
Dynamic Penalties. Next, we show how to compute an optimally permissive
sound multi-strategy for a dynamic penalty scheme (ψ, dyn). This case is more
540 K. Dräger et al.

Minimise: − xs + (1 − ys,a )·ψ(s, a) subject to:
s∈S♦ a∈A(s)
xs b (1)

1 ys,a for all s ∈ S♦ (2)
a∈A(s)

xs δ(s, a)(t)·xt + r(s, a) + (1 − ys,a ) for all s ∈ S♦ , a ∈ A(s) (3)
t∈S

xs δ(s, a)(t)·xt for all s ∈ S , a ∈ A(s) (4)
t∈S

xs αs for all s ∈ S (5)

ys,a = (1 − αs ) + βs,a,t for all s ∈ S, a ∈ A(s) (6)
t∈supp(δ(s,a))

ys,a = 1 for all s ∈ S , a ∈ A(s) (7)

γt < γs + (1 − βs,a,t ) + r(s, a) for all (s, a, t) ∈ supp(δ) (8)

Fig. 2. MILP encoding for deterministic multi-strategies with static penalties

Minimise: zs subject to (1), . . . , (7) and:

s = ψ(s, a)·(1 − ys,a ) for all s ∈ S♦ (9)
a∈A(s)

zs δ(s, a)(t)·zt + s − c·(1 − ys,a ) for all s ∈ S♦ , a ∈ A(s) (10)
t∈S

zs δ(s, a)(t)·zt for all s ∈ S , a ∈ A(s) (11)
t∈S

Fig. 3. MILP encoding for deterministic multi-strategies with dynamic penalties

subtle since the optimal penalty can be infinite. Hence, our solution proceeds
in two steps as follows. Initially, we determine if there is some sound multi-
strategy. For this, we just need to check for the existence of a sound strategy,
using standard algorithms for solution of stochastic games [12,15].
If there is no sound multi-strategy, we are done. If there is, we use the MILP
problem in Fig. 3 to determine the penalty for an optimally permissive sound
multi-strategy. This MILP encoding extends the one in Fig. 2 for static penal-
ties, adding variables s and zs , representing the local and the expected penalty
in state s, and three extra sets of constraints. Equations (9) and (10) define
the expected penalty in controller states, which is the sum of penalties for all
disabled actions and those in the successor states, multiplied by their transition
probability. The behaviour of environment states is captured by Equation (11),
where we only maximise the penalty, without incurring any penalty locally.
The constant c in (10) is chosen to be no lower than any finite penalty achiev-
able by a deterministic multi-strategy, a possible value being ∞ i=0 (1 − p
|S| i
) ·
|S|
p · i · |S| · pen max , where p is the smallest non-zero probability assigned by δ,
and pen max is the maximal local penalty over all states. If the MILP problem has
a solution, this is the optimal dynamic penalty over all sound multi-strategies.
If not, no deterministic sound multi-strategy has finite penalty and the optimal
penalty is ∞ (recall that we established there is some sound multi-strategy).
Permissive Controller Synthesis for Probabilistic Systems 541

b1 a1
s1
s1

...
p1
a1 bn ak
s s

...
ak b1 a1
pm sn

...
sm
bn ak
Fig. 4. Transformed game for approximating randomised multi-strategies (Section 4.2)

In practice, we might choose a lower value of c than the one above, resulting in
a multi-strategy that is sound, but possibly not optimally permissive.

4.2 Approximating Randomised Multi-strategies

As shown in Section 3, randomised multi-strategies can outperform deterministic
ones. The MILP encodings in Fig.s 2 and 3, though, cannot be adapted to the
randomised case, since this would need non-linear constraints.
Instead, in this section, we propose an approximation which finds the optimal
randomised multi-strategy θ in which each probability θ(s, B) is a multiple of
1
M for a given granularity M . Any such multi-strategy can then be simulated by
a deterministic one on a transformed game, allowing synthesis to be carried out
using the MILP-based methods described in the previous section.
The transformed game is illustrated in Fig. 4. For each controller state s, we
add two layers of states: gadgets sj (for 1 j n) representing the subsets
B ⊆ A(s) with θ(s, B) > 0, and selectors si (for 1 i m), which distribute
probability among the gadgets. The si are reached from s via a transition using
fixed probabilities p1 , . . . , pm which need to be chosen appropriately (see below).
For efficiency, we want to minimise the number of gadgets n and selectors m for
each state s. We now present several results used to achieve this.
First, note that, if |A(s)| = k, a randomised multi-strategy chooses probabili-
ties for all n = 2k −1 non-empty subsets of A(s). Below, we show that it suffices
to consider randomised multi-strategies whose support in each state has just two
subsets, allowing us to reduce the number of gadgets from n = 2k −1 to n = 2,
resulting in a smaller MILP problem to solve for multi-strategy synthesis.
Theorem 6. 1. For a (static or dynamic) penalty scheme (ψ, t) and any sound
multi-strategy θ we can construct another sound multi-strategy θ such that
pen t (ψ, θ) pen t (ψ, θ ) and |supp(θ (s))| 2 for any s ∈ S♦ .
2. Furthermore, for static penalties, we can construct θ such that, for each
state s ∈ S♦ , if supp(θ (s))={B1 , B2 }, then either B1 ⊆ B2 or B1 ⊆ B2 .
Part 2 of Theorem 6 states that, for static penalties, we can further reduce
the possible multi-strategies that we need to consider. This, however, does not
extend to dynamic penalties (see [13]).
Lastly, we define the probabilities p1 , . . . , pm on the transitions to selectors
in Fig. 4. We let m = 1 + log2 M and pi = M li
, where l1 . . . , lm ∈ N are
M−(l1 +···+li−1 )
defined recursively as follows: l1 = 2 and li =
M
2 for 2 i m.
Assuming n = 2, as discussed above, this allows us to encode any probability
distribution ( Ml , M−l
M ) between two subsets B1 and B2 .
542 K. Dräger et al.

Table 1. Experimental results for synthesising optimal deterministic multi-strategies

Name Param. Ctrl. Time

States Property Penalty
[param.s] values states (s)
cloud 5 8,841 2,177 P0.9999 [ F deployed ] 0.001 9.08
[vm] 6 34,953 8,705 P0.999 [ F deployed ] 0.01 72.44
1, 48 2,305 997 0.0009 0.58
android
2, 48 9,100 3,718 Rtime
10000 [ C ] 0.0011 10.64
[r, s]
3, 48 23,137 9,025 0.0013 17.34
mdsm 3 62,245 9,173 P0.1 [ F deviated ] 52 50.97
[N ] 3 62,245 9,173 P0.01 [ F deviated ] 186 15.84
investor 5,10 10,868 3,344 Rproﬁt
4.98 [ C ] 1 3.32
[vinit, vmax] 10, 15 21,593 6,644 Rproﬁt
8.99 [ C ] 1 18.99
team-form 3 12,476 2,023 0.8980 0.12
P0.9999 [ F done 1 ]
[N ] 4 96,666 13,793 0.704 2.26
cdmsn [N ] 3 1240 604 P0.9999 [ F prefer 1 ] 2 0.46

The following result states that, by varying the granularity M , we can get
arbitrarily close to the optimal penalty for a randomised multi-strategy and, for
the case of static penalties, deﬁnes a suitable choice of M .
Theorem 7. Let θ be a sound multi-strategy. For any ε > 0, there is an M and
a sound multi-strategy θ of granularity M satisfying pen t (ψ, θ )−pen t (ψ, θ) ε.

Moreover, for static penalties it suﬃces to take M = s∈S,a∈A(s) ψ(s,a) ε .

5 Experimental Results
We have implemented our techniques within PRISM-games [9], an extension of
the PRISM model checker for performing model checking and strategy synthe-
sis on stochastic games. PRISM-games can thus already be used for (classical)
controller synthesis problems on stochastic games. To this, we add the ability
to synthesise multi-strategies using the MILP-based method described in Sec-
tion 4. Our implementation currently uses CPLEX to solve MILP problems. It
also supports SCIP and lp solve, but in our experiments (run on a PC with a
1.7GHz i7 Core processor and 4GB RAM) these were slower in all cases.
We investigated the applicability and performance of our approach on a va-
riety of case studies, some of which are existing benchmark examples and some
of which were developed for this work. These are described in detail below and
the files used can be found online [29].
Deterministic Multi-strategy Synthesis. We first discuss the generation
of optimal deterministic multi-strategies, the results of which are summarised
in Table 1. In each row, we first give details of the model: the case study, any
parameters used, the number of states (|S|) and of controller states (|S♦ |). Then,
we show the property φ used, the penalty value of the optimal multi-strategy
and the time to generate it. Below, we give further details for each case study,
illustrating the variety of ways that permissive controller synthesis can be used.
Permissive Controller Synthesis for Probabilistic Systems 543

cloud: We adapt a PRISM model from [6] to synthesise deployments of services

across virtual machines (VMs) in a cloud infrastructure. Our property φ specifies
that, with high probability, services are deployed to a preferred subset of VMs,
and we then assign unit (dynamic) penalties to all actions corresponding to
deployment on this subset. The resulting multi-strategy has very low expected
penalty (see Table 1) indicating that the goal φ can be achieved whilst the
controller experiences reduced flexibility only on executions with low probability.
android: We apply permissive controller synthesis to a model created for run-
time control of an Android application that provides real-time stock monitoring
(see [29] for details). We extend the application to use multiple data sources and
synthesise a multi-strategy which specifies an efficient runtime selection of data
sources (φ bounds the total expected response time). We use static penalties,
assigning higher values to actions that select the two most efficient data sources
at each time point and synthesise a multi-strategy that always provides a choice
of at least two sources (in case one becomes unavailable), while preserving φ.
mdsm: Microgrid demand-side management (MDSM) is a randomised scheme for
managing local energy usage. A stochastic game analysis [8] previously showed
it is beneficial for users to selfishly deviate from the protocol, by ignoring a
random back-off mechanism designed to reduce load at busy times. We synthesise
a multi-strategy for a (potentially selfish) user, with the goal (φ) of bounding
the probability of deviation (at either 0.1 or 0.01). The resulting multi-strategy
could be used to modify the protocol, restricting the behaviour of this user to
reduce selfish behaviour. To make the multi-strategy as permissive as possible,
restrictions are only introduced where necessary to ensure φ. We also guide where
restrictions are made by assigning (static) penalties at certain times of the day.
investor: This example [22] synthesises strategies for a futures market investor,
who chooses when to reserve shares, operating in a (malicious) market which can
periodically ban him from investing. We generate a multi-strategy that achieves
90% of the maximum expected profit (obtainable by a single strategy) and assign
(static) unit penalties to all actions, showing that, after an immediate share
purchase, the investor can choose his actions freely and still meet the 90% target.
team-form: This example [10] synthesises strategies for forming teams of agents
in order to complete a set of collaborative tasks. Our goal (φ) is to guarantee that
a particular task is completed with high probability (0.9999). We use (dynamic)
unit penalties on all actions of the first agent and synthesise a multi-strategy
representing several possibilities for this agent while still achieving the goal.
cdmsn: Lastly, we apply permissive controller synthesis to a model of a protocol
for collective decision making in sensor networks (CDMSN) [8]. We synthesise
strategies for nodes in the network such that consensus is achieved with high
probability (0.9999). We use (static) penalties inversely proportional to the en-
ergy associated with each action a node can perform to ensure that the multi-
strategy favours more efficient solutions.
Analysis. Unsurprisingly, permissive controller synthesis is slightly more costly
to execute than (classical) controller synthesis. But we successfully synthesised
544 K. Dräger et al.

Table 2. Experimental results for approximating optimal randomised multi-strategies

Par- Ctrl. Pen. Pen. (randomised)

Name† States Property
am.s states (det.) M =100 M =200 M =300
1,1 49 10 P0.9999 [ F done] 1.01 0.91 0.905 0.903
android
1,10 481 112 P0.999 [ F done] 19.13 18.14∗ 17.73∗ 17.58∗
cloud 5 8,841 2,177 P0.9999 [ F deployed ] 1 0.91 0.905 0.906∗
∗
investor 5,10 10,868 3,344 proﬁt
R 4.98 [ C ] 1 1 1∗ 0.996∗
team-form 3 12,476 2,023 P0.9999 [ F done 1 ] 264 263.96 263.95 263.94∗
∗ ∗

†
See Table 1 for parameter names.
∗
Sound but possibly non-optimal multi-strategy obtained after 5 minute MILP time-out.

deterministic multi-strategies for a wide range of models and properties, with

model sizes ranging up to approximately 100,000 states. The performance and
scalability of our method is affected (as usual) by the state space size. But,
in particular, it is affected by the number of actions in controller states, since
these result in integer MILP variables, which are the most expensive part of the
solution. Performance is also sensitive to the penalty scheme used: for example,
states with all penalties equal to zero can be dealt with more efficiently.
Randomised Multi-strategy Synthesis. Finally, Table 2 presents results for
approximating optimal randomised multi-strategies on several models from Ta-
ble 1. We show the (static) penalty values for the generated multi-strategies for
3 different levels of precision (i.e. granularities M ; see Section 4.2) and compare
them to those of the deterministic multi-strategies for the same models.
The MILP encodings for randomised multi-strategies are larger than deter-
ministic ones and thus slower to solve, so we impose a time-out of 5 minutes. We
are able to generate a sound multi-strategy for all the examples; in some cases
it is optimally permissive, in others it is not (denoted by a ∗ in Table 2). As
would be expected, we generally observe smaller penalties with increasing values
of M . In the instance where this is not true (cloud, M =300), we attribute this to
the size of the MILP problem, which grows with M . For all examples, we built
randomised multi-strategies with smaller penalties than the deterministic ones.

6 Conclusions
We have presented a framework for permissive controller synthesis on stochastic
two-player games, based on generation of multi-strategies that guarantee a spec-
iﬁed objective and are optimally permissive with respect to a penalty function.
We proved several key properties, developed MILP-based synthesis methods and
evaluated them on a set of case studies. Topics for future work include synthesis
for more expressive temporal logics and using history-dependent multi-strategies.
Acknowledgements. The authors are part supported by ERC Advanced Grant
VERIWARE and EPSRC projects EP/K038575/1 and EP/F001096/1.
Permissive Controller Synthesis for Probabilistic Systems 545

References
1. Behrmann, G., Cougnard, A., David, A., Fleury, E., Larsen, K.G., Lime, D.:
UPPAAL-tiga: Time for playing games! In: Damm, W., Hermanns, H. (eds.) CAV
2007. LNCS, vol. 4590, pp. 121–125. Springer, Heidelberg (2007)
2. Bernet, J., Janin, D., Walukiewicz, I.: Permissive strategies: from parity games to
safety games. ITA 36(3), 261–275 (2002)
3. Bouyer, P., Duflot, M., Markey, N., Renault, G.: Measuring permissivity in finite
games. In: Bravetti, M., Zavattaro, G. (eds.) CONCUR 2009. LNCS, vol. 5710, pp.
196–210. Springer, Heidelberg (2009)
4. Bouyer, P., Markey, N., Olschewski, J., Ummels, M.: Measuring permissiveness in
parity games: Mean-payoff parity games revisited. In: Bultan, T., Hsiung, P.-A.
(eds.) ATVA 2011. LNCS, vol. 6996, pp. 135–149. Springer, Heidelberg (2011)
5. Calinescu, R., Ghezzi, C., Kwiatkowska, M., Mirandola, R.: Self-adaptive software
needs quantitative verification at runtime. CACM 55(9), 69–77 (2012)
6. Calinescu, R., Johnson, K., Kikuchi, S.: Compositional reverification of probabilis-
tic safety properties for large-scale complex IT systems. In: LSCITS (2012)
7. Canny, J.: Some algebraic and geometric computations in PSPACE. In: Proc.
STOC 1988, pp. 460–467. ACM, New York (1988)
8. Chen, T., Forejt, V., Kwiatkowska, M., Parker, D., Simaitis, A.: Automatic verifi-
cation of competitive stochastic systems. In: Flanagan, C., König, B. (eds.) TACAS
2012. LNCS, vol. 7214, pp. 315–330. Springer, Heidelberg (2012)
9. Chen, T., Forejt, V., Kwiatkowska, M., Parker, D., Simaitis, A.: PRISM-games: A
model checker for stochastic multi-player games. In: Piterman, N., Smolka, S.A.
(eds.) TACAS 2013. LNCS, vol. 7795, pp. 185–191. Springer, Heidelberg (2013)
10. Chen, T., Kwiatkowska, M., Parker, D., Simaitis, A.: Verifying team formation
protocols with probabilistic model checking. In: Leite, J., Torroni, P., Ågotnes,
T., Boella, G., van der Torre, L. (eds.) CLIMA XII 2011. LNCS, vol. 6814, pp.
190–207. Springer, Heidelberg (2011)
11. Chen, T., Kwiatkowska, M., Simaitis, A., Wiltsche, C.: Synthesis for multi-
objective stochastic games: An application to autonomous urban driving. In: Joshi,
K., Siegle, M., Stoelinga, M., D’Argenio, P.R. (eds.) QEST 2013. LNCS, vol. 8054,
pp. 322–337. Springer, Heidelberg (2013)
12. Condon, A.: On algorithms for simple stochastic games. In: Advances in Compu-
tational Complexity Theory. DIMACS Series, vol. 13, pp. 51–73 (1993)
13. Draeger, K., Forejt, V., Kwiatkowska, M., Parker, D., Ujma, M.: Permissive con-
troller synthesis for probabilistic systems. Technical Report CS-RR-14-01, Depart-
ment of Computer Science, University of Oxford (2014)
14. Etessami, K., Kwiatkowska, M., Vardi, M., Yannakakis, M.: Multi-objective model
checking of Markov decision processes. LMCS 4(4), 1–21 (2008)
15. Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer (1997)
16. Forejt, V., Kwiatkowska, M., Norman, G., Parker, D., Qu, H.: Quantitative multi-
objective verification for probabilistic systems. In: Abdulla, P.A., Leino, K.R.M.
(eds.) TACAS 2011. LNCS, vol. 6605, pp. 112–127. Springer, Heidelberg (2011)
17. Garey, M.R., Graham, R.L., Johnson, D.S.: Some np-complete geometric problems.
In: STOC 1976, pp. 10–22. ACM, New York (1976)
18. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal
Aspects of Computing 6(5), 512–535 (1994)
546 K. Dräger et al.

19. Kemeny, J., Snell, J., Knapp, A.: Denumerable Markov Chains. Springer (1976)
20. Kumar, R., Garg, V.: Control of stochastic discrete event systems modeled by
probabilistic languages. IEEE Trans. Automatic Control 46(4), 593–606 (2001)
21. Lahijanian, M., Wasniewski, J., Andersson, S., Belta, C.: Motion planning and
control from temporal logic specifications with probabilistic satisfaction guarantees.
In: Proc. ICRA 2010, pp. 3227–3232 (2010)
22. McIver, A., Morgan, C.: Results on the quantitative mu-calculus qMu. ACM Trans-
actions on Computational Logic 8(1) (2007)
23. Ozay, N., Topcu, U., Murray, R., Wongpiromsarn, T.: Distributed synthesis of
control protocols for smart camera networks. In: Proc. ICCPS 2011 (2011)
24. Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Program-
ming. John Wiley and Sons (1994)
25. Schrijver, A.: Theory of Linear and Integer Programming. John Wiley & Sons
(1998)
26. Shankar, N.: A tool bus for anytime verification. In: Usable Verification (2010)
27. Steel, G.: Formal analysis of PIN block attacks. TCS 367(1-2), 257–270 (2006)
28. Wimmer, R., Jansen, N., Ábrahám, E., Becker, B., Katoen, J.-P.: Minimal critical
subsystems for discrete-time Markov models. In: Flanagan, C., König, B. (eds.)
TACAS 2012. LNCS, vol. 7214, pp. 299–314. Springer, Heidelberg (2012); Extended
version available as technical report SFB/TR 14 AVACS 88
29. http://www.prismmodelchecker.org/files/tacas14pcs/
Precise Approximations of the Probability
Distribution of a Markov Process in Time:
An Application to Probabilistic Invariance

Sadegh Esmaeil Zadeh Soudjani1 and Alessandro Abate2,1

1
Delft Center for Systems & Control, TU Delft, The Netherlands
2
Department of Computer Science, University of Oxford, UK
{S.EsmaeilZadehSoudjani,A.Abate}@tudelft.nl

Abstract. The goal of this work is to formally abstract a Markov pro-

cess evolving over a general state space as a ﬁnite-state Markov chain,
with the objective of precisely approximating the state probability dis-
tribution of the Markov process in time. The approach uses a partition
of the state space and is based on the computation of the average tran-
sition probability between partition sets. In the case of unbounded state
spaces, a procedure for precisely truncating the state space within a com-
pact set is provided, together with an error bound that depends on the
asymptotic properties of the transition kernel of the Markov process. In
the case of compact state spaces, the work provides error bounds that
depend on the diameters of the partitions, and as such the errors can be
tuned. The method is applied to the problem of computing probabilistic
invariance of the model under study, and the result is compared to an
alternative approach in the literature.

1 Introduction

Verification techniques and tools for deterministic, discrete time, finite-state sys-
tems have been available for many years [9]. Formal methods in the stochastic
context is typically limited to discrete state structures, either in continuous or
in discrete time [3, 12]. Stochastic processes evolving over continuous (uncount-
able) spaces are often related to undecidable problems (the exception being
when they admit analytical solutions). It is thus of interest to resort to formal
approximation techniques that allow solving corresponding problems over finite
discretizations of the original models. In order to relate the approximate solu-
tions to the original problems, it is of interest to come up with precise bounds on
the error introduced by the approximations. The use of formal approximations
techniques for such complex models can be looked at from the perspective of the
research on abstraction techniques, which are of wide use in formal verification.
Successful numerical schemes based on Markov chain approximations of sto-
chastic systems in continuous time have been introduced in the literature, e.g.
[10]. However, the finite abstractions are only related to the original models
asymptotically (at the limit), with no explicit error bounds. This approach has

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 547–561, 2014.

c Springer-Verlag Berlin Heidelberg 2014
548 S. Esmaeil Zadeh Soudjani and A. Abate

been applied to the approximate study of probabilistic reachability or safety of

stochastic hybrid models in [8, 15]. In [1] a technique has been introduced to
instead provide formal abstractions of discrete-time, continuous-space Markov
models [2], with the objective of investigating their probabilistic invariance
(safety) by employing probabilistic model checking over a finite Markov chain.
In view of scalability, the approach has been improved and optimized in [7].
In this work we show that the approach in [1, 7] can be successfully employed
to approximately compute the statistics in time of a stochastic process over
a continuous state space. This additionally leads to an alternative method for
probabilistic safety analysis of the process. We first provide a forward recursion
for the approximate computation of the state distribution of a Markov process
in time. The computation of the state distribution is based on a state-space
partitioning procedure, and on the abstraction of the Markov process as a finite-
state Markov chain. An upper bound on the error related to the approximation is
formally derived. Based on the information from the state distribution, we show
how the method can be used to approximately compute probabilistic invariance
(safety) for discrete-time stochastic systems over general state spaces.
Probabilistic safety is the dual problem to probabilistic reachability. Over
deterministic models reachability and safety have been vastly studied in the
literature, and computational algorithms and tools have been developed based
on both forward and backward reachability for these systems. Similarly, for the
probabilistic models under study, we compare the presented approach (based
on forward computations) with the existing approaches in the literature [1, 5–7]
(which hinge on backward computations), particularly in terms of the introduced
error.
The article is structured as follows. Section 2 introduces the model under study
and discusses some structural assumptions needed for the abstraction procedure.
The procedure comprises two separate parts: Section 3 describes the truncation
of the dynamics of the model, whereas Section 4 details the abstraction of the
dynamics (approximation of the transition kernel) – both parts formally assess
the associated approximation error. Section 5 discusses the application of the
procedure to the computation of probabilistic invariance, and compares it against
an alternative approach in the literature.

2 Model, Preliminaries, and Goal of This Work

We consider a discrete time Markov process M deﬁned over a general state

space, which is characterized by a pair (S, Ts ), where S is the continuous state
space that we assume endowed with a metric and Borel measurable. We denote
by (S, B(S), P) the probability structure on S, with B(S) being the associated
sigma algebra and P a probability measure to be characterized shortly. Ts is
a conditional stochastic kernel that assigns to each point s ∈ S a probability
> Ts (·|s), so that for any measurable set A ∈ B(S), P(s(1) ∈ A|s(0) =
measure
s) = A Ts (ds̄|s). We assume that the stochastic kernel Ts admits a density
function ts , namely Ts (ds̄|s) = ts (s̄|s)ds̄.
Approximate State Probability Distribution of a Markov Process in Time 549

Suppose that the initial state of the Markov process M is random and dis-
tributed according to the density function π0 : S → R≥0 . The state distribu-
tion of M at time t ∈ N={1, ˙ 2, 3, . . .} is characterized by a density function
πt : S → R≥0 , which fully describes the statistics of the process at t and is in
particular such that, for all A ∈ B(S),
?
P(s(t) ∈ A) = πt (s)ds,
A

where the symbol P is loosely used to indicate the probability associated to events
over the product space S t+1 with elements s = [s(0), s(1), . . . , s(t)], whereas the
bold typeset is constantly used in the sequel to indicate vectors.
The state density functions πt (·) can be computed recursively, as follows:
?
πt+1 (s̄) = ts (s̄|s)πt (s)ds ∀s̄ ∈ S. (1)
S

In practice the forward recursion in (1) rarely yields a closed form for the density
function πt+1 (·). A special instance where this is the case is represented by a
linear dynamical system perturbed by Gaussian process noise: due to the closure
property of the Gaussian distribution with respect to addition and multiplication
by a constant, it is possible to explicitly write recursive formulas for the mean and
the variance of the distribution, and thus express in a closed form the distribution
in time of the solution of the model. In more general cases, it is necessary to
numerically (hence, approximately) compute the density function of the model
in time.
This article provides a numerical approximation of the density function of
M as the probability mass function (pmf) of a ﬁnite-state Markov chain Mf in
time. The Markov chain Mf is obtained as an abstraction of the concrete Markov
process M. The abstraction is associated with a guaranteed and tunable error
bound, and algorithmically it leverages a state-space partitioning procedure. The
procedure is comprised of two steps:

1. since the state space S is generally unbounded, it is ﬁrst properly truncated;

2. subsequently, a partition of the truncated dynamics is introduced.

Section 3 discusses the error generated by the state-space truncation, whereas

Section 4 describes the construction of the Markov chain by state-space parti-
tioning. We employ the following example throughout the article as a running
case study.
Example 1. Consider the one-dimensional stochastic dynamical system

s(t + 1) = as(t) + b + σw(t),

where the parameters a, σ > 0, whereas b ∈ R, and such that w(·) is a process
comprised of independent, identically distributed random variables with a stan-
dard normal distribution. The initial state of the process is selected uniformly in
550 S. Esmaeil Zadeh Soudjani and A. Abate

the bounded interval [β0 , γ0 ] ⊂ R. The solution of the model is a Markov process,
evolving over the state space S = R, and fully characterized by the conditional
density function
1
φσ (u) = √ e−u /2σ .
2 2
ts (s̄|s) = φσ (s̄ − as − b),
σ 2π
We raise the following assumptions in order to be able to later relate the state
density function of M to the probability mass function of Mf .
Assumption 1. For given sets Γ ⊂ S 2 and Λ0 ⊂ S, there exist positive con-
stants
and ε0 , such that ts (s̄|s) and π0 (s) satisfy the following conditions:
ts (s̄|s) ≤
∀(s, s̄) ∈ S 2 \Γ, and π0 (s) ≤ ε0 ∀s ∈ S\Λ0 . (2)
Assumption 2. The density functions π0 (s) and ts (s̄|s) are (globally) Lipschitz
continuous, namely there exist ﬁnite constants λ0 , λf , such that the following
Lipschitz continuity conditions hold:
|π0 (s) − π0 (s )| ≤ λ0 s − s ∀s, s ∈ Λ0 , (3)
|ts (s̄|s) − ts (s̄ |s)| ≤ λf s̄ − s̄ ∀s, s̄, s̄ ∈ S. (4)
Moreover, there exists a ﬁnite constant Mf such that
9? = @
=
Mf = sup ts (s̄|s)ds==s̄ ∈ S . (5)
S

The Lipschitz constants λ0 , λf are eﬀectively computed by taking partial deriva-

tives of the density functions π0 (·), ts (·|s) and maximizing its norm. The sets Λ0
and Γ will be used to truncate the support of density functions π0 (·) and ts (·|·),
respectively. Assumption 1 enables the precise study of the behavior of density
functions πt (·) over the truncated part of the state space. Further, the Lipschitz
continuity conditions in Assumption 2 are essential to derive error bounds re-
lated to the abstraction of the Markov process over the truncated state space.
In order to compute these error bounds, we assign the inﬁnity norm to the space
of bounded measurable functions over the state space S, namely
f (s)∞ = sup |f (s)| ∀f : S → R.
s∈S

In the sequel the function IA (·) denotes the indicator function of a set A ⊆ S,
namely IA (s) = 1, if s ∈ A; else IA (s) = 0.
Example 1 (Continued). Select the interval Λ0 = [β0 , γ0 ] and deﬁne the set Γ
by the linear inequality
=
Γ = {(s, s̄) ∈ R2 =|s̄ − as − b| ≤ ασ}.
The initial density function π0 of the process can be represented by the function
ψ0 (s) = I[β0 ,γ0 ] (s)/(γ0 − β0 ).
Then Assumption 1 holds with
= φ1 (α)/σ and ε0 = 0. The constant Mf in
Assumption 2 is equal to 1/a. Lipschitz
√ continuity, as per (3) and (4), holds for
constants λ0 = 0 and λf = 1/ σ 2 2πe .
Approximate State Probability Distribution of a Markov Process in Time 551

3 State-Space Truncation Procedure, with Error

Quantification
We truncate the support of the density functions π0 , ts to the sets Λ0 , Γ respec-
tively, and recursively compute support sets Λt , as in (7) that are associated to
the density functions πt . Then we employ the quantities
, ε0 in Assumption 1 to
compute error bounds εt , as in (6), on the value of the density functions πt out-
side the sets Λt . Finally we truncate the unbounded state space to Υ = ∪N t=0 Λt .
As intuitive, the error related to the spatial truncation depends on the be-
havior of the conditional density function ts over the eliminated regions of the
state space. Suppose that sets Γ, Λ0 are selected such that Assumption 1 is sat-
isfied with constants
, ε0 : then Theorem 2 provides an upper bound on the error
obtained from evaluating the density functions in time πt (·) over the truncated
regions of the state space.
Theorem 1. Under Assumption 1 the functions πt satisfy the bound
0 ≤ πt (s) ≤ εt ∀s ∈ S\Λt ,
where the quantities {εt }N
t=0 are defined recursively by
εt+1 =
+ Mf εt , (6)
whereas the support sets {Λt }N
t=0 are computed as

Λt+1 = Πs̄ (Γ ∩ (Λt × S)) , (7)

where Πs̄ denotes the projection map along the second set of coordinates1 .
Remark 1. Notice that if the shape of the sets Γ and Λ0 is computationally
manageable (e.g., polytopes) then it is possible to implement the computation
of the recursion in (7) by available software tools, such as the MPT toolbox [11].
Further, notice that if for some t0 , Λt0 +1 ⊃ Λt0 , then for all t ≥ t0 , Λt+1 ⊃ Λt .
Similarly, we have that
– if for some t0 , Λt0 +1 ⊂ Λt0 , then for all t ≥ t0 , Λt+1 ⊂ Λt .
– if for some t0 , Λt0 +1 = Λt0 , then for all t ≥ t0 , Λt = Λt0 .
To clarify the role of Γ in the computation of Λt , we emphasize that Λt+1 =
∪s∈Λt Ξ(s), where Ξ depends only on Γ and is deﬁned by the set-valued map
Ξ : S → 2S , Ξ(s) = {s̄ ∈ S|(s, s̄) ∈ Γ }.
Figure 1 provides a visual illustration of the recursion in (7).
Let us introduce a quantity κ(t, Mf ), which plays a role in the solution of (6)
and will be frequently used shortly:

1−Mft
, Mf = 1
κ(t, Mf ) = 1−Mf (8)
t, Mf = 1.
1
Recall that both Γ and Λ × S are deﬁned over S 2 = S × S.
552 S. Esmaeil Zadeh Soudjani and A. Abate

The following theorem provides a truncation procedure, valid over a ﬁnite

N time
horizon {0, 1, . . . , N }, which reduces the state space S to the set Υ = t=0 Λt .
The theorem also formally quantiﬁes the associated truncation error.

Theorem 2. SupposeN that the state space of the process M has been truncated
to the set Υ = t=0 Λt . Let us introduce the following recursion to compute
functions μt : S → R≥0 as an approximation of the density functions πt :
?
μt+1 (s̄) = IΥ (s̄) ts (s̄|s)μt (s)ds, μ0 (s) = IΛ0 (s)π0 (s) ∀s̄ ∈ S. (9)
S

Then the introduced approximation error is πt − μt ∞ ≤ εt , for all t ≤ N .

To recapitulate, Theorem 2 leads to the following procedure to approximate the

density functions πt of M over an unbounded state space S:
1. truncate π0 in such a way that μ0 has a bounded support Λ0 ;
2. truncate the conditional density function ts (·|s) over a bounded set for all
s ∈ S, then quantify Γ ⊂ S 2 as the support of the truncated density function;
3. leverage the recursion in (7) to compute the support sets Λt ;
4. use the recursion in (9) to compute the approximate density functions μt over
the set Υ = ∪N t=0 Λt . Note that the recursion in (9) is eﬀectively computed
over the set Υ , since μt (s) = 0 for all s ∈ S\Υ .
Note that we could as well deal with the support of μt (·) over the time-varying
sets Λt by adapting recursion (9) with IΛt+1 instead of IΥ . While employing the
(larger) set Υ may lead to a memory increase at each stage, it will considerably
simplify the computations of the state-space partitioning and the abstraction as
a Markov chain: indeed, employing time-varying sets Λt would render the parti-
tioning procedure also time-dependent, and the obtained Markov chain would be
time-inhomogeneous. We opt to work directly with Υ to avoid these diﬃculties.
Example 1 (Continued). We can easily obtain a closed form for the sets Λt =
[βt , γt ], via
βt+1 = aβt + b − ασ, γt+1 = aγt + b + ασ.
The set Υ is the union of intervals [βt , γt ]. The error of the state-space truncation
over set Υ is
φ1 (α) 1
πt − μt ∞ ≤ εt = κ(t, Mf ) , Mf = .
σ a

4 State-Space Partitioning Procedure, with Error

Quantiﬁcation
In this section we assume that the sets Γ, Λ0 have been properly selected so that
Υ = ∪N t=0 Λt is bounded. In order to formally abstract process M as a ﬁnite
Approximate State Probability Distribution of a Markov Process in Time 553

s̄

Γ
b + ασ

Λt+1
s
Λt

b − ασ

Fig. 1. Graphical representation of the recursion in (7) for Λt

Markov chain Mf and to approximate its state density functions, we select a

finite partition of the bounded set Υ as Υ = ∪ni=1 Ai , where the sets Ai have
non-trivial measure. We then complete the partition over the whole state space
S = ∪n+1
i=1 Ai by considering the set An+1 = S\Υ . This results in a finite Markov
chain Mf with n+1 discrete abstract states in the set Nn+1 ={1,
˙ 2, · · · , n, n+1},
2
and characterized by the transition probability matrix P = [Pij ] ∈ R(n+1) ,
where the probability of jumping from any pair of states i to j (Pij ) is computed
as > >
Aj Ai ts (s̄|s)dsds̄ ∀i ∈ Nn ,
1
Pij = L(A i)
(10)
P(n+1)j = δ(n+1)j ,
for all j ∈ Nn+1 , and where δ(n+1)j is the Kronecker delta function (the abstract
state n + 1 of Mf is absorbing), and L(·) denotes the Lebesgue measure of a
set. The quantities in (10) are well-defined since the set Υ is bounded and the
measures L(Ai ), i ∈ Nn , are finite and non-trivial.
Notice that matrix P for the Markov chain Mf is stochastic, namely
⎛ ⎞

n+1 1 ? ?
n+1
1
?
n+1 ?
Pij = ts (s̄|s)dsds̄ = ⎝ ts (s̄|s)ds̄⎠ ds
j=1 j=1
L(A i ) Aj Ai L(A i ) Ai j=1 Aj
? ? ?
1 1
= ts (s̄|s)ds̄ds = ds = 1.
L(Ai ) Ai S L(Ai ) Ai

The initial distribution of Mf is> the pmf p0 = [p0 (1), p0 (2), . . . , p0 (n+1)], and it
is obtained from π0 as p0 (i) = Ai π0 (s)ds, ∀i ∈ Nn+1 . Then the pmf associated
to the state distribution of Mf at time t can be computed as pt = p0 P t .
It is intuitive that the discrete pmf pt of the Markov chain Mf approximates
the continuous density function πt of the Markov process M. In the rest of the
section we show how to formalize this relationship: pt is used to construct an
554 S. Esmaeil Zadeh Soudjani and A. Abate

approximation function, denoted by ψt , of the density function πt . Theorem 3

shows that ψt is a piece-wise constant approximation (with values that are the
entries of the pmf pt normalized by the Lebesgue measure of the associated
partition set) of the original density function πt . Moreover, under the continuity
assumption in (4) (ref. Lemma 1) we can establish the Lipschitz continuity of
πt , which enables the quantiﬁcation in Theorem 3 of the error of its piece-wise
constant approximation.

Lemma 1. Suppose that the inequality in (4) holds. Then the state density func-
tions πt (·) are globally Lipschitz continuous with constant λf for all t ∈ N:

|πt (s) − πt (s )| ≤ λf s − s ∀s, s ∈ S.

Theorem 3. Under Assumptions 1 and 2, the functions πt (·) can be approxi-

mated by piece-wise constant functions ψt (·), deﬁned as

n
pt (i)
ψt (s) = IAi (s) ∀t ∈ N, (11)
i=1
L(A i)

where IB (·) is the indicator function of a set B ⊂ S. The approximation error

is upper-bounded by the quantity

πt − ψt ∞ ≤ εt + Et ∀t ∈ N, (12)

where Et can be recursively computed as

Et+1 = Mf Et + λf δ, E0 = λ0 δ, (13)

and δ is an upper bound on the diameters of the partition sets {Ai }ni=1 , namely
δ = sup {s − s , s, s ∈ Ai , i ∈ Nn }.

Note that the functions ψt are deﬁned over the whole state space S, but (11)
implies that they are equal to zero outside the set Υ .
Corollary 1. The recursion in (13) admits the explicit solution
3 4
Et = κ(t, Mf )λf + Mft λ0 δ,

where κ(t, Mf ) is introduced in (8).

Underlying Theorem 3 is the fact that ψt (·) are in general sub-stochastic
density functions:
? ? n n ?
pt (i) pt (i)
ψt (s)ds = IAi (s)ds = IA (s)ds
S S i=1 L(Ai ) i=1
L(Ai ) S i
n
pt (i) n
= L(Ai ) = pt (i) = 1 − pt (n + 1) ≤ 1.
i=1
L(Ai ) i=1
Approximate State Probability Distribution of a Markov Process in Time 555

This is clearly due to the fact that we are operating on the dynamics of M
truncated over the set Υ . It is thus intuitive that the approximation procedure
and the derived error bounds are also valid for the case of sub-stochastic density
functions, namely
? ?
ts (s̄|s)ds̄ ≤ 1 ∀s ∈ S, π0 (s)ds ≤ 1,
S S

the only diﬀerence being that the obtained Markov chain Mf is as well sub-
stochastic.
Further, whenever the Lipschitz continuity requirement on the initial density
function, as per (3) in Assumption 2, does not hold, (for instance, this is the
case when the initial state of the process is deterministic) we can relax this
continuity assumption on the initial distribution of the process by starting the
discrete computation from the time step t = 1. In this case we deﬁne the pmf
p1 = [p1 (1), p1 (2), . . . , p1 (n + 1)], where
? ?
p1 (i) = ts (s̄|s)π0 (s)dsds̄ ∀i ∈ Nn+1 ,
Ai S

and derive pt = p1 P t−1 for all t ∈ N. Theorem 3 follows along similar lines,
except for eqn. (13), where the initial error is set to E0 = 0 and the time-
dependent terms Et can be derived as Et = κ(t, Mf )λf δ.
It is important to emphasize the computability of the derived errors and
the fact that they can be tuned. Further, in order to attain abstractions that
are practically useful, it imperative to seek improvements on the derived error
bounds: in particular, the approximation errors can be computed locally (under
corresponding local Lipschitz continuity assumptions), following the procedures
discussed in [7].
Example 1 (Continued). The error of proposed Markov chain abstraction can
be expressed as
& '
δ φ1 (α) 1
πt − ψt ∞ ≤ κ(t, Mf ) √ + , Mf = .
σ 2 2πe σ a

The error can be tuned in two distinct ways:

1. by selecting larger values for α, which on the one hand leads to a less conser-
vative truncation, but on the other requires the partition of a larger interval;
2. by reducing partitions diameter δ, which of course results in a larger cardi-
nality of the partition sets.
Let us select values b = 0, β0 = 0, γ0 = 1, σ = 0.1, and time horizon N = 5.
For a = 1.2 we need to partition the interval Υ = [−0.75α, 2.49 + 0.75α], which
results in the error πt − ψt ∞ ≤ 86.8δ + 35.9φ1(α) for all t ≤ N . For a = 0.8 we
need to partition the smaller interval Υ = [−0.34α, 0.33 + 0.34α], which results
in the error πt −ψt ∞ ≤ 198.6δ +82.1φ1(α) for all t ≤ N . In the case of a = 1.2,
we partition a larger interval and obtain a smaller error, while for a = 0.8 we
556 S. Esmaeil Zadeh Soudjani and A. Abate

partition a smaller interval with correspondingly a larger error. It is obvious that

the parameters δ, α can be chosen properly to ensure that a certain error precision
is met. This simple model admits a solution in closed form, and its state density
functions can be obtained as the convolution of a uniform distribution (the
contribution of initial state) and a zero-mean Gaussian distribution with time-
dependent variance (the contributions of the process noise). Figure 2 displays the
original and the approximated state density functions for the set of parameters
α = 2.4, δ = 0.05.

a = 1.2 a = 0.8
0.8
ψ1 (s), π1 (s) 2 ψ1 (s), π1 (s)
0.7 ψ2 (s), π2 (s) 1.8 ψ2 (s), π2 (s)
ψ3 (s), π3 (s) 1.6 ψ3 (s), π3 (s)
0.6
ψ4 (s), π4 (s) 1.4 ψ4 (s), π4 (s)
0.5
ψ5 (s), π5 (s) 1.2 ψ5 (s), π5 (s)
0.4 1

0.3 0.8

0.6
0.2
0.4
0.1
0.2

0
−1 0 1 2 3 4 −1 −0.5 0 0.5 1
s s

Fig. 2. Piece-wise constant approximation of the state density function ψt (·), compared
to the actual function πt (·) (derived analytically) for a = 1.2 (left) and a = 0.8 (right)

5 Application of the Formal Approximation Procedure

to the Probabilistic Invariance Problem
The problem of probabilistic invariance (safety) for general Markov processes has
been theoretically characterized in [2] and further investigated computationally
in [1, 4–6]. With reference to a discrete-time Markov process M over a continuous
state space S, and a safe set A ∈ B(S), the goal is to quantify the probability

s (A) = P{s(t) ∈ A, for all t ∈ [0, N ]|s(0) = s}.

More generally, it is of interest to quantify the probability pN

π0 (A), where the
initial condition of the process s(0) is a random variable characterized by the
density function π0 (·). We present a forward computation of probabilistic in-
variance by application of the approximation procedure above in Section 5.1,
then review results on backward computation [1, 4–6] in Section 5.2. Section 5.3
compares the two approaches.

5.1 Forward Computation of Probabilistic Invariance

The approach for approximating the density function of a process in time can
be easily employed for the approximate computation of probabilistic invariance.
Deﬁne sub-density functions Wt : S → R≥0 , characterized by
Approximate State Probability Distribution of a Markov Process in Time 557
?
Wt+1 (s̄) = IA (s̄) Wt (s)ts (s̄|s)ds, W0 (s̄) = IA (s̄)π0 (s̄) ∀s̄ ∈ S. (14)
S
>
Then the solution of the problem is obtained as pN π0 (A) = S WN (s)ds. A com-
parison of the recursions in (14) and in (9) reveals how probabilistic invariance
can be computed as a special case of the approximation procedure. In apply-
ing the procedure, the only diﬀerence consists in replacing set Υ by A, and in
restricting Assumption 2 to hold over the safe set (the solution over the com-
plement of this set is trivially known) – in this case the error related to the
truncation of the state space can be disregarded. The procedure consists in par-
titioning the safe set, in constructing the Markov chain Mf as per (10), and in
computing ψt (·) as an approximation of Wt (·) based on (11). The error of this
approximation is Wt − ψt ∞ ≤ Et , which results in the following:
= ? =
= N =
=pπ (A) − ψ (s)ds = ≤ EN L(A) = κ(N, Mf )λf δL(A)=E
˙ f.
= 0 t =
A

Note that these sub-density functions satisfy the inequalities

? ? ?
1≥ W0 (s)ds ≥ W1 (s)ds ≥ · · · ≥ WN (s)ds ≥ 0.
S S S

5.2 Backward Computation of Probabilistic Invariance

The contributions in [1, 4–6] have characterized specifications in PCTL with
a formulation based on backward recursions. In particular, the computation of
probabilistic invariance is obtained via the value functions Vt : S → [0, 1], which
are characterized as
?
Vt (s) = IA (s) Vt+1 (s̄)ts (s̄|s)ds̄, VN (s) = IA (s) ∀s ∈ S. (15)
S
>
The desired probabilistic invariance is expressed as pN π0 (A) = S V0 (s)π0 (s)ds.
The value functions always map the state space to the interval [0, 1] and they
are non-increasing, Vt (s) ≤ Vt+1 (s) for any fixed s ∈ S. [1, 4–6] discuss efficient
algorithms for the approximate computation of the quantity pN π0 (A), relying on
different assumptions on the model under study. The easiest and most straight-
forward procedure is based on the following assumption [1].
Assumption 3. The conditional density function of the process is globally Lips-
chitz continuous within the safe set with respect to the conditional state. Namely,
there exists a finite constant λb , such that

|t(s̄|s) − t(s̄|s )| ≤ λb s − s
∀s, s , s̄ ∈ A.
>
A ﬁnite constant Mb is introduced as Mb = sups∈A A ts (s̄|s)ds̄ ≤ 1.
The procedure introduces a partition of the safe set A = ∪ni=1 Ai and extends
it to S = ∪n+1
i=1 Ai , with An+1 = S\A. Then it selects arbitrary representative
558 S. Esmaeil Zadeh Soudjani and A. Abate

points si ∈ Ai and constructs a ﬁnite-state Markov chain Mb over the ﬁnite

state space {s1 , s2 , . . . , sn+1 }, endowed with transition probabilities
>
P (si , sj ) = Aj ts (s̄|si )ds̄, P (sn+1 , sj ) = δ(n+1)j , (16)

for all i ∈ Nn , j ∈ Nn+1 . The error of such an approximation is [5]:

Eb = κ(N, Mb )λb δL(A),

where δ is the max partitions diameter, L(A) is the Lebesgue measure of set A.

5.3 Comparison of the Two Approaches

We first compare the two constructed Markov chains. The Markov chain in the
forward approach Mf is a special case of Markov chain of backward approach
Mb , where the representative points can be selected intelligently to determine
the average probability of jumping from one partition set to another. More specif-
ically, the quantities (10) are a special case of those in (16) (based on mean value
theorem for integration). We claim that this leads to a less conservative (smaller)
error bound for the approximation.
The forward computation is in general more informative than the backward
computation since it provides not only the solution of the safety problem in time,
but also the state distribution over the safe set. Further the forward approach
may provide some insight to the solution of the infinite-horizon safety problem
[16, 17], for a given initial distribution. Finally, the forward approach can be
used to approximate the solution of safety problem over unbounded safe sets,
while boundedness of the safe set is required in all the results in the literature
that are based on backward computations.
Next, we compare errors and related assumptions. The error computations
rely on different assumptions: the Lipschitz continuity of the conditional density
function with respect to the current or to the next states, respectively. Further,
the constants Mf and Mb are generally different and play an important role in
the error. Mb represents the maximum probability of remaining inside the safe
set, while Mf is an indication of the maximum concentration at one point over
a single time-step of the process evolution. Mb is always less than or equal to
one, while Mf could be any finite positive number.
Example 1 (Continued) The constants λf , Mf and λb , Mb for the one dimen-
sional dynamical system of Example 1 are
1 1
λf = √ , λb = aλf , Mf = , Mb ≤ 1.
σ2 2πe a
If 0 < a < 1 the system trajectories converge to an equilibrium point (in ex-
pected value). In this case the system state gets higher chances of being in the
neighborhood of the equilibrium in time and the backward recursion provides a
better error bound. If a > 1 the system trajectories tend to diverge with time. In
this case the forward recursion provides a much better error bound, compared
to the backward recursion.
Approximate State Probability Distribution of a Markov Process in Time 559

For the numerical simulation we select a safety set A = [0, 1], a noise level
σ = 0.1, and a time horizon N = 10. The solution of the safety problem for the
two cases a = 1.2 and a = 0.8 is plotted in Figure 3. We have computed constants
λf = 24.20, Mb = 1 in both cases, while λb = 29.03, Mf = 0.83 for the first case,
and λb = 19.36, Mf = 1.25 for the second case. We have selected the center
of the partition sets (distributed uniformly over the set A) as representative
points for Markov chain Mb . In order to compare the two approaches, we have
assumed the same computational effort (related to the same partition size of
δ = 0.7 × 10−4 ), and have obtained an error Ef = 0.008, Eb = 0.020 for a =
1.2 and Ef = 0.056, Eb = 0.014 for a = 0.8. The simulations show that the
forward approach works better for a = 1.2, while the backward approach is
better suitable for a = 0.8. Note that the approximate solutions provided by
the two approaches are very close: the difference of the transition probabilities
computed via the Markov chains Mf , Mb are in the order of 10−8 , and the
difference in the approximate solutions (black curve in Figure 3) is in the order
of 10−6 . This has been due to the selection of very fine partition sets that have
resulted in small abstraction errors.

a = 1.2 a = 0.8
0.35 0.7
forward approach forward approach
backward approach backward approach
0.3 approximate solution 0.6 approximate solution

0.25 0.5

0.2
ps (A)

ps (A)

0.4

0.15 0.3

0.1 0.2

0.05 0.1

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
s s

Fig. 3. Approximate solution of the probabilistic invariance problem (black line), to-
gether with error intervals of forward (red band) and backward (blue band) approaches,
for a = 1.2 (left) and a = 0.8 (right)

Remark 2. Over deterministic models, [13] compares forward and backward

reachability analysis and provides insights on their differences: the claim is that
for systems with significant contraction, forward reachability is more effective than
backward reachability because of numerical stability issues. On the other hand, for
the probabilistic models under study, the result indicates that under Lipschitz con-
tinuity of the transition kernel the backward approach is more effective in systems
with convergence in the state distribution. If we treat deterministic systems as spe-
cial (limiting) instances of stochastic systems, our result is not contradicting with
[13] since the Lipschitz continuity assumption on the transition kernels of proba-
bilistic models does not hold over deterministic ones.
Motivated by the previous case study we study how convergence properties of a
Markov process are related to the constant Mf .
560 S. Esmaeil Zadeh Soudjani and A. Abate

Theorem 4. Assume that the initial density function π0 (s) is bounded and that
the constant Mf is finite and Mf < 1. If the state space is unbounded, the
sequence of density functions {πt (s)|t ≥ 0} uniformly exponentially converges to
zero. The sequence of probabilities P{s(t) ∈ A} and the corresponding solution
of the safety problem for any compact safe set A exponentially converge to zero.
Theorem 4 indicates that under the invoked assumptions the probability “spreads
out” over the unbounded state space as time progresses. Moreover, the theorem en-
sures the absence of absorbing sets [16, 17], which are indeed known to characterize
the solution of infinite-horizon properties. Example 2 studies the relationship be-
tween constant Mf and the stability of linear stochastic difference equations.

Example 2. Consider the stochastic linear diﬀerence equations

x(t + 1) = Ax(t) + w(t), x(·), w(·) ∈ Rn ,
where w(·) are i.i.d. random vectors with known distributions. For such systems
Mf = 1/| det A|, then the condition Mf < 1 implies instability of the system in
expected value. Equivalently, mean-stability of the system implies Mf ≥ 1. Note
that for this class of systems Mf > 1 does not generally imply stability, since
det A is only the product of the eigenvalues of the system.
The Lipschitz constants λf and λb have a different nature. Example 3 clarifies
this point.
Example 3. Consider the dynamical system
s(t + 1) = f (s(t), w(t)),
where w(·) are i.i.d. with known distribution tw (·). Suppose that the vector
field f : Rn × Rn → Rn is continuously differentiable and that the matrix ∂w ∂f

is invertible. Then the implicit function theorem guarantees the existence and
uniqueness of a function g : Rn × Rn → Rn such that w(t) = g(s(t + 1), s(t)).
The conditional density function of the system in this case is [14]:
= & '=
= ∂g =
=
ts (s̄|s) = =det (s̄, s) == tw (g(s̄, s)).
∂s̄
The Lipschitz constants λf , λb are speciﬁed by the dependence of function g(s̄, s)
∂f
from the variables s̄, s, respectively. As a special case the invertability of ∂w is
guaranteed for systems with additive process noise, namely f (s, w) = fa (s) + w.
Then g(s̄, s) = s̄ − fa (s), λf is the Lipschitz constant of tw (·), while λb is the
multiplication of Lipschitz constant of tw (·) and of fa (·).

References
1. Abate, A., Katoen, J.-P., Lygeros, J., Prandini, M.: Approximate model checking
of stochastic hybrid systems. European Journal of Control 6, 624–641 (2010)
2. Abate, A., Prandini, M., Lygeros, J., Sastry, S.: Probabilistic reachability and
safety for controlled discrete time stochastic hybrid systems. Automatica 44(11),
2724–2734 (2008)
Approximate State Probability Distribution of a Markov Process in Time 561

3. Baier, C., Katoen, J.-P., Hermanns, H.: Approximate symbolic model checking of
continuous-time Markov chains (Extended abstract). In: Baeten, J.C.M., Mauw, S.
(eds.) CONCUR 1999. LNCS, vol. 1664, pp. 146–162. Springer, Heidelberg (1999)
4. Esmaeil Zadeh Soudjani, S., Abate, A.: Adaptive gridding for abstraction and veri-
fication of stochastic hybrid systems. In: Proceedings of the 8th International Con-
ference on Quantitative Evaluation of Systems, Aachen, DE, pp. 59–69 (September
2011)
5. Esmaeil Zadeh Soudjani, S., Abate, A.: Higher-Order Approximations for Verifica-
tion of Stochastic Hybrid Systems. In: Chakraborty, S., Mukund, M. (eds.) ATVA
2012. LNCS, vol. 7561, pp. 416–434. Springer, Heidelberg (2012)
6. Esmaeil Zadeh Soudjani, S., Abate, A.: Probabilistic invariance of mixed
deterministic-stochastic dynamical systems. In: ACM Proceedings of the 15th In-
ternational Conference on Hybrid Systems: Computation and Control, Beijing,
PRC, pp. 207–216 (April 2012)
7. Esmaeil Zadeh Soudjani, S., Abate, A.: Adaptive and sequential gridding proce-
dures for the abstraction and verification of stochastic processes. SIAM Journal on
Applied Dynamical Systems 12(2), 921–956 (2013)
8. Koutsoukos, X., Riley, D.: Computational methods for reachability analysis of sto-
chastic hybrid systems. In: Hespanha, J.P., Tiwari, A. (eds.) HSCC 2006. LNCS,
vol. 3927, pp. 377–391. Springer, Heidelberg (2006)
9. Kurshan, R.P.: Computer-Aided Verification of Coordinating Processes: The
Automata-Theoretic Approach. Princeton Series in Computer Science. Princeton
University Press (1994)
10. Kushner, H.J., Dupuis, P.G.: Numerical Methods for Stochastic Control Problems
in Continuous Time. Springer, New York (2001)
11. Kvasnica, M., Grieder, P., Baotić, M.: Multi-parametric toolbox, MPT (2004)
12. Kwiatkowska, M., Norman, G., Segala, R., Sproston, J.: Verifying quantitative
properties of continuous probabilistic timed automata. In: Palamidessi, C. (ed.)
CONCUR 2000. LNCS, vol. 1877, pp. 123–137. Springer, Heidelberg (2000)
13. Mitchell, I.M.: Comparing forward and backward reachability as tools for safety
analysis. In: Bemporad, A., Bicchi, A., Buttazzo, G. (eds.) HSCC 2007. LNCS,
vol. 4416, pp. 428–443. Springer, Heidelberg (2007)
14. Papoulis, A.: Probability, Random Variables, and Stochastic Processes, 3rd edn.
Mcgraw-hill (1991)
15. Prandini, M., Hu, J.: Stochastic reachability: Theory and numerical approximation.
In: Cassandras, C.G., Lygeros, J. (eds.) Stochastic Hybrid Systems. Automation
and Control Engineering Series, vol. 24, pp. 107–138. Taylor & Francis Group/CRC
Press (2006)
16. Tkachev, I., Abate, A.: On infinite-horizon probabilistic properties and stochastic
bisimulation functions. In: Proceedings of the 50th IEEE Conference on Decision
and Control and European Control Conference, Orlando, FL, pp. 526–531 (De-
cember 2011)
17. Tkachev, I., Abate, A.: Characterization and computation of infinite-horizon spec-
ifications over markov processes. Theoretical Computer Science 515, 1–18 (2014)
SACO: Static Analyzer for Concurrent Objects

Elvira Albert1 , Puri Arenas1 , Antonio Flores-Montoya2, Samir Genaim1 ,

Miguel Gómez-Zamalloa1, Enrique Martin-Martin1 ,
German Puebla3 , and Guillermo Román-Dı́ez3
1
Complutense University of Madrid (UCM), Spain
2
Technische Universität Darmstadt (TUD), Germany
3
Technical University of Madrid (UPM), Spain

Abstract. We present the main concepts, usage and implementation of

SACO, a static analyzer for concurrent objects. Interestingly, SACO is
able to infer both liveness (namely termination and resource bounded-
ness) and safety properties (namely deadlock freedom) of programs based
on concurrent objects. The system integrates auxiliary analyses such as
points-to and may-happen-in-parallel, which are essential for increasing
the accuracy of the aforementioned more complex properties. SACO pro-
vides accurate information about the dependencies which may introduce
deadlocks, loops whose termination is not guaranteed, and upper bounds
on the resource consumption of methods.

1 Introduction
With the trend of parallel systems, and the emergence of multi-core comput-
ing, the construction of tools that help analyzing and verifying the behaviour
of concurrent programs has become fundamental. Concurrent programs contain
several processes that work together to perform a task and communicate with
each other. Communication can be programmed using shared variables or mes-
sage passing. When shared variables are used, one process writes into a variable
that is read by another; when message passing is used, one process sends a mes-
sage that is received by another. Shared memory communication is typically
implemented using low-level concurrency and synchronization primitives These
programs are in general more diﬃcult to write, debug and analyze, while its main
advantage is eﬃciency. The message passing model uses higher-level concurrency
constructs that help in producing concurrent applications in a less error-prone
way and also more modularly. Message passing is the essence of actors [1], the
concurrency model used in concurrent objects [9], in Erlang, and in Scala.
This paper presents the SACO system, a S tatic Analyzer for C oncurrent
O bjects. Essentially, each concurrent object is a monitor and allows at most
one active task to execute within the object. Scheduling among the tasks of
an object is cooperative, or non-preemptive, such that the active task has to
release the object lock explicitly (using the await instruction). Each object has
an unbounded set of pending tasks. When the lock of an object is free, any task
in the set of pending tasks can grab the lock and start executing. When the
result of a call is required by the caller to continue executing, the caller and the

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 562–567, 2014.

c Springer-Verlag Berlin Heidelberg 2014
SACO: Static Analyzer for Concurrent Objects 563

callee methods can be synchronized by means of future variables, which act as

proxies for results initially unknown, as their computations are still incomplete.
The ﬁgure below overviews the main components of SACO, whose distinguish-
ing feature is that it infers both liveness and safety properties.
Advanced Analysis
Input Auxiliary Analysis Visualizer
Deadlock no/cycles
Program points−to Eclipse
Points to
pp plug−in
Termination yes/scc
Analysis
MHP Size Web
parameters MHP interface
pairs analysis Resources unknown/
bound

SACO receives as input a program and a selection of the analysis parameters.

Then it performs two auxiliary analyses: points-to and may-happen-in-parallel
(MHP), which are used for inferring the more complex properties in the next
phase. As regards to liveness, we infer termination as well as resource bound-
edness, i.e., ﬁnd upper bounds on the resource consumption of methods. Both
analyses require the inference of size relations, which are gathered in a previous
step. Regarding safety, we infer deadlock freedom, i.e., there is no state in which
a non-empty set of tasks cannot progress because all tasks are waiting for the
termination of other tasks in the set, or otherwise we show the tasks involved
in a potential deadlock set. Finally, SACO can be used from a command line
interface, a web interface, and an Eclipse plugin. It can be downloaded and/or
used online from its website http://costa.ls.fi.upm.es/saco.
2 Auxiliary Analyses
We describe the auxiliary analyses used in SACO by means of the example below:
1class PrettyPrinter{ 15 void insertCoin(){ //main method
29

2 void showIncome(Int n){. . .} 16 coins=coins+1; main(Int n){

3 void showCoin(){. . .} 17 } 31 PrettyPrinter p;

4 }//end class 18 Int retrieveCoins(){ 32 VendingMachine v;
5 class VendingMachine{ 19 Futvoid f; 33 FutInt f;
6 Int coins; 20 Int total=0; 34 p=new PrettyPrinter();
7 PrettyPrinter out; 21 while (coins>0){ 35 v=new VendingMachine(0,p);
8 void insertCoins(Int n){ 22 coins=coins−1; 36 v ! insertCoins(n);
9 Futvoid f; 23 f=out ! showCoin(); 37 f=v ! retrieveCoins();
10 while (n>0){ 24 await f?; 38 await f?;
11 n=n−1; 25 total=total+1; } 39 Int total=f.get;
12 f=this ! insertCoin(); 26 return total; 40 p!showIncome(total);
13 await f?; } 27 } 41 }

14 } 28 }//end class

We have a class PrettyPrinter to display some information and a class VendingMachine

with methods to insert a number of coins and to retrieve all coins. The main
method is executing on the object This, which is the initial object, and receives
as parameter the number of coins to be inserted. Besides This, two other concur-
rent objects are created at Line 34 (L34 for short) and L35. Objects can be seen as
buﬀers in which tasks are posted and that execute in parallel. In particular, two
tasks are posted at L36 and L37 on object v. insertCoins executes asynchronously
564 E. Albert et al.

on v. However, the await at L38 synchronizes the execution of This with the com-
pletion of the task retrieveCoins in v by means of the future variable f. Namely, at
the await, if the task spawned at L37 has not finished, the processor is released and
any available task on the This object could take it. The result of the execution of
retrieveCoins is obtained by means of the blocking get instruction which blocks the
execution of This until the future variable f is ready. In general, the use of get can
introduce deadlocks. In this case, the await at L38 ensures that retrieveCoins has
finished and thus the execution will not block.
Points-to Analysis. Inferring the set of memory locations to which a reference
variable may point-to is a classical analysis in object-oriented languages. In
SACO we follow Milonava et al. [11] and abstract objects by the sequence of
allocation sites of all objects that lead to its creation. E.g., if we create an
object o1 at program point pp1 , and afterwards call a method of o1 that creates
an object o2 at program point pp2 , then the abstract representation of o2 is
pp1 .pp2 . In order to ensure termination of the inference process, the analysis is
parametrized by k, the maximal length of these sequences. In the example, for
any k ≥ 2, assuming that the allocation site of the This object is ψ, the points-to
analysis abstracts v and out to ψ.35 and ψ.34, respectively. For k = 1, they would
be abstracted to 35 and 34. As variables can be reused, the information that
the analysis gives is specified at the program point level. Basically, the analysis
results are defined by a function P(op , pp, v) which for a given (abstract) object
op , a program point pp and a variable v, it returns the set of abstract objects
to which v may point to. For instance, P(ψ, 36, v) = 35 should be read as: when
executing This and instruction L36 is reached, variable v points to an object
whose allocation site is 35. Besides, we can trivially use the analysis results to find
out to which task a future variable f is pointing to. I.e., P(op , pp, f ) = o.m where
o is an abstract object and m a method name, e.g., P(ψ, 37, f ) = 35.retrieveCoins .
Points-to analysis allows making any analysis object-sensitive [11]. In addition, in
SACO we use it: (1) in the resource analysis in order to know to which object the
cost must be attributed, and (2) in the deadlock analysis, where the abstraction
of future variables above is used to spot dependencies among tasks.
May-Happen-in-Parallel. An MHP analysis [10,3] provides a safe approximation
of the set of pairs of statements that can execute in parallel across several objects,
or in an interleaved way within an object. MHP allows ensuring absence of data
races, i.e., that several objects access the same data in parallel and at least one of
them modifies such data. Also, it is crucial for improving the accuracy of deadlock,
termination and resource analysis. The MHP analysis implemented in SACO [3]
can be understood as a function MHP(op , pp) which returns the set of program
points that may happen in parallel with pp when executing in the abstract object
op . A remarkable feature of our analysis is that it performs a local analysis of meth-
ods followed by a composition of the local results, and it has a polynomial complex-
ity. In our example, SACO infers that the execution of showIncome (L2) cannot hap-
pen in parallel with any instruction in retrieveCoins (L18–L27), since retrieveCoins
must be finished in the await at L38. Similarly, it also reveals that showCoin (L3)
cannot happen in parallel with showIncome. On the other hand, SACO detects that
SACO: Static Analyzer for Concurrent Objects 565

the await (L24) and the assignment (L16) may happen in parallel. This could be a
problem for the termination of retrieveCoins, as the shared variable coins that con-
trols the loop may be modified in parallel, but our termination analysis can over-
come this difficulty. Since the result of the MHP analysis refines the control-flow,
we could also consider applying the MHP and points-to analyses continuously to
refine the results of each other. In SACO we apply them only once.

3 Advanced Analyses
Termination Analysis. The main challenge is in handling shared-memory con-
current programs. When execution interleaves from one task to another, the
shared-memory may be modified by the interleaved task. The modifications can
affect the behavior of the program and change its termination behavior and its
resource consumption. Inspired by the rely-guarantee principle used for com-
positional verification and analysis [12,5] of thread-based concurrent programs,
SACO incorporates a novel termination analysis for concurrent objects [4] which
assumes a property on the global state in order to prove termination of a loop
and, then, proves that this property holds. The property to prove is the finite-
ness of the shared-data involved in the termination proof, i.e., proving that such
shared-memory is updated a finite number of times. Our analysis is based on a
circular style of reasoning since the finiteness assumptions are proved by proving
termination of the loops in which that shared-memory is modified. Crucial for
accuracy is the use of the information inferred by the MHP analysis which allows
us to restrict the set of program points on which the property has to be proved
to those that may actually interleave its execution with the considered loop.
Consider the function retrieveCoins from Sec. 2. At the await (L24) the value
of the shared variable coins may change, since other tasks may take the object’s
lock and modify coins. In order to prove termination, the analysis first assumes
that coins is updated a finite number of times. Under this assumption the loop is
terminating because eventually the value of coins will stop being updated by other
tasks, and then it will decrease at each iteration of the loop. The second step is
to prove that the assumption holds, i.e., that the instructions updating coins are
executed a finite number of times. The only update instruction that may happen
in parallel with the await is in insertCoin (L16), which is called from insertCoins and
this from main. Since these three functions are terminating (their termination
can be proved without any assumption), the assumption holds and therefore
retrieveCoins terminates. Similarly, the analysis can prove the termination of the
other functions, thus proving the whole program terminating.

Resource Analysis. SACO can measure diﬀerent types of costs (e.g., number
of execution steps, memory created, etc.) [2]. In the output, it returns upper
bounds on the worst-case cost of executing the concurrent program. The results
of our termination analysis provide useful information for cost: if the program
is terminating then the size of all data is bounded (we use x+ to refer to the
maximal value for x). Thus, we can give cost bounds in terms of the maximum
and/or minimum values that the involved data can reach. Still, we need novel
566 E. Albert et al.

techniques to infer upper bounds on the number of iterations of loops whose

execution might interleave with instructions that update the shared memory.
SACO incorporates a novel approach which is based on the combination of local
ranking functions (i.e., ranking functions obtained by ignoring the concurrent
interleaving behaviors) with upper bounds on the number of visits to the in-
structions which update the shared memory. As in termination, the function
MHP is used to restrict the set of points whose visits have to be counted to
those that indeed may interleave.
Consider again the loop inside retrieveCoins. Ignoring concurrent interleavings,
a local ranking function RF = coins is easily computed. In order to obtain an
upper bound on the number of iterations considering interleavings, we need to
calculate the number of visits to L16, the only instruction that updates coins and
MHP with the await in L24. We need to add the number of visits to L16 for every
path of calls reaching it, in this case main–insertCoins–insertCoin only. By applying
the analysis recursively we obtain that L16 is visited n times. Combining the
local ranking function and the number of visits to L16 we obtain that an upper
bound on the number of iterations of the loop in retrieveCoins is coin+ ∗n.
Finally, we use the results of points-to analysis to infer the cost at the level of
the distributed components (i.e., the objects). Namely, we give an upper bound
of the form c(ψ)*(. . . )+c(35)*(coin+ ∗n. . . )+ c(34)*(. . . ) which distinguishes the
cost attributed to each abstract object o by means of its associated marker c(o).

Deadlock Analysis. The combination of non-blocking (await) and blocking (get

) mechanisms to access futures may give rise to complex deadlock situations.
SACO provides a rigorous formal analysis which ensures deadlock freedom, as
described in [6]. Similarly to other deadlock analyses, our analysis is based on
constructing a dependency graph which, if acyclic, guarantees that the program
is deadlock free. In order to construct the dependency graph, we use points-to
analysis to identify the set of objects and tasks created along any execution.
Given this information, the construction of v35 .insertCoin p34 .showCoin
the graph is done by a traversal of the pro-
gram in which we analyze await and get main
24
v35 38
instructions in order to detect possible 13
deadlock situations. However, without fur- v35 .retrieveCoins
ther temporal information, our dependency v35 .insertCoins p34
graphs would be extremely imprecise. The
crux of our analysis is the use of the MHP analysis which allows us to label the
dependency graph with the program points of the synchronization instructions
that introduce the dependencies and, thus, that may potentially induce dead-
locks. In a post-process, we discard unfeasible cycles in which the synchronization
instructions involved in the circular dependency cannot happen in parallel. The
dependency graph for our example is shown above. Circular nodes represent
objects and squares tasks. Solid edges are tagged with the program point that
generated them (await or get instructions). Dotted edges go from each task to
their objects indicating ownership. In our example, there are no cycles in the
graph. Thus, the program is deadlock free.
SACO: Static Analyzer for Concurrent Objects 567

4 Related Tools and Conclusions

We have presented a powerful static analyzer for an actor-based concurrency
model, which is lately regaining much attention due to its adoption in Erlang,
Scala and concurrent objects (e.g., there are libraries in Java implementing con-
current objects). As regards to related tools, there is another tool [7] which
performs deadlock analysis of concurrent objects but, unlike SACO, it does not
rely on MHP and points-to analyses. We refer to [3,6] for detailed descriptions
on the false positives that our tool can give. Regarding termination, we only
know of the TERMINATOR tool [8] for thread-based concurrency. As far as we
know, there are no other cost analyzers for imperative concurrent programs.

Acknowledgments. This work was funded partially by the EU project FP7-

ICT-610582 ENVISAGE: Engineering Virtualized Services (http://www.envisage-
project.eu) and by the Spanish projects TIN2008-05624 and TIN2012-38137.

References
1. Agha, G.A.: Actors: A Model of Concurrent Computation in Distributed Systems.
MIT Press, Cambridge (1986)
2. Albert, E., Arenas, P., Genaim, S., Gómez-Zamalloa, M., Puebla, G.: Cost Analysis
of Concurrent OO Programs. In: Yang, H. (ed.) APLAS 2011. LNCS, vol. 7078,
pp. 238–254. Springer, Heidelberg (2011)
3. Albert, E., Flores-Montoya, A.E., Genaim, S.: Analysis of May-Happen-in-Parallel
in Concurrent Objects. In: Giese, H., Rosu, G. (eds.) FMOODS/FORTE 2012.
LNCS, vol. 7273, pp. 35–51. Springer, Heidelberg (2012)
4. Albert, E., Flores-Montoya, A., Genaim, S., Martin-Martin, E.: Termination and
Cost Analysis of Loops with Concurrent Interleavings. In: Van Hung, D., Ogawa,
M. (eds.) ATVA 2013. LNCS, vol. 8172, pp. 349–364. Springer, Heidelberg (2013)
5. Cook, B., Podelski, A., Rybalchenko, A.: Proving Thread Termination. In: PLDI
2007, pp. 320–330. ACM (2007)
6. Flores-Montoya, A.E., Albert, E., Genaim, S.: May-Happen-in-Parallel Based Dead-
lock Analysis for Concurrent Objects. In: Beyer, D., Boreale, M. (eds.) FMOODS/-
FORTE 2013. LNCS, vol. 7892, pp. 273–288. Springer, Heidelberg (2013)
7. Giachino, E., Laneve, C.: Analysis of Deadlocks in Object Groups. In: Bruni, R.,
Dingel, J. (eds.) FMOODS/FORTE 2011. LNCS, vol. 6722, pp. 168–182. Springer,
Heidelberg (2011)
8. http://research.microsoft.com/enus/um/cambridge/projects/terminator/
9. Johnsen, E.B., Hähnle, R., Schäfer, J., Schlatte, R., Steﬀen, M.: ABS: A Core
Language for Abstract Behavioral Speciﬁcation. In: Aichernig, B.K., de Boer, F.S.,
Bonsangue, M.M. (eds.) FMCO 2011. LNCS, vol. 6957, pp. 142–164. Springer,
Heidelberg (2011)
10. Lee, J.K., Palsberg, J.: Featherweight X10: A Core Calculus for Async-Finish Par-
allelism. In: PPoPP 2010, pp. 25–36. ACM (2010)
11. Milanova, A., Rountev, A., Ryder, B.G.: Parameterized Object Sensitivity for
Points-to Analysis for Java. ACM Trans. Softw. Eng. Methodol. 14, 1–41 (2005)
12. Popeea, C., Rybalchenko, A.: Compositional Termination Proofs for Multi-
threaded Programs. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS,
vol. 7214, pp. 237–251. Springer, Heidelberg (2012)
VeriMAP: A Tool for Verifying Programs
through Transformations

Emanuele De Angelis1, , Fabio Fioravanti1,

Alberto Pettorossi2, and Maurizio Proietti3
1
DEC, University ‘G. D’Annunzio’, Pescara, Italy
{emanuele.deangelis,fioravanti}@unich.it
2
DICII, University of Rome Tor Vergata, Rome, Italy
[email protected]
3
IASI-CNR, Rome, Italy
[email protected]

Abstract. We present VeriMAP, a tool for the veriﬁcation of C pro-

grams based on the transformation of constraint logic programs, also
called constrained Horn clauses. VeriMAP makes use of Constraint Logic
Programming (CLP) as a metalanguage for representing: (i) the opera-
tional semantics of the C language, (ii) the program, and (iii) the prop-
erty to be verified. Satisfiability preserving transformations of the CLP
representations are then applied for generating verification conditions
and checking their satisfiability. VeriMAP has an interface with various
solvers for reasoning about constraints that express the properties of
the data (in particular, integers and arrays). Experimental results show
that VeriMAP is competitive with respect to state-of-the-art tools for
program verification.

1 The Transformational Approach to Verification

Program verification techniques based on Constraint Logic Programming (CLP),
or equivalently constrained Horn clauses (CHC), have gained increasing popular-
ity during the last years [2,4,8,17]. Indeed, CLP has been shown to be a powerful,
flexible metalanguage for specifying the program syntax, the operational seman-
tics, and the proof rules for many different programming languages and program
properties. Moreover, the use of the CLP-based techniques allows one to enhance
the reasoning capabilities provided by Horn clause logic by taking advantage of
the many special purpose solvers that are available for various data domains,
such as integers, arrays, and other data structures.
Several verification tools, such as ARMC [18], Duality [15], ELDARICA [12],
HSF [7], TRACER [13], μZ [11], implement reasoning techniques within CLP
(or CHC) by following approaches based on interpolants, satisfiability modulo
theories, counterexample-guided abstraction refinement, and symbolic execution
of CLP programs.
Our tool for program verification, called VeriMAP, is based on transformation
techniques for CLP programs [3,4,19]. The current version of the VeriMAP can

Supported by the National Group of Computing Science (GNCS-INDAM).

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 568–574, 2014.

c Springer-Verlag Berlin Heidelberg 2014
VeriMAP: A Tool for Verifying Programs through Transformations 569

be used for verifying safety properties of C programs that manipulate integers

and arrays. We assume that: (i) a safety property of a program P is defined
by a pair ϕinit , ϕerror of formulas, and (ii) safety holds iff no execution of P
starting from an initial configuration that satisfies ϕinit , terminates in a final
configuration that satisfies ϕerror .
From the CLP representation of the given C program and of the property,
VeriMAP generates a set of verification conditions (VC’s) in the form of CLP
clauses. The VC generation is performed by a transformation that consists in
specializing (with respect to the given C program and property) a CLP program
that defines the operational semantics of the C language and the proof rules
for verifying safety. Then, the CLP program made out of the generated VC’s
is transformed by applying unfold/fold transformation rules [5]. This transfor-
mation ‘propagates’ the constraints occurring in the CLP clauses and derives
equisatisfiable, easier to analyze VC’s. During constraint propagation VeriMAP
makes use of constraint solvers for linear (integer or rational) arithmetic and
array formulas. In a subsequent phase the transformed VC’s are processed by a
lightweight analyzer that basically consists in a bounded unfolding of the clauses.
Since safety is in general undecidable, the analyzer may not be able to detect the
satisfiability or the unsatisfiability of the VC’s and, if this is the case, the verifi-
cation process continues by iterating the transformation and the propagation of
the constraints in the VC’s.
The main advantage of the transformational approach to program verifica-
tion over other approaches is that it allows one to construct highly parametric,
configurable verification tools. In fact, one could modify VeriMAP so as to deal
with other programming languages, different language features, and different
properties to be proved. This modification can be done by reconfiguring the in-
dividual modules of the tool, and in particular, (i) by replacing the CLP clauses
that define the language semantics and proof rules, (ii) by designing a suitable
strategy for specializing the language semantics and proof rules so as to automat-
ically generate the VC’s for any given program and property, (iii) by designing
suitable strategies for transforming the VC’s by plugging-in different constraint
solvers and replacement rules (which are clause rewriting rules) depending on
the theories of the data structures that are used, (iv) by replacing the lightweight
analyzer currently used in VeriMAP by other, more precise analyzers available
for CLP programs. These module reconfigurations may require considerable ef-
fort (and this is particularly true for the design of the strategies of Point (iii)),
but then, by composing the different module versions we get, we will have at our
disposal a rich variety of powerful verification procedures.
Another interesting feature of the transformational approach is that at each
step of the transformation, we get a set of VC’s which is equisatisfiable with
respect to the initial set. This feature allows us both (i) to compose together
various verification strategies, each one being expressed by a sequence of trans-
formations, and (ii) to use VeriMAP as a front-end for other verifiers (such as
those we have mentioned above) that can take as input VC’s in the form of
CLP clauses. Finally, the use of satisfiability preserving transformations eases
570 E. De Angelis et al.

the task of guaranteeing that VeriMAP computes sound results, as the soundness
of the transformation rules can be proved once and for all, before performing
any veriﬁcation using VeriMAP.

2 The VeriMAP Tool: Architecture and Usage

Architecture. The VeriMAP tool consists of three modules (see Figure 1).
(1) A C-to-CLP Translator (C2CLP) that constructs a CLP encoding of the
C program and of the property given as input. C2CLP ﬁrst translates the given
C program into CIL, the C Intermediate Language of [16]. (2) A Verification
Conditions Generator (VCG) that generates a CLP program representing the
VC’s for the given program and property. The VCG module takes as input
also the CLP representations of the operational semantics of CIL and of the
proof rules for establishing safety. (3) An Iterated Verifier (IV ) that attempts
to determine whether or not the VC’s are satisﬁable by iteratively applying
unfold/fold transformations to the input VC’s, and analyzing the derived VC’s.

Iterated Verifer unknown

CIL Interpreter
C Program
Verifcation Unfold/Fold true/false
C-to-CLP Analyzer
Property Translator Conditions Transformer
Generator
Proof Rules

Transformation Strategies
Constraint Domain
Unfolding Generalization Constraint Replacement
Data Theory Operators Operators Solvers Rules

Fig. 1. The VeriMAP architecture

The C2CLP module is based on a modified version of the CIL tool [16]. This
module first parses and type-checks the input C program, annotated with the
property to be verified, and then transforms it into an equivalent program writ-
ten in CIL that uses a reduced set of language constructs. During this transfor-
mation, in particular, commands that use while’s and for’s are translated into
equivalent commands that use if-then-else’s and goto’s. This transformation step
simplifies the subsequent processing steps. Finally, C2CLP generates as output
the CLP encoding of the program and of the property by running a custom
implementation of the CIL visitor pattern [16]. In particular, for each program
command, C2CLP generates a CLP fact of the form at(L, C), where C and L
represent the command and its label, respectively. C2CLP also constructs the
clauses for the predicates phiInit and phiError representing the formulas ϕinit
and ϕerror that specify the safety property.
The VCG module generates the VC’s for the given program and property
by applying a program specialization technique based on equivalence preserving
unfold/fold transformations of CLP programs [5]. Similarly to what has been
proposed in [17], the VCG module specializes the interpreter and the proof rules
VeriMAP: A Tool for Verifying Programs through Transformations 571

with respect to the CLP representation of the program and safety property gen-
erated by C2CLP (that is, the clauses defining at, phiInit, and phiError).
The output of the specialization process is the CLP representation of the VC’s.
This specialization process is said to ‘remove the interpreter’ in the sense that
it removes every reference to the predicates used in the CLP definition of the
interpreter in favour of new predicates corresponding to (a subset of) the ‘pro-
gram points’ of the original C program. Indeed, the structure of the call-graph
of the CLP program generated by the VCG module corresponds to that of the
control-flow graph of the C program.
The IV module consists of two submodules: (i) the Unfold/Fold Transformer,
and (ii) the Analyzer. The Unfold/Fold Transformer propagates the constraints
occurring in the definition of phiInit and phiError through the input VC’s
thereby deriving a new, equisatisfiable set of VC’s. The Analyzer checks the
satisfiability of the VC’s by performing a lightweight analysis. The output of this
analysis is either (i) true, if the VC’s are satisfiable, and hence the program is
safe, or (ii) false, if the VC’s are unsatisfiable, and hence the program is unsafe
(and a counterexample may be extracted), or (iii) unknown, if the lightweight
analysis is unable to determine whether or not the VC’s are satisfiable. In this
last case the verification continues by iterating the propagation of constraints
by invoking again the Unfold/Fold Transformer submodule. At each iteration,
the IV module can also apply a Reversal transformation [4], with the effect of
reversing the direction of the constraint propagation (either from phiInit to
phiError or vice versa, from phiError to phiInit).
The VCG and IV modules are realized by using MAP [14], a transformation
engine for CLP programs (written in SICStus Prolog), with suitable concrete
versions of Transformation Strategies. There are various versions of the transfor-
mation strategies which, as indicated in [4], are defined in terms of: (i) Unfold-
ing Operators, which guide the symbolic evaluation of the VC’s, by controlling
the expansion of the symbolic execution trees, (ii) Generalization Operators [6],
which guarantee termination of the Unfold/Fold Transformer and are used (to-
gether with widening and convex-hull operations) for the automatic discovery
loop invariants, (iii) Constraint Solvers, which check satisfiability and entailment
within the Constraint Domain at hand (for example, the integers or the ratio-
nals), and (iv) Replacement Rules, which guide the application of the axioms
and the properties of the Data Theory under consideration (like, for example,
the theory of arrays), and their interaction with the Constraint Domain.
Usage. VeriMAP can be downloaded from http://map.uniroma2.it/VeriMAP
and can be run by executing the following command: ./VeriMAP program.c,
where program.c is the C program annotated with the property to be verified.
VeriMAP has options for applying custom transformation strategies and for
exiting after the execution of the C2CLP or VCG modules, or after the execution
of a given number of iterations of the IV module.
572 E. De Angelis et al.

3 Experimental Evaluation

We have experimentally evaluated VeriMAP on several benchmark sets. The ﬁrst

benchmark set for our experiments consisted of 216 safety verification problems
of C programs acting on integers (179 of which are safe, and the remaining 37
are unsafe). None of these programs deal with arrays. Most problems have been
taken from the TACAS 2013 Software Verification Competition [1] and from the
benchmark sets of other tools used in software model checking, like DAGGER [9],
TRACER [13] and InvGen [10]. The size of the input programs ranges from a
dozen to about five hundred lines of code.
In Table 1 we summarize the verification results obtained by VeriMAP and the
following three state-of-the-art CLP-based software model checkers for C pro-
grams: (i) ARMC [18], (ii) HSF(C) [7], and (iii) TRACER [13] using the strongest
postcondition (SPost) and the weakest precondition (WPre) options.

Table 1. Veriﬁcation results using VeriMAP, ARMC, HSF(C), and TRACER. Time
is in seconds. The time limit for timeout is ﬁve minutes. (∗) These errors are due to
incorrect parsing, or excessive memory requirements, or similar other causes.

TRACER
VeriMAP ARMC HSF(C)
SPost WPre
correct answers 185 138 160 91 103
safe problems 154 112 138 74 85
unsafe problems 31 26 22 17 18
incorrect answers 0 9 4 13 14
missed bugs 0 1 1 0 0
false alarms 0 8 3 13 14
errors (∗) 0 18 0 20 22
timeout 31 51 52 92 77
total time 10717.34 15788.21 15770.33 27757.46 23259.19
average time 57.93 114.41 98.56 305.03 225.82

The results of the experiments show that our approach is competitive with
state-of-the-art verifiers. Besides the above benchmark set, we have used Ver-
iMAP on a small benchmark set of verification problems of C programs acting
on integers and arrays. These problems include programs for computing the
maximum elements of arrays and programs for performing array initialization,
array copy, and array search. Also for this benchmark, the results we have ob-
tained show that our transformational approach is effective and quite efficient
in practice.
All experiments have been performed on an Intel Core Duo E7300 2.66Ghz
processor with 4GB of memory running GNU/Linux, using a time limit of five
minutes. The source code of all the verification problems we have considered is
available at http://map.uniroma2.it/VeriMAP.
VeriMAP: A Tool for Verifying Programs through Transformations 573

4 Future Work

The current version of VeriMAP deals with safety properties of a subset of the
C language where, in particular, pointers and recursive procedures do not occur.
Moreover, the user is only allowed to configure the transformation strategies by
choosing among some available submodules for unfolding, generalization, con-
straint solving, and replacement rules (see Figure 1). Future work will be devoted
to make VeriMAP a more flexible tool so that the user may configure other pa-
rameters, such as: (i) the programming language and its semantics, (ii) the class
of properties and their proof rules (thus generalizing an idea proposed in [8]),
and (iii) the theory of the data types in use, including those for dynamic data
structures, such as lists and heaps.

References
1. Beyer, D.: Second Competition on Software Verification (SV-COMP 2013). In:
Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 594–609.
Springer, Heidelberg (2013)
2. Bjørner, N., McMillan, K., Rybalchenko, A.: On solving universally quantified Horn
clauses. In: Logozzo, F., Fähndrich, M. (eds.) SAS 2013. LNCS, vol. 7935, pp. 105–
125. Springer, Heidelberg (2013)
3. De Angelis, E., Fioravanti, F., Pettorossi, A., Proietti, M.: Verification of impera-
tive programs by constraint logic program transformation. In: SAIRP 2013, Elec-
tronic Proceedings in Theoretical Computer Science, vol. 129, pp. 186–210 (2013)
4. De Angelis, E., Fioravanti, F., Pettorossi, A., Proietti, M.: Verifying Programs via
Iterated Specialization. In: PEPM 2013, pp. 43–52. ACM (2013)
5. Fioravanti, F., Pettorossi, A., Proietti, M.: Transformation rules for locally strat-
ified constraint logic programs. In: Bruynooghe, M., Lau, K.-K. (eds.) Program
Development in Computational Logic. LNCS, vol. 3049, pp. 291–339. Springer,
Heidelberg (2004)
6. Fioravanti, F., Pettorossi, A., Proietti, M., Senni, V.: Generalization strategies for
the verification of infinite state systems. Theory and Practice of Logic Program-
ming 13(2), 175–199 (2013)
7. Grebenshchikov, S., Gupta, A., Lopes, N.P., Popeea, C., Rybalchenko, A.: HSF(C):
A software verifier based on Horn clauses. In: Flanagan, C., König, B. (eds.) TACAS
2012. LNCS, vol. 7214, pp. 549–551. Springer, Heidelberg (2012)
8. Grebenshchikov, S., Lopes, N.P., Popeea, C., Rybalchenko, A.: Synthesizing soft-
ware verifiers from proof rules. In: PLDI 2012, pp. 405–416. ACM (2012)
9. Gulavani, B.S., Chakraborty, S., Nori, A.V., Rajamani, S.K.: Automatically re-
fining abstract interpretations. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS
2008. LNCS, vol. 4963, pp. 443–458. Springer, Heidelberg (2008)
10. Gupta, A., Rybalchenko, A.: InvGen: An efficient invariant generator. In: Boua-
jjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 634–640. Springer,
Heidelberg (2009)
11. Hoder, K., Bjørner, N., de Moura, L.: µZ– An efficient engine for fixed points with
constraints. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806,
pp. 457–462. Springer, Heidelberg (2011)
574 E. De Angelis et al.

12. Hojjat, H., Konečný, F., Garnier, F., Iosif, R., Kuncak, V., Rümmer, P.: A verifi-
cation toolkit for numerical transition systems. In: Giannakopoulou, D., Méry, D.
(eds.) FM 2012. LNCS, vol. 7436, pp. 247–251. Springer, Heidelberg (2012)
13. Jaffar, J., Murali, V., Navas, J.A., Santosa, A.E.: TRACER: A symbolic execution
tool for verification. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS,
vol. 7358, pp. 758–766. Springer, Heidelberg (2012)
14. The MAP system, http://www.iasi.cnr.it/~proietti/system.html
15. McMillan, K.L., Rybalchenko, A.: Solving constrained Horn clauses using interpo-
lation. MSR Technical Report 2013-6, Microsoft Report (2013)
16. Necula, G.C., McPeak, S., Rahul, S.P., Weimer, W.: CIL: Intermediate language
and tools for analysis and transformation of C programs. In: Horspool, R.N. (ed.)
CC 2002. LNCS, vol. 2304, pp. 209–265. Springer, Heidelberg (2002)
17. Peralta, J.C., Gallagher, J.P., Saglam, H.: Analysis of imperative programs through
analysis of Constraint Logic Programs. In: Levi, G. (ed.) SAS 1998. LNCS,
vol. 1503, pp. 246–261. Springer, Heidelberg (1998)
18. Podelski, A., Rybalchenko, A.: ARMC: The logical choice for software model check-
ing with abstraction refinement. In: Hanus, M. (ed.) PADL 2007. LNCS, vol. 4354,
pp. 245–259. Springer, Heidelberg (2007)
19. De Angelis, E., Fioravanti, F., Pettorossi, A., Proietti, M.: Verifying Array Pro-
grams by Transforming Verification Conditions. In: McMillan, K.L., Rival, X. (eds.)
VMCAI 2014. LNCS, vol. 8318, pp. 182–202. Springer, Heidelberg (2014)
CIF 3: Model-Based Engineering
of Supervisory Controllers

D.A. van Beek, W.J. Fokkink, D. Hendriks, A. Hofkamp,

J. Markovski, J.M. van de Mortel-Fronczak,
and M.A. Reniers

Manufacturing Networks Group

Eindhoven University of Technology, Eindhoven, The Netherlands

Abstract. The engineering of supervisory controllers for large and com-

plex cyber-physical systems requires dedicated engineering support. The
Compositional Interchange Format language and toolset have been devel-
oped for this purpose. We highlight a model-based engineering framework
for the engineering of supervisory controllers and explain how the CIF
language and accompanying tools can be used for typical activities in that
framework such as modeling, supervisory control synthesis, simulation-
based validation, veriﬁcation, and visualization, real-time testing, and
code generation. We mention a number of case studies for which this
approach was used in the recent past. We discuss future developments
on the level of language and tools as well as research results that may be
integrated in the longer term.

1 Introduction
A supervisory controller coordinates the behavior of a (cyber-physical) system
from discrete-event observations of its state. Based on such observations the
supervisory controller decides on the activities that the uncontrolled system can
safely perform or on the activities that (are more likely to) lead to acceptable
system behavior. Engineering of supervisory controllers is a challenging task
in practice, amongst others because of the high complexity of the uncontrolled
system.
In model-based engineering, models are used in the design process, instead of
directly implementing a solution. The Compositional Interchange Format (CIF)
is an automata-based modeling language that supports the entire model-based
engineering development process of supervisory controllers, including modeling,
supervisory controller synthesis (deriving a controller from its requirements),
simulation-based validation, veriﬁcation, and visualization, real-time testing, and
code generation. CIF 3 is a substantially enhanced new version of CIF, after CIF
1 [BRRS08] and CIF 2 [NBR12]. It has been improved based on feedback from
industry, as well as new theoretical advances. The various versions of CIF have
been developed in European projects HYCON, HYCON2, Multiform, and C4C.
CIF is actively being developed by the Manufacturing Networks Group1 of the
1
Until recently the group was named Systems Engineering Group.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 575–580, 2014.

c Springer-Verlag Berlin Heidelberg 2014
576 D.A. van Beek et al.

Mechanical Engineering department, at the Eindhoven University of Technology

(TU/e) [BHSR13]. The CIF tooling (see cif.se.wtb.tue.nl) is available under
the MIT open source license (see opensource.org/licenses/MIT).
In Section 2, we introduce a simpliﬁed version of the framework for model-
based engineering of supervisory controllers. In Section 3, we outline the role
CIF and its related tools play in this framework. The most prominent features
of CIF and its tools are highlighted there. In Section 4, we brieﬂy discuss some
industrial cases where the framework and tooling have been applied. Finally, in
Section 5, we present a number of enhancements that are being considered for
future addition to the CIF language and its tool set.

2 Model-Based Engineering of Supervisory Controllers

Fig. 1 depicts an overview of model-based engineering of supervisory controllers.

Our starting point is a model (uncontrolled hybrid plant ) of the uncontrolled
system. The goal is to obtain a supervisory controller either by supervisory
controller synthesis or by design. A hybrid observer forms an interface be-
tween the plant model and its supervisory controller. The ﬁrst purpose of the
observer is to interface the variable-based con-
tinuous world of the plant to the event-based Observer-based
world of the discrete-event controller. Its sec- supervisor
ond purpose is the generation of additional Supervisory

Discrete
events, from the state observed at the hybrid controller
plant. They can be interpreted as virtual sen- sensor actuator
sors by the controller, abstracting away timed events events
behavior. Examples are a timeout, or an event
that signals that a certain combination of val- Hybrid Timed/hybrid
observer
ues of physical quantities has occurred.
Fig. 2 depicts the workﬂow of the sim- sensor actuator
pliﬁed framework for model-based engineer- variables variables

ing of supervisory controllers. First, the Uncontrolled

modeler manually designs an uncontrolled hybrid plant
hybrid plant model, and a hybrid observer
model. Next, from these two models, an ab- Fig. 1. Overview of model-based
straction of the uncontrolled system (un- engineering of supervisory con-
controlled discrete-event plant ) is manually trollers
created.
From the uncontrolled discrete-event plant and discrete-event control require-
ments, a discrete-event supervisory controller is synthesized (generated). This
controller is safe by construction. To also ensure that all relevant behavior is
present, additional liveness verification can be performed on the supervisory
controller and the uncontrolled discrete-event plant. For timed verification, the
uncontrolled hybrid plant and hybrid observer can be used instead of the un-
controlled discrete-event plant. The automated synthesis and verification enable
the designer to perform rapid iterative corrections and improvements of the
CIF 3: Model-Based Engineering of Supervisory Controllers 577

Uncontrolled Control Simulation/

Image
hybrid plant requirements visualization

Uncontrolled Controller Supervisory Observer-

discrete-event based
plant synthesis controller supervisor

Hybrid Code
Veriﬁcation
observer generation

Fig. 2. Workﬂow for model-based engineering of supervisory controllers

plant model and the control requirements. The supervisory controller is com-
bined with the hybrid observer, resulting in the actual controller (observer-based
supervisor ). This model is used for model-based validation, by means of real-
time interactive simulation and visualization, based on a user-supplied image
of the system. This brings a higher confidence that the models fulfill expected
properties. The mentioned simulation-based visualization can also be used for
validating the other models, such as the discrete-event and hybrid plant models.
As a final step, actual real-time control code is generated, for the implemen-
tation of the controller.

3 The Role of the CIF Language and Tools

The CIF language (see [BHSR13]) is based on networks of hybrid automata

with invariants, and non-linear and/or discontinuous differential algebraic equa-
tions. The abilitiy of CIF to model large-scale systems is due to the orthogonal
combination of parametrized process definition and instantiation (reuse, hier-
archy), grouping of arbitrary components in sub-scopes, and an import mecha-
nism. Components (automata) in a CIF model can interact in several different
ways: multi-party synchronization via shared events, allowing communication
via shared data; monitor automata to provide the functionality of nonblocking
input events as defined in input-output automata [LSV01]; shared variables (lo-
cal write, global read). Furthermore, CIF supports urgent events (events that
must happen as soon as they are enabled by all synchronizing automata), a rich
set of data types and expressions (e.g. lists, sets, dictionaries, tuples), functions,
stochastic distributions, conditional updates, and multi-assignments.
CIF is well suited to model plants and supervisors in the application domain of
cyber-physical systems (the blocks uncontrolled hybrid plant, uncontrolled discrete-
event plant, hybrid observer, and control requirements in Fig. 2) as collections of
automata using both discrete events and continuous-time behavior (see the cases
mentioned in Section 4) in the same style as hybrid automata [ACH+ 95, Hen00,
AGH+ 00, LSV01]. Developing a CIF model for the uncontrolled hybrid plant is an
578 D.A. van Beek et al.

iterative process of developing a model and using simulation and visualization to

increase confidence in the model.
The main difference between CIF and the other currently available hybrid
automata-based languages and toolsets, is that CIF covers the complete inte-
grated tool chain for the development of supervisory controllers for complex
cyber-physical systems. For the specification of CIF models for plants, require-
ments, observers, etc, an Eclipse-based (eclipse.org) textual editor is available,
which features syntax highlighting, and continuous background syntax and type
checking.
An important feature of the CIF toolset is the simulator. It can be employed
to validate each of the CIF models mentioned before in isolation. Additionally
it may be used to validate the controller when put in the context of the un-
controlled hybrid plant (as indicated in Fig. 2). Based on a CIF model, the
simulator generates code for high-performance visual simulation. The simulator
allows interactive exploration of the behavior of the controlled system, by using
the interactive visualization-based simulation mentioned in the previous section.
This requires user-supplied SVG vector images (w3.org/TR/SVG11). The simula-
tor is highly configurable and versatile, allowing for automatic testing of various
use cases, and the production of various forms of output.
To support event-based supervisory controller synthesis, CIF features plant
and requirement automata, marker predicates [CL07], as well as controllable
and uncontrollable events. Furthermore, the tools include efficient implementa-
tions of the traditional event-based supervisory controller synthesis algorithms
as presented in [WR87, CL07].
For verification, a transformation of CIF models to Uppaal [NRS+ 11] is avail-
able. Compositionality has been a central concern when designing CIF, because
a compositional semantics facilitates property-preserving model transformations
and compositional verification techniques. Currently, over twenty transforma-
tions for various purposes are available ([BHSR13]), some via CIF 2.
Interoperability with other languages and tools is achieved by means of model
transformations, external functions, and co-simulation via the Matlab/Simulink
S-function interface [The05]. Programmable Logic Controller (PLC) code gener-
ation conforming to the IEC 61131-3 standard [JT10] is also available, allowing
for implementation of CIF controllers in actual systems.

4 Applications
CIF has been used in an industrial context for a number of years now. We
mention some of the more prominent applications.
– Development of a coordinator for maintenance procedures for a high-tech
Océ printer [MJB+ 10]
– Improving evolvability of a patient communication control system using
state-based supervisory control synthesis [TBR12]
– Application of supervisory control theory to theme park vehicles [FMSR12]
– Supervisory control of MRI subsystems [Geu12]
– Design of a supervisory controller for a Philips MRI-scanner [Dij13]
CIF 3: Model-Based Engineering of Supervisory Controllers 579

– Design and real-time implementation of a supervisory controller for baggage

handling [Kam13]
Although these projects showed different parts of the suggested model-based
engineering framework, together they demonstrate all mentioned activities.
Currently, a project on control and performance analysis of wafer flow in wafer
handlers is carried out at ASML, and a project on design and implementation of
a certified controller with multiple control levels for a baggage handling system
is carried out for Vanderlande Industries. Also in these projects supervisory
controller synthesis and verification are considered.

5 Future Developments
CIF is constantly being improved and extended. A planned extension of CIF is
the addition of point-to-point communication by means of channels. Our experi-
ence with industrial cases has shown that these are well suited to model physical
movements of objects. The channels will be fully integrated into the language.
For instance, supervisors will be able to prohibit channel communications.
For veriﬁcation, we intend to support a larger class of CIF models for the
transformation to Uppaal. We will develop model transformations to other model
checking tools as experimented with in [MR12a]. For performance analysis we are
considering model transformations to MRMC and/or PRISM [MR12b, MER13].
The Manufacturing Networks Group also works on extensions of supervisory
control theory, such as the domain of plant models to which synthesis may be
applied, and the expressivity of the logic for requirements. See [HFR13] for a
ﬁrst publication of this line of research. As soon as such extensions reach an
acceptable level of maturity, they are incorporated in the CIF tooling.

Acknowledgments. The research leading to these results has received fund-

ing from the EU FP7 Programme under grant agreements no. FP7-ICT-223844
(C4C), no. FP7-ICT-224249 (MULTIFORM), and the Network of Excellence
HYCON (IST-2002-2.3.2.5). The research leading to these results has received
funding from the European Union Seventh Framework Programme [FP7/2007-
2013] under grant agreement no. 257462 HYCON2 Network of excellence.

References
[ACH+ 95] Alur, R., Courcoubetis, C., Halbwachs, N., Henzinger, T.A., Ho, P.-H.,
Nicollin, X., Olivero, A., Sifakis, J., Yovine, S.: The algorithmic analysis
of hybrid systems. Theoretical Computer Science 138(1), 3–34 (1995)
[AGH+ 00] Alur, R., Grosu, R., Hur, Y., Kumar, V., Lee, I.: Modular speciﬁcation of
hybrid systems in CHARON. In: Lynch, N.A., Krogh, B.H. (eds.) HSCC
2000. LNCS, vol. 1790, pp. 6–19. Springer, Heidelberg (2000)
[BHSR13] van Beek, D.A., Hendriks, D., Swartjes, L., Reniers, M.A.: Report on the
extensions of the CIF and transformation algorithms. Technical Report HY-
CON Deliverable D6.2.4 (2013)
580 D.A. van Beek et al.

[BRRS08] van Beek, D.A., Reniers, M.A., Rooda, J.E., Schiffelers, R.R.H.: Concrete
syntax and semantics of the Compositional Interchange Format for hybrid
systems. In: IFAC World Congress 2008, pp. 7979–7986, IFAC (2008)
[CL07] Cassandras, C.G., Lafortune, S.: Introduction to Discrete Event Systems,
2nd edn. Springer (2007)
[Dij13] van Dijk, D.: Supervisory control of a Philips MRI-scanner. Master’s thesis,
Eindhoven University of Technology (2013)
[FMSR12] Forschelen, S.T.J., van de Mortel-Fronczak, J.M., Su, R., Rooda, J.E.: Ap-
plication of supervisory control theory to theme park vehicles. Discrete
Event Dynamic Systems 22(4), 511–540 (2012)
[Geu12] Geurts, J.W.P.: Supervisory control of MRI subsystems. Master’s thesis,
Eindhoven University of Technology (2012)
[Hen00] Henzinger, T.A.: The theory of hybrid automata. In: Verification of Digital
and Hybrid Systems. NATO ASI Series F: Computer and Systems Science,
vol. 170, pp. 265–292. Springer (2000)
[HFR13] van Hulst, A., Fokkink, W.J., Reniers, M.A.: Maximal synthesis for
Hennessy-Milner Logic. In: ACSD 2013, pp. 1–10. IEEE (2013)
[JT10] John, K.H., Tiegelkamp, M.: IEC 61131-3: Programming Industrial Au-
tomation Systems, 2nd edn. Springer (2010)
[Kam13] Kamphuis, R.H.J.: Design and real-time implementation of a supervisory
controller for baggage handling at Veghel Airport. Master’s thesis, Eind-
hoven University of Technology (2013)
[LSV01] Lynch, N., Segala, R., Vaandrager, F.: Hybrid I/O automata revisited.
In: Di Benedetto, M.D., Sangiovanni-Vincentelli, A.L. (eds.) HSCC 2001.
LNCS, vol. 2034, pp. 403–417. Springer, Heidelberg (2001)
[MER13] Markovski, J., Estens Musa, E.S., Reniers, M.A.: Extending a synthesis-
centric model-based systems engineering framework with stochastic model
checking. ENTCS 296, 163–181 (2013)
[MJB+ 10] Markovski, J., Jacobs, K.G.M., van Beek, D.A., Somers, L.J.A.M., Rooda,
J.E.: Coordination of resources using generalized state-based requirements.
In: WODES 2010, pp. 300–305. IFAC (2010)
[MR12a] Markovski, J., Reniers, M.A.: An integrated state- and event-based frame-
work for verifying liveness in supervised systems. In: ICARCV 2012, pp.
246–251. IEEE (2012)
[MR12b] Markovski, J., Reniers, M.A.: Verifying performance of supervised plants.
In: ACSD 2012, pp. 52–61. IEEE (2012)
[NBR12] Nadales Agut, D.E., van Beek, D.A., Rooda, J.E.: Syntax and semantics of
the compositional interchange format for hybrid systems. Journal of Logic
and Algebraic Programming 82(1), 1–52 (2012)
[NRS+ 11] Nadales Agut, D.E., Reniers, M.A., Schiffelers, R.R.H., Jørgensen, K.Y.,
van Beek, D.A.: A semantic-preserving transformation from the Composi-
tional Interchange Format to UPPAAL. In: IFAC World Congress 2011, pp.
12496–12502, IFAC (2011)
[TBR12] Theunissen, R.J.M., van Beek, D.A., Rooda, J.E.: Improving evolvability
of a patient communication control system using state-based supervisory
control synthesis. Advanced Engineering Informatics 26(3), 502–515 (2012)
[The05] The MathWorks, Inc. Writing S-functions, version 6 (2005),
http://www.mathworks.com
[WR87] Wonham, W.M., Ramadge, P.J.: On the supremal controllable sublanguage
of a given language. SIAM Journal on Control and Optimization 25(3),
637–659 (1987)
EDD: A Declarative Debugger
for Sequential Erlang Programs

Rafael Caballero1 , Enrique Martin-Martin1 ,

Adrian Riesco1 , and Salvador Tamarit2
1
Universidad Complutense de Madrid, Madrid, Spain
[email protected], [email protected], [email protected]
2
Babel Research Group, Universitat Politécnica de Madrid, Madrid, Spain
[email protected]

Abstract. Declarative debuggers are semi-automatic debugging tools

that abstract the execution details to focus on the program semantics.
This paper presents a tool implementing this approach for the sequential
subset of Erlang, a functional language with dynamic typing and strict
evaluation. Given an erroneous computation, it ﬁrst detects an erroneous
function (either a “named” function or a lambda-abstraction), and then
continues the process to identify the fragment of the function responsible
for the error. Among its features it includes support for exceptions, pre-
deﬁned and built-in functions, higher-order functions, and trusting and
undo commands.

1 Introduction
Declarative debugging, also known as algorithmic debugging, is a well-known
technique that only requires from the user knowledge about the intended behav-
ior of the program, that is, the expected results of the program computations,
abstracting the execution details and hence presenting a declarative approach. It
has been successfully applied in logic [5], functional [6], and object-oriented [4]
programming languages. In [3,2] we presented a declarative debugger for se-
quential Erlang. These works gave rise to EDD, the Erlang Declarative Debugger
presented in this paper. EDD has been developed in Erlang. EDD, its documenta-
tion, and several examples are available at https://github.com/tamarit/edd
(check the README.md ﬁle for installing the tool).
As usual in declarative debugging the tool is started by the user when an
unexpected result, called the error symptom, is found. The debugger then builds
internally the so-called debugging tree, whose nodes correspond to the auxiliary
computations needed to obtain the error symptom. Then the user is questioned

Research supported by EU project FP7-ICT-610582 ENVISAGE, Spanish
projects StrongSoft (TIN2012-39391-C04-04), DOVES (TIN2008-05624), and VI-
VAC (TIN2012-38137), and Comunidad de Madrid PROMETIDOS (S2009/
TIC-1465). Salvador Tamarit was partially supported by research project POLCA,
Programming Large Scale Heterogeneous Infrastructures (610686), funded by the
European Union, STREP FP7.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 581–586, 2014.

c Springer-Verlag Berlin Heidelberg 2014
582 R. Caballero et al.

about the validity of some tree nodes until the error is found. In our proposal,
the debugger ﬁrst concentrates in the function calls occurred during the com-
putation. The goal is to ﬁnd a function call that returned an invalid result,
but such that all the function calls occurring in the function body returned a
valid result. The associated node is called a buggy node. We prove in [3] that
such function is a wrong function, and that every program producing an error
symptom1 contains at least one wrong function. An important novelty of our
debugger w.r.t. similar tools is that it allows using zoom debugging to detect an
erroneous fragment of code inside the wrong function. At this stage the user is
required to answer questions about the validity of certain variable matchings,
or about the branch that should be selected in a case/if statement for a given
context. The theoretical results in [2] ensure that this code is indeed erroneous,
and that a wrong function always contains an erroneous statement.
The rest of the paper is organized as follows: Section 2 introduces Erlang and
EDD. Section 3 describes the questions that can be asked by the tool and the
errors that can be detected. Section 4 concludes and presents the future work.

2 Erlang and EDD

In this section we introduce some pieces of Erlang [1] which are relevant for
our presentation. At the same time we introduce the basic features of our tool.
Erlang is a concurrent language with a sequential subset that is a functional
language with dynamic typing and strict evaluation. Programs are structured
using modules, which contain functions deﬁned by collections of clauses.
Example 1. Obtain the square of a number X without using products. This is
possible deﬁning Y = X − 1 and considering X 2 = (Y + 1)2 = Y 2 + Y + Y + 1.
-module(mathops).
-export([square/1]).
square(0) -> 0;
square(X) when X>0 -> Y=X-1, DoubleY=X+X, square(Y)+DoubleY+1.
Observe that variables start with an uppercase letter or underscore. In order
to evaluate a function call Erlang scans sequentially the function clauses until a
match is found. Then, the variables occurring in the head are bound. In our ex-
ample the second clause of the function square is erroneous: the underlined sub-
term X+X should be Y+Y. Using this program we check that mathops:square(3)
is unexpectedly evaluated to 15. Then, we can start EDD, obtaining the debug-
ging session in Fig. 1, where the user answers are boxed. Section 3.1 explains all
the possible answers to the debugger questions. Here we only use ‘n’ (standing
for ‘no’), indicating that the result is invalid, and ‘y’ (standing for ‘yes’), indicat-
ing that it is valid. After two questions the debugger detects that the tree node
containing the call mathops:square(1) is buggy (it produces and invalid result
while its only child mathops:square(0) returns a valid result). Consequently,
1
Note that, if the module has multiple errors that compensate each other, there is no
error symptom and hence declarative debugging cannot be applied.
EDD: A Declarative Debugger for Sequential Erlang Programs 583

> edd:dd("mathops:square(3)").
mathops:square(1) = 3? n
mathops:square(0) = 0? y
Call to a function that contains an error:
mathops:square(1) = 3
Please, revise the second clause:
square(X) when X > 0 -> Y=X-1, DoubleY=X+X, square(Y)+DoubleY+1.
Continue the debugging session inside this function? y
In the function square(1) matching with second clause succeed.
Is this correct? y
Given the context: X = 1
the following variable is assigned: DoubleY = 2? n
This is the reason for the error:
Variable DoubleY is badly assigned 2 in the expression:
DoubleY = X + X (line 4).

Fig. 1. EDD session corresponding to the call mathops:square(3)

the tool points out the second clause of square as wrong. Next, the user is asked
if zoom debugging must be used. The user agrees with inspecting the code as-
sociated to the buggy node function call. The debugger proceeds asking about
the validity of the chosen function clause, which is right (the second one), and
about the validity of the value for DoubleY, which is incorrect (it should be 0
since X=1 implies Y=0). The session finishes pointing to this incorrect match-
ing as the source of the error. Observe that an incorrect matching is not always
associated to wrong code, because it could depend on a previous value that con-
tains an incorrect result. However, the correctness results in [3] ensure that only
matchings with real errors are displayed as errors by our tool. Note in this ses-
sion the improvement with respect to the trace, the standard debugging facility
for Erlang programs. While the trace shows every computation step, our tool
focuses first on function calls, simplifying and shortening the debugging process.
The next example shows that Erlang allows more sophisticated expressions in
the function bodies, including case or if expressions.
Example 2. Select the appropriate food taking into account different preferences.
-module(meal).
-export([food/1]).
food(Preferences) ->
case Preferences of
{vegetarian,ovo_vegetarian} -> omelette;
{vegetarian,_lacto_vegetarian} -> yogurt;
{vegetarian,vegan} -> salad;
_Else -> fish
end.
Now we can evaluate the expression meal:food({vegetarian,vegan}) and
we obtain the unexpected result yogurt. This time the first phase of the debugger
is not helpful: it points readily to the only clause of food. In order to obtain
more precise information we use zoom debugging, and the debugger asks:
584 R. Caballero et al.

For the case expression:

(... omitted for the sake of space ... )
Is there anything incorrect?
1.- The context: Preferences = {vegetarian,vegan}
2.- The argument value: {vegetarian,vegan}.
3.- Enter in the second clause.
4.- The bindings: _lacto_vegetarian = vegan
5.- The final value: yogurt.
6.- Nothing.
[1/2/3/4/5/6]? 3
This question asks for anything invalid in the evaluation of the case expressions
with respect to its intended meaning. It is important to mention that in the case
of something wrong the answer must be the ﬁrst wrong item. In our case the
context and the case argument are correct, but we did not expect to use the
second branch/clause but the third. Therefore we answer 3 indicating that this
is the ﬁrst error in the list. The next question is:
Which clause did you expect to be selected [1/2/3/4]? 3
As explained above, we expected to use the third clause for this context. Then
the debugger stops indicating the error:
This is the reason for the error:
The pattern of the second clause of case expression:
case Preferences of {vegetarian, _lacto_vegetarian} -> yogurt end

Indeed there is an erroneous underscore in _lacto_vegetarian. It converts

the constant into an anonymous variable, and thus the branch was incorrectly
selected. The debugger has found the error, indicating that the second branch is
wrong and that in particular the pattern definition is incorrectly defined. Note
that, with a trace-debugger, programmers proceed instruction by instruction
checking whether the bindings, the branches selected in case/if expressions or
inner function calls are correct. Our tool fulfills a similar task, although it dis-
cards inner function calls—they were checked in the previous phase. Moreover,
the navigation strategy automatically guides the session without the participa-
tion of the user by choosing breakpoints and steps, finally pointing out the piece
of code causing the bug. Therefore, it provides a simpler and clearer way of
finding bugs in concrete functions, although the complexity is similar.

3 Using the Tool

3.1 User Answers
The possible answers to the debugging questions during the ﬁrst phase are:
– yes (y)/ no (n): the statement is valid/invalid.
– trusted (t): the user knows that the function or Σ-abstraction used in the
evaluation is correct, so further questions about it are not necessary. In this
case all the calls to this function are marked as valid.
EDD: A Declarative Debugger for Sequential Erlang Programs 585

– inadmissible (i): the question does not apply because the arguments should
not take these values. The statement is marked as valid.
– don’t know (d): the answer is unknown. The statement is marked as unknown,
and might be asked again if it is required for ﬁnding the buggy node.
– switch strategy (s): changes the navigation strategy. The navigation strate-
gies provided by the tool are explained below.
– undo (u): reverts to the previous question.
– abort (a): ﬁnishes the debugging session.

In the case of zoom debugging, the answer trusted has not any sense and it is
never available, while the answers yes, no, and inadmissible cannot be used in
some situations, for instance in compound questions about case/if expressions.
The rest of answers are always available.
The tool includes a memoization feature that stores the answers yes, no,
trusted, and inadmissible, preventing the system from asking the same question
twice. It is worth noting that don’t know is used to ease the interaction with the
debugger but it may introduce incompleteness; if the debugger reaches a deadlock
due to these answers it presents two alternatives to the user: either answering
some of the discarded questions to ﬁnd the buggy node or showing the possible
buggy code, depending on the answers to the nodes marked as unknown.

3.2 Strategies
As indicated in the introduction, the statements are represented in suitable de-
bugging trees, which represents the structure of the wrong computation. The
system can internally utilize two diﬀerent navigation strategies [7,8], Divide &
Query and Top Down Heaviest First, in order to choose the next node and there-
fore the next question presented to the user. Top Down selects as next node the
largest child of the current node, while Divide & Query selects the node whose
subtree is closer to half the size of the whole tree. In this way, Top Down sessions
usually presents more questions to the user, but they are presented in a logical
order, while Divide & Query leads to shorter sessions of unrelated questions.

3.3 Detected Errors

Next we summarize the diﬀerent types of errors detected by our debugger. As we
have seen, the ﬁrst phase always ends with a wrong function. The errors found
during the zoom debugging phase are:

Wrong case argument, which indicates that the argument of a speciﬁc case
statement has not been coded as the user expected.
Wrong pattern, which indicates that a pattern in the function arguments or
in a case/if branch is wrong.
Wrong guard, which indicates that a guard in either a function clause or in a
case/if branch is wrong.
Wrong binding, which indicates that a variable binding is incorrect.
586 R. Caballero et al.

4 Concluding Remarks and Ongoing Work

EDD is a declarative debugger for sequential Erlang. Program errors are found
by asking questions about the intended behavior of some parts of the program
being debugged, until the bug is found. Regarding usability, EDD provides sev-
eral features that make it a useful tool for debugging real programs, such as sup-
port for built-in functions and external libraries, anonymous (lambda) functions,
higher-order values, don’t know and undo answers, memoization, and trusting
mechanisms, among others. See [2,3] for details.
We have used this tool to debug several programs developed by others. This
gives us confidence in its robustness, but also illustrates an important point of
declarative debugging: it does not require the person in charge of debugging to
know the details of the implementation; it only requires to know the intended
behavior of the functions, which is much easier and more intuitive, hence allowing
a simpler form of debugging than other approaches, like tracing or breakpoints.
As future work we plan to extend this proposal to include the concurrent
features of Erlang. This extension requires first to extend our calculus with
these features. Then, we must identify the errors that can be detected in this
new framework, define the debugging tree, and adapt the tool to work with these
modifications.

References
1. Armstrong, J., Williams, M., Wikstrom, C., Virding, R.: Concurrent Programming
in Erlang, 2nd edn. Prentice-Hall (1996)
2. Caballero, R., Martin-Martin, E., Riesco, A., Tamarit, S.: A zoom-declarative de-
bugger for sequential Erlang programs. Submitted to the JLAP
3. Caballero, R., Martin-Martin, E., Riesco, A., Tamarit, S.: A declarative debugger
for sequential Erlang programs. In: Veanes, M., Viganò, L. (eds.) TAP 2013. LNCS,
vol. 7942, pp. 96–114. Springer, Heidelberg (2013)
4. Insa, D., Silva, J.: An algorithmic debugger for Java. In: Lanza, M., Marcus, A.
(eds.) Proc. of ICSM 2010, pp. 1–6. IEEE Computer Society (2010)
5. Naish, L.: Declarative diagnosis of missing answers. New Generation Comput-
ing 10(3), 255–286 (1992)
6. Nilsson, H.: How to look busy while being as lazy as ever: the implementation of a
lazy functional debugger. Journal of Functional Programming 11(6), 629–671 (2001)
7. Silva, J.: A comparative study of algorithmic debugging strategies. In: Puebla, G.
(ed.) LOPSTR 2006. LNCS, vol. 4407, pp. 143–159. Springer, Heidelberg (2007)
8. Silva, J.: A survey on algorithmic debugging strategies. Advances in Engineering
Software 42(11), 976–991 (2011)
APTE: An Algorithm
for Proving Trace Equivalence

Vincent Cheval

School of Computer Science, University of Birmingham, UK

Abstract. This paper presents APTE, a new tool for automatically

proving the security of cryptographic protocols. It focuses on proving
trace equivalence between processes, which is crucial for specifying pri-
vacy type properties such as anonymity and unlinkability.
The tool can handle protocols expressed in a calculus similar to the
applied-pi calculus, which allows us to capture most existing protocols that
rely on classical cryptographic primitives. In particular, APTE handles
private channels and else branches in protocols with bounded number of
sessions. Unlike most equivalence veriﬁer tools, APTE is guaranteed to
terminate.
Moreover, APTE is the only tool that extends the usual notion of
trace equivalence by considering “side-channel” information leaked to
the attacker such as the length of messages and the execution times. We
illustrate APTE on diﬀerent case studies which allowed us to automat-
ically (re)-discover attacks on protocols such as the Private Authentica-
tion protocol or the protocols of the electronic passports.

1 Introduction

Cryptographic protocols are small distributed programs speciﬁcally designed to

ensure the security of our communications on public channels like Internet. It
is therefore essential to verify and prove the correctness of these cryptographic
protocols. Symbolic models have proved their usefulness for verifying crypto-
graphic protocols. However, the many sources of unboundedness in modelling of
the capabilities of an attacker makes it extremely diﬃcult to verify the security
properties of a cryptographic protocol by hand. Thus, developing automatic veri-
ﬁcation tools of security protocols is a necessity. Since the 1980s, many tools have
been developed to automatically verify cryptographic protocols (e.g. Scyther
[10], ProVerif [3], Avispa [13] and others) but they mainly focus on trace prop-
erties such as authentication and secrecy, which typically specify that a protocol
cannot reach a bad state. However, many interesting security properties such as
anonymity, unlinkability, privacy, cannot be expressed as a trace property but
require the notion of equivalence property. Intuitively, these properties specify
the indistinguishability of some instances of the protocols. We focus here on
the notion of trace equivalence which is well-suited for the analysis of security
protocols.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 587–592, 2014.

c Springer-Verlag Berlin Heidelberg 2014
588 V. Cheval

Existing Tools. To our knowledge, there are only three tools that can handle
equivalence properties: ProVerif [3], SPEC [12] and AKiSs [9]. The tool
ProVerif originally was designed to prove trace properties but it can also
check some equivalence properties (so-called diﬀ-equivalence) [4] that are usu-
ally too strong to model a real intruder since they consider that the intruder has
complete knowledge of the internal states of all the honest protocols executed.
Note that this is the only tool that can handle an unbounded number of sessions
of a protocol with a large class of cryptographic primitives in practice. However,
the downside of ProVerif is that it may not terminate and it may also return a
false-negative. More recently, the tool AKiSs [9] was developed in order to decide
the trace equivalence of bounded processes that do not contain non-trivial else
branching. This tool was proved to be sound and complete and accepts a large
class of primitives but the algorithm was only conjectured to terminate. At last,
the tool SPEC [12] is based on a decision procedure for open-bisimulation for
bounded processes. The scope is however limited: open-bisimulation coincides
with trace equivalence only for determinate processes and the procedure also as-
sumes a ﬁxed set of primitives (symmetric encryption and pairing), and a pattern
based message passing, hence, in particular, no non-trivial else branching.
Hence, some interesting protocols cannot be handle by these tools. It is par-
ticularly the case for the Private Authentication protocol [2], or the protocols of
the electronic passport [1] since they rely on a conditional with a non-trivial else
branch to be properly modelled. Moreover, even though recent work [6] led to a
new release of ProVerif that can deal with the Private Authentication protocol,
ProVerif is still not able to handle the protocols of the electronic passport and
yields a false positive due to the too strong equivalence that it proves.
At last none of the existing tools take into account the fact that an attacker
can always observe the length of a message, even though it can leak information
on private data. For example, in most of existing encryption schemes, the length
of ciphertext depends on the length of its plaintext. Thus the ciphertext {m}k
corresponding to the encryption of a message m by the key k can always be
distinguished from the ciphertext {m, m}k corresponding to the encryption of
the message m repeated twice by the key k. This is simply due to the fact that
{m, m}k is longer than {m}k . However, these two messages would be considered
as indistinguishable in all previous mentioned tools.

2 Trace Equivalence
Our tool is based on a symbolic model where the messages exchanged over the
network are represented by terms. They are built from a set of variables and
names by applying function symbols modelling the cryptographic primitives.
For example, the symbol function senc (resp. sdec) represents the symmetric
encryption (resp. decryption) primitive and the term senc(m, k) models the en-
cryption of a message m by a key k. The behaviour of each primitive is modelled
by a rewriting system. As such, a term sdec(senc(m, k), k) will be rewritten m
to model the fact that decrypting a ciphertext by the key that was used to en-
crypt indeed yields the plain text. Moreover, to take into account the length
APTE: An Algorithm for Proving Trace Equivalence 589

of messages, we associate to each cryptographic primitive f a length function

ωf : Nn → N where n is the arity of f. Intuitively, a length function represents
the length of the outcome of a cryptographic primitive depending on the length
of his inputs. For example, the length function for the pairing could be the
function ω (x, y) = x + y + 1. The length function ωsenc (x, y) = x for the sym-
metric encryption would imply that the length of a ciphertext is always the size
of its plaintext. Typically, each encryption scheme would have a speciﬁc length
function that depends on the characteristics of the encryption scheme.

Equivalence of sequence of terms. By interacting with a protocol, an attacker

may obtain a sequence of messages, meaning that not only he knows each mes-
sage but also the order in which he obtained them. Properties like anonymity
rely on the notion of indistinguishability between two sequences of messages.
This is called static equivalence and denoted by ∼. For example, consider the
two sequences of messages σm = [k; senc(m, k)] and σn = [k; senc(n, k)] where
m, n are two random numbers. The two sequences σn and σm are indistinguish-
able. Indeed, even if the attacker can decrypt the second message using the ﬁrst
message, i.e. k, he obtains in both cases two random numbers and so does not
gain any particular informations. However, the two sequences [k; senc(m, k); m]
and [k; senc(m, k); n] are distinguishable since the attacker can compare the plain
text of the ciphertext with the third message he obtained.
When the attacker can also compare the size of message, the equivalence is
called length static equivalence and denoted ∼ω .

Processes. Participants in the protocol are modelled as processes whose grammar

is as follows:
P |Q P +Q new k; P out(u, v); P in(u, x); P
if u = v then P else Q let x = u in P

where P, Q are processes, u, v are terms and x is a variable. The nil process is
denoted 0. The process P + Q represents the non-deterministic choice between P
and Q. The process new k is the creation of a fresh name. The process out(u, v)
represents the emission of the message v into the channel u. Similarly, in(u, x) is
the process that receives a message on the channel u and binds it to x. Typically,
an attacker can interact with the process by emitting or receiving messages from
honest participants through public channels. Hence we represent possible inter-
actions of the attacker with P by the notion of trace, that is a pair (s, σ) where
s is the sequence of actions that the attacker performs and σ is the sequence of
messages that the attacker receives from the honest participants. A process is
said to be determinate when the execution of the process is deterministic. For
example, a process containing the choice operator is not determinate.

Deﬁnition 1 ((length) trace equivalence). Let P and Q be two processes.

The processes P and Q are trace equivalence if for all traces (s, σ) of P there
exists a trace (s , σ ) of Q such that s = s and σ ∼ σ (and conversely).
590 V. Cheval

Moreover, we say that P and Q are in length trace equivalence when the length
static equivalence ∼ω is used to compare the sequence of messages σ and σ , i.e.
σ ∼ω σ .
Intuitively, this deﬁnition indicates that whatever the actions the attacker
performs on P , the same actions can be performed on Q and the sequences of
messages obtained in both cases are indistinguishable, and conversely.

3 The Tool APTE

We present the tool APTE that decides the trace equivalence for bounded
of (possibly non-determinate, possibly with non-trivial else branches) processes
that use standard primitives, namely signature, pairing, symmetric and asym-
metric encryptions and any one-way functions such as hash, mac, etc. Moreover,
by specifying the linear length functions of each cryptographic primitives, a user
can also use APTE to decide the length trace equivalence between processes.
When the trace equivalence or length trace equivalence between two processes
is not satisﬁed, APTE provides a witness of the non-equivalence, i.e. it displays
the actions that the attacker has to perform for him to distinguish the two pro-
cesses. Note that, even though it is not the main purpose, APTE can also be
used to verify reachability properties on a protocol.

Theoretical foundations. APTE relies on symbolic traces, that is a ﬁnite rep-

resentation of inﬁnitely many traces, to decide the equivalence. In particular
from each symbolic trace of the two processes is extracted two sets of constraint
systems. The two processes are then equivalent if and only if all pairs of sets of
constant systems are symbolically equivalent. The algorithm used in APTE to
decide the symbolic equivalence between sets of constraint systems was proved
to be complete, sound and to always terminate in [7,5]. It relies on a set of rules
on constraint systems that simplify the sets of constraint systems given as input
to render the decision trivial. The extension to length trace equivalence between
processes was presented and proved in [8].

Implementation details. The tool is implemented in Ocaml1 and the source code
has about 12Klocs. The source code is highly modular: each mathematical notion
used in the algorithm is implemented in a separate module. To facilitate any new
extension and optimisation of the tool, the data structures are always hidden in
the modules, i.e. we only use abstract types (sometimes called opaque types) in
the interface files. The format of the comments in the interface files is the one
of Ocamldoc that generates a LaTex file with the documented interfaces.

Availability. APTE is an open source software and is distributed under GNU

General Public Licence 3.0. It can be downloaded at http://projects.lsv.
ens-cachan.fr/APTE/ where a mailing list, some relevant examples and also a
list of publications related or using APTE is available.
1
http://caml.inria.fr
APTE: An Algorithm for Proving Trace Equivalence 591

Anonymity on PrA Status Execution time

satisfy trace equivalence
Original 0.01 sec
but length attack
Fix (one session) safe 0.08 sec
Fix (two sessions) safe > 2 days

Unlinkability on BAC Status Execution time

French attack 0.09 sec
UK safe ? > 2 days

Unkinkability on PaA Status Execution time

Original length attack 0.08 sec
Fix safe 2.8 sec

Others Status Execution time

Anonymity on PaA safe 3.2 sec
Needham-Shroeder attack 0.01 sec
Needham-Shroeder-Love safe 0.4 sec

These results were obtained by using APTE on a 2.9 Ghz Intel Core i7, 8 GB DDR3.

Fig. 1. Experimental results

Upcoming features. Nowadays, all computers are equipped with a multi-core

processor and sometimes with several processors, thus we are currently working
on a distributed version of APTE that will take advantage of such multi-core
processors and clusters. Moreover, we would like to include new cryptographic
primitives such as XOR, blind signature and re-encryption used in very inter-
esting protocols e.g. Caveat Coercitor [11].

4 Experimental Results
We use APTE on several case studies found in the literature. The ﬁgure 1
summarises the results. In particular, we focused on the Private Authentication
(PrA) protocol [2] and the protocols of the electronic passport (a description of
the protocols can be found in [5]). The two key results that we obtained using
APTE are a new attack on the anonymity of the Private Authentication proto-
col and a new attack on the unlinkability of the Passive Authentication protocol
(PaA) of the electronic passport. Both attacks rely on the attacker being able
to observe the length of messages. In both cases, we propose possible ﬁxes and
show their security with APTE for few sessions of the protocols. Observe that
execution times for the trace equivalence and length trace equivalence are very
similar. However, depending on how many sessions you consider, the execution
time varies greatly. For example, in the case of the Private Authentication pro-
tocol, one sessions is computed in less than a second whereas two sessions take
more than two days. Using APTE, we also rediscovered an existing attack on
592 V. Cheval

the unlinkability of the Basic Access Control protocol (BAC) used in the French
electronic passport. Note that proving the unlinkability of the BAC protocol for
the UK passport took too much time and so we stopped the execution after two
days. We applied APTE to prove the anonymity of the Passive Authentication
protocol. At last, since all reachability properties can be expressed by an equiv-
alence, APTE is also able to ﬁnd the very classical attack on the secrecy of the
Needham-Schroeder protocol and prove the secrecy on the Needham-Schroeder-
Love protocol.

References
1. Machine readable travel document. Technical Report 9303, International Civil Avi-
ation Organization (2008)
2. Abadi, M., Fournet, C.: Private authentication. Theoretical Computer Sci-
ence 322(3), 427–476 (2004)
3. Blanchet, B.: An Efficient Cryptographic Protocol Verifier Based on Prolog Rules.
In: 14th Computer Security Foundations Workshop, CSFW 2001 (2001)
4. Blanchet, B., Abadi, M., Fournet, C.: Automated verification of selected equiva-
lences for security protocols. Journal of Logic and Algebraic Programming 75(1),
3–51 (2008)
5. Cheval, V.: Automatic verification of cryptographic protocols: privacy-type prop-
erties. Phd thesis. ENS Cachan, France (2012)
6. Cheval, V., Blanchet, B.: Proving more observational equivalences with ProVerif.
In: Basin, D., Mitchell, J.C. (eds.) POST 2013. LNCS, vol. 7796, pp. 226–246.
Springer, Heidelberg (2013)
7. Cheval, V., Comon-Lundh, H., Delaune, S.: Trace equivalence decision: Negative
tests and non-determinism. In: 18th ACM Conference on Computer and Commu-
nications Security, CCS 2011 (2011)
8. Cheval, V., Cortier, V., Plet, A.: Lengths may break privacy – or how to check
for equivalences with length. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS,
vol. 8044, pp. 708–723. Springer, Heidelberg (2013)
9. Ciobâcă, Ş.: Automated Verification of Security Protocols with Applications to
Electronic Voting. Thèse de doctorat, Laboratoire Spécification et Vérification.
ENS Cachan, France (December 2011)
10. Cremers, C.J.F.: Unbounded verification, falsification, and characterization of se-
curity protocols by pattern refinement. In: CCS 2008: Proceedings of the 15th ACM
Conference on Computer and Communications Security, pp. 119–128. ACM, New
York (2008)
11. Grewal, G., Ryan, M., Bursuc, S., Ryan, P.: Caveat coercitor: Coercion-evidence
in electronic voting. In: IEEE Symposium on Security and Privacy, pp. 367–381.
IEEE Computer Society (2013)
12. Tiu, A., Dawson, J.: Automating open bisimulation checking for the spi calculus.
In: Proc. 23rd IEEE Computer Security Foundations Symposium (CSF 2010), pp.
307–321. IEEE Computer Society Press (2010)
13. Viganò, L.: Automated security protocol analysis with the avispa tool. In: Pro-
ceedings of the XXI Mathematical Foundations of Programming Semantics (MFPS
2005). ENTCS, vol. 155, pp. 61–86. Elsevier (2006)
The Modest Toolset: An Integrated Environment
for Quantitative Modelling and Verification

Arnd Hartmanns and Holger Hermanns

Saarland University – Computer Science, Saarbrücken, Germany

Abstract Probabilities, real-time behaviour and continuous dynamics

are the key ingredients of quantitative models enabling formal studies of
non-functional properties such as dependability and performance. The
Modest Toolset is based on networks of stochastic hybrid automata
(SHA) as an overarching semantic foundation. Many existing automata-
based formalisms are special cases of SHA. The toolset aims to facilit-
ate reuse of modelling expertise via Modest, a high-level compositional
modelling language; to allow reuse of existing models by providing im-
port and export facilities for existing languages; and to permit reuse of
existing tools by integrating them in a uniﬁed modelling and analysis
environment.

1 Introduction
Our reliance on complex safety-critical or economically vital systems such as fly-
by-wire controllers, networked industrial automation systems or “smart” power
grids increases at an ever-accelerating pace. The necessity to study the reliab-
ility and performance of these systems is evident. Over the last two decades,
significant progress has been made in the area of formal methods to allow the
construction of mathematically precise models of such systems and automatic-
ally evaluate properties of interest on the models. Classically, model checking has
been used to study functional correctness properties such as safety or liveness.
However, since a correct system implementation may still be prohibitively slow
or energy-consuming, performance requirements need to be considered as well.
The desire to evaluate both qualitative as well as quantitative properties fostered
the development of integrative approaches that combine probabilities, real-time
aspects or costs with formal verification techniques [1].
The Modest Toolset is an integrated collection of tools for the creation
and analysis of formally specified behavioural models with quantitative aspects.
It constitutes the second generation [8] of tools revolving around the Modest
modelling language [7]. By now, it has become a versatile and extensible toolset
based on the rich semantic foundation of networks of stochastic hybrid automata
(SHA), supporting multiple input languages and multiple analysis backends.

This work is supported by the Transregional Collaborative Research Centre SFB/TR
14 AVACS, the NWO-DFG bilateral project ROCKS, and the 7th EU Framework
Programme under grant agreements 295261 (MEALS) and 318490 (SENSATION).

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 593–598, 2014.

c Springer-Verlag Berlin Heidelberg 2014
594 A. Hartmanns and H. Hermanns

SHA Key:
SHA stochastic hybrid automata
PHA STA
+ continuous
PHA probabilistic hybrid automata
probability STA stochastic timed automata
HA PTA HA hybrid automata
+ continuous PTA probabilistic timed automata
dynamics
TA timed automata
TA MDP MDP Markov decision processes
+ real time
LTS labelled transition systems
nonde- discrete DTMC discrete-time Markov chains
terminism LTS DTMC probabilities

Fig. 1. Submodels of stochastic hybrid automata

The Modest Toolset’s aim is to incorporate the state of the art in research
on the analysis of stochastic hybrid systems and special cases thereof, such as
probabilistic real-time systems. In particular, it goes beyond the usual “research
prototype” by providing a single, stable, easy-to-install and easy-to-use package.
In this paper, we illustrate how SHA provide a unified formalism for quant-
itative modelling that subsumes a wide variety of well-known automata-based
models (Section 2); we highlight the Modest Toolset’s approach to model-
ling and model reuse through its support of three very different input languages
(Section 3); we give an overview of the available analysis backends for different
specialisations of SHA (Section 4); and we provide some background on technical
aspects of the toolset and its cross-platform user interface (Section 5).

Related work. Two tools have substantially inspired the design of the Modest
Toolset: Möbius [9] is a prominent multiple-formalism, multiple-solution
tool. Focussing on performance and dependability evaluation, its input form-
alisms include Petri nets, Markov chains and stochastic process algebras. Cadp
[11], in contrast, is a tool suite for explicit-state system veriﬁcation, comprising
about ﬁfty interoperable components, supporting various input languages and
analysis approaches. The Modest Toolset has so far focused on reusing ex-
isting tools on the analysis side whereas Möbius and Cadp rely on their own
implementations.

2 A Common Semantic Foundation

The Modest Toolset is built around a single overarching semantic model:
networks of stochastic hybrid automata (SHA), i.e. sets of automata that run
asynchronously and can communicate via shared actions and global variables.
While action labels are used for synchronisation, a state-based approach is used
for veriﬁcation, i.e. the valuations of the global variables act as atomic proposi-
tions observable in properties. SHA combine three key modelling concepts:
Continuous dynamics To represent continuous processes, such as physical
laws or chemical reactions, the evolution of general continuous variables over
The Modest Toolset: An Integrated Environment 595

Modest: Guarded Commands: Uppaal TA:

process Channel() module Channel
{ l: [0..1]; // control loc
snd palt { c: clock; // for delay
:99: delay(2) invariant
rcv l = 1 => c <= 2
: 1: // msg lost endinvariant
{==} [snd] l = 0 -> 0.01:(l’ = 0)
}; + 0.99:(l’ = 1) & (c’ = 0)
Channel() [rcv] l = 1 & c >= 2 -> (l’ = 0)
} endmodule

Fig. 2. Modelling a channel with loss probability 0.01 and transmission delay 2

time can be described using diﬀerential (in)equations. Continuous variables

with constant derivative 1 are used as clocks to model real-time systems.
Nondeterminism To model concurrency (via an interleaving semantics) or the
absence of knowledge over some choice, to abstract from details, and to rep-
resent the influence of an unknown environment, nondeterministic choices
can be used. The number of choices may be finite or (countably or uncount-
ably) infinite. The latter can be used to model nondeterministic delays.
Probability Probabilistic choices represent the case where an outcome is un-
certain, but the probabilities of the outcomes are known. Such choices may
be inherent to the system under study, e.g. in a randomised algorithm, or
they may represent external influences such as failure rates where statistical
data is available. Again, these choices may be discrete (“probabilistic”) or
continuous (“stochastic”), and they can be used to represent random delays.
On the syntactic representation of a SHA, each of these aspects is easy to
identify. By restricting the occurrence of certain aspects, various well-known
automata models appear as special cases of SHA as shown in Fig. 1. Addition-
ally, sampling from the exponential distribution can be combined with clocks to
obtain exponentially-distributed delays, allowing models based on continuous-
time Markov chains to be represented as SHA, too.

3 Input Languages for Every Taste

As of the current version 2.0, the Modest Toolset can process models speciﬁed
in three very diﬀerent input languages:
Modest is a high-level textual modelling language. It is inspired by process al-
gebras, but has an expressive programming language-like syntax that leads to
concise models. Modest was originally introduced with a STA semantics [7]
and has recently been extended to allow the modelling of SHA [12].
Guarded Commands Probabilistic guarded commands are a low-level textual
modelling language. Easy to learn with few key language constructs, it can be
seen as the “assembly language” of quantitative modelling. It is the language
596 A. Hartmanns and H. Hermanns

prohver PHAVer
Modest
Networks of mcpta Prism
Guarded Cmds Stochastic Hybrid Results
Automata mctau Uppaal
Uppaal TA
modes

Fig. 3. Schematic overview of the Modest Toolset’s components

of the Prism [17] model checker, so its support within the Modest Toolset
allows the reuse of many existing Prism models.
Uppaal TA Uppaal is built upon a graphical interface to model (probabil-
istic) timed automata [3]. A textual language is used for expressions and to
specify the composition of components. The Modest Toolset can import
and export Uppaal TA models. It supports a useful subset of the language’s
advanced features such as parameterised templates and C-style functions.
Fig. 2 shows a comparison of the three languages for a small example PTA
model. Through the use of the intermediate networks-of-SHA representation,
models can be freely converted between the input languages.

4 Multiple Analysis Backends

A prime goal of the Modest Toolset is to facilitate the reuse of existing
analysis tools for specific subsets of SHA where possible in order to concentrate
development effort on key areas where current tool support is still lacking or non-
existent. The following analysis backends are part of version 2.0 of the toolset:
prohver Computes upper bounds on max. probabilities of probabilistic safety
properties in SHA [12]. Relies on a modified PHAVer [10] for a HA analysis.
mcpta Performs model checking of PTA using Prism for the probabilistic ana-
lysis; supports probabilistic and expected-time/expected-reward reachability
properties in unbounded, time- and cost-bounded variants [14].
mctau Connects to Uppaal for model checking of TA [4], for which it is more
efficient than mcpta. Automatically overapproximates probabilistic choices
with nondeterminism for PTA, providing a quick first check of such models.
modes Performs statistical model checking and simulation of STA with an
emphasis on the sound handling of nondeterministic models [5,6,16]. Its trace
generation facilities are useful for model debugging and visualisation.
Fig. 3 gives a schematic overview of the input languages and analysis backends
that form the Modest Toolset.

5 An Integrated Toolset
As presented in the previous sections, the Modest Toolset consists of several
components and concepts. Several of its analysis backends have been developed
independently and presented separately before. However, it is their combination
The Modest Toolset: An Integrated Environment 597

Fig. 4. The mime graphical user interface for modelling (left) and analysis (right)

and integration that give rise to the advance in utility that the toolset presents.
This integration is visible in the main interfaces of the toolset:
mime is the toolset’s graphical user interface. It provides a modern editor
for the supported textual input languages and gives full access to the analysis
backends and their conﬁguration. mime is cross-platform, based on web techno-
logies such as HTML5, Javascript and the WebSocket protocol. Fig. 4 shows two
screenshots of the mime interface. For scripting and automation scenarios, all
backends are also available as standalone command-line tools.
The toolset itself is built around a small set of object-oriented program-
ming interfaces for input components, SHA-to-SHA model conversions, model
restrictions (to enforce certain subsets of SHA) and analysis backends. Adding
a new input language, for example, can be accomplished by implementing the
IInputFormalism interface and providing a semantics in terms of networks of
SHA; for mime support, syntax highlighting information can be included.
The Modest Toolset is implemented in C#. This allows the same binary
distribution to run on 32- and 64-bit Windows, Mac OS and Linux machines.
Libraries with a C interface are easy to use from C#. modes uses the runtime
bytecode generation facilities in the standard Reflection.Emit namespace to
generate fast simulation code for the speciﬁc model at hand.

6 Conclusion
We have presented the Modest Toolset, version 2.0, highlighting how it fa-
cilitates reuse of modelling expertise via Modest, a high-level compositional
modelling language, while allowing reuse of existing models by providing import
and export facilities for existing languages; and how it permits reuse of existing
tools by integrating them in a uniﬁed modelling and analysis environment.
The toolset and the Modest language have been used on several case studies,
most notably to analyse safety properties of a wireless bicycle brake [2] and
to evaluate stability, availability and fairness characteristics of power micro-
generation control algorithms [15]. For a more extensive list of case studies, we
refer the interested reader to [13].
The Modest Toolset, including example models, is available for download
on its website, which also provides documentation, a list of relevant publications
and the description of several case studies, at www.modestchecker.net.
598 A. Hartmanns and H. Hermanns

Planned improvements and extensions include distributed simulation and

graphical automata modelling. We are very open for collaborations on case stud-
ies, new input languages and connecting to more analysis backends.

References
1. Baier, C., Haverkort, B.R., Hermanns, H., Katoen, J.P.: Performance evaluation
and model checking join forces. Commun. ACM 53(9), 76–85 (2010)
2. Baró Graf, H., Hermanns, H., Kulshrestha, J., Peter, J., Vahldiek, A., Vasudevan, A.:
A verified wireless safety critical hard real-time design. In: WoWMoM. IEEE (2011)
3. Behrmann, G., David, A., Larsen, K.G.: A tutorial on uppaal. In: Bernardo, M.,
Corradini, F. (eds.) SFM-RT 2004. LNCS, vol. 3185, pp. 200–236. Springer, Heidel-
berg (2004)
4. Bogdoll, J., David, A., Hartmanns, A., Hermanns, H.: mctau: Bridging the gap
between modest and UPPAAL. In: Donaldson, A., Parker, D. (eds.) SPIN 2012.
LNCS, vol. 7385, pp. 227–233. Springer, Heidelberg (2012)
5. Bogdoll, J., Ferrer Fioriti, L.M., Hartmanns, A., Hermanns, H.: Partial order meth-
ods for statistical model checking and simulation. In: Bruni, R., Dingel, J. (eds.)
FMOODS/FORTE 2011. LNCS, vol. 6722, pp. 59–74. Springer, Heidelberg (2011)
6. Bogdoll, J., Hartmanns, A., Hermanns, H.: Simulation and statistical model check-
ing for Modestly nondeterministic models. In: Schmitt, J.B. (ed.) MMB/DFT 2012.
LNCS, vol. 7201, pp. 249–252. Springer, Heidelberg (2012)
7. Bohnenkamp, H.C., D’Argenio, P.R., Hermanns, H., Katoen, J.P.: MoDeST: A
compositional modeling formalism for hard and softly timed systems. IEEE Trans.
Software Eng. 32(10), 812–830 (2006)
8. Bohnenkamp, H.C., Hermanns, H., Katoen, J.-P.: motor: The modest Tool En-
vironment. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp.
500–504. Springer, Heidelberg (2007)
9. Courtney, T., Gaonkar, S., Keefe, K., Rozier, E., Sanders, W.H.: Möbius 2.3: An
extensible tool for dependability, security, and performance evaluation of large and
complex system models. In: DSN, pp. 353–358. IEEE (2009)
10. Frehse, G.: PHAVer: Algorithmic verification of hybrid systems past HyTech. In:
Morari, M., Thiele, L. (eds.) HSCC 2005. LNCS, vol. 3414, pp. 258–273. Springer,
Heidelberg (2005)
11. Garavel, H., Lang, F., Mateescu, R., Serwe, W.: Cadp 2011: a toolbox for the
construction and analysis of distributed processes. STTT 15(2), 89–107 (2013)
12. Hahn, E.M., Hartmanns, A., Hermanns, H., Katoen, J.P.: A compositional mod-
elling and analysis framework for stochastic hybrid systems. Formal Methods in
System Design 43(2), 191–232 (2013)
13. Hartmanns, A.: Modest - a unified language for quantitative models. In: FDL, pp.
44–51. IEEE (2012)
14. Hartmanns, A., Hermanns, H.: A Modest approach to checking probabilistic timed
automata. In: QEST, pp. 187–196. IEEE Computer Society (2009)
15. Hartmanns, A., Hermanns, H., Berrang, P.: A comparative analysis of decentralized
power grid stabilization strategies. In: Winter Simulation Conference (2012)
16. Hartmanns, A., Timmer, M.: On-the-fly confluence detection for statistical model
checking. In: Brat, G., Rungta, N., Venet, A. (eds.) NFM 2013. LNCS, vol. 7871,
pp. 337–351. Springer, Heidelberg (2013)
17. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of probabilistic
real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS,
vol. 6806, pp. 585–591. Springer, Heidelberg (2011)
Bounds2: A Tool for Compositional
Multi-parametrised Verification

Antti Siirtola

Helsinki Institute for Information Technology HIIT

Department of Computer Science and Engineering, Aalto University

Abstract. Bounds2 is a two-part tool for parametrised veriﬁcation. The

instance generator inputs a parametrised system implementation and
specification, computes cut-offs for the values of the parameters and
outputs the specification and implementation instances up to the cut-
offs. After that, the outputted instances are verified by using an instance
checker. Bounds2 is unique since it lends support to compositional rea-
soning through three refinement-based notions of correctness and allows
for parametrising not only the number of processes but also the size
of data types as well as the structure of a system. Bounds2 provides a
sound and complete approach to parametrised verification under explicit
assumptions checked automatically by the tool. The decidable fragment
covers, e.g., mutual exclusion properties of systems with shared resources.

1 Introduction
Modern software systems are not only multithreaded but also object-oriented
and component-based. Such systems have several natural parameters, such as
the number of processes and the number of data objects. Moreover, some com-
ponents, like external libraries and subsystems concurrently under construction,
may only be available in the interface specification form. That is why there is
an evident need for verification techniques that can handle multi-parametrised
systems in a compositional way.
Bounds2 is a tool that enables parametrised verification by establishing upper
bounds, i.e., cut-offs, for the values of parameters such that if there is a bug in an
implementation instance with a parameter value greater than the cut-off, then
there is an analogous bug in an implementation instance where the values of
the parameters are within the cut-offs. When using Bounds2, implementations
and specifications are composed of labelled transitions systems (LTSs) with ex-
plicit input and output events by using parallel composition and hiding. We can
also use several kinds of parameters: types represent the sets of the identifiers
of replicated components or the sets of data values of an arbitrary size, typed
variables refer to the identities of individual components or data values and rela-
tion symbols represent binary relations over replicated processes. Correctness is
understood either as a refinement, which can be trace inclusion [1] or alternating
simulation [2], or the compatibility of the components of the implementation [2].
Hence, Bounds2 enables compositional reasoning, too.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 599–604, 2014.

c Springer-Verlag Berlin Heidelberg 2014
600 A. Siirtola

The tool consists of two parts. The instance generator determines cut-offs
for the types, computes the allowed parameter values up to the cut-offs and
outputs the corresponding finite state verification tasks. It can also apply a
limited form of abstraction if the cut-offs cannot be determined without. After
that, the outputted instances are verified by an instance checker specific to the
notion of correctness. Trace refinement and compatibility checkers exploit the
refinement checker FDR2 [1] to verify the instances, the alternating simulation
checker makes use of the MIO Workbench [3] refinement checker, too. Bounds2
is publicly available at [4].

2 The Description Language

To get a grasp of the CSP-based (Communicating Sequential Processes [1]) for-
malism used by Bounds2, consider an arbitrary number of processes competing
for an access to an arbitrary number of shared variables that store values of a
parametric type. Each variable should be written to in a mutually exclusive way,
so the interface of the shared variables is formalised as follows:

1 type P type V type D

2 var p:P var v:V var d:D
3 chan writebeg : P,V,D chan writeend : P,V,D
4 plts VarIF =
5 lts
6 I = []p,d: ?writebeg(p,v,d) -> W(p,d)
7 W(p,d) = writeend(p,v,d) -> I
8 from I
9 plts VarsIF = ||v: VarIF

Types P and V represent the set of the identifiers of processes and variables,
respectively, and D denotes the domain of the shared variables. Variables p, v
and d are used to refer to an individual process, a shared variable and a value of
a shared variable, respectively. The event writebeg(p,v,d) (writeend(p,v,d))
denotes that the process p starts (is finished with) writing the value d to the
shared variable v. The input events are marked by ?, the other events are outputs.
A parametrised LTS (PLTS) VarIF captures the interface of a shared variable v:
only one process can access v at a time. As we let v to range over all identifiers
of shared variables and compose the instances of VarIF in parallel, we obtain
the PLTS VarsIF which captures the joint interface of all the shared variables.
Suppose that we also have an implementation VarImpl of the variable inter-
face, the interface PrIF of a process p and the alphabet PrAlph of PrIF without
the write events. In order to check that (a) all the variable and process inter-
faces are compatible, i.e., they can co-operate in some environment such that
whenever an output is sent, it is matched by an input, (b) the implementation
of the variable refines its interface and (c) no two process access the variable
simultaneously, we specify the following parametrised verification tasks:

10 compatibility: verify (||p: PrIF) || VarsIF

Bounds2: A Tool for Compositional Multi-parametrised Veriﬁcation 601

11 alternating simulation: verify VarImpl against VarIF

12 trace refinement: verify (||p: PrIF) \ ((_)p: PrAlph) against VarsIF

We can also deﬁne binary relations over parametric types. For example, in
order to specify a total order TO in which the processes access the shared variables
we could write as follows:

relv TO : P,P var p1 : P var p2 : P var p3 : P

vc irrefl = \/p1: ! p1 TO p1
vc asymm = \/p1,p2: ! (p1 TO p2 & p2 TO p1)
vc trans = \/p1,p2,p3: p1 TO p2 & p2 TO p3 -> p1 TO p3
vc total = \/p1,p2: !p1 = p2 -> p1 TO p2 | p2 TO p1
trace refinement: verify (||p1,p2: [p1 TO p2] Pr2IF) \ ((_)p: PrAlph)
against VarsIF when (irrefl & asymm & trans & total)

In this case, we need also a PLTS Pr2IF which describes the behaviour of the
process interface from the viewpoint of two processes p1 and p2 such that p1
comes before p2 in the total order.
Once we have proved that a system implementation refines its interface spec-
ification, we can use the specification, which is usually much smaller, in place of
the system implementation in further verification efforts. This is possible since
the input formalism of Bounds2 is compositional.

3 Novel Features

The ﬁrst version of Bounds was introduced in [5] and it featured the support
for process types, relation symbols and trace reﬁnement (tref). The novelties of
Bounds2 introduced here are fourfold.

Sound and/or complete veriﬁcation. The input language of Bounds2 is Turing

complete in its full generality, but there are explicit conditions under which
cut-offs can be computed. The main restriction is that each type must classify
either as a data type or a process type, but not both. In the above example,
V is thought as a process type since v is only used in the replicated parallel
compositions (line 9) and D is considered a data type since d only occurs in the
replicated choices (line 6). The variable p is used both in a replicated choice and
a replicated parallel composition (lines 6,10,12), but sometimes, like in this case,
Bounds2 can convert replicated choices into replicated parallel compositions,
which means that P is classified as a process type. Depending on the notion of
correctness, there are also limitations on non-determinism, hiding and the use
of relation symbols. The decidable fragment covers, e.g., the mutual exclusion
properties of concurrent systems with shared resources [6–8], and if any of the
assumptions is removed, parametrised verification becomes undecidable [6–8].
All the conditions are checked automatically by the tool. If any of them is
violated, the user is asked to provide a cut-off manually. In this case, verification
is not sound but we can still detect bugs. Sometimes, the tool can perform an
602 A. Siirtola

over-approximating conversion of replicated choices to replicated parallel com-

positions. In this case, the veriﬁcation is sound but false negatives are possible.
The exact and over-approximating conversions as well as the possibility to enter
a cut-oﬀ manually are novel features of Bounds2.

More expressive input language. The main novelty of Bounds2 is a more ex-
pressive input language with a larger decidable fragment. This is enabled by the
introduction of a replicated choice (data types), the classiﬁcation of events into
inputs and outputs, and two new notions of correctness: compatibility (comp)
and alternating simulation (altsim) [2]. A replicated choice adds to expressive-
ness, since it allows us to express components with a parametrised state space.
Earlier, it was only possible to parametrise the number of concurrent compo-
nents. Distinguishing between input and output events allows us to consider the
compatibility of PLTSs representing software interfaces and gives rise to another
reﬁnement, the alternating simulation, which is a natural notion of the correct-
ness of software interfaces. However, the support for the alternating simulation
is currently not as good as for two other notions of correctness, since the back
end alternating simulation checker, MIO Workbench, can only handle relatively
small models. The theoretical background of the extensions is described in [7, 8].

Faster operation. Bounds2 determines structural cut-oﬀs for types based on

the results in [7, 8]. After that, the tool computes the allowed values for all
parameters up to the cut-offs by using a search-tree-based enumeration algorithm
described in [5]. Since the values of relation symbols are defined in the universal
fragment of first order logic, this is basically equivalent to computing the model
class of a first order formula up to the cut-offs.
For the recent version of the tool, we have implemented a version of the
search algorithm which is parallelised according to the exploratory partitioning
scheme [9]. Initially, computation proceeds sequentially in a breadth-first search
manner but when the search tree becomes wide enough (several times wider than
the number of processor cores available), the subtrees are processed in parallel.
This way, we can ensure that the work load is distributed evenly between threads,
since some subtrees are much bigger than the others. Once the values of the
parameters up to the cut-offs are computed, Bounds2 generates and outputs the
corresponding instances of the system specification and implementation. Also
this phase is parallelised in Bounds2 by using the input partitioning scheme [9].
Additionally, the internal data structures are optimised. Previously, all states
and events were stored as a string which was highly inefficient. In Bounds2, the
strings are first converted into integers for faster analysis and finally back into
strings when the instances are outputted.

Improved reduction. The cut-offs computed earlier are rough structural ones.
They are fast and easy to compute but they are often far from optimal, espe-
cially in the case of data types. Therefore, Bounds2 tries to improve the cut-offs
further by analysing the instances up to the structural cut-offs. Basically, the
tool discards instances that can be obtained as a composition of smaller ones as
Bounds2: A Tool for Compositional Multi-parametrised Verification 603

described in [7, 8]. Also this additional reduction is sound and complete and it
is an important enhancement over Bounds1, because the discarded instances are
always the biggest ones which are the most expensive to verify.

4 Experimental Results

We have made several case studies with Bounds2 and compared its performance
against Bounds1 [5]. We have not compared Bounds2 with other parametrised
verification tools, since most of them are targeted to low level software with fi-
nite data [10–14] whereas our focus is on higher level applications which are not
only multithreaded but also object-oriented and component-based. The compar-
ison with other tools would be difficult anyway, since the tools solve a different
decidable fragment and we are not aware of any other tools for parametrised
refinement checking.
For each system, the table below lists the number of types (typ), relation
symbols (rel) and variables (var) used in the model, the notion of correctness
(corr) and the structural cut-offs for types. For both versions of Bounds, the
number of instances outputted and the running times of the instance generator
(tG ) and the instance checker (tC ) are reported. For Bounds2, the former running
times are given with a single core (tG1 ) and six cores (tG6 ) being used. We can
see that the cut-offs provided by the tool are often very small and compared
with Bounds1, the new version not only has a broader application domain but
also operates faster and produces less instances that need to be checked. We can
also see that the bottleneck in the verification chain is typically not Bounds2
but the back end finite state verification tool. The experiments were made on a
six-core AMD Phenom II with 8GB of memory running Ubuntu 12.04 LTS.

parameters Bounds1 Bounds2

system typ rel var corr cut-oﬀs out tG tC out tG1 tG6 tC
HCP 2 0 0 tref 2,9 not supported 4 0.15s <0.1s 0.3s
cache consistency 2 0 0 tref 1,13 not supported 5 15s 6s 9s
tokenring 1 1 2 tref 4 3 <0.1s 0.2s 3 <0.1s <0.1s 0.2s
twotoken 1 2 2 tref 5 30 32s 5s 30 32s 8s 5s
miniSRS comp 2 0 0 comp 2,1 not supported 2 <0.1s <0.1s 0.15s
SRS 2 1 0 tref 2,3 14 0.4s 1m35s 6 0.1s <0.1s 1m4s
SRSwithData 3 1 0 tref 2,3,1 not supported 5 0.1s 0.1s 1m4s
res io 2 0 1 altsim 2,1 not supported 2 <0.1s <0.1s 5.7s
peterson io 2 0 1 altsim 2,1 not supported 1 <0.1s <0.1s 11s
taDOM2+ 2 2 0 tref 2,3 28 11s >12h 14 6.5s 1.7s 6h36m

5 Conclusions

Bounds2 is a cut-off-based tool for parametrised verification. The cut-offs pro-

vided by the tool are often as good as we can intuitively hope for, which is
necessary for practical parametrised veriﬁcation. The distinctive feature of the
604 A. Siirtola

tool is sound and complete verification with the support for compositional rea-
soning and the possibility to parametrise both the number of processes and the
size of data types as well as the structure of the system. We believe that the
tool will be useful in the analysis of multithreaded, component-based, object-
oriented software systems, which involve both process and data parameters and
where some components may only be available in the interface specification form.
Hence, Bounds2 nicely complements other parametrised verification tools most
of which are targeted for low level software acting on finite data.

Acknowledgements. We would like to thankfully acknowledge the funding

from the SARANA project in the SAFIR 2014 program and the Academy of
Finland project 139402.

References
1. Roscoe, A.W.: Understanding Concurrent Systems. Springer (2010)
2. De Alfaro, L., Henzinger, T.: Interface automata. ACM SIGSOFT Software Engi-
neering Notes 26(5), 109–120 (2001)
3. Bauer, S.S., Mayer, P., Schroeder, A., Hennicker, R.: On weak modal compatibility,
refinement, and the MIO workbench. In: Esparza, J., Majumdar, R. (eds.) TACAS
2010. LNCS, vol. 6015, pp. 175–189. Springer, Heidelberg (2010)
4. Siirtola, A.: Bounds website, http://www.cs.hut.fi/u/siirtoa1/bounds
5. Siirtola, A.: Bounds: from parameterised to finite-state verification. In: Caillaud,
B., Carmona, J., Hiraishi, K. (eds.) ACSD 2011, pp. 31–35. IEEE (2011)
6. Siirtola, A.: Algorithmic Multiparameterised Verification of Safety Properties. Pro-
cess Algebraic Approach. PhD thesis, University of Oulu (2010)
7. Siirtola, A., Heljanko, K.: Parametrised compositional verification with multiple
process and data types. In: Carmona, J., Lazarescu, M.T., Pietkiewicz-Koutny, M.
(eds.) ACSD 2013, pp. 67–76. IEEE (2013)
8. Siirtola, A.: Parametrised interface automata (unpublished draft) (2013),
http://www.cs.hut.fi/u/siirtoa1/papers/pia_paper.pdf
9. Grama, A., Gupta, A., Karypis, G., Kumar, V.: Introduction to Parallel Comput-
ing. Addison Wesley (2003)
10. Delzanno, G., Raskin, J.-F., Van Begin, L.: Towards the automated verification of
multithreaded Java programs. In: Katoen, J.-P., Stevens, P. (eds.) TACAS 2002.
LNCS, vol. 2280, pp. 173–187. Springer, Heidelberg (2002)
11. Ghilardi, S., Ranise, S.: Backward reachability of array-based systems by SMT
solving: termination and invariant synthesis. Log. Meth. Comput. Sci. 6(4) (2010)
12. Kaiser, A., Kroening, D., Wahl, T.: Dynamic cutoff detection in parameterized
concurrent programs. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS,
vol. 6174, pp. 645–659. Springer, Heidelberg (2010)
13. La Torre, S., Madhusudan, P., Parlato, G.: Model-checking parameterized concur-
rent programs using linear interfaces. In: Touili, T., Cook, B., Jackson, P. (eds.)
CAV 2010. LNCS, vol. 6174, pp. 629–644. Springer, Heidelberg (2010)
14. Yang, Q., Li, M.: A cut-off approach for bounded verification of parameterized
systems. In: Kramer, J., Bishop, J., Devanbu, P.T., Uchitel, S. (eds.) ICSE 2010,
pp. 345–354. ACM (2010)
On the Correctness
of a Branch Displacement Algorithm

Jaap Boender1 and Claudio Sacerdoti Coen2

1
Foundations of Computing Group
Department of Computer Science
School of Science and Technology
Middlesex University, London, UK
[email protected]
2
Dipartimento di Scienze dell’Informazione,
Università degli Studi di Bologna, Italy
[email protected]

Abstract. The branch displacement problem is a well-known problem

in assembler design. It revolves around the feature, present in several
processor families, of having different instructions, of different sizes, for
jumps of different displacements. The problem, which is provably NP-
hard, is then to select the instructions such that one ends up with the
smallest possible program.
During our research with the CerCo project on formally verifying a C
compiler, we have implemented and proven correct an algorithm for this
problem. In this paper, we discuss the problem, possible solutions, our
specific solutions and the proofs.

Keywords: formal veriﬁcation, interactive theorem proving, assembler,

branch displacement optimisation.

1 Introduction

The problem of branch displacement optimisation, also known as jump encoding,

is a well-known problem in assembler design [3]. Its origin lies in the fact that
in many architecture sets, the encoding (and therefore size) of some instructions
depends on the distance to their operand (the instruction ’span’). The branch
displacement optimisation problem consists of encoding these span-dependent
instructions in such a way that the resulting program is as small as possible.
This problem is the subject of the present paper. After introducing the prob-
lem in more detail, we will discuss the solutions used by other compilers, present
the algorithm we use in the CerCo assembler, and discuss its veriﬁcation, that
is the proofs of termination and correctness using the Matita proof assistant [1].

Research supported by the CerCo project, within the Future and Emerging Techno-
logies (FET) programme of the Seventh Framework Programme for Research of the
European Commission, under FET-Open grant number 243881.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 605–619, 2014.

c Springer-Verlag Berlin Heidelberg 2014
606 J. Boender and C.S. Coen

Formulating the final statement of correctness and finding the loop invariants
have been non-trivial tasks and are, indeed, the main contribution of this paper.
It has required considerable care and fine-tuning to formulate not only the min-
imal statement required for the ulterior proof of correctness of the assembler,
but also the minimal set of invariants needed for the proof of correctness of the
algorithm.
The research presented in this paper has been executed within the CerCo
project which aims at formally verifying a C compiler with cost annotations.
The target architecture for this project is the MCS-51, whose instruction set
contains span-dependent instructions. Furthermore, its maximum addressable
memory size is very small (64 Kb), which makes it important to generate pro-
grams that are as small as possible. With this optimisation, however, comes
increased complexity and hence increased possibility for error. We must make
sure that the branch instructions are encoded correctly, otherwise the assembled
program will behave unpredictably.
All Matita files related to this development can be found on the CerCo web-
site, http://cerco.cs.unibo.it. The specific part that contains the branch
displacement algorithm is in the ASM subdirectory, in the files PolicyFront.ma,
PolicyStep.ma and Policy.ma.

2 The Branch Displacement Optimisation Problem

In most modern instruction sets that have them, the only span-dependent in-
structions are branch instructions. Taking the ubiquitous x86-64 instruction set
as an example, we find that it contains eleven different forms of the unconditional
branch instruction, all with different ranges, instruction sizes and semantics (only
six are valid in 64-bit mode, for example). Some examples are shown in Figure 1
(see also [4]).

Instruction Size (bytes) Displacement range

Short jump 2 -128 to 127 bytes
Relative near jump 5 −232 to 232 − 1 bytes
Absolute near jump 6 one segment (64-bit address)
Far jump 8 entire memory (indirect jump)

Fig. 1. List of x86 branch instructions

The chosen target architecture of the CerCo project is the Intel MCS-51,
which features three types of branch instructions (or jump instructions; the two
terms are used interchangeably), as shown in Figure 2.
Conditional branch instructions are only available in short form, which means
that a conditional branch outside the short address range has to be encoded using
three branch instructions (for instructions whose logical negation is available, it
can be done with two branch instructions, but for some instructions this is not
the case). The call instruction is only available in absolute and long forms.
On the Correctness of a Branch Displacement Algorithm 607

Instruction Size Execution time Displacement range

(bytes) (cycles)
SJMP (‘short jump’) 2 2 -128 to 127 bytes
AJMP (‘absolute jump’) 2 2 one segment (11-bit address)
LJMP (‘long jump’) 3 3 entire memory

Fig. 2. List of MCS-51 branch instructions

Note that even though the MCS-51 architecture is much less advanced and
much simpler than the x86-64 architecture, the basic types of branch instruction
remain the same: a short jump with a limited range, an intra-segment jump and
a jump that can reach the entire available memory.
Generally, in code fed to the assembler as input, the only difference between
branch instructions is semantics, not span. This means that a distinction is made
between an unconditional branch and the several kinds of conditional branch,
but not between their short, absolute or long variants.
The algorithm used by the assembler to encode these branch instructions into
the different machine instructions is known as the branch displacement algorithm.
The optimisation problem consists of finding as small an encoding as possible,
thus minimising program length and execution time.
Similar problems, e.g. the branch displacement optimisation problem for other
architectures, are known to be NP-complete [7,9], which could make finding an
optimal solution very time-consuming.
The canonical solution, as shown by Szymanski [9] or more recently by Dick-
son [2] for the x86 instruction set, is to use a fixed point algorithm that starts
with the shortest possible encoding (all branch instruction encoded as short
jumps, which is likely not a correct solution) and then iterates over the source
to re-encode those branch instructions whose target is outside their range.

Adding Absolute Jumps

In both papers mentioned above, the encoding of a jump is only dependent on

the distance between the jump and its target: below a certain value a short jump
can be used; above this value the jump must be encoded as a long jump.
Here, termination of the smallest ﬁxed point algorithm is easy to prove. All
branch instructions start out encoded as short jumps, which means that the
distance between any branch instruction and its target is as short as possible
(all the intervening jumps are short). If, in this situation, there is a branch
instruction b whose span is not within the range for a short jump, we can be
sure that we can never reach a situation where the span of j is so small that it
can be encoded as a short jump. This argument continues to hold throughout
the subsequent iterations of the algorithm: short jumps can change into long
jumps, but not vice versa, as spans only increase. Hence, the algorithm either
terminates early when a ﬁxed point is reached or when all short jumps have been
changed into long jumps.
608 J. Boender and C.S. Coen

L0 : jmp X
X: ...
...
L1 : ...
% Start of new segment if
% jmp X is encoded as short
jmp X ...
... jmp L1
L0 : ... ...
% Start of new segment if jmp L1
% jmp X is encoded as short ...
... jmp L1
jmp L0 ...

(a) Example of a program where a long (b) Example of a program where the
jump becomes absolute ﬁxed-point algorithm is not optimal

Also, we can be certain that we have reached an optimal solution: a short

jump is only changed into a long jump if it is absolutely necessary.
However, neither of these claims (termination nor optimality) hold when we
add the absolute jump. With absolute jumps, the encoding of a branch instruc-
tion no longer depends only on the distance between the branch instruction
and its target. An absolute jump is possible when instruction and target are in
the same segment (for the MCS-51, this means that the first 5 bytes of their
addresses have to be equal). It is therefore entirely possible for two branch in-
structions with the same span to be encoded in different ways (absolute if the
branch instruction and its target are in the same segment, long if this is not the
case).
This invalidates our earlier termination argument: a branch instruction, once
encoded as a long jump, can be re-encoded during a later iteration as an absolute
jump. Consider the program shown in Figure 3a. At the start of the first iteration,
both the branch to X and the branch to L0 are encoded as small jumps. Let us
assume that in this case, the placement of L0 and the branch to it are such that
L0 is just outside the segment that contains this branch. Let us also assume
that the distance between L0 and the branch to it is too large for the branch
instruction to be encoded as a short jump.
All this means that in the second iteration, the branch to L0 will be encoded as
a long jump. If we assume that the branch to X is encoded as a long jump as well,
the size of the branch instruction will increase and L0 will be ‘propelled’ into the
same segment as its branch instruction, because every subsequent instruction
will move one byte forward. Hence, in the third iteration, the branch to L0 can
be encoded as an absolute jump. At first glance, there is nothing that prevents
us from constructing a configuration where two branch instructions interact in
such a way as to iterate indefinitely between long and absolute encodings.
This situation mirrors the explanation by Szymanski [9] of why the branch dis-
placement optimisation problem is NP-complete. In this explanation, a condition
On the Correctness of a Branch Displacement Algorithm 609

for NP-completeness is the fact that programs be allowed to contain pathological

jumps. These are branch instructions that can normally not be encoded as a
short(er) jump, but gain this property when some other branch instructions are
encoded as a long(er) jump. This is exactly what happens in Figure 3a. By en-
coding the ﬁrst branch instruction as a long jump, another branch instruction
switches from long to absolute (which is shorter).
In addition, our previous optimality argument no longer holds. Consider the
program shown in Figure 3b. Suppose that the distance between L0 and L1 is
such that if jmp X is encoded as a short jump, there is a segment border just after
L1 . Let us also assume that all three branches to L1 are in the same segment,
but far enough away from L1 that they cannot be encoded as short jumps.
Then, if jmp X were to be encoded as a short jump, which is clearly possible,
all of the branches to L1 would have to be encoded as long jumps. However,
if jmp X were to be encoded as a long jump, and therefore increase in size, L1
would be ‘propelled’ across the segment border, so that the three branches to L1
could be encoded as absolute jumps. Depending on the relative sizes of long and
absolute jumps, this solution might actually be smaller than the one reached by
the smallest ﬁxed point algorithm.

3 Our Algorithm
3.1 Design Decisions
Given the NP-completeness of the problem, finding optimal solutions (using, for
example, a constraint solver) can potentially be very costly.
The SDCC compiler [8], which has a backend targeting the MCS-51 instruc-
tion set, simply encodes every branch instruction as a long jump without taking
the distance into account. While certainly correct (the long jump can reach any
destination in memory) and a very fast solution to compute, it results in a less
than optimal solution in terms of output size and execution time.
On the other hand, the gcc compiler suite, while compiling C on the x86
architecture, uses a greatest fix point algorithm. In other words, it starts with
all branch instructions encoded as the largest jumps available, and then tries to
reduce the size of branch instructions as much as possible.
Such an algorithm has the advantage that any intermediate result it returns is
correct: the solution where every branch instruction is encoded as a large jump is
always possible, and the algorithm only reduces those branch instructions whose
destination address is in range for a shorter jump. The algorithm can thus be
stopped after a determined number of steps without sacrificing correctness.
The result, however, is not necessarily optimal. Even if the algorithm is run
until it terminates naturally, the fixed point reached is the greatest fixed point,
not the least fixed point. Furthermore, gcc (at least for the x86 architecture)
only uses short and long jumps. This makes the algorithm more efficient, as
shown in the previous section, but also results in a less optimal solution.
In the CerCo assembler, we opted at first for a least fixed point algorithm,
taking absolute jumps into account.
610 J. Boender and C.S. Coen

Here, we ran into a problem with proving termination, as explained in the

previous section: if we only take short and long jumps into account, the jump
encoding can only switch from short to long, but never in the other direction.
When we add absolute jumps, however, it is theoretically possible for a branch
instruction to switch from absolute to long and back, as previously explained.
Proving termination then becomes difficult, because there is nothing that pre-
cludes a branch instruction from oscillating back and forth between absolute and
long jumps indefinitely.
To keep the algorithm in the same complexity class and more easily prove ter-
mination, we decided to explicitly enforce the ‘branch instructions must always
grow longer’ requirement: if a branch instruction is encoded as a long jump in
one iteration, it will also be encoded as a long jump in all the following itera-
tions. Therefore the encoding of any branch instruction can change at most two
times: once from short to absolute (or long), and once from absolute to long.
There is one complicating factor. Suppose that a branch instruction is en-
coded in step n as an absolute jump, but in step n + 1 it is determined that
(because of changes elsewhere) it can now be encoded as a short jump. Due
to the requirement that the branch instructions must always grow longer, the
branch encoding will be encoded as an absolute jump in step n + 1 as well.
This is not necessarily correct. A branch instruction that can be encoded as
a short jump cannot always also be encoded as an absolute jump, as a short
jump can bridge segments, whereas an absolute jump cannot. Therefore, in this
situation we have decided to encode the branch instruction as a long jump, which
is always correct.
The resulting algorithm, therefore, will not return the least fixed point, as it
might have too many long jumps. However, it is still better than the algorithms
from SDCC and gcc, since even in the worst case, it will still return a smaller
or equal solution.
Experimenting with our algorithm on the test suite of C programs included
with gcc 2.3.3 has shown that on average, about 25 percent of jumps are encoded
as short or absolute jumps by the algorithm. As not all instructions are jumps,
this does not make for a large reduction in size, but it can make for a reduction
in execution time: if jumps are executed multiple times, for example in loops,
the fact that short jumps take less cycles to execute than long jumps can have
great effect.
As for complexity, there are at most 2n iterations, where n is the number of
branch instructions. Practical tests within the CerCo project on small to medium
pieces of code have shown that in almost all cases, a fixed point is reached in 3
passes. Only in one case did the algorithm need 4. This is not surprising: after
all, the difference between short/absolute and long jumps is only one byte (three
for conditional jumps). For a change from short/absolute to long to have an
effect on other jumps is therefore relatively uncommon, which explains why a
fixed point is reached so quickly.
On the Correctness of a Branch Displacement Algorithm 611

3.2 The Algorithm in Detail

The branch displacement algorithm forms part of the translation from pseudo-
code to assembler. More specifically, it is used by the function that translates
pseudo-addresses (natural numbers indicating the position of the instruction in
the program) to actual addresses in memory. Note that in pseudocode, all in-
structions are of size 1.
Our original intention was to have two different functions, one function policy :
N → {short jump, absolute jump, long jump} to associate jumps to their in-
tended encoding, and a function δ : N → Word to associate pseudo-addresses
to machine addresses. δ would use policy to determine the size of jump instruc-
tions. This turned out to be suboptimal from the algorithmic point of view and
impossible to prove correct.
From the algorithmic point of view, in order to create the policy function,
we must necessarily have a translation from pseudo-addresses to machine ad-
dresses (i.e. a δ function): in order to judge the distance between a jump and
its destination, we must know their memory locations. Conversely, in order to
create the δ function, we need to have the policy function, otherwise we do not
know the sizes of the jump instructions in the program.
Much the same problem appears when we try to prove the algorithm correct:
the correctness of policy depends on the correctness of δ, and the correctness
of δ depends on the correctness of policy.
We solved this problem by integrating the policy and δ algorithms. We
now have a function δ : N → Word × bool which associates a pseudo-address
to a machine address. The boolean denotes a forced long jump; as noted in the
previous section, if during the fixed point computation an absolute jump changes
to be potentially re-encoded as a short jump, the result is actually a long jump.
It might therefore be the case that jumps are encoded as long jumps without
this actually being necessary, and this information needs to be passed to the
code generating function.
The assembler function encodes the jumps by checking the distance between
source and destination according to δ, so it could select an absolute jump in a
situation where there should be a long jump. The boolean is there to prevent
this from happening by indicating the locations where a long jump should be
encoded, even if a shorter jump is possible. This has no effect on correctness,
since a long jump is applicable in any situation.
The algorithm, shown in Figure 4, works by folding the function f over the
entire program, thus gradually constructing sigma. This constitutes one step in
the fixed point calculation; successive steps repeat the fold until a fixed point is
reached. We have abstracted away the case where an instruction is not a jump,
since the size of these instructions is constant.
Parameters of the function f are:
– a function labels that associates a label to its pseudo-address;
– old sigma, the δ function returned by the previous iteration of the fixed
point calculation;
– instr, the instruction currently under consideration;
612 J. Boender and C.S. Coen

function f(labels,old sigma,instr,ppc,acc)

added, pc, sigma ← acc
if instr is a backward jump to j then
length ← jump size(pc, sigma1 (labels(j))) compute jump distance
else if instr is a forward jump to j then
length ← jump size(pc, old sigma1 (labels(j)) + added)
end if
old length ← old sigma1 (ppc)
new length ← max(old length, length) length must never decrease
old size ← old sigma2 (ppc)
new size ← instruction size(instr, new length) compute size in bytes
new added ← added + (new size − old size) keep track of total added bytes
new sigma ← old sigma
new sigma1 (ppc + 1) ← pc + new size
new sigma2 (ppc) ← new length update σ
return new added, pc + new size, new sigma
end function

Fig. 4. The heart of the algorithm

– ppc, the pseudo-address of instr;

– acc, the fold accumulator, which contains added (the number of bytes added
to the program size with respect to the previous iteration), pc (the highest
memory address reached so far), and of course sigma, the δ function under
construction.

The first two are parameters that remain the same through one iteration, the
final three are standard parameters for a fold function (including ppc, which is
simply the number of instructions of the program already processed).
The δ functions used by f are not of the same type as the final δ func-
tion: they are of type δ : N → N × {short jump, absolute jump, long jump}; a
function that associates a pseudo-address with a memory address and a jump
length. We do this to ease the comparison of jump lengths between iterations.
In the algorithm, we use the notation sigma1 (x) to denote the memory address
corresponding to x, and sigma2 (x) for the jump length corresponding to x.
Note that the δ function used for label lookup varies depending on whether
the label is behind our current position or ahead of it. For backward branches,
where the label is behind our current position, we can use sigma for lookup,
since its memory address is already known. However, for forward branches, the
memory address of the address of the label is not yet known, so we must use
old sigma.
We cannot use old sigma without change: it might be the case that we have
already increased the size of some branch instructions before, making the pro-
gram longer and moving every instruction forward. We must compensate for this
by adding the size increase of the program to the label’s memory address ac-
cording to old sigma, so that branch instruction spans do not get compromised.
On the Correctness of a Branch Displacement Algorithm 613

sigma policy specification ≡ λprogram.λsigma.

sigma 0 = 0 ∧
let instr list ≡ code program in
∀ppc.ppc < |instr list| →
let pc ≡ sigma ppc in
let instruction ≡ fetch pseudo instruction instr list ppc in
let next pc ≡ sigma (ppc + 1) in
next pc = pc + instruction size sigma ppc instruction ∧
(pc + instruction size sigma ppc instruction < 216 ∨
(∀ppc .ppc < |instr list| → ppc < ppc →
let instruction ≡ fetch pseudo instruction instr list ppc in
instruction size sigma ppc instruction = 0) ∧
pc + instruction size sigma ppc instruction = 216 )

Fig. 5. Main correctness statement

4 The Proof
In this section, we present the correctness proof for the algorithm in more de-
tail. The main correctness statement is shown, slightly simplified, in Figure 5.
Informally, this means that when fetching a pseudo-instruction at ppc, the trans-
lation by δ of ppc + 1 is the same as δ(ppc) plus the size of the instruction at
ppc. That is, an instruction is placed consecutively after the previous one, and
there are no overlaps. The rest of the statement deals with memory size: either
the next instruction fits within memory (next pc < 216 ) or it ends exactly at
the limit memory, in which case it must be the last translated instruction in the
program (enforced by specfiying that the size of all subsequent instructions is 0:
there may be comments or cost annotations that are not translated).
Finally, we enforce that the program starts at address 0, i.e. δ(0) = 0. It may
seem strange that we do not explicitly include a safety property stating that
every jump instruction is of the right type with respect to its target (akin to
the lemma from Figure 7), but this is not necessary. The distance is recalculated
according to the instruction addresses from δ, which implicitly expresses safety.
Since our computation is a least fixed point computation, we must prove ter-
mination in order to prove correctness: if the algorithm is halted after a number
of steps without reaching a fixed point, the solution is not guaranteed to be
correct. More specifically, branch instructions might be encoded which do not
coincide with the span between their location and their destination.
Proof of termination rests on the fact that the encoding of branch instructions
can only grow larger, which means that we must reach a fixed point after at most
2n iterations, with n the number of branch instructions in the program. This
worst case is reached if at every iteration, we change the encoding of exactly one
branch instruction; since the encoding of any branch instruction can change first
from short to absolute, and then to long, there can be at most 2n changes.
614 J. Boender and C.S. Coen

4.1 Fold Invariants

In this section, we present the invariants that hold during the fold of f over the
program. These will be used later on to prove the properties of the iteration.
During the ﬁxed point computation, the δ function is implemented as a trie for
ease of access; computing δ(x) is achieved by looking up the value of x in the
trie. Actually, during the fold, the value we pass along is a pair N × ppc pc map.
The ﬁrst component is the number of bytes added to the program so far with
respect to the previous iteration, and the second component, ppc pc map, is the
actual δ trie (which we’ll call strie to avoid confusion).
out of program none ≡ λpref ix.λstrie.
∀i.i < 216 → (i > |pref ix| ↔ lookup opt i (snd strie) = None)

The ﬁrst invariant states that any pseudo-address not yet examined is not
present in the lookup trie.
not jump default ≡ λpref ix.λstrie.∀i.i < |pref ix| →
¬is jump (nth i pref ix) → lookup i (snd strie) = short jump

This invariant states that when we try to look up the jump length of a pseudo-
address where there is no branch instruction, we will get the default value, a
short jump.
jump increase ≡ λpc.λop.λp.∀i.i < |pref ix| →
let oj ≡ lookup i (snd op) in
let j ≡ lookup i (snd p) in jmpleq oj j

This invariant states that between iterations (with op being the previous iter-
ation, and p the current one), jump lengths either remain equal or increase.
It is needed for proving termination. We now proceed with the safety lem-
mas. The lemma in Figure 6 is a temporary formulation of the main property
sigma policy specification. Its main diﬀerence from the ﬁnal version is that
it uses instruction size jmplen to compute the instruction size. This func-
tion uses j to compute the span of branch instructions (i.e. it uses the δ under
construction), instead of looking at the distance between source and destination.

sigma compact unsafe ≡ λpref ix.λstrie.∀n.n < |pref ix| →

match lookup opt n (snd strie) with
None ⇒ False
Some pc, j ⇒
match lookup opt (n + 1) (snd strie) with
None ⇒ False
Some pc1 , j1 ⇒ pc1 = pc+
instruction size jmplen j (nth n pref ix)

Fig. 6. Temporary safety property

On the Correctness of a Branch Displacement Algorithm 615

sigma safe ≡ λpref ix.λlabels.λold strie.λstrie.∀i.i < |pref ix| →

∀dest label.is jump to (nth i prefix) dest label →
let paddr ≡ lookup labels dest label in
let j, src, dest ≡ if paddr ≤ i then
let , j ≡ lookup i (snd strie) in
let pc plus jl, ≡ lookup (i + 1) (snd strie) in
let addr, ≡ lookup paddr (snd strie) in
j, pc plus jl, addr
else
let , j ≡ lookup i (snd strie) in
let pc plus jl, ≡ lookup (i + 1) (snd old strie) in
let addr, ≡ lookup paddr (snd old strie) in
j, pc plus jl, addrin
match j with
short jump ⇒ short jump valid src dest
absolute jump ⇒ absolute jump valid src dest
long jump ⇒ True

Fig. 7. Safety property

This is because δ is still under construction; we will prove below that after the
final iteration, sigma compact unsafe is equivalent to the main property in Fig-
ure 7 which holds at the end of the computation. We compute the distance using
the memory address of the instruction plus its size. This follows the behaviour
of the MCS-51 microprocessor, which increases the program counter directly
after fetching, and only then executes the branch instruction (by changing the
program counter again).
There are also some simple, properties to make sure that our policy remains
consistent, and to keep track of whether the fixed point has been reached. We
do not include them here in detail. Two of these properties give the values of δ
for the start and end of the program; δ(0) = 0 and δ(n), where n is the number
of instructions up until now, is equal to the maximum memory address so far.
There are also two properties that deal with what happens when the previous
iteration does not change with respect to the current one. added is a variable
that keeps track of the number of bytes we have added to the program size by
changing the encoding of branch instructions. If added is 0, the program has not
changed and vice versa.
We need to use two different formulations, because the fact that added is 0
does not guarantee that no branch instructions have changed. For instance, it
is possible that we have replaced a short jump with an absolute jump, which
does not change the size of the branch instruction. Therefore policy pc equal
states that old sigma1 (x) = sigma1 (x), whereas policy jump equal states that
616 J. Boender and C.S. Coen

old sigma2 (x) = sigma2 (x). This formulation is suﬃcient to prove termination
and compactness.
Proving these invariants is simple, usually by induction on the preﬁx length.

4.2 Iteration Invariants

These are invariants that hold after the completion of an iteration. The main
diﬀerence between these invariants and the fold invariants is that after the com-
pletion of the fold, we check whether the program size does not supersede 64
Kb, the maximum memory size the MCS-51 can address. The type of an itera-
tion therefore becomes an option type: None in case the program becomes larger
than 64 Kb, or Some δ otherwise. We also no longer pass along the number of
bytes added to the program size, but a boolean that indicates whether we have
changed something during the iteration or not.
If the iteration returns None, which means that it has become too large for
memory, there is an invariant that states that the previous iteration cannot
have every branch instruction encoded as a long jump. This is needed later in
the proof of termination. If the iteration returns Some δ, the fold invariants are
retained without change.
Instead of using sigma compact unsafe, we can now use the proper invariant:
sigma compact ≡ λprogram.λsigma.
∀n.n < |program| →
match lookup opt n (snd sigma) with
None ⇒ False
Some pc, j ⇒
match lookup opt (n + 1) (snd sigma) with
None ⇒ False
Somepc1, j1 ⇒
pc1 = pc + instruction size n (nth n program)

This is almost the same invariant as sigma compact unsafe, but differs in that
it computes the sizes of branch instructions by looking at the distance between
position and destination using δ. In actual use, the invariant is qualified: δ is
compact if there have been no changes (i.e. the boolean passed along is true).
This is to reflect the fact that we are doing a least fixed point computation: the
result is only correct when we have reached the fixed point.
There is another, trivial, invariant in case the iteration returns Some δ: it must
hold that fst sigma < 216 . We need this invariant to make sure that addresses
do not overflow.
The proof of nec plus ultra goes as follows: if we return None, then the
program size must be greater than 64 Kb. However, since the previous iteration
did not return None (because otherwise we would terminate immediately), the
program size in the previous iteration must have been smaller than 64 Kb.
Suppose that all the branch instructions in the previous iteration are encoded
as long jumps. This means that all branch instructions in this iteration are long
On the Correctness of a Branch Displacement Algorithm 617

jumps as well, and therefore that both iterations are equal in the encoding of
their branch instructions. Per the invariant, this means that added = 0, and
therefore that all addresses in both iterations are equal. But if all addresses
are equal, the program sizes must be equal too, which means that the program
size in the current iteration must be smaller than 64 Kb. This contradicts the
earlier hypothesis, hence not all branch instructions in the previous iteration are
encoded as long jumps.
The proof of sigma compact follows from sigma compact unsafe and the fact
that we have reached a ﬁxed point, i.e. the previous iteration and the current
iteration are the same. This means that the results of instruction size jmplen
and instruction size are the same.

4.3 Final Properties

These are the invariants that hold after 2n iterations, where n is the pro-
gram size (we use the program size for convenience; we could also use the
number of branch instructions, but this is more complex). Here, we only need
out of program none, sigma compact and the fact that δ(0) = 0.
Termination can now be proved using the fact that there is a k ≤ 2n, with n
the length of the program, such that iteration k is equal to iteration k + 1. There
are two possibilities: either there is a k < 2n such that this property holds, or
every iteration up to 2n is diﬀerent. In the latter case, since the only changes
between the iterations can be from shorter jumps to longer jumps, in iteration 2n
every branch instruction must be encoded as a long jump. In this case, iteration
2n is equal to iteration 2n + 1 and the ﬁxed point is reached.

5 Conclusion

In the previous sections we have discussed the branch displacement optimisation

problem, presented an optimised solution, and discussed the proof of termination
and correctness for this algorithm, as formalised in Matita.
The algorithm we have presented is fast and correct, but not optimal; a true
optimal solution would need techniques like constraint solvers. While outside the
scope of the present research, it would be interesting to see if enough heuristics
could be found to make such a solution practical for implementing in an exist-
ing compiler; this would be especially useful for embedded systems, where it is
important to have as small a solution as possible.
In itself the algorithm is already useful, as it results in a smaller solution than
the simple ‘every branch instruction is long’ used up until now—and with only
64 Kb of memory, every byte counts. It also results in a smaller solution than
the greatest fixed point algorithm that gcc uses. It does this without sacrificing
speed or correctness.
The certification of an assembler that relies on the branch displacement al-
gorithm described in this paper was presented in [6]. The assembler computes
the δ map as described in this paper and then works in two passes. In the first
618 J. Boender and C.S. Coen

pass it builds a map from instruction labels to addresses in the assembly code.
In the second pass it iterates over the code, translating every pseudo jump at
address src to a label l associated to the assembly instruction at address dst to
a jump of the size dictated by (δ src) to (δ dst). In case of conditional jumps,
the translated jump may be implemented with a series of instructions.
The proof of correctness abstracts over the algorithm used and only relies
on sigma policy specification (page 5). It is a variation of a standard 1-
to-many forward simulation proof [5]. The relation R between states just maps
every code address ppc stored in registers or memory to (δ ppc). To identify the
code addresses, an additional data structure is always kept together with the
source state and is updated by the semantics. The semantics is preserved only for
those programs whose source code operations (f ppc1 . . . ppcn ) applied to code
addresses ppc1 . . . ppcn are such that (f (δ ppc1 ) ... (δ ppcn ) = f ppc1 ppcn )).
For example, an injective δ preserves a binary equality test f for code addresses,
but not pointer subtraction.
The main lemma (fetching simulation), which relies on sigma policy
specification and is established by structural induction over the source code,
says that fetching an assembly instruction at position ppc is equal to fetching
the translation of the instruction at position (δ ppc), and that the new incremen-
ted program counter is at the beginning of the next instruction (compactness).
The only exception is when the instruction fetched is placed at the end of code
memory and is followed only by dead code. Execution simulation is trivial be-
cause of the restriction over well behaved programs w.r.t. sigma. The condition
δ 0 = 0 is necessary because the hardware model prescribes that the first in-
struction to be executed will be at address 0. For the details see [6].
Instead of verifying the algorithm directly, another solution to the problem
would be to run an optimisation algorithm, and then verify the safety of the
result using a verified validator. Such a validator would be easier to verify than
the algorithm itself and it would also be efficient, requiring only a linear pass over
the source code to test the specification. However, it is surely also interesting
to formally prove that the assembler never rejects programs that should be
accepted, i.e. that the algorithm itself is correct. This is the topic of the current
paper.

5.1 Related Work

As far as we are aware, this is the ﬁrst formal discussion of the branch displace-
ment optimisation algorithm.
The CompCert project is another veriﬁed compiler project. Their backend [5]
generates assembly code for (amongst others) subsets of the PowerPC and x86
(32-bit) architectures. At the assembly code stage, there is no distinction between
the span-dependent jump instructions, so a branch displacement optimisation
algorithm is not needed.
On the Correctness of a Branch Displacement Algorithm 619

References
1. Asperti, A., Sacerdoti Coen, C., Tassi, E., Zacchiroli, S.: User interaction with the
Matita proof assistant. Automated Reasoning 39, 109–139 (2007)
2. Dickson, N.G.: A simple, linear-time algorithm for x86 jump encoding. CoRR
abs/0812.4973 (2008)
3. Hyde, R.: Branch displacement optimisation (2006),
http://groups.google.com/group/alt.lang.asm/msg/d31192d442accad3
4. Intel: Intel 64 and IA-32 Architectures Developer’s Manual,
http://www.intel.com/content/www/us/en/processors/architectures-
software-developer-manuals.html
5. Leroy, X.: A formally veriﬁed compiler back-end. Journal of Automated Reas-
oning 43, 363–446 (2009), http://dx.doi.org/10.1007/s10817-009-9155-4, doi:
10.1007/s10817-009-9155-4
6. Mulligan, D.P., Sacerdoti Coen, C.: On the correctness of an optimising as-
sembler for the intel MCS-51 microprocessor. In: Hawblitzel, C., Miller, D.
(eds.) CPP 2012. LNCS, vol. 7679, pp. 43–59. Springer, Heidelberg (2012),
http://dx.doi.org/10.1007/978-3-642-35308-6_7
7. Robertson, E.L.: Code generation and storage allocation for machines with span-
dependent instructions. ACM Trans. Program. Lang. Syst. 1(1), 71–83 (1979),
http://doi.acm.org/10.1145/357062.357067
8. Small device C compiler 3.1.0 (2011), http://sdcc.sourceforge.net/
9. Szymanski, T.G.: Assembling code for machines with span-dependent instructions.
Commun. ACM 21(4), 300–308 (1978),
http://doi.acm.org/10.1145/359460.359474
Analyzing the Next Generation Airborne
Collision Avoidance System

Christian von Essen1,ψ and Dimitra Giannakopoulou2,ψ

1
Verimag, Grenoble, France
[email protected]
2
NASA Ames Research Center
Moﬀett Field, CA, USA
[email protected]

Abstract. The next generation airborne collision avoidance system,

ACAS X, departs from the traditional deterministic model on which
the current system, TCAS, is based. To increase robustness, ACAS X
relies on probabilistic models to represent the various sources of uncer-
tainty. The work reported in this paper identifies verification challenges
for ACAS X, and studies the applicability of probabilistic verification and
synthesis techniques in addressing these challenges. Due to shortcom-
ings of off-the-shelf probabilistic analysis tools, we developed a frame-
work that is designed to handle systems with similar characteristics as
ACAS X. We describe the application of our framework to ACAS X, and
the results and recommendations that our analysis produced.

Keywords: Markov decision processes, probabilistic veriﬁcation, prob-

abilistic synthesis, aircraft collision avoidance.

1 Introduction

The current onboard collision avoidance standard, TCAS [7], has been successful
in preventing mid-air collisions. However, its deterministic logic limits robust-
ness in the presence of unanticipated pilot responses, as exposed by the collision
of two aircraft in 2002 over Überlingen, Germany [4]. To increase robustness,
Lincoln Laboratory has been developing a new system, ACAS X, which uses
probabilistic models to represent uncertainty. Simulation studies with recorded
radar data have confirmed that this novel approach leads to a significant im-
provement in safety and operational performance. The Federal Aviation Admin-
istration (FAA) has formed a team of organizations to mature the system, aiming
to make ACAS X the next international standard for collision avoidance.
The adoption of a completely new algorithmic approach to a safety-critical
system naturally poses a significant challenge for verification and certification.

The first author performed this work while employed by SGT Inc. as an intern at the
NASA Ames Research Center. This work was funded under the System-wide Safety
Analysis Technologies Project of the Aviation Safety Program, NASA ARMD.

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 620–635, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Analyzing the Next Generation Airborne Collision Avoidance System 621

Our goal in this work is to study the applicability of formal probabilistic verifi-
cation and synthesis techniques, which go beyond simulation studies [8,5]. Our
study was driven by tasks defined in collaboration with the ACAS X team to be
complementary to their verification efforts. During the course of our work, we
identified shortcomings of existing tools, which lead us to develop a framework
customized for ACAS X (or similar systems). In our framework, models are ex-
pressed in a traditional programming language for increased expressiveness, and
verification and synthesis algorithms are designed for scalability and efficiency.
The contributions of this work can be summarized as follows: 1) Develop-
ment of a faithful model for synthesis of the ACAS X controller, based on the
Lincoln Laboratory publications [6]; 2) Development of customized verification
and synthesis algorithms for efficient handling of ACAS X (and like) systems;
3) Identification of design and verification challenges for ACAS X as related to
probabilistic verification and synthesis; 4) Results obtained from the application
of our framework to ACAS X and recommendations for the ACAS X effort.
The results of our work will serve as input for the certification of ACAS X.
Due to access restrictions, we analyze a previous version of the system [6], but
are currently working with the ACAS X team to extend our work to the current
version. We believe that ACAS X presents researchers in probabilistic verification
and synthesis with a unique opportunity to focus on a relevant, safety-critical
case study. For this reason, we are preparing a public release of our models and
framework, to encourage other members of the community to build on our work.
The remainder of this paper is organized as follows. Section 2 describes the
ACAS X system as designed and deployed by the ACAS X team. In addition
to these techniques, our work implements and applies formal verification and
synthesis approaches, described in Sections 3 and 4. We discuss implementation
details in Section 5, with Section 6 concluding the paper.

2 The ACAS X System

Model Description. Similarly to the current standard TCAS, ACAS X [6] uses
several sources to estimate the current state of the plane on which it is deployed,
and the planes in its vicinity. If it detects the possibility of an imminent collision
(less than 40 seconds away), it produces vertical maneuver advisories (to climb
or descend) in order to avoid the collision. Both TCAS and ACAS X operate at
a frequency of one state update and advisory per second.
The ACAS X model consists of two airplanes on collision course. Loss of
Horizontal Separation, from now on denoted as LHS, describes the situation where
two airplanes are in the exact same location when their height diﬀerence is
ignored. A Near Mid-Air Collision (NMAC) occurs when the two airplanes are
within 100 ft of each other when LHS occurs. We refer to the plane equipped with
ACAS X as our plane (often referred to as ownship in the literature), and the
other plane as intruder (similarly to [6]).
The model has 5 parameters: (1) h ∈ [−1000, 1000] ft, the height diﬀerence
between the two planes, (2) dh0 , dh1 ∈ [−2500, 2500] ft / min, our and the in-
truder’s climbing rates (3) adv the advisory produced by ACAS X one second
622 C. von Essen and D. Giannakopoulou

ago (4) ps the pilot state. Pilot state and advisories can take the following values
— note that the pilot can either follow the advisory (i.e., ps = adv) or perform
random maneuvers (i.e., ps = COC), since studies have shown that pilots may
not react immediately or at all to an advisory:

– COC stands for “clear of conﬂict” — the pilot is free to choose how to control
the plane.
– CLI1500 / DES1500 stand for “climb / descend with 1500 ft / min”, respec-
tively; they advise the pilot to change the climbing rate with 14 g until reach-
ing a climbing rate of 1500 ft / min / −1500 ft / min, respectively.
– Advisories SCLI1500 / SDES1500 and SCLI2500 / SDES2500 are similar but
employ an acceleration of 13 g. Moreover, SCLI2500 / SDES2500 target a ﬁnal
climbing rate of 2500 ft / min / −2500 ft / min, respectively.

In describing the dynamics of the system, we use X ∼ P to denote that X

is sampled according to probability distribution P . Moreover, N (μ, δ) denotes a
normal distribution with mean μ and standard deviation δ. Lastly, we denote by
{p1 : e1 , p2 : e2 , . . .} the distribution in which ei has probability pi . Given a state
(dh0 , dh1 , h, adv, ps) and an advisory a, the dynamics of the system are given
by the following equations, which together describe a continuous probability
distribution αc (dh0 , dh1 , h , adv , ps | dh0 , dh1 , h, adv, ps, a), where the primed
versions of variables (e.g., dh0 ) characterize the next state. In these equations,
function f returns the appropriate acceleration in ft /s2 if the desired climbing
rate has not been reached yet, and 0 otherwise.
adv =⎧a; dh1 ∼ dh1 + 60N (0, 3); h = h + ((dh0 + dh0 )/2 − (dh1 + dh1 )/2)/60
⎪{1 : a}
⎨ if a = COC ∨a = ps
ps ∼ { 16 : a, 56 : COC} if a ∈ {CLI1500, DES1500} ∧ a = ps
⎪
⎩ 1
{ 5 : a, 45 : COC} if a ∈ {SCLI*, SDES*} ∧ a = ps

60N (0, 3) if ps = COC
dh0 ∼ dh0 +
{1 : f(dh0 , ps )} otherwise

Model Discretization. Similarly to [6], we generate an ACAS X controller

by analyzing a Markov Decision Process (MDP) obtained through discretiza-
tion of the above model. In our implementation, the number of discrete values
that replace each continuous parameter is configurable by a resolution vector
(rdh0 , rdh1 , rh ), where rdh0 , rdh1 , rh define the number of points below and above
0 used to discretise dh0 , dh1 , h, respectively. Formally, the set of discretization
points is defined as Drdh0 ,rdh1 ,rh = {−2500, −2500 + 2500/rdh0 , . . . , 2500} ×
{−2500, −2500 + 2500/rdh1 , . . . , 2500} × {−1000, −1000 + 1000/rh, . . . , 1000}.
The resolution of the controller defined in [6] is (10, 10, 10).
The following two techniques are then employed in [6] to calculate the tran-
sition distribution over Drdh0 ,rdh1 ,rh . Instead of sampling from the continuous
normal distribution (N (0, 3), N (0, 3)) for equations dh0 and dh1 , we sample from
the distribution { 16 : (0, δ), 16 : (0, −δ), 13 : (0, 0), 16 : (δ, 0), 16 : (−δ, 0)}, where
Analyzing the Next Generation Airborne Collision Avoidance System 623
√
δ = 3 3. This is called sigma point sampling. After having modified the equa-
tions with sigma point sampling, we obtain a discrete probability distribution
α (dh0 , dh1 , h, adv , ps | dh0 , dh1 , h, adv, ps, a).
Secondly, linear interpolation matches the points of α to the discretization
points in Drdh0 ,rdh1 ,rh . Let γdh0 be the distance between two discretization
points of the climbing rate of our plane, and let γdh1 and γh be defined analo-
gously. We define function η to capture how “close” a point (dh0 , dh1 , h) is to a
discretization point (dh0 , dh1 , h ) immediately surrounding it as
|dh0 − dh0 | |dh1 − dh1 | |h − h |
η((dh0 , dh1 , h), (dh0 , dh1 , h )) = (1 − )(1 − )(1 − ),
γdh0 γdh1 γh
and 0 for all
other points. Based on these, we define the transition relation as
α(sd | s, a) = s α (s | s, a) η(s , sd ).
Controller Generation. In order to generate a controller, each ACAS X ad-
visory receives a cost/reward, where costs are rewards with negative values.
Reward COC is associated with switching from any alerting state to COC; Alert
is a cost associated with switching from COC to either CLI1500 or DES1500;
Reversal is a cost associated with switching from any climbing to any descend-
ing advisory, or vice versa; Strengthening is a cost associated with switching
from any climb/descent advisory with goal 1500 ft / min to SCLI2500/SDES2500,
respectively; NMAC is a cost associated with the occurrence of an NMAC.
We henceforth refer to the costs/rewards as “weights”, thus describing the
fact that they capture the relative importance of different quality criteria of the
controller. Let c(s, a) be the sum of costs and rewards ACAS X receives for
selecting advisory a in state s (for example, if a in state s activates an alert,
then c(s, a) = Alert). Moreover let Eγ(s |s,a) [α(s )] describe the expected value
of some function α under the probability distribution over the successor states s
of s when selecting action a. We then calculate a table equivalent to the family of
functions Tt (s, a) := c(s, a) + Eγ(s |s,a) [mina ∈A(s ) Tt−1 (s , a )], 1 ≤ t ≤ 40, where
A(s) stands for the set of advisories admissible in s. Further, T0 (s, a) = NMAC if s
models an NMAC, and 0 otherwise. Essentially, for each state and each advisory
the table stores the expected accumulated cost.
Controller Deployment. The generated controller is deployed as look-up ta-
ble Tt (s, a) described previously. Linear interpolation is used to determine the
advisory for a state s in the continuous world at time t until loss of horizontal
separation by: arg mina∈A(s) s ∈Dr ,r ,r η(s, s )Tt (s , a).
dh0 dh1 h
Figure 1(a) illustrates a part of the interpolated strategy generated according
to [6]. In the figures, note that LHS occurs at time 0, located on the left hand side
of the plots, so time in the plots flows from right to left. Thorough examination
of such plots is part of the validation of ACAS X but goes beyond the scope of
this paper. Our framework can easily generate such plots, though.
We would like to point out two features of the generated controller. Firstly, if
the airplanes start out on the same height, then the controller waits for a long
time until giving an advisory, as witnessed by the black space between the two
“tails” on the right. This is because it is very unlikely that the two planes will
624 C. von Essen and D. Giannakopoulou

(a) Resolution (10, 10, 10) (b) Resolution (20, 20, 20)

Fig. 1. Two controllers generated with the same weight in diﬀerent resolutions. x-
axis shows time until LHS, y-axis height diﬀerence. Parameters dh0 and dh1 are zero
throughout, and adv = ps = COC. Color indicates selected advisory: black (0) for COC,
red (1) for CLI1500, yellow (2) for DES1500.

remain on the same height for a long time (due to their random movement),
and it is therefore better to wait until the intruder either starts climbing and
or descending and go in the opposite direction. Secondly, notice the “mouth”
shape close to time 0 and around height diﬀerence 0. In this collision situation,
ACAS X is not giving any advisory, although one would intuitively expect that
some advisory would be more informative to the pilot than COC, which may be
misleading. This is an artifact of the costs used for synthesis, and we describe a
technique that identiﬁes situations like these in Section 3.

3 Veriﬁcation

To complement the ACAS X work that primarily uses simulation, we apply for-
mal analysis techniques to evaluate the ACAS X controller. Simulation-based
techniques are studied and discussed in Section 4, where we explore the design-
space of controllers and compare different generated controllers among them-
selves. In this section, we evaluate the ACAS X controller 1) in terms of the
quality criteria used for its generation, and 2) through model checking of PCTL
[3] properties, which are ideal for probabilistic models such as ACAS X’s. For
evaluation, we use models discretized at different resolutions, and could even use
different model characteristics and parameters (although we do not do the latter
in the experiments presented here).
The type of analysis that we perform provides a value v(s) for each state of
the discretized model. To easily compare results of analyses with each other and
with simulations, we define a probability distribution I(s) over the states of the
model as follows (similarly to [6]). The only states we consider are those at 40
seconds from LHS, and in which ps = adv = COC. Over those states, we first
define a continuous distribution over (dh0 , dh1 , h) ∈ R3 by sampling dh0 and
dh1 uniformly from [−1000, 1000] ft / min, denoted as dh0 ∼ U (−1000, 1000) and
dh1 ∼ U (−1000, 1000). To make a collision likely, and therefore to provoke the
controller into action, h is sampled from 40((dh1 − dh0 )/60) + N (0, 25).
Analyzing the Next Generation Airborne Collision Avoidance System 625

To deﬁne an analogous distribution of Drdh0 ,rdh1 ,rh , we assign probability

masses to all three parameters so as to soak up the probability of the space
around them. That is, the probability of picking sample point dh0 is defined
as: P(dh0 − γdh0 /2 ≤ H0 ≤ dh0 + γdh0 /2), with H0 ∼ U (−1000, 1000). Note
that γdh0 , γdh1 and γh are defined as in Section 2. We define the discretized
probability of dh1 analogously. The discretized probability of h is defined as:
P(h − γh /2 ≤ H ≤ h + γh /2), where H ∼ 40((dh1 − dh0 )/60) + N (0, 25), i.e.,
the probability distribution of h depends on dh0 and dh1 . We then use I to
calculate the expected value EI(s) [v(s)].

3.1 Inﬂuence of Resolution on Controller Evaluation

Our first step in evaluating the ACAS X controller involves calculating its per-
formance in evaluation models of different resolutions for the two climbing rates
and the height difference: (10, 10, 20), . . . , (10, 10, 50), (20, 20, 10) . . . (50, 50, 10)
and (20, 20, 20) . . . (50, 50, 50).

For each of these resolutions, Fig-
!

ure 2 presents the evolution of the

probability of seeing an NMAC ver-

sus the resolution. The three lines

represent the three groups of in-

creasing resolutions. Line “Height”

represents resolutions (10, 10, n), while
line “Climbing Rate” represents the

resolutions (n, n, 10) and line “All”

represents the resolutions (n, n, n), for Fig. 2. P(NMAC) of baseline controller in
n ∈ {10, 20, 30, 40, 50}. various resolutions
These plots indicate that the probability of NMAC drops as we increase resolu-
tion. This in turn indicates (though does not guarantee) that a coarse resolution
provides a conservative estimate for the quality criteria of the controller. Lines
“Height” and “Climbing Rate” indicate that increasing the resolution of the
height difference has a stronger influence on the quality of the analysis than the
resolution of the climbing rate. This observation is reinforced by comparing lines
“Height” and “All”. The difference between these two lines is small, despite the
fact that an n-fold increase in resolution of the climbing rate leads to an n2 -fold
increase in state space.

3.2 PCTL Model Checking

The PCTL model checking engine that we have developed enables users to: (1)
vary the resolution of the model to get more precise results, and (2) analyse non-
trivial properties expressed in the PCTL formal property language. In contrast
to simulation, PCTL model checking allows an exhaustive search of the state
space and can thus uncover scenarios that simulations might easily miss. This is
important given the low probability of some of the properties we want to check.
Property 1: Near Mid-Air Collision. Studies the probability of a near
626 C. von Essen and D. Giannakopoulou

dh0

dh1

dh1

h

dh0

Fig. 3. Trace plots for properties 1 and 3. x-axis displays time to LHS, y-axis displays
values of (dh0 , dh1 , h). The color of line h depicts the advisory, tagged above the line.

mid-air collision, formally P=? [F NMAC]. During analysis, we observed that the
most likely cases of this undesirable scenario stem from late reactions from
the pilot. We therefore decided to instead concentrate on NMACs that occur
despite immediate reactions to advisories by the pilot. We formulate this as
P=? (F NMAC | G adv = ps), i.e., what is the probability of reaching an NMAC
state although the pilot always reacts immediately.
The highest probability over all initial states that we encounter with the
conditional probability formula is 2.30 · 10−8 , as opposed to 6.92 · 10−4 with the
original formula. This confirms that the vast majority of NMACs happen because
the pilot does not react fast enough or at all. To understand the NMACs that
occur despite the fact that the pilot reacts to advisories, we analyzed some traces
that are most likely to fulfill P=? (F NMAC | G adv = ps). Figure 3 depicts such a
scenario: initially, our airplane is 1000ft below the intruder and we are climbing
with 2500 ft / min. The intruder, on the other hand, starts out with a climbing
rate of −250 ft / min. Until 22 seconds to LHS, the two airplanes maintain their
course, and therefore the height difference shrinks. If both planes were to continue
to maintain their course, then our plane would be well above the intruder at time
0 to LHS, so ACAS X does not alert.
At this point, climbing rate of the intruder starts increasing, and the verti-
cal distance becomes −150 ft. The height difference levels off as a result of the
intruder’s increase in climbing rate from now on. ACAS X signals the DES1500
advisory seven seconds later, and SDES2500 one second after that. As a result,
our airplane starts descending steeply until it reaches −2500 ft / min. At the point
of the first alarm, the vertical distance is 50 ft, i.e., our plane is slightly above
the intruder. Unfortunately, the climbing rate of the intruder starts decreasing
at exactly the same point and from that point on, the two climbing rates are
not different enough to carry our plane outside of the danger zone and we end
up with a vertical distance of 100 ft, and hence an NMAC.
Traces like these capture exactly the type of unforeseen behaviour that led to
the Überlingen accident [4], and probabilistic model checking can detect cases
like these easily. We consider it encouraging that the most likely case of colli-
sion requires relatively complex behaviour of the intruder (first increasing the
climbing rate, then decreasing it, at exactly the right point in time).
Analyzing the Next Generation Airborne Collision Avoidance System 627

Property 2: No advisory despite collision. Studies the probability of issu-

ing no advisory although a future NMAC is likely, formally P=? [F(P=1 [X COC] ∧
P>0.1 [F NMAC])]. This formula was motivated by our previous observation
of Figure 1(a) in Section 2, accord-

ing to which there is an area where

ACAS X issues no advisory although

an NMAC is imminent. Figure 4

shows the probability of the formula

for all states in which dh0 = dh1 =

0 ft / min and adv = ps = COC. This

probability is 1 until about 12 seconds

away if the height diﬀerence between

the planes is less than a 100 ft. Model

checking the formula, however, reveals Fig. 4. Probability of fulfilling property 2.
that among all initial states, the high- Plot parameters as in Figure 1(a); color de-
picts probability
est probability is 0.3%, so getting into
such a situation is improbable.
Property 3: Split Advisory. Studies the probability of issuing an alert, switch-
ing it off, and then switching an alert on again (a split advisory), formally
P=? [F(¬ COC ∧P=1 [X COC] ∧ P>0 [F ¬ COC])]. Even though during controller gen-
eration ACAS X penalizes reversals, these costs only reflect immediate changes
in controller advisories. Split advisories are also undesirable, but are harder to
capture during controller generation. The PCTL property described above can
however be used to study how likely such situations are. Analysis of the model
checking results revealed that a main cause for such situations is the pilot not
following the advisory. We therefore refined the property similarly to Property
1, by checking cases where split advisories occur under the condition that the
pilot always reacts immediately to advisories.
Figure 3 depicts a split advisory scenario under the refined property. Initially
(at 40 seconds to LHS), our plane is 830 ft above the intruder and descending with
2500 ft / min, while the intruder is in level flight. The vertical distance is therefore
decreasing. Around 19 seconds into the scenario, the intruder starts descending,
and soon after, ACAS X advises CLI1500 and maintains this advisory for 2
seconds, before switching it off again. Accordingly, the rate of descent of our
plane gradually reduces to 1500 ft / min. The advisory is then switched off, as
the intruder stops descending, effectively moving out of the way of our plane.
ACAS X switches to COC but, a second later, gives advisories DES1500, followed
by SDES2500, as the intruder’s rate of descent increases again.
Let us further analyze this generated scenario. The first climb advisory aimed
at avoiding a collision that would be likely if our plane continued to descend at
the same rate. It could not force the pilot to increase the rate of descend further,
since 2500 ft / min already is the maximum. Therefore, climbing was the only
possibility. Then the intruder stopped descending, which reduced the probability
of colliding with our current climbing rate. This may have caused ACAS X to
shut the advisory off. Shortly before ACAS X switched the advisory back on, the
628 C. von Essen and D. Giannakopoulou

-Alerts

−P (NMAC)

(a) Simpliﬁed Pareto curve (b) Points generated for two objectives
Fig. 5. Fictional and actual Pareto fronts

difference in climbing rates was 1000 ft / min, and the height difference was -30
ft. Since we were about 15 seconds away from LHS, this amounted to a decreased
vertical distance of about 260 ft. ACAS X decided to increase the vertical distance
by increasing the rate of descent.
It would be interesting to study whether the cost function of ACAS X may
encourage such cases of split advisories. Given that (Alert + COC < Reversal),
it is possible that ACAS X decided to gain a small reward for selecting COC after
the first advisory, and additionally avoid the cost of a reversal that would be
incurred if the advisory was switched directly from a climb to a descend.

4 ACAS X Design Challenges

The generation of the ACAS X controller depends on two major design issues
that have so far been unexplored: the selection of weights, and the discretization
resolution. As reported in [6], the weights were selected based on an intuition of
the relative importance of the diﬀerent quality criteria. In this section, we study
more systematic techniques for selecting controller weights, and investigate how
discretization resolution inﬂuences the generated controller.

4.1 Generating Controller Weights

Our goal is to systematically explore deterministic controllers whose perfor-
mances exceed requirements on NMAC, Alert, etc., provided by domain or certi-
ﬁcation experts. We refer to these requirements as “targeted performance”, or
simply “target”. Central to achieving this goal is an existing result, which states
that the performance of all controllers that can be generated by weights form a
convex Pareto front [2]. The Pareto front is n-dimensional, where n is the number
of costs/rewards. The performances of all possible controllers (even controllers
using randomization and memory) lie on the inside of the Pareto front.
For example, Figure 5(a) illustrates a two-dimensional Pareto front. The per-
formance of all deterministic controllers (green dots in the plot), lie on the ver-
tices of the Pareto front. The targeted performance is depicted as a black dot
in Figure 5(a). The box with a lower left corner at this target and extending to
Analyzing the Next Generation Airborne Collision Avoidance System 629

infinity in all dimensions, defines the section of the Pareto front in which we are
interested. To find this section, we modified an algorithm presented in [9].
While the details of the approach are beyond the scope of this paper, the idea
can be summarized as follows. Initially, the optimal controller for each dimension
is generated, i.e., the controller with the lowest P(NMAC), the controller with the
lowest expected number of Alerts (i.e., zero), etc. We add the performance of
these controllers to the approximation of the Pareto front. These points, illus-
trated as the two green dots on the axes in Figure 5(a), reflect the performance
of the corresponding controller in terms of the selected quality attributes.
We then keep adding points to the Pareto front in the following way. We
calculate the convex hull of the points generated so far. This hull defines a set of
n-dimensional faces (lines, in our picture), that connect the points. Further, the
hull defines a lower bound for new points (the Pareto front is convex, so missing
points must lie on or above the hull). In the picture, the lines connecting the
green dots form the hull. The generated points also define an upper bound on
the space of controller performances, illustrated by the dashed lines in the figure.
The direction (normal) of the dashed line (separating hyperplane) is given by
the weights we used to generate the point. If there are any more points we can
generate, then these points exist between the hull and the upper bound.
Since we want to find new points in the box defined by the target, we pick
new weights so as to refine the face (by lowering the upper bound or breaking
up the face) above which there is a point that 1) lies inside the upper bound
2) lies above the target 3) is maximally far away from the face (as defined by
the Euclidean distance). We continue until we either prove that the target lies
outside the upper point (which means that no controller fulfilling the minimal
requirement exists) or until we have found enough points above the target.
Figure 5(b) presents a subset of the points generated by this approach on
Alert and NMAC exclusively. The target point and the box it defines are plotted in
black, and the points generated are plotted in red. The algorithm first generated
8 points outside the box. The first point generated within the target box (the
9th overall) is depicted in blue. We generated 10 more points after we found it.
We note that all subsequent 10 points that are generated also lie within the box.
The same effect has been observed for three dimensions. We conclude that this
algorithm is good at approximating the interesting part of the Pareto front (that
inside the box) once it finds the first point that meets the target specifications.
We have checked this algorithm against various targets, and it always either
finds a controller meeting the requirement, or proves that no such controller
exists. Note that finding a controller in the box is an NP-complete problem (easy
adaptation of proof from [1]). In the worst case, the algorithm has to generate all
points of the Pareto front of the model, of which there are exponentially many.
However, as the next section shows, little more than 100 points suffice to find a
controller meeting the requirement for various resolutions.
We believe that this technique can be very helpful as the controller model
ACAS X evolves. Each evolution (be it a change in discretization or a
change in parameters), necessitates tuning weights anew (as witnessed by the
630 C. von Essen and D. Giannakopoulou

ﬁrst experiment in the next section). Our approach allows to semi-automatically

select these weights by presenting domain experts with the trade-offs. They can
then select a controller they deem sufficient, or select an area for further refine-
ment.

4.2 Discretization Resolution

To study the effects of discretization resolution on the quality of the obtained
controller, we designed a number of experiments described in this section. We will
from now on refer to the controller presented in [6] as the “baseline” controller.
Experiment 1. This experiment aims to analyze the performance of controllers
generated at resolutions (20, 20, 20), (30, 30, 30), (40, 40, 40) and (50, 50, 50), us-
ing the weights of the baseline controller. Our expectation was that a higher
resolution would lead to a better performance, at least in terms of P(NMAC). How-
ever, the experiments showed that the controllers we generate by this method
do not necessarily perform better in all the quality attributes. Instead, higher
resolution controllers have a significantly higher P(NMAC) and significantly fewer
alerts than the baseline controller in the same resolutions.
The reason becomes clear when we consider the controller plots in Figure 1(a)
and Figure 1(b). The area in which an alert is signalled by the controller is sig-
nificantly smaller in Figure 1(b) when compared to Figure 1(a). To understand
the reason for this effect, we analyzed the controllers using the techniques from
Section 3. It turns out that controllers in higher resolutions indeed perform better
in the sense of having a higher expected reward than the baseline controller. Intu-
itively, the controllers use the additional information they receive from a higher res-
olution to improve the score they receive. To this end, the controllers improve their
score by reducing the expected number of alerts, at the cost of a higher P(NMAC).
This experiment made it clear to us that weights may balance out the quality
attributes of a controller differently, when different resolutions are considered. As
a consequence, we believe that it is more meaningful to systematically explore
the design space of controllers based on specific target quality attributes, as
presented in Section 4.1. One could then compute weights based on these target
values, and within the resolution where the generation will occur.
Experiment 2. Given the first experiment, we decided to study whether it is
possible to generate controllers that are better than the baseline controller in all
quality attributes, in higher resolutions. To generate a controller that performs
better than the baseline controller in a given resolution R = (rh , rdh0 , rdh1 ), we
first evaluate the performance of the baseline controller in resolution R. The result
is a vector v = (NMAC, Alert, Strengthening, Reversal, NMAC), which summa-
rizes the performance of the baseline controller when model checked in resolution
R (see Section 3 for more details). We then use the technique described above to
approximate the Pareto front above v. From the generated controllers that meet
the specification, we then pick the one with the lowest P(NMAC).
Figure 6(a) illustrates the obtained results. The bars show, for resolution fac-
tor n the performance of the baseline controller when checked against resolutions
Analyzing the Next Generation Airborne Collision Avoidance System 631

(n, n, 10) (Climbing Rate), (10, 10, n) (Height) and (n, n, n) (All) respectively. It
can be seen that we were almost unable to decrease P(NMAC) using the climbing
rate alone. The relative performance of these controllers is consistently around
99.5%. When we increase the resolution of the height, then we get a relative per-
formance of about 85%. Finally, when increasing the resolution of both we see a
relative performance of about 83%. As witnessed in Section 3, the discretization
of height seems to have the biggest inﬂuence on controller quality. Interestingly,
the relative performance does not improve as we increase the resolution.

!
!
"#
"#

(a) Controller quality vs resolution (b) Controller quality checked in

(50,50,100)

Fig. 6. Plots for Experiment 3

To further judge the quality of the generated controllers, we checked them

against resolution (50, 50, 100) and present the results in Figure 6(b). On the x-
axis, we have the controller resolution, while on the y-axis we have the probability
of a Near Mid-Air Collision. As before, “Height” stands for the controllers of
resolution (10, 10, n), “Climbing Rate” for the controllers of resolution (n, n, 10)
and “All” for the controllers of resolution (n, n, n). This experiment confirms
that increasing the resolution of the height difference between the two planes
has the most impact up to and including (10, 10, 30), after which we notice no
further improvement. In contrast to this, we notice further improvements in
category “All”. Our experiments indicate that the best ratio of resolution for
the three parameters is (n, n, 3 · n).
Experiment 3. Let vR (c) denote the quality vector of a controller c in resolu-
tion R (i.e., the vector of P(NMAC), P(Alert), etc). We organized this experiment
to study if ∀c1 , c2 , R1 , R2 : vR1 (c1 ) ≥ vR1 (c2 ) ∧ R2 > R1 =⇒ vR2 (c1 ) ≥ vR2 (c2 )
holds. To this end, we compared the performance of the controller we gener-
ated in resolution (20, 20, 20) to the baseline controller in resolutions (20, 20, 20)
and (50, 50, 100), and present the results in the following table. Note that the
higher resolution controller performs better than the baseline in all dimensions
in resolution (20, 20, 20); specifically, it is very close to the target performance
in everything but NMAC, where it is notably better.
This attests to the efficacy of our Pareto front algorithm. When comparing
this to the analysis results in resolution (50, 50, 100), we observe that while the
higher resolution controller and the baseline controller are still very close in all
632 C. von Essen and D. Giannakopoulou

NMAC Alert Strengthening Reversal COC

(10, 10, 10) in (20, 20, 20) −4.850 · 10−4 -0.6310 -0.083 -0.019 0.629
(20, 20, 20) in (20, 20, 20) −4.186 · 10−4 -0.6306 -0.081 -0.019 0.631
(10, 10, 10) in (50, 50, 100) −2.897 · 10−4 -0.6245 -0.078 -0.020 0.622
(20, 20, 20) in (50, 50, 100) −2.313 · 10−4 -0.6308 -0.078 -0.019 0.630

characteristics except NMAC, the higher resolution controller is no longer strictly

better in all dimensions. For example, it uses slightly more alerts and slightly
more reversals. This is oﬀset by the fact that the P(NMAC) of the higher resolution
controller is still signiﬁcantly better than that of the baseline controller. To
summarize, the general tendencies of the relation of controllers when checked in
higher resolutions are the same, but the exact relations are not preserved.

4.3 Bayesian Model Checking

In this section, we evaluate the generated controllers using simulation (where
discretization is not required), and compare the results with model checking. To
this aim, we implemented a parallel Bayesian model checking engine [10], which
simulates the system based on the dynamic equations of Section 2. We used the
same initial distribution as [6], described in Section 3. In [6], the authors also
report on the use of a Bayesian network instead of the dynamic equations.
This approach allows us to run simulations, and state “given the traces ob-
served, the probability that property π holds lies in interval [a, b] with confidence
c. The level of confidence and the size of the interval are configurable. In the
following, we use this framework to estimate the probability that an NMAC
happens when using the baseline controller, and compare the results to Ex-
periment 2. Our analysis reports that the probability of NMAC lies in range
[2.48 · 10−4 , 2.58 · 10−4 ] with probability 95%. We needed to generate 38,796,000
samples to get this level of confidence for the given interval size.
We additionally applied this simulation technique to controllers of resolution
(10, 10, 10), . . . , (10, 10, 50) generated previously. The following table presents
the probability of seeing an NMAC for each of them.
Resolution 10 20 30 40 50
P(NMAC) · 104 [2.51, 2.61] [2.17, 2.27] [2.08, 2.18] [2.12, 2.22] [2.27, 2.37]
We conclude that the trend follows that depicted in Figure 6(a): improvements
in performance are significant until we reach resolution (10, 10, 30), at which
point they taper off. We were unable to perform this analysis on controllers with
resolution larger than (20, 20, 20) because we could not fit the whole table into
memory at once. For (20, 20, 20), though, we receive P(NMAC) ∈ [2.06 · 10−4, 2.16 ·
10−4 ], i.e., a number very close to that of the controller generated for (10, 10, 30).

5 Implementation
We originally used existing probabilistic model checking tools for ACAS X but
encountered several limitations. First, we could not express the linear interpola-
tion needed in the controller evaluation. Second, we not only require capabilities
Analyzing the Next Generation Airborne Collision Avoidance System 633

for the specification of a model, but also for loading generated controllers for
subsequent verification. Last but not least, for our mupliple experiments involv-
ing increasing resolution, the state spaces we generate grow prohibitively large,
and there is a considerable slow-down that could benefit from parallelization,
which is unavailable in current releases of existing tools.
More specifically, the size of the controller has 40 · ((2rdh0 + 1) · (2rdh1 + 1) ·
(2rh + 1) · 13) states in resolution (rdh0 , rdh1 , rh ). So, for example, the model
from [6] has 4, 815, 720 states overall. A controller with resolution (50, 50, 50)
has 535, 756, 520 states. We wrote a simplified version of the model in [6] for
PRISM [8] (without linear interpolation, but with sigma point sampling). While
PRISM succeeded in loading the model as a BDD model, analyzing it was not
possible (we aborted conversion to the hybrid representation after 10 min).
These problems motivated us to create our own framework that takes advan-
tage of two key insights into the ACAS X model. Firstly, if we want to calculate
the values of any property in this model at time t, then we only need to keep
the value of time t − 1 in memory. This alone leads to a reduction of memory
consumption to 2.5%. Secondly, since we need to calculate value iteration steps
only a relatively small number of times for each state, it is possible to avoid
storing the transition matrix in memory and generate the values on-demand.
In addition, we parallelized value iteration, and the speed-up obtained in ex-
periments using up to 12 cores was almost linear (1.94 for 2 cores, 3.37 for 4
cores, 4.67 for 6 cores, 6.47 for 8 cores, 7.54 for 10 cores, 8.93 for 12 cores). Paral-
lelization proved essential for our experiments involving increasing discretization
resolution; generating the Pareto fronts for all cases took about 2 days, as op-
posed to more than a month.

6 Conclusions and Future Work

ACAS X is a safety-critical system that the FAA plans on introducing as the
new standard for collision avoidance. The system that will be deployed is the
look-up table that is generated by the techniques described in [6]. It is therefore
reasonable that a large number of the verification efforts would focus on the
verification of the generated controller in operation. However, we believe that it
is meaningful to take advantage of the existence of models for additional formal
analysis both of the controller itself, and of the design choices.
Our experiments related to the effects of resolution on controller generation
were particularly interesting. For example, we observed that height discretization
is more effective that climbing rate alone, when exploring the space of controllers
better than a particular target. We therefore recommend increasing height reso-
lution first, when there is an upper bound in controller size that does not allow
for uniform discretization of all variables. In the future, we intend to carry out
more experiments in this domain in order to give more precise recommendations.
Some of the results that we obtained were also unexpected: the fact that a
higher resolution may balance the weights of quality attributes differently and
therefore result in a drop in performance of NMAC; or the fact that the relative
performance of two controllers may change when moving to higher resolutions.
634 C. von Essen and D. Giannakopoulou

This cautions us, in exploring the space of controllers, to ultimately evaluate their
relative performance in simulation. However, the Pareto-front-based techniques
for controller generation provide a systematic way of generating and comparing
controllers that can complement designer intuition.
PCTL model checking also proves valuable in studying properties of generated
controllers. However, more useful than the model checking itself, is the capability
to visualize its results and generate traces that help with understanding of the
model checking results. We therefore found that latter aspect of our tools most
helpful, together with a simulator that we built, which allows to interactively
explore generated controllers. In the future, we plan to connect the simulator to
the model checker, to allow replay of the generated traces.
The techniques and tools that we developed are general, and the customization
for memory savings is applicable to problems that have a similar nature; for
example, it could be used in the domain of car collision avoidance systems,
which is important as we move towards self-driving cars. Our work on analysis
of ACAS X will continue beyond this paper. Our plans for future work include the
modeling of a reasonably adversarial pilot for the intruder plane, and alternative
representations of the look-up table for veriﬁcation and deployment. Moreover,
we plan to study a version of ACAS X that is targeted to unmanned vehicles, as
well as experiment with the evaluation of generated controllers in the context of
hybrid veriﬁcation tools, which the ACAS X team has expertise in.

Acknowledgement. We thank Guillaume Brat, and members of the ACAS X

team Ryan Gardner, Mykel Kochenderfer and Yanni Kouskoulas, for valuable
discussions and feedback.

References
1. Chatterjee, K.: Markov decision processes with multiple long-run average objec-
tives. In: Arvind, V., Prasad, S. (eds.) FSTTCS 2007. LNCS, vol. 4855, pp. 473–484.
Springer, Heidelberg (2007)
2. Forejt, V., Kwiatkowska, M., Parker, D.: Pareto curves for probabilistic model
checking. In: Chakraborty, S., Mukund, M. (eds.) ATVA 2012. LNCS, vol. 7561,
pp. 317–332. Springer, Heidelberg (2012)
3. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal
Aspects of Computing 6, 102–111 (1994)
4. Johnson, C.: Final report: review of the BFU Überlingen accident report. Con-
tract C/1.369/HQ/SS/04 to Eurocontrol (2004), http://www.dcs.gla.ac.uk/~
johnson/Eurocontrol/Ueberlingen/Ueberlingen Final Report.PDF
5. Katoen, J.-P., Zapreev, I.S., Hahn, E.M., Hermanns, H., Jansen, D.N.: The ins and
outs of the probabilistic model checker MRMC. Perform. Eval. 68(2) (2011)
6. Kochenderfer, M.J., Chryssanthacopoulos, J.P.: Robust airborne collision avoid-
ance through dynamic programming. Project Report ATC-371, Massachusetts In-
stitute of Technology, Lincoln Laboratory (2011)
Analyzing the Next Generation Airborne Collision Avoidance System 635

7. Kuchar, J., Drumm, A.C.: The traffic alert and collision avoidance system. Lincoln
Laboratory Journal 16(2), 277 (2007)
8. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of probabilistic
real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS,
vol. 6806, pp. 585–591. Springer, Heidelberg (2011)
9. Rennen, G., van Dam, E.R., den Hertog, D.: Enhancement of sandwich algorithms
for approximating higher-dimensional convex Pareto sets. INFORMS Journal on
Computing 23(4), 493–517 (2011)
10. Zuliani, P., Platzer, A., Clarke, E.M.: Bayesian statistical model checking with
application to Stateflow/Simulink verification. Formal Methods in System De-
sign 43(2), 338–367 (2013)
Environment-Model Based Testing
of Control Systems: Case Studies *

Erwan Jahier1 , Simplice Djoko-Djoko1, Chaouki Maiza1 , and Eric Lafont2

1VERIMAG-CNRS, Grenoble, France
2 ATOS-WORLDGRID, Grenoble, France

Abstract. A reactive system reacts to an environment it tries to control. Lurette

is a black-box testing tool for such closed-loop systems. It focuses on environ-
ment modeling using Lutin, a language designed to perform guided random ex-
ploration of the System Under Test (SUT) environment, taking into account the
feedback. The test decision is automated using Lustre oracles resulting from the
formalisation of functional requirements.
In this article, we report on experimentations conducted with Lurette on two
industrial case studies. One deals with a dynamic system which simulates the
behavior of the temperature and the pressure of a fluid in a pipe. The other one
reports on how Lurette can be used to automate the processing of an existing test
booklet of a Supervisory Control and Data Acquisition (SCADA) library module.

Keywords: Reactive systems, Control-command, Dynamic systems, SCADA, Test

Booklets, Black-box testing, Requirements engineering, Synchronous Languages.

1 Introduction
Lurette is a black-box testing tool for reactive systems that automates the tests decision
and the stimulation of the System Under Test (SUT). Lurette is based on two syn-
chronous languages: Lustre [1], to specify test oracles, and Lutin [2], to model reactive
environments. Lurette does not require to analyze the code, thus it can deal with any
kind of reactive systems, as the experimentations reported below illustrate.
The COMON project∗ gathered three industrial companies that conceive control-
command systems of nuclear plants. Corys Tess designs plant simulators used in par-
ticular for training operators. Atos Worldgrid designs software and hardware of com-
puterized control rooms. Rolls-Royce designs the software and hardware of classified
automatisms in charge of the plant security. The goal for this consortium was to take
advantage of the partners complementarity to set up a development framework based
on early simulations, model refinements, continuous integration, and automatic testing.
During the project, the consortium has crafted a case study representative of each of the
partners activity [3]. They also wanted to experiment on their own designs the Lurette
languages and methodology. This article presents those experimentations.
We first recall Lurette principles in Section 2, and briefly present enough of the
Lustre and the Lutin languages to be able to understand the examples. Section 3 presents
* This work was supported by the COMON Minalogic project [2009-2012] funded by the
french government (DGCIS/FUI), la Metro, and the city Grenoble – http://comon.
minalogic.net/

E. Ábrahám and K. Havelund (Eds.): TACAS 2014, LNCS 8413, pp. 636–650, 2014.
© Springer-Verlag Berlin Heidelberg 2014
Environment-Model Based Testing of Control Systems: Case Studies 637

the Corys case study and demonstrates the use of Lurette on a library object used to
simulate the behavior of the temperature and the pressure of a fluid in a pipe. Section 4
presents the Atos case study that illustrates how Lurette can be used to automate the run
of existing test plans designed for a Supervisory Control and Data Acquisition (SCADA)
library object. We discuss related work and conclude in Sections 5 and 6.

2 Black-Box Testing of Reactive Systems Using Lustre and Lutin

Test of Reactive Systems. A reactive system is an combination of hardware or soft-

ware (or both) that (1) acquires inputs (set_inputs), (2) performs a computation step
(step), (3) and provides outputs (get_outputs). Testing a reactive system consists in
writing or generating scripts that call the set_inputs and the step functions in turn.
Such test scripts can be done offline, but a reactive system is meant to react to stimuli
coming from its environment (e.g., from sensors), and to control it (e.g., using actua-
tors). Thus a realistic test sequence should use get_outputs to provide input vectors
that take into account the SUT inputs/outputs sequence history (i.e., its trace).
Stimulation. The environment is also a reactive system that executes in closed-loop
with the SUT. It can be very versatile or underspecified. This motivated the design of
Lutin [2], a language to program stochastic reactive systems and environment models.
Oracles. The test decision is deterministic and can be automated by formalizing the
SUT expected properties via predicate over traces. A Lurette oracle is a program that
returns as first output a Boolean that formalizes some requirements. Lurette reports a
property violation each time one oracle returns false. As oracles for reactive systems
often involve time, a language where time is a first-class concept like Lustre [1] is a
legitimate choice. Moreover, Lustre can formalise any safety property [4].
Coverage. To decide when to stop generating tests, we use a notion of requirements
coverage. Consider the requirement “stable(X,30) ⇒ stable(Y,5)” where the
“stable(V,n)” predicate states if a variable V was stable during the last n seconds;
it is always satisfied if X is never stable. But from a coverage point of view, it is more
interesting to generate input sequences where X is stable. That is an example where ran-
dom simulations are not sufficient and a language to program some guided scenarios is
useful. Note also that if X is a SUT input, covering this requirement is easy. If X is an
output, it is more complicated as it requests to drive the SUT, which sometimes require
a deep expertise on its internals (but it is always the case when designing tests).
Lurette. Lurette handles the test harness, by reading test parameters, executing all the
reactive systems in turn (SUT, environment, oracles), computing requirements cover-
age, and displaying a test report1. It does not impose the use of Lustre or Lutin. The
reaction steps can be either time-triggered, event-triggered, or both. In the experimenta-
tions we report in this article, the SUT was time-triggered. More detailed presentations
of Lurette can be found in [3,5]. We now present a few Lustre and Lutin programs to
illustrate their main characteristics. We use those examples in the forthcoming sections.
1 cf http://www-verimag.imag.fr/lurette.html for tools, manuals, and tutorials.
638 E. Jahier et al.

Lustre. Lustre allows defining reactive programs via sets of data-flow equations that
are virtually executed in parallel. Equations are structured into nodes. Nodes transform
input sequences into output sequences. The Lustre node r_edge below processes one
Boolean input sequence, and computes one Boolean output sequence.

node r_edge ( x : bool ) returns ( r : bool );

let
x
r = x -> x and not pre ( x );
tel r_edge(x)

This node defines its output with one equation and four operators (i.e., predefined
nodes). The memory operator “pre” gives access to the previous value in a sequence:
if x holds the sequences (x1 ,x2 ,. . . ), then pre(x) holds (⊥,x1,x2 ,. . . ), where ⊥ denotes
an undefined value. The arrow operator “->” modifies the value of the first element of a
sequence: if x holds (x1 ,x2 ,x3 ,. . . ), then init->x holds (init,x2 ,x3 ,. . . ). This operator is
useful for sequences that are undefined at their first instant, such as pre(x). The “and”
and “not” operators are the logical conjunction and negation lifted over sequences.
Hence, r_edge(x) is equal to x at the first instant, and then is true if and only if x is
true at the current instant and false at the previous one. This node detects rising edges.
Lutin. Lutin is a probabilistic extension of Lustre with an explicit control structure
based on regular operators: sequence (fby, for “followed by”), Kleene star (loop), and
choice (|). At each step, the Lutin interpreter (1) computes the set of reachable con-
straints, which depends on the current control-state; (2) removes from it unsatisfiable
constraints, which depends on the current data-state (input and memories); (3) draws a
constraint among the satisfiable ones (control-level non-determinism); (4) draws a point
in the solution set of the constraint (data-level non-determinism). This chosen point de-
fines the output for the current reaction. The solver of the current Lutin interpreter uses
Binary Decision Diagrams (BDD) and convex polyhedron libraries [6]. It is thus able
to deal with any combination of logical operators and linear constraints. Let us first
illustrate the Lutin syntax and semantics with a program using equality constraints.
3
node sn_gen () returns ( sn : int ) =
loop [10 ,20] sn =1 fby 2
sn_gen()
loop [20 ,30] sn =2 1
0

This node generates an integer finite sequence, without using any input. It first uses
an atomic constraint that binds sn to 1, during between (uniformly) 10 and 20 reaction
steps. Then it uses sn=2 during between 20 and 30 steps, and then stops. A constraint
can actually have any number of solutions, as in the x_gen node below.
node x_gen ( i : real ) returns ( x : real ) = loop { 0 < x and x < i }

At each step, the elected constraint is simplified by constant propagation of inputs

and memories values, and solved. Here, when i is negative, the constraint is not satisfi-
able and the program stops. Otherwise, one solution is drawn in the solution set ]0;i[.
Lutin also has a notion of typed macro, which is useful to structure constraints.
Environment-Model Based Testing of Control Systems: Case Studies 639

let abs ( z : real ): real = if z < 0.0 then -z else z

let zone1 (x , y : real ): bool = abs ( x +3.0* y ) < 3.0 and abs (20.0* x - y +2.0) <5.0
let zone4 (x , y : real ): bool = abs (x - y +6.0) < 3.0 and abs ( -5.0* x +y -2.0) <7.0

The first macro defines the absolute value of any real. The next ones define two zones
where a couple of real values (x,y) evolve. We present below a last example (used later)
that illustrates how to use Lutin to guide the random exploration of the environment.
node x_y_gen () returns (x , y : real ) =
loop { {|3: zone1 (x , y ) |1: zone4 (x , y )} fby loop ~50:5 x = pre x and y = pre y }

For the first reaction, a point is drawn in zone1 with a probability of 3/(3+1)=0.75 or
in zone4 with a probability of 1/(3+1)=0.25. Then x and y keep their previous values
for 50 steps in average, with a standard deviation of 5. This process then starts again
thanks to the outer loop. Preventing the environment outputs to change at each reac-
tion produces better coverage for requirements guarded by stability conditions (which
is common in control-command applications). More generally, a too chaotic environ-
ment might set the SUT into degraded modes, which would prevent the test of nominal
modes. Lutin also has constructs to execute in parallel nodes (run) or constraints (&>),
as well as exceptions 2 .

3 Automatic Testing of an Alices Library Object

In this section we report on a case study provided by Corys, a 300 persons company
that develops and commercializes the Alices workbench. Alices is a data-flow graphical
programming language tool for modeling, simulating and analyzing dynamic systems
in the domain of energy and transportation. Simulators of energy production plants
implemented in Alices are typically used to train operators.

3.1 The SUT: The Node_Liquid_SPL Alices Object

Corys asked us to test one of their most frequently used library object, which is named
Node_Liquid_SPL. This object simulates the behavior of the temperature and the pres-
sure of a fluid in a pipe transporting homogeneous liquids through hydraulic networks.
It is defined using mass and energy conservation equations:

dM dh
∑ Qei − h ∑ Qmi
= ∑ Qmi = i i
dt i dt M
where ∑i Qmi and ∑i Qei are respectively the sum of the mass flow (kg/s) and the sum
of the powers arriving in the node; M and h are the mass (kg) and the mass enthalpy
(J/kg) of the system; t is the time. The SUT is made of this object connected to two
pipes, themselves connected to two objects (load loss) that models the fluid mass flow
and transported power. The resulting equations are discretized and solved using the
Newton-Raphson method. Table 1 describes the SUT input/output variables. We have
shortened some variable names for the sake of readability.
2 cf http://www-verimag.imag.fr/Lutin.html for more information.
640 E. Jahier et al.

Table 1. Description of the SUT input/output variables

Name Producer Meaning Unit

Pin Env. Limit condition for input pressure Pa
Pout Env. Limit condition for output pressure Pa
Tin Env. Limit condition for input temperature °c
Tout Env. Limit condition for output temperature °c
T_amb Env. Temperature of the ambiant env °c
Qe_amb SUT Power exchanged with the ambiant env W
Qe1 SUT Power exchanged with the first pipe W
Qe2 SUT Power exchanged with the second pipe W
Qm1 SUT Mass flow exchanged with the first pipe kg/s
Qm2 SUT Mass flow exchanged with the second pipe kg/s
M SUT Mass of the system kg
h SUT Mass enthalpy of the system J/kg
T SUT Temperature of the system °c

3.2 The SUT Environment

The input variables to stimulate this node are the limit conditions for the pressure (Pin
and Pout), the temperature (Tin and Tout), and the ambient temperature (T_amb).
The admissible values for those inputs are part of the object documentation, which
states that pressure values vary within [10000.0, 190.0e5], and temperature values vary
within [5.0, 365.0]. Moreover, Corys wanted to test this node in average conditions, and
therefore required that the stimuli generator satisfies the following constraints:

– temperature and pressure cannot vary more than 10% between two instants;
– orders change only when mass and temperature values are stable (i.e., they do not
change of more than 1% between two steps).

To stimulate the SUT, we therefore designed a Lutin program that is a direct for-
malization of the preceding constraints. We use the limit_der macro, which can be
used both to test if an input varies more than a given percentage (limit_der(1.0,M)
to test if M varies less than 1%), or to constraint the derivative of some output
(limit_der(10.0,Pin) to constraint Pin to vary less than 10%).
let l i m i t _ d e r( pc : real ; x : real ref ): bool = abs (x - pre x ) < abs ( pc /100.0* pre x )
node l i q u i d _ s p l _ e n v(M , T : real ) returns (
Pin , Pout : real [ 1 0 0 0 0 . 0 ; 190.0 e5 ]; Tin , Tout , Tamb : real [5.0; 365.0];
) =
-- a few aliases to make it more readable
let i n p u t s _ a r e _ s t a b l e = l i m i t _ d e r(1.0 , M ) and l i m i t _ d e r(1.0 , T ) in
let d o n t _ c h a n g e = -- outputs keep their previous values
Pin = pre Pin and Tin = pre Tin and
Pout = pre Pout and Tout = pre Tout and Tamb = pre Tamb in
let change = -- outputs do not vary more than 10%
l i m i t _ d e r(10.0 , Pin ) and l i m i t _ d e r(10.0 , Pout ) and
l i m i t _ d e r(10.0 , Tin ) and l i m i t _ d e r(10.0 , Tout ) and l i m i t _ d e r(10.0 , Tamb )
in -- a simple scenario
true -- the first instant
fby loop { if i n p u t s _ a r e _ s t a b l e then change else d o n t _ c h a n g e}
Environment-Model Based Testing of Control Systems: Case Studies 641

The main node liquid_spl_env has two real inputs (produced by the SUT), and
five real outputs. At the first instant, the only constraints on output variables are the
ones mentioned in their declarations; a random value is drawn in their respective inter-
val domains. For example, Tamb is drawn between 5 and 365. Then, for the remaining
instants, variables keep their previous values if one of the environment input (M or T)
varies more than 1%; otherwise they vary at random, but without exceeding 10%. One
could of course imagine scenarios that are more complex. However, it hasn’t been nec-
essary to cover the expected properties we present in the following.
Note the feedback loop: the SUT reacts to its environment, which itself reacts to
the SUT by testing the stability of M and T. This is typical of what offline test vectors
generators cannot do when they ignore the reactive nature of the SUT.

3.3 The Oracles

In order to automate the test decision, we need to formalize the SUT expected proper-
ties. Actually, such requirements were not explicitly written in the object documen-
tation. Hence we asked to the Corys engineer responsible for the Alices library to
write down how he expects this object to behave. He came up with the following
requirements.

1. if the sum of powers (coming from Qe1, Qe2, and Qe_amb sensors), and the sum
of incoming mass flows (coming from Qm1 and Qm2 sensors) are positive, then the
mass and the temperature of the node increase;
2. if the sum of powers Qe, and the sum of flows mass Qm are negative, then the mass
and temperature of the node decrease;
3. if the sum of powers is zero, and the sum of mass flow rate Qm is positive, then the
mass increases;
4. if the sum of Qe is zero, and the sum of mass flow rate Qm is negative, then the
mass decreases;
5. if the sum of mass flows Qm is zero, and the sum of powers Qe is negative, then the
temperature decreases;
6. if the sum of mass flows Qm is zero, and the sum of powers Qe is positive, then the
temperature increases.

A possible Lustre formalization of the first requirement is:

Qe = Qe1 + Qe2 + Qe_amb ;
Qm = Qm1 + Qm2 ;
ok1 = ( Qe > = 0.0 and Qm > = 0.0) => ( increase (M , 0.0) and increase (T , 0.0));

where increase is defined like that:

node increase ( x : real ; t h r e s h o l d: real ) returns ( y : bool );
let y = true -> (x - pre ( x ) > = t h r e s h o l d); tel

When we run Lurette with the SUT, the environment, and the oracles we described,
all oracles are violated after a few steps. After several discussions with the person who
wrote down the requirements, we ended up in Lurette runs that worked fine for hours.
We now sum-up the fixes we needed to perform.
642 E. Jahier et al.

First Problem. We have formalized the sentence “the sum of powers (coming from Qe1,
Qe2, and Qe_amb sensors)”, and “the sum of mass flows (coming from Qm1 and Qm2”)
as Qe=Qe1+Qe2+Qe_amb and Qm=Qm1+Qm2. However, the node connectors are oriented:
the first pipe flows in, whereas the second pipe flows out. Hence the correct interpreta-
tion leads to the following definitions: Qe=Qe1-Qe2+Qe_amb and Qm=Qm1-Qm2.
Second Problem. We have performed a bad interpretation of “are positive/negative” in
the requirements. Indeed, when one compares to 0 a sum of values that are computed up
to a certain precision (0.1 for flow mass, and 100 for powers), one has to specify some
tolerance levels. Hence, for example, the second property should be rewritten as: “if
Qe<=-Tol_Qe and Qm<=-Tol_Qm then the mass and temperature of the node decrease”,
where Tol_Qe=300 (three times the precision of power sensors) and Tol_Qm=0.2 (two
times the precision of flow mass sensors).
Third Problem. In properties 5 and 6, the statements “the sum of powers Qe is positive”
should take into account the mass enthalpy of the node (Qe-h.Qm instead of just Qe).
Fourth Problem. At this stage, the requirements fixes we have performed allow run-
ning simulations that last several minutes without violating oracles. After more steps
(around 1000 steps in average), property 5 is violated. This time, the problem was more
subtle and required a deeper investigation to the Corys engineer. His conclusion was
that the convergence criteria (thresholds parametrizing the differential equation solver)
in this simulation were too small. By setting a convergence criterion of 1 (versus 0.1)
for the power, and of 1000 (versus 100) for the flow mass, no oracle is violated, even if
we run the simulation for hours. Since the convergence criterion implies the precision
of sensors computations, we need to modify again the values of Tol_Qm and Tol_P.
Those new convergence criteria are actually the ones that are typically used in Alices
for modeling pipes in power plants, which explain why this problem was (probably)
never triggered before by Alices users.

Table 2. Summary of requirements fixes. Version 2 arises from the fixing of the first three prob-
lems. Version 3 arises from the fixing of the fourth problem.

Name Unit Meaning Version 1 Version 2 Version 3 Involved Req.

Qm kg/s Sum of mass flow Qm1+Qm2 Qm1-Qm2 ditto all
Qe W Sum of powers Qe1+Qe2+Qe_amb Qe1-Qe2+Qe_amb ditto all
P W Node power Qe Qe-h.Qm ditto all
Tol_Qm kg Mass tolerance 0 0.2 2 3,4
Tol_P W Power tolerance 0 300 3000 1,2,5,6

3.4 Discussion and Lessons Learned from This First Experiment

The first three problems were due to a lack of precision when formulating requirements.
One could argue that a specialist in physical systems simulators design would have
interpreted such requirements correctly in the first place. Still, undoubtedly, the less a
requirement is subject to a bad interpretation, the better it is. This experiment stresses
out that Lurette can be seen as an engineering tool that helps to write consistent and
precise requirements. The fourth problem was much more interesting for the Corys
Environment-Model Based Testing of Control Systems: Case Studies 643

engineers and revealed a real feature of this very frequently used object that behaves
unexpectedly when used with an unusual convergence criterion.
The principal lesson of this experimentation is that writing executable requirements
is not that difficult and can be very effective. Indeed, the experiment was conducted
by an engineer that was ignorant about Lustre, Lutin, Alices, and dynamic systems
modeling. Still, he was able to pinpoint four issues in less than one week of work with
a few interactions with the Alices libraries supervisor.
We performed a similar study during the COMON project on voters designed in
Scade by Rolls-Royce. Their voters were much simpler, with no internal state. Hence
their formalization into Lustre oracles ended up into something equivalent to the Scade
implementation. We believe that using oracles in this context is still useful, as it amounts
to have two teams implementing the same specification, which is a classical strategy to
gain confidence in software implementations. In such cases, Lutin stimulators can still
be useful to compare thoroughly two implementations. In the particular case of Rolls-
Royce voters, it was not necessary as we were able to prove their equivalence by state
exploration (using the Lesar model-checker [7]). This illustrates the synergy we can
have between formal-based testing and formal verification.

4 Timed Test Plans Automation

4.1 Test Plans: A Standard Practice in Industry
A standard practice in industry is to base test campaigns on test plans. The test plans
of our three partners in the COMON project were actually very similar, and were made
of a three columns table: one for the time (physical or logical); one for the stimulation,
that specifies what the tester should do to perform the test; and one last column that
specifies what the tester should observe in reaction to its stimulations. Corys developed
in collaboration with EDF and AREVA a tool (I&C Simulation) to automate the play
of such test plans, both for the stimulation and the decision parts. This tool processes
scripts, where one can ask to set a variable to a particular value at a specific time; and
then one can check that another variable take a specific value at another specific time.
One problem with such test plans, being automated or not, is that they are overly de-
terministic, both at the data and at the temporal levels. In the case studies we addressed
so far with Lurette, the strategy was different as it consisted in writing high-level con-
straints both for generating several stochastic scenarios (Lutin), and for checking several
traces (Lustre). This allows covering much more cases with the same specifications.
Nevertheless, engineers are used to write test plans, and several years of know-how
are associated to their design. This is why we find interesting to report how Lutin and
Lustre could be used to implement test plans, and to show how easy it can be to add a
little bit of data and temporal looseness.
In this section, we present a test plan provided by Atos, targeting a generic library
object. This plan was extracted from an existing test campaign conducted some years
ago. We first demonstrate how to automate the play of this test plan in a very faithful
and deterministic manner, as it could have been done with the I&C Simulation tool of
Corys for example. Then we demonstrate the benefits of our languages to relax and
generalize the constraints on both the stimulation and the observation part, which leads
to tests that cover more cases, and are easier to maintain.
644 E. Jahier et al.

4.2 The SUT: A SCADA Generic Object

A SCADA (Supervisory Control and Data Acquisition [8]) is a remote management

system for large-scale processing in many real time telemetry and remote control in-
dustrial installations (manufacturing, food processing, energy). It typically handles in
real-time thousands of data (e.g., coming from sensors), and presents a relevant syn-
thesis in graphical form to operators so they can monitor and control the system. Atos
develops and commercialises several SCADA dedicated to the supervisory and control
of power generating plants (nuclear, fuel, gas).
The purpose of the generic object we want to test is to monitor the operating area
of a pair of numeric values (which typically comes from the physical process) and to
raise alarms when dreaded events occur. The space where the monitored point evolves
in is divided into several operating domains (nominal, degraded, etc.), and into several
zones. When the point enters in a forbidden zone, an alarm should be raised; when
it remains in an accumulating zone too long, another alarm should be raised; in an
authorized zone, there is nothing to check. The zones shapes differ for each domain.
The system chooses a domain, depending on various criteria on the evolution of the
operating point. The operator can ask to favor some domain, and he can force it (i.e.,
ask more categorically). The number and the shape of domains (that can overlap) and
zones are parameters of the generic object.
The SUT is such a parametrized object, with four domains and five zones; zone 2
is forbidden; zones 3 and 5 are accumulating; zones 1 and 4 are authorized. The SUT
environment is made of two integers (X and Y) that hold the monitored point coordinates,
and three Boolean inputs per domain so that the operator can ask to choose a domain
(dd1 to dd4), force a domain (fd1 to fd4), or un-force it (ud1 to ud4).
The Test Plan CRT_019_S04. The existing test campaign we based our work on con-
sisted in 21 test plans. The CRT_019_S04 is one of them, and is shown in Table 3. This
test plan is split into seven logical steps, and four stages. At each step, the operator sets
the values of variables mentioned in the action column, and checks (visually) that the
system behaves as specified in the expected result column.
The Atos I/O Stimulator. In order to ease the test of their SCADA objects, Atos de-
veloped an in-house tool called the Input/Output stimulator. This tool processes scripts,
and is basically able to (1) set SCADA internal variable values; (2) display messages;
(3) suspend the script until the operator presses a key (WAIT). This stimulator is used
to ease the play of test plans by automating the run of the action column, and to limit
the intervention of the operator to a few key presses. In the CRT_019_S04 plan, each
of the four stages actually corresponds to a WAIT statement in the corresponding I/O
stimulator script. The tester does the expected results checking.

4.3 Implementing Automated Test Plans with Lutin and Lustre

The first step to implement with Lurette an automated version of this test plan was to
connect our languages APIs to the Atos SCADA . To do that, we re-used the infrastructure
that was set up for the I/0 stimulator. We also added a layer in charge of interfacing an
event-triggered workbench (SCADA) with time-triggered programs (Lutin/Lustre). From
Environment-Model Based Testing of Control Systems: Case Studies 645

Table 3. The CRT_019_S04 test plan

step nb Action Expected result / Comment

1 Launch stage 1 which elects domain 1 and Check the image display
sets X,Y to (25,40) (in zone 1)
Launch stage 2 which sets X,Y to (40,28) Check the operating point (position, color)
2 in the forbidden zone 2 Check the alarm raised in the alarm function
Write down the timestamp
3 Launch stage 3, which elects domain 2 Check that the alarm above remains
instead of domain 1 at the timestamp of step 2
X,Y remains in the forbidden zone 2
4 Check that the alarm above remains
Force domain 3 at the timestamp of step 2
X,Y remains in the forbidden zone 2
Force domain 4 The alarm above disappear
5 X,Y is now in an authorized zone 4
The alarm is raised at the current timestamp
6 Unforce domain 4 domain 2 is elected
X,Y is back in the forbidden zone 2
7 Launch stage 4 which sets X,Y to (-9,25) The alarm disappears
in the authorized zone 4 X,Y in the authorized zone 4

Lurette to SCADA , we generate an event each time a variable value changes (up to a
given threshold). From SCADA to Lurette, we perform a periodic sampling of the variable
values. This sampling is done at 1 hertz, to avoid data race problems and to remain
deterministic and reproducible: indeed, 1 second is enough for the SUT to address all
events resulting from the change of all interface variables. Note that it would have been
easy and interesting to test what happens at higher rates.

The « Expected result » column of Table 3 in Lustre. In order to detect bad behav-
iors, we formalize the observation column of the CRT_019_S04 test plan with a Lustre
oracle that monitors the following inputs: the step number (sn ∈ [1,7]) ; the current
zone (czone ∈ [1,5]); the alarm of zone 2 (A2); the elected domain (d_elec); the cur-
rent timestamp (ts_c); and the timestamp of alarm A2 (ts_a2). Here again, we have
shortened variable names for the sake of readability.
node c r t 0 1 9 _ s 0 4( sn : int ; czone : int ; A2 : bool ; d_elec , ts_c , ts_a2 : int )
returns ( ok : bool );
var ok1 , ok2 , ok3 , ok4 , ok5 , ok6 , ok7 : bool ; lts_a2 : int ;
let
lts_a2 = 0 -> if r_edge ( A2 ) then ts_c else pre ( lts_a2 );
ok1 = ( sn =1 => ( czone =1));
ok2 = ( sn =2 => ( czone =2 and A2 ));
ok3 = ( sn =3 => ( czone =2 and A2 and ts_a2 = lts_a2 ));
ok4 = ( sn =4 => ( czone =2 and A2 and ts_a2 = lts_a2 ));
ok5 = ( sn =5 => ( czone =4 and not A2 ));
ok6 = ( sn =6 => ( czone =2 and d_elec =2 and ts_a2 = ts_c ));
ok7 = ( sn =7 => ( czone =4 and not A2 ));
ok = ok1 and ok2 and ok3 and ok4 and ok5 and ok6 and ok7 ;
tel
646 E. Jahier et al.

The local variables ok1 to ok7 encode the seven steps of the third column. In order
to « write down the timestamp » at step 2, we define a local variable lts_a2 as follows:
initially set to 0, it then takes the value of the current timestamp ts_c when A2 is raised
(r_edge(A2)), and keeps its previous value otherwise (pre(lts_a2)). To encode the
expected result of steps 3 and 4, we compare the timestamp of the A2 provided in input
ts_a2 with its counterpart computed locally lts_a2.
The « action » column of Table 3 in Lutin. We first present a completely deterministic
Lutin program that mimics the behavior of an operator that processes this test plan.
Then we show how slight modifications of this program can lead to a stimuli generator
that covers much more cases. Let us first define a few Boolean macros to enhance the
programs readability. The tfff macro below binds its first parameter to true, and all
the other ones to false.
let tfff (x ,y ,z , t : bool ): bool = x and not y and not z and not t

Similarly, we define ftff, which binds its second parameter to true; f7 and f8 bind
all their parameters to false. The integer input sn is used to choose the instant at which
we change the step. It can be controlled by a physical operator or by another Lutin node
that sequentially assigns values from 1 to 7 (similar to the sn_gen node of Section 2).
The fourteen outputs of this node controls the domain to display (display domain i if
ddi is true), to force (force domain i if fdi is true), or to un-force (un-force domain i
if udi is true).
node c r t 0 1 9 _ s 0 4( sn : int ) returns
(X , Y : real ; dd1 , dd2 , dd3 , dd4 , fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 , ud4 : bool ) =
loop {
sn =1 and X =25.0 and Y =40.0 and tfff ( dd1 , dd2 , dd3 , dd4 ) and
f8 ( fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 , ud4 )

As long as the sn input is equal to 1, the outputs of the crt019_s04 node satisfy
the constraint above that states that only the first domain should be displayed, and no
domain is forced or unforced. X and Y are set in the authorized zone 1. When sn becomes
equal to 2, the control passes to the constraint below, which is the same as the previous
one except that the point is set somewhere in zone 2.
} fby loop {
sn =2 and X =40.0 and Y =28.0 and tfff ( dd1 , dd2 , dd3 , dd4 ) and
f8 ( fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 , ud4 )
} fby loop {
sn =3 and X =40.0 and Y =28.0 and ftff ( dd1 , dd2 , dd3 , dd4 ) and
f8 ( fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 , ud4 )
} fby loop {
sn =4 and X =40.0 and Y =28.0 and ftff ( dd1 , dd2 , dd3 , dd4 ) and
fd3 and f7 ( fd1 , fd2 , fd4 , ud1 , ud2 , ud3 , ud4 )
} fby loop {
sn =5 and X =40.0 and Y =28.0 and ftff ( dd1 , dd2 , dd3 , dd4 ) and
fd4 and f7 ( fd1 , fd2 , fd3 , ud1 , ud2 , ud3 , ud4 )
} fby loop {
sn =6 and X =40.0 and Y =28.0 and ftff ( dd1 , dd2 , dd3 , dd4 ) and
ud4 and f7 ( fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 )
} fby loop {
sn = 7 and X = -9.0 and Y =25.0 and ftff ( dd1 , dd2 , dd3 , dd4 ) and
f8 ( fd1 , fd2 , fd3 , fd4 , ud1 , ud2 , ud3 , ud4 )
}
Environment-Model Based Testing of Control Systems: Case Studies 647

This Lutin program, once run with the oracle of Section 4.3, allows test automation.
However, it suffers from the same flaw as its original non-automated counterpart: it can
be tedious to maintain. Indeed, if for some reason, the shape of zone 1 is changed, the
chosen point (25,40) might no longer be part of zone 1. Choosing pseudo-randomly any
point in zone 1 using the Lutin constraint solver makes the plan more robust to software
evolution. Moreover, with the same effort, it covers much more cases. In the same spirit,
we can further loosen this plan by replacing “choose a point in the authorized zone 1”
by “choose a point in any authorized zone” (cf the x_y_gen node of Section 2). In
step 3, 4, and 5, we could also toss the choice of the domain to be forced. Actually, by
loosening this plan in this way, we obtain a plan that covers more cases than the twenty
other plans of the test campaign!

4.4 Discussion and Lessons Learned from This Second Experiment

The original test plan was not deterministic, since the time between each step change
was controlled by a physical tester. However, this non-determinism is easy to simulate
with Lutin, for example using the sn generator (sn_gen) presented in Section 2. The
advantage of the Lutin non-determinism over the human one is its reproducible nature.
Indeed, one just needs to store the seed used by the Lutin pseudo-random engine to be
able to replay the exact same simulation.
This test plan does not illustrate the feedback capability of Lurette. However, plans
where the tester should perform some specific actions depending on some behavior of
the SUT are very common.
We have shown a way to use Lurette and its associated languages to automate the
run of an existing test plan, designed to be exercised by a human operator. The initial
set-up for automated plans seems to require more effort, as each variable behavior has
to be described precisely at each step, while the original plan was more allusive. But the
Lurette version has four major advantages: (1) it can be run automatically, (2) each run
is reproducible, (3) it covers (much!) more cases, and (4) it is more robust to software
(or specification) evolutions.
The two last points are the most important. Indeed, Atos experimented with com-
pletely automated test scripts, but gave up as they were too difficult to maintain. One
reason was that their scripts were too sensitive to minor time or data values changes.
The use of languages with a clean semantics with respect to time and parallelism eases
the writing of more abstract and general properties that can serve as oracles for several
test scenarios. The concision and the robustness arguments hold both for the oracles
and the stimulators, and from the data and the temporal points of view.
In previous experimentations ([3,5] and Section 3), the methodology was to derive
oracles and stimulators from informal requirements. The initial stimulator is made of
very general constraints. Then, to increase oracles coverage, Lutin scenarios are de-
signed. During the COMON project, we also experimented with this methodology on
the SCADA object, and the coverage was actually comparable. This “direct formalization
approach” is more modular, as some variables sets can be defined separately, whereas
with test plans, one needs to describe all variables at the same time. Moreover, it al-
lows writing specific scenarios only when it is necessary, as some oracles are covered
in the first place using simple constraints on the environment. Once all the easy cases
are explored at random with a minimal effort, it remains the difficult work that consists
648 E. Jahier et al.

in driving the SUT to set it in some configurations that exhibit interesting cases. This
is a work of SUT experts. Leveraging testers from the tedious and systematic part, and
letting them focus on interesting parts using high-level languages could restore the in-
terest in testing, which often has a poor reputation. Writing Lutin programs is a creative
activity, and generalising its use could ease to relocate test teams.

5 Related Work
Automating the test decision with executable oracles is a simple and helpful idea used
by many others. The real distinction between Lurette and other tools lies in the way
SUT inputs are generated. In the following, we group approaches according to the in-
put generation techniques: source-based, model-based, or environment-model based.
We found no work dealing with automated testing of SCADA systems. For dynamical
systems workbenches (such as Alices), the literature is quite abundant, and mostly con-
cerns Simulink [9]. Hence we focus here on works targeting Simulink, and refer to the
related work section of [3] for a broader and complementary positioning of Lurette in
the test of reactive systems.
Source Code Based Testing (White-Box). The White-box testing approach consists
in trying to increase structural coverage by analysing the SUT source using techniques
coming from formal verification such as model-checking [10], constraint solving, or
search-based exploration [11,12]. Such approaches are completely automated, but can
be confronted to the same limitations as formal verification with respect to state space
explosion. Several industrial tools use white-box techniques to test Simulink designs,
e.g., Safety Test Builder [13], or Design Verifier [14].
Model-Based Testing (Grey-Box). A very popular approach in the literature [15,16]
consists in viewing the SUT as a black-box, and designing a more or less detailed model
of it. This model should be faithful enough to provide valuable insights, and small
enough to be analyzable. The model structure is sometimes used to define coverage
criteria. The model is used both for the test decision and the stimuli generation. T-VEC
[17,18] and Reactis Tester [19] are an industrial tools using this approach to generate
tests offline. With Lurette, we also use a model of the SUT, but this model is only used
for oracles. The input generation is developed by exploration of environment models.
A way to combine this approach with Lurette would be to use such models of the SUT
to generate Lutin scenarios to guide the SUT to specific states and increase coverage.
Environment Model Based Testing (Black-Box). While the white-box approach in-
tends to increase structural coverage, the main goal of black-box testing is to increase
(functional) requirements coverage [20]. Time Partition Testing (TPT) is an industrial
black-box tool distributed by Piketec [21]. As Lurette, TPT have its own formalism to
model the environment and automate the SUT stimulation [22,23]. It is a graphical for-
malism based on hierarchical hybrid automata that is able to react online on the SUT
outputs. The major difference with Lutin is that those automata are deterministic. It
uses python oracles to automate the test decision, although Lustre is arguably better for
specifying high-level timed properties.
Another way to explore the environment state space, which has been experimented
on Simulink programs [24,25], is to perform heuristic search (evolutionary algorithms,
Environment-Model Based Testing of Control Systems: Case Studies 649

simulated annealing [26]). The idea is to associate to each SUT input a set of possible
parametrized generators (ramp, sinus, impulse, spline). The search algorithms generate
input sequences playing with several parameters, such as the number of steps each
generator is used, their order, or the amplitude of the signal. A fitness function estimates
the distance of the trace to the requirements. Then, another trace is generated with
other parameters, until an optimal solution is found. A limitation of their generators is
that they are not able to react to SUT outputs. More generally, for systems that have
a complex internal state, it can be very difficult to drive it in some specific operating
mode; to do that, the knowledge of the expert is mandatory (and being able to react
to SUT outputs too). Instead of guiding a random exploration via heuristics, Lurette
proposal consists in asking experts to write programs that performs a guided random
exploration of the SUT input state space. A way to combine both approaches could be
to let some evolutionary algorithms choose some parameters of Lutin programs, such
as choice point weights or variable bounds.

6 Conclusion
The main lesson of the first experimentation is that writing executable requirements is
not that difficult and allows to write precise and consistent requirements. This study
gave new insights to Corys engineers on one of their most frequently used object.
The second experimentation demonstrates a way to automate the execution of timed
test plans. Test plans are commonly used in industry, and automating their process
aroused a big interest within our industrial partners. Lutin and Lustre allows improv-
ing their use by permitting the design of more abstract test plans that are more robust to
temporal and data changes. One noteworthy outcome of this study is that the resulting
randomized and automated test plan actually covers more than the 21 test plans of the
original test suite.
There is a synergy between automated oracles and automated stimulus generation.
Indeed, generating thousands of simulation traces would be useless without automatic
test decision. Conversely, designing executable requirements to automate the decision
of a few scenarios generated manually might not be worth the effort.
This work also demonstrates that synchronous languages are not only useful for de-
signing critical systems (as the success of Scade gives evidence of), but can also be
used to validate dynamic systems models (Alices) or event-based asynchronous systems
(SCADA). The language-based approach of Lurette allows performing several kinds of
test (unit, integration, system, non-regression) on various domains [3,5].
From an industrial use perspective, a general-purpose library and specialized
domain-based ones are still to be done. That situation may progress in the near fu-
ture, as the interest expressed in Lurette by the three industrial partners of the COMON
project is one of the reasons that convinced people to establish in 2013 the Argosim
company. Argosim is developing the Stimulus tool based on the Lurette principles [27].

References
1. Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous dataflow programming
language Lustre. Proceedings of the IEEE 79(9), 1305–1320 (1991)
2. Raymond, P., Roux, Y., Jahier, E.: Lutin: a language for specifying and executing reactive
scenarios. EURASIP Journal on Embedded Systems (2008)
650 E. Jahier et al.

3. Jahier, E., Halbwachs, N., Raymond, P.: Engineering functional requirements of reactive
systems using synchronous languages. In: International Symposium on Industrial Embedded
Systems, SIES 2013, Porto, Portugal (2013)
4. Halbwachs, N., Fernandez, J.C., Bouajjanni, A.: An executable temporal logic to express
safety properties and its connection with the language lustre. In: ISLIP 1993, Quebec (1993)
5. Jahier, E., Raymond, P., Baufreton, P.: Case studies with lurette v2. Software Tools for Tech-
nology Transfer 8(6), 517–530 (2006)
6. Jahier, E., Raymond, P.: Generating random values using binary decision diagrams and con-
vex polyhedra. In: CSTVA, Nantes, France (2006)
7. Raymond, P.: Synchronous program verification with lustre/lesar. In: Modeling and Verifica-
tion of Real-Time Systems. ISTE/Wiley (2008)
8. Bailey, D., Wright, E.: Practical SCADA for industry. Elsevier (2003)
9. The Mathworks: Simulink/stateflow, http://www.mathworks.com
10. Hamon, G., de Moura, L., Rushby, J.: Generating efficient test sets with a model checker. In:
Software Engineering and Formal Methods, pp. 261–270 (2004)
11. Satpathy, M., Yeolekar, A., Ramesh, S.: Randomized directed testing (redirect) for
simulink/stateflow models. In: Proceedings of the 8th ACM International Conference on Em-
bedded Software, EMSOFT 2008, pp. 217–226. ACM, New York (2008)
12. Zhan, Y., Clark, J.A.: A search-based framework for automatic testing of MATLAB/Simulink
models. Journal of Systems and Software 81(2), 262–285 (2008)
13. TNI Software: Safety Test Builder,
http://www.geensoft.com/fr/article/safetytestbuilder/
14. The Mathworks: Design verifier, http://www.mathworks.com/products
15. Broy, M., Jonsson, B., Katoen, J.-P., Leucker, M., Pretschner, A. (eds.): Model-Based Testing
of Reactive Systems. LNCS, vol. 3472. Springer, Heidelberg (2005)
16. Zander, J., Schieferdecker, I., Mosterman, P.J.: 1. In: A Taxonomy of Model-based Testing
for Embedded Systems from Multiple Industry Domains, pp. 3–22. CRC Press (2011)
17. T-VEC: T-vec tester, http://www.t-vec.com
18. Blackburn, M., Busser, R., Nauman, A., Knickerbocker, R., Kasuda, R.: Mars polar lander
fault identification using model-based testing. In: 8th IEEE International Conference on En-
gineering of Complex Computer Systems, pp. 163–169 (2002)
19. Reactive Systems: Testing and validation of simulink models with reactis white paper
20. Cu, C., Jeppu, Y., Hariram, S., Murthy, N., Apte, P.: A new input-output based model cover-
age paradigm for control blocks. In: 2011 IEEE Aerospace Conference, pp. 1–12 (2011)
21. Piketec: Tpt, http://www.piketec.com
22. Lehmann, E.: Time partition testing: A method for testing dynamic functional behaviour. In:
Proceedings of TEST 2000, London, Great Britain (2000)
23. Bringmann, E., Kramer, A.: Model-based testing of automotive systems. In: 2008 1st Inter-
national Conference on Software Testing, Verification, and Validation, pp. 485–493 (2008)
24. Vos, T.E., Lindlar, F.F., Wilmes, B., Windisch, A., Baars, A.I., Kruse, P.M., Gross, H., We-
gener, J.: Evolutionary functional black-box testing in an industrial setting. Software Quality
Control 21(2), 259–288 (2013)
25. Baresel, A., Pohlheim, H., Sadeghipour, S.: Structural and functional sequence test of dy-
namic and state-based software with evolutionary algorithms. In: Cantú-Paz, E., et al. (eds.)
GECCO 2003. LNCS, vol. 2724, pp. 2428–2441. Springer, Heidelberg (2003)
26. McMinn, P.: Search-based software test data generation: a survey: Research articles. Softw.
Test. Verif. Reliab. 14(2), 105–156 (2004)
27. Argosim: Stimulus, http://www.argosim.com
Author Index

Abate, Alessandro 248, 547 Eldib, Hassan 62

Adzkiya, Dieky 248 Emmes, Fabian 140
Albert, Elvira 562 Ermis, Evren 421
Alberti, Francesco 15 Esmaeil Zadeh Soudjani, Sadegh 547
Almagor, Shaull 424 von Essen, Christian 620
Ardeshir-Larijani, Ebrahim 500
Arenas, Puri 562 Falke, Stephan 140
Armando, Alessandro 31 Finkbeiner, Bernd 78
Armstrong, Philip 187 Fioravanti, Fabio 568
Aştefănoaei, Lacramioara 263 Fischer, Bernd 398, 402, 405
Flores-Montoya, Antonio 562
Baier, Christel 515 Fokkink, Wan J. 575
van Bert, Dirk A. 575 Forejt, Vojtěch 531
Belov, Anton 93, 408 Fuhs, Carsten 140, 156
Ben Rayana, Souha 263
Ben Salem, Ala Eddine 440 Gario, Marco 326
Bensalem, Saddek 263 Gay, Simon J. 500
Beyer, Dirk 373 Genaim, Samir 562
Boender, Jaap 605 Ghilardi, Silvio 15
Boker, Udi 424 Ghorbal, Khalil 279
Bošnački, Dragan 233 Giannakopoulou, Dimitra 620
Boulgakov, Alexandre 187 Gibson-Robinson, Thomas 187
Bozga, Marius 263 Giesl, Jürgen 140
Bozzano, Marco 326 Gómez-Zamalloa, Miguel 562
Brockschmidt, Marc 140 Griggio, Alberto 46
Gurﬁnkel, Arie 93, 408
Caballero, Rafael 581
Hartmanns, Arnd 593
Carbone, Roberto 31
Heizmann, Matthias 172, 418
Chen, Hong-Yi 156
Hendriks, Dennis 575
Cheval, Vincent 587
Hermanns, Holger 593
Christ, Jürgen 418
Herrera, Christian 295
Cimatti, Alessandro 46, 326
Hoenicke, Jochen 418, 421
Combaz, Jacques 263
Hofkamp, Albert 575
Compagna, Luca 31
Huang, Xiaowei 455
Cook, Byron 156
Huth, Michael 109
Cordeiro, Lucas 405
Inverso, Omar 398, 402
De Angelis, Emanuele 568
Decker, Normann 341 Jahier, Erwan 636
De Schutter, Bart 248
Dietsch, Daniel 418, 421 Klein, Joachim 515
Djoko-Djoko, Simplice 636 Klüppelholz, Sascha 515
Dräger, Klaus 531 Kordon, Fabrice 440
Dudka, Kamil 412 Kroening, Daniel 389
Duret-Lutz, Alexandre 440 Kuo, Jim Huan-Pu 109
652 Author Index

Kupferman, Orna 1, 424 Ramalho, Mikhail 405

Kwiatkowska, Marta 531 Reinbacher, Thomas 357
Reineke, Jan 217
Lafont, Eric 636 Reniers, Michel A. 575
La Torre, Salvatore 398, 402 Riesco, Adrian 581
Leike, Jan 172 Román-Dı́ez, Guillermo 562
Leucker, Martin 341 Roscoe, Andrew W. 187
Li, Shanping 310 Rozier, Kristin Yvonne 357
Li, Wenchao 470
Lindenmann, Markus 418 Sacerdoti Coen, Claudio 605
Liu, Yang 310 Sadigh, Dorsa 470
Lowe, Gavin 202 Sastry, S. Shankar 470
Löwe, Stefan 392 Schaumont, Patrick 62
Schilling, Christian 418
Maiza, Chaouki 636 Schumann, Johann 357
Maler, Oded 485 Seshia, Sanjit A. 470
Mandrykin, Mikhail 392 Sharygina, Natasha 15
Märcker, Steﬀen 515 Siirtola, Antti 599
Markovski, Jasen 575 Slaby, Jiri 415
Marques-Silva, Joao 93 Strejček, Jan 415
Martin-Martin, Enrique 562, 581 Sun, Jun 310
Mens, Irini-Eleftheria 485
van der Meyden, Ron 455 Tamarit, Salvador 581
Morse, Jeremy 405 Tautschnig, Michael 389
van de Mortel-Fronczak, Joanna M. 575 Tentrup, Leander 78
Mover, Sergio 46 Thierry-Mieg, Yann 440
Muller, Petr 395 Thoma, Daniel 341
Musa, Betim 418 Tomasco, Ermenegildo 398, 402
Tonetta, Stefano 46, 326
Nagarajan, Rajagopal 500 Tripakis, Stavros 217
Nicole, Denis 405
Nimkar, Kaustubh 156 Ujma, Mateusz 531
Nutz, Alexander 421
Vojnar, Tomáš 395, 412
O’Hearn, Peter 156
Wang, Chao 62
Parker, David 531 Wang, Ting 310
Parlato, Gennaro 398, 402 Wang, Xinyu 310
Peringer, Petr 412 Wendler, Philipp 392
Pettorossi, Alberto 568 Westphal, Bernd 295
Piskac, Ruzica 124 Wies, Thomas 124
Platzer, André 279 Wijs, Anton 233
Podelski, Andreas 295, 418, 421 Wissert, Stefan 418
Proietti, Maurizio 568
Puebla, German 562 Zuﬀerey, Damien 124

FG100 Tech Manual v2
80% (10)
FG100 Tech Manual v2
94 pages
R Programming Lab Manual
No ratings yet
R Programming Lab Manual
73 pages
Aspiring Tech Developer's Journey
No ratings yet
Aspiring Tech Developer's Journey
1 page
3.4.4 Lab - Research Networking Standards
No ratings yet
3.4.4 Lab - Research Networking Standards
3 pages
AWS Re/start Agenda: Week 1 - Introduction, Cloud Foundations
No ratings yet
AWS Re/start Agenda: Week 1 - Introduction, Cloud Foundations
12 pages
Unit Notes MAS225
No ratings yet
Unit Notes MAS225
144 pages
Heli Bhavsar: Education
No ratings yet
Heli Bhavsar: Education
2 pages
Irfan Shaik: Skills Experience
No ratings yet
Irfan Shaik: Skills Experience
1 page
C# FAQ For C++ Programmers: Latest Updates
No ratings yet
C# FAQ For C++ Programmers: Latest Updates
21 pages
E-KYC Process for Financial Advisors
No ratings yet
E-KYC Process for Financial Advisors
7 pages
Mastercam PDF
0% (1)
Mastercam PDF
2 pages
13.3.1 Packet Tracer - Use ICMP To Test and Correct Network Connectivity
No ratings yet
13.3.1 Packet Tracer - Use ICMP To Test and Correct Network Connectivity
2 pages
Sourcecodewebsitejualanonlinewa HTML
No ratings yet
Sourcecodewebsitejualanonlinewa HTML
130 pages
Case Study - 2018marriott
No ratings yet
Case Study - 2018marriott
6 pages
Advances in Information Retrieval
No ratings yet
Advances in Information Retrieval
913 pages
Jeffrey Wan's Resume 2023
No ratings yet
Jeffrey Wan's Resume 2023
1 page
"Everyone Wants To Do The Model Work, Not The Data Work": Data Cascades in High-Stakes AI
No ratings yet
"Everyone Wants To Do The Model Work, Not The Data Work": Data Cascades in High-Stakes AI
15 pages
17.8.3 Packet Tracer - Troubleshooting Challenge
No ratings yet
17.8.3 Packet Tracer - Troubleshooting Challenge
2 pages
Real-Time Barcode Scanning From Camera Stream - CodeProject
No ratings yet
Real-Time Barcode Scanning From Camera Stream - CodeProject
12 pages
IPv4 Subnet Calculation Lab
No ratings yet
IPv4 Subnet Calculation Lab
4 pages
10 Algorithms That Dominate The World
No ratings yet
10 Algorithms That Dominate The World
26 pages
BXXXXXX Mobile: 91-18 E-Mail
No ratings yet
BXXXXXX Mobile: 91-18 E-Mail
9 pages
Ds Roadmap 2
No ratings yet
Ds Roadmap 2
9 pages
Startups and Companies List
No ratings yet
Startups and Companies List
18 pages
Wintel L1 Sample Resume
No ratings yet
Wintel L1 Sample Resume
4 pages
11.10.1 Packet Tracer - Design and Implement A VLSM Addressing Scheme
No ratings yet
11.10.1 Packet Tracer - Design and Implement A VLSM Addressing Scheme
2 pages
Connect Wired & Wireless LAN Guide
No ratings yet
Connect Wired & Wireless LAN Guide
4 pages
JavaScript Export To Excel HTML Table With Input Tags - CodeProject
No ratings yet
JavaScript Export To Excel HTML Table With Input Tags - CodeProject
5 pages
ShivanshuGarg Resume PDF
No ratings yet
ShivanshuGarg Resume PDF
1 page
Education: 99.17%ile 100%ile Data Interpretation and Logical Reasoning
No ratings yet
Education: 99.17%ile 100%ile Data Interpretation and Logical Reasoning
1 page
Ch08 Roth3e
No ratings yet
Ch08 Roth3e
86 pages
Nagarro Placement Papers.. - Tech-Giant - Jargons For PDF
No ratings yet
Nagarro Placement Papers.. - Tech-Giant - Jargons For PDF
6 pages
AARUSHI SAXENA - Resume
100% (1)
AARUSHI SAXENA - Resume
1 page
1st Year - 282
No ratings yet
1st Year - 282
8 pages
Infobright ICE User Guide
No ratings yet
Infobright ICE User Guide
62 pages
DevOps Trainee - FY 2020 - 21
No ratings yet
DevOps Trainee - FY 2020 - 21
1 page
Rohith P S SRE
No ratings yet
Rohith P S SRE
9 pages
McKinsey Global Institute 2021 Hightlights
No ratings yet
McKinsey Global Institute 2021 Hightlights
17 pages
Harsh Kathiriya Resume
No ratings yet
Harsh Kathiriya Resume
1 page
PyGstreamer Tutorial
No ratings yet
PyGstreamer Tutorial
44 pages
Studinka and Guenduez - The Use of Big Data in The Public Policy Process - Paving The Way For Evidence-Based Governance
No ratings yet
Studinka and Guenduez - The Use of Big Data in The Public Policy Process - Paving The Way For Evidence-Based Governance
29 pages
DR Avi Perry-Fundamentals of Voice-Quality Engineering in Wireless Networks-Cambridge University Press (2007)
No ratings yet
DR Avi Perry-Fundamentals of Voice-Quality Engineering in Wireless Networks-Cambridge University Press (2007)
377 pages
E - Catalogue - Data Science, AI and Fintech 2025
No ratings yet
E - Catalogue - Data Science, AI and Fintech 2025
124 pages
Test Genius A
No ratings yet
Test Genius A
1 page
Cross Word Puzzle Game in C Language
No ratings yet
Cross Word Puzzle Game in C Language
8 pages
Vastipatrak 17oct 10
No ratings yet
Vastipatrak 17oct 10
18 pages
DWH Int Questions
100% (1)
DWH Int Questions
9 pages
AI Voice Bots for Banking
No ratings yet
AI Voice Bots for Banking
9 pages
2.7.6 Packet Tracer - Implement Basic Connectivity
No ratings yet
2.7.6 Packet Tracer - Implement Basic Connectivity
3 pages
Resume Ishita Garg PDF
No ratings yet
Resume Ishita Garg PDF
2 pages
Sre Assignment
No ratings yet
Sre Assignment
8 pages
Survival Analysis Approach To Reliability, Survivability
100% (1)
Survival Analysis Approach To Reliability, Survivability
20 pages
Leadsquared - SRM List 2022 Batch
No ratings yet
Leadsquared - SRM List 2022 Batch
84 pages
Architect Charles Jencks On The Bauhaus - What Is Its Legacy 100 Years On - Financial Times
No ratings yet
Architect Charles Jencks On The Bauhaus - What Is Its Legacy 100 Years On - Financial Times
13 pages
Co-Living A New Living Alternative With A Strong Potential - Houdoux Emma
No ratings yet
Co-Living A New Living Alternative With A Strong Potential - Houdoux Emma
14 pages
11.10.1 Packet Tracer - Design and Implement A VLSM Addressing Scheme
No ratings yet
11.10.1 Packet Tracer - Design and Implement A VLSM Addressing Scheme
3 pages
Packet Tracer - Configure Ipv6 Addressing
No ratings yet
Packet Tracer - Configure Ipv6 Addressing
3 pages
Nitu Resume
No ratings yet
Nitu Resume
1 page
2023 B.E. Graduates Placement List
No ratings yet
2023 B.E. Graduates Placement List
4 pages
Redacted Resume
No ratings yet
Redacted Resume
1 page
K7 GRC Brochure
No ratings yet
K7 GRC Brochure
8 pages
00 Fundamental Approaches To Software Engineering
No ratings yet
00 Fundamental Approaches To Software Engineering
361 pages
032-066 Biotech2e Lab Ch03
No ratings yet
032-066 Biotech2e Lab Ch03
35 pages
Grade 8 August Holiday Revision Booklet
No ratings yet
Grade 8 August Holiday Revision Booklet
154 pages
Chemistry Basics for Students
No ratings yet
Chemistry Basics for Students
16 pages
Namma Kalvi 12th Computer Applications Practical Manual em
No ratings yet
Namma Kalvi 12th Computer Applications Practical Manual em
33 pages
Notes Dual
No ratings yet
Notes Dual
13 pages
(Business Statistics) Chapter 3 Part 1
No ratings yet
(Business Statistics) Chapter 3 Part 1
30 pages
Kit - 500 Coating Thickness Gauge
No ratings yet
Kit - 500 Coating Thickness Gauge
8 pages
Preparation of Specimens FR Immunohistochemistry - PPT (2) - 1
No ratings yet
Preparation of Specimens FR Immunohistochemistry - PPT (2) - 1
33 pages
5.2 The Definite Integral HW
No ratings yet
5.2 The Definite Integral HW
15 pages
Act. 2 - Micropipetting Techni
No ratings yet
Act. 2 - Micropipetting Techni
29 pages
Critical Resistance and Critical Speed For DC Shunt Generator For PDF
No ratings yet
Critical Resistance and Critical Speed For DC Shunt Generator For PDF
10 pages
NM Release Notes en
No ratings yet
NM Release Notes en
11 pages
JavaScript Global Object and Promise Polyfills
No ratings yet
JavaScript Global Object and Promise Polyfills
88 pages
DC-80 Direct Conversion Receiver For 80 Meters
No ratings yet
DC-80 Direct Conversion Receiver For 80 Meters
11 pages
Project Report (Org) 4
No ratings yet
Project Report (Org) 4
49 pages
Rujukan ALGEBRA LINEAR
No ratings yet
Rujukan ALGEBRA LINEAR
2 pages
Straight To Market in An Autoinjector
No ratings yet
Straight To Market in An Autoinjector
8 pages
Acids Bases and Salts IGCSE
No ratings yet
Acids Bases and Salts IGCSE
22 pages
mt940 Details
No ratings yet
mt940 Details
18 pages
Object Oriented Programming Assignment
No ratings yet
Object Oriented Programming Assignment
24 pages
Examples: 238 17 Psychrometrics
No ratings yet
Examples: 238 17 Psychrometrics
12 pages
Outokumpu Stainless Steel Bar Sizes and Specifications
No ratings yet
Outokumpu Stainless Steel Bar Sizes and Specifications
2 pages
Environment Control System: Types of Absorbents
No ratings yet
Environment Control System: Types of Absorbents
4 pages
Brown Book
100% (1)
Brown Book
179 pages
Antilock Brake System 4f
No ratings yet
Antilock Brake System 4f
24 pages
IPE Pre-Test 2nd Sem 23-24
No ratings yet
IPE Pre-Test 2nd Sem 23-24
3 pages
2024 Spring Project
No ratings yet
2024 Spring Project
7 pages
4PH1 1P Que 20211113 1
No ratings yet
4PH1 1P Que 20211113 1
28 pages