Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
11 views84 pages

L12-Checking Concurrent Programs

The document discusses methods to increase confidence in concurrent programs, focusing on model checking and formal methods. It highlights the advantages and disadvantages of formal methods, various approaches to model checking, and specific tools like TLA+ for verifying properties of concurrent systems. The importance of concurrent programming in distributed systems is emphasized, along with real-world applications and the growing need for formal specifications in the industry.

Uploaded by

Chang Jingyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views84 pages

L12-Checking Concurrent Programs

The document discusses methods to increase confidence in concurrent programs, focusing on model checking and formal methods. It highlights the advantages and disadvantages of formal methods, various approaches to model checking, and specific tools like TLA+ for verifying properties of concurrent systems. The importance of concurrent programming in distributed systems is emphasized, along with real-world applications and the growing need for formal specifications in the industry.

Uploaded by

Chang Jingyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 84

Checking Concurrent

Programs
CS3211 Parallel and Concurrent Programming
Increase the confidence in your concurrent programs
• My code works and I don’t know why
• No confidence that the code will always work
• Will making a change break my code?
• How to increase the confidence in your code
• (Classic testing methods)
• Sanitizers (in CS3211 lecture 6)
• Model checking – mathematically proving that the code is correct
• Many approaches out there
• Unfortunately, these approaches are not really used in the industry (much)

CS3211 L12 - Model Checking 2


Model and checking the model

CS3211 L12 - Model Checking 3


Model checking of a formal specification
• Build the model using a special Domain Specific Language (DSL)
• Write a formal specification, usually done in a new language
• Then check the model
• Are all the constraints met?
• Does anything unexpected happen?
• Does it deadlock?
• Why spend time with this?
• Check things make sense before starting the (costly) implementation
• Prove certain properties for existing code
• Allow for aggressive optimizations without compromising correctness

aka “formal methods” approach

CS3211 L12 - Model Checking 4


Formal Methods
Advantages Disadvantages

• Rigorous • The specification used is faulty


• Verify all traces exhaustively • Tedious in coming up with a
• Produce a system run that complete specification
violates the requirement • Time consuming

CS3211 L12 - Model Checking 5


Approaches to Model Checking
• Write a formal specification of the system and check it
• Use a model checker or proof assistant
• Various degrees of automation
• The checker usually checks the states exhaustively
• Use the specification to write the code
• Manually write code
• Automatically generate code from specification
• Add formal specification (invariants that should hold) in the code, as
comments
• Use a model checker to check the invariants
• Difficult to make the model checker understand the code
• Use some symbolic execution
• Limited in functionalities for concurrent code

CS3211 L12 - Model Checking 6


Model checkers for concurrent programs
• TLA+ (TLC) - Temporal Logic of Actions+
• Focusses on temporal properties
• Good for modeling concurrent systems (and distributed systems)
• Coq Proof Assistant
• Generates oCaml, Haskell and Scheme
• Good for interactive proof methods
• Alloy (alloy analyzer)
• Focusses on relational logic
• Good for modeling structures

CS3211 L12 - Model Checking 7


TLA+
• Proposed by Leslie Lamport in 1999
• Defines TLA+ as a "quixotic attempt to overcome engineers' antipathy
towards mathematics"
• High-level language for modeling programs and systems – especially
concurrent and distributed ones
• Based on the idea that the best way to describe things precisely is
with simple mathematics
• Approach
• A specification in TLA+ is written
• The specification is proven (verified) using a checker by exhaustively testing
the states
• Manually write the code based on the TLA+ spec

CS3211 L12 - Model Checking 8


How it works?
• The model checker finds all possible system behaviours (states) up to
some number of execution steps
• Examines the states for violations of desired invariance properties
such as safety and liveness
• TLA+ specifications use basic set theory to define safety (bad things
won't happen) and temporal logic to define liveness (good things
eventually happen)

CS3211 L12 - Model Checking 9


TLA+
• Temporal (time)
• Logic (Boolean logic) of
• Actions (state machines)
• Plus (some stuff)

CS3211 L12 - Model Checking 10


Boolean Logic

CS3211 L12 - Model Checking 11


Boolean Logic
• A predicate is an expression that returns a Boolean

CS3211 L12 - Model Checking 12


Actions – state machines
• State machines
• States
• Transitions

CS3211 L12 - Model Checking 13


State machine for playing chess

CS3211 L12 - Model Checking 14


Formalizing the actions

CS3211 L12 - Model Checking 15


Formalizing the actions

CS3211 L12 - Model Checking 16


The action is the transition
• This is test, not an assignment!

CS3211 L12 - Model Checking 17


Actions are tests

CS3211 L12 - Model Checking 18


Temporal – state transitions over time
• Infinite amount of time

• TLA+ can ask questions like:


• Is something always true?
• Is something ever true?
• If X happens, must Y happen afterwards?
over time

CS3211 L12 - Model Checking 19


Count to three

CS3211 L12 - Model Checking 20


Count to three

CS3211 L12 - Model Checking 21


Count to three

CS3211 L12 - Model Checking 22


Count to three

CS3211 L12 - Model Checking 23


Count to three

CS3211 L12 - Model Checking 24


Count to three

CS3211 L12 - Model Checking 25


Count to three, refactored

CS3211 L12 - Model Checking 26


TLA+ toolbox: IDE and checker

CS3211 L12 - Model Checking 28


CS3211 L12 - Model Checking 29
CS3211 L12 - Model Checking 30
CS3211 L12 - Model Checking 31
Deadlock in TLA+

• Infinite time in TLA+

CS3211 L12 - Model Checking 32


Count to three, updated

CS3211 L12 - Model Checking 33


Doing nothing is always an option!

CS3211 L12 - Model Checking 34


Count to three, with stuttering

CS3211 L12 - Model Checking 35


The power of temporal properties
• A property applies to the whole system over time
• Not just to individual states
• Checking these properties is important
• Humans are bad at this
• Programming languages are bad at this too
• TLA+ can help with this!

CS3211 L12 - Model Checking 36


Properties in TLA+
• Always true
• For all tests, x>0
• Eventually true
• At some point in time, x=2
• Eventually always
• Eventually becomes true (done) and stays there (done)
• x eventually becomes 3 and then stays there
• Leads to
• If x ever becomes 2, then it will become 3 later

CS3211 L12 - Model Checking 37


Properties for “count to three”

CS3211 L12 - Model Checking 38


Adding properties to the script

CS3211 L12 - Model Checking 39


Adding properties to the script

CS3211 L12 - Model Checking 40


Oh no! Model checker says we have errors!

CS3211 L12 - Model Checking 41


Stuttering caused a loop!

CS3211 L12 - Model Checking 42


Fixing the error
• Make sure every possible transition is followed
• Don’t get stuck in an infinite loop

Add fairness! TLA+ can model this

CS3211 L12 - Model Checking 43


CS3211 L12 - Model Checking 44
CS3211 L12 - Model Checking 45
CS3211 L12 - Model Checking 46
CS3211 L12 - Model Checking 47
CS3211 L12 - Model Checking 48
The complete spec with fairness

CS3211 L12 - Model Checking 49


The complete spec with fairness

CS3211 L12 - Model Checking 50


A more complicated example
• Very exciting: we can count to three!
• What about a more complicated problem?
• What about concurrency?
• Property checking is where TLA+ is powerful and it can help

CS3211 L12 - Model Checking 51


Producer/consumer problem
Producer: Consumer:
• Check if queue is not full • Check if queue is not empty
• If true, then write item to queue • If true, then read item from
queue

CS3211 L12 - Model Checking 52


CS3211 L12 - Model Checking 53
CS3211 L12 - Model Checking 54
CS3211 L12 - Model Checking 55
CS3211 L12 - Model Checking 56
Embedded concurrency!
CS3211 L12 - Model Checking 57
CS3211 L12 - Model Checking 58
CS3211 L12 - Model Checking 59
Temporal properties for the producer/consumer

• 8 states, no errors
• BUT only for 1 producer and 1 consumer!

CS3211 L12 - Model Checking 60


Concurrent version with multiple producers/consumers
• Use the Plus in TLA+

• We need
• A set of producers
• A set of consumers
• Use the set-description part of TLA+

CS3211 L12 - Model Checking 61


Plus… set theory!

CS3211 L12 - Model Checking 62


CS3211 L12 - Model Checking 63
CS3211 L12 - Model Checking 64
CS3211 L12 - Model Checking 65
CS3211 L12 - Model Checking 66
CS3211 L12 - Model Checking 67
CS3211 L12 - Model Checking 68
CS3211 L12 - Model Checking 69
CS3211 L12 - Model Checking 70
CS3211 L12 - Model Checking 71
CS3211 L12 - Model Checking 72
Running the script
• Run the model checker with 2 producers and 2 consumers
• Use the AlwaysWithinBounds property
• There are 38 states
• Error: Invariant AlwaysWithinBounds is violated!
• The design does not work

CS3211 L12 - Model Checking 73


Fixing the error
• TLA+ won’t tell you how to fix the error
• You must fix the spec
• Easy to test the fixes
• Update the spec to use atomic operations (or locks)
• Re-run the model checker!
• You gain confidence in your design

CS3211 L12 - Model Checking 74


The power of TLA+
• TLA+ can be used to model large concurrent systems
• Such as distributed systems!

• Examples where TLA+ can help:


• https://hillelwayne.com/modeling-deployments/
• https://hillelwayne.com/talks/distributed-systems-tlaplus/
• Learn more: https://learntla.com/index.html

CS3211 L12 - Model Checking 75


Why concurrent programs are important?
• They are everywhere nowadays because we all use distributed
systems
• Distributed systems use the most complex programs
• systems that span the world
• serve millions of users
• and are always available
• Incredibly relevant today as everything is a distributed system!

CS3211 L12 - Model Checking 76


Definition of Distributed Systems
• Distributed system is a system where multiple processes located on
networked computers communicate via messages to achieve a
common goal
• "A distributed system is one in which the failure of a computer you
didn't even know existed can render your own computer unusable.“,
Leslie Lamport
• Examples: client-server applications, map-reduce, grid computing,
peer-to-peer networks, skype, cloud computing, email clients, music
streaming, ftp connection, hadoop, web service compositions, video
streaming, etc.

CS3211 L12 - Model Checking 77


Definition of Distributed Systems
• Distributed system is a system where multiple processes located on
networked computers communicate via messages to achieve a
common goal
• "A distributed system is one in which the failure of a computer you
didn't even know existed can render your own computer unusable.“,
Leslie Lamport
• Examples: client-server applications, map-reduce, grid computing,
peer-to-peer networks, skype, cloud computing, email clients, music
streaming, ftp connection, hadoop, web service compositions, video
streaming, etc.

CS3211 L12 - Model Checking 78


Definition of Distributed Systems
• Distributed system is a system where multiple processes located on
networked computers communicate via messages to achieve a
common goal
• "A distributed system is one in which the failure of a computer you
didn't even know existed can render your own computer unusable.“,
Leslie Lamport
• Examples: client-server applications, map-reduce, grid computing,
peer-to-peer networks, skype, cloud computing, email clients, music
streaming, ftp connection, hadoop, web service compositions, video
streaming, etc.

CS3211 L12 - Model Checking 79


Major Challenges (1)
• No global clock, no ordering of events
• Events happen at different times
• Different interleaving of events are possible
• The participants see the events interleaving in different ways
• Use a logical clock (happens-before)
~ sounds like a memory model is needed

CS3211 L12 - Model Checking 80


Major Challenges (2)
Resilience – Consistency - Consensus

Possible errors due to the concurrent nature of the system:


• Deadlocks
• Livelock/starvation
• Lack of consensus
~ knowing classical synchronization problems might
help you solve some particular situation

CS3211 L12 - Model Checking 81


2015: Formal Methods at AWS *
• Precise description of system in TLA+ (PlusCal language - like c)
• In 6 large complex real world systems
• 7 teams
• Found subtle bugs
• Confidence to make aggressive optimizations w/o sacrificing
correctness
• Use formal specification to teach system to new engineers

* How Amazon Web Services Uses Formal Methods by Chris Newcombe et al. (Communications of the
ACM, 2015)

CS3211 L12 - Model Checking 83


2015: Formal Methods at AWB

CS3211 L12 - Model Checking 84


2021: Using Lightweight Formal Methods to Validate a Key-Value
Storage Node in Amazon S3
• S3’s new ShardStore storage node
• Built in Rust
• Crash consistency, concurrency, IO,etc
• Specs alongside the code
• Reference model spec
• Decompose correctness checks
• Sequential correctness
• Crashes
• Concurrency
• Accept weaker correctness guarantees then full formal verification
• Adding continuous validation validation

* Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3 by James
Bornholt et al., SOSP2021. https://www.youtube.com/watch?v=YdxvOPenjWI

CS3211 L12 - Model Checking 85


Conclusion
• Formal verification (model checking) bring guarantees and allows us
to check properties
• Formally checking the concurrent code is here to stay
• More engineers will need to write formal specs for their code
• Industry is adapting and using model checkers, especially for newly
developed systems

• References:
• https://www.youtube.com/watch?v=tqwcz-Yt9gQ

CS3211 L12 - Model Checking 86

You might also like