Synchronization in
Distributed Systems
CS-4513
D-Term 2007
(Slides include materials from Operating System Concepts, 7th ed., by Silbershatz, Galvin, & Gagne,
Modern Operating Systems, 2nd ed., by Tanenbaum, and Distributed Systems: Principles & Paradigms, 2nd
ed. By Tanenbaum and Van Steen)
CS-4513, D-Term 2007 Synchronization 1
Issue
• Synchronization within one system is hard
enough
• Semaphores
• Messages
• Monitors
• …
• Synchronization among processes in a
distributed system is much harder
CS-4513, D-Term 2007 Synchronization 2
Example
• File locking in NFS
• Not supported directly within NFS v.3
• Need lockmanager service to supplement
NFS
CS-4513, D-Term 2007 Synchronization 3
What about using Time?
• make recompiles if foo.c is newer than foo.o
• Scenario
• make on machine A to build foo.o
• Test on machine B; find and fix a bug in foo.c
• Re-run make on machine B
• Nothing happens!
• Why?
CS-4513, D-Term 2007 Synchronization 4
Synchronizing Time on
Distributed Computers
• See Tanenbaum & Van Steen, §6.1.1, 6.1.2
for descriptions of
• Solar Time
• International Atomic Time
• GPS, etc.
• §6.1.3 for Clock Synchronization
algorithms
CS-4513, D-Term 2007 Synchronization 5
NTP (Network Time Protocol)
T2 T3
B
A T4
T1
• A requests time of B at its own T1
• B receives request at its T2, records
• B responds at its T3, sending values of T2 and T3
• A receives response at its T4
• Question: what is = TB – TA?
CS-4513, D-Term 2007 Synchronization 6
NTP (Network Time Protocol)
T2 T3
B
A T4
T1
• Question: what is = TB – TA?
• Assume transit time is approximately the same
both ways
CS-4513, D-Term 2007 Synchronization 7
NTP (continued)
• Servers organized as strata
– Stratum 0 server adjusts itself to WWV directly
– Stratum 1 adjusts self to Stratum 0 servers
– Etc.
• Within a stratum, servers adjust with each
other
CS-4513, D-Term 2007 Synchronization 8
Adjusting the Clock
• If TA is slow, add to clock rate
• To speed it up gradually
• If TA is fast, subtract from clock rate
• To slow it down gradually
CS-4513, D-Term 2007 Synchronization 9
Berkeley Algorithm
• Berkeley Algorithm
• Time Daemon polls other systems
• Computes average time
• Tells other machines how to adjust their clocks
CS-4513, D-Term 2007 Synchronization 10
Problem
• Time not a reliable method of
synchronization
• Users mess up clocks
• (and forget to set their time zones!)
• Unpredictable delays in Internet
• Relativistic issues
• If A and B are far apart physically, and
• two events TA and TB are very close in time, then
• which comes first? how do you know?
CS-4513, D-Term 2007 Synchronization 11
Example
• At midnight PDT, bank posts interest to your
account based on current balance.
• At 3:00 AM EDT, you withdraw some cash.
• Does interest get paid on the cash you just
withdrew?
• Depends upon which event came first!
• What if transactions made on different replicas?
CS-4513, D-Term 2007 Synchronization 12
Example (continued)
CS-4513, D-Term 2007 Synchronization 13
Solution — Logical Clocks
• Not “clocks” at all
• Just monotonic counters
• Lamport’s temporal logic
• Definition: a b means
• a occurs before b
• I.e., all processes agree that a happens, then later b
happens
• E.g., send(message) receive(message)
CS-4513, D-Term 2007 Synchronization 14
Logical Clocks (continued)
CS-4513, D-Term 2007 Synchronization 15
Logical Clocks (continued)
• Every machine maintains its own logical
“clock” C
• Transmit C with every message
• If Creceived > Cown, then adjust Cown forward to
Creceived + 1
• Result: Anything that is known to follow
something else in logical time has larger
logical clock value.
CS-4513, D-Term 2007 Synchronization 16
Logical Clocks (continued)
CS-4513, D-Term 2007 Synchronization 17
Variations
• See Tanenbaum & Van Steen, §6.2
• Note: Grapevine timestamps for updating its
registries behave somewhat like logical
clocks.
CS-4513, D-Term 2007 Synchronization 18
Mutual Exclusion in Distributed Systems
• Prevent inconsistent usage or updates to
shared data
• Two approaches
• Token
• Permission
CS-4513, D-Term 2007 Synchronization 19
Centralized Permission Approach
• One process is elected coordinator for a resource
• All others ask permission.
• Possible responses
– Okay; denied (ask again later); none (caller waits)
CS-4513, D-Term 2007 Synchronization 20
Centralized Permissions (continued)
• Advantages
– Mutual exclusion guaranteed by coordinator
– “Fair” sharing possible without starvation
– Simple to implement
• Disadvantages
– Single point of failure (coordinator crashes)
– Performance bottleneck
–…
CS-4513, D-Term 2007 Synchronization 21
Decentralized Permissions
• n coordinators; ask all
• E.g., n replicas
• Must have agreement of m > n/2
• Advantage
• No single point of failure
• Disadvantage
• Lots of messages
• Really messy
CS-4513, D-Term 2007 Synchronization 22
Distributed Permissions
• Use Lamport’s logical clocks
• Requestor sends reliable messages to all
other processes (including self)
• Waits for OK replies from all other processes
• Replying process
• If not interested in resource, reply OK
• If currently using resource, queue request, don’t
reply
• If interested, then reply OK if requestor is earlier
Queue request if requestor is later
CS-4513, D-Term 2007 Synchronization 23
Distributed Permissions (continued)
• Process 0 and Process 2 want resource
• Process 1 replies OK because not interested
• Process 0 has lower time-stamp, thereby goes first
• …
CS-4513, D-Term 2007 Synchronization 24
Distributed Permissions (continued)
• Advantage
– No central bottleneck
– Fewer messages than Decentralized
• Disadvantage
– n points of failure
– i.e., failure of one node to respond locks up
system
CS-4513, D-Term 2007 Synchronization 25
Token system
• Organize processes in logical ring
• Each process knows successor
• Token is passed around ring
• If process is interested in resource, it waits for token
• Releases token when done
• If node is dead, process skips over it
• Passes token to successor of dead process
CS-4513, D-Term 2007 Synchronization 26
Token system (continued)
• Advantages
• Fairness, no starvation
• Recovery from crashes if token is not lost
• Disadvantage
• Crash of process holding token
• Difficult to detect; difficult to regenerate exactly one token
CS-4513, D-Term 2007 Synchronization 27
Next Time
• Election algorithms for synchronization
• Consistency and Replication
CS-4513, D-Term 2007 Synchronization 28