Overview
1 Consistency & Replication
Introduction
Data-Centric Consistency Models
Client-Centric Consistency Models
Replica Management
Consistency Protocols
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 1 / 41
Consistency & Replication
Consistency & replication
Introduction (what’s it all about)
Data-centric consistency
Client-centric consistency
Replica management
Consistency protocols
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 2 / 41
Consistency & Replication Introduction
Performance and scalability
Main issue
To keep replicas consistent, we generally need to ensure that all conflicting
operations are done in the the same order everywhere
Conflicting operations
From the world of transactions:
Read–write conflict: a read operation and a write operation act
concurrently
Write–write conflict: two concurrent write operations
Issue
Guaranteeing global ordering on conflicting operations may be a costly operation,
downgrading scalability Solution: weaken consistency requirements so that
hopefully global synchronization can be avoided
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 3 / 41
Consistency & Replication Data-Centric Consistency Models
Data-centric consistency models
Consistency model
A contract between a (distributed) data store and processes, in which the data
store specifies precisely what the results of read and write operations are in the
presence of concurrency.
Essential
A data store is a distributed collection of storages:
Process Process Process
Local copy
Distributed data store
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 4 / 41
Consistency & Replication Data-Centric Consistency Models
Continuous Consistency
Observation
We can actually talk a about a degree of consistency:
replicas may differ in their numerical value
replicas may differ in their relative staleness
there may be differences with respect to (number and order) of
performed update operations
Conit
Consistency unit => specifies the data unit over which consistency is to
be measured.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 5 / 41
Consistency & Replication Data-Centric Consistency Models
Example: Conit
Conit (contains the variables x and y )
Each replica has a vector clock: ([known] time @ A, [known] time @ B)
B sends A operation [h5, Bi: g := g + 45]; A has made this operation
permanent (cannot be rolled back)
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 6 / 41
Consistency & Replication Data-Centric Consistency Models
Example: Conit
Conit (contains the variables x and y )
A has three pending operations => order deviation = 3
A has missed two operation from B(6, B), (7, B), yielding a max diff of 482
units => (2, 482)
Likewise B has missed three operation from A(8, A), (9, A), (10, A), yielding
a max diff of 686 units => (3, 686)
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 7 / 41
Consistency & Replication Data-Centric Consistency Models
Sequential consistency
Definition
The result of any execution is the same as if the operations of all processes
were executed in some sequential order, and the operations of each
individual process appear in this sequence in the order specified by its
program.
P1: W(x)a P1: W(x)a
P2: W(x)b P2: W(x)b
P3: R(x)b R(x)a P3: R(x)b R(x)a
P4: R(x)b R(x)a P4: R(x)a R(x)b
(a) (b)
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 8 / 41
Consistency & Replication Data-Centric Consistency Models
Causal consistency
Definition
Writes that are potentially causally related must be seen by all processes in
the same order. Concurrent writes may be seen in a different order by
different processes.
P1: W(x)a
P2: R(x)a W(x)b
P3: R(x)b R(x)a
P4: R(x)a R(x)b
(a)
P1: W(x)a
P2: W(x)b
P3: R(x)b R(x)a
P4: R(x)a R(x)b
(b)
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 9 / 41
Consistency & Replication Data-Centric Consistency Models
Grouping operations
Definition
Accesses to synchronization variables are sequentially consistent.
No access to a synchronization variable is allowed to be performed
until all previous writes have completed everywhere.
No data access is allowed to be performed until all previous accesses
to synchronization variables have been performed.
Basic idea
You don’t care that reads and writes of a series of operations are
immediately known to other processes. You just want the effect of the
series itself to be known.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 10 / 41
Consistency & Replication Data-Centric Consistency Models
Grouping operations
P1: Acq(Lx) W(x)a Acq(Ly) W(y)b Rel(Lx) Rel(Ly)
P2: Acq(Lx) R(x)a R(y) NIL
P3: Acq(Ly) R(y)b
Observation
Weak consistency implies that we need to lock and unlock data (implicitly
or not).
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 11 / 41
Consistency & Replication Client-Centric Consistency Models
Client-centric consistency models
Overview
System model
Monotonic reads
Monotonic writes
Read-your-writes
Write-follows-reads
Goal
Show how we can perhaps avoid systemwide consistency, by concentrating
on what specific clients want, instead of what should be maintained by
servers.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 12 / 41
Consistency & Replication Client-Centric Consistency Models
Consistency for mobile users
Example
Consider a distributed database to which you have access through your
notebook. Assume your notebook acts as a front end to the database.
At location A you access the database doing reads and updates.
At location B you continue your work, but unless you access the same
server as the one at location A, you may detect inconsistencies:
your updates at A may not have yet been propagated to B
you may be reading newer entries than the ones available at A
your updates at B may eventually conflict with those at A
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 13 / 41
Consistency & Replication Client-Centric Consistency Models
Consistency for mobile users
Note
The only thing you really want is that the entries you updated and/or read
at A, are in B the way you left them in A. In that case, the database will
appear to be consistent to you.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 14 / 41
Consistency & Replication Client-Centric Consistency Models
Basic architecture
Client moves to other location
and (transparently) connects to
other replica
Replicas need to maintain
client-centric consistency
Wide-area network
Distributed and replicated database
Read and write operations
Portable computer
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 15 / 41
Consistency & Replication Client-Centric Consistency Models
Monotonic reads
Definition
If a process reads the value of a data item x, any successive read operation
on x by that process will always return that same or a more recent value.
L1: WS( x 1) R( x 1)
L2: WS( x 1;x 2) R( x 2)
L1: WS( x 1) R( x 1)
L2: WS( x 2) R( x 2)
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 16 / 41
Consistency & Replication Client-Centric Consistency Models
Client-centric consistency: notation
Notation
WS(xi [t]) is the set of write operations (at Li ) that lead to version xi
of x (at time t)
WS(xi [t1 ]; xj [t2 ]) indicates that it is known that WS(xi [t1 ]) is part of
WS(xj [t2 ]).
Note: Parameter t is omitted from figures.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 17 / 41
Consistency & Replication Client-Centric Consistency Models
Monotonic reads
Example
Automatically reading your personal calendar updates from different
servers. Monotonic Reads guarantees that the user sees all updates, no
matter from which server the automatic reading takes place.
Example
Reading (not modifying) incoming mail while you are on the move. Each
time you connect to a different e-mail server, that server fetches (at least)
all the updates from the server you previously visited.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 18 / 41
Consistency & Replication Client-Centric Consistency Models
Monotonic writes
Definition
A write operation by a process on a data item x is completed before any
successive write operation on x by the same process.
L1: W( x 1)
L2: WS( x 1) W(x 2 )
L1: W( x 1)
L2: W(x 2 )
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 19 / 41
Consistency & Replication Client-Centric Consistency Models
Monotonic writes
Example
Updating a program at server S2 , and ensuring that all components on
which compilation and linking depends, are also placed at S2 .
Example
Maintaining versions of replicated files in the correct order everywhere
(propagate the previous version to the server where the newest version is
installed).
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 20 / 41
Consistency & Replication Client-Centric Consistency Models
Read your writes
Definition
The effect of a write operation by a process on data item x, will always be
seen by a successive read operation on x by the same process.
L1: W( x 1)
Example
L2: WS( x 1;x 2) R( x 2)
Updating your Web page and
guaranteeing that your Web
L1: W( x 1) browser shows the newest
L2: WS( x 2) R( x 2) version instead of its cached
copy.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 21 / 41
Consistency & Replication Client-Centric Consistency Models
Writes follow reads
Definition
A write operation by a process on a data item x following a previous read
operation on x by the same process, is guaranteed to take place on the
same or a more recent value of x that was read.
L1: WS( x 1) R( x 1) Example
L2: WS( x 1;x 2) W( x 2) See reactions to posted
articles only if you have the
L1: WS( x 1) R( x 1) original posting (a read “pulls
L2: WS( x 2) W( x 3)
in” the corresponding write
operation).
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 22 / 41
Consistency & Replication Replica Management
Distribution protocols
Replica server placement
Content replication and placement
Content distribution
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 23 / 41
Consistency & Replication Replica Management
Replica placement
Essence
Figure out what the best K places are out of N possible locations.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 24 / 41
Consistency & Replication Replica Management
Replica placement
Essence
Figure out what the best K places are out of N possible locations.
Select best location out of N − K for which the average distance to
clients is minimal. Then choose the next best server. (Note: The
first chosen location minimizes the average distance to all clients.)
Computationally expensive.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 24 / 41
Consistency & Replication Replica Management
Replica placement
Essence
Figure out what the best K places are out of N possible locations.
Select best location out of N − K for which the average distance to
clients is minimal. Then choose the next best server. (Note: The
first chosen location minimizes the average distance to all clients.)
Computationally expensive.
Select the K -th largest autonomous system and place a server at
the best-connected host. Computationally expensive.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 24 / 41
Consistency & Replication Replica Management
Replica placement
Essence
Figure out what the best K places are out of N possible locations.
Select best location out of N − K for which the average distance to
clients is minimal. Then choose the next best server. (Note: The
first chosen location minimizes the average distance to all clients.)
Computationally expensive.
Select the K -th largest autonomous system and place a server at
the best-connected host. Computationally expensive.
Position nodes in a d-dimensional geometric space, where distance
reflects latency. Identify the K regions with highest density and place
a server in every one. Computationally cheap.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 24 / 41
Consistency & Replication Replica Management
Content replication
Distinguish different processes
A process is capable of hosting a replica of an object or data:
Permanent replicas: Process/machine always having a replica
Server-initiated replica: Process that can dynamically host a replica
on request of another server in the data store
Client-initiated replica: Process that can dynamically host a replica
on request of a client (client cache)
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 25 / 41
Consistency & Replication Replica Management
Content replication
Server-initiated replication
Client-initiated replication
Permanent
replicas
Server-initiated replicas
Client-initiated replicas
Clients
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 26 / 41
Consistency & Replication Replica Management
Server-initiated replicas
C2
Server without
copy of file F
P
Client Server with
Q copy of F
C1
File F
Server Q counts access from C1 and
C2 as if they would come from P
Keep track of access counts per file, aggregated by considering server
closest to requesting clients
Number of accesses drops below threshold D => drop file
Number of accesses exceeds threshold R => replicate file
Number of access between D and R => migrate file
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 27 / 41
Consistency & Replication Replica Management
Content distribution
Model
Consider only a client-server combination:
Propagate only notification/invalidation of update (often used for
caches)
Transfer data from one copy to another (distributed databases):
passive replication
Propagate the update operation to other copies: active replication
Note
No single approach is the best, but depends highly on available bandwidth
and read-to-write ratio at replicas.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 28 / 41
Consistency & Replication Replica Management
Content distribution: client/server system
Pushing updates: server-initiated approach, in which update is
propagated regardless whether target asked for it.
Pulling updates: client-initiated approach, in which client requests
to be updated.
Issue Push-based Pull-based
1: List of client caches None
2: Update (and possibly fetch update) Poll and update
3: Immediate (or fetch-update time) Fetch-update time
1: State at server
2: Messages to be exchanged
3: Response time at the client
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 29 / 41
Consistency & Replication Replica Management
Content distribution
Observation
We can dynamically switch between pulling and pushing using leases: A
contract in which the server promises to push updates to the client until
the lease expires.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 30 / 41
Consistency & Replication Replica Management
Content distribution
Issue
Make lease expiration time dependent on system’s behavior (adaptive leases):
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 31 / 41
Consistency & Replication Replica Management
Content distribution
Issue
Make lease expiration time dependent on system’s behavior (adaptive leases):
Age-based leases: An object that hasn’t changed for a long time, will not
change in the near future, so provide a long-lasting lease
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 31 / 41
Consistency & Replication Replica Management
Content distribution
Issue
Make lease expiration time dependent on system’s behavior (adaptive leases):
Renewal-frequency based leases: The more often a client requests a
specific object, the longer the expiration time for that client (for that object)
will be
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 31 / 41
Consistency & Replication Replica Management
Content distribution
Issue
Make lease expiration time dependent on system’s behavior (adaptive leases):
State-based leases: The more loaded a server is, the shorter the expiration
times become
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 31 / 41
Consistency & Replication Replica Management
Content distribution
Issue
Make lease expiration time dependent on system’s behavior (adaptive leases):
Age-based leases: An object that hasn’t changed for a long time, will not
change in the near future, so provide a long-lasting lease
Renewal-frequency based leases: The more often a client requests a
specific object, the longer the expiration time for that client (for that object)
will be
State-based leases: The more loaded a server is, the shorter the expiration
times become
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 31 / 41
Consistency & Replication Consistency Protocols
Consistency protocols
Consistency protocol
Describes the implementation of a specific consistency model.
Continuous consistency
Primary-based protocols
Replicated-write protocols
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 32 / 41
Consistency & Replication Consistency Protocols
Continuous consistency: Numerical errors
Principal operation
Every server Si has a log, denoted as log (Si ).
Consider a data item x and let weight(W ) denote the numerical
change in its value after a write operation W . Assume that
∀W : weight(W ) > 0
W is initially forwarded to one of the N replicas, denoted as
origin(W ). TW [i, j] are the writes executed by server Si that
originated from Sj :
X
TW [i, j] = {weight(W )|origin(W ) = Sj & W ∈ log (Si )}
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 33 / 41
Consistency & Replication Consistency Protocols
Continuous consistency: Numerical errors
Note
Actual value v (t) of x:
N
X
v (t) = vinit + TW [k, k]
k=1
value vi of x at replica i:
N
X
vi = vinit + TW [i, k]
k=1
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 34 / 41
Consistency & Replication Consistency Protocols
Continuous consistency: Numerical errors
Problem
We need to ensure that v (t) − vi < δi for every server Si .
Approach
Let every server Sk maintain a view TWk [i, j] of what it believes is the
value of TW [i, j]. This information can be gossiped when an update is
propagated.
Note
0 ≤ TWk [i, j] ≤ TW [i, j] ≤ TW [j, j]
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 35 / 41
Consistency & Replication Consistency Protocols
Continuous consistency: Numerical errors
Solution
Sk sends operations from its log to Si when it sees that TWk [i, k] is
getting too far from TW [k, k], in particular, when
TW [k, k] − TWk [i, k] > δi /(N − 1)
Question
To what extent are we being pessimistic here: where does δi /(N − 1)
come from?
Note
Staleness can be done analogously, by essentially keeping track of what
has been seen last from Si (see book).
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 36 / 41
Consistency & Replication Consistency Protocols
Primary-based protocols
Primary-backup protocol
Client Client
Primary server
for item x Backup server
W1 W5 R1 R2
W4 W4
W3 W3 Data store
W2 W3
W4
W1. Write request R1. Read request
W2. Forward request to primary R2. Response to read
W3. Tell backups to update
W4. Acknowledge update
W5. Acknowledge write completed
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 37 / 41
Consistency & Replication Consistency Protocols
Primary-based protocols
Example primary-backup protocol
Traditionally applied in distributed databases and file systems that require
a high degree of fault tolerance. Replicas are often placed on same LAN.
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 38 / 41
Consistency & Replication Consistency Protocols
Primary-based protocols
Primary-backup protocol with local writes
Client Client
Old primary New primary
for item x for item x Backup server
R1 R2 W1 W3
W5 W5
W4 W4 Data store
W5 W2
W4
W1. Write request R1. Read request
W2. Move item x to new primary R2. Response to read
W3. Acknowledge write completed
W4. Tell backups to update
W5. Acknowledge update
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 39 / 41
Consistency & Replication Consistency Protocols
Primary-based protocols
Example primary-backup protocol with local writes
Mobile computing in disconnected mode (ship all relevant files to user
before disconnecting, and update later on).
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 40 / 41
Consistency & Replication Consistency Protocols
Replicated-write protocols
Quorum-based protocols
Ensure that each operation is carried out in such a way that a majority vote is
established: distinguish read quorum and write quorum:
Read quorum
A B C D A B C D A B C D
E F G H E F G H E F G H
I J K L I J K L I J K L
NR = 3, N W = 10 NR = 7, NW = 6 NR = 1, N W = 12
W
required: NR + NW > N and NW > N/2
Wondimagegn D. (AAIT ) Distributed System Programming December 30, 2021 41 / 41