Chapter 6 Synchronization
Chapter 6 Synchronization
Chapter 6 Synchronization
Chapter 6
1
• Synchronization and coordination are two closely related
phenomena. In process synchronization we make sure that one
process waits for another to complete its operation. When
dealing with data synchronization, the problem is to ensure
that two sets of data are the same. When it comes to
coordination, the goal is to manage the interactions and
dependencies between activities in a distributed system.
• coordination in distributed systems is often much more difficult
compared to that in uniprocessor or multiprocessor systems
2
Clock Synchronization
3
Logical Clocks
• Synchronization based on “relative time”.
• Note that (with this mechanism) there is no
requirement for “relative time” to have any
relation to the “real time”.
• What’s important is that the processes in the
Distributed System agree on the ordering in
which certain events occur.
• Such “clocks” are referred to as Logical Clocks.
4
Lamport’s Logical Clocks
• First point: if two processes do not interact, then their
clocks do not need to be synchronized – they can
operate concurrently without fear of interfering with
each other.
• Second (critical) point: it does not matter that two
processes share a common notion of what the “real”
current time is. What does matter is that the processes
have some agreement on the order in which certain
events occur.
• Lamport used these two observations to define the
“happens-before” relation (also often referred to within
the context of Lamport’s Timestamps).
5
The “Happens-Before” Relation (1)
• If A and B are events in the same process, and A
occurs before B, then we can state that:
• A “happens-before” B is true.
• Equally, if A is the event of a message being sent by
one process, and B is the event of the same message
being received by another process, then A “happens-
before” B is also true.
• (Note that a message cannot be received before it is
sent, since it takes a finite, nonzero amount of time to
arrive … and, of course, time is not allowed to run
backwards).
6
Lamport’s Logical Clocks (1)
11
Physical Clock
• The physical clocks are used to adjust the time of nodes. Each node in the
system can share its local time with other nodes in the system. The time is set
based on UTC (Universal Time Coordination). UTC is used as a reference
time clock for the nodes in the system.
• The basis for keeping global time is a called Universal Coordinated Time,
but is abbreviated as UTC.
• UTC is the basis of all modern civil timekeeping and is a worldwide
standard.
• The accuracy of these stations is about ± 1 msec, but due to random
atmospheric fluctuations that can affect the length of the signal path, in
practice the accuracy is no better than ± 10 msec.
12
Physical Clocks synchronization algorithm
If one machine has a UTC receiver, the goal becomes keeping all
the other machines synchronized to it. If no machines have UTC
receivers, each machine keeps track of its own time, and the goal is
to keep all the machines together as well as possible.
13
The Berkeley Algorithm (1)
• Getting the current time from a “time server”, using periodic client
requests.
• Problem results from the delay introduced by the network
request/response: latency.
17
18
• Degrees of separation from the UTC source are
defined as strata.
• A reference clock -- which receives true time from a
dedicated transmitter or satellite navigation system --
is categorized as stratum-0;
• A computer that is directly linked to the reference
clock is stratum-1;
• A computer that receives its time from a stratum-1
computer is stratum-2, and so on.
• Accuracy is reduced with each additional degree of
19 separation.
Mutual Exclusion within Distributed Systems
21
Mutual Exclusion
A Centralized Algorithm (1)
25
Distributed Mutual Exclusion
• Based on work by Ricart and Agrawala
(1981).
• Requirement of their solution: total ordering
of all events in the distributed system (which
is achievable with Lamport’s timestamps).
• Note that messages in their system contain
three pieces of information:
1. The critical region ID.
2. The requesting process ID.
3. The current time.
26
Mutual Exclusion: Distributed Algorithm
1. When a process (the “requesting process”) decides to enter a critical
region, a message is sent to all processes in the Distributed System
(including itself).
2. What happens at each process depends on the “state” of the critical region.
3. If not in the critical region (and not waiting to enter it), a process sends
back an OK to the requesting process.
4. If in the critical region, a process will queue the request and will not send
a reply to the requesting process.
5. If waiting to enter the critical region, a process will:
a) Compare the timestamp of the new message with that in its queue
(note that the lowest timestamp wins).
b) If the received timestamp wins, an OK is sent back, otherwise the
request is queued (and no reply is sent back).
6. When all the processes send OK, the requesting process can safely enter
the critical region.
7. When the requesting process leaves the critical region, it sends an OK to
27 all the processes in its queue, then empties its queue.
Distributed Algorithm (1)
• Three different cases:
1. If the receiver is not accessing the resource and does
not want to access it, it sends back an OK message
to the sender.
2. If the receiver already has access to the resource,
it simply does not reply. Instead, it queues the
request.
3. If the receiver wants to access the resource as well
but has not yet done so, it compares the timestamp
of the incoming message with the one contained in
the message that it has sent everyone. The lowest
one wins.
28
Distributed Algorithm (2)
29
A Token Ring Algorithm
33
Election Algorithms
38
Location systems
When looking at very large distributed systems that are
dispersed across a wide-area network, it is often necessary
to take proximity into account.
Just imagine a distributed system organized as an overlay
network in which two processes are neighbors in the
overlay network, but are actually placed far apart in the
underlying network.
If these two processes communicate a lot, it may have been
better to ensure that they are also physically placed in each
other proximity.
39
GPS: Global Positioning System
Let us start by considering how to determine your geographical
position anywhere on Earth. This positioning problem is by itself
solved through a highly specific, dedicated distributed system,
namely GPS, which is an acronym for Global Positioning
System.
it initially was used mainly for military applications, it by now has
found its way to many civilian applications, notably for traffic
navigation. However, many more application domains exist. For
example, modern smartphones now allow owners to track each
other’s position.
GPS uses up to 24 satellites each circulating in an orbit at a height
of approximately 20,000 km.
40
41
GPS: Global Positioning System
42
43
GPS: Global Positioning System
44
Distributed event matching
•As a final subject concerning the coordination among processes,
we consider distributed event matching. Event matching, or more
precisely, notification filtering, is at the heart of publish-
subscribe systems. The problem boils down to the following:
46
Gossip-based coordination
As a final topic in coordination, we take a look at a few important examples in
which gossiping is deployed. In the following, we look at aggregation, large-
scale peer sampling, and overlay construction, respectively.
Aggregation
•Gossiping can be used to discover nodes that have a few outgoing wide-area
links, to subsequently apply directional gossiping.
47
A peer-sampling service
.
A solution is to construct a fully decentralized peer-sampling service, or PSS
for short.
Each node maintains a list of c neighbors, where, ideally, each of these
neighbors represents a randomly chosen live node from the current set of nodes.
This list of neighbors is also referred to as a partial view
48
A peer-sampling service
49
Gossip-based overlay construction
The lowest layer constitutes an unstructured peer-to-peer system in
which nodes periodically exchange entries of their partial views
with the aim to provide a peer-sampling service.
The lowest layer passes its partial view to the higher layer,
where an additional selection of entries takes place. This then
leads to a second list of neighbors corresponding to the desired
topology.
50