CS3551 Unit 1
CS3551 Unit 1
CS3551 Unit 1
UNIT I INTRODUCTION
Logical Time: Physical Clock Synchronization: NTP – A Framework for a System of Logical
Clocks – Scalar Time – Vector Time; Message Ordering and Group Communication: Message
Ordering Paradigms – Asynchronous Execution with Synchronous Communication –
Synchronous Program Order on Asynchronous System – Group Communication – Causal
Order – Total Order; Global State and Snapshot Recording Algorithms: Introduction – System
Model and Definitions – Snapshot Algorithms for FIFO Channels.
TEXT BOOKS
1. George Coulouris, Jean Dollimore, Time Kindberg, “Distributed Systems Concepts and
Design”, Fifth Edition, Pearson Education, 2012.
2. Pradeep L Sinha, “Distributed Operating Systems: Concepts and Design”, Prentice Hall of
India, 2007.
3. Tanenbaum A S, Van Steen M, “Distributed Systems: Principles and Paradigms”, Pearson
Education, 2007.
4. Liu M L, “Distributed Computing: Principles and Applications”, Pearson Education, 2004.
5. Nancy A Lynch, “Distributed Algorithms”, Morgan Kaufman Publishers, 2003.
6. Arshdeep Bagga, Vijay Madisetti, “ Cloud Computing: A Hands-On Approach”,
Universities Press, 2014.
UNIT I Distributed systems CS3551
UNIT I
1. 1 INTRODUCTION
The process of computation was started from working on a single processor.
This uni-processor computing can be termed as centralized computing. As the
demand for the increased processing capability grew high, multiprocessor systems
came to existence. The advent of multiprocessor systems, led to the development of
distributed systems with high degree of scalability and resource sharing. The modern
day parallel computing is a subset of distributed computing
A distributed system is a collection of independent computers, interconnected via a network,
capable of collaborating on a task. Distributed computing is computing performed in a
distributed system.
Dr. Gopikrishnan M 1
UNIT I Distributed systems CS3551
Concurrency
Transparency
Quality of service
QOS parameters
The distributed systems must offer the following QOS:
➢ Performance
➢ Reliability
➢ Availability
➢ Security
Differences between centralized and distributed systems
Centralized Systems Distributed Systems
In Centralized Systems, several jobs are done In Distributed Systems, jobs are distributed
on a particular central processing unit(CPU) among several processors. The Processor are
interconnected by a computer network
They have shared memory and shared They have no global state (i.e.) no shared memory
variables. and no shared variables.
Clocking is present. No global clock.
Dr. Gopikrishnan M 2
UNIT I Distributed systems CS3551
The interaction of the layers of the network with the operating system and
middleware is shown in Fig 1.2. The middleware contains important library functions
for facilitating the operations of DS.
The distributed system uses a layered architecture to break down the complexity of
system design. The middleware is the distributed software that drives the distributed
system, while providing transparency of heterogeneity at the platform level
1.3 Motivation
The following are the keypoints that acts as a driving force behind DS:
Dr. Gopikrishnan M 3
UNIT I Distributed systems CS3551
• Increased performance/cost ratio: The resource sharing and remote access features
of DS naturally increase the performance / cost ratio.
• Scalable: The number of systems operating in a distributed environment can be
increased as the demand increases.
Dr. Gopikrishnan M 4
UNIT I Distributed systems CS3551
Dr. Gopikrishnan M 12
UNIT I Distributed systems CS3551
Processes can communicate with other Here, a process does not have private address
processes. They can be protected from one space. So one process can alter the execution of
another by having private address spaces. other.
This technique can be used in heterogeneous This cannot be used to heterogeneous
computers. computers.
Synchronization between processes is through Synchronization is through locks and
message passing primitives. semaphores.
Processes communicating via message passing Processes communicating through DSM
must execute at the same time. may execute with non-overlapping lifetimes.
Efficiency:
All remote data accesses are explicit and Any particular read or update may or may not
therefore the programmer is always aware of involve communication by the underlying
whether a particular operation is in-process or runtime support.
involves the expense of communication.
Dr. Gopikrishnan M 13
UNIT I Distributed systems CS3551
A Send primitive has two parameters: the destination, and the buffer in the
user space that holds the data to be sent.
The Receive primitive also has two parameters: the source from which the
data is to be received and the user buffer into which the data is to be received.
There are two ways of sending data when the Send primitive is called:
➢ Buffered: The standard option copies the data from the user buffer to the kernel
buffer. The data later gets copied from the kernel buffer onto the network. For the
Receive primitive, the buffered option is usually required because the data may
already have arrived when the primitive is invoked, and needs a storage place in the
kernel.
➢ Unbuffered: The data gets copied directly from the user buffer onto thenetwork.
Blocking primitives
• The primitive commands wait for the message to be delivered. The execution of the
processes is blocked.
• The sending process must wait after a senduntil an acknowledgement is made by the
receiver.
• The receiving process must wait for the expected message from the sending process
• The receipt is determined by polling common buffer or interrupt
• This is a form of synchronization or synchronous communication.
• A primitive is blocking if control returns to the invoking process after the processing
for the primitive completes.
Non Blocking primitives
• If send is nonblocking, it returns control to the caller immediately, before the message
is sent.
• The advantage of this scheme is that the sending process can continue computing in
parallel with the message transmission, instead of having the CPU go idle.
• This is a form of asynchronous communication.
• A primitive is non-blocking if control returnsback to the invoking process
immediately after invocation, even thoughthe operation has not completed.
• For a non-blocking Send, control returnsto the process even before the data is copied
out of the user buffer.
• For anon-blocking Receive, control returns to the process even before the datamay
have arrived from the sender.
Synchronous
• A Send or a Receive primitive is synchronous if both the Send() and Receive()
handshake with each other.
• The processing for the Send primitive completes only after the invoking processor
learns
that the other corresponding Receive primitive has also been invoked andthat the
receive operation has been completed.
• The processing for theReceive primitive completes when the data to be received is
copied intothe receiver’s user buffer.
Asynchronous
• A Send primitive is said to be asynchronous, if control returns back to the invoking
process after the data item to be sent has been copied out of the user-specified buffer.
• It does not make sense to define asynchronous Receive primitives.
Dr. Gopikrishnan M 14
UNIT I Distributed systems CS3551
Dr. Gopikrishnan M 15
UNIT I Distributed systems CS3551
• The asynchronous Send completes when the data has been copied out of the user’s
Fig 1.12 a) Blocking synchronous send and blocking Fig 1.12 b) Non-blocking synchronous send and
receive blocking receive
Fig 1.12 c) Blocking asynchronous send Fig 1.12 d) Non-blocking asynchronous send
buffer. The checking for the completion may be necessary if the user wants to reuse
the buffer from which the data was sent.
Processor synchrony indicates that all the processors execute in lock-step with their clocks
synchronized.
Dr. Gopikrishnan M 16
UNIT I Distributed systems CS3551
begins executing the next step of code until all the processors have completed
executing the previous steps of code assigned to each of the processors.
RMI RPC
RMI uses an object oriented paradigm RPC is not object oriented and does not
where the user needs to know the object deal with objects. Rather, it calls specific
and the method of the object he needs to subroutines that are already established
invoke.
With RPC looks like a local call. RPC RMI handles the complexities of passing
handles the complexities involved with along the invocation from the local to the
passing the call from the local to the remote computer. But instead of passing a
remote computer. procedural call, RMI passes a reference to
the object and the method that is being
called.
Dr. Gopikrishnan M 17
UNIT I Distributed systems CS3551
Asynchronous Execution:
A communication among processes is considered asynchronous, when every
communicating process can have a different observation of the order of the messages
being exchanged. In an asynchronous execution:
• there is no processor synchrony and there is no bound on the drift rate of processor
clocks
• message delays are finite but unbounded
• no upper bound on the time taken by a process
Synchronous Execution:
A communication among processes is considered synchronous when every
process observes the same order of messages within the system. In the same manner,
the execution is considered synchronous, when every individual process in the system
observes the same total order of all the processes which happen within it. In an
synchronous execution:
• processors are synchronized and the clock drift rate between any two processors is
bounded
• message delivery times are such that theyoccur in one logical step or round
• upper boundon the time taken by a process to execute a step.
Dr. Gopikrishnan M 18
UNIT I Distributed systems CS3551
Dr. Gopikrishnan M 19
UNIT I Distributed systems CS3551
Dr. Gopikrishnan M 20
UNIT I Distributed systems CS3551
• Hence graphalgorithms are the base for large number of higher level communication,
data dissemination, object location, and object search functions.
• These algorithms must have the capacity to deal with highly dynamic graph
characteristics. They are expected to function like routing algorithms.
• The performance of these algorithms has direct impact on user-perceived latency, data
traffic and load in the network.
➢ Time and global state in a distributed system
• The geographically remote resources demands the synchronization based on logical
time.
• Logical time is relative and eliminates the overheads of providing physical time for
applications.Logical time can
(i) capturethe logic and inter-process dependencies
(ii) track the relative progress at each process
• Maintaining the global state of the system across space involves the role of time
dimension for consistency. This can be done with extra effort in a coordinated
manner.
• Deriving appropriate measures of concurrency also involves the timedimension, as the
execution and communication speed of threads may vary a lot.
➢ Synchronization/coordination mechanisms
• Synchronization is essential for the distributed processes to facilitate concurrent
execution without affecting other processes.
• The synchronization mechanisms also involveresource management and concurrency
managementmechanisms.
• Some techniques for providing synchronization are:
✓ Physical clock synchronization: Physical clocks usually diverge in theirvalues
due to hardware limitations. Keeping them synchronized is a fundamental
challenge to maintain common time.
✓ Leader election: All the processes need to agree on which process willplay the
role of a distinguished process or a leader process. A leaderis necessary even for
many distributed algorithms because there is often some asymmetry.
✓ Mutual exclusion:Access to the critical resource(s) has to be coordinated.
✓ Deadlock detection and resolution:This is done to avoid duplicate work, and
deadlock resolution shouldbe coordinated to avoid unnecessary aborts of
processes.
✓ Termination detection: cooperation among the processesto detect the specific
global state of quiescence.
✓ Garbage collection: Detecting garbage requirescoordination among the
processes.
➢ Group communication, multicast, and ordered message delivery
• A group is a collection of processes that share a common context and collaborate on a
common task within an application domain. Group management protocols are needed
for group communication wherein processes can join and leave groups dynamically,
or fail.
• The concurrent execution of remote processes may sometimes violate the semantics
and order of the distributed program. Hence, a formal specification of the semantics
ofordered delivery need to be formulated, and then implemented.
➢ Monitoring distributed events and predicates
• Predicates defined on program variables that are local to different processesare used
for specifying conditions on the global system state.
Dr. Gopikrishnan M 21
UNIT I Distributed systems CS3551
Dr. Gopikrishnan M 22
UNIT I Distributed systems CS3551
Dr. Gopikrishnan M 23
UNIT I Distributed systems CS3551
➢ Performance
User perceived latency in distributed systems must be reduced. The common issues in
performance:
✓ Metrics: Appropriate metrics must be defined for measuring the performance of
theoretical distributed algorithms and its implementation.
✓ Measurement methods/tools:The distributed system is a complexentity
appropriate methodology andtools must be developed for measuring the
performance metrics.
1.8.3 Applications of distributed computing and newer challenges
The deployment environment of distributed systems ranges from mobile
systems to cloud storage. All the environments have their own challenges:
➢ Mobile systems
• Mobile systems which use wireless communication in shared broadcast mediumhave
issues related to physical layer such as transmission range, power, battery power
consumption, interfacing with wired internet, signal processing and interference.
• The issues pertaining to other higher layers includerouting, location management,
channel allocation, localization and position estimation, and mobility management.
• Apart from the above mentioned common challenges, the architectural differences of
the mobile network demands varied treatment. The two architectures are:
✓ Base-station approach (cellular approach): The geographical region is divided
into hexagonal physical locations called cells. The powerful base station transmits
signals to all other nodes in its range
✓ Ad-hoc network approach:This is an infrastructure-less approach which do not
have any base station to transmit signals. Instead all the responsibilityis distributed
among the mobile nodes.
✓ It is evident that both the approaches work in different environment with different
principles of communication. Designing a distributed system to cater the varied
need is a great challenge.
➢ Sensor networks
• A sensor is a processor with an electro-mechanical interface that is capable ofsensing
physical parameters.
• They are low cost equipment with limited computational power and battery life. They
are designed to handle streaming data and route it to external computer network and
processes.
• They are susceptible to faults and have to reconfigure themselves.
• These features introduces a whole newset of challenges, such as position estimation
and time estimation when designing a distributed system .
➢ Ubiquitous or pervasive computing
• In Ubiquitous systems the processors are embedded in the environment to
performapplication functions in the background.
• Examples: Intelligent devices, smart homes etc.
• They are distributed systems with recent advancements operating in wireless
environments through actuator mechanisms.
• They can be self-organizing and network-centricwith limited resources.
➢ Peer-to-peer computing
• Peer-to-peer (P2P) computing is computing over an application layernetwork where
all interactions among the processors are at a same level.
Dr. Gopikrishnan M 24
UNIT I Distributed systems CS3551
Dr. Gopikrishnan M 25
UNIT I Distributed systems CS3551
• The processes do not share a global memory and communicate solely by passing
messages. These processes do not share a global clock that is instantaneously
accessible tothese processes.
• Process execution and message transfer are asynchronous – aprocess may execute an
action spontaneously and a process sending a message does not wait for the delivery
of the message to be complete.
• The global state of a distributed computation is composed of the states of the
processes and the communication channels. The state of a process ischaracterized by
the state of its local memory and depends upon the context.
• The state of a channel is characterized by the set of messages in transit in thechannel.
1.9.1 A MODEL OF DISTRIBUTED EXECUTIONS
Dr. Gopikrishnan M 26
UNIT I Distributed systems CS3551
When all the above conditions are satisfied, then it can be concluded that a→b is
casually related. Consider two events c and d; c→d and d→c is false (i.e) they are not
casually related, then c and d are said to be concurrent events denoted as c||d.
Dr. Gopikrishnan M 27
UNIT I Distributed systems CS3551
A system that supports the causal ordering model satisfies the following property:
Distributed Snapshot represents a state in which the distributed system might have been in. A
snapshot of the system is a single configuration of the system.
▪ The global state of a distributed system is a collection of the local states of its
components, namely, the processes and the communication channels.
▪ The state of a process at any time is defined by the contents of processor registers,
stacks, local memory, etc. and depends on the local context of the distributed
application.
▪ The state of a channel is given by the set of messages in transit in the channel.
The state of a channel is difficult to state formally because a channel is a distributed entity
and its state depends upon the states of the processes it connects. Let
denote the state of a channel Cij defined as follows:
A distributed snapshot should reflect a consistent state. A global state is consistent if it could
have been observed by an external observer. For a successful Global State, all states must be
consistent:
• If we have recorded that a process P has received a message from a process Q, then
we should have also recorded that process Q had actually send that message.
Dr. Gopikrishnan M 28
UNIT I Distributed systems CS3551
• Otherwise, a snapshot will contain the recording of messages that have been received
but never sent.
• The reverse condition (Q has sent a message that P has not received) is allowed.
The notion of a global state can be graphically represented by what is called a cut. A cut
represents the last event that has been recorded for each process.
The history of each process if given by:
Each event either is an internal action of the process. We denote by sik the state of process pi
immediately before the kth event occurs. The state si in the global state S corresponding to the
cut C is that of pi immediately after the last event processed by pi in the cut – eici . The set of
events eici is called the frontier of the cut.
Dr. Gopikrishnan M 29