Distributed Systems: Chapter 01: Introduction
Distributed Systems: Chapter 01: Introduction
Distributed Systems: Chapter 01: Introduction
(3rd Edition)
Distributed System
Definition
A distributed system is a collection of autonomous computing elements that
appears to its users as a single coherent system.
Characteristic features
Autonomous computing elements, also referred to as nodes, be they
hardware devices or software processes.
Single coherent system: users or applications perceive a single system ⇒
nodes need to collaborate.
2 / 56
Introduction: What is a distributed system? Middleware and distributed systems
Network
6 / 56
Introduction: Design goals
7 / 56
Introduction: Design goals Supporting resource sharing
Sharing resources
Canonical examples
Cloud-based shared storage and files
Peer-to-peer assisted multimedia streaming
Shared mail services (think of outsourced mail systems)
Shared Web hosting (think of content distribution networks)
Observation
“The network is the computer”
8 / 56
Introduction: Design goals Making distribution transparent
Distribution transparency
Types
Transparency Description
Access Hide differences in data representation and how an
object is accessed
Location Hide where an object is located
Relocation Hide that an object may be moved to another location
while in use
Migration Hide that an object may move to another location
Replication Hide that an object is replicated
Concurrency Hide that an object may be shared by several
independent users
Failure Hide the failure and recovery of an object
Degree of transparency
Observation
Aiming at full distribution transparency may be too much:
Degree of transparency
Observation
Aiming at full distribution transparency may be too much:
There are communication latencies that cannot be hidden
Degree of transparency
Observation
Aiming at full distribution transparency may be too much:
There are communication latencies that cannot be hidden
Completely hiding failures of networks and nodes is (theoretically and
practically) impossible
You cannot distinguish a slow computer from a failing one
You can never be sure that a server actually performed an operation
before a crash
Degree of transparency
Observation
Aiming at full distribution transparency may be too much:
There are communication latencies that cannot be hidden
Completely hiding failures of networks and nodes is (theoretically and
practically) impossible
You cannot distinguish a slow computer from a failing one
You can never be sure that a server actually performed an operation
before a crash
Full transparency will cost performance, exposing distribution of the
system
Keeping replicas exactly up-to-date with the master takes time
Immediately flushing write operations to disk for fault tolerance
Degree of transparency
Degree of transparency
Conclusion
Distribution transparency is a nice a goal, but achieving it is a different story,
and it should often not even be aimed at.
Observation
Many developers of modern distributed systems easily use the adjective
“scalable” without making clear why their system actually scales.
Scalability dimensions 15 / 56
Introduction: Design goals Being scalable
Observation
Many developers of modern distributed systems easily use the adjective
“scalable” without making clear why their system actually scales.
Scalability dimensions 15 / 56
Introduction: Design goals Being scalable
Observation
Many developers of modern distributed systems easily use the adjective
“scalable” without making clear why their system actually scales.
Observation
Most systems account only, to a certain extent, for size scalability. Often a
solution: multiple powerful servers operating independently in parallel. Today,
the challenge still lies in geographical and administrative scalability.
Scalability dimensions 15 / 56
Introduction: Design goals Being scalable
Size scalability
Scalability dimensions 16 / 56
Introduction: Design goals Being scalable
Scalability dimensions 20 / 56
Introduction: Design goals Being scalable
Examples
Computational grids: share expensive resources between different
domains.
Shared equipment: how to control, manage, and use a shared radio
telescope constructed as large-scale shared sensor network?
Scaling techniques 22 / 56
Introduction: Design goals Being scalable
Scaling techniques 24 / 56
Introduction: Design goals Being scalable
Scaling techniques 25 / 56
Introduction: Design goals Being scalable
Scaling techniques 26 / 56
Introduction: Design goals Being scalable
Scaling techniques 26 / 56
Introduction: Design goals Being scalable
Scaling techniques 26 / 56
Introduction: Design goals Being scalable
Scaling techniques 26 / 56
Introduction: Design goals Being scalable
Observation
If we can tolerate inconsistencies, we may reduce the need for global
synchronization, but tolerating inconsistencies is application dependent.
Scaling techniques 26 / 56
Introduction: Design goals Pitfalls
27 / 56
Introduction: Design goals Pitfalls
27 / 56
Introduction: Design goals Pitfalls
27 / 56
Introduction: Design goals Pitfalls
27 / 56
Introduction: Design goals Pitfalls
27 / 56
Introduction: Design goals Pitfalls
27 / 56
Introduction: Design goals Pitfalls
27 / 56
Introduction: Design goals Pitfalls
27 / 56
Introduction: Design goals Pitfalls
27 / 56
Introduction: Design goals Pitfalls
28 / 56
Introduction: Types of distributed systems High performance distributed computing
Parallel computing
Observation
High-performance distributed computing started with parallel computing
M M M M M M M
Interconnect P P P P
P P P P Interconnect
Processor Memory
29 / 56
Introduction: Types of distributed systems High performance distributed computing
Cluster computing
Cluster computing 31 / 56
Introduction: Types of distributed systems High performance distributed computing
Grid computing
Note
To allow for collaborations, grids generally use virtual organizations. In
essence, this is a grouping of users (or better: their IDs) that will allow for
authorization on resource allocation.
Grid computing 32 / 56
Introduction: Types of distributed systems High performance distributed computing
Cloud computing
Google docs
Software
aa Svc
Platforms
Amazon S3
Computation (VM), storage (block, file) Amazon EC2
Infrastructure
Infrastructure
aa Svc
Cloud computing 34 / 56
Introduction: Types of distributed systems High performance distributed computing
Cloud computing
Cloud computing 35 / 56
Introduction: Types of distributed systems Distributed information systems
Integrating applications
Situation
Organizations confronted with many networked applications, but achieving
interoperability was painful.
Basic approach
A networked application is one that runs on a server making its services
available to remote clients. Simple integration: clients combine requests for
(different) applications; send that off; collect responses, and present a coherent
result to the user.
Next step
Allow direct application-to-application communication, leading to Enterprise
Application Integration.
42 / 56
Introduction: Types of distributed systems Distributed information systems
Issue: all-or-nothing
Nested transaction
Subtransaction Subtransaction
Server
Reply
Transaction Request
Requests
Request
Client Server
TP monitor
application
Reply
Reply
Request
Server
Reply
Observation
In many cases, the data involved in a transaction is distributed across several
servers. A TP Monitor is responsible for coordinating the execution of a
transaction.
Client Client
application application
Communication middleware
Observation
Emerging next-generation of distributed systems in which nodes are small,
mobile, and often embedded in a larger system, characterized by the fact that
the system naturally blends into the user’s environment.
47 / 56
Introduction: Types of distributed systems Pervasive systems
Observation
Emerging next-generation of distributed systems in which nodes are small,
mobile, and often embedded in a larger system, characterized by the fact that
the system naturally blends into the user’s environment.
47 / 56
Introduction: Types of distributed systems Pervasive systems
Observation
Emerging next-generation of distributed systems in which nodes are small,
mobile, and often embedded in a larger system, characterized by the fact that
the system naturally blends into the user’s environment.
47 / 56
Introduction: Types of distributed systems Pervasive systems
Observation
Emerging next-generation of distributed systems in which nodes are small,
mobile, and often embedded in a larger system, characterized by the fact that
the system naturally blends into the user’s environment.
47 / 56
Introduction: Types of distributed systems Pervasive systems
Mobile computing
Distinctive features
A myriad of different mobile devices (smartphones, tablets, GPS devices,
remote controls, active badges.
Mobile implies that a device’s location is expected to change over time ⇒
change of local services, reachability, etc. Keyword: discovery.
Communication may become more difficult: no stable route, but also
perhaps no guaranteed connectivity ⇒ disruption-tolerant networking.
Sensor networks
Characteristics
The nodes to which sensors are attached are:
Many (10s-1000s)
Simple (small memory/compute/communication capacity)
Often battery-powered (or even battery-less)
Sensor networks 53 / 56
Introduction: Types of distributed systems Pervasive systems
Two extremes
Sensor network
Operator's site
Sensor data
is sent directly
to operator
Each sensor
can process and Sensor network
store data
Operator's site
Query
Sensors
send only
answers
Sensor networks 54 / 56
Distributed Systems
(3rd Edition)
Architectural styles
Basic idea
A style is formulated in terms of
(replaceable) components with well-defined interfaces
the way that components are connected to each other
the data exchanged between components
how these components and connectors are jointly configured into a
system.
Connector
A mechanism that mediates communication, coordination, or cooperation
among components. Example: facilities for (remote) procedure call,
messaging, or streaming.
2 / 36
Architectures: Architectural styles Layered architectures
Layered architecture
Handle
Upcall
Layer N-2
Layer N-2
Layer 2
Layer N-3
Layer 1
3 / 36
Architectures: Architectural styles Layered architectures
Layer N Layer N
Interface Service
Two-party communication
Server
1 from socket import *
2 s = socket(AF_INET, SOCK_STREAM)
3 (conn, addr) = s.accept() # returns new socket and addr. client
4 while True: # forever
5 data = conn.recv(1024) # receive data from client
6 if not data: break # stop if client stopped
7 conn.send(str(data)+"*") # return sent data plus an "*"
8 conn.close() # close the connection
Client
1 from socket import *
2 s = socket(AF_INET, SOCK_STREAM)
3 s.connect((HOST, PORT)) # connect to server (block until accepted)
4 s.send(’Hello, world’) # send some data
5 data = s.recv(1024) # receive the response
6 print data # print the result
7 s.close() # close the connection
Application Layering
Application layering 6 / 36
Architectures: Architectural styles Layered architectures
Application Layering
Observation
This layering is found in many distributed information systems, using traditional
database technology and accompanying applications.
Application layering 6 / 36
Architectures: Architectural styles Layered architectures
Application Layering
HTML page
Keyword expression containing list
HTML
generator Processing
Query Ranked list level
generator of page titles
Ranking
Database queries algorithm
Application layering 7 / 36
Architectures: Architectural styles Object-based and service-oriented architectures
Object-based style
Essence
Components are objects, connected to each other through procedure calls.
Objects may be placed on different machines; calls can thus execute across a
network.
State
Object Object
Method
Method call
Object
Object
Object
Interface
Encapsulation
Objects are said to encapsulate data and offer methods on that data without
revealing the internal implementation.
8 / 36
Architectures: Architectural styles Resource-based architectures
RESTful architectures
Essence
View a distributed system as a collection of resources, individually managed by
components. Resources may be added, removed, retrieved, and modified by
(remote) applications.
1 Resources are identified through a single naming scheme
2 All services offer the same interface
3 Messages sent to or from a service are fully self-described
4 After executing an operation at a service, that component forgets
everything about the caller
Basic operations
Operation Description
PUT Create a new resource
GET Retrieve the state of a resource in some representation
DELETE Delete a resource
POST Modify a resource by transferring a new state
9 / 36
Architectures: Architectural styles Resource-based architectures
Essence
Objects (i.e., files) are placed into buckets (i.e., directories). Buckets cannot be
placed into buckets. Operations on ObjectName in bucket BucketName require
the following identifier:
http://BucketName.s3.amazonaws.com/ObjectName
Typical operations
All operations are carried out by sending HTTP requests:
Create a bucket/object: PUT, along with the URI
Listing objects: GET on a bucket name
Reading an object: GET on a full URI
10 / 36
Architectures: Architectural styles Resource-based architectures
On interfaces
Issue
Many people like RESTful approaches because the interface to a service is so
simple. The catch is that much needs to be done in the parameter space.
11 / 36
Architectures: Architectural styles Resource-based architectures
On interfaces
Simplifications
Assume an interface bucket offering an operation create, requiring an input
string such as mybucket, for creating a bucket “mybucket.”
12 / 36
Architectures: Architectural styles Resource-based architectures
On interfaces
Simplifications
Assume an interface bucket offering an operation create, requiring an input
string such as mybucket, for creating a bucket “mybucket.”
SOAP
import bucket
bucket.create("mybucket")
12 / 36
Architectures: Architectural styles Resource-based architectures
On interfaces
Simplifications
Assume an interface bucket offering an operation create, requiring an input
string such as mybucket, for creating a bucket “mybucket.”
SOAP
import bucket
bucket.create("mybucket")
RESTful
PUT "http://mybucket.s3.amazonsws.com/"
12 / 36
Architectures: Architectural styles Resource-based architectures
On interfaces
Simplifications
Assume an interface bucket offering an operation create, requiring an input
string such as mybucket, for creating a bucket “mybucket.”
SOAP
import bucket
bucket.create("mybucket")
RESTful
PUT "http://mybucket.s3.amazonsws.com/"
Conclusions
Are there any to draw?
12 / 36
Architectures: Architectural styles Publish-subscribe architectures
Coordination
Temporal and referential coupling
Temporally Temporally
coupled decoupled
Referentially Direct Mailbox
coupled
Referentially Event- Shared
decoupled based data space
Subscribe Notification
Publish Subscribe Data
delivery
delivery
Event bus
Publish
Component
Shared (persistent) data space
13 / 36
Architectures: Architectural styles Publish-subscribe architectures
More details
Calling out(t) twice in a row, leads to storing two copies of tuple t ⇒ a
tuple space is modeled as a multiset.
Both in and rd are blocking operations: the caller will be blocked until a
matching tuple is found, or has become available.
14 / 36
Architectures: Architectural styles Publish-subscribe architectures
Alice
1 blog = linda.universe._rd(("MicroBlog",linda.TupleSpace))[1]
2
3 blog._out(("alice","gtcn","This graph theory stuff is not easy"))
4 blog._out(("alice","distsys","I like systems more than graphs"))
Chuck
1 blog = linda.universe._rd(("MicroBlog",linda.TupleSpace))[1]
2
3 t1 = blog._rd(("bob","distsys",str))
4 t2 = blog._rd(("alice","gtcn",str))
5 t3 = blog._rd(("bob","gtcn",str))
15 / 36
Architectures: Middleware organization Wrappers
Problem
The interfaces offered by a legacy component are most likely not suitable for all
applications.
Solution
A wrapper or adapter offers an interface acceptable to a client application. Its
functions are transformed into those available at the component.
16 / 36
Architectures: Middleware organization Wrappers
Organizing wrappers
Application Broker
17 / 36
Architectures: Middleware organization Interceptors
Problem
Middleware contains solutions that are good for most applications ⇒ you may
want to adapt its behavior for specific applications.
18 / 36
Architectures: Middleware organization Interceptors
Client application
Intercepted call
B.doit(val)
Application stub
Object middleware
Message-level interceptor
Local OS
To object B
19 / 36
Architectures: System architecture Centralized organizations
Client Server
Request
User interface User interface User interface User interface User interface
Application Application Application
Database
User interface
Server machine
(a) (b) (c) (d) (e)
Multitiered Architectures 21 / 36
Architectures: System architecture Centralized organizations
Three-tiered architecture
Client Application Database
server server
Request
operation
Request
data
Wait for Wait for
reply data
Return
data
Return
reply
Multitiered Architectures 22 / 36
Architectures: System architecture Decentralized organizations: peer-to-peer systems
Alternative organizations
Vertical distribution
Comes from dividing distributed applications into three logical layers, and
running the components from each layer on a different server (machine).
Horizontal distribution
A client or server may be physically split up into logically equivalent parts, but
each part is operating on its own share of the complete data set.
Peer-to-peer architectures
Processes are all equal: the functions that need to be carried out are
represented by every process ⇒ each process will act as a client and a server
at the same time (i.e., acting as a servant).
23 / 36
Architectures: System architecture Decentralized organizations: peer-to-peer systems
Structured P2P
Essence
Make use of a semantic-free index: each data item is uniquely associated with
a key, in turn used as an index. Common practice: use a hash function
key(data item) = hash(data item’s value).
P2P system now responsible for storing (key,value) pairs.
0100
0101 1101
1100
0110 0111 1111
1110
Example: Chord
Principle
Nodes are logically organized in a ring. Each node has an m-bit identifier.
Each data item is hashed to an m-bit key.
Data item with key k is stored at node with smallest identifier id ≥ k ,
called the successor of key k .
The ring is extended with various shortcut links to other nodes.
Example: Chord
31 0 1
30 2
29 3
Actual node
28 Shortcut 4
27 5
26 6
Nonexisting
25 7
node
24 8
21 11
20 12
19 13
18 14
17 16 15
lookup(3)@9 : 28 → 1 → 4
Structured peer-to-peer systems 26 / 36
Architectures: System architecture Decentralized organizations: peer-to-peer systems
Unstructured P2P
Essence
Each node maintains an ad hoc list of neighbors. The resulting overlay
resembles a random graph: an edge hu, v i exists only with a certain probability
P[hu, v i].
Searching
Flooding: issuing node u passes request for d to all neighbors. Request
is ignored when receiving node had seen it before. Otherwise, v searches
locally for d (recursively). May be limited by a Time-To-Live: a maximum
number of hops.
Random walk: issuing node u passes request for d to randomly chosen
neighbor, v . If v does not have d, it forwards request to one of its
randomly chosen neighbors, and so on.
Model
Assume N nodes and that each data item is replicated across r randomly
chosen nodes.
Random walk
P[k ] probability that item is found after k attempts:
r r
P[k ] = (1 − )k−1 .
N N
S (“search size”) is expected number of nodes that need to be probed:
N N
r r
S= ∑ k · P[k ] = ∑ k · N (1 − N )k −1 ≈ N/r for 1 r ≤ N.
k =1 k =1
Flooding
Flood to d randomly chosen neighbors
After k steps, some R(k ) = d · (d − 1)k−1 will have been reached
(assuming k is small).
With fraction r /N nodes having data, if Nr · R(k) ≥ 1, we will have found
the data item.
Comparison
If r /N = 0.001, then S ≈ 1000
With flooding and d = 10, k = 4, we contact 7290 nodes.
Random walks are more communication efficient, but might take longer
before they find the result.
Super-peer networks
Essence
It is sometimes sensible to break the symmetry in pure peer-to-peer networks:
When searching in unstructured P2P systems, having index servers
improves performance
Deciding where to store data can often be done more efficiently through
brokers.
Super peer
Overlay network of super peers
Weak peer
Edge-server architecture
Essence
Systems deployed on the Internet where servers are placed at the edge of the
network: the boundary between enterprise networks and the actual Internet.
ISP
ISP
Core Internet
Edge server
Enterprise network
Edge-server systems 32 / 36
Architectures: System architecture Hybrid Architectures
Client node
K out of N nodes
Lookup(F) Node 1
Exchange blocks
A file is divided into equally sized pieces (typically each being 256 KB)
Peers exchange blocks of pieces, typically some 16 KB.
A can upload a block d of piece D, only if it has piece D.
Neighbor B belongs to the potential set PA of A, if B has a block that A
needs.
If B ∈ PA and A ∈ PB : A and B are in a position that they can trade a block.
BitTorrent phases
Bootstrap phase
A has just received its first piece (through optimistic unchoking: a node from NA
unselfishly provides the blocks of a piece to get a newly arrived node started).
Trading phase
|PA | > 0: there is (in principle) always a peer with whom A can trade.
BitTorrent phases
0.8
0.6
|P|
|N|
0.4
|N| = 5
0.2 |N| = 10
|N| = 40
Introduction to Threads
Basic idea
We build virtual processors in software, on top of physical processors:
2 / 34
Processes 3.1 Threads
Context Switching
Contexts
Processor context: The minimal collection of values stored in the
registers of a processor used for the execution of a series of
instructions (e.g., stack pointer, addressing registers, program
counter).
Thread context: The minimal collection of values stored in
registers and memory, used for the execution of a series of
instructions (i.e., processor context, state).
Process context: The minimal collection of values stored in
registers and memory, used for the execution of a thread (i.e.,
thread context, but now also at least MMU register values).
3 / 34
Processes 3.1 Threads
Context Switching
Contexts
Processor context: The minimal collection of values stored in the
registers of a processor used for the execution of a series of
instructions (e.g., stack pointer, addressing registers, program
counter).
Thread context: The minimal collection of values stored in
registers and memory, used for the execution of a series of
instructions (i.e., processor context, state).
Process context: The minimal collection of values stored in
registers and memory, used for the execution of a thread (i.e.,
thread context, but now also at least MMU register values).
3 / 34
Processes 3.1 Threads
Context Switching
Contexts
Processor context: The minimal collection of values stored in the
registers of a processor used for the execution of a series of
instructions (e.g., stack pointer, addressing registers, program
counter).
Thread context: The minimal collection of values stored in
registers and memory, used for the execution of a series of
instructions (i.e., processor context, state).
Process context: The minimal collection of values stored in
registers and memory, used for the execution of a thread (i.e.,
thread context, but now also at least MMU register values).
3 / 34
Processes 3.1 Threads
Context Switching
Observations
1 Threads share the same address space. Thread context switching
can be done entirely independent of the operating system.
2 Process switching is generally more expensive as it involves
getting the OS in the loop, i.e., trapping to the kernel.
3 Creating and destroying threads is much cheaper than doing so
for processes.
4 / 34
Processes 3.1 Threads
Main issue
Should an OS kernel provide threads, or should they be implemented as
user-level packages?
User-space solution
All operations can be completely handled within a single process ⇒
implementations can be extremely efficient.
All services provided by the kernel are done on behalf of the process in
which a thread resides ⇒ if the kernel decides to block a thread, the
entire process will be blocked.
Threads are used when there are lots of external events: threads block
on a per-event basis ⇒ if the kernel can’t distinguish threads, how can it
support signaling events to them?
5 / 34
Processes 3.1 Threads
Kernel solution
The whole idea is to have the kernel contain the implementation of a thread
package. This means that all operations return as system calls
Operations that block a thread are no longer a problem: the kernel
schedules another available thread within the same process.
Handling external events is simple: the kernel (which catches all events)
schedules the thread associated with the event.
The problem is (or used to be) the loss of efficiency due to the fact that
each thread operation requires a trap to the kernel.
Conclusion – but
Try to mix user-level and kernel-level threads into a single concept, however,
performance gain has not turned out to outweigh the increased complexity.
6 / 34
Processes 3.1 Threads
7 / 34
Processes 3.1 Threads
Improve performance
Starting a thread is much cheaper than starting a new process.
Having a single-threaded server prohibits simple scale-up to a
multiprocessor system.
As with clients: hide network latency by reacting to next request while
previous one is being replied.
Better structure
Most servers have high I/O demands. Using simple, well-understood
blocking calls simplifies the overall structure.
Multithreaded programs tend to be smaller and easier to understand due
to simplified flow of control.
8 / 34
Processes 3.2 Virtualizaton
Virtualization
Observation
Virtualization is becoming increasingly important:
Hardware changes faster than software
Ease of portability and code migration
Isolation of failing or attacked components
Program
Interface A
Program Implementation of
mimicking A on B
Interface A Interface B
(a) (b)
9 / 34
Processes 3.2 Virtualizaton
Architecture of VMs
Observation
Virtualization can take place at very different levels, strongly depending
on the interfaces as offered by various systems components:
Library
System calls
10 / 34
Processes 3.2 Virtualizaton
Application Applications
Runtime system Operating system
Runtime system Operating system
Runtime system Operating system
Hardware Hardware
(a) (b)
Practice
We’re seeing VMMs run on top of existing operating systems.
Perform binary translation: while executing an application or
operating system, translate instructions to that of the underlying
machine.
Distinguish sensitive instructions: traps to the orginal kernel (think
of system calls, or privileged instructions).
Sensitive instructions are replaced with calls to the VMM.
12 / 34
Processes 3.3 Clients
Essence
A major part of client-side software is focused on (graphical) user
interfaces.
Xlib Xlib
Local OS Local OS X protocol
X kernel
Device drivers
13 / 34
Processes 3.3 Clients
Client-Side Software
14 / 34
Processes 3.4 Servers
Basic model
A server is a process that waits for incoming service requests at a
specific transport address. In practice, there is a one-to-one mapping
between a port and a service.
15 / 34
Processes 3.4 Servers
Type of servers
Superservers: Servers that listen to several ports, i.e., provide several
independent services. In practice, when a service request comes
in, they start a subprocess to handle the request (UNIX inetd)
Iterative vs. concurrent servers: Iterative servers can handle only one
client at a time, in contrast to concurrent servers
16 / 34
Processes 3.4 Servers
Stateless servers
Never keep accurate information about the status of a client after having
handled a request:
Don’t record whether a file has been opened (simply close it again after
access)
Don’t promise to invalidate a client’s cache
Don’t keep track of your clients
Consequences
Clients and servers are completely independent
State inconsistencies due to client or server crashes are reduced
Possible loss of performance because, e.g., a server cannot anticipate
client behavior (think of prefetching file blocks)
18 / 34
Processes 3.4 Servers
Stateless servers
Never keep accurate information about the status of a client after having
handled a request:
Don’t record whether a file has been opened (simply close it again after
access)
Don’t promise to invalidate a client’s cache
Don’t keep track of your clients
Consequences
Clients and servers are completely independent
State inconsistencies due to client or server crashes are reduced
Possible loss of performance because, e.g., a server cannot anticipate
client behavior (think of prefetching file blocks)
18 / 34
Processes 3.4 Servers
Stateful servers
Keeps track of the status of its clients:
Record that a file has been opened, so that prefetching can be
done
Knows which data a client has cached, and allows clients to keep
local copies of shared data
Observation
The performance of stateful servers can be extremely high, provided
clients are allowed to keep local copies. As it turns out, reliability is not
a major problem.
20 / 34
Processes 3.4 Servers
Stateful servers
Keeps track of the status of its clients:
Record that a file has been opened, so that prefetching can be
done
Knows which data a client has cached, and allows clients to keep
local copies of shared data
Observation
The performance of stateful servers can be extremely high, provided
clients are allowed to keep local copies. As it turns out, reliability is not
a major problem.
20 / 34
Distributed Systems
Principles and Paradigms
Layered Protocols
Low-level layers
Transport layer
Application layer
Middleware layer
2 / 55
Communication 4.1 Layered Protocols
Application protocol
Application 7
Presentation protocol
Presentation 6
Session protocol
Session 5
Transport protocol
Transport 4
Network protocol
Network 3
Data link protocol
Data link 2
Physical protocol
Physical 1
Network
Drawbacks
Focus on message-passing only
Often unneeded or unwanted functionality
Violates access transparency
3 / 55
Communication 4.1 Layered Protocols
Low-level layers
Recap
Physical layer: contains the specification and implementation of
bits, and their transmission between sender and receiver
Data link layer: prescribes the transmission of a series of bits into
a frame to allow for error and flow control
Network layer: describes how packets in a network of computers
are to be routed.
Observation
For many distributed systems, the lowest-level interface is that of the
network layer.
4 / 55
Communication 4.1 Layered Protocols
Transport Layer
Important
The transport layer provides the actual communication facilities for
most distributed systems.
Note
IP multicasting is often considered a standard available service (which
may be dangerous to assume).
5 / 55
Communication 4.1 Layered Protocols
Middleware Layer
Observation
Middleware is invented to provide common services and protocols that
can be used by many different applications
Note
What remains are truly application-specific protocols...
such as?
6 / 55
Communication 4.1 Layered Protocols
Types of communication
Synchronize at Synchronize at Synchronize after
request submission request delivery processing by server
Client
Request
Transmission
interrupt
Storage
facility
Reply
Server Time
Distinguish
Transient versus persistent communication
Asynchrounous versus synchronous communication
7 / 55
7/31/2017 Synchronous vs asynchronous - javatpoint
Understanding Synchronous vs
Asynchronous
Before understanding AJAX, let’s understand classic web application
model and ajax web application model first.
As you can see in the above image, full page is refreshed at request
time and user is blocked until request completes.
https://www.javatpoint.com/understanding-synchronous-vs-asynchronous 1/4
7/31/2017 Synchronous vs asynchronous - javatpoint
https://www.javatpoint.com/understanding-synchronous-vs-asynchronous 2/4
7/31/2017 Synchronous vs asynchronous - javatpoint
As you can see in the above image, full page is not refreshed at request
time and user gets response from the ajax engine.
← prev next →
⇧
Please Share
https://www.javatpoint.com/understanding-synchronous-vs-asynchronous 3/4
Communication 4.2 Remote Procedure Call
12 / 55
Communication 4.2 Remote Procedure Call
Observations
Application developers are familiar with simple procedure model
Well-engineered procedures operate in isolation (black box)
There is no fundamental reason not to execute procedures on
separate machine
13 / 55
Communication 4.2 Remote Procedure Call
Asynchronous RPCs
Essence
Try to get rid of the strict request-reply behavior, but let the client
continue without waiting for an answer from the server.
Server Call local procedure Time Server Call local procedure Time
and return results
(a) (b)
17 / 55
Lecture 4.1
2
Overview
Distributed systems require that computations
running in different address spaces, potentially on
different hosts, be able to communicate. For a basic
communication mechanism, the Java language
supports sockets, which are flexible and sufficient for
general communication. However, sockets require
the client and server to engage in applications-level
protocols to encode and decode messages for
exchange, and the design of such protocols is
cumbersome and can be error-prone.
3
Cont.
the Java language's RMI system assumes
the homogeneous environment of the Java
Virtual Machine, and the system can
therefore take advantage of the Java object
model whenever possible.
4
RMI Interfaces and Classes
The interfaces and classes that are
responsible for specifying the remote
behavior of the RMI system are defined in the
java.rmi and the java.rmi.server packages.
The figure in the next slide shows the
relationship between these interfaces and
classes:
5
6
The Remote Interface
All remote interfaces extend, either directly or
indirectly, the interface java.rmi.remote. The Remote
interface defines no methods, as shown here:
public interface Remote {}
Example:
public interface HelloInterface extends Remote
{
public String say() throws RemoteException;
}
7
Implementing a Remote Interface
The general rules for a class that implements
a remote interface are as follows:
The class can implement any number of
remote interfaces.
The class can extend another remote
implementation class.
The class can define methods that do not
appear in the remote interface, but those
methods can only be used locally and are not
available remotely.
8
System Architectural Overview
The RMI system consists of three layers:
The stub/skeleton layer - client-side stubs (proxies)
and server-side skeletons
The remote reference layer - remote reference
behavior (such as invocation to a single object or to a
replicated object)
The transport layer - connection set up and
management and remote object tracking
The application layer sits on top of the RMI system.
The relationship between the layers is shown in the
following figure.
9
10
Cont.
A remote method invocation from a client to a remote server
object travels down through the layers of the RMI system to the
client-side transport, then up through the server-side transport to
the server
13
The Transport Layer
In general, the transport layer of the RMI system is
responsible for:
Setting up connections to remote address spaces.
Managing connections.
Monitoring connection "liveness."
Listening for incoming calls.
Maintaining a table of remote objects that reside in the
address space.
Setting up a connection for an incoming call.
Locating the dispatcher for the target of the remote call
and passing the connection to this dispatcher.
14
Client Interfaces
The Remote Interface
package java.rmi; public interface Remote {}
The java.rmi.Remote interface serves to identify all remote objects, all
remote objects must directly or indirectly implement this interface. Note
that all remote interfaces must be declared public.
17
Communication 4.3 Message-Oriented Communication
Message-Oriented Communication
Transient Messaging
Message-Queuing System JMS etc
Message Brokers like CORBA
Example: IBM Websphere
21 / 55
Message-Oriented Middleware and
The Java Message Service API
(JMS)
javax.jms
home page:
http://www.oracle.com/technetwork/java/jms/index.html
Ref: D2212 Network Programming with JavaLecture 8
by Vladimir Vlassov and Leif Lindbäck, KTH/ICT/SCS,
HT 2015
Message-Oriented Middleware,
MOM
Enables the exchange of general-purpose
messages in a distributed application.
Data is exchanged by message queuing, typically
asynchronously.
Reliable message delivery is achieved using
message queues, and by providing security,
transactions and the required administrative
services.
The Java Message Service API (JMS)
• JMS provides a Java API for an existing message queue. The
JMS specification defines how to call the provider.
• Asynchronous message production
• Asynchronous message consumption by a message listener
registered as consumer.
– Message-driven EJBs (Enterprise Java Beans) asynchronously
consume messages.
• Reliable messaging: Can ensure that a message is delivered
once and only once.
• JMS provider is a messaging agent performing messaging
@Resource(mappedName="jms/MyConnectionFactory")
private static ConnectionFactory connectionFactory;
@Resource(mappedName="jms/MyTopic")
private static Topic topic;
Connection connection =
connectionFactory.createConnection();
...
connection.close();
JMSDeliveryMod
• A JMS message has three e send or
publish
parts: JMSExpiration method