Cs6601 Ds 2m Rejinpaul III
Cs6601 Ds 2m Rejinpaul III
Cs6601 Ds 2m Rejinpaul III
com
www.rejinpaul.com
UNIT I
1.What is meant distributed system?
1. We define a distributed system as a collection of autonomous computers linked by a
network, with software designed to produce an integrated computing facility.
2. A system in which hardware or software components located at networked computers
communicate and coordinate their actions only by message passing.
3. A collection of two or more independent computers which coordinate their processing
through the exchange of synchronous or asynchronous message passing.
4. A collection of independent computers that appear to the users of the system as a
single computers.
2. What are the significance of distributed system?
a. Concurrency of computers.
b. No global clock.
c. Independent failures.
3. Why we do you need distributed system?
a. Functional distribution: Computers have different functional capabilities (i.e.,
sharing of resources with specific functionalities).
b. Load distribution/balancing: Assign tasks to processors such that the overall
system performance is optimized.
c. Replication of processing power: Independent processors working on the same
task.
d. Distributed system consisting of collections of microcomputers may have
processing powers that no supercomputer will ever achieve.
e. Physical separation: Systems that rely on the fact that computers are physically
separated (e.g., to satisfy reliability requirements).
f. Economics: Collections of microprocessors offer a better price/performance ratio
than large mainframes.mainframes:10 times faster, 1000 times as expensive.
4. Examples of distributed system?
a. Internet
b. Intranet
c. Mobile and ubiquitous computing.
5. What is meant by location aware computing?
Mobile computing is the performance of computing tasks while the users are on the move
and away from their residence intranet but still provided with access to resources via the devices
they carry with them. They can continue to access the intranet, they can continue to access
resources in their home intranet, and there is increasing provision for users to utilize resources
such as printers that are conveniently nearby as they move around. This is known as location
aware computing.
6. What are the two type of resource sharing?
a. Hardware sharing: Printers. plotters and large disks and other peripherals are shared
to reduce costs.
b. Data sharing is important in many applications:
1. Software developers working in a team need to access each other’s code and
share the same development tools.
2. Many commercial applications enable users to access shared data objects in a
single active database.
3. The rapidly growing area of group-ware tools enables users to cooperate with
in a network.
7. List the importance of data sharing?
Software developers working in a team need to access each other’s code and share
the same development tools.
Many commercial applications enable users to access shared data objects in a
single active database.
The rapidly growing area of group- ware tools enables users to cooperate with in a
network.
8. Write the technological components of web?
HTML
HTTP-request-reply protocol
URL’s
9. List the distributed systems challenges?
a. Heterogeneity: standards and protocols; middleware; virtual machine;
b. Openness: publication of services; notification of interfaces;
c: Security: firewalls; encryption;
d. Scalability: replication; caching; multiple servers;
e. Failure Handling. failure tolerance; recover/roll-back; redundancy;
f. Concurrency. concurrency control to ensure data consistency.
g. Transparency. Middleware; location transparent naming; anonymity
available bandwidth.
c. Jitter. Jitter is the variation in the time taken to deliver a serious of messages.
Jitter is relevant to multimedia data. For example, if consecutive samples of audio
data are played with differing time intervals, the sound will be badly distorted.
33.What is synchronous DS?
1) The time to execute each step of a process has known lower and upper bounds.
2) Each message transmitted over a channel is received within a known bounded
time.
3) Each process has a local clock whose drift rate from real time has a known bound.
4) It is possible to suggest likely upper and lower bounds for process execution time,
message delay and clock drift rates in a distributed system, but it is difficult to
arrive at realistic values and to provide guarantees of the chosen values.
5) In a synchronous system it is possible to use timeouts, for example to detect the
failure of a process.
34.What is asynchronous DS?
1. Many distributed systems, such as the Intranet, qualify as asynchronous system.
2. An asynchronous distributed system is one in which there are no bounds on:
1. Process execution speeds-for example, one process step may take only a
picoseconds and another a century; all that can be said is that each step may take an arbitrarily
long time.
2. Message transmission delays-for example, one message from process A to
process B may be delivered in negligible time and another may take several years. In other
words, a message may be received after an arbitrarily long time.
3. Clock drift rates- again, the drift rate of a clock is arbitrary.
35. What is omission failure?
The faults classified as omission failures refer to cases when a process or communication
channel fails to perform actions that it is supposed to do.
36.What is meant by arbitrary failure?
1) The term arbitrary or Byzantine failure is used to describe the worst possible
failure semantics, in which any type of error may occur. For example, a process
may set wrong values in its data items, or it may return a wrong value in response
to an invocation.
2) An arbitrary failure of a process is one in which in arbitrarily omits intended
processing steps to takes unintended processing steps. Arbitrary failures in
processes cannot be detected by seeing whether the process responds to
invocations, because it might arbitrarily omit to reply.
37.List out the characteristics of networks hidden by stream abstraction?
a) Message sizes: The application can choose how much data it writes to a stream or
reads from it. It may deal in very small or very large sets of data. The underlying
implementation of a TCP stream decides how much data to collect before
transmitting it as one or more IP packets.
b) Lost messages: The TCP protocol uses an acknowledgement scheme. As an
example of a simple scheme (which is not used in TCP), the sending end keeps a
record of each IP packet sent and receiving end acknowledges all the arrivals. If
the sender does not receive an acknowledgement within a timeout, it retransmits
the message.
c) Flow control: The TCP protocol attempts to match the speeds of the processes
that read from and write to a stream. If the writer is too fast the reader, then it is
b. Java’s object serialization, which is concerned with the flattering and external
data representation of any single object or tree of objects that may need to be
transmitted in a message or stored on a disk. It is for use only by java.
c. XML (Extensible Markup Language), which defines a textual format for
representing structured data. It was originally intended for documents containing
textual self-describing structured data-for example documents accessible on the
Web- but it is now also used to represent the data sent in message exchanged by
clients and servers in web services.
16 MARK QUESTIONS
UNIT II
1. Draw the Middleware Architecture.
Application
RMI,RPC and events
Request reply protocol
External data representation
Operating system
generally different from that of a local object reference. Remote object references are
analogous to local one in that:
The remote object to receive a remote method invocation is specified by
the invoker as a remote object reference.
Remote object references may be passed as arguments and results of
remote methods invocations.
Remote interfaces: Every object has a remote interface that specifies which of its
methods can be invoked remotely..
The class of a remote object implements the methods of its remote interface, for
example as public instance methods in java.Object in other processes can
invoke only the methods that belong to its remote interface.
9.Define RMI .
Each process contains a collection of objects, some of which
can receive both local and remote invocations whereas the other objects can
receive only local invocations as shown in figure.
Method invocation between objects in different processes, whether in the same
computer or not, are known as remote method invocations. Method invocation
between objects in the same process is local method invocation. We refer to
objects that can receive remote invocation as remote objects.
10. What are the main choices to be considered in design of RMI?
RMI invocation semantics
a. Retry-reply protocols, where we showed that doOperation can be implemented in different
ways to provide different guarantees.
b. The main choices are:
i. Retry request message: Controls whether to retransmit the request message until
either a reply is received or the server is assumed to have failed.
ii. Duplicate filtering: Controls when retransmissions are used and whether to filter out
duplicate requests at the server.
Iii.Retrasmission of results: Controls whether to keep a history of
result message to enable lost results to be retransmitted without re-executing the
operations at the server.
processes. Other kernels have no notion of other computers built into them, and an
additional service is required for external communication.
Memory manager: Management of physical and virtual memory. It describes the
utilization of memory management techniques for efficient data copying and
sharing.
Supervisor: Dispatching of interrupts, system call traps and other exceptions; control
of memory management unit and hardware caches; processor and floating-point
unit register manipulation. This is known as the Hardware Abstraction Layer in
Windows.
24. Define process.
A process consists of an execution environment together with one or more threads.
25. Define thread.
A thread is the operating system abstraction of an activity (the term derives from
the phrase ‘thread of execution’). An execution environment is the unit of
resource management: a collection of local kernel managed resources to which its
threads have access.
26. Define Unix address space.
This representation of an address space as a sparse set of disjoint regions is a
generalization of the UNIX address space, which has three regions: a fixed,
unmodifiable text region containing program code; a heap, part of which is
initialized by values stored in the program’s binary file , and which is
extensible towards higher virtual addresses ; and a stack, which is extensible
towards lower virtual addresses.
27. List the uses of shared region.
The uses of shared regions include the following:
Libraries: Library code can be very large and would waste considerable memory if it
was loaded separately into every process that used it.
Kernel: Often the kernel code and data are mapped into every address space at the
same location. Data sharing and
Communication: Two processes, or a process and the kernel, might need to share data
in order to cooperate on some task. It can be considerably more efficient for the
Thread-per-connection Architecture
Thread-per-object Architecturre:
29. Compare process and threads.
a. Creation a new thread within an existing process is cheaper than creating a process.
b. More importantly switching to a different thread within the same process is cheaper than
switching between threads belonging to different processes.
c. Threads within a process may share data and other resources conveniently and efficiently
compared with separate processes.
d. But by the same token threads within processes are not protected from one another.
30. Explain thread lifetime.
A new thread is created on the same Java Virtual machine (JVM) as its
creator in the SUSPENDED state. After it is made RUNNABLE with the start()
method, it execute in the run() method of an object designated in its
constructor, The JVM and the threads on top of it all execute in a process
on top of the underlying operating system. Threads can be assigned a
priority so that a java implementation that supports priorities will run a
particular threads in preference to any thread with lower priority.
16 MARKS QUESTION
8MARKS QUESTION:
1. Explain the implementation of RMI.
2. Explain RPC in detail.
3. Explain the core OS layers functionality.
4. Explain the process creation with an example.
5. Explain in detail about threads.
6. How invocation are made concurrent in DS.
UNIT III
File Length
Creation Time stamp
Read Timestamp
Write time stamp
Attribute time stamp
Reference count
Owner
File Type
fieldes=open(name,mode)
fieldes=create(name,mode)
status=close(fieldes)
count=read(fieldes,buffer,n)
count=write(fieldes,buufer,n)
pos=Iseek(filedes,offset,whence)
status=unlink(nmae)
status=link(name1,nmae2)
status=stat(name,buffer)
i. Access transparency
ii. Location transparency
iii. Mobility transparency
iv. Performance transparency
v. Scaling transparency
Changes to a file by one client should not interfere with the operation of other
clients simultaneously accessing or changing the same file. This is well-known
issue of concurrency control .The need for concurrency control for accss to
shared data in many applications Is widely accepted and techniques are
known for its implementation ,but they are costly .Most current file services
follow morden UNIX standards in providing advisery or mandatoryfile or
record-level locking.
copy of the file when one has failed. Few file services support replication fully, but most
support the catching of files or portions of files locally, a limited form of replication.
The directory services provide a mapping between text names for files and their UFIDs.
Client may obtain the UFIDs of a file by quoting its text name to the directory
services. The directory services provides the function needed to generate directories,
to add new file name to directories and to obtain UFIDs from directories. It is client
of the flat file services; its directory is stored in files of the flat services. When a
hierarchic file-naming scheme is adopted as in UNIX, directories hold references to
other directories.
Lookup(Dir,Name)→FileID –throws not found Locate the text name in the directory and returns the
relevant UFID.If name is not in the directory,
throws an exception
UnName (Dir,Name)—throws not found If name is in the directory: The entry containing
name is removed from the directory.
If name is not in the directory;throws an exception.
Lookup(DirFH,Name)→FH,Attr Returns File handles and attributes for the File Name
in the directory DirFH.
taken place.
The adaptive algorithm for setting freshness interval t outlined above reduces
the traffic considerably for most files.
The name is resolved when it is translated into data about the named resource or
object, often in order to invoke an action upon it. The association between a
name and an object is called a binding. In general, names are bound to
attributes of the named objects, rather than the implementation of the objects
themselves. An attribute is the value of a property associated with an object.
URI-Uniform Resource Identifiers came about from the need to identify resources
on the web, and other internet resources such as electronic mailboxes. An
important goal was to identify resources in a coherent way, so that they could
all be processed by common software such as browser. URIs is ‘uniform’ in
that their syntax incorporates that of indefinitely many individual types of
resource identifier(i.e URI schemas),and there are procedures for managing the
global namespace of schemas. The advantage of uniformity is that eases the
process of introducing new types of identifier, as well as using existing types of
identifier in new contexts without disrupting existing usage.
Uniform Resource Names are URIs that are used as pure resource names rather than
The process of locating naming data from than more than one name server in order
to resolve a name is called navigation. The client name resolution software
carries out navigation on behalf of the client. It communicates with name servers
as necessary to resolve a name.
In multicast navigation, a client multicast the name to be resolved and required object
type to the group of name servers. Only the server that holds the named
attributes responds to the request. Unfortunately, however, if the name proves to
be unbound, the request is greeted with silence.
In the non recursive and non recursive server controlled navigation. Under non
recursive server controlled navigation ,any name server may be chosen by the
client. This server communicates by multicast or iteratively with its peer in the
style described above, as through it were a client. Under recursive server-
controlled navigation ,the client once more contacts a single server. If this
server doesnot store the name,the server contains a peer to storing a
prefix of the
name,which in turn attempts to resolve it. This procedure continues recursively
until the name resolved.
This original scheme was soon seen to suffer from three major shortcomings:
The domain name system is a name service design whose main naming database is
used across the internet. It was devised principally by mockapetris and
specified in RFC 1034 and RFC 1035. DNS replaced the original internet
naming scheme in which all host names and address were held in a single
central master file and downloaded by FTP to all computer that required them.
Domain Names: The DNS is designed for use in multiple implementations, each of
which may have its own name space. In practice, however, only one is in
widespread use, and that is one used for naming across the internet. The
internet DNS name space is partitioned both organizationally and according to
geography. The names are written with the highest-level domain on the right.
The original top-level organizational domains in use across the internet were:
26. What is meant by zone?
The DNS naming data are divided into zones. A zone contains the following data:
Attributes data for names in a domain, less any subdomains administrates by lower level
authorities.
The names and address of at least two name servers that provide authoritative data for the
zone. These are versions of zone data that can be relied upon as being reasonably up to
date.
The names of name servers that hold authoritative data for delegated sub domains; and
‘glue’ data giving the IP address of these servers.
Zone-management parameters, such as those governing the catching and replication of
zone data.
Data in write operation received from client is stored in the memory cache at the
server and written to disk before a reply is sent to the client. This is called
writethrough caching. The client can be sure that is data stored persistently as
soon as reply has been received.
Data in write operation is stored only in the memory cache. It will be written to
disk when a commit operation is received for the relevant file. The client can be
sure that the data is persistent stored only when a reply to a commit operation for
the relevant file has been received. Standard NFS clients use this mode of
operation, issuing a commit whenever a file that was open for writing is
closed.
Name management is separated from other service largely because of the openness of
distributed system, which brings the following motivation:
Unification
Integration
A namespace is the collection of all valid names recognized by a particular service. The service will
attempt to look up a valid name, even though that name may prove not to correspond to any object.
Name space requires a syntactic definition to separate.
PART B
UNIT IV
9. What is strata?
The NTP service is provided by a network of servers located across the Internet.
Primary servers are connected directly to a time source such as a radio clock
receiving UTC; secondary servers are synchronized, ultimately, with primary
servers. The servers are connected in a logical hierarchy called a
synchronization subnet whose levels are called strata.
10. Enumerate the mode of synchronization in NTP servers.
NTP servers synchronize with one another in one of three: multicast, procedure-call
and symmetric mode
Multicast mode is intended for use on a high-speed LAN. One or more servers
periodically multicasts the time to the servers running in other computers connected
by the LAN, Which set their clocks assuming a small delay. This mode can achieve
only relatively low accuracies, but ones that nonetheless are considered sufficient for
many purposes.
Procedure-call mode is similar to the operation of Cristian’s algorithm. In this mode,
one server accepts requests from other computers, which it processes by replying with
its timestamp (current clock reading). This mode is suitable when higher accuracies
are required than can be achieved with multicast, or where multicast is not supported
in hardware.
In, symmetric mode is intended for use by the servers that supply time information in
LANs and by the higher levels of the synchronization subnet, where the highest
accuracies are to be achieved.
11. What is filter dispersion?
NTP servers apply a data filtering algorithm to successive pairs which
estimates the offset o and calculates the quality of this estimates as a statistical
quantity called the filter dispersion.
12. What is synchronization dispersion?
Peers with lower stratum numbers are more favoured than those in higher strata
because they are ‘closer’ to the primary time sources. Also, those with the lowest
synchronization dispersion are relatively favoured. This is the sum of the
filter dispersions measured between the server and the root of the
synchronization subnet.
VC4: when pi receives a timestamp t, it sets Vi[ j ] := max(Vi[ j ] , t[ j ]) for j=1..N (merge
operation)
16. What do you meant by distributed garbage
An object is considered to be garbage if there are no longer any reference to it
anywhere in the distributed system. The memory taken up by that object can be
reclaimed once it is known as to be garbage.
17. Define Global History
Let us return to our general system p of N processes pi(i=1,2,3,…..N)
Here a series of events occurs at each process, and that we may characterize the
execution of each process by its history
18. What is meant by cut?
Consider the events occurring at processes p1 and p2 shown in figure
predicates .
UNIT V
1. What do you meant by DSM
Distributed shared memory (DSM) is an abstraction used for sharing data
between computers that do not share physical memory. Processes access
DSM by reads and updates to what appears to be ordinary memory within their
address space. However, an underlying runtime system ensures transparently
that processes executing at different computers observe the updates made by
one another.
2. List the three approaches of DSM structure
Hardware
Paged virtual memory
Middleware
3. Define Sequential consistency
A DSM system is said to be sequentially consistent if for any execution there is
some interleaving of the series of operations issued by all the processes that
satisfies the following two criteria:
SC1: The interleaved sequence of operations is such that if R(x) a occurs in the
sequence, then either the last write operation that occurs before it in the
interleaved sequence is W(x) a, or no write operation occurs before it and a is the
initial value of x.
SC2: The order of operations in the interleaving is consistent with the program
order in which each individual client executed them.
4. What is coherence
Coherence is an example of a weaker form of consistency. Under coherence,
every process agrees on the order of write operations to the same location,
but they do not necessarily agree on the ordering of write operations to different
locations. We can think of coherence as sequential consistency on a
locationby- location basis. Coherent DSM can be implemented by taking a
protocol for implementing sequential consistency and applying it separately to
each unit of replicated data – for example, each page.
5. What is meant by weaker consistency
This model exploits knowledge of synchronization operations in order to relax
programmer uses a lock to implement a critical section, then a DSM system can
assume that no other process may access the data items accessed under mutual
exclusion within it. It is therefore redundant for the DSM system to propagate
updates to these items until the process leaves the critical section. While items
are left with ‘inconsistent’ values some of the time, they are not accessed at
those points; the execution appears to be sequentially consistent.
6. What is granularity
An issue that is related to the structure of DSM is the granularity of sharing.
Conceptually, all processes share the entire contents of a DSM. As programs
sharing DSM execute, however, only certain parts of the data are actually
shared and then only for certain times during the execution. It would clearly
be very wasteful for the DSM implementation always to transmit the entire
contents of DSM as processes access and update it.
7. What is meant by multiple reader/writer sharing
Write-update: The updates made by a process are made locally and multicast to all
other replica managers possessing a copy of the data item, which immediately
modify the data read by local processes. Processes read the local copies of data
items, without the need for communication. In addition to allowing multiple
readers, several processes may write the same data item at the same time; this
is known as multiple-reader/multiple-writer sharing.
message that transfers the lock to a waiting process – ensuring that the lock’s
recipient has copies of the data it needs before it accesses them.
13. What is meant by casual consistency
Causal consistency: Reads and writes may be related by the happened-before relationship
. This is defined to hold between memory operations when either (a) they are made by
the same process; (b) a process reads a value written by another process; or (c)
there exists a sequence of such operations linking the two operations. The
model’s constraint is that the value returned by a read must be consistent with the
happened-before relationship.
14. Define processor consistency
Processor consistency: The memory is both coherent and adheres to the pipelined
RAM model (see below). The simplest way to think of processor consistency is
that the memory is coherent and that all processes agree on the ordering of any
two write accesses made by the same process – that is, they agree with its
program order.
15. Define CORBA
CORBA is a middleware design that allows application programs to communicate
with one another irrespective of their programming languages, their hardware
and software platforms, the networks they communicate over and their
implementers.
Applications are built from CORBA objects, which implement interfaces defined in
CORBA’s interface definition language, IDL. Clients access the methods in
the IDL interfaces of CORBA objects by means of RMI. The middleware
component that supports RMI is called the Object Request Broker or ORB.
16. What are the steps to be taken for semantics parameter passing in CORBA IDL
Passing CORBA objects:
Any parameter whose type is specified by the name of an IDL interface, such as the return
value Shape in line 7, is a reference to a CORBA object and the value of a remote object
reference is passed.
Passing CORBA primitive and constructed types:
Arguments of primitive and constructed types are copied and passed by value. On arrival, a
new value is created in the recipient’s process. For example, the struct GraphicalObject
passed as argument (in line 7) produces a new copy of this struct at the server.
Type Object:
Object is the name of a type whose values are remote object references. It is effectively a
common super type of all of IDL interface types such as Shape and ShapeList.
17. What is meant by CORBA naming service
It is a binder that provides operations including rebind for servers to register the remote
object references of CORBA objects by name and resolve for clients to look them up by
name. The names are structured in a hierarchic fashion, and each name in a path is inside a
structure called a Name Component. This makes access in a simple example seem rather
complex.
18. What is meant by CORBA security service?
CORBA Security Service
The CORBA Security Service [Blakley 1999, Baker 1997, OMG 2002b] includes
the following:
Authentication of principals (users and servers); generating credentials for
principals (that is, certificates stating their rights); delegation of credentials is
supported
Access control can be applied to CORBA objects when they receive remote
method invocations. Access rights may for example be specified in access control
lists (ACLs).
Security of communication between clients and objects, protecting messages for
integrity and confidentiality.
Auditing by servers of remote method invocations.
Facilities for non-repudiation. When an object carries out a remote invocation on
behalf of a principal, the server creates and stores credentials that prove that the
invocation was done by that server on behalf of the requesting principal.
19. What is meant by CORBA notification services
The CORBA Notification Service extends the CORBA Event Service, retaining all
of its features including event channels, event consumers and event suppliers.
The event service provides no support for filtering events or for specifying
delivery requirements. Without the use of filters, all the consumers attached to
a channel have to receive the same notifications as one another. And
without the ability to specify delivery
requirements, all of the notifications sent via a channel are given the delivery
guarantees built into the implementation.
20. What is meant by CORBA event services?
The CORBA Event Service specification defines interfaces allowing objects of
interest, called suppliers, to communicate notifications to subscribers, called
consumers. The notifications are communicated as arguments or results of
ordinary synchronous CORBA remote method invocations. Notifications may
be propagated either by being pushed by the supplier to the consumer or pulled
by the consumer from the supplier.