Unit 1 - Cloud Computing - Digital Content
Unit 1 - Cloud Computing - Digital Content
Unit 1 - Cloud Computing - Digital Content
2
Please read this disclaimer before proceeding:
This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document
through email in error, please notify the system manager. This document
contains proprietary information and is intended only to the respective group /
learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender
immediately by e-mail if you have received this document by mistake and delete
this document from your system. If you are not the intended recipient you are
notified that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.
3
CS8791
Cloud Computing
Computer Science and Engineering
2019 – 2023 / IV Year
Created by:
August 2022
4
TABLE OF CONTENTS
6 Lecture Plan 11
8 Lecture Notes 14
9 Assignments 45
10 Part A Q & A 48
11 Part B Qs 56
15 Assessment Schedule 64
5
COURSE OBJECTIVES
technologies.
computing paradigm.
6
PRE REQUISITES
CS8791
CLOUD COMPUTING
CS8591 CS8493 CS8491
Computer Operating Computer
Networks Systems Architecture
7
SYLLABUS
8
COURSE OUTCOMES
C203.1 K2 2 1 - - - - - - - - - - 2 2 1
C203.2 K3 3 2 1 - 3 - - - - - - - 2 2 -
C203.3 K3 3 2 1 - 2 - - - - - - - 2 1 -
C203.4 K3 3 2 1 1 2 - - - - - - - - - -
C203.5 K3 3 2 1 1 2 - - - - - - - - - -
C203.6 K3 2 1 - - 1 - - - - - - - - - -
10
LECTURE PLAN
UNIT 1 INTRODUCTION
11
ACTIVITY BASED LEARNING
12
Lecture Notes
13
Unit 1 - Introduction
14
INTRODUCTION TO CLOUD COMPUTING
Web 2.0 technologies play a central role in making cloud computing an attractive
opportunity for building computing systems. They have transformed the Internet
into a rich application and service delivery platform, mature enough to serve
complex needs. Service orientation allows cloud computing to deliver its capabilities
with familiar abstractions, while virtualization confers on cloud computing the
necessary degree of customization, control, and flexibility for building production
and enterprise systems.
DEFINITION OF CLOUD
15
EVOLUTION OF CLOUD COMPUTING
HARDWARE EVOLUTION
Computerization has pervaded nearly every facet of our personal and professional
lives. Computer evolution has been both rapid and fascinating. The first step along
the evolutionary path of computers occurred in 1930, when binary arithmetic was
developed and became the foundation of computer processing technology,
terminology, and programming languages. In 1939, the Berry brothers invented an
electronic computer capable of operating digitally. Computations were performed
using vacuum-tube technology. In 1941, the introduction of Konrad Zuse’s Z3 at the
German Laboratory for Aviation in Berlin was one of the most significant events in
the evolution of computers because this machine supported both floating-point and
binary arithmetic.
First-Generation Computers:
The Mark I was designed and developed in 1943 at Harvard University. It was a
general-purpose electromechanical programmable computer.
Colossus was an electronic computer built in Britain at the end 1943. Colossus
was the world’s first programmable, digital, electronic, computing device.
First-generation computers were built using hard-wired circuits and vacuum tubes.
Colossus was used in secret during World War II to help decipher teleprinter
messages encrypted by German forces using the Lorenz SZ40/42 machine.
The ENIAC (Electronic Numerical Integrator and Computer) was built in 1946.
This was the first Turing-complete, digital computer capable of being
reprogrammed to solve a full range of computing problems. Although, the ENIAC
was similar to the Colossus, it was much faster, more flexible, and it was Turing-
complete.
16
ENIAC contained 18,000 thermionic valves, weighed over 60,000 pounds, and
consumed 25 kilowatts of electrical power per hour.
Second-Generation Computers:
The inefficient thermionic valves were replaced with smaller and more reliable
transistors.
Despite using transistors and printed circuits, these computers were still bulky and
expensive. They were therefore used mainly by universities and government
agencies.
The integrated circuit or microchip was developed by Jack St. Claire Kilby, an
achievement for which he received the Nobel Prize in Physics in 2000.
Third-Generation Computers:
The development of the integrated circuit was the hallmark of the third
generation of computers.
Instead of punched cards and printouts, users interacted with third generation
computers through keyboards and monitors and interfaced with an operating
system, which allowed the device to run many different applications at one time
with a central program that monitored the memory.
Computers for the first time became accessible to a mass audience because they
were smaller and cheaper than their predecessors.
17
Fourth-Generation Computers:
The fourth-generation computers that were being developed at this time utilized a
microprocessor that put the computer’s processing capabilities on a single
integrated circuit chip.
In November 1971, Intel released the world’s first commercial microprocessor, the
Intel 4004. The 4004 was the first complete CPU on one chip and became the first
commercially available microprocessor. The 4004 processor was capable of “only”
60,000 instructions per second.
The microprocessors that evolved from the 4004 allowed manufacturers to begin
developing personal computers small enough and cheap enough to be purchased
by the general public.
The first commercially available personal computer was the MITS Altair 8800,
released at the end of 1974.
Even though microprocessing power, memory and data storage capacities have
increased by many orders of magnitude since the invention of the 4004 processor,
the technology for large-scale integration (LSI) or very-large-scale integration
(VLSI) microchips has not changed all that much. For this reason, most of today’s
computers still fall into the category of fourth-generation computers.
Fifth-Generation Computers:
In the fifth generation, VLSI technology became ULSI (Ultra Large Scale
Integration) technology, resulting in the production of microprocessor chips
having ten million electronic components.
The computer generation was being categorized on the basis of hardware only,
but the fifth generation technology included software also.
18
This generation is based on parallel processing hardware and AI (Artificial
Intelligence) software.
The computers of the fifth generation had high capability and large memory
capacity.
The Advanced Research Projects Agency (ARPA) of the United States Department
of Defence funded research project to develop a network. The idea was to
develop a computer network that could continue to function in the event of a
disaster such as nuclear war.
Packet switching was incorporated into the proposed design for the ARPANET in
1967.
ARPANET development began with two network nodes in 1969. Later by the end
of 1971, fifteen sites were connected to the young ARPANET.
Interface Message Processor (IMP) at each site would handle the interface to the
ARPANET network.
In 1968, Beranek and Newman, Inc. (BBN) unveils the final version of the
Interface Message Processor (IMP) specifications.
The ARPANET project and international working groups led to the development of
various protocols and standards by which multiple separate networks could
become a single network or "a network of networks".
19
ARPANET became a massive network of networks and now it is known as
Internet.
The protocols in the Physical Layer, the Data Link Layer, and the Network
Layer used within the network were implemented on separate Interface Message
Processors (IMPs).
Since lower protocol layers were provided by the IMP-host interface, NCP
essentially provided a Transport Layer consisting of the ARPANET Host-to-Host
Protocol (AHHP) and the Initial Connection Protocol (ICP). AHHP defined
procedures to transmit a unidirectional, flow-controlled data stream between two
hosts. The ICP defined the procedure for establishing a bidirectional pair of such
streams between a pair of host processes.
Application protocols such as File Transfer Protocol (FTP), used for file transfers,
and Simple Mail Transfer Protocol (SMTP), used for sending email, accessed
network services through an interface to the top layer of the NCP.
NCP was officially rendered obsolete when the ARPANET changed its core
networking protocols from NCP to the more flexible and powerful TCP/IP protocol
suite, marking the start of the modern Internet.
Transmission Control Protocol (TCP) and Internet Protocol (IP), as the protocol
suite, commonly known as TCP/IP, emerge as the protocol for ARPANET. This
results in the fledgling definition of the Internet as connected TCP/IP internets.
TCP/IP remains the standard protocol for the Internet.
TCP/IP quickly became the most widely used network protocol in the world.
TCP converts messages into streams of packets at the source. Re-assembled back
into messages at the destination. IP handles the dispatch of these packets. It
handles the addressing and make sure that packets reaches its destination
through multiple nodes.
20
Evolution of Ipv6:
The amazing growth of the Internet throughout the 1990s caused a vast
reduction in the number of free IP addresses available under IPv4.
Internet Protocol Version 4 (IPv4) is the initial version used on the first generation
of the Internet and is still in dominant use. It was designed to address up to ≈4.3
billion (109) hosts. However, the explosive growth of the Internet has led to IPv4
address exhaustion, which entered its final stage in 2011, when the global IPv4
address allocation pool was exhausted.
Because of the growth of the Internet and the depletion of available IPv4
addresses, a new version of IP IPv6, was developed in the mid-1990s, which
provides vastly larger addressing capabilities and more efficient routing of
Internet traffic. IPv6 uses 128 bits for the IP address.
Ipv6 is sometimes called the Next Generation Internet Protocol (IPNG) or TCP/IP
v6.
IPv6 was widely available from industry as an integrated TCP/IP protocol and was
supported by most new Internet networking equipment.
21
The Memex would also be able to create 'trails' of linked and branching sets of
pages, combining pages from the published microfilm library with personal
annotations or additions captured on a microfilm recorder.
Memex influenced Ted Nelson and Douglas Engelbart for the invention of
hypertext.
HyperCard by Apple was the first hypertext editing system available to the general
public.
In the 1990s, the Mosaic and Netscape browsers were developed at the National
Center for Supercomputer Applications (NCSA), a research institute at the
University of Illinois.
In the fall of 1990, Berners-Lee developed the first web browser featuring an
integrated editor that could create hypertext documents.
Berners-Lee enhanced the server and browser by adding support for the FTP
protocol.
Mosaic was the first widely popular web browser available to the general public. It
helped spread use and knowledge of the web across the world.
Mosaic provided support for graphics, sound, and video clips. Innovations
including the use of bookmarks and history files were added.
22
Mosaic became even more popular, helping further the growth of the World Wide
Web.
Netscape released the first beta version of its browser, Mozilla 0.96b, over the
Internet. The final version, named Mozilla 1.0, was released in December 1994.
The Mosaic programming team then developed another web browser, which they
named Netscape Navigator.
In 1995, Microsoft Internet Explorer arrived as both a graphical Web browser and
the name for a set of technologies.
A key to efficient cluster management was engineering where the data was to be
held. This process became known as data residency.
23
Grid computing expands on the techniques used in clustered computing models,
were multiple independent clusters appear to act like a grid simply because they
are not all located within the same domain.
The Globus Toolkit is an open source software toolkit used for building grid
systems and applications. It is being developed and maintained by the Globus
Alliance and many others all over the world.
Cloud-resident entities such as data centers have taken the concepts of grid
computing and bundled them into service offerings that appeal to other entities
that do not want the burden of infrastructure but do want the capabilities hosted
from those data centers.
One of the most well known of the new cloud service providers is Amazon’s S3
(Simple Storage Service) third party storage solution.
SERVER VIRTUALIZATION
24
Virtualization technology is a way of reducing the majority of hardware acquisition
and maintenance costs, which can result in significant savings for any company.
Parallel Processing:
Vector Processing:
The next step in the evolution of parallel processing was the introduction of
multiprocessing. Here, two or more processors share a common workload.
25
In SMP systems, each processor is equally capable and responsible for managing
the workflow as it passes through the system.
In this form of computing, all the processing elements are interconnected to act
as one very large computer.
The earliest massively parallel processing systems all used serial computers as
individual processing units in order to maximize the number of units available for
a given size and cost.
26
Parallel Computing Systems
The term parallel computing and distributed computing are often used
interchangeably, even though they mean slightly different things.
The term parallel implies a tightly coupled system, whereas
distributed systems refers to a wider class of system, including
those that are tightly coupled.
More precisely, the term parallel computing refers to a model in which
the computation is divided among several processors sharing the
same memory.
The architecture of a parallel computing system is often characterized by
the homogeneity of components: each processor is of the same
type and it has the same capability as the others.
The shared memory has a single address space, which is accessible to all
the processors.
Parallel programs are then broken down into several units of execution that
can be allocated to different processors and can communicate with each
other by means of shared memory.
Originally parallel systems are considered as those architectures that
featured multiple processors sharing the same physical memory and that
were considered a single computer.
Over time, these restrictions have been relaxed, and parallel systems now
include all architectures that are based on the concept of shared memory,
whether this is physically present or created with the support of libraries,
specific hardware, and a highly efficient networking infrastructure.
For example: a cluster of which of the nodes are connected through an
InfiniBand network and configured with a distributed shared memory
system can be considered as a parallel system.
The term distributed computing encompasses any architecture or system
that allows the computation to be broken down into units and executed
concurrently on different computing elements, whether these are
processors on different nodes, processors on the same computer, or cores
within the same processor.
Distributed computing includes a wider range of systems and applications
than parallel computing and is often considered a more general term.
Even though it is not a rule, the term distributed often implies that the
locations of the computing elements are not the same and such elements
might be heterogeneous in terms of hardware and software features.
Classic examples of distributed computing systems are Computing
Grids and Internet Computing Systems.
27
Elements of Parallel Computing
Silicon-based processor chips are reaching their physical limits. Processing speed
is constrained by the speed of light, and the density of transistors packaged in a
processor is constrained by thermodynamics limitations.
A viable solution to overcome this limitation is to connect multiple processors
working in coordination with each other to solve “Grand Challenge” problems.
The first step in this direction led to the development of parallel computing, which
encompasses techniques, architectures, and systems for performing multiple
activities in parallel. As discussed earlier, the term parallel computing has blurred
its edges with the term distributed computing and is often used in place of later
term.
Even though it is not a rule, the term distributed often implies that the locations
of the computing elements are not the same and such elements might be
heterogeneous in terms of hardware and software features.
Classic examples of distributed computing systems are Computing Grids and
Internet Computing Systems.
28
Hardware architectures for parallel
Processing
The core elements of parallel processing are CPUs. Based on the number of
instructions and data streams, that can be processed simultaneously, computing
systems are classified into the following four categories:
Single-instruction, Single-data (SISD) systems
Single-instruction, Multiple-data (SIMD) systems
Multiple-instruction, Single-data (MISD) systems
Multiple-instruction, Multiple-data (MIMD) systems
Single – Instruction , Single Data (SISD) systems
SISD computing system is a uni-processor machine capable of executing a
single instruction, which operates on a single data stream.
Machine instructions are processed sequentially, hence computers adopting
this model are popularly called sequential computers.
Most conventional computers are built using SISD model.
All the instructions and data to be processed have to be stored in primary
memory.
The speed of processing element in the SISD model is limited by the rate at
which the computer can transfer information internally.
Dominant representative SISD systems are IBM PC, Macintosh, and
workstations.
Single – Instruction, Multiple Data (SIMD) systems
SIMD computing system is a multiprocessor machine capable of executing
the same instruction on all the CPUs but operating on different data streams.
Machines based on this model are well suited for scientific computing since
they involve lots of vector and matrix operations.
For instance statement Ci = Ai * Bi, can be passed to all the processing
elements (PEs), organized data elements of vectors A and B can be divided
into multiple sets ( N- sets for N PE systems), and each PE can process one
data set.
Dominant representative SIMD systems are Cray’s Vector processing machine
and Thinking Machines Cm*, and GPGPU accelerators.
Multiple – Instruction , Single Data (MISD) systems
MISD computing system is a multi processor machine capable of executing
different instructions on different Pes all of them operating on the same data
set.
For example, y = sin(x) + cos(x) + tan(x)
Machines built using MISD model are not useful in most of the applications.
Few machines are built but none of them available commercially.
This type of systems are more of an intellectual exercise than a practical
configuration
29
Hardware architectures for parallel
Processing
Multiple – Instruction , Multiple Data (MIMD) systems
MIMD computing system is a multi processor machine capable of executing
multiple instructions on multiple data sets.
Each PE in the MIMD model has separate instruction and data streams,
hence machines built using this model are well suited to any kind of
application.
Unlike SIMD, MISD machine, PEs in MIMD machines work asynchronously,
MIMD machines are broadly categorized into shared-memory MIMD and
distributed memory MIMD based on the way PEs are coupled to the main
memory
Shared Memory MIMD machines
All the PEs are connected to a single global memory and they all have access
to it.
Systems based on this model are also called tightly coupled multi processor
systems.
The communication between PEs in this model takes place through the
shared memory.
Modification of the data stored in the global memory by one PE is visible to
all other PEs.
Dominant representative shared memory MIMD systems are silicon graphics
machines and Sun/IBM SMP ( Symmetric Multi-Processing).
Distributed Memory MIMD machines
PEs have a local memory. Systems based on this model are also called
loosely coupled multi-processor systems.
The communication between PEs in this model takes place through the
interconnection network, the inter process communication channel, or IPC.
The network connecting PEs can be configured to tree, mesh, cube, and so
on.
Each PE operates asynchronously, and if communication/synchronization
among tasks is necessary, they can do so by exchanging messages between
them.
Shared Vs Distributed MIMD model
The shared memory MIMD architecture is easier to program but is less
tolerant to failures and harder to extend with respect to the distributed
memory MIMD model.
Failures, in a shared memory MIMD affect the entire system, whereas this is
not the case of the distributed model, in which each of the PEs can be easily
isolated.
Moreover, shared memory MIMD architectures are less likely to scale
because the addition of more PEs leads to memory contention.
This is a situation that does not happen in the case of distributed memory, in
which each PE has its own memory.
As a result, distributed memory MIMD architectures are most popular today.
30
SISD & SIMD
31
MISD & MIMD
32
Shared Memory & Distributed
Memory Machines
33
Approaches to Parallel
Programming
A sequential program is one that runs on a single processor and has a single line
of control.
To make many processors collectively work on a single program, the program
must be divided into smaller independent chunks so that each processor can work
on separate chunks of the problem.
The program decomposed in this way is a parallel program.
A wide variety of parallel programming approaches are available.
The most prominent among them are the following.
Data Parallelism
Process Parallelism
Farmer-and-worker model
The above said three models are suitable for task-level parallelism. In the case of
data level parallelism, the divide-and-conquer technique is used to split data into
multiple sets, and each data set is processed on different PEs using the same
instruction.
This approach is highly suitable to processing on machines based on the SIMD
model.
In the case of Process Parallelism, a given operation has multiple (but distinct)
activities that can be processed on multiple processors.
In the case of Farmer-and-Worker model, a job distribution approach is used, one
processor is configured as master and all other remaining PEs are designated as
slaves, the master assigns the jobs to slave PEs and, on completion, they inform
the master, which in turn collects results.
These approaches can be utilized in different levels of parallelism.
Levels of Parallelism
Levels of Parallelism are decided on the lumps of code ( grain size) that can be a
potential candidate of parallelism.
The table shows the levels of parallelism. All these approaches have a common
goal
To boost processor efficiency by hiding latency.
To conceal latency, there must be another thread ready to run whenever a
lengthy operation occurs.
The idea is to execute concurrently two or more single-threaded applications.
Such as compiling, text formatting, database searching, and device simulation.
34
Approaches to Parallel
Programming
35
Distributed Computing Systems
Distributed computing studies the models, architectures, and algorithms used for
building and managing distributed systems.
A distributed system is a collection of independent computers that appears to its
users as a single coherent system.
A distributed system is one in which components located at networked computers
communicate and coordinate their action only by passing messages.
As specified in this definition, the components of a distributed system
communicate with some sort of message passing.
Components of distributed System
A distributed system is the result of the interaction of several components that
traverse the entire computing stack from hardware to software.
It emerges from the collaboration of several elements that- by working together-
give users the illusion of a single coherent system.
The figure provides an overview of the different layers that are involved in
providing the services of a distributed system.
At the very bottom layer, computer and network hardware constitute the physical
infrastructure; these components are directly managed by the operating system,
which provides the basic services for inter process communication (IPC), process
scheduling and management, and resource management in terms of file system
and local devices.
Taken together these two layers become the platform on top of which specialized
software is deployed to turn a set of networked computers into a distributed
system.
36
Architectural styles for distributed
computing
Although a distributed system comprises the interaction of several layers, the
middleware layer is the one that enables distributed computing, because it
provides a coherent and uniform runtime environment for applications.
There are many different ways to organize the components that, taken together,
constitute such an environment.
The interactions among these components and their responsibilities give structure
to the middleware and characterize its type or, in other words, define its
architecture.
Architectural styles aid in understanding the classifying the organization of the
software systems in general and distributed computing in particular.
The architectural styles are classified into two major classes
Software Architectural styles : Relates to the logical organization of the software.
System Architectural styles: styles that describe the physical organization of
distributed software systems in terms of their major components.
Software Architectural styles
Software architectural styles are based on the logical arrangement of software
components.
They are helpful because they provide an intuitive view of the whole system,
despite its physical deployment.
They also identify the main abstractions that are used to shape the components
of the system and the expected interaction patterns between them.
37
Software Architectural styles
Data Centered Architecture:
These architectures identify the data as the fundamental element of the
software system, and access to shared data is the core characteristics of the
data-centered architectures.
Within the context of distributed and parallel computing systems, integrity of
data is overall goal for such systems.
The repository architectural style is the most relevant reference model in this
category. It is characterized by two main components – the central data structure,
which represents the current state of the system, and a collection of independent
component, which operate on the central data.
The ways in which the independent components interact with the central data
structure can be very heterogeneous.
In particular repository based architectures differentiate and specialize
further into subcategories according to the choice of control discipline
to apply for the shared data structure. Of particular interest are
databases and blackboard systems.
In the repository systems, the dynamics of the system is controlled by
independent components, which by issuing an operation on the central
repository, trigger the selection of specific processes that operate on
data.
Black Board Architecture:
The black board architectural style is characterized by three main components:
Knowledge sources : These are entities that update the knowledge base that
is maintained in the black board.
Blackboard : This represents the data structure that is shared among the
knowledge sources and stores the knowledge base of the application.
Control: The control is the collection of triggers and procedures that govern
the interaction with the blackboard and update the status of the knowledge
base.
Knowledge sources represent the intelligent agents sharing the blackboard, react
opportunistically to changes in the knowledge base, almost in the same way that
a group of specialists brainstorm in a room in front of a blackboard.
Blackboard models have become popular and widely used for artificial intelligent
applications in which the blackboard maintains the knowledge about a domain in
the form of assertion and rules, which are entered by domain experts.
These operate through a control shell that controls the problem-solving activity of
the system. Particular and successful applications of this model can be found in
the domains of speech recognition and signal processing.
38
Software Architectural styles
Data Flow Architecture:
Access to data is the core feature, data-flow styles explicitly incorporate the
pattern of data-flow, since their design is determined by an orderly motion of data
from component to component, which is the form of communication between
them.
Styles within this category differ in one of the following ways: how the control is
exerted, the degree of concurrency among components, and the topology that
describes the flow of data.
Batch Sequential: The batch sequential style is characterized by an ordered
sequence of separate programs executing one after the other. These programs
are chained together by providing as input for the next program the output
generated by the last program after its completion, which is most likely in the
form of a file. This design was very popular in the mainframe era of computing
and still finds applications today. For example, many distributed applications for
scientific computing are defined by jobs expressed as sequence of programs that,
for example, pre-filter, analyze, and post process data. It is very common to
compose these phases using the batch sequential style.
Pipe-and-Filter Style: It is a variation of the previous style for expressing the
activity of a software system as sequence of data transformations. Each
component of the processing chain is called a filter, and the connection between
one filter and the next is represented by a data stream
Virtual Machine Architecture:
The virtual machine class of architectural styles is characterized by the presence
of an abstract execution environment ( generally referred as a virtual machine)
that simulates features that are not available in the hardware or software.
Applications and systems are implemented on top of this layer and become
portable over different hardware and software environments.
The general interaction flow for systems implementing this pattern is – the
program (or the application) defines its operations and state in an abstract
format, which is interpreted by the virtual machine engine. The interpretation of a
program constitutes its execution. It is quite common in this scenario that the
engine maintains an internal representation of the program state.
Popular examples within this category are rule based systems, interpreters, and
command language processors.
Rule-Based Style:
This architecture is characterized by representing the abstract execution
environment as an inference engine. Programs are expressed in the form of rules
or predicates that hold true. The input data for applications is generally
represented by a set of assertions or facts that the inference engine uses to
activate rules or to apply predicates, thus transforming data.
39
Software Architectural styles &
System Architectural Styles
The examples of rule-based systems can be found in the networking domain:
Network Intrusion Detection Systems (NIDS) often rely on a set of rules to
identify abnormal behaviors connected to possible intrusion in computing
systems.
Interpreter Style: The presence of engine is to interpret the style.
Call and Return Architecture:
This identifies all systems that are organized into components mostly connected
together by method calls.
The activity of systems modelled in this way is characterized by a chain of method
calls whose overall execution and composition identify the execution one or more
operations.
There are three categories in this, Top down Style: 1) developed with imperative
programming 2) Object Oriented Style: Object programming models and 3)
Layered Style: provides the implementation in different levels of abstraction of the
system.
System Architectural Styles
System architectural styles cover the physical organization of components and
processes over a distributed infrastructure. Two fundamental reference styles are
as follows
1) Client / Server
The information and the services of interest can be centralized and accessed
through a single access point: the server. Multiple clients are interested in such
services and the server must be appropriately designed to efficiently serve
requests coming from different clients.
40
Software Architectural styles &
System Architectural Styles
2) Peer- to – Peer
Symmetric architectures in which all the components, called peers, play the same
role and incorporate both client and server capabilities of the client/server model.
More precisely, each peer acts as a server when it processes requests from other
peers and as a client when it issues
41
Software Architectural styles &
System Architectural Styles
Message-based communication
The abstraction of message has played an important role in the evolution of the
model and technologies enabling distributed computing.
The definition of distributed computing – is the one in which components located
at networked computers communicate and coordinate their actions only by
passing messages. The term messages, in this case, identify any discrete amount
of information that is passed from one entity to another. It encompasses any form
of data representation that is limited in size and time, whereas this is an
invocation to a remote procedure or a serialized object instance or a generic
message.
The term message-based communication model can be used to refer to any
model for IPC.
Several distributed programming paradigms eventually use message-based
communication despite the abstractions that are presented to developers for
programming the interactions of distributed components.
Here are some of the most popular and important:
Message Passing : This paradigm introduces the concept of a message as the
main abstraction of the model. The entities exchanging information explicitly
encode in the form of a message the data to be exchanged. The structure and the
content of a message vary according to the model. Examples of this model are
the Message-Passing-Interface (MPI) and openMP.
Remote Procedure Call (RPC) : This paradigm extends the concept of
procedure call beyond the boundaries of a single process, thus triggering the
execution of code in remote processes.
Distributed Objects : This is an implementation of the RPC model for the
object-oriented paradigm and contextualizes this feature for the
remote invocation of methods exposed by objects. Examples of distributed object
infrastructures are Common Object Request Broker Architecture (CORBA),
Component Object Model (COM, DCOM, and COM+), Java Remote Method
Invocation (RMI), and .NET Remoting.
Distributed agents and active Objects: Programming paradigms based on
agents and active objects involve by definition the presence of instances, whether
they are agents of objects, despite the existence of requests.
Web Service: An implementation of the RPC concept over HTTP; thus allowing
the interaction of components that are developed with different technologies. A
Web service is exposed as a remote object hosted on a Web Server, and method
invocation are transformed in HTTP requests, using specific protocols such as
Simple Object Access Protocol (SOAP) or Representational State Transfer (REST).
42
Characteristics of Cloud
The five essential characteristics of Cloud computing according to NIST are
On-demand self-service: A consumer can unilaterally provision computing capabilities,
such as server time and network storage, as needed automatically without requiring
human interaction with each service provider.
Broad network access: Capabilities are available over the network and accessed
through standard mechanisms that promote use by heterogeneous thin or thick client
platforms (e.g., mobile phones, tablets, laptops and workstations).
Resource pooling: The provider's computing resources are pooled to serve multiple
consumers using a multi-tenant model, with different physical and virtual resources
dynamically assigned and reassigned according to consumer demand. There is a sense
of location independence in that the customer generally has no control or knowledge
over the exact location of the provided resources but may be able to specify location at a
higher level of abstraction (e.g., country, state or datacenter). Examples of resources
include storage, processing, memory and network bandwidth.
Rapid elasticity: Capabilities can be elastically provisioned and released, in some cases
automatically, to scale rapidly outward and inward commensurate with demand. To the
consumer, the capabilities available for provisioning often appear to be unlimited and can
be appropriated in any quantity at any time.
Measured service: Cloud systems automatically control and optimize resource use by
leveraging a metering capability at some level of abstraction appropriate to the type of
service (e.g., storage, processing, bandwidth and active user accounts). Resource usage
can be monitored, controlled and reported, providing transparency for the provider and
consumer.
Elasticity Vs Scalability in Cloud
Elasticity
Сloud elasticity is a system’s ability to manage available resources according to the
current workload requirements dynamically. Elasticity is a vital feature of cloud
infrastructure. It comes in handy when the system is expected to experience sudden
spikes of user activity and, as a result, a drastic increase in workload demand.
For example:
Streaming Services. Netflix is dropping a new season of Mindhunter. The notification
triggers a significant number of users to get on the service and watch or upload the
episodes. Resource-wise, it is an activity spike that requires swift resource allocation.
E-commerce. Amazon has a Prime Day event with many special offers, sell-offs,
promotions, and discounts. It attracts an immense amount of customers on the service
who are doing different activities. Actions include searching for products, bidding, buying
stuff, writing reviews, rating products.
43
Elasticity Vs Scalability
This diverse activity requires a very flexible system that can allocate resources to
one sector without dragging down others.
Scalability
Cloud scalability is the ability of the system’s infrastructure to handle growing
workload requirements while retaining a consistent performance adequately.
Unlike cloud elasticity, which is more of makeshift resource allocation - scalability
is a part of infrastructure design.
There are several types of cloud scalability:
Vertical Scaling or Scale-Up - the ability to handle an increasing workload by
adding resources to the existing infrastructure. It is a short term solution to cover
immediate needs.
Horizonta Scaling or Scale-Out - the expansion of the existing infrastructure
with new elements to tackle more significant workload requirements. It is a long
term solution aimed to cover present and future resource demands with room for
expansion.
Diagonal scalability is a more flexible solution that combines adding and
removal of resources according to the current workload requirements. It is the
most cost-effective scalability solution by far.
Let’s take a call center, for example. The typical call center is continuously
growing. New employees come in to handle an increasing number of customer
requests gradually, and new features are introduced to the system (like sentiment
analysis, embedded analytics, etc.). In this case, cloud scalability is used to keep
the system’s performance as consistent and efficient as possible over an extended
time and growth.
On demand provisioning in Clouds:
Cloud provisioning refers to the processes for the deployment and integration of
cloud computing services within an enterprise IT infrastructure. This is a broad
term that incorporates the policies, procedures and an enterprise’s objective in
sourcing cloud services and solutions from a cloud service provider.
Cloud provisioning primarily defines how, what and when an organization will
provision cloud services. These services can be internal, public or hybrid cloud
products and solutions. There are three different delivery models:
Dynamic/On-Demand Provisioning: The customer or requesting application is
provided with resources on run time.
User Provisioning: The user/customer adds a cloud device or device
themselves.
Post-Sales/Advanced Provisioning: The customer is provided with the
resource upon contract/service signup.
44
Assignments
45
Assignment - I
46
Question Bank
47
Unit 1 - Part A – (CO1, K2)
First-Generation Computers
Second-Generation Computers
Third-Generation Computers
Fourth-Generation Computers
Fifth-Generation Computers
3. What is ARPANET?
48
5. Discuss the evolution of IPv6.
Because of the growth of the Internet and the depletion of available IPv4
addresses, a new version of IP IPv6, was developed in the mid-1990s, which
provides vastly larger addressing capabilities and more efficient routing of
Internet traffic. IPv6 uses 128 bits for the IP address.
6. What are the two models of computing? List the key elements of
computing.
architectures
compilers
applications
problem-solving environments.
51
13. What are the basic building blocks of architectural styles?
Repository
Data-centered
Blackboard
Rule-based system
Virtual machine
Interpreter
Communicating processes
Independent components
Event systems
52
16. What are the components that characterize the blackboard
architectural styles?
Knowledge sources: These are entities that update the knowledge base
that is maintained in the black board.
Blackboard: This represents the data structure that is shared among the
knowledge sources and stores the knowledge base of the application.
Client / Server
Peer- to – Peer
19. Identify two major models for client design in client/server model.
The two major models for client design are: Thin-client model and Fat-
client model.
53
Thin-client model: In this model, the load of data processing and
transformation is put on the server side, and the client has a light
implementation that is mostly concerned with retrieving and returning the
data it is being asked for, with no considerable further processing.
Presentation
Application logic
Data storage
54
23.What are the characteristics of cloud computing?
On-demand self-service
Resource pooling
Rapid elasticity
Measured service
25.Define scalability.
Diagonal scalability
55
Unit 1 - Part B - Questions: (CO1, K2)
56
Supportive Online
Courses
57
Relevant Online Courses
58
Real-time
Applications
59
Real world Examples of Cloud
Cloud Storage:
Drop Box
Gmail
Facebook
Marketing:
Maropost
Hubspot
Adobe Marketing Cloud
Education:
SlideRocket
Ratatype
Amazon Web Services
Healthcare:
ClearData
Dell’s Secure Healthcare Cloud
IBM Cloud
60
Real Life Analogies
What would you prefer, either buy or rent car?
Buy Your Own Car
Buying a car is a big investment, and there are a lot of important decisions to take
into account. Some people like all the different options, and others don’t want to
bother with thousands of decisions. When buying a car you have full control over
everything, its make and model, cost, interior, etc. Additionally, you’ve got to work
about taxes, insurance, inspections, and all sorts of maintenance, you’ve got the
control, but it comes with a hassle.
Renting a Car
Then how about renting a car? You have fewer and simpler decisions to make.
You just need to select a car from what’s available, and you can switch your car if
something comes up.
Rent when you need; pay when you use. You don’t have to worry about
maintenance costs, tax, and insurance since they are included in your rental fee.
On the other hand, there are obviously some disadvantages. You’re limited by
what’s available from the rental vendor, you may not be allowed to customize the
car, and the car is not dedicated to you all the time.
Translating the Analogy to Cloud Computing
This simple real life analogy is easily translatable to Cloud Computing.
Buying your own car is similar to setting up your own on-premise data center. You
have the flexibility to customize whatever you like, starting from physical
infrastructure, the security system, hardware and software, etc. However, you
also have to invest a lot of money upfront. And also, you will also need to manage
it later when it’s operating.
On the other hand, instead of building your own data center, you can rent
computation power and storage from the cloud provider. You can scale in and out
when necessary. Just pay when you use. No specific commitment takes place. You
can start and stop anytime.
61
Contents Beyond
Syllabus
62
1. Serverless Computing
Serverless architecture (also known as serverless computing or function as a
service, FaaS) is a software design pattern where applications are hosted by a third-
party service, eliminating the need for server software and hardware management
by the developer. Applications are broken up into individual functions that can be
invoked and scaled individually.
63
ASSESSMENT SCHEDULE
Tentative schedule for the Assessment During 2022-2023 odd
semester
64
Prescribed Text Books
& References
65
Text Books and References
TEXT BOOKS:
T1: Kai Hwang, Geoffrey C. Fox, Jack G. Dongarra, "Distributed and Cloud
Computing, From Parallel Processing to the Internet of Things", Morgan
Kaufmann Publishers, 2012.
REFERENCES:
66
Miniproject
Suggestions
67
Mini Project Ideas
1. Design a dynamic website about yourself, showcasing your talents, skillsets,
interests, your opinion on current affairs and hobbies and host it in using
AWS cloud.
3. Design a model secure health cloud for a hospital near your residence.
8. Can you implement a simple application which integrates Big Data, Machine
Learning concepts and deploy it in cloud platform?
10. Can you think of any solution for COVID-19 (data gathering, data
management) using cloud?
68
Thank you
Disclaimer:
This document is confidential and intended solely for the educational purpose of RMK Group
of Educational Institutions. If you have received this document through email in error,
please notify the system manager. This document contains proprietary information and is
intended only to the respective group / learning community as intended. If you are not the
addressee you should not disseminate, distribute or copy through e-mail. Please notify the
sender immediately by e-mail if you have received this document by mistake and delete this
document from your system. If you are not the intended recipient you are notified that
disclosing, copying, distributing or taking any action in reliance on the contents of this
information is strictly prohibited.
69