Unit 1 - Cloud Computing - Digital Content

Download as pdf or txt
Download as pdf or txt
You are on page 1of 69

1

2
Please read this disclaimer before proceeding:
This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document
through email in error, please notify the system manager. This document
contains proprietary information and is intended only to the respective group /
learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender
immediately by e-mail if you have received this document by mistake and delete
this document from your system. If you are not the intended recipient you are
notified that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.

3
CS8791
Cloud Computing
Computer Science and Engineering
2019 – 2023 / IV Year
Created by:

Dr. T. Sethukarasi, Prof & Head, RMKEC

Ms. T. Sumitha, AP/CSE, RMKEC

Ms. S. Keerthiga, AP/CSE, RMKEC

August 2022

4
TABLE OF CONTENTS

S.No Description Page


Number
1 Course Objectives 6

2 Pre Requisites (Course Names with Code) 7

3 Syllabus (With Subject Code, Name, LTPC 8


details)
4 Course outcomes 9

5 CO- PO/PSO Mapping 10

6 Lecture Plan 11

7 Activity based learning 12

8 Lecture Notes 14

9 Assignments 45

10 Part A Q & A 48

11 Part B Qs 56

12 Supportive online Certification courses (NPTEL, 57


Swayam, Coursera, Udemy)
13 Real time Applications in day to day life and to 60
Industry
14 Contents beyond the Syllabus 63

15 Assessment Schedule 64

16 Prescribed Text Books & Reference Books 66

17 Mini Project suggestions 68

5
COURSE OBJECTIVES

To understand the concept of cloud computing.

To appreciate the evolution of cloud from the existing

technologies.

To have knowledge on the various issues in cloud computing.

To be familiar with the lead players in cloud.

To appreciate the emergence of cloud as the next generation

computing paradigm.

6
PRE REQUISITES

CS8791

CLOUD COMPUTING
CS8591 CS8493 CS8491
Computer Operating Computer
Networks Systems Architecture

[Sem V] [Sem VI] [Sem VII]

7
SYLLABUS

CS8791 CLOUD COMPUTING LTPC


3 00 3
UNIT I INTRODUCTION
Introduction to Cloud Computing – Definition of Cloud – Evolution of
Cloud Computing – Underlying Principles of Parallel and Distributed
Computing – Cloud Characteristics – Elasticity in Cloud – On-demand
Provisioning.
UNIT II CLOUD ENABLING TECHNOLOGIES
Service Oriented Architecture – REST and Systems of Systems – Web
Services – Publish and Subscribe Model – Basics of Virtualization – Types
of Virtualization – Implementation Levels of Virtualization – Virtualization
Structures – Tools and Mechanisms – Virtualization of CPU –Memory –
I/O Devices –Virtualization Support and Disaster Recovery.
UNIT III CLOUD ARCHITECTURE, SERVICES AND STORAGE
Layered Cloud Architecture Design – NIST Cloud Computing Reference
Architecture – Public, Private and Hybrid Clouds – laaS – PaaS – SaaS –
Architectural Design Challenges – Cloud Storage – Storage-as-a-Service –
Advantages of Cloud Storage – Cloud Storage Providers – S3.
UNIT IV RESOURCE MANAGEMENT AND SECURITY IN CLOUD
Inter Cloud Resource Management – Resource Provisioning and Resource
Provisioning Methods – Global Exchange of Cloud Resources – Security
Overview – Cloud Security Challenges –Software-as-a-Service Security –
Security Governance – Virtual Machine Security – IAM –Security
Standards.
UNIT V CLOUD TECHNOLOGIES AND ADVANCEMENTS
Hadoop – Map Reduce – Virtual Box — Google App Engine –
Programming Environment for Google App Engine –– Open Stack –
Federation in the Cloud – Four Levels of Federation –Federated Services
and Applications – Future of Federation.

8
COURSE OUTCOMES

At the end of the course, the student should be able to:

S.No Description CO HKL

Describe the principles of Parallel and


Distributed Computing and evolution of
1 CO1 K2
cloud computing from existing
technologies
Implement different types of Virtualization
2 technologies and Service Oriented CO2 K3
Architecture systems
Elucidate the concepts of NIST Cloud
3 Computing architecture and its design CO3 K3
challenges
Analyse the issues in Resource
4 provisioning and Security governance in CO4 K3
clouds

Choose among various cloud technologies


5 CO5 K3
for implementing applications

6 Install and use current cloud technologies CO6 K3

*HKL - Highest Knowledge


Level
9
CO - PO / PSO MAPPING

PROGRAM OUTCOMES PSO


K3, P P P
CO HKL K3 K4 K5 K5 K4, A3 A2 A3 A3 A3 A3 A2 S S S
K5 O O O
PO PO PO PO PO PO PO PO PO PO PO PO 1 2 3
-1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12

C203.1 K2 2 1 - - - - - - - - - - 2 2 1

C203.2 K3 3 2 1 - 3 - - - - - - - 2 2 -

C203.3 K3 3 2 1 - 2 - - - - - - - 2 1 -

C203.4 K3 3 2 1 1 2 - - - - - - - - - -

C203.5 K3 3 2 1 1 2 - - - - - - - - - -

C203.6 K3 2 1 - - 1 - - - - - - - - - -

Correlation Level - 1. Slight (Low) 2. Moderate (Medium)


3. Substantial (High) , If there is no correlation, put “-“.

10
LECTURE PLAN

UNIT 1 INTRODUCTION

UNIT – I LINEAR DATA STRUCTURES – LIST


Actua
Propos
l Highest Re
S. ed Mode of Delivery
Topic Lectu CO Cognitive LU Outcomes mar
No Lecture Delivery Resources
re Level k
Date
Date
Discuss the need for cloud
computing, shift of
1 Introduction to Cloud Computing K2 MD1 & MD5 R1
industries from traditional
computing to cloud.
2 Definition of Cloud K2 MD1 & MD5 R1 Write the definition of cloud
Describe the evolution of
3 Evolution of Cloud Computing K2 MD1 & MD5 R1 cloud computing over
decades using timeline
Describe the principles of
4 Principles of Parallel Computing K2 MD1 & MD5 R1 Parallel computing and
Flynn’s classification.
Describe the principles of
5 Principles of Distributed Processing CO1 K2 MD1 & MD5 R1 Distributed computing
system and its approaches.
List out the five essential
6 Characteristics of Cloud K2 MD1 & MD5 R1
characteristics of cloud.

Discuss the important


7 Elasticity K2 MD1 & MD5 R1
feature of cloud - Elasticity

Describe the difference


8 Elasticity Vs Scalability K2 MD1 & MD5 R1 between Elasticity and
Scalability
Describe the concepts of On
9 On demand Provisioning K2 MD1 & MD5 T1
demand Provisioning.

ASSESMENT COMPONENTS MODE OF DELIVERY


AC 1. Unit Test MD1. Oral Presentation
AC 2. Assignment MD2. Tutorial
AC 3. Course MD3. Seminar
AC 4. Course Quiz MS4. Hands On
AC 5. Case MD5. Videos
AC 6. Record Work MD6. Field Visit
AC 7. Lab / Mini Project
AC 8. Lab Model Exam
AC 9. Project Review

11
ACTIVITY BASED LEARNING

Create a presentation by collecting pictures from Internet, Text


Books and Magazines to showcase how Parallel Computing
Machines, Mainframe Computer, Super Computer, Grid setup
and Cloud Data centers will look like in real-time.

12
Lecture Notes

13
Unit 1 - Introduction

14
INTRODUCTION TO CLOUD COMPUTING

Cloud Computing is the evolution of a variety of technologies that have come


together to alter an organization’s approach to building out an IT infrastructure.

Cloud computing is a technological advancement that focuses on the way we design


computing systems, develop applications, and leverage existing services for building
software. It is based on the concept of dynamic provisioning, which is applied not
only to services but also to compute capability, storage, networking, and information
technology (IT) infrastructure in general. Resources are made available through the
Internet and offered on a pay-per-use basis from cloud computing vendors.

Web 2.0 technologies play a central role in making cloud computing an attractive
opportunity for building computing systems. They have transformed the Internet
into a rich application and service delivery platform, mature enough to serve
complex needs. Service orientation allows cloud computing to deliver its capabilities
with familiar abstractions, while virtualization confers on cloud computing the
necessary degree of customization, control, and flexibility for building production
and enterprise systems.

DEFINITION OF CLOUD

Definition proposed by the U.S. National Institute of Standards and Technology


(NIST):

Cloud computing is a model for enabling ubiquitous, convenient, on-demand


network access to a shared pool of configurable computing resources (e.g.,
networks, servers, storage, applications, and services) that can be rapidly
provisioned and released with minimal management effort or service provider
interaction.

15
EVOLUTION OF CLOUD COMPUTING

HARDWARE EVOLUTION
Computerization has pervaded nearly every facet of our personal and professional
lives. Computer evolution has been both rapid and fascinating. The first step along
the evolutionary path of computers occurred in 1930, when binary arithmetic was
developed and became the foundation of computer processing technology,
terminology, and programming languages. In 1939, the Berry brothers invented an
electronic computer capable of operating digitally. Computations were performed
using vacuum-tube technology. In 1941, the introduction of Konrad Zuse’s Z3 at the
German Laboratory for Aviation in Berlin was one of the most significant events in
the evolution of computers because this machine supported both floating-point and
binary arithmetic.
First-Generation Computers:

The Mark I was designed and developed in 1943 at Harvard University. It was a
general-purpose electromechanical programmable computer.

Colossus was an electronic computer built in Britain at the end 1943. Colossus
was the world’s first programmable, digital, electronic, computing device.

First-generation computers were built using hard-wired circuits and vacuum tubes.

Data was stored using paper punch cards.

Colossus was used in secret during World War II to help decipher teleprinter
messages encrypted by German forces using the Lorenz SZ40/42 machine.

The ENIAC (Electronic Numerical Integrator and Computer) was built in 1946.
This was the first Turing-complete, digital computer capable of being
reprogrammed to solve a full range of computing problems. Although, the ENIAC
was similar to the Colossus, it was much faster, more flexible, and it was Turing-
complete.

16
ENIAC contained 18,000 thermionic valves, weighed over 60,000 pounds, and
consumed 25 kilowatts of electrical power per hour.

ENIAC was capable of performing 100,000 calculations a second.

Second-Generation Computers:

The inefficient thermionic valves were replaced with smaller and more reliable
transistors.

Transistorized computers marked the advent of second-generation computers,


which dominated in the late 1950s and early 1960s.

Despite using transistors and printed circuits, these computers were still bulky and
expensive. They were therefore used mainly by universities and government
agencies.

The integrated circuit or microchip was developed by Jack St. Claire Kilby, an
achievement for which he received the Nobel Prize in Physics in 2000.

Third-Generation Computers:

The development of the integrated circuit was the hallmark of the third
generation of computers.

Transistors were miniaturized and placed on silicon chips, called semiconductors,


which drastically increased the speed and efficiency of computers.

Instead of punched cards and printouts, users interacted with third generation
computers through keyboards and monitors and interfaced with an operating
system, which allowed the device to run many different applications at one time
with a central program that monitored the memory.

The integrated circuit allowed the development of minicomputers that began to


bring computing into many smaller businesses.

Computers for the first time became accessible to a mass audience because they
were smaller and cheaper than their predecessors.

17
Fourth-Generation Computers:

The fourth-generation computers that were being developed at this time utilized a
microprocessor that put the computer’s processing capabilities on a single
integrated circuit chip.

By combining random access memory (RAM), developed by Intel, fourth-


generation computers were faster than ever before and had much smaller
footprints.

In November 1971, Intel released the world’s first commercial microprocessor, the
Intel 4004. The 4004 was the first complete CPU on one chip and became the first
commercially available microprocessor. The 4004 processor was capable of “only”
60,000 instructions per second.

The microprocessors that evolved from the 4004 allowed manufacturers to begin
developing personal computers small enough and cheap enough to be purchased
by the general public.

The first commercially available personal computer was the MITS Altair 8800,
released at the end of 1974.

Even though microprocessing power, memory and data storage capacities have
increased by many orders of magnitude since the invention of the 4004 processor,
the technology for large-scale integration (LSI) or very-large-scale integration
(VLSI) microchips has not changed all that much. For this reason, most of today’s
computers still fall into the category of fourth-generation computers.

Fifth-Generation Computers:

In the fifth generation, VLSI technology became ULSI (Ultra Large Scale
Integration) technology, resulting in the production of microprocessor chips
having ten million electronic components.

The computer generation was being categorized on the basis of hardware only,
but the fifth generation technology included software also.

18
This generation is based on parallel processing hardware and AI (Artificial
Intelligence) software.

The computers of the fifth generation had high capability and large memory
capacity.

INTERNET SOFTWARE EVOLUTION

The Advanced Research Projects Agency (ARPA) of the United States Department
of Defence funded research project to develop a network. The idea was to
develop a computer network that could continue to function in the event of a
disaster such as nuclear war.

Project was known as ARPANET [Advanced Research Projects Agency] – world’s


first operational network.

Packet switching was incorporated into the proposed design for the ARPANET in
1967.

ARPANET development began with two network nodes in 1969. Later by the end
of 1971, fifteen sites were connected to the young ARPANET.

Interface Message Processor (IMP) at each site would handle the interface to the
ARPANET network.

In 1968, Beranek and Newman, Inc. (BBN) unveils the final version of the
Interface Message Processor (IMP) specifications.

Computer connected to ARPANET used a standard to communicate. Standard


used by ARPANET is known as NCP [National Control Protocol]. Protocols are set
of standards that govern the transmissions of data over the network. NCP
provided connections and flow control between processes running on different
ARPANET host computers.

The ARPANET project and international working groups led to the development of
various protocols and standards by which multiple separate networks could
become a single network or "a network of networks".

19
ARPANET became a massive network of networks and now it is known as
Internet.

Establishing a Common Protocol for the Internet:

The protocols in the Physical Layer, the Data Link Layer, and the Network
Layer used within the network were implemented on separate Interface Message
Processors (IMPs).

Since lower protocol layers were provided by the IMP-host interface, NCP
essentially provided a Transport Layer consisting of the ARPANET Host-to-Host
Protocol (AHHP) and the Initial Connection Protocol (ICP). AHHP defined
procedures to transmit a unidirectional, flow-controlled data stream between two
hosts. The ICP defined the procedure for establishing a bidirectional pair of such
streams between a pair of host processes.

Application protocols such as File Transfer Protocol (FTP), used for file transfers,
and Simple Mail Transfer Protocol (SMTP), used for sending email, accessed
network services through an interface to the top layer of the NCP.

NCP was officially rendered obsolete when the ARPANET changed its core
networking protocols from NCP to the more flexible and powerful TCP/IP protocol
suite, marking the start of the modern Internet.

Transmission Control Protocol (TCP) and Internet Protocol (IP), as the protocol
suite, commonly known as TCP/IP, emerge as the protocol for ARPANET. This
results in the fledgling definition of the Internet as connected TCP/IP internets.
TCP/IP remains the standard protocol for the Internet.

TCP/IP quickly became the most widely used network protocol in the world.

TCP converts messages into streams of packets at the source. Re-assembled back
into messages at the destination. IP handles the dispatch of these packets. It
handles the addressing and make sure that packets reaches its destination
through multiple nodes.

20
Evolution of Ipv6:

For locating individual computers on the network, the Internet provides IP


addresses.

IP addresses are used by the Internet infrastructure to direct internet packets to


their destinations.

The amazing growth of the Internet throughout the 1990s caused a vast
reduction in the number of free IP addresses available under IPv4.

Internet Protocol version 4 (IPv4) defines an IP address as a 32-bit number.

Internet Protocol Version 4 (IPv4) is the initial version used on the first generation
of the Internet and is still in dominant use. It was designed to address up to ≈4.3
billion (109) hosts. However, the explosive growth of the Internet has led to IPv4
address exhaustion, which entered its final stage in 2011, when the global IPv4
address allocation pool was exhausted.

Because of the growth of the Internet and the depletion of available IPv4
addresses, a new version of IP IPv6, was developed in the mid-1990s, which
provides vastly larger addressing capabilities and more efficient routing of
Internet traffic. IPv6 uses 128 bits for the IP address.

Ipv6 is sometimes called the Next Generation Internet Protocol (IPNG) or TCP/IP
v6.

IPv6 was widely available from industry as an integrated TCP/IP protocol and was
supported by most new Internet networking equipment.

Finding a Common Method to Communicate Using the Internet Protocol:

The word hypertext was coined by Ted Nelson in 1960.

MEMEX is an automated library system.

Memex is an electromechanical desk linked to an extensive archive of microfilms,


able to display books, writings, or any document from a library.

21
The Memex would also be able to create 'trails' of linked and branching sets of
pages, combining pages from the published microfilm library with personal
annotations or additions captured on a microfilm recorder.

Memex influenced Ted Nelson and Douglas Engelbart for the invention of
hypertext.

Engelbart developed computer tools to augment human capabilities and


productivity. Part of this effort required that he developed the mouse, the
graphical user interface (GUI), and the first working hypertext system, named
NLS (derived from oN-Line System).

NLS was designed to crossreference research papers for sharing among


geographically distributed researchers. NLS provided groupware capabilities,
screen sharing among remote users, and reference links for moving between
sentences within a research paper and from one research paper to another.

HyperCard by Apple was the first hypertext editing system available to the general
public.

Building a Common Interface to the Internet:

In the 1990s, the Mosaic and Netscape browsers were developed at the National
Center for Supercomputer Applications (NCSA), a research institute at the
University of Illinois.

In the fall of 1990, Berners-Lee developed the first web browser featuring an
integrated editor that could create hypertext documents.

Berners-Lee enhanced the server and browser by adding support for the FTP
protocol.

Mosaic was the first widely popular web browser available to the general public. It
helped spread use and knowledge of the web across the world.

Mosaic provided support for graphics, sound, and video clips. Innovations
including the use of bookmarks and history files were added.

22
Mosaic became even more popular, helping further the growth of the World Wide
Web.

Mosaic Communications was later renamed Netscape Communications in 1994.

Netscape released the first beta version of its browser, Mozilla 0.96b, over the
Internet. The final version, named Mozilla 1.0, was released in December 1994.

The Mosaic programming team then developed another web browser, which they
named Netscape Navigator.

In 1995, Microsoft Internet Explorer arrived as both a graphical Web browser and
the name for a set of technologies.

In July 1995, Microsoft released the Windows 95 operating system, which


included built-in support for dial-up networking and TCP/IP, two key technologies
for connecting a PC to the Internet. It also included an add-on to the operating
system called Internet Explorer 1.0.

Netscape released a free, open source software version of Netscape named


Mozilla in 2002.

Mozilla has steadily gained market share, particularly on non-Windows platforms


such as Linux, largely because of its open source foundation. Mozilla Firefox,
released in November 2004, became very popular almost immediately.

The Appearance of Cloud Formations—From One Computer to a Grid of


Many:

Computers were clustered together to form a single larger computer in order to


simulate a supercomputer and harness greater processing power.

Clustering allowed one to configure computers using special protocols so they


could “talk” to each other.

A key to efficient cluster management was engineering where the data was to be
held. This process became known as data residency.

23
Grid computing expands on the techniques used in clustered computing models,
were multiple independent clusters appear to act like a grid simply because they
are not all located within the same domain.

A major obstacle to overcome in the migration from a clustering model to grid


computing was data residency.

The Globus Toolkit is an open source software toolkit used for building grid
systems and applications. It is being developed and maintained by the Globus
Alliance and many others all over the world.

The toolkit provided by Globus allows people to share computing power,


databases, instruments, and other online tools securely across corporate,
institutional, and geographic boundaries without sacrificing local autonomy.

The cloud is helping to further propagate the grid computing model.

Cloud-resident entities such as data centers have taken the concepts of grid
computing and bundled them into service offerings that appeal to other entities
that do not want the burden of infrastructure but do want the capabilities hosted
from those data centers.

One of the most well known of the new cloud service providers is Amazon’s S3
(Simple Storage Service) third party storage solution.

SERVER VIRTUALIZATION

Virtualization is a method of running multiple independent virtual operating


systems on a single physical computer. The creation and management of virtual
machines has often been called platform virtualization.

Platform virtualization is performed on a given computer (hardware platform) by


software called a control program.

The control program creates a simulated environment, a virtual computer, which


enables the device to use hosted software specific to the virtual environment,
called guest software.

24
Virtualization technology is a way of reducing the majority of hardware acquisition
and maintenance costs, which can result in significant savings for any company.

Parallel Processing:

Parallel processing is performed by the simultaneous execution of program


instructions that have been allocated across multiple processors with the objective
of running a program in less time.

Parallel processing improve performance by allowing interleaved execution of


programs simultaneously.

The next advancement in parallel processing was multiprogramming.

In a multiprogramming system, multiple programs submitted by users are allowed


to use the processor for a short time, each taking turns and having exclusive time
with the processor in order to execute instructions.

Vector Processing:

The next step in the evolution of parallel processing was the introduction of
multiprocessing. Here, two or more processors share a common workload.

The earliest versions of multiprocessing were designed as a master/slave model,


where one processor (the master) was responsible for all of the tasks to be
performed and it only off-loaded tasks to the other processor (the slave) when
the master processor determined, based on a predetermined threshold, that work
could be shifted to increase performance.

Vector processing was developed to increase processing performance by


operating in a multitasking manner.

Symmetric Multiprocessing Systems:

The next advancement was the development of symmetric multiprocessing


systems (SMP) to address the problem of resource management in master/ slave
models.

25
In SMP systems, each processor is equally capable and responsible for managing
the workflow as it passes through the system.

The primary goal is to achieve sequential consistency, in other words, to make


SMP systems appear to be exactly the same as a single-processor,
multiprogramming platform.

Massively Parallel Processing Systems:

Massive parallel processing systems refer to a computer system with many


independent arithmetic units or entire microprocessors, which run in parallel.

In this form of computing, all the processing elements are interconnected to act
as one very large computer.

The earliest massively parallel processing systems all used serial computers as
individual processing units in order to maximize the number of units available for
a given size and cost.

Single-chip implementations of massively parallel processor arrays are becoming


ever more cost effective due to the advancements in integrated circuit technology.

26
Parallel Computing Systems

The term parallel computing and distributed computing are often used
interchangeably, even though they mean slightly different things.
The term parallel implies a tightly coupled system, whereas
distributed systems refers to a wider class of system, including
those that are tightly coupled.
More precisely, the term parallel computing refers to a model in which
the computation is divided among several processors sharing the
same memory.
The architecture of a parallel computing system is often characterized by
the homogeneity of components: each processor is of the same
type and it has the same capability as the others.
The shared memory has a single address space, which is accessible to all
the processors.
Parallel programs are then broken down into several units of execution that
can be allocated to different processors and can communicate with each
other by means of shared memory.
Originally parallel systems are considered as those architectures that
featured multiple processors sharing the same physical memory and that
were considered a single computer.
Over time, these restrictions have been relaxed, and parallel systems now
include all architectures that are based on the concept of shared memory,
whether this is physically present or created with the support of libraries,
specific hardware, and a highly efficient networking infrastructure.
For example: a cluster of which of the nodes are connected through an
InfiniBand network and configured with a distributed shared memory
system can be considered as a parallel system.
The term distributed computing encompasses any architecture or system
that allows the computation to be broken down into units and executed
concurrently on different computing elements, whether these are
processors on different nodes, processors on the same computer, or cores
within the same processor.
Distributed computing includes a wider range of systems and applications
than parallel computing and is often considered a more general term.
Even though it is not a rule, the term distributed often implies that the
locations of the computing elements are not the same and such elements
might be heterogeneous in terms of hardware and software features.
Classic examples of distributed computing systems are Computing
Grids and Internet Computing Systems.

27
Elements of Parallel Computing

Silicon-based processor chips are reaching their physical limits. Processing speed
is constrained by the speed of light, and the density of transistors packaged in a
processor is constrained by thermodynamics limitations.
A viable solution to overcome this limitation is to connect multiple processors
working in coordination with each other to solve “Grand Challenge” problems.
The first step in this direction led to the development of parallel computing, which
encompasses techniques, architectures, and systems for performing multiple
activities in parallel. As discussed earlier, the term parallel computing has blurred
its edges with the term distributed computing and is often used in place of later
term.
Even though it is not a rule, the term distributed often implies that the locations
of the computing elements are not the same and such elements might be
heterogeneous in terms of hardware and software features.
Classic examples of distributed computing systems are Computing Grids and
Internet Computing Systems.

What are the factors that influenced Parallel Processing?


The development of parallel processing is being influenced by many factors. The
prominent among them include the following:
Computational requirements are ever increasing in the areas of both
scientific and business computing. The technical computing problems, which
require high-speed computational power, are related to life sciences,
aerospace, geographical information systems, mechanical design and
analysis etc.
Sequential architectures are reaching mechanical physical limitations as they
are constrained by the speed of light and thermodynamics laws.
Hardware improvements in pipelining, super scalar and the like are non-
scalable and require sophisticated compiler technology. Developing such
compiler technology is a difficult task.
Vector processing works well for certain kinds of problems. It is suitable
mostly for scientific problems (involving lots of matrix operations) and
graphical processing. It is not useful for other areas, such as databases.
The technology of parallel processing is mature and can be exploited
commercially. There is already significant R&D work on development tools
and environments.

28
Hardware architectures for parallel
Processing
The core elements of parallel processing are CPUs. Based on the number of
instructions and data streams, that can be processed simultaneously, computing
systems are classified into the following four categories:
Single-instruction, Single-data (SISD) systems
Single-instruction, Multiple-data (SIMD) systems
Multiple-instruction, Single-data (MISD) systems
Multiple-instruction, Multiple-data (MIMD) systems
Single – Instruction , Single Data (SISD) systems
SISD computing system is a uni-processor machine capable of executing a
single instruction, which operates on a single data stream.
Machine instructions are processed sequentially, hence computers adopting
this model are popularly called sequential computers.
Most conventional computers are built using SISD model.
All the instructions and data to be processed have to be stored in primary
memory.
The speed of processing element in the SISD model is limited by the rate at
which the computer can transfer information internally.
Dominant representative SISD systems are IBM PC, Macintosh, and
workstations.
Single – Instruction, Multiple Data (SIMD) systems
SIMD computing system is a multiprocessor machine capable of executing
the same instruction on all the CPUs but operating on different data streams.
Machines based on this model are well suited for scientific computing since
they involve lots of vector and matrix operations.
For instance statement Ci = Ai * Bi, can be passed to all the processing
elements (PEs), organized data elements of vectors A and B can be divided
into multiple sets ( N- sets for N PE systems), and each PE can process one
data set.
Dominant representative SIMD systems are Cray’s Vector processing machine
and Thinking Machines Cm*, and GPGPU accelerators.
Multiple – Instruction , Single Data (MISD) systems
MISD computing system is a multi processor machine capable of executing
different instructions on different Pes all of them operating on the same data
set.
For example, y = sin(x) + cos(x) + tan(x)
Machines built using MISD model are not useful in most of the applications.
Few machines are built but none of them available commercially.
This type of systems are more of an intellectual exercise than a practical
configuration

29
Hardware architectures for parallel
Processing
Multiple – Instruction , Multiple Data (MIMD) systems
MIMD computing system is a multi processor machine capable of executing
multiple instructions on multiple data sets.
Each PE in the MIMD model has separate instruction and data streams,
hence machines built using this model are well suited to any kind of
application.
Unlike SIMD, MISD machine, PEs in MIMD machines work asynchronously,
MIMD machines are broadly categorized into shared-memory MIMD and
distributed memory MIMD based on the way PEs are coupled to the main
memory
Shared Memory MIMD machines
All the PEs are connected to a single global memory and they all have access
to it.
Systems based on this model are also called tightly coupled multi processor
systems.
The communication between PEs in this model takes place through the
shared memory.
Modification of the data stored in the global memory by one PE is visible to
all other PEs.
Dominant representative shared memory MIMD systems are silicon graphics
machines and Sun/IBM SMP ( Symmetric Multi-Processing).
Distributed Memory MIMD machines
PEs have a local memory. Systems based on this model are also called
loosely coupled multi-processor systems.
The communication between PEs in this model takes place through the
interconnection network, the inter process communication channel, or IPC.
The network connecting PEs can be configured to tree, mesh, cube, and so
on.
Each PE operates asynchronously, and if communication/synchronization
among tasks is necessary, they can do so by exchanging messages between
them.
Shared Vs Distributed MIMD model
The shared memory MIMD architecture is easier to program but is less
tolerant to failures and harder to extend with respect to the distributed
memory MIMD model.
Failures, in a shared memory MIMD affect the entire system, whereas this is
not the case of the distributed model, in which each of the PEs can be easily
isolated.
Moreover, shared memory MIMD architectures are less likely to scale
because the addition of more PEs leads to memory contention.
This is a situation that does not happen in the case of distributed memory, in
which each PE has its own memory.
As a result, distributed memory MIMD architectures are most popular today.

30
SISD & SIMD

31
MISD & MIMD

32
Shared Memory & Distributed
Memory Machines

33
Approaches to Parallel
Programming
A sequential program is one that runs on a single processor and has a single line
of control.
To make many processors collectively work on a single program, the program
must be divided into smaller independent chunks so that each processor can work
on separate chunks of the problem.
The program decomposed in this way is a parallel program.
A wide variety of parallel programming approaches are available.
The most prominent among them are the following.
Data Parallelism
Process Parallelism
Farmer-and-worker model
The above said three models are suitable for task-level parallelism. In the case of
data level parallelism, the divide-and-conquer technique is used to split data into
multiple sets, and each data set is processed on different PEs using the same
instruction.
This approach is highly suitable to processing on machines based on the SIMD
model.
In the case of Process Parallelism, a given operation has multiple (but distinct)
activities that can be processed on multiple processors.
In the case of Farmer-and-Worker model, a job distribution approach is used, one
processor is configured as master and all other remaining PEs are designated as
slaves, the master assigns the jobs to slave PEs and, on completion, they inform
the master, which in turn collects results.
These approaches can be utilized in different levels of parallelism.
Levels of Parallelism
Levels of Parallelism are decided on the lumps of code ( grain size) that can be a
potential candidate of parallelism.
The table shows the levels of parallelism. All these approaches have a common
goal
To boost processor efficiency by hiding latency.
To conceal latency, there must be another thread ready to run whenever a
lengthy operation occurs.
The idea is to execute concurrently two or more single-threaded applications.
Such as compiling, text formatting, database searching, and device simulation.

34
Approaches to Parallel
Programming

Grain Size Code Item Parallelized By

Large Separate and Programmer


heavy weight
process
Medium Function or Programmer
procedure
Fine Loop or Parallelizing
instruction block compiler
Very Fine Instruction Processor

35
Distributed Computing Systems

Distributed computing studies the models, architectures, and algorithms used for
building and managing distributed systems.
A distributed system is a collection of independent computers that appears to its
users as a single coherent system.
A distributed system is one in which components located at networked computers
communicate and coordinate their action only by passing messages.
As specified in this definition, the components of a distributed system
communicate with some sort of message passing.
Components of distributed System
A distributed system is the result of the interaction of several components that
traverse the entire computing stack from hardware to software.
It emerges from the collaboration of several elements that- by working together-
give users the illusion of a single coherent system.
The figure provides an overview of the different layers that are involved in
providing the services of a distributed system.
At the very bottom layer, computer and network hardware constitute the physical
infrastructure; these components are directly managed by the operating system,
which provides the basic services for inter process communication (IPC), process
scheduling and management, and resource management in terms of file system
and local devices.
Taken together these two layers become the platform on top of which specialized
software is deployed to turn a set of networked computers into a distributed
system.

36
Architectural styles for distributed
computing
Although a distributed system comprises the interaction of several layers, the
middleware layer is the one that enables distributed computing, because it
provides a coherent and uniform runtime environment for applications.
There are many different ways to organize the components that, taken together,
constitute such an environment.
The interactions among these components and their responsibilities give structure
to the middleware and characterize its type or, in other words, define its
architecture.
Architectural styles aid in understanding the classifying the organization of the
software systems in general and distributed computing in particular.
The architectural styles are classified into two major classes
Software Architectural styles : Relates to the logical organization of the software.
System Architectural styles: styles that describe the physical organization of
distributed software systems in terms of their major components.
Software Architectural styles
Software architectural styles are based on the logical arrangement of software
components.
They are helpful because they provide an intuitive view of the whole system,
despite its physical deployment.
They also identify the main abstractions that are used to shape the components
of the system and the expected interaction patterns between them.

Category Most common Architectural Styles

Data Centered Repository


Blackboard

Data Flow Pipe and filter


Batch Sequential

Virtual Machine Rule based


Interpreter

Call and return Main program and subroutine call/top-


down systems
Layered Systems

Independent Communicating Processes


Components Event Systems

37
Software Architectural styles
Data Centered Architecture:
These architectures identify the data as the fundamental element of the
software system, and access to shared data is the core characteristics of the
data-centered architectures.
Within the context of distributed and parallel computing systems, integrity of
data is overall goal for such systems.
The repository architectural style is the most relevant reference model in this
category. It is characterized by two main components – the central data structure,
which represents the current state of the system, and a collection of independent
component, which operate on the central data.
The ways in which the independent components interact with the central data
structure can be very heterogeneous.
In particular repository based architectures differentiate and specialize
further into subcategories according to the choice of control discipline
to apply for the shared data structure. Of particular interest are
databases and blackboard systems.
In the repository systems, the dynamics of the system is controlled by
independent components, which by issuing an operation on the central
repository, trigger the selection of specific processes that operate on
data.
Black Board Architecture:
The black board architectural style is characterized by three main components:
Knowledge sources : These are entities that update the knowledge base that
is maintained in the black board.
Blackboard : This represents the data structure that is shared among the
knowledge sources and stores the knowledge base of the application.
Control: The control is the collection of triggers and procedures that govern
the interaction with the blackboard and update the status of the knowledge
base.
Knowledge sources represent the intelligent agents sharing the blackboard, react
opportunistically to changes in the knowledge base, almost in the same way that
a group of specialists brainstorm in a room in front of a blackboard.
Blackboard models have become popular and widely used for artificial intelligent
applications in which the blackboard maintains the knowledge about a domain in
the form of assertion and rules, which are entered by domain experts.
These operate through a control shell that controls the problem-solving activity of
the system. Particular and successful applications of this model can be found in
the domains of speech recognition and signal processing.

38
Software Architectural styles
Data Flow Architecture:
Access to data is the core feature, data-flow styles explicitly incorporate the
pattern of data-flow, since their design is determined by an orderly motion of data
from component to component, which is the form of communication between
them.
Styles within this category differ in one of the following ways: how the control is
exerted, the degree of concurrency among components, and the topology that
describes the flow of data.
Batch Sequential: The batch sequential style is characterized by an ordered
sequence of separate programs executing one after the other. These programs
are chained together by providing as input for the next program the output
generated by the last program after its completion, which is most likely in the
form of a file. This design was very popular in the mainframe era of computing
and still finds applications today. For example, many distributed applications for
scientific computing are defined by jobs expressed as sequence of programs that,
for example, pre-filter, analyze, and post process data. It is very common to
compose these phases using the batch sequential style.
Pipe-and-Filter Style: It is a variation of the previous style for expressing the
activity of a software system as sequence of data transformations. Each
component of the processing chain is called a filter, and the connection between
one filter and the next is represented by a data stream
Virtual Machine Architecture:
The virtual machine class of architectural styles is characterized by the presence
of an abstract execution environment ( generally referred as a virtual machine)
that simulates features that are not available in the hardware or software.
Applications and systems are implemented on top of this layer and become
portable over different hardware and software environments.
The general interaction flow for systems implementing this pattern is – the
program (or the application) defines its operations and state in an abstract
format, which is interpreted by the virtual machine engine. The interpretation of a
program constitutes its execution. It is quite common in this scenario that the
engine maintains an internal representation of the program state.
Popular examples within this category are rule based systems, interpreters, and
command language processors.
Rule-Based Style:
This architecture is characterized by representing the abstract execution
environment as an inference engine. Programs are expressed in the form of rules
or predicates that hold true. The input data for applications is generally
represented by a set of assertions or facts that the inference engine uses to
activate rules or to apply predicates, thus transforming data.

39
Software Architectural styles &
System Architectural Styles
The examples of rule-based systems can be found in the networking domain:
Network Intrusion Detection Systems (NIDS) often rely on a set of rules to
identify abnormal behaviors connected to possible intrusion in computing
systems.
Interpreter Style: The presence of engine is to interpret the style.
Call and Return Architecture:
This identifies all systems that are organized into components mostly connected
together by method calls.
The activity of systems modelled in this way is characterized by a chain of method
calls whose overall execution and composition identify the execution one or more
operations.
There are three categories in this, Top down Style: 1) developed with imperative
programming 2) Object Oriented Style: Object programming models and 3)
Layered Style: provides the implementation in different levels of abstraction of the
system.
System Architectural Styles
System architectural styles cover the physical organization of components and
processes over a distributed infrastructure. Two fundamental reference styles are
as follows
1) Client / Server
The information and the services of interest can be centralized and accessed
through a single access point: the server. Multiple clients are interested in such
services and the server must be appropriately designed to efficiently serve
requests coming from different clients.

40
Software Architectural styles &
System Architectural Styles
2) Peer- to – Peer
Symmetric architectures in which all the components, called peers, play the same
role and incorporate both client and server capabilities of the client/server model.
More precisely, each peer acts as a server when it processes requests from other
peers and as a client when it issues

Models for Inter process Communication


Distributed systems are composed of a collection of concurrent processes
interacting with each other by means of a network connection.
IPC is a fundamental aspect of distributed systems design and implementation.
IPC is used to either exchange data and information or coordinate the activity of
processes.
IPC is what ties together the different components of a distributed system, thus
making them act as a single system.
There are several different models in which processes can interact with each
other – these maps to different abstractions for IPC.
Among the most relevant that we can mention are shared memory, remote
procedure call (RPC), and message passing.
At lower level, IPC is realized through the fundamental tools of network
programming.
Sockets are the most popular IPC primitive for implementing communication
channels between distributed processes.

41
Software Architectural styles &
System Architectural Styles
Message-based communication
The abstraction of message has played an important role in the evolution of the
model and technologies enabling distributed computing.
The definition of distributed computing – is the one in which components located
at networked computers communicate and coordinate their actions only by
passing messages. The term messages, in this case, identify any discrete amount
of information that is passed from one entity to another. It encompasses any form
of data representation that is limited in size and time, whereas this is an
invocation to a remote procedure or a serialized object instance or a generic
message.
The term message-based communication model can be used to refer to any
model for IPC.
Several distributed programming paradigms eventually use message-based
communication despite the abstractions that are presented to developers for
programming the interactions of distributed components.
Here are some of the most popular and important:
Message Passing : This paradigm introduces the concept of a message as the
main abstraction of the model. The entities exchanging information explicitly
encode in the form of a message the data to be exchanged. The structure and the
content of a message vary according to the model. Examples of this model are
the Message-Passing-Interface (MPI) and openMP.
Remote Procedure Call (RPC) : This paradigm extends the concept of
procedure call beyond the boundaries of a single process, thus triggering the
execution of code in remote processes.
Distributed Objects : This is an implementation of the RPC model for the
object-oriented paradigm and contextualizes this feature for the
remote invocation of methods exposed by objects. Examples of distributed object
infrastructures are Common Object Request Broker Architecture (CORBA),
Component Object Model (COM, DCOM, and COM+), Java Remote Method
Invocation (RMI), and .NET Remoting.
Distributed agents and active Objects: Programming paradigms based on
agents and active objects involve by definition the presence of instances, whether
they are agents of objects, despite the existence of requests.
Web Service: An implementation of the RPC concept over HTTP; thus allowing
the interaction of components that are developed with different technologies. A
Web service is exposed as a remote object hosted on a Web Server, and method
invocation are transformed in HTTP requests, using specific protocols such as
Simple Object Access Protocol (SOAP) or Representational State Transfer (REST).

42
Characteristics of Cloud
The five essential characteristics of Cloud computing according to NIST are
On-demand self-service: A consumer can unilaterally provision computing capabilities,
such as server time and network storage, as needed automatically without requiring
human interaction with each service provider.
Broad network access: Capabilities are available over the network and accessed
through standard mechanisms that promote use by heterogeneous thin or thick client
platforms (e.g., mobile phones, tablets, laptops and workstations).
Resource pooling: The provider's computing resources are pooled to serve multiple
consumers using a multi-tenant model, with different physical and virtual resources
dynamically assigned and reassigned according to consumer demand. There is a sense
of location independence in that the customer generally has no control or knowledge
over the exact location of the provided resources but may be able to specify location at a
higher level of abstraction (e.g., country, state or datacenter). Examples of resources
include storage, processing, memory and network bandwidth.
Rapid elasticity: Capabilities can be elastically provisioned and released, in some cases
automatically, to scale rapidly outward and inward commensurate with demand. To the
consumer, the capabilities available for provisioning often appear to be unlimited and can
be appropriated in any quantity at any time.
Measured service: Cloud systems automatically control and optimize resource use by
leveraging a metering capability at some level of abstraction appropriate to the type of
service (e.g., storage, processing, bandwidth and active user accounts). Resource usage
can be monitored, controlled and reported, providing transparency for the provider and
consumer.
Elasticity Vs Scalability in Cloud
Elasticity
Сloud elasticity is a system’s ability to manage available resources according to the
current workload requirements dynamically. Elasticity is a vital feature of cloud
infrastructure. It comes in handy when the system is expected to experience sudden
spikes of user activity and, as a result, a drastic increase in workload demand.
For example:
Streaming Services. Netflix is dropping a new season of Mindhunter. The notification
triggers a significant number of users to get on the service and watch or upload the
episodes. Resource-wise, it is an activity spike that requires swift resource allocation.
E-commerce. Amazon has a Prime Day event with many special offers, sell-offs,
promotions, and discounts. It attracts an immense amount of customers on the service
who are doing different activities. Actions include searching for products, bidding, buying
stuff, writing reviews, rating products.

43
Elasticity Vs Scalability
This diverse activity requires a very flexible system that can allocate resources to
one sector without dragging down others.
Scalability
Cloud scalability is the ability of the system’s infrastructure to handle growing
workload requirements while retaining a consistent performance adequately.
Unlike cloud elasticity, which is more of makeshift resource allocation - scalability
is a part of infrastructure design.
There are several types of cloud scalability:
Vertical Scaling or Scale-Up - the ability to handle an increasing workload by
adding resources to the existing infrastructure. It is a short term solution to cover
immediate needs.
Horizonta Scaling or Scale-Out - the expansion of the existing infrastructure
with new elements to tackle more significant workload requirements. It is a long
term solution aimed to cover present and future resource demands with room for
expansion.
Diagonal scalability is a more flexible solution that combines adding and
removal of resources according to the current workload requirements. It is the
most cost-effective scalability solution by far.
Let’s take a call center, for example. The typical call center is continuously
growing. New employees come in to handle an increasing number of customer
requests gradually, and new features are introduced to the system (like sentiment
analysis, embedded analytics, etc.). In this case, cloud scalability is used to keep
the system’s performance as consistent and efficient as possible over an extended
time and growth.
On demand provisioning in Clouds:
Cloud provisioning refers to the processes for the deployment and integration of
cloud computing services within an enterprise IT infrastructure. This is a broad
term that incorporates the policies, procedures and an enterprise’s objective in
sourcing cloud services and solutions from a cloud service provider.
Cloud provisioning primarily defines how, what and when an organization will
provision cloud services. These services can be internal, public or hybrid cloud
products and solutions. There are three different delivery models:
Dynamic/On-Demand Provisioning: The customer or requesting application is
provided with resources on run time.
User Provisioning: The user/customer adds a cloud device or device
themselves.
Post-Sales/Advanced Provisioning: The customer is provided with the
resource upon contract/service signup.

44
Assignments

45
Assignment - I

Differentiate the Elasticity and Scalability features of cloud. (CO1, K2)

Submit a survey report on Indian perspective of Parallel computing and


CDAC’s role in it. (CO1, K3)

Is there any differences between Parallel Computing and Cloud Computing?


Justify your answer. (CO2, K3)

Explore and analyze the on-demand provisioning feature provided by


Amazon AWS Cloud.

46
Question Bank

47
Unit 1 - Part A – (CO1, K2)

1. Define Cloud Computing.

Cloud computing is a model for enabling ubiquitous, convenient, on-demand


network access to a shared pool of configurable computing resources (e.g.,
networks, servers, storage, applications, and services) that can be rapidly
provisioned and released with minimal management effort or service provider
interaction.

2. What are the generations of computer?

First-Generation Computers

Second-Generation Computers

Third-Generation Computers

Fourth-Generation Computers

Fifth-Generation Computers

3. What is ARPANET?

ARPA (Advanced Research Projects Agency), is an agency of the U.S.


Department of Defense funded research project to develop a network.

Project was known as ARPANET [Advanced Research Projects Agency


Network] – world’s first operational network.

4. List the common protocols used for the internet.

File Transfer Protocol (FTP)

Simple Mail Transfer Protocol (SMTP)

Transmission Control Protocol (TCP)

Internet Protocol (IP)

48
5. Discuss the evolution of IPv6.

An IP address is a numerical label assigned to each device connected to


a computer network that uses the Internet Protocol for communication. IP
addresses are used by the Internet infrastructure to direct internet packets
to their destinations.

Internet Protocol version 4 (IPv4) defines an IP address as a 32-bit number.

Because of the growth of the Internet and the depletion of available IPv4
addresses, a new version of IP IPv6, was developed in the mid-1990s, which
provides vastly larger addressing capabilities and more efficient routing of
Internet traffic. IPv6 uses 128 bits for the IP address.

6. What are the two models of computing? List the key elements of
computing.

The two fundamental and dominant models of computing are sequential


and parallel.

The four key elements of computing developed during the sequential


computing era and parallel computing era are:

 architectures

 compilers

 applications

 problem-solving environments.

7. Differentiate parallel and distributed computing.

Parallel Computing Distributed computing

Model in which the computation is Model in which computation is broken down


divided among several processors into units and executed concurrently on
sharing the same memory. different computing elements, whether
these are processors on different nodes,
processors on the same computer, or cores
within the same processor with their own
memory.
49
Parallel Computing Distributed computing

Tightly coupled systems Some distributed systems might be


loosely coupled, while others might be
tightly coupled

Uses single computer Uses multiple computers

8. Define parallel processing and parallel programming.

Processing of multiple tasks simultaneously on multiple processors is called


parallel processing.

The parallel program consists of multiple active processes (tasks)


simultaneously solving a given problem.

A given task is divided into multiple subtasks using a divide-and-conquer


technique, and each subtask is processed on a different central processing
unit (CPU).

Programming on a multiprocessor system using the divide-and-conquer


technique is called Parallel Programming.

9. Classify the computing systems based on the number of instruction


and data streams that can be processed.

Single-instruction, single-data (SISD) systems


Single-instruction, multiple-data (SIMD) systems

Multiple-instruction, single-data (MISD) systems

Multiple-instruction, multiple-data (MIMD) systems 50


10. List the levels of parallelism.

Levels of parallelism are decided based on grain size.

Grain Size Code Item Parallelized By

Large Separate and heavyweight process Programmer

Medium Function or procedure Programmer

Fine Loop or instruction block Parallelizing compiler

Very fine Instruction Processor

11. Define distributed systems and distributed computing.

A distributed system is a collection of independent computers that


appears to its users as a single coherent system.

A distributed system is one in which components located at networked


computers communicate and coordinate their actions only by passing
messages.

The components of a distributed system communicate with some sort of


message passing.

Distributed computing studies the models, architectures, and algorithms


used for building and man- aging distributed systems.

12. State the architectural styles for distributed computing.

Architectural styles aid in understanding and classifying the organization of


Software systems in general and distributed computing in particular.

Two major classes of architectural styles:

 Software architectural styles

 System architectural styles

51
13. What are the basic building blocks of architectural styles?

Basic building blocks of architectural styles are components and connectors.

Component - a unit of software that encapsulates a function or a feature of


the system.

Examples: programs, objects, processes, pipes, and filters.

Connector - communication mechanism that allows cooperation and


coordination among components.

14. What is the use of software architectural styles?

Software architectural styles are based on the logical arrangement of software


components. They are helpful because they provide an intuitive view of the
whole system, despite its physical deployment. They also identify the main
abstractions that are used to shape the components of the system and the
expected interaction patterns between them.

15. What are the categories of software architectural styles?

Category Most Common Architectural Styles

Repository
Data-centered
Blackboard

Pipe and filter


Data flow
Batch sequential

Rule-based system
Virtual machine
Interpreter

Main program and subroutine call/top-down


systems
Call and return
Object-oriented systems
Layered systems

Communicating processes
Independent components
Event systems

52
16. What are the components that characterize the blackboard
architectural styles?

The black board architectural style is characterized by three main components:

Knowledge sources: These are entities that update the knowledge base
that is maintained in the black board.

Blackboard: This represents the data structure that is shared among the
knowledge sources and stores the knowledge base of the application.

Control: The control is the collection of triggers and procedures that


govern the interaction with the blackboard and update the status of the
knowledge base.

17. Define system architectural styles.

System architectural styles cover the physical organization of components and


processes over a distributed infrastructure. Two fundamental reference styles
are as follows

Client / Server

Peer- to – Peer

18. What are the important operations in client/ server paradigm?

The important operations in the client – server paradigm are:

request (client side)

accept (client side)

listen (server side)

response (server side)

19. Identify two major models for client design in client/server model.

The two major models for client design are: Thin-client model and Fat-
client model.

53
Thin-client model: In this model, the load of data processing and
transformation is put on the server side, and the client has a light
implementation that is mostly concerned with retrieving and returning the
data it is being asked for, with no considerable further processing.

Fat-client model: In this model, the client component is also responsible


for processing and transforming the data before returning it to the user,
whereas the server features a relatively light implementation that is mostly
concerned with the management of access to the data.

20. What are the major components in client/ server model?

The three major components in the client-server model:

Presentation

Application logic

Data storage

21. Define Interprocess Communication [IPC].

Distributed systems are composed of a collection of concurrent processes


interacting with each other by means of a network connection. Therefore, IPC
is a fundamental aspect of distributed systems design and implementation. IPC
is used to either exchange data and information or coordinate the activity of
processes. IPC is what ties together the different components of a distributed
system, thus making them act as a single system.

22. Identify the models for message based communication.

a. Point-to-point message model


Sub categories of point-to-point communication model
direct communication
queue-based communication

b. Publish-and-subscribe message model


two major roles: the publisher and the subscriber
two major strategies for dispatching the event to the subscribers:
 Push strategy & Pull strategy
c. Request-reply message model

54
23.What are the characteristics of cloud computing?

The five essential characteristics of Cloud computing according to NIST are

On-demand self-service

Broad network access

Resource pooling

Rapid elasticity

Measured service

24.Define elasticity in cloud computing.

Сloud elasticity is a system’s ability to manage available resources according to


the current workload requirements dynamically. Elasticity is a vital feature of
cloud infrastructure.

25.Define scalability.

Cloud scalability is the ability of the system’s infrastructure to handle growing


workload requirements while retaining a consistent performance adequately.

Unlike cloud elasticity, which is more of makeshift resource allocation -


scalability is a part of infrastructure design.

There are several types of cloud scalability:

Vertical Scaling or Scale-Up

Horizontal Scaling or Scale-Out

Diagonal scalability

55
Unit 1 - Part B - Questions: (CO1, K2)

1. Explain in detail about the evolution of cloud computing.

2. What are the underlying principles of parallel and distributed computing?

3. Discuss the elements of parallel computing in detail.

4. Illustrate the hardware architectures for parallel processing.

5. Discuss the elements of distributed computing.

6. Elaborate the architectural styles for distributed computing.

7. Describe the technologies for distributed computing.

8. Explain the characteristics of Cloud computing.

56
Supportive Online
Courses

57
Relevant Online Courses

S. Course Title Link Platform


No
1 Cloud Computing https://nptel.ac.in NPTEL / Swayam
/courses/106/105
/106105167/#
2 Cloud Computing Applications, https://www.cour Coursera
Part 1: Cloud Systems and sera.org/learn/clo
Infrastructure ud-applications-
part1?specializati
on=cloud-
computing#syllab
us
3 Projects in Cloud Computing https://www.ude Udemy
my.com/course/pr
ojects-in-cloud-
computing/
4 Google Cloud Computing https://swayam.g NPTEL / Swayam
Foundations Course ov.in/nd1_noc20_
cs55/preview#:~:t
ext=The%20Googl
e%20Cloud%20Co
mputing%20Foun
dations,Google%2
0Cloud%20Platfor
m%20fits%20in.
5 Fast Track to Cloud https://www.expe Experfy
Comprehension -- IaaS, PaaS, rfy.com/training/c
SaaS on AWS, Azure, Google and ourses/fast-track-
Co to-cloud-
comprehension-
iaas-paas-saas-on-
aws-azure-google-
and-
co?code=CLOUD5
0

58
Real-time
Applications

59
Real world Examples of Cloud
Cloud Storage:
Drop Box
Gmail
Facebook

Marketing:
Maropost
Hubspot
Adobe Marketing Cloud

Education:
SlideRocket
Ratatype
Amazon Web Services

Healthcare:
ClearData
Dell’s Secure Healthcare Cloud
IBM Cloud

60
Real Life Analogies
What would you prefer, either buy or rent car?
Buy Your Own Car
Buying a car is a big investment, and there are a lot of important decisions to take
into account. Some people like all the different options, and others don’t want to
bother with thousands of decisions. When buying a car you have full control over
everything, its make and model, cost, interior, etc. Additionally, you’ve got to work
about taxes, insurance, inspections, and all sorts of maintenance, you’ve got the
control, but it comes with a hassle.
Renting a Car
Then how about renting a car? You have fewer and simpler decisions to make.
You just need to select a car from what’s available, and you can switch your car if
something comes up.
Rent when you need; pay when you use. You don’t have to worry about
maintenance costs, tax, and insurance since they are included in your rental fee.
On the other hand, there are obviously some disadvantages. You’re limited by
what’s available from the rental vendor, you may not be allowed to customize the
car, and the car is not dedicated to you all the time.
Translating the Analogy to Cloud Computing
This simple real life analogy is easily translatable to Cloud Computing.
Buying your own car is similar to setting up your own on-premise data center. You
have the flexibility to customize whatever you like, starting from physical
infrastructure, the security system, hardware and software, etc. However, you
also have to invest a lot of money upfront. And also, you will also need to manage
it later when it’s operating.
On the other hand, instead of building your own data center, you can rent
computation power and storage from the cloud provider. You can scale in and out
when necessary. Just pay when you use. No specific commitment takes place. You
can start and stop anytime.

61
Contents Beyond
Syllabus

62
1. Serverless Computing
Serverless architecture (also known as serverless computing or function as a
service, FaaS) is a software design pattern where applications are hosted by a third-
party service, eliminating the need for server software and hardware management
by the developer. Applications are broken up into individual functions that can be
invoked and scaled individually.

Why Serverless Architecture?

Hosting a software application on the internet usually involves managing some


kind of server infrastructure. Typically this means a virtual or physical server that
needs to be managed, as well as the operating system and other web server
hosting processes required for your application to run. Using a virtual server from
a cloud provider such as Amazon or Microsoft does mean the elimination of the
physical hardware concerns, but still requires some level of management of the
operating system and the web server software processes.

With a serverless architecture, we focus purely on the individual functions in your


application code. Services such as Twilio Functions, AWS Lambda and Microsoft
Azure Functions take care of all the physical hardware, virtual machine operating
system, and web server software management. We only need to worry about our
code.

How does Serverless or FaaS Differ from PaaS?

PaaS, or Platform as a Service, products such as Heroku, Azure Web


Apps and AWS Elastic Beanstalk offer many of the same benefits as Serverless
(sometimes called Function as a Service or FaaS). They do eliminate the need for
management of server hardware and software. The primary difference is in the
way you compose and deploy your application, and therefore the scalability of
your application.
With PaaS, your application is deployed as a single unit and is developed in the
traditional way using some kind of web framework like ASP.NET, Flask, Ruby on
Rails, Java Servlets, etc. Scaling is only done at the entire application level. You
can decide to run multiple instances of your application to handle additional load.

63
ASSESSMENT SCHEDULE
Tentative schedule for the Assessment During 2022-2023 odd
semester

Name of the Scheduled


S.NO Portion
Assessment Date

1 Unit Test 1 26.08.22 UNIT 1

2 IAT 1 19.09.2022 UNIT 1 & 2

3 Unit Test 2 12.10.22 UNIT 3

4 IAT 2 4.11.22 UNIT 3 & 4

5 Revision 28.11.22 ALL 5 UNITS

6 Model 5.12.22 ALL 5 UNITS

64
Prescribed Text Books
& References

65
Text Books and References
TEXT BOOKS:

T1: Kai Hwang, Geoffrey C. Fox, Jack G. Dongarra, "Distributed and Cloud
Computing, From Parallel Processing to the Internet of Things", Morgan
Kaufmann Publishers, 2012.

T2: Rittinghouse, John W., and James F. Ransome, ―Cloud Computing:


Implementation, Management and Security‖, CRC Press, 2017.

REFERENCES:

R1: Rajkumar Buyya, Christian Vecchiola, S. ThamaraiSelvi, ―Mastering


Cloud Computing‖, Tata Mcgraw Hill, 2013.

R2: Toby Velte, Anthony Velte, Robert Elsenpeter, "Cloud Computing - A


Practical Approach‖, Tata Mcgraw Hill, 2009.

R3: George Reese, "Cloud Application Architectures: Building Applications


and Infrastructure in the Cloud: Transactional Systems for EC2 and Beyond
(Theory in Practice)‖, O'Reilly, 2009.

66
Miniproject
Suggestions

67
Mini Project Ideas
1. Design a dynamic website about yourself, showcasing your talents, skillsets,
interests, your opinion on current affairs and hobbies and host it in using
AWS cloud.

2. Implement a collaborative workspace for communication for you class


students using any cloud technology. (More like a Technical Forum, Virtual
Community).

3. Design a model secure health cloud for a hospital near your residence.

4. Implement a secure Framework for Digital Study Material sharing System


for your college.

5. Conduct a survey on Pre-emption-aware Energy Management in Virtualized


Cloud Centres.

6. Conduct a survey on the methods of Auditing in Cloud systems.

7. Show case a data leakage detection implementation in any cloud application


of your choice.

8. Can you implement a simple application which integrates Big Data, Machine
Learning concepts and deploy it in cloud platform?

9. Design a simple cloud based application to implement Car Pooling.

10. Can you think of any solution for COVID-19 (data gathering, data
management) using cloud?

68
Thank you

Disclaimer:

This document is confidential and intended solely for the educational purpose of RMK Group
of Educational Institutions. If you have received this document through email in error,
please notify the system manager. This document contains proprietary information and is
intended only to the respective group / learning community as intended. If you are not the
addressee you should not disseminate, distribute or copy through e-mail. Please notify the
sender immediately by e-mail if you have received this document by mistake and delete this
document from your system. If you are not the intended recipient you are notified that
disclosing, copying, distributing or taking any action in reliance on the contents of this
information is strictly prohibited.

69

You might also like