CC Unit 1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 24

TECHNOLOGY FOR NETWORK BASED SYSTEMS

Technologies for network-based systems in cloud computing refer to a comprehensive


suite of hardware, software, and networking tools and frameworks designed to enable and
optimize distributed computing environments. These technologies support the efficient
handling of massive parallelism, allowing multiple computing resources to work together
seamlessly to perform large-scale computations, store data, and manage network
communications. The goal is to create scalable, reliable, and high-performance distributed
operating systems and applications that can dynamically allocate resources to meet varying
demands.
1.2.1Multicore CPUs and Multithreading Technologies
 The development of component and network technologies over the last 30 years has
been vital for high-performance computing (HPC) and high-throughput computing
(HTC) systems.
 The advancements in processor speed (measured in millions of instructions per second
or MIPS) and network bandwidth (measured in megabits per second or Mbps, and
gigabits per second or Gbps) have been significant.
 For example, 1 Gbps Ethernet bandwidth is denoted as 1 GE.

1.2.1.1 Advances in CPU Processors


o Modern CPUs, or microprocessor chips, now commonly feature a multicore
architecture, which means they have multiple processing cores. These can range
from dual-core (two cores) to quad-core (four cores), six cores, or even more.
This design allows processors to perform parallel processing, utilizing both
Instruction-Level Parallelism (ILP) and Thread-Level Parallelism (TLP).
o Processor Speed Growth: Over the years, processor speeds have dramatically
increased. For instance, the VAX 780 in 1978 could process 1 million
instructions per second (MIPS). By 2002, the Intel Pentium 4 reached 1,800
MIPS, and by 2008, the Sun Niagara 2 peaked at 22,000 MIPS. This
progression reflects Moore’s Law, which predicts the doubling of transistors on
a microchip approximately every two years, resulting in performance
improvements.
o Clock rates, which measure how fast a CPU can execute instructions, have also
increased from 10 MHz with the Intel 286 to 4 GHz with the Pentium 4.
However, due to power and heat limitations, clock rates have mostly plateaued
around 5 GHz for CMOS-based chips. Higher frequencies generate excessive
heat, making further increases challenging without new technological
advancements.
o Modern CPUs leverage ILP techniques such as multiple-issue superscalar
architecture, dynamic branch prediction, and speculative execution to boost
performance. These techniques require both hardware and software (compiler)
support. In addition, Data Level Parallelism (DLP) and Thread Level
Parallelism (TLP) are extensively utilized in GPUs, which feature hundreds or
even thousands of simpler cores designed for parallel processing.
o Today's multicore CPUs and many-core GPUs can manage multiple instruction
threads concurrently. For example, a typical multicore processor includes
multiple cores, each with its own private L1 cache, and an L2 cache shared
among all cores. Future CPUs may integrate multiple chip multiprocessors
(CMPs) and even an L3 cache on a single chip.
o High-end processors, such as Intel's i7 and Xeon, AMD's Opteron, Sun's
Niagara, IBM's Power 6, and the X Cell processors, all embody multicore and
multithreaded capabilities. Each core can also support multithreading. For
instance, the Niagara II has eight cores, each capable of handling eight threads,
allowing for a maximum of 64 threads simultaneously (8 cores x 8 threads per
core). By 2011, the Intel Core i7 990x achieved a remarkable execution rate of
159,000 MIPS, illustrating the significant advancements in processor
technology.
1.2.1.2 Multicore CPU and Many-Core GPU Architectures
o Multicore CPUs are expected to continue evolving, potentially expanding from
the current dozens of cores to hundreds or even more. However, CPUs face
limitations in processing massive amounts of data simultaneously (massive
Data-Level Parallelism, or DLP) due to challenges like the memory wall
problem, where the speed of memory access cannot keep up with CPU speeds.
This has led to the development of many-core GPUs, which have hundreds or
even thousands of simpler cores designed to handle large-scale parallel
processing more efficiently.
o Commercial CPUs incorporate both 32-bit (IA-32) and 64-bit (IA-64)
instruction sets. Recently, x86 processors have been enhanced to support high-
performance computing (HPC) and high-throughput computing (HTC) in high-
end servers. Many RISC (Reduced Instruction Set Computing) processors are
being replaced by multicore x86 processors and many-core GPUs in the world's
top supercomputers. This trend suggests that x86 processors will continue to be
dominant in data centers and supercomputers.
o GPUs have also been increasingly used in large clusters to build supercomputers
based on massively parallel processing (MPP) architectures. In the future, the
industry is likely to focus on developing asymmetric or heterogeneous
multiprocessors that combine both powerful CPU cores and numerous GPU
cores on the same chip. This approach aims to leverage the strengths of both
types of cores for improved performance and efficiency.

(I have put below some additional points related to the above topic u can
read if u want)
Additional Related Points
Memory Wall Problem: The term "memory wall" refers to the growing disparity
between CPU speeds and memory access times. While CPU processing power
has increased significantly, memory speed improvements have lagged behind,
creating a bottleneck that limits overall system performance.

Heterogeneous Computing: Heterogeneous computing involves using different


types of processors within a single system, such as combining CPUs and GPUs.
This allows systems to efficiently handle a wider range of tasks, leveraging
CPUs for complex sequential processing and GPUs for parallel processing
tasks.

High-Performance and High-Throughput Computing: High-Performance


Computing (HPC) focuses on achieving the highest possible performance for
individual tasks, often using supercomputers. High-Throughput Computing
(HTC), on the other hand, aims to process a large number of tasks over a longer
period, typically seen in data center environments.
Advancements in Instruction Sets: The x86 instruction set, originally designed
for personal computers, has been extended to meet the demands of HPC and
HTC environments. These advancements include support for larger memory
addresses, more complex instructions, and enhanced parallel processing
capabilities.

Supercomputer Clusters: Modern supercomputers often consist of large clusters


of interconnected processors. These clusters use GPUs to accelerate tasks that
require massive parallel processing, such as scientific simulations, big data
analysis, and artificial intelligence applications.

Future Trends: The trend towards heterogeneous computing and the integration
of various types of processors on a single chip is expected to continue. This
approach not only improves performance but also enhances energy efficiency,
making it suitable for both large-scale data centers and edge computing devices.

1.2.1.3 Multithreading Technology

o Consider Figure 1.6, which illustrates how five independent threads of


instructions are dispatched to four pipelined data paths (functional units)
in five different types of processors. These processors are:
- A four-issue superscalar processor
- A fine-grain multithreaded processor
- A coarse-grain multithreaded processor
- A two-core chip multiprocessor (CMP)
- A simultaneous multithreaded (SMT) processor
o Let's break down each type:
o Superscalar Processor: This processor can issue four instructions per
cycle but only from a single thread at a time. It's single-threaded with
four functional units. It processes instructions in parallel but only from
the same thread.
o Fine-Grain Multithreaded Processor: This processor switches between
threads every cycle. It has four functional data paths and can issue
instructions from different threads in quick succession, improving
resource utilization.
o Coarse-Grain Multithreaded Processor: This processor runs multiple
instructions from the same thread for several cycles before switching to
another thread. This means it executes a block of instructions from one
thread, then moves on to the next thread.
o Two-Core CMP (Chip Multiprocessor): This processor has two cores,
each functioning as a two-way superscalar processor. Each core can issue
instructions from its own thread independently, allowing two threads to
be processed simultaneously.
o Simultaneous Multithreaded (SMT) Processor: This advanced
processor can issue instructions from multiple threads at the same time
within the same cycle. It maximizes resource utilization by interleaving
instructions from different threads.

o In these diagrams, instructions from different threads are shown with


different shading patterns. Here’s what you need to understand:
o Superscalar Processor: Only processes instructions from the same
thread at any given time, hence it's limited by the instructions available
from that single thread.
o Fine-Grain Multithreading: Quickly switches between threads every
cycle, ensuring high utilization of the functional units by rapidly
interleaving instructions.
o Coarse-Grain Multithreading: Sticks with one thread for a while before
switching, leading to a block-wise execution of instructions from one
thread at a time.
o Multicore CMP: Executes instructions from different threads
independently across multiple cores, effectively handling more threads at
the same time.
o Simultaneous Multithreading (SMT): Can handle multiple threads
concurrently, issuing instructions from various threads simultaneously in
a single cycle, thus maximizing the use of available resources.
o The blank squares in the diagrams represent cycles where no instructions
are available for execution in a particular functional unit, indicating lower
scheduling efficiency. Achieving the maximum instruction-level
parallelism (ILP) or thread-level parallelism (TLP) in each cycle is
challenging. The key takeaway is to understand how different processor
architectures manage the scheduling and execution of instructions from
multiple threads to optimize performance and resource utilization.

1.2.2GPU Computing to Exascale and Beyond


 A GPU, or Graphics Processing Unit, is a specialized processor originally designed to
handle graphics and video tasks, freeing up the CPU from these intensive chores. It is
mounted on a computer’s graphics card or video card. NVIDIA introduced the first
GPU, the GeForce 256, in 1999. This GPU could process at least 10 million polygons
per second, making it an essential component in almost all modern computers. Some
features of GPUs have even been integrated into certain CPUs over time.
 Unlike traditional CPUs, which typically have a few cores (like the six-core Xeon
X5670), modern GPUs are equipped with hundreds of cores. For example, the Xeon
X5670 CPU has six cores. This design allows GPUs to handle many tasks
simultaneously, making them much more efficient for parallel processing. GPUs use a
throughput architecture, which means they process many operations concurrently,
albeit at a slower pace, instead of handling one task very quickly like a CPU.
 Recently, the use of GPUs, especially in clusters, has gained popularity for tasks
requiring high parallelism, outperforming CPUs that struggle with such tasks. This has
led to the rise of general-purpose computing on GPUs, known as GPGPUs. NVIDIA’s
CUDA model is a significant development in this area, tailored for high-performance
computing (HPC) using GPGPUs.

Additional Points:
 GPUs in Modern Applications: GPUs are not only used for graphics but also for
scientific simulations, artificial intelligence, and machine learning, where their ability
to perform many calculations at once is invaluable.
 Power Efficiency: GPUs are designed to deliver high performance with better power
efficiency compared to CPUs, making them ideal for large-scale data centers and
supercomputers.
 Programming Models: Apart from CUDA, other programming models like OpenCL
(Open Computing Language) also enable developers to utilize GPU power for a wide
range of applications.
 GPU Evolution: Over the years, GPUs have evolved from simple graphics
accelerators to powerful processors capable of handling diverse and complex
computational tasks, contributing significantly to advancements in various fields such
as deep learning and data analytics.
 Future Trends: As technology progresses, we can expect further improvements in
GPU architecture, increasing their capabilities and efficiency, pushing the boundaries
of what can be achieved in exascale computing.

1.2.2.1 How GPUs Work


o Initially, GPUs were functioned as coprocessors that worked alongside the CPU
to manage graphics tasks. Modern GPUs, such as those from NVIDIA, have
evolved significantly and can now contain up to 128 cores on a single chip.
Each core can handle eight threads at once, allowing a single GPU to manage
up to 1,024 threads simultaneously. This represents a substantial level of parallel
processing power, especially when compared to a conventional CPU, which can
only manage a few threads at a time.
o GPUs are designed differently from CPUs. While CPUs are optimized for
minimizing latency (delays in processing), GPUs are optimized for high
throughput, meaning they can process a large number of operations in parallel.
This is achieved through explicit management of on-chip memory. Modern
GPUs are no longer limited to just graphics and video tasks; they are also used
in high-performance computing (HPC) systems to perform extensive parallel
processing, making them crucial for powering supercomputers.
o GPUs excel at handling large numbers of floating-point calculations
simultaneously, offloading these data-intensive tasks from the CPU. This frees
up the CPU to manage other tasks more efficiently. Conventional GPUs are
found in a wide range of devices, including mobile phones, game consoles,
embedded systems, personal computers, and servers.
o NVIDIA's CUDA technology, such as the Tesla or Fermi GPUs, is specifically
designed for HPC and is used in GPU clusters to handle massive amounts of
parallel processing, particularly for floating-point operations.

Additional Points:
Parallel Processing: The ability of GPUs to handle many tasks at the same time
makes them ideal for applications that require high levels of parallelism, such as
machine learning, scientific simulations, and data analysis.
Efficiency: By offloading data-intensive calculations to the GPU, systems can
achieve higher efficiency and performance, as the CPU is relieved from these
demanding tasks.
Programming Models: To leverage the full power of GPUs, developers use
programming models like CUDA (Compute Unified Device Architecture) and
OpenCL (Open Computing Language) which provide tools for managing
parallel execution.
Integration in Diverse Devices: The versatility of GPUs allows them to be
integrated into various devices beyond traditional computers, enhancing the
processing capabilities of mobile phones, gaming consoles, and other embedded
systems.
Future Developments: As GPU technology continues to advance, we can expect
even greater improvements in their processing power, efficiency, and
applications, further pushing the capabilities of HPC and other computationally
intensive fields.
1.2.2.2 GPU Programming Model

o Figure 1.7 illustrates how a CPU and GPU work together to perform parallel
processing of floating-point operations. The CPU, which is a conventional
multicore processor, has limited parallel processing capabilities. In contrast, the
GPU has a many-core architecture with hundreds of simple processing cores
organized into multiprocessors, each capable of running one or more threads.
o In this setup, the CPU offloads most of its floating-point computation tasks to
the GPU. The CPU tells the GPU to handle massive data processing tasks,
leveraging the GPU's parallel processing power. To ensure efficient
performance, the data transfer speed between the main memory (on the
motherboard) and the GPU's on-chip memory must be well-matched. This
interaction is managed using NVIDIA's CUDA programming model,
particularly with GPUs like the GeForce 8800, Tesla, and Fermi.
o Looking ahead, GPUs with thousands of cores could be used in Exascale
computing systems, which are capable of performing 10^18 floating-point
operations per second (flops). This represents a move towards hybrid systems
that combine both CPUs and GPUs to maximize processing power. A DARPA
report from September 2008 identifies four major challenges for exascale
computing: energy and power efficiency, memory and storage management,
handling concurrency and locality (i.e., managing many operations at once and
ensuring data is processed close to where it's stored), and ensuring system
resiliency (i.e., reliability and fault tolerance). This highlights the progress being
made in both GPU and CPU technologies in terms of power efficiency,
performance, and programmability.
Additional Points:
Parallel Execution: GPUs excel in parallel execution, handling many tasks at
once, which is ideal for applications like scientific simulations, machine
learning, and real-time data processing.
Hybrid Systems: The future of computing may involve hybrid systems that use
both CPUs and GPUs, leveraging the strengths of each to achieve optimal
performance.
Programming Tools: CUDA (Compute Unified Device Architecture) is a key
tool for programming GPUs, enabling developers to write software that can take
full advantage of GPU parallelism.
Challenges and Innovations: Addressing the challenges of energy consumption,
memory management, concurrency, and system reliability is crucial for the
development of future exascale computing systems.
Trend Towards Efficiency: Both CPUs and GPUs are becoming more power-
efficient and capable, reflecting ongoing advancements in technology to meet
the demands of high-performance computing.
1.2.2.3 Power Efficiency of the GPU
o Bill Dally from Stanford University highlights that the key advantages of GPUs
over CPUs for future computing are their power efficiency and massive
parallelism. According to estimates, running an exaflops (10^18 floating-point
operations per second) system would require achieving 60 Gflops (billion
floating-point operations per second) per watt per core. Power limitations
dictate what can be included in CPU or GPU chips.
o Dally's research indicates that a CPU chip consumes about 2 nanojoules per
instruction, while a GPU chip uses only 200 picojoules per instruction, making
GPUs ten times more efficient in terms of power. CPUs are designed to
minimize delays (latency) in accessing caches and memory, whereas GPUs are
designed to maximize data processing throughput by efficiently managing on-
chip memory.
o In 2010, the performance-to-power ratio for GPUs was 5 Gflops per watt per
core, compared to less than 1 Gflop per watt per CPU core. This significant
difference suggests that GPUs are more efficient, but scaling up future
supercomputers may still pose challenges. Nonetheless, GPUs could narrow the
performance gap with CPUs over time.
o A major factor in power consumption is data movement. Therefore, optimizing
the storage hierarchy and tailoring memory to specific applications is crucial.
Additionally, developing self-aware operating systems, runtime support, and
locality-aware compilers and auto-tuners for GPU-based massively parallel
processors (MPPs) is essential. This means that addressing both power
efficiency and software development are the primary challenges for future
parallel and distributed computing systems.

Additional Points:
Advantages of GPUs: GPUs are significantly more power-efficient than CPUs,
making them suitable for high-performance computing tasks that require
massive parallelism.
Power Consumption: Managing power consumption is critical, especially in
large-scale systems. Efficient data movement and memory management are key
to reducing power usage.
Optimizing Software: Developing advanced software tools, such as self-aware
operating systems and locality-aware compilers, is necessary to fully utilize the
power of GPUs in parallel computing environments.
Future Challenges: As we aim for exascale computing, overcoming the
challenges of power efficiency and software optimization will be crucial to
building more powerful and efficient supercomputers.
Trends and Innovations: Continuous advancements in GPU technology and
software development will play a pivotal role in the evolution of parallel and
distributed computing systems.
1.2.3Memory,Storage,and Wide-Area Networking
1.2.3.1 Memory Technology
o Figure 1.10 illustrates the growth of DRAM chip capacity, which increased
from 16KB in 1976 to 64GB in 2011. This indicates that memory chips have
roughly quadrupled in capacity every three years. However, the speed at which
memory can be accessed hasn't improved significantly, leading to a worsening
"memory wall" problem, where the gap between fast processors and slower
memory access times hinders overall performance.
o For hard drives, capacity has seen substantial growth as well, increasing from
260MB in 1981 to 250GB in 2004, and reaching 3TB with the Seagate
Barracuda XT in 2011. This equates to a tenfold increase in capacity
approximately every eight years. Disk arrays, which combine multiple hard
drives, are expected to see even greater capacity increases in the future.
o The increasing speed of processors and the growing capacity of memory are
creating a larger disparity between processor performance and memory access
speed. This memory wall problem could become more severe, potentially
limiting CPU performance further in the future.
Additional Points:
Memory Capacity Growth: The rapid increase in DRAM capacity has
significantly expanded the amount of data that can be stored and accessed
quickly by computers.
Memory Access Time: Despite the increase in memory capacity, the speed at
which data can be accessed from memory has not kept pace with processor
speeds, leading to performance bottlenecks.
Hard Drive Capacity Growth: Hard drive capacities have grown exponentially,
allowing for vast amounts of data to be stored on a single drive. Disk arrays
further amplify this capacity.
Future Challenges: As processors continue to get faster, the gap between their
speed and the slower memory access times (memory wall) presents a significant
challenge for future computer performance.
Solutions and Innovations: Addressing the memory wall problem may require
new technologies or architectures, such as faster memory technologies,
improved data caching strategies, or alternative storage solutions like solid-state
drives (SSDs).
1.2.3.2 Disks and Storage Technology
o Since 2011, disk drives and storage systems have continued to grow in capacity,
surpassing 3TB. The growth of disk storage is illustrated by lower curve in
Figure 1.10, showing a remarkable increase of seven orders of magnitude over
33 years. Additionally, the rapid development of flash memory and solid-state
drives (SSDs) has significant implications for both high-performance computing
(HPC) and high-throughput computing (HTC) systems.
o SSDs have become increasingly popular due to their durability. They can
withstand hundreds of thousands to millions of write cycles per block, ensuring
longevity even under heavy usage conditions. This durability, combined with
their impressive speed, makes SSDs suitable for many applications.
o However, challenges such as power consumption, cooling, and packaging
constraints may limit the development of large-scale systems in the future.
Power consumption increases linearly with clock frequency and quadratically
with voltage, necessitating the demand for lower voltage supplies. While clock
rates cannot be endlessly increased, there is a growing need for more efficient
power management solutions.
o A quote from Jim Gray, given during a talk at the University of Southern
California, reflects the evolving landscape of storage technology: "Tape units
are dead, disks are tape units, flashes are disks, and memory are caches now."
This indicates a shift towards faster, more efficient storage solutions. However,
as of 2011, SSDs remain too costly to fully replace traditional disk arrays in the
storage market.

Additional Points:
Flash Memory and SSDs: The rise of flash memory and SSDs has
revolutionized storage technology, offering faster speeds and greater durability
compared to traditional hard disk drives.
Longevity of SSDs: SSDs can endure a high number of write cycles per block,
ensuring they remain operational for several years, even under heavy usage.
Challenges of Large-Scale Systems: Power consumption, cooling, and
packaging constraints pose challenges for the development of large-scale
computing systems using SSDs and other storage technologies.
Shift in Storage Paradigm: Jim Gray's quote highlights the transition from
traditional tape-based storage to modern flash-based solutions, indicating a shift
towards faster, more efficient storage technologies.
Cost Considerations: While SSDs offer numerous advantages, including speed
and durability, their higher cost relative to traditional hard drives may hinder
widespread adoption in certain markets.
1.2.3.3 System-Area Interconnects
o In small clusters, the nodes are usually connected using an Ethernet switch or a
local area network (LAN). As depicted in Figure 1.11, a LAN is commonly used
to link client hosts to large servers. A storage area network (SAN) connects
servers to network storage devices like disk arrays, while network-attached
storage (NAS) directly links client hosts to disk arrays. These three types of
networks are often found in large clusters constructed using off-the-shelf
network components. For smaller clusters where distributed storage isn't
needed, a multiport Gigabit Ethernet switch along with copper cables can
connect the end machines. All three network types are readily available for
commercial use.

1.2.3.4 Wide-Area Networking


o The graph in Figure 1.10 illustrates the swift expansion of Ethernet bandwidth,
growing from 10 Mbps in 1979 to 1 Gbps in 1999, and further to 40-100 Gbps
by 2011. It's anticipated that by 2013, network links capable of 1 Tbps speeds
will be available. In 2006, various bandwidths ranging from 1 Gbps to 1 Tbps
were reported for different types of connections, such as international, national,
organizational, optical desktop, and copper desktop.
o Reports suggest that network performance has been doubling every year, a rate
faster than Moore's law, which states that CPU speed doubles every 18 months.
This implies that more computers will be used simultaneously in the future.
High-bandwidth networking enhances the capability to construct massively
distributed systems, accommodating the growing demand for interconnected
computing resources.
o According to the IDC 2010 report, both InfiniBand and Ethernet are expected to
be the primary choices for interconnects in the high-performance computing
(HPC) field. Currently, most data centers utilize Gigabit Ethernet as the
interconnect technology in their server clusters.

1.2.4Virtual Machines and Virtualization Middleware


 In a typical computer setup, there's only one operating system (OS) image, which makes
the system inflexible because it tightly links application software to specific hardware.
This means that software designed for one machine might not work on another with
different hardware or OS.
 Virtual machines (VMs) offer a solution to this problem by allowing better utilization of
resources, flexibility in running applications, easier management of software, and
improved security compared to traditional physical machines.
 To create large clusters, grids, or clouds, we often need to access a lot of computing
power, storage, and network resources in a virtualized way. This means pooling these
resources together and presenting them as a single unified system. In cloud computing
especially, resources are provisioned dynamically, and virtualization plays a crucial role
in managing processors, memory, and input/output (I/O) facilities efficiently.
 We'll delve deeper into virtualization in Chapter 3, but it's essential to introduce some
basic concepts first, such as virtual machines, virtual storage, and virtual networking,
along with the software or middleware used for virtualization. Figure 1.12 illustrates three
different configurations of virtual machine architectures.

Additional Points:
 Benefits of Virtualization: Virtual machines offer advantages such as improved
resource utilization, flexibility in running software, simplified software management,
and enhanced security compared to traditional physical machines.
 Cloud Computing: Virtualization is fundamental to cloud computing, where
resources are pooled together and provisioned dynamically as needed.
 Dynamic Resource Allocation: In cloud environments, virtualization enables the
dynamic allocation and management of processors, memory, and I/O resources based
on demand.
 Software and Middleware: Virtualization software or middleware facilitates the
creation and management of virtual machines, virtual storage, and virtual networking,
providing a layer of abstraction between the hardware and the software running on it.
 Scalability: Virtualization allows for easier scaling of computing resources, enabling
organizations to adapt to changing demands and workload fluctuations more
efficiently.

1.2.4.1 Virtual Machines


o In Figure 1.12, the host machine has the physical hardware, like an x86 desktop
running Windows OS, shown in part (a). A virtual machine (VM) can be set up
to work on any hardware system. It's created with virtual resources managed by
a guest OS to run a specific application. Between the VMs and the host, a
middleware layer called a virtual machine monitor (VMM) is needed.
o Figure 1.12(b) illustrates a native VM installed using a VMM called a
hypervisor in privileged mode. For instance, if the hardware is x86-based with
Windows, the guest OS might be Linux, and the hypervisor could be XEN from
Cambridge University. This is known as bare-metal VM because the hypervisor
directly manages the hardware.
o Another setup is the host VM shown in Figure 1.12(c), where the VMM runs in
nonprivileged mode, requiring no modification to the host OS. There's also a
dual-mode VM, as in Figure 1.12(d), where part of the VMM operates at the
user level and another at the supervisor level, possibly needing some changes to
the host OS.
o Multiple VMs can be moved to a specific hardware system to facilitate
virtualization. The VM approach offers independence for both the OS and
applications from the hardware. Applications running on their dedicated OS can
be bundled as a virtual appliance, which can then run on any hardware platform.
The VM can even operate on an OS different from that of the host computer.

Additional Points:
Virtual Machine Monitor (VMM): This middleware layer manages VMs and
their interaction with the host hardware.
Bare-Metal VM: In this setup, the hypervisor directly handles hardware
resources, offering efficient management.
Host VM: Here, the VMM operates without modifying the host OS, simplifying
setup.
Dual-Mode VM: This setup utilizes both user-level and supervisor-level
components in the VMM, potentially requiring some host OS changes.
Hardware Independence: VMs provide a layer of abstraction, allowing
applications to run independently of the underlying hardware.
Virtual Appliance: Bundling applications with their dedicated OS into virtual
appliances allows for easy deployment across different hardware platforms.
Compatibility: VMs can run on different OSes, enhancing flexibility and
compatibility across diverse computing environments.

1.2.4.2 VM Primitive Operations

o The Virtual Machine Monitor (VMM) provides an abstraction layer to the guest
OS. With full virtualization, the VMM creates a VM abstraction that mimics the
physical machine closely, allowing standard OSes like Windows 2000 or Linux
to run as they would on actual hardware. Mendel Rosenblum outlined low-level
VMM operations, illustrated in Figure 1.13:
- Multiplexing VMs: VMs can be shared among hardware machines, as
depicted in Figure 1.13(a).
- Suspending and Storing VMs: A VM can be paused and saved in
stable storage, shown in Figure 1.13(b).
- Resuming or Provisioning VMs: A suspended VM can be resumed or
moved to a new hardware platform, as depicted in Figure 1.13(c).
- Migrating VMs: VMs can be transferred from one hardware platform
to another, as shown in Figure 1.13(d).
o These operations allow VMs to be provisioned on any available hardware
platform, offering flexibility in executing distributed applications. Additionally,
the VM approach improves server resource utilization significantly. Various
server functions can be consolidated on the same hardware platform, increasing
system efficiency. This consolidation helps eliminate server sprawl by
deploying systems as VMs, which effectively utilizes shared hardware
resources. VMware claimed that this approach could boost server utilization
from its current range of 5–15 percent to 60–80 percent.
1.2.4.3 Virtual Infrastructures
o In Figure 1.14, the physical resources like computing power, storage, and
networking are depicted at the bottom, while various applications are
represented in virtual machines (VMs) at the top. This separation of hardware
and software defines the virtual infrastructure, which links resources to
distributed applications dynamically. Essentially, it's a flexible allocation of
system resources to specific applications.
o The outcome of virtual infrastructure is reduced costs and improved efficiency
and responsiveness. An example of this is server consolidation and containment
through virtualization. We'll delve deeper into VMs and virtualization support in
Chapter 3. Furthermore, we'll cover virtualization support for clusters, clouds,
and grids in Chapters 3, 4, and 7, respectively.
1.2.5Data Center Virtualization for Cloud Computing
 In this section, the basic structure and design aspects of data centers are discussed,
specifically for cloud computing. Cloud architecture typically relies on inexpensive
hardware and network devices. Most cloud platforms opt for widely-used x86
processors. Data centers are constructed using cost-effective terabyte disks and
Gigabit Ethernet.
 When designing data centers, the focus is on achieving a balance between
performance and cost. This means prioritizing factors like storage capacity and energy
efficiency rather than just raw speed. Figure 1.13 illustrates the growth of servers and
the breakdown of costs in data centers over the past 15 years. As of 2010,
approximately 43 million servers are operational worldwide. Interestingly, the cost of
utilities starts outweighing the cost of hardware after three years.

Additional Points:
Commodity Hardware: Cloud data centers typically utilize standard, affordable
hardware components to keep costs down.
Processor Choice: x86 processors are commonly used in cloud architectures due to
their widespread availability and cost-effectiveness.
Storage and Energy Efficiency: Data center design prioritizes efficiency in storage and
energy usage rather than solely focusing on maximizing speed.
Server Growth: The number of servers in operation has been steadily increasing over
the years, reflecting the growing demand for computing resources.
Cost Considerations: Utilities costs become a significant factor in data center
operations, surpassing hardware costs after a certain period.
Long-Term Sustainability: Data center designs aim to balance performance and cost-
effectiveness to ensure sustainable operation over time.
1.2.5.1 Data Center Growth and Cost Breakdown
o Large data centers can be massive structures housing thousands of servers,
while smaller ones typically accommodate hundreds. Over the years, the
expenses associated with building and maintaining data center servers have
risen.
o According to a 2009 IDC report (refer to Figure 1.14), only around 30 percent
of the total data center costs are attributed to purchasing IT equipment like
servers and disks. The majority of costs are allocated to other components: 33
percent for chillers, 18 percent for uninterruptible power supplies (UPS), 9
percent for computer room air conditioning (CRAC), and the remaining 7
percent for power distribution, lighting, and transformer expenses. This
breakdown indicates that a significant portion, about 60 percent, of the overall
data center cost is directed towards management and maintenance.
o Interestingly, while the cost of server purchases hasn't seen much change over
time, the expenses related to electricity and cooling have increased notably from
5 percent to 14 percent over a span of 15 years.
Additional Points:
Cost Distribution: Data center costs are not solely about purchasing servers; a
substantial portion is allocated to infrastructure, maintenance, and utilities.
Electricity and Cooling Costs: With the increasing demand for computing
power, the expenses associated with electricity and cooling systems have also
risen significantly.
Efficiency Measures: To mitigate rising costs, data centers often implement
energy-efficient practices and technologies to reduce electricity consumption
and cooling needs.
Scalability Challenges: As data centers grow in size and capacity, managing
costs becomes increasingly challenging, requiring careful planning and
optimization strategies.
Sustainability: There's a growing emphasis on making data centers more
environmentally friendly by reducing energy consumption and adopting
renewable energy sources where possible.
1.2.5.2 Low-Cost Design Philosophy
o Expensive high-end switches or routers might not be feasible for constructing
data centers due to their high costs. Consequently, utilizing high-bandwidth
networks may not align with the economic principles of cloud computing. When
operating within a fixed budget, it's more practical to opt for commodity
switches and networks in data center setups.
o Similarly, choosing commodity x86 servers over pricey mainframes is
preferable. The software layer manages tasks such as network traffic balancing,
fault tolerance, and scalability. Presently, the majority of cloud computing data
centers rely on Ethernet as their primary network technology.

Additional Points:
Cost Efficiency: Prioritizing cost-effective solutions helps maximize the value
of resources within data center budgets.
Scalability: Commodity hardware and networks offer greater scalability,
enabling data centers to expand their capacities as needed without incurring
substantial costs.
Simplified Management: Utilizing standard components simplifies the
management and maintenance of data center infrastructure.
Compatibility: Commodity hardware and Ethernet-based networks are widely
compatible with existing systems and technologies, enhancing interoperability
and ease of integration.
Flexibility: Low-cost design philosophies allow data centers to adapt more
readily to evolving technology trends and business requirements.
1.2.5.3 Convergence of Technologies
o Cloud computing arises from the merging of advancements in four key areas:
- Hardware Virtualization and Multi-core Chips: These technologies
facilitate dynamic configurations within the cloud.
- Utility and Grid Computing: They form the foundation for cloud
computing by providing necessary infrastructure.
- SOA, Web 2.0, and Mashups: Recent progress in these areas propels
cloud computing forward.
- Autonomic Computing and Data Center Automation: These
advancements contribute to the automation and efficiency of cloud
operations.
o The rise of cloud computing addresses the challenge of managing vast amounts
of data, which is crucial in various fields of science and society. To support
data-intensive computing, it's essential to address issues related to workflows,
databases, algorithms, and virtualization.
o Cloud computing transforms how we interact with information by offering on-
demand services at different levels: infrastructure, platform, and software.
MapReduce, a programming model, is instrumental in handling data parallelism
and fault tolerance, particularly in scientific applications.
o The convergence of data-intensive science, cloud computing, and multi-core
computing is revolutionizing computing by addressing architectural design and
programming challenges. This convergence facilitates the transformation of data
into knowledge, aligning with the principles of Service-Oriented Architecture
(SOA).
SYSTEM MODELS FOR DISTRIBUTED AND CLOUD COMPUTING

 Distributed and cloud computing systems are constructed using numerous independent
computer nodes. These nodes are linked together through networks like SANs, LANs,
or WANs in a hierarchical structure. With modern networking technology, just a few
LAN switches can connect hundreds of machines, forming a functional cluster. WANs
can interconnect many local clusters to create a vast network of clusters, essentially a
cluster of clusters.
 These systems are highly scalable, capable of achieving web-scale connectivity, either
physically or logically. Massive systems are typically classified into four groups:
clusters, P2P networks, computing grids, and Internet clouds housed in large data
centers. These classifications are based on the number of participating nodes, ranging
from hundreds to millions of computers.
 In these systems, nodes work together collaboratively or cooperatively at various
levels to perform tasks. The table provided in the text outlines the technical and
application aspects characterizing each of these system classes.

You might also like