CC Unit 1
CC Unit 1
CC Unit 1
(I have put below some additional points related to the above topic u can
read if u want)
Additional Related Points
Memory Wall Problem: The term "memory wall" refers to the growing disparity
between CPU speeds and memory access times. While CPU processing power
has increased significantly, memory speed improvements have lagged behind,
creating a bottleneck that limits overall system performance.
Future Trends: The trend towards heterogeneous computing and the integration
of various types of processors on a single chip is expected to continue. This
approach not only improves performance but also enhances energy efficiency,
making it suitable for both large-scale data centers and edge computing devices.
Additional Points:
GPUs in Modern Applications: GPUs are not only used for graphics but also for
scientific simulations, artificial intelligence, and machine learning, where their ability
to perform many calculations at once is invaluable.
Power Efficiency: GPUs are designed to deliver high performance with better power
efficiency compared to CPUs, making them ideal for large-scale data centers and
supercomputers.
Programming Models: Apart from CUDA, other programming models like OpenCL
(Open Computing Language) also enable developers to utilize GPU power for a wide
range of applications.
GPU Evolution: Over the years, GPUs have evolved from simple graphics
accelerators to powerful processors capable of handling diverse and complex
computational tasks, contributing significantly to advancements in various fields such
as deep learning and data analytics.
Future Trends: As technology progresses, we can expect further improvements in
GPU architecture, increasing their capabilities and efficiency, pushing the boundaries
of what can be achieved in exascale computing.
Additional Points:
Parallel Processing: The ability of GPUs to handle many tasks at the same time
makes them ideal for applications that require high levels of parallelism, such as
machine learning, scientific simulations, and data analysis.
Efficiency: By offloading data-intensive calculations to the GPU, systems can
achieve higher efficiency and performance, as the CPU is relieved from these
demanding tasks.
Programming Models: To leverage the full power of GPUs, developers use
programming models like CUDA (Compute Unified Device Architecture) and
OpenCL (Open Computing Language) which provide tools for managing
parallel execution.
Integration in Diverse Devices: The versatility of GPUs allows them to be
integrated into various devices beyond traditional computers, enhancing the
processing capabilities of mobile phones, gaming consoles, and other embedded
systems.
Future Developments: As GPU technology continues to advance, we can expect
even greater improvements in their processing power, efficiency, and
applications, further pushing the capabilities of HPC and other computationally
intensive fields.
1.2.2.2 GPU Programming Model
o Figure 1.7 illustrates how a CPU and GPU work together to perform parallel
processing of floating-point operations. The CPU, which is a conventional
multicore processor, has limited parallel processing capabilities. In contrast, the
GPU has a many-core architecture with hundreds of simple processing cores
organized into multiprocessors, each capable of running one or more threads.
o In this setup, the CPU offloads most of its floating-point computation tasks to
the GPU. The CPU tells the GPU to handle massive data processing tasks,
leveraging the GPU's parallel processing power. To ensure efficient
performance, the data transfer speed between the main memory (on the
motherboard) and the GPU's on-chip memory must be well-matched. This
interaction is managed using NVIDIA's CUDA programming model,
particularly with GPUs like the GeForce 8800, Tesla, and Fermi.
o Looking ahead, GPUs with thousands of cores could be used in Exascale
computing systems, which are capable of performing 10^18 floating-point
operations per second (flops). This represents a move towards hybrid systems
that combine both CPUs and GPUs to maximize processing power. A DARPA
report from September 2008 identifies four major challenges for exascale
computing: energy and power efficiency, memory and storage management,
handling concurrency and locality (i.e., managing many operations at once and
ensuring data is processed close to where it's stored), and ensuring system
resiliency (i.e., reliability and fault tolerance). This highlights the progress being
made in both GPU and CPU technologies in terms of power efficiency,
performance, and programmability.
Additional Points:
Parallel Execution: GPUs excel in parallel execution, handling many tasks at
once, which is ideal for applications like scientific simulations, machine
learning, and real-time data processing.
Hybrid Systems: The future of computing may involve hybrid systems that use
both CPUs and GPUs, leveraging the strengths of each to achieve optimal
performance.
Programming Tools: CUDA (Compute Unified Device Architecture) is a key
tool for programming GPUs, enabling developers to write software that can take
full advantage of GPU parallelism.
Challenges and Innovations: Addressing the challenges of energy consumption,
memory management, concurrency, and system reliability is crucial for the
development of future exascale computing systems.
Trend Towards Efficiency: Both CPUs and GPUs are becoming more power-
efficient and capable, reflecting ongoing advancements in technology to meet
the demands of high-performance computing.
1.2.2.3 Power Efficiency of the GPU
o Bill Dally from Stanford University highlights that the key advantages of GPUs
over CPUs for future computing are their power efficiency and massive
parallelism. According to estimates, running an exaflops (10^18 floating-point
operations per second) system would require achieving 60 Gflops (billion
floating-point operations per second) per watt per core. Power limitations
dictate what can be included in CPU or GPU chips.
o Dally's research indicates that a CPU chip consumes about 2 nanojoules per
instruction, while a GPU chip uses only 200 picojoules per instruction, making
GPUs ten times more efficient in terms of power. CPUs are designed to
minimize delays (latency) in accessing caches and memory, whereas GPUs are
designed to maximize data processing throughput by efficiently managing on-
chip memory.
o In 2010, the performance-to-power ratio for GPUs was 5 Gflops per watt per
core, compared to less than 1 Gflop per watt per CPU core. This significant
difference suggests that GPUs are more efficient, but scaling up future
supercomputers may still pose challenges. Nonetheless, GPUs could narrow the
performance gap with CPUs over time.
o A major factor in power consumption is data movement. Therefore, optimizing
the storage hierarchy and tailoring memory to specific applications is crucial.
Additionally, developing self-aware operating systems, runtime support, and
locality-aware compilers and auto-tuners for GPU-based massively parallel
processors (MPPs) is essential. This means that addressing both power
efficiency and software development are the primary challenges for future
parallel and distributed computing systems.
Additional Points:
Advantages of GPUs: GPUs are significantly more power-efficient than CPUs,
making them suitable for high-performance computing tasks that require
massive parallelism.
Power Consumption: Managing power consumption is critical, especially in
large-scale systems. Efficient data movement and memory management are key
to reducing power usage.
Optimizing Software: Developing advanced software tools, such as self-aware
operating systems and locality-aware compilers, is necessary to fully utilize the
power of GPUs in parallel computing environments.
Future Challenges: As we aim for exascale computing, overcoming the
challenges of power efficiency and software optimization will be crucial to
building more powerful and efficient supercomputers.
Trends and Innovations: Continuous advancements in GPU technology and
software development will play a pivotal role in the evolution of parallel and
distributed computing systems.
1.2.3Memory,Storage,and Wide-Area Networking
1.2.3.1 Memory Technology
o Figure 1.10 illustrates the growth of DRAM chip capacity, which increased
from 16KB in 1976 to 64GB in 2011. This indicates that memory chips have
roughly quadrupled in capacity every three years. However, the speed at which
memory can be accessed hasn't improved significantly, leading to a worsening
"memory wall" problem, where the gap between fast processors and slower
memory access times hinders overall performance.
o For hard drives, capacity has seen substantial growth as well, increasing from
260MB in 1981 to 250GB in 2004, and reaching 3TB with the Seagate
Barracuda XT in 2011. This equates to a tenfold increase in capacity
approximately every eight years. Disk arrays, which combine multiple hard
drives, are expected to see even greater capacity increases in the future.
o The increasing speed of processors and the growing capacity of memory are
creating a larger disparity between processor performance and memory access
speed. This memory wall problem could become more severe, potentially
limiting CPU performance further in the future.
Additional Points:
Memory Capacity Growth: The rapid increase in DRAM capacity has
significantly expanded the amount of data that can be stored and accessed
quickly by computers.
Memory Access Time: Despite the increase in memory capacity, the speed at
which data can be accessed from memory has not kept pace with processor
speeds, leading to performance bottlenecks.
Hard Drive Capacity Growth: Hard drive capacities have grown exponentially,
allowing for vast amounts of data to be stored on a single drive. Disk arrays
further amplify this capacity.
Future Challenges: As processors continue to get faster, the gap between their
speed and the slower memory access times (memory wall) presents a significant
challenge for future computer performance.
Solutions and Innovations: Addressing the memory wall problem may require
new technologies or architectures, such as faster memory technologies,
improved data caching strategies, or alternative storage solutions like solid-state
drives (SSDs).
1.2.3.2 Disks and Storage Technology
o Since 2011, disk drives and storage systems have continued to grow in capacity,
surpassing 3TB. The growth of disk storage is illustrated by lower curve in
Figure 1.10, showing a remarkable increase of seven orders of magnitude over
33 years. Additionally, the rapid development of flash memory and solid-state
drives (SSDs) has significant implications for both high-performance computing
(HPC) and high-throughput computing (HTC) systems.
o SSDs have become increasingly popular due to their durability. They can
withstand hundreds of thousands to millions of write cycles per block, ensuring
longevity even under heavy usage conditions. This durability, combined with
their impressive speed, makes SSDs suitable for many applications.
o However, challenges such as power consumption, cooling, and packaging
constraints may limit the development of large-scale systems in the future.
Power consumption increases linearly with clock frequency and quadratically
with voltage, necessitating the demand for lower voltage supplies. While clock
rates cannot be endlessly increased, there is a growing need for more efficient
power management solutions.
o A quote from Jim Gray, given during a talk at the University of Southern
California, reflects the evolving landscape of storage technology: "Tape units
are dead, disks are tape units, flashes are disks, and memory are caches now."
This indicates a shift towards faster, more efficient storage solutions. However,
as of 2011, SSDs remain too costly to fully replace traditional disk arrays in the
storage market.
Additional Points:
Flash Memory and SSDs: The rise of flash memory and SSDs has
revolutionized storage technology, offering faster speeds and greater durability
compared to traditional hard disk drives.
Longevity of SSDs: SSDs can endure a high number of write cycles per block,
ensuring they remain operational for several years, even under heavy usage.
Challenges of Large-Scale Systems: Power consumption, cooling, and
packaging constraints pose challenges for the development of large-scale
computing systems using SSDs and other storage technologies.
Shift in Storage Paradigm: Jim Gray's quote highlights the transition from
traditional tape-based storage to modern flash-based solutions, indicating a shift
towards faster, more efficient storage technologies.
Cost Considerations: While SSDs offer numerous advantages, including speed
and durability, their higher cost relative to traditional hard drives may hinder
widespread adoption in certain markets.
1.2.3.3 System-Area Interconnects
o In small clusters, the nodes are usually connected using an Ethernet switch or a
local area network (LAN). As depicted in Figure 1.11, a LAN is commonly used
to link client hosts to large servers. A storage area network (SAN) connects
servers to network storage devices like disk arrays, while network-attached
storage (NAS) directly links client hosts to disk arrays. These three types of
networks are often found in large clusters constructed using off-the-shelf
network components. For smaller clusters where distributed storage isn't
needed, a multiport Gigabit Ethernet switch along with copper cables can
connect the end machines. All three network types are readily available for
commercial use.
Additional Points:
Benefits of Virtualization: Virtual machines offer advantages such as improved
resource utilization, flexibility in running software, simplified software management,
and enhanced security compared to traditional physical machines.
Cloud Computing: Virtualization is fundamental to cloud computing, where
resources are pooled together and provisioned dynamically as needed.
Dynamic Resource Allocation: In cloud environments, virtualization enables the
dynamic allocation and management of processors, memory, and I/O resources based
on demand.
Software and Middleware: Virtualization software or middleware facilitates the
creation and management of virtual machines, virtual storage, and virtual networking,
providing a layer of abstraction between the hardware and the software running on it.
Scalability: Virtualization allows for easier scaling of computing resources, enabling
organizations to adapt to changing demands and workload fluctuations more
efficiently.
Additional Points:
Virtual Machine Monitor (VMM): This middleware layer manages VMs and
their interaction with the host hardware.
Bare-Metal VM: In this setup, the hypervisor directly handles hardware
resources, offering efficient management.
Host VM: Here, the VMM operates without modifying the host OS, simplifying
setup.
Dual-Mode VM: This setup utilizes both user-level and supervisor-level
components in the VMM, potentially requiring some host OS changes.
Hardware Independence: VMs provide a layer of abstraction, allowing
applications to run independently of the underlying hardware.
Virtual Appliance: Bundling applications with their dedicated OS into virtual
appliances allows for easy deployment across different hardware platforms.
Compatibility: VMs can run on different OSes, enhancing flexibility and
compatibility across diverse computing environments.
o The Virtual Machine Monitor (VMM) provides an abstraction layer to the guest
OS. With full virtualization, the VMM creates a VM abstraction that mimics the
physical machine closely, allowing standard OSes like Windows 2000 or Linux
to run as they would on actual hardware. Mendel Rosenblum outlined low-level
VMM operations, illustrated in Figure 1.13:
- Multiplexing VMs: VMs can be shared among hardware machines, as
depicted in Figure 1.13(a).
- Suspending and Storing VMs: A VM can be paused and saved in
stable storage, shown in Figure 1.13(b).
- Resuming or Provisioning VMs: A suspended VM can be resumed or
moved to a new hardware platform, as depicted in Figure 1.13(c).
- Migrating VMs: VMs can be transferred from one hardware platform
to another, as shown in Figure 1.13(d).
o These operations allow VMs to be provisioned on any available hardware
platform, offering flexibility in executing distributed applications. Additionally,
the VM approach improves server resource utilization significantly. Various
server functions can be consolidated on the same hardware platform, increasing
system efficiency. This consolidation helps eliminate server sprawl by
deploying systems as VMs, which effectively utilizes shared hardware
resources. VMware claimed that this approach could boost server utilization
from its current range of 5–15 percent to 60–80 percent.
1.2.4.3 Virtual Infrastructures
o In Figure 1.14, the physical resources like computing power, storage, and
networking are depicted at the bottom, while various applications are
represented in virtual machines (VMs) at the top. This separation of hardware
and software defines the virtual infrastructure, which links resources to
distributed applications dynamically. Essentially, it's a flexible allocation of
system resources to specific applications.
o The outcome of virtual infrastructure is reduced costs and improved efficiency
and responsiveness. An example of this is server consolidation and containment
through virtualization. We'll delve deeper into VMs and virtualization support in
Chapter 3. Furthermore, we'll cover virtualization support for clusters, clouds,
and grids in Chapters 3, 4, and 7, respectively.
1.2.5Data Center Virtualization for Cloud Computing
In this section, the basic structure and design aspects of data centers are discussed,
specifically for cloud computing. Cloud architecture typically relies on inexpensive
hardware and network devices. Most cloud platforms opt for widely-used x86
processors. Data centers are constructed using cost-effective terabyte disks and
Gigabit Ethernet.
When designing data centers, the focus is on achieving a balance between
performance and cost. This means prioritizing factors like storage capacity and energy
efficiency rather than just raw speed. Figure 1.13 illustrates the growth of servers and
the breakdown of costs in data centers over the past 15 years. As of 2010,
approximately 43 million servers are operational worldwide. Interestingly, the cost of
utilities starts outweighing the cost of hardware after three years.
Additional Points:
Commodity Hardware: Cloud data centers typically utilize standard, affordable
hardware components to keep costs down.
Processor Choice: x86 processors are commonly used in cloud architectures due to
their widespread availability and cost-effectiveness.
Storage and Energy Efficiency: Data center design prioritizes efficiency in storage and
energy usage rather than solely focusing on maximizing speed.
Server Growth: The number of servers in operation has been steadily increasing over
the years, reflecting the growing demand for computing resources.
Cost Considerations: Utilities costs become a significant factor in data center
operations, surpassing hardware costs after a certain period.
Long-Term Sustainability: Data center designs aim to balance performance and cost-
effectiveness to ensure sustainable operation over time.
1.2.5.1 Data Center Growth and Cost Breakdown
o Large data centers can be massive structures housing thousands of servers,
while smaller ones typically accommodate hundreds. Over the years, the
expenses associated with building and maintaining data center servers have
risen.
o According to a 2009 IDC report (refer to Figure 1.14), only around 30 percent
of the total data center costs are attributed to purchasing IT equipment like
servers and disks. The majority of costs are allocated to other components: 33
percent for chillers, 18 percent for uninterruptible power supplies (UPS), 9
percent for computer room air conditioning (CRAC), and the remaining 7
percent for power distribution, lighting, and transformer expenses. This
breakdown indicates that a significant portion, about 60 percent, of the overall
data center cost is directed towards management and maintenance.
o Interestingly, while the cost of server purchases hasn't seen much change over
time, the expenses related to electricity and cooling have increased notably from
5 percent to 14 percent over a span of 15 years.
Additional Points:
Cost Distribution: Data center costs are not solely about purchasing servers; a
substantial portion is allocated to infrastructure, maintenance, and utilities.
Electricity and Cooling Costs: With the increasing demand for computing
power, the expenses associated with electricity and cooling systems have also
risen significantly.
Efficiency Measures: To mitigate rising costs, data centers often implement
energy-efficient practices and technologies to reduce electricity consumption
and cooling needs.
Scalability Challenges: As data centers grow in size and capacity, managing
costs becomes increasingly challenging, requiring careful planning and
optimization strategies.
Sustainability: There's a growing emphasis on making data centers more
environmentally friendly by reducing energy consumption and adopting
renewable energy sources where possible.
1.2.5.2 Low-Cost Design Philosophy
o Expensive high-end switches or routers might not be feasible for constructing
data centers due to their high costs. Consequently, utilizing high-bandwidth
networks may not align with the economic principles of cloud computing. When
operating within a fixed budget, it's more practical to opt for commodity
switches and networks in data center setups.
o Similarly, choosing commodity x86 servers over pricey mainframes is
preferable. The software layer manages tasks such as network traffic balancing,
fault tolerance, and scalability. Presently, the majority of cloud computing data
centers rely on Ethernet as their primary network technology.
Additional Points:
Cost Efficiency: Prioritizing cost-effective solutions helps maximize the value
of resources within data center budgets.
Scalability: Commodity hardware and networks offer greater scalability,
enabling data centers to expand their capacities as needed without incurring
substantial costs.
Simplified Management: Utilizing standard components simplifies the
management and maintenance of data center infrastructure.
Compatibility: Commodity hardware and Ethernet-based networks are widely
compatible with existing systems and technologies, enhancing interoperability
and ease of integration.
Flexibility: Low-cost design philosophies allow data centers to adapt more
readily to evolving technology trends and business requirements.
1.2.5.3 Convergence of Technologies
o Cloud computing arises from the merging of advancements in four key areas:
- Hardware Virtualization and Multi-core Chips: These technologies
facilitate dynamic configurations within the cloud.
- Utility and Grid Computing: They form the foundation for cloud
computing by providing necessary infrastructure.
- SOA, Web 2.0, and Mashups: Recent progress in these areas propels
cloud computing forward.
- Autonomic Computing and Data Center Automation: These
advancements contribute to the automation and efficiency of cloud
operations.
o The rise of cloud computing addresses the challenge of managing vast amounts
of data, which is crucial in various fields of science and society. To support
data-intensive computing, it's essential to address issues related to workflows,
databases, algorithms, and virtualization.
o Cloud computing transforms how we interact with information by offering on-
demand services at different levels: infrastructure, platform, and software.
MapReduce, a programming model, is instrumental in handling data parallelism
and fault tolerance, particularly in scientific applications.
o The convergence of data-intensive science, cloud computing, and multi-core
computing is revolutionizing computing by addressing architectural design and
programming challenges. This convergence facilitates the transformation of data
into knowledge, aligning with the principles of Service-Oriented Architecture
(SOA).
SYSTEM MODELS FOR DISTRIBUTED AND CLOUD COMPUTING
Distributed and cloud computing systems are constructed using numerous independent
computer nodes. These nodes are linked together through networks like SANs, LANs,
or WANs in a hierarchical structure. With modern networking technology, just a few
LAN switches can connect hundreds of machines, forming a functional cluster. WANs
can interconnect many local clusters to create a vast network of clusters, essentially a
cluster of clusters.
These systems are highly scalable, capable of achieving web-scale connectivity, either
physically or logically. Massive systems are typically classified into four groups:
clusters, P2P networks, computing grids, and Internet clouds housed in large data
centers. These classifications are based on the number of participating nodes, ranging
from hundreds to millions of computers.
In these systems, nodes work together collaboratively or cooperatively at various
levels to perform tasks. The table provided in the text outlines the technical and
application aspects characterizing each of these system classes.