Data Parallel Architecture
Data Parallel Architecture
Data Parallel Architecture
1
ARCHITECTURE
ACKNOWLEDGEMENT
VIPAN BAGGA
DATA PARALLEL
3
ARCHITECTURE
Table of Contents
o Solved in less time with multiple compute resources than with a single
compute resource.
• Parallel computing is an evolution of serial computing that attempts to
emulate what has always been the state of affairs in the natural world: many
complex, interrelated events happening at the same time, yet within a
sequence. Some examples:
o Planetary and galactic orbits
o Weather and ocean patterns
o Tectonic plate drift
o Rush hour traffic in LA
o Automobile assembly line
o Daily operations within a business
o Building a shopping mall
o Ordering a hamburger at the drive through.
• Traditionally, parallel computing has been considered to be "the high end of
computing" and has been motivated by numerical simulations of complex
systems and "Grand Challenge Problems" such as:
o weather and climate
o chemical and nuclear reactions
o biological, human genome
o geological, seismic activity
o mechanical devices - from prosthetics to spacecraft
o electronic circuits
o manufacturing processes
• Today, commercial applications are providing an equal or greater driving
force in the development of faster computers. These applications require the
processing of large amounts of data in sophisticated ways. Example
applications include:
o parallel databases, data mining
o oil exploration
o web search engines, web based business services
o computer-aided diagnosis in medicine
o management of national and multi-national corporations
o advanced graphics and virtual reality, particularly in the entertainment
industry
o networked video and multi-media technologies
o collaborative work environments
• Ultimately, parallel computing is an attempt to maximize the infinite but
seemingly scarce commodity called time.
DATA PARALLEL
6
ARCHITECTURE
o
DATA PARALLEL
8
ARCHITECTURE
I. INTRODUCTION
• Basic design:
o Memory is used to store both program and data instructions
o Program instructions are coded data which tell the computer to do
something
o Data is simply information to be used by the program
o A central processing unit (CPU) gets instructions and/or data from
memory, decodes the instructions and then sequentially performs
them.
• There are different ways to classify parallel computers. One of the more
widely used classifications, in use since 1966, is called Flynn's Taxonomy.
• Flynn's taxonomy distinguishes multi-processor computer architectures
according to how they can be classified along the two independent
dimensions of Instruction and Data. Each of these dimensions can have
only one of two possible states: Single or Multiple.
DATA PARALLEL
10
ARCHITECTURE
SISD SIMD
MISD MIMD
element
• This type of machine typically has
an instruction dispatcher, a very
high-bandwidth internal
network,and a very large array of
very small-capacity instruction
units.
• Best suited for specialized
problems characterized by a high
degree of regularity, such as image
processing.
• Synchronous (lockstep) and
deterministic execution
• Two varieties: Processor Arrays
and Vector Pipelines
• Examples:
o Processor Arrays:
Connection Machine CM-2,
Maspar MP-1, MP-2
Vector
pipelines: IBM 9000,
Cray C90, Fujitsu VP,
EC SX-2, HitachiS820
Multiple Instruction, Single Data (MISD):
• A single data stream is fed into multiple processing units.
• Each processing unit operates on the data independently via independent instruction
streams.
• Few actual examples of this class of parallel computer have ever existed. One is the
experimental Carnegie-Mellon C.mmp computer (1971).
• Some conceivable uses might be:
o multiple frequency filters operating on a single signal stream
DATA PARALLEL
12
ARCHITECTURE
o
o
o multiple cryptography algorithms attempting to crack a single coded
message.
Multiple Instruction,
Multiple Data (MIMD):
Shared Memory
• Multiple processors can operate independently but share the same memory
resources.
• Changes in a memory location effected by one processor are visible to all
other processors.
• Shared memory machines can be divided into two main classes based upon
memory access times: UMA and NUMA.
Advantages:
Disadvantages:
Distributed Memory
General Characteristics:
• Like shared memory systems, distributed memory systems vary widely but
share a common characteristic. Distributed memory systems require a
communication network to connect inter-processor memory.
Advantages:
Disadvantages:
• The programmer is responsible for many of the details associated with data
communication between processors.
• It may be difficult to map existing data structures, based on global memory,
to this memory organization.
• Non-uniform memory access (NUMA) times
Other Models
• SPMD is actually a "high level" programming model that can be built upon
any combination of the previously mentioned parallel programming models.
• A single program is executed by all tasks simultaneously.
• At any moment in time, tasks can be executing the same or different
instructions within the same program.
• SPMD programs usually have the necessary logic programmed into them to
allow different tasks to branch or conditionally execute only those parts of
the program they are designed to execute. That is, tasks do not necessarily
have to execute the entire program - perhaps only a portion of it.
• All tasks may use different data
• Like SPMD, MPMD is actually a "high level" programming model that can
be built upon any combination of the previously mentioned parallel
programming models.
DATA PARALLEL
17
ARCHITECTURE
SUMMARY
This presentation covers the basics of parallel computing. Beginning with a brief
overview and some concepts and terminology associated with parallel computing,
the topics of parallel memory architectures and programming models are then
explored. These topics are followed by a discussion on a number of issues related
to designing parallel programs. The last portion of the presentation is spent
examining how to parallelize several different types of serial programs.
REFERENCE