Grain Packing & Scheduling Ch2 Hwang - Copy
Grain Packing & Scheduling Ch2 Hwang - Copy
Grain Packing & Scheduling Ch2 Hwang - Copy
Partitioning
Parallel Algorithm
D A O M
e s r a
c s c p
o i h p
m g p0 p1 e p0 p1 i
p s P0 P1
n n
o m t g
s e r
i n a
t t t
P2 P3
i p2 p3 i p2 p3
o o
n n
4 steps:
Decomposition, Assignment, Orchestration, Mapping+ Scheduling
• Done by programmer or system software (compiler, runtime, ...)
• Issues are the same, so assume programmer does it all explicitly
1
Programming Level
Concurrent Processing
These can be attributed to issues such as:
• level of concurrency,
• computational granularity,
• time and space complexity,
• communication latencies, and
• scheduling and load balancing.
Programming Level
Concurrent Processing
Independence among segments of a program is
a necessary condition to execute them
concurrently.
In general, two independent segments should be
executed in any order without affecting each
other — a segment can be an instruction or a
sequence of instructions.
Programming Level
Concurrent Processing
Dependence graph is used to determine the
dependence relations among the program
segments.
Programming Level
Concurrent Processing
Dependence Graph — A dependence graph is a
directed graph G º G (N,A) in which the set of
nodes (N) represents the program segments and
the set of the directed arcs (A) shows the order
of dependence among the segments
Programming Level
Concurrent Processing
Dependence Graph
• Dependence comes in various forms and kinds:
− Data dependence
− Control dependence
− Resource dependence
Programming Level
Concurrent Processing
Data Dependence: If an instruction uses a
value produced by a previous instruction, then
the second instruction is data dependent to the
first instruction.
Data Dependence comes in different forms:
Programming Level
Data dependence
Flow dependence: At least one output of S1 is an input
of S2 (Read-After-Write: RAW). 1 ® 2
S S
Data dependence
I/O dependence: The same file is referred to by
both I/O statements.
S2 S 4
S3
Programming Level
Control dependence
The order of execution is determined during
run-time — Conditional statements.
Control dependence could also exist between
operations performed in successive iterations of
a loop:
Do I = 1, N
If (A(I-1) = 0) then A(I) = 0
End
Programming Level
Control dependence
Control dependence often does not allow
efficient exploitation of parallelism.
Programming Level
Resource dependence
Conflict in using shared resources such as
concurrent request for the same functional unit.
A resource conflict arises when two instructions
attempt to use the same resource at the same
time.
Programming Level
Resource dependence
Within the scope of the resource dependence
then we can talk about storage dependence,
ALU dependence,...
Programming Level
Question
What is “true Dependence”?
What is “False Dependence”?
Programming Level
Concurrent Processing
Bernstein's Conditions
• Let Ii and Oi be the input and output sets of process
Pi, respectively. The two processes P1 and P2 can be
executed in parallel (P1 || P2) iff:
I1 Ç O2 = 0 WAR
I2 Ç O1 = 0 RAW
O1 Ç O2 = 0 WAW
Programming Level
Concurrent Processing
Bernstein's Conditions
• In general, P1, P2,..., Pk can be executed in parallel if
Bernstein condition is held for every pair of
processes:
P1 || P2 || P3... || Pk iff
Pi || Pj " i¹j
Programming Level
Concurrent Processing
Bernstein's Conditions
· parallelism relation (||) is commutative:
Pi || Pj Þ Pj || Pi
· parallelism relation (||) is not transitive:
Pi || Pj and Pj || Pk
does not necessarily guarantee Pi || Pk
Fall 2004
Programming Level
Concurrent Processing
Bernstein's Conditions
· parallelism relation (||) is not equivalence relation:
· parallelism relation (||) is associative:
Fall 2004
Programming Level
Concurrent Processing
Bernstein's Conditions — Example
• Detect parallelism in the following program, assume
a uniform execution time:
P1: C=D*E
P2: M=G+C
P3: A= B+ C
P4: C=L+M
P5: F=G/E
Programming Level
Concurrent Processing — Bernstein's Condition
D E
E G
*
G /
+1
B F
+2
L
+3
C
Programming Level
Concurrent Processing — Bernstein's Condition
Example: If two adders are available then,
D E E G
* /
B
G
F
+1 +2
L
A
+3
C
Programming Level
Concurrent Processing
Bernstein's Conditions — Example
P
1
*
+1 +3 P4
P2
/
+2 P5
P3
N = (A * B) + (C * D)
M = (A * B) - (C * D)
Programming Level
Concurrent Processing
Hardware vs. software parallelism
• In machine code we have:
Load R1, A
Load R2, B
Load R3, C
Load R4, D
Mult Rx, R1, R2
Mult Ry, R3, R4
Add R1, Rx, Ry
Sub R2, Rx, Ry
Store N, R1
Programming Level
Concurrent Processing
Hardware vs. software parallelism
• A machine which allows parallel multiplications,
add/subtraction and simultaneous load/store
operations gives an average software parallelism of
(10/4) = 2.5 instructions.
Programming Level
Concurrent Processing
Hardware vs. software parallelism
Load A Load B Load C Load D
*1 *2
+ -
Store M
Store N
Programming Level
Concurrent Processing
Hardware vs. software parallelism
• For a machine which does not allow simultaneous
Load/Store and arithmetic operations, we have an
average software parallelism of (10/8) = 1.25
instructions.
Programming Level
Concurrent Processing Load A
*1 Load C
Load D
*2
- Store N
Store M
Programming Level
Concurrent Processing
Hardware vs. software parallelism
• Now assume a multiprocessor composing of two
processors:
Programming Level
Load A Load C
Concurrent Processing
Load B Load D
Hardware vs. software parallelism
*2
*1
Store K
}
Store L
added instructions
for inter processor
communication.
Load K Load L
+ -
Store N Store M
Programming Level
Concurrent Processing
Compilation support is one way to solve the
mismatch between software and hardware
parallelism.
14,2 m,3
15,2
16,2 n,4
17,2 o,3
p,3
Programming Level
j,4
B,4 D,6
C,4
g,4 i,4
h,4 n,4
E,6
Programming Level
P1
6 11 1 2 3 9 8 7
P2 4 5 10 12 13 14 15 16 17
6
0 12 8 10 14 16 19 21 24 26 30 32 35 37 40 42
P1
A C D E
0 14 18 22
P2
B 7
B,1 C,1
c,8 c,1
b,1
D,2 E,2
d,4 e,4
Programming Level
Task Duplication
0 4 6 13 21 23 27
P1
A B D
0 4 12 13 14 16 20
P2
E C 7E
c,1 c,1
b,1
E,2
D,2
e,4
d,4
Programming Level
Task Duplication
0 4 6 7 8 10 14
P1
A B C D
0 4 5 6 7 9 13
P2
A C EE 7