Parallel Programming 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Parallel Programming

Shared Memory: OpenMP


Environment and Synchronization
What is OpenMP?


Standard for shared memory programming for
scientific applications.

Has specific support for scientific application needs.

Acceptance among vendors and application writers
(including for GPU).

See http://www.openmp.org for more info.
OpenMP API Overview


API is a set of compiler directives inserted in the
source program (in addition to some library
functions).

Ideally, compiler directives do not affect
sequential code.
– pragma’s in C / C++ .

(special) comments in Fortran code.
OpenMP API Example

Sequential code:
statement1;
statement2;
statement3;
Assume we want to execute statement 2 in
parallel, and statement 1 and 3 sequentially.
OpenMP API Example (2 of 2)

OpenMP parallel code:


statement 1;
#pragma <specific OpenMP directive>
statement2;
statement3;
Statement 2 (may be) executed in parallel.
Statement 1 and 3 are executed sequentially.
Important Note


By giving a parallel directive, the user asserts that
the program will remain correct if the statement is
executed in parallel.
• OpenMP compiler does not check correctness.

Some tools exist for helping with that.

Totalview - good parallel debugger
API Semantics

• Master thread executes sequential code.


• Master and workers execute parallel code.

Note: very similar to fork-join semantics of
Pthreads create/join primitives.
OpenMP Implementation Overview

OpenMP implementation

compiler,

library.

Unlike Pthreads (purely a library).
OpenMP Example Usage (1 of 2)

Sequential
Program

Annotated OpenMP compiler switch


Source Compiler

Parallel
Program
OpenMP Example Usage (2 of 2)


If you give sequential switch,

comments and pragmas are ignored.

If you give parallel switch,

comments and/or pragmas are read, and

cause translation into parallel program.

Ideally, one source for both sequential and
parallel program (big maintenance plus).
OpenMP Directives

• Parallelization directives:

parallel section

parallel for
• Data environment directives:

shared, private, threadprivate, reduction, etc.
• Synchronization directives:

barrier, critical
General Rules about Directives


They always apply to the next statement,
which must be a structured block.

Examples
– #pragma omp …
statement
– #pragma omp …
{ statement1; statement2; statement3; }
OpenMP Parallel Region

#pragma omp parallel


A number of threads are spawned at entry.

Each thread executes the same code.

Each thread waits at the end.

Very similar to a number of create/join’s
with the same function in Pthreads.
Getting Threads to do Different Things

Through explicit thread identification (as in Pthreads).


Through work-sharing directives.
Thread Identification

int omp_get_thread_num()
int omp_get_num_threads()


Gets the thread id.

Gets the total number of threads.
Example

#pragma omp parallel


{
if( !omp_get_thread_num() )
master();
else
worker();
}
Work Sharing Directives


Always occur within a parallel region directive.

Two principal ones are
– parallel for
– parallel section
OpenMP Parallel For

#pragma omp parallel


#pragma omp for
for( … ) { … }

Each thread executes a subset of the iterations.

All threads wait at the end of the parallel for.
Multiple Work Sharing Directives

May occur within a single parallel region
#pragma omp parallel
{
#pragma omp for
for( ; ; ) { … }
#pragma omp for
for( ; ; ) { … }
}

All threads wait at the end of the first for.
The NoWait Qualifier

#pragma omp parallel


{
#pragma omp for nowait
for( ; ; ) { … }
#pragma omp for
for( ; ; ) { … }
}

Threads proceed to second for w/o waiting.

You might also like