Lab Mpi

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

Introduction to

MPI Programming

Rocks-A-Palooza II
Lab Session

© 2006 UC Regents 1
Modes of Parallel Computing
 SIMD - Single Instruction Multiple Data
processors are “lock-stepped”: each processor executes single
instruction in synchronism on different data
 SPMD - Single Program Multiple Data
processors run asynchronously a personal copy of a program
 MIMD - Multiple Instruction Multiple Data
processors run asynchronously: each processor has its own data
and its own instructions
 MPMD - Multiple Program Multiple Data

© 2006 UC Regents 2
MPI in Parallel Computing
 MPI addresses message-passing mode of parallel computation
 Processes have separate address spaces
 Processes communicate via sending and receiving messages
 MPI is designed mainly for SPMD/MIMD (or distributed memory
parallel supercomputer)
 Each process is run on a separate node
 Communication is over high-performance switch
 Paragon, IBM SP2, Meiko CS-2, Thinking Machines CM-5, NCube-2,
and Cray T3D
 MPI can support shared memory programming model
 Multiple processes can read/write to the same memory location
 SGI Onyx, Challenge, Power Challenge, Power Challenge Array, IBM
SMP, Convex Exemplar, and the Sequent Symmetry
 MPI exploits Network Of Workstations (heterogeneous)
 Sun, DEC, Hewlett-Packard, SGI, IBM, Intel and Pentium (various
Linux OS)
© 2006 UC Regents 3
What is MPI?
 Message Passing application programmer Interface
 Designed to provide access to parallel hardware
• Clusters
• Heterogeneous networks
• Parallel computers
 Provides for development of parallel libraries
 Message passing
• Point-to-point message passing operations
• Collective (global) operations
 Additional services
• Environmental inquiry
• Basic timing info for measuring application performance
• Profiling interface for external performance monitoring
© 2006 UC Regents 4
MPI advantages
 Mature and well understood
 Backed by widely-supported formal standard (1992)
 Porting is “easy”
 Efficiently matches the hardware
 Vendor and public implementations available
 User interface:
 Efficient and simple (vs. PVM)
 Buffer handling
 Allow high-level abstractions
 Performance
© 2006 UC Regents 5
MPI disadvantages
 MPI 2.0 includes many features beyond
message passing

e
curv
rning
Lea

 Execution control environment depends on


implementation

© 2006 UC Regents 6
MPI features
 Thread safety
 Point-to-point communication
 Modes of communication
standard synchronous ready buffered

 Structured buffers
 Derived datatypes
 Collective communication
 Native built-in and user-defined collective operations
 Data movement routines
 Profiling
 Users can intercept MPI calls and call their own tools

© 2006 UC Regents 7
Communication modes
 standard
 send has no guarantee that corresponding receive routine has
started
 synchronous
 send and receive can start before each other but complete
together
 ready
 used for accessing fast protocols
 user guarantees that matching receive was posted
 use with care!
 buffered
 send may start and return before matching receive
 buffer space must be provided
© 2006 UC Regents 8
Communication modes (cont’d)
 All routines are
 Blocking - return when they are locally complete
• Send does not complete until buffer is empty
• Receive does not complete until buffer is full
• Completion depends on
• size of message
• amount of system buffering
 Non-blocking - returns immediately and allows next statement
to execute
• Use to overlap communication and computation when time to
send data between processes is large
• Immediately returns “request handle” that can be used for
querying and waited on,
• Completion detected by MPI_Wait() or MPI_Test()

© 2006 UC Regents 9
Point-to-point vs. collective
 point-to-point, blocking MPI_Send/MPI_Recv
MPI_Send(start, count, datatype, dest, tag, comm )
MPI_Recv(start, count, datatype, source, tag, comm, status)
 simple but inefficient
 most work is done by process 0:
• Get data and send it to other processes (they idle)
• May be compute
• Collect output from the processes
 collective operations to/from all
MPI_Bcast(start, count, datatype, root, comm)
MPI_Reduce(start, result, count, datatype, operation, root, comm)
 called by all processes
 simple, compact, more efficient
 must have the same size for “count”and “datatype”
 “result” has significance only on node 0
© 2006 UC Regents 10
MPI complexity
 MPI extensive functionality is provided by many
(125+) functions
 Do I Need them all ?
 No need to learn them all to use MPI
 Can use just 6 basic functions
MPI_Init
MPI_Comm_size
MPI_Comm_rank
MPI_Send or MPI_Bcast
MPI_Recv MPI_Reduce
MPI_Finalize
 Flexibility: use more functions as required
© 2006 UC Regents 11
To be or not to be MPI user
 Use if:
Your data do not fit data parallel model
Need portable parallel program
Writing parallel library
 Don’t use if:
Don’t need any parallelism
Can use libraries
Can use fortran

© 2006 UC Regents 12
Writing MPI programs
 provide basic MPI definitions and types
#include “mpi.h”
 start MPI
MPI_Init( &argc, &argv );
 provide local non-MPI routines
 exit MPI
MPI_Finalize();

see /opt/mpich/gnu/examples
/opt/mpich/gnu/share/examples
© 2006 UC Regents 13
Compiling MPI programs
 From a command line:
 mpicc -o prog prog.c
 Use profiling options (specific to mpich)
 -mpilog Generate log files of MPI calls
 -mpitrace Trace execution of MPI calls
 -mpianim Real-time animation of MPI (not available on all
systems)
 --help Find list of available options
 Use makefile!
 get Makefile.in template and create Makefile
mpireconfig Makefile
 compile
make progName
© 2006 UC Regents 14
Running MPI program
 Depends on your implementation of MPI
 For mpich:
• mpirun -np2 foo # run MPI program
 For lam:
• lamboot -v lamhosts # starts LAM
• mpirun -v -np 2 foo # run MPI program
• lamclean -v # rm all user processes
• mpirun … # run another program
• lamclean …
• lamhalt # stop LAM

© 2006 UC Regents 15
Common MPI flavors on Rocks

© 2006 UC Regents 16
MPI flavors path
/opt + MPI flavor + interconnect + compiler + bin/ + executable

 MPICH + Ethernet + GNU  LAM + Ethernet + GNU


/opt/mpich/ethernet/gnu/bin/… /opt/lam/ethernet/gnu/bin/…
 MPICH + Myrinet + GNU  LAM + Myrinet + GNU
/opt/mpich/myrinet/gnu/bin/… /opt/lam/myrinet/gnu/bin/…
 MPICH + Ethernet + INTEL  LAM + Ethernet + INTEL
/opt/mpich/ethernet/intel/bin/… /opt/lam/ethernet/intel/bin/…
 MPICH + Myrinet + INTEL  LAM + Myrinet + INTEL
/opt/mpich/myrinet/intel/bin/… /opt/lam/myrinet/intel/bin/…

C: mpicc C++: mpiCC


F77: mpif77 F90: mpif90

© 2006 UC Regents 17
What provides MPI

© 2006 UC Regents 18
Example 1: LAM hello
Execute all commands as a regular user

1. Start ssh agent for key management


$ ssh-agent $SHELL
2. Add your keys
$ ssh-add
(at prompt give your ssh passphrase)
3. Make sure you have right mpicc:
$ which mpicc
(output must be /opt/lam/gnu/bin/mpicc)
4. Create program source hello.c (see next page)

© 2006 UC Regents 19
hello.c
#include "mpi.h"
#include <stdio.h>
int main(int argc ,char *argv[])
{
int myrank;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
fprintf(stdout, "Hello World, I am process %d\n", myrank);
MPI_Finalize();
return 0;
}

© 2006 UC Regents 20
Example 1 (cont’d)
5. compile
$ mpicc -o hello hello.c
6. create machines file with IP’s of two nodes. Use your numbers here!
198.202.156.1
198.202.156.2
7. start LAM
$ lamboot -v machines
8. run your program
$ mpirun -np 2 -v hello
9. clean after the run
$ lamclean -v
10. stop LAM
$ lamhalt

© 2006 UC Regents 21
Example1 output
$ ssh-agent $SHELL
$ ssh-add Enter p assp hrase for /home/nadya/.ssh/id_rsa:
Identity added: /home/nadya/.ssh/id_rsa (/home/nadya/.ssh/id_rsa)

$ which mp icc /op t/lam/gnu/b in/mp icc


$ mp icc -o hello hello.c
$ lamb oot -v machines LAM 7.1.1/MPI 2 C ++/ROMIO - Indiana University
n-1<27213> ssi:b oot:b ase:linear: b ooting n0 (rocks-155.sdsc.edu)
n-1<27213> ssi:b oot:b ase:linear: b ooting n1 (10.255.255.254)
n-1<27213> ssi:b oot:b ase:linear: finished
$ mp irun -np 2 -v hello 27245 hello running on n0 (o)
7791 hello running on n1
Hello W orld, I am p rocess 0
Hello W orld, I am p rocess 1
$ lamclean -v killing p rocesses, done
closing files, done
sweep ing traces, done
cleaning up registered ob jects, done
sweep ing messages, done
$ lamhalt LAM 7.1.1/MPI 2 C ++/ROMIO - Indiana University
© 2006 UC Regents 22
Example 2: mpich cpi
1. set your ssh keys as in example 1 (if not done already)
$ ssh-agent $SHELL
$ ssh-add
2. copy example files to your working directory
$ cp /opt/mpich/gnu/examples/*.c .
$ cp /opt/mpich/gnu/examples/Makefile.in .
3. create Makefile
$ mpireconfig Makefile
4. make sure you have right mpicc
$ which mpicc
If output lists path /opt/lam… update the path:
$ export PATH=$/opt/mpich/gnu/bin:$PATH
5. compile your program
$ make cpi
6. run
$ mpirun -np 2 -machinefile machines cpi
or $ mpirun -nolocal -np 2 -machinefile machines cpi
© 2006 UC Regents 23
Example 2 details
 If using frontend and compute nodes in machines file use
mpirun -np 2 -machinefile machines cpi
 If using only compute nodes in machine file use
mpirun -nolocal -np 2 -machinefile machines cpi

 -nolocal - don’t start job on frontend


 -np 2 - start job on 2 nodes
 -machinefile machines - nodes are specified in machinesfile
 cpi - start program cpi

© 2006 UC Regents 24
More examples
 See CPU benchmark lab
 how to run linpack

 Additional examples in
 /opt/mpich/gnu/examples
 /opt/mpich/gnu/share/examples

© 2006 UC Regents 25
Cleanup when an MPI Program
Crashes
 MPICH in Rocks uses shared memory segments to pass
messages between processes on the same node
 When an MPICH program crashes, it doesn’t properly cleanup
these shared memory segments
 After a program crash, run:
$ cluster-fork sh /opt/mpich/gnu/sbin/cleanipcs

 NOTE: this removes all shared memory segments for your user id
 If you have other live MPI programs running, this will remove
their shared memory segments too and cause that program to
fail

© 2006 UC Regents 26
Online resources
MPI standard:
www-unix.mcs.anl.gov/mpi
Local Area Multicomputer MPI (LAM MPI):
www.osc.edu/lam.html
MPICH:
www.mcs.anl.gov/mpi/mpich
Aggregate Function MPI (AFMPI):
garage.ecn.purdue.edu/~papers
Lam tutorial
www.lam-mpi.org/tutorials/one-step/lam.php

© 2006 UC Regents 27
Glossary
MPI - message passing interface
PVM - parallel virtual machine
LAM - local area multicomputer
P4 - 3rd generation parallel programming library, includes
message-passing and shared-memory components
Chameleon - high-performance portability package for
message passing on parallel supercomputers
Zipcode - portable system for writing of scalable libraries
ADI - abstract device architecture

© 2006 UC Regents 28
Glossary (cont’d)
SIMD - Single Instruction Multiple Data
SPMD - Single Program Multiple Data
MIMD - Multiple Instruction Multiple Data
MPMD - Multiple Program Multiple Data

© 2006 UC Regents 29

You might also like