Question

How do I strace all processes of MPI parallel job, started with mpiexec (MPICH2, linux)?

-o will mess outputs from different processes

PS To some editors: who may think that MPICH is the name of the library. MPICH2 is a particular version.. MPICH2 is actually MPICH2 is an all-new implementation of MPI and I sometimes had to used both mpich and mpich2. So, we can't replace mpich2 with mpich.

Was it helpful?

Solution

Create a wrapper around your program, which will be launched by mpiexec. Something like:

#!/bin/sh
LOGFILE="strace-$(hostname).$$"
exec strace -o"$LOGFILE" my_mpi_program

OTHER TIPS

You may want to try STAT (Stack Trace Analysis Tool). Check out the STAT Homepage. It will give you a high level overview of your process behavior, and works especially well in the case of a hung process.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top