Question

Which profiler do you use for Fortran code base with MPI in it? gprof doesn't seem to be working correctly. Sun Studio Analyzer only returns the timings for the C/C++ system calls and none of the fortran functions appear.

Was it helpful?

Solution

There are a number of performance analysis tools specialized for Parallel/MPI Programs, such as:

  • Score-P, which works with a number of different Analysis tools, e.g. Cube, Vampir
  • HPCToolkit uses sampling only, so you do not have to recompile your application
  • Tau

At first they may not be as simple to use simple to use, but they provide much more help to investigate the performance of parallel applications.

OTHER TIPS

When the questioner says "gprof doesn't seem to be working correctly", perhaps he's referring to the fact that N MPI processes might clobber the gmon.out file. In that case, the (undocumented) GMON_OUT_PREFIX environment variable might make gprof more useful:

$ export GMON_OUT_PREFIX=gmon.out
$ mpiexec -np 4 cpi

Allinea MAP is a profiler that is simple and straightforward but very powerful.

It is designed to show the performance problems in Fortran, C and C++ MPI applications, and requires very little effort to get started and get profiling.

It is graphical, and has an integrated with a source code browser that shows performance against lines of code, and able to analyse bad MPI behaviour, poor work balance or poor vectorization.

I am one of the team behind the product, so am a little biased. It is commercial - there are evaluation licences available from the website.

gprof is a good profiler for Fortran and other GNU based compilers.

You can use Intel Trace analyzer to profile MPI communication and Intel VTune to obtain a profile of single MPI Task. Both software was widely documented on Intel web site.

I would like to add two more profilers : (1) mpiP is a lightweight profiler and can produce textual output but measures only MPI functions. (2) Scalasca - this produces a sophisticated output which can point to synchronisation imbalances (late sender / late receiver) also (as opposed to TAU which does not point to synchronisation imbalances).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top