Difference between omp_get_wtime() and mpi_wtime() when using both MPI and shared memory parallelization

StackOverflow https://stackoverflow.com/questions/22697001

Question

I am using both OpenMPI and OpenMP (shared memory) to parallelize a piece of code. I am trying to time that code for benchmarking and speedup purposes, and I don't understand the differences between omp_get_wtime() and mpi_wtime().

Here is an outline of what my code does along with the architecture it is running on:

I have 8 nodes, each of which have 16 processors. I have 128 tasks to run.

There are two scenarios I am testing, and I want to know which one is faster.

First scenario: I give each node 1/8 of the tasks (using OpenMPI), and for each task the node uses its 16 processors (using OpenMP) to do the task. So Node0 will do Task0, when Task0 completes it will do Task8, when Task8 completes it will do Task16; simultaneously Node1 will do Task1, then Task9, then Task15, etc. until every task is complete. I want to know how long that entire process takes, and it would be nice to know how long each task takes to complete (on average).

Second scenario: I put each of the 128 tasks on a single processor (using MPI). I just want to know how long this takes to run, and again it would be nice to know how long each processor takes to finish its job.

Basically what I am doing here is comparing how MPI+OpenMP compares to just using MPI for my code.

Which timer should I be using, and what are the differences between the two?

Thanks!

Was it helpful?

Solution

There is no principal difference between omp_get_wtime() and MPI_WTIME. Both are sub-second precision wall-clock timers that return the time elapsed since some point in the past. Both are also used the same way by calling them repeatedly and subtracting the values, e.g.:

double tmr = omp_get_wtime();
...
// routine to be timed
...
tmr = omp_get_wtime() - tmr;
// tmr now holds the elapsed time in seconds

or:

double tmr = MPI_Wtime();
...
// routine to be timed
...
tmr = MPI_Wtime() - tmr;
// tmr now holds the elapsed time in seconds

Both functions are local, e.g. they measure the time on the node where the calling code executes. MPI_WTIME is allowed to be synchronised across all nodes, but few implementations do so. They should be pretty equal in all other respects and provide similar precision (and on most platforms both are usually implemented using the same OS-specific timer routines). There is absolutely no guarantee that both timers have the same reference point in the past, therefore one should not mix them, e.g. the following is not valid code:

double tmr = MPI_Wtime();
...
tmr = omp_get_wtime() - tmr;
// tmr now holds the elapsed time in seconds + possibly a constant difference

I would preferably use MPI_WTIME for omp_get_wtime() depends on having OpenMP enabled. If you allow for your program to compile both as pure MPI and as hybrid MPI+OpenMP, then it is a good idea to not have many (or any) calls to the OpenMP runtime library or you would have to provide stub implementations for the case when OpenMP is not enabled. Of course, a stub omp_get_wtime() implementation for hybrid codes takes one preprocessor macro:

#define omp_get_wtime MPI_Wtime
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top