MPI send and receive (many to many)

https://stackoverflow.com/questions/23140894

05-07-2023
|

Вопрос

I have the following code that works for now but I don't think is going to scale as the number of process and sent data grows.

Here is what I'm going:

First I have a send loop where each processor sends messages to every other. The length of the messages that each process sends will be different but not the type.

 for (int i = 0; i < n_proc; ++i){
     if (i != my_rank){
           int N = Xcoord_top[my_rank].size();
           MPI_Send(&Xcoord_top[my_rank][0], N, MPI_DOUBLE, i, 1000, MPI_COMM_WORLD);
     }
 }
 MPI_Barrier(MPI_COMM_WORLD);

After I've sent the messages I receive them using a similar loop

for (int i = 0; i < n_proc; ++i){
    if (i != my_rank){
         std::vector<double> temp(max_n);
         MPI_Recv(&temp[0], points_per_proc[i], MPI_DOUBLE,
                    MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
         ...
    }
}

In the second loop I also have few lines that put the messages in the right places based on the tags and sources

This works only if I put the barrier between the loops, otherwise crashes.

According to my understanding it is possible that the MPI internal storage for the messages might overflow (I'm not sure if I use the right terminology). So in that case the program will hang during the first loop.

Any ideas how I should go about it?

Решение

A bit too much code to fit in a comment:

I'd suggest just doing this as a single MPI_Allgatherv():

std::vector<int> disps(n_proc);

disps[0] = 0;
for (int i=1; i<n_proc; i++)
  disps[i] = disps[i-1] + points_per_proc[i-1];

int totdata = disps[n_proc-1] + points_per_proc[n_proc-1];
std::vector<double> temp(totdata);

MPI_Allgatherv(&Xcoord_top[my_rank][0], Xcoord_top[my_rank].size(),
               MPI_Double, temp, points_per_proc, disps, MPI_DOUBLE,
               MPI_COMM_WORLD);

and now the data for proc i is in temp[disps[i]]...temp[disps[i+1]-1].

There's at least three problems with the code as originally posted:

It could well deadlock (Sends are allowed to block until received) - that could be fixed with using asynchronous sends, eg MPI_Isend() with a following MPI_Waitall() rather than MPI_Send();
It will almost certainly process the receives out of order (there's no guarantee that in the ith iteration it is receiving from the ith processor), and so the message lengths might be wrong resulting in an error which will abort the program - that could be fixed by fixing the source as rank i rather than MPI_ANY_SOURCE; and
It is inefficient, using linear point-to-point sends and receives instead of optimized collectives like broadcasts or gathers - that can be fixed by using collectives, such as allgather, as above.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow