deadlock with non-blocking MPI communications

https://stackoverflow.com/questions/22150555

mpi
openmpi

19-10-2022
|

Question

The following code is a routine that communicates ghost points to top/bottom and left/right neighbors. The routine is called during the loop of an iterative method, about hundreds of times.

the problem is that, although it is written with non-blocking communication, it is deadlocking... The funny thing is that it seems to be ok for several iterations and all of a sudden it freezes. I've tried to comment the second communication loop (top/bottom) it freezes too but at a larger iteration index.... all happens as if there was a maximum number of communications allowed or something.

I thought no deadlock should occur with Isend and Irecv. Also I took care, I think, of not touching the buffers before the call to MPI_Wait_all().

Am I using the MPI routines in a bad way?

void fields_updateEghosts(Fields *this, Params *parameters)
{


double *ex_in[2], *ex_out[2], *ey_in[2], *ey_out[2];
int neighbors_in[2], neighbors_out[2], ineighb;
MPI_Request requests_lr[4], requests_tb[4];
int ix,iy,nx,ny,i;
YeeCell *yee;

yee = this->yeecell;
nx  = this->nx;
ny  = this->ny;

/* 0,1 = top/bottom for Ey and left/right for Ex*/
for (i=0; i < 2; i++)
{
    MEM_ALLOC(ex_in[i],  ny*sizeof *ex_in[0]);
    MEM_ALLOC(ex_out[i], ny*sizeof *ex_out[0]);
    MEM_ALLOC(ey_in[i],  nx*sizeof *ey_in[0]);
    MEM_ALLOC(ey_out[i], nx*sizeof *ey_out[0]);
}

/* we send the points just inside the boundary */
for (iy=1; iy < ny; iy++)
{
    ex_out[PART_LEFT][iy]  = ex(1   ,iy);
    ex_out[PART_RIGHT][iy] = ex(nx-2,iy);
}

neighbors_in[0]  = PART_LEFT;
neighbors_in[1]  = PART_RIGHT;
neighbors_out[0] = PART_RIGHT;
neighbors_out[1] = PART_LEFT;

for (ineighb=0; ineighb < 2; ineighb++)
{
    MPI_Irecv(ex_in[neighbors_in[ineighb]],
              ny, MPI_DOUBLE,
              parameters->para->neighbors[neighbors_in[ineighb]], /*src rank */
              neighbors_out[ineighb],                           /* tag */
              MPI_COMM_WORLD,
              &requests_lr[ineighb]);

    MPI_Isend(ex_out[neighbors_out[ineighb]],
              ny, MPI_DOUBLE,
              parameters->para->neighbors[neighbors_out[ineighb]],
              neighbors_out[ineighb],
              MPI_COMM_WORLD,
              &requests_lr[ineighb+2]);
}

/* fill the outgoing top and bottom buffers
   while left/right communications are done*/
for (ix=1; ix < nx; ix++)
{
    ey_out[PART_TOP][ix] = ey(ix,ny-2);
    ey_out[PART_BOT][ix] = ey(ix,1);
}


/* now communications for top/bottom */
neighbors_in[0]  = PART_TOP;
neighbors_in[1]  = PART_BOT;
neighbors_out[0] = PART_BOT;
neighbors_out[1] = PART_TOP;

for (ineighb=0; ineighb < 2; ineighb++)
{
    MPI_Irecv(ey_in[neighbors_in[ineighb]],
              nx, MPI_DOUBLE,
              parameters->para->neighbors[neighbors_in[ineighb]],
              neighbors_out[ineighb],
              MPI_COMM_WORLD,
              &requests_tb[ineighb]);

    MPI_Isend(ey_out[neighbors_out[ineighb]],
              nx, MPI_DOUBLE,
              parameters->para->neighbors[neighbors_out[ineighb]],
              neighbors_out[ineighb],
              MPI_COMM_WORLD,
              &requests_tb[ineighb+2]);
}

/* now wait for communications to be done
   before copying the data into the arrays */

MPI_Waitall(4, requests_lr, MPI_STATUS_IGNORE);
MPI_Waitall(4, requests_tb, MPI_STATUS_IGNORE);


for (iy=1; iy < ny; iy++)
{
    ex(0   ,iy) = ex_in[PART_LEFT][iy];
    ex(nx-1,iy) = ex_in[PART_RIGHT][iy];
}

for (ix=1; ix < nx; ix++)
{
    ey(ix,ny-1) = ey_in[PART_TOP][ix];
    ey(ix,0)    = ey_in[PART_BOT][ix];
}



for (i=0; i < 2; i++)
{
    MEM_FREE(ex_in[i]);
    MEM_FREE(ex_out[i]);
    MEM_FREE(ey_in[i]);
    MEM_FREE(ey_out[i]);
}
}

Solution

I found the answer. I will now explain it in case someone has the same problem. First, the above function is fine, there is I think no problem with it. The issue was in the iterative method which is calling this routine to get the ghost node values. The iterative method has a convergence criterion, which I forgot to compute for the global domain. Hence some processes would satisfy the convergence test before others, and exit the loop... leaving the others to wait for ever their buddies... A small mpi_allreduce() in the calculation of the convergence and no blocking anymore.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow