Question

I am trying to implement Tournament barrier using MPI. Here, is the code I have written. I am writing only the arrival phase and wake up phase

//Arrival phase
while(1)
{
    if((!strcmp(round[my_id][round_num].role,"winner"))||(!strcmp(round[my_id][round_num].role,"champion")))
    {
       printf("%d is the winner of round %d\n",my_id,round_num);
       MPI_Recv(&reach_msg, sizeof(reach_msg), MPI_BYTE, round[my_id][round_num].opponent, tag, MPI_COMM_WORLD, &status);
       printf("%d received: %s\n",my_id,reach_msg);
    }

    else if(!strcmp(round[my_id][round_num].role,"loser"))
    {
       printf("%d is the loser of round %d\n",my_id,round_num);
       sprintf(reach_msg,"%d arrived at the barrier",my_id);
       MPI_Send(reach_msg,strlen(reach_msg+1),MPI_BYTE,round[my_id][round_num].opponent,tag,MPI_COMM_WORLD);
       MPI_Recv(wakeup_msg,sizeof(wakeup_msg),MPI_BYTE,round[my_id][round_num].opponent,tag,MPI_COMM_WORLD,&status);
       printf("%d received: %s\n",my_id,wakeup_msg);
    }

if(round_num==num_rounds)
       break;
    else
       round_num++;
}

printf("%d is out of arrival tree\n",my_id);

//wakeup tree
      while(1)
     {
       printf("%d prints: round num is: %d\n",my_id,round_num);
       if(round_num==0)
            break;
       sprintf(wakeup_msg,"wakeup msg from %d of %d",my_id,P);

       if((!strcmp(round[my_id][round_num].role,"winner"))||(!strcmp(round[my_id][round_num].role,"champion")))
       MPI_Send(wakeup_msg,strlen(wakeup_msg+1),MPI_BYTE,round[my_id][round_num].opponent,tag,MPI_COMM_WORLD);
       round_num--;
      }

    MPI_Finalize();
    return 0;
   }

I dont understand why race conditions occur. I believe the MPI_Send and MPI_Recv are blocking functions. But, sometimes it doesn't behave that way

EDIT: Here is a sample output where race condition happens. As you can see, 0 received message("1 arrived at the barrier") from 1 even before 1 sent the message.

0 is the winner of round 1

0 received: 1 arrived at the barrie

0 is the winner of round 2

1 is the loser of round 1

1 sending reach msg

2 is the winner of round 1

2 received: 3 arrived at the barrie

2 is the loser of round 2

2 sending reach msg

3 is the loser of round 1

3 sending reach msg

0 received: 2 arrived at the barrie

0 sending wakeup msg

0 sending wakeup msg

1 received: wakeup msg from 0 with tag at round 1

2 received: wakeup msg from 0 with tag at round 2

2 sending wakeup msg

3 received: wakeup msg from 2 with tag at round 1

Was it helpful?

Solution

For debugging mpi programs, print statements are generally not a very good solution. As the print statments need to be send to the console from different nodes over the network and are cached before.

To debug it you should let each processor write to an own output file. Or let only one processor print statements, where you disable caching by either using std::cerr or somehow disable the caching of printf.

OTHER TIPS

To my experience you most likely have a problem of observation and not of your underlying algorithm. Usually the printfs in such a setting simply arrive out-of-order. You'd have to

  • put time stamps in your output
  • write to different files, one per each MPI process
  • merge them back together by sorting according to your time stamp

I am not sure I fully understand your problem... adding the output of the code to your question might help.

What I can say for sure is that MPI_Send and MPI_Recv are definitely blocking functions. Have you tried to use non-blocking functions instead (i.e. MPI_Isend and MPI_Irecv) ? If yes, did it solve your problem ?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top