Pregunta

I have the following code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <mpi.h>

static int rank, size;

char msg[] = "This is a test message";

int main(int argc, char **argv) {
    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    if (size != 2) {
        fprintf(stderr, "This test requires exactly 2 tasks (has: %d).\n", size);
        MPI_Finalize();
        return -1;
    }

    int run = 1;
    if (argc > 1) {
        run = atoi(argv[1]);
    }

    int len = strlen(msg) + 1;
    if (argc > 2) {
        len = atoi(argv[2]);
    }

    char buf[len];

    strncpy(buf, msg, len);

    MPI_Status statusArray[run];

    MPI_Request reqArray[run];


    double start = MPI_Wtime();

    for (int i = 0; i < run; i++) {
        if (!rank) {
          MPI_Isend(buf, len, MPI_CHAR, 1, 0, MPI_COMM_WORLD, &reqArray[i]);
          printf("mpi_isend for run %d\n", i);
        } else {
          MPI_Irecv(buf, len, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &reqArray[i]);
          printf("mpi_irecv for run %d\n", i);
        }
    }
    int buflen = 512;
    char name[buflen];
    gethostname(name, buflen);
    printf("host: %s has rank %d\n", name, rank);
    printf("Reached here! for host %s before MPI_Waitall \n", name);
    if(!rank) {
      printf("calling mpi_waitall for sending side which is %s\n", name);
      MPI_Waitall(run, &reqArray[0], &statusArray[0]);
    }
    else {
      printf("calling mpi_waitall for receiving side which is %s\n", name);
      MPI_Waitall(run, &reqArray[0], &statusArray[0]);
    }
    printf("finished waiting! for host %s\n", name);
    double end = MPI_Wtime();
    if (!rank) {
      printf("Throughput: %.4f Gbps\n", 1e-9 * len * 8 * run / (end - start));
    }

    MPI_Finalize();
}

I got a seg-fault on the sending side before MPI_Waitall. The error message is:

[host1:27679] *** Process received signal ***
[host1:27679] Signal: Segmentation fault (11)
[host1:27679] Signal code: Address not mapped (1)
[host1:27679] Failing at address: 0x8
[host1:27679] [ 0] /lib64/libpthread.so.0() [0x3ce7e0f500]
[host1:27679] [ 1] /usr/lib64/openmpi/mca_btl_openib.so(+0x21dc7) [0x7f46695c1dc7]
[host1:27679] [ 2] /usr/lib64/openmpi/mca_btl_openib.so(+0x1cbe1) [0x7f46695bcbe1]
[host1:27679] [ 3] /lib64/libpthread.so.0() [0x3ce7e07851]
[host1:27679] [ 4] /lib64/libc.so.6(clone+0x6d) [0x3ce76e811d]
[host1:27679] *** End of error message ***

I think there is something wrong with the array of MPI_Request. Could someone point it out? Thanks!

¿Fue útil?

Solución

I ran your program without a problem (other than a warning for not including unistd.h). The problem is probably related to your setup of Open MPI. Are you using a machine with an InfiniBand network? If not, you probably want to change to just use the default tcp implementation. Your problem might be related to that.

If you want to specify that you'll only use tcp, you should run like this:

mpirun --mca btl tcp,self -n 2 <prog_name> <prog_args>

That will ensure that openib isn't accidentally detected and used when it shouldn't be.

If, on the other hand, you do mean to use InfiniBand, you might have discovered some sort of problem with Open MPI. I doubt that's the case though since you're not doing anything fancy.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top