Question

I am writing a very small program to understand MPI (MPICH implementation) and Fortran 90. Unfortunately the code is not running properly when executed with "-np 2".

This is the code:

PROGRAM main
    USE MPI
    IMPLICIT none

    INTEGER :: ierr, npe, mynpe
    INTEGER :: istatus(MPI_STATUS_SIZE)
    REAL :: aa

    CALL MPI_INIT(ierr)
    CALL MPI_Comm_size(MPI_COMM_WORLD, npe, ierr)
    CALL MPI_Comm_rank(MPI_COMM_WORLD, mynpe, ierr)

    IF (mynpe == 0) THEN
        READ(*,*) aa
        CALL  MPI_Send(aa, 1, MPI_REAL, 1, 99, MPI_COMM_WORLD, ierr)
    ELSE IF (mynpe == 1) THEN
        CALL MPI_Recv(aa, 1, MPI_REAL, 0, 99, MPI_COMM_WORLD, istatus, ierr)
        WRITE(*,*) "Ho ricevuto il numero ", aa
    END IF

    CALL MPI_FINALIZE(ierr)
END PROGRAM

I am compiling it with mpif90 mpi_2.f90 -o output and when I execute it with mpirun -np 2 output I get the following error:

At line 14 of file mpi_2.f90 (unit = 5, file = 'stdin')
Fortran runtime error: End of file

The shell still waits for an input and if I insert a number (e.g. 11) I get the following output:

11
Fatal error in MPI_Send: Invalid rank, error stack:
MPI_Send(173): MPI_Send(buf=0xbff4783c, count=1, MPI_REAL, dest=1, tag=99, MPI_COMM_WORLD) failed
MPI_Send(98).: Invalid rank has value 1 but must be nonnegative and less than 1
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------

Thank you for all the help!

Was it helpful?

Solution

Two different MPI implementations get mixed in your case. The run-time MPI environment comes from a different implementation that the one used to compile the program and therefore both processes behave as MPI singletons, i.e. each of them forms a separate MPI_COMM_WORLD communicator and becomes rank 0 in it. As a result the first branch of the conditional executes in both processes. On the other side mpirun performs input redirection to the first process only while all other get their standard input closed or connected to /dev/null. MPI_SEND fails for the same reason - in the singleton universe of each MPI process there is no rank 1.

The most frequent cause for such behaviour is that mpirun and mpif90 come from different MPI libraries. In your case you have MPICH mixed with Open MPI. Indeed, the following error message:

MPI_Send(173): MPI_Send(buf=0xbff4783c, count=1, MPI_REAL, dest=1, tag=99, MPI_COMM_WORLD) failed
MPI_Send(98).: Invalid rank has value 1 but must be nonnegative and less than 1

is in the error format of MPICH. Therefore mpif90 comes from MPICH.

But the next error message:

--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------

is in the error format used by the OpenRTE framework of Open MPI. Therefore mpirun comes from Open MPI and not from MPICH.

This could happen if you have installed a development package for MPICH, so that it provides mpicc, mpif90, and so on, but then you've installed a run-time package for Open MPI. Make sure that you have only packages from one kind of MPI installed. If you have compiled MPICH from source, make sure the path to its binaries is the first element of $PATH.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top