سؤال

I am using mvapich in a supercomputing cluster(PSB environment). I need to enable MPI_THREAD_MULTIPLE support to run my program. But the output of my program suggestes that MPI_Init_thread failed to enable MPI_THREAD_MULTIPLE.

PBS scripts is :

#!/bin/sh
APP_NAME=score
NP=2
NP_PER_NODE=1
RUN="RAW"

rm -f hosts.list
for i in `echo $LSB_HOSTS`; do
echo $i >>hosts.list
done

/home/compiler/mpi/mvapich/1.0/icc.ifort-9.1/bin/mpirun_rsh -np 2 -hostfile ./hosts.list MV2_ENABLE_AFFINITY=0 /home/users/simmykq/users/zhengyuan/mpi_parallel_framework/master_slave/exe_framework

(The last line is exe command.)

My program is like

int main(int argc,char *argv[])
{
        int p,id;
        int t;
        int provided;
        pthread_t tid[4];

        MPI_Init_thread(&argc,&argv,MPI_THREAD_MULTIPLE,&provided);
        if(provided!=MPI_THREAD_MULTIPLE)
        {
                printf("MPI cannot support mutiple\n");
                MPI_Abort(MPI_COMM_WORLD,0);
        }
   //...........
}

The output is like

Sender: LSF System <lsfadmin@a328>
Subject: Job 2958650: <t> Exited

Job <t> was submitted from host <inode01> by user <simmykq> in cluster <MagicCube_SC1>.
Job was executed on host(s) <1*a328>, in queue <score>, as user <simmykq> in cluster <MagicCube_SC1>.
                            <1*a215>
</home/users/simmykq> was used as the home directory.
</home/users/simmykq/users/zhengyuan/mpi_parallel_framework/master_slave> was used as the working directory.
Started at Sun Mar 16 12:51:09 2014
Results reported at Sun Mar 16 12:51:36 2014

Your job looked like:

------------------------------------------------------------
# LSBATCH: User input
./test2.lsf
------------------------------------------------------------

Exited with exit code 1.

Resource usage summary:

    CPU time   :      0.57 sec.

The output (if any) follows:

Exit code -3 signaled from a328
MPI cannot support mutiple
MPI cannot support mutiple
Killing remote processes...[0] [MPI Abort by user] Aborting Program!
[1] [MPI Abort by user] Aborting Program!
Abort signaled by rank 0: MPI Abort by user Aborting program !
Abort signaled by rank 1: MPI Abort by user Aborting program !
MPI process terminated unexpectedly
MPI process terminated unexpectedly
DONE
Signal 15 received.
Signal 15 received.

Thanks for any hints. :)

هل كانت مفيدة؟

المحلول

In you code you call MPI_Init_thread with MPI_THREAD_MULTIPLE, but call returns something not equal to MPI_THREAD_MULTIPLE:

 MPI_Init_thread(&argc,&argv,MPI_THREAD_MULTIPLE,&provided);
 if(provided!=MPI_THREAD_MULTIPLE)

This means that you have MPI library installed which don't support MPI_THREAD_MULTIPLE. You need to rebuild or reinstall you MPI library with version, configured for MPI_THREAD_MULTIPLE support. In MPICH2, for example, there were only 2 data transport layers with support of MPI_THREAD_MULTIPLE: nemesis and sock. Don't know about MVAPICH, but check its configure parameters and configure output.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top