Pregunta

While learning MPI using MPICH in windows (1.4.1p1) I found some sample code here. Originally, when I ran the server, I would have to copy the generated port_name and start the client with it. That way, the client can connect to the server. I modified it to include MPI_Publish_name() in the server instead. After launching the server with a name of aaaa, I launch the client which fails MPI_Lookup_name() with

Invalid service name (see MPI_Publish_name), error stack:
MPID_NS_Lookup(87): Lookup failed for service name aaaa

Here are the snipped bits of code:

server.c

MPI_Comm client; 
MPI_Status status; 
char port_name[MPI_MAX_PORT_NAME];
char serv_name[256];
double buf[MAX_DATA]; 
int size, again; 
int res = 0;

MPI_Init( &argc, &argv ); 
MPI_Comm_size(MPI_COMM_WORLD, &size); 
MPI_Open_port(MPI_INFO_NULL, port_name);
sprintf(serv_name, "aaaa");
MPI_Publish_name(serv_name, MPI_INFO_NULL, port_name);

while (1) 
{ 
    MPI_Comm_accept( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &client );
    /*...snip...*/
}

client.c

MPI_Comm server; 
double buf[MAX_DATA]; 
char port_name[MPI_MAX_PORT_NAME]; 
memset(port_name,'\0',MPI_MAX_PORT_NAME);
char serv_name[256];
memset(serv_name,'\0',256);

strcpy(serv_name, argv[1] )
MPI_Lookup_name(serv_name, MPI_INFO_NULL, port_name);
MPI_Comm_connect( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &server ); 
MPI_Send( buf, 0, MPI_DOUBLE, 0, tag, server ); 
MPI_Comm_disconnect( &server ); 
MPI_Finalize(); 
return 0; 

I cannot find any information about altering visibility of published names, if that is even the problem. MPICH seems to not have implemented anything with MPI_INFO. I would try openMPI but I am having trouble just building it. Any suggestions?

¿Fue útil?

Solución 3

This approach of publishing names, looking them up, and connecting to them is outlandish relative to normal MPI usage.

The standard pattern is to use mpirun to specify a set of nodes on which to launch a given number of processes. The operation of common implementations of mpirun implementations is explained in another question

Once the processes are all launched as part of a single parallel job, the MPI library reads whatever information the launcher provided during MPI_Init to set up MPI_COMM_WORLD, a communicator over the group of all processes in the job.

Using that communicator, the parallel application can distribute work, exchange information, and so forth. It would do this using the common MPI_Send and MPI_Recv routines, in all their variants, the collective operations, and so forth.

Otros consejos

I uploaded a working version using OpenMPI 1.6.5 of a client and server in C on Ubuntu that uses the ompi-server name server here:

OpenMPI nameserver client server example in C

(digging up old stuff)
For MPICH, the code by @daemondave should actually work as well. It does, however, still require to get a nameserver running. For MPICH, instead of using ompi-server, this can be done using hydra_nameserver. The host then has to be specified for all the mpirun/mpiexec calls using -nameserver HOSTNAME.

I created a working example over at github, which also provides a shell script to build+run the example.

P.S: the ompi-server variant seems to be somewhat outdated (and includes a few bugs).
For an updated, but still, somewhat undocumented alternative, see this comment.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top