Possible to distribute an MPI (C++) program accross the internet rather than within a LAN cluster?

StackOverflow https://stackoverflow.com/questions/3062734

Question

I've written some MPI code which works flawlessly on large clusters. Each node in the cluster has the same cpu architecture and has access to a networked (i.e. 'common') file system (so that each node can excecute the actual binary). But consider this scenario:

  • I have a machine in my office with a dual core processor (intel).
  • I have a machine at home with a dual core processor (amd).

Both machines run linux, and both machines can successfully compile and run the MPI code locally (i.e. using 2 cores).

Now, is it possible to link the two machines together via MPI, so that I can utilise all 4 cores, bearing in mind the different architectures, and bearing in mind the fact that there are no shared (networked) filesystems?

If so, how?

Thanks, Ben.

Was it helpful?

Solution

Its possible to do this. Most MPI implementations allow you to specify the location of the binary to be run on different machines. Alternatively, make sure that it is in your path on both machines. Since both machines have the same byte order, that shouldn't be a problem. You will have to make sure that any input data that the individual processes read is available in both locations.

There are lots of complications with doing this. You need to make sure that the firewalls between the systems will allow process startup and communication. Communication between the machines is going to be much slower, so if you code is communication heavy or latency intolerant, it probably will be quite slow. Most likely your execution time running on all 4 cores will be longer than just running with 2 on a single machine.

OTHER TIPS

There is no geographical limitation on where the processes are located. And as KeithB said, there is no need to have common path or even the same binary on both the machines. Depending on what MPI implementation you are using, you dont even need the same endian-ness.

You can specify exactly the path to the binary on each machine and have two independent binaries as well. However, you should note the program will run slow if the communication infrastructure between the two nodes is not fast enough.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top