使用MPI在多个内核上运行

https://stackoverflow.com/questions/6060836

15-11-2019
|

题

我使用当前命令提交MPI作业：mpirun -np no.of处理器文件名我的理解是，上面的命令让我提交到4个通过MPI进行通信的独立处理器。但是，在我们的设置中，每个处理器都有4个核心，不使用。我所提出的问题如下：

是可以在同一节点上的多个核心上运行的作业或来自MPI运行命令行的多个节点可以提交作业吗？如果是这样？
上面需要在代码中设置任何特殊评论吗？我确实从阅读了一些文献中，核心之间的通信时间可能与处理器之间不同，因此它确实需要一些关于如何分发问题的思考......但是对于该问题来说？还有什么需要估计？
最后，转移了多少数据量的限制？有多少数据可以发送/接收多少数据？缓存有限吗？
谢谢！

解决方案

So 1 is a question about launching processes, and 2+3 are questions about, basically, performance tuning. Performance tuning can involve substantial work on the underlying code, but you won't need to modify a line of code to do any of this.

What I understand from your first question is that you want to modify the distribution of the MPI processes launched. Doing this is necessarily outside the standard, because it's OS and platform dependant; so each MPI implementation will have a different way to do this. Recent versions of OpenMPI and MPICH2 allow you to specify where the processors end up, so you can specify two processors per socket, etc.

You do not need to modify the code for this to work, but there are performance issues depending on core distributions. It's hard to say much about this in general, because it depends on your communication patterns, but yes, the "closer" the processors are, the faster the communications will be, by and large.

There's no specified limit to the total volume of data that goes back and forth between MPI tasks, but yes, there are bandwidth limits (and there are limits per message). The cache size is whatever it is.

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow