MPI를 사용하여 여러 코어에서 실행됩니다

https://stackoverflow.com/questions/6060836

15-11-2019
|

문제

현재 명령을 사용하여 MPI 작업을 제출합니다. mpirun -np no.fox 프로세서 파일 이름

나의 이해는 위의 명령이 MPI를 통해 통신하는 4 개의 독립 프로세서에 제출할 수 있다는 것입니다. 그러나 우리의 설치에서 각 프로세서에는 사용되지 않은 4 개의 코어가 있습니다. ...에 내가 가진 질문은 다음과 같습니다 :

는 동일한 노드의 다중 코어 또는 MPI 실행 명령 행의 여러 노드에서 실행되도록 작업을 제출할 수 있습니까? 그렇다면 어떻게?
위의 특별한 주석 / 코드 내에서 설정이 필요합니까? 나는 코어 간의 의사 소통 시간이 프로세서간에 다를 수 있기 때문에 일부 문헌을 읽는 것을 이해하므로 문제가 어떻게 배포되는지에 대한 생각이 필요합니다 ... 그러나 그 문제에 대해서는? 그 밖의 무엇이 왜 추정해야합니까?
마지막으로, 얼마나 많은 양의 데이터가 전송되는지에 대한 제한이 있습니까? 버스가 보낼 수있는 데이터의 양에 한계가 있습니까? 캐시에 제한이 있습니까?
감사합니다!

해결책

So 1 is a question about launching processes, and 2+3 are questions about, basically, performance tuning. Performance tuning can involve substantial work on the underlying code, but you won't need to modify a line of code to do any of this.

What I understand from your first question is that you want to modify the distribution of the MPI processes launched. Doing this is necessarily outside the standard, because it's OS and platform dependant; so each MPI implementation will have a different way to do this. Recent versions of OpenMPI and MPICH2 allow you to specify where the processors end up, so you can specify two processors per socket, etc.

You do not need to modify the code for this to work, but there are performance issues depending on core distributions. It's hard to say much about this in general, because it depends on your communication patterns, but yes, the "closer" the processors are, the faster the communications will be, by and large.

There's no specified limit to the total volume of data that goes back and forth between MPI tasks, but yes, there are bandwidth limits (and there are limits per message). The cache size is whatever it is.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow