使用OpenMPI的节点之间的未均匀分布
-
21-12-2019 - |
题
我正在使用 slurm 资源管理软件在群集中使用eventmpi运行我的可执行文件。我想找到一种方法来指定应该分配给每个节点的每个节点的方法,其中每个节点可能不同的进程数。
一个示例,以澄清我正在寻找的:假设我想在3个节点上运行7个进程。然后我想能说出一些这样的东西: 节点1应该使用等级n,节点2和3运行进程,每个剩余过程每个都运行3。
aa slots=1
bb slots=3
cc slots=3
.
解决方案
感谢Hristo Iliev的评论,我发现了问题所规定的示例的解决方案:
#!/bin/bash
HOSTFILE=./myHostfile
RANKFILE=./myRankfile
# Write the names of the nodes allocated by SLURM to a file
scontrol show hostname ${SLURM_NODELIST} > $HOSTFILE
# Number of processes
numProcs=7
# Number of nodes
numNodes=${SLURM_JOB_NUM_NODES}
# Counts the number of processes already assigned
count=0
while read p; do
# Write the node names to a rank file
if [ $count == 0 ]
then
echo "rank $count=$p slot=0-7" > $RANKFILE
let count=$count+1
let numNodes=$numNodes-1 # Number of nodes that are still available
else
# Compute the number of processes that should be assigned to this node
# by dividing the number of processes that still need to be assigned by
# the number of nodes that are still available. (This automatically "floor"s the result.)
let numProcsNode=($numProcs-$count)/$numNodes
for i in `seq 1 $numProcsNode`
do
echo "rank $count=$p slot=0-7" >> $RANKFILE
let count=$count+1
done
let numNodes=$numNodes-1 # Number of nodes that are still available
fi
done < $HOSTFILE
mpirun --display-map -np $numProcs -rf $RANKFILE hostname
.
虽然有点难看。并且可能“slot= 0-7”不应具有“7”硬编码。
不隶属于 StackOverflow