Question

I'm trying to distribute process in a hpc with 8 cores per node, I also have a partition with 2 nodes.

I have done this csh test script:

#!/bin/tcsh
foreach i (`seq 30`)
    srun csh -c "echo 'running${i} into:'; hostname; sleep 10;echo 'end ${i}'" &
end
echo "waiting for jobs completion"
wait

And I want to run it with salloc:

salloc -N2 -p mypartition testsalloc.sh

The script launch the 30 processes simultaneously when I expected that 16 were launched and 14 were queued.

Is this behaviour posible to do with salloc and srun?

Was it helpful?

Solution

You could have used sbatch rather than salloc and your original script would have most probably given the expected result

OTHER TIPS

Finally I have found a solution only adding this params to srun command:

srun --ntasks=1 --exclusive ....
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top