Question

When I use sinfo I see the following:

$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
[...]
RG3          up 28-00:00:0      1  drain rg3hpc4
[...]

What does the state 'drain' mean?

Was it helpful?

Solution

It means no further job will be scheduled on that node, but the currently running jobs will keep running (by contrast with setting the node down which kills all jobs running on the node).

Nodes are often set to that state so that some maintenance operation can take place once all running jobs are finished.

From the manpage of the scontrol command:

If you want to remove a node from service, you typically want to set it's state to "DRAIN"

Note that the system administrator most probably gave a reason why the node is drained, and you can see that reason with

sinfo -R
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top