Do all cluster schedulers take array jobs, and if they do, do they set SGE_TASK_ID array id?
-
14-06-2021 - |
Question
When using qsub to put array jobs on a cluster the global variable SGE_TASK_ID gets set to the array job ID. I use this in a shell script that I run on a cluster, where each array job needs to do something different based on the SGE_TASK_ID. Is this a common way for cluster schedulers to do this, or do they all have a different approach?
Solution
Most schedulers have a way to do this, although it can be slightly different in different setups. In TORQUE the variable is called $PBS_ARRAYID but it works the same.
OTHER TIPS
Do all cluster schedulers take array jobs
No. Many do, but not all.
and if they do, do they set SGE_TASK_ID array id?
Only Grid Engine will set SGE_TASK_ID because this is simply what the variable is called in Grid Engine. Other cluster middlewares have a different name for it, with different semantics.
It's a bit unclear where you are aiming with your question, but if you want to write a program/system that runs on many different cluster middlewares / load balancers / schedulers, you should look into DRMAA. This will abstract variables like SGE_TASK_ID.