Question

When using qsub to put array jobs on a cluster the global variable SGE_TASK_ID gets set to the array job ID. I use this in a shell script that I run on a cluster, where each array job needs to do something different based on the SGE_TASK_ID. Is this a common way for cluster schedulers to do this, or do they all have a different approach?

Was it helpful?

Solution

Most schedulers have a way to do this, although it can be slightly different in different setups. In TORQUE the variable is called $PBS_ARRAYID but it works the same.

OTHER TIPS

Do all cluster schedulers take array jobs

No. Many do, but not all.

and if they do, do they set SGE_TASK_ID array id?

Only Grid Engine will set SGE_TASK_ID because this is simply what the variable is called in Grid Engine. Other cluster middlewares have a different name for it, with different semantics.

It's a bit unclear where you are aiming with your question, but if you want to write a program/system that runs on many different cluster middlewares / load balancers / schedulers, you should look into DRMAA. This will abstract variables like SGE_TASK_ID.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top