Question

I am writing script in python which submit multiple job to qsub but we need to determine the load on qsub.if more jobs are in queue or load is high on qsub than I need to inform the user and run the job local environment. I have checked the command page but could not get useful information.

qstat [options]
        [-ext]                            view additional attributes
        [-explain a|c|A|E]                show reason for c(onfiguration ambiguous), a(larm), suspend A(larm), E(rror) state
        [-f]                              full output
        [-fjc]                            full output grouped according to job class instances
        [-F [resource_attributes]]        full output and show (selected) resources of queue(s)
        [-g {c}]                          display cluster queue summary
        [-g {d}]                          display all job-array tasks (do not group)
        [-g {t}]                          display all parallel job tasks (do not group)
        [-help]                           print this help
        [-j job_identifier_list ]         show scheduler job information
        [-l resource_list]                request the given resources
        [-ne]                             hide empty queues
        [-ncb]                            suppress additional binding specific parameters
        [-pe pe_list]                     select only queues with one of these parallel environments
        [-nenv]                           do not request job environment
        [-njd]                            do not show details about foreign jobs
        [-q wc_queue_list]                print information on given queue
        [-qs {a|c|d|o|s|u|A|C|D|E|S}]     selects queues, which are in the given state(s)
        [-r]                              show requested resources of job(s)
        [-s {p|r|s|z|hu|ho|hs|hd|hj|ha|h|a}] show pending, running, suspended, zombie jobs
Was it helpful?

Solution

The ideal solution for this is to use a scheduler, such as Moab or Maui (I think Maui can do this) that can assign nodes to jobs intelligently, including not using nodes in the cluster if they are at a high load already. Typically, schedulers offer policies that allow you to handle typical HPC scenarios such as this one. (In the interest of full disclosure, I am currently an engineer at the company that sells Moab - Maui is free to use)

If you wish to do this via scripts, pbsnodes -a reports the load average for the nodes in the cluster. It is inside a larger status string in this format:

status = attr=[val][,attr2=[val]...]

The attribute you're looking for is loadave, so if you wrappered qsub inside a script that calls pbsnodes (or has cached results from pbsnodes) to obtain this value and then either qsubs the job or runs it in your local environment that would work. To me, it seems easier to use a scheduler.

OTHER TIPS

That looks like a sun grid engine(or derivative) qstat.

You can request grid engine not to launch the job unless it can run (more or less) immediately with qsub -now n. If you don't want it to run on a machine with a high load you may be able to request load_avg,load_long,load_medium or load_short va the -l option to qsub depending on how the cluster is configured.

To list out queued jobs qstat -u '*' -g d -s p

You can optionally add -xml to get output in that format

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top