Question

I would like to submit jobs via qsub on Sun Grid Engine (now: Oracle Grid Engine?). I do not wish to use the -sync yes option or qrsh, because I want my controlling program to be single-threaded and able to launch many jobs at a time. These options would block my controlling program's thread.

However, I would like to receive the exit statuses of the processes that I launch. From the man pages, there seems to be no way to get this code without blocking my thread. Short of modifying the jobs that I'm launching to print their exit codes to stdout, is there any way to get this status?

Was it helpful?

Solution

The answer is 'qacct -j '. A summary of the history of the job is printed to stdout, which can then be parsed for the exit status, start and end times, and a variety of other information.

SGE must be configured properly for this command to work, however.

OTHER TIPS

If you are submitting your jobs within your application, the simplest and fastest (faster then submitting with qsub) way (and getting the exit status later) is using the DRMAA API. This simple API is available in C and in Java in Sun Grid Engine for a very long time. Univa Grid Engine (commercial successor of Grid Engine) and Sun Grid Engine forks also shipping the necessary library. Since it is an open standard you can submit even to completely other DRMS like Condor/SLURM etc. without changing your program. Language bindings for GO, Python, or TCL (and others) are available.

See: http://www.gridengine.eu/mangridengine/htmlman3/drmaa_wait.html

Some more information and the Go (#golang) DRMAA language binding with examples you can find here: http://www.gridengine.eu/programming-apis

Cheers

Daniel

www.gridengine.eu

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top