Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting the exit code from a process submitted with qsub on Sun Grid Engine

I would like to submit jobs via qsub on Sun Grid Engine (now: Oracle Grid Engine?). I do not wish to use the -sync yes option or qrsh, because I want my controlling program to be single-threaded and able to launch many jobs at a time. These options would block my controlling program's thread.

However, I would like to receive the exit statuses of the processes that I launch. From the man pages, there seems to be no way to get this code without blocking my thread. Short of modifying the jobs that I'm launching to print their exit codes to stdout, is there any way to get this status?

like image 259
Brian Avatar asked Jun 22 '10 07:06

Brian


People also ask

How do I stop QSUB?

Use the qdel command to cancel jobs, regardless of whether the jobs are running or are spooled. Use the qmod command to suspend and resume (unsuspend) jobs already running. For both commands, you need to know the job identification number, which is displayed in response to a successful qsub command.

What is SGE QSUB?

Qsub is the command used for job submission to the cluster. It takes several command line arguments and can also use special directives found in the submission scripts or command file.


2 Answers

The answer is 'qacct -j '. A summary of the history of the job is printed to stdout, which can then be parsed for the exit status, start and end times, and a variety of other information.

SGE must be configured properly for this command to work, however.

like image 152
Brian Avatar answered Oct 05 '22 14:10

Brian


If you are submitting your jobs within your application, the simplest and fastest (faster then submitting with qsub) way (and getting the exit status later) is using the DRMAA API. This simple API is available in C and in Java in Sun Grid Engine for a very long time. Univa Grid Engine (commercial successor of Grid Engine) and Sun Grid Engine forks also shipping the necessary library. Since it is an open standard you can submit even to completely other DRMS like Condor/SLURM etc. without changing your program. Language bindings for GO, Python, or TCL (and others) are available.

See: http://www.gridengine.eu/mangridengine/htmlman3/drmaa_wait.html

Some more information and the Go (#golang) DRMAA language binding with examples you can find here: http://www.gridengine.eu/programming-apis

Cheers

Daniel

www.gridengine.eu

like image 44
Daniel Avatar answered Oct 05 '22 13:10

Daniel