I have a batch script which starts off a couple of qsub jobs, and I want to trap when they are all completed.
I don't want to use the -sync option, because I want them to be running simultaneously. Each job has a different set of command line parameters.
I want my script to wait till when all the jobs have been completed, and do something after that. I don't want to use the sleep function e.g. to check if certain files have been generated after each 30 s, because this is a drain on resources.
I believe Torque may have some options, but I am running SGE.
Any ideas on how I could implement this please?
Thanks P.s. I did find another thread Link
which had a reponse
You can use wait to stop execution until all your jobs are done. You can even collect all the exit statuses and other running statistics (time it took, count of jobs done at the time, whatever) if you cycle around waiting for specific ids.
but I am not sure how to use it without polling on some value. Can bash trap be used, but how would I with qsub?
Launch your qsub jobs, using the -N option to give them arbitrary names (job1, job2, etc):
qsub -N job1 -cwd ./job1_script qsub -N job2 -cwd ./job2_script qsub -N job3 -cwd ./job3_script
Launch your script and tell it to wait until the jobs named job1, job2 and job3 are finished before it starts:
qsub -hold_jid job1,job2,job3 -cwd ./results_script
If all the jobs have a common pattern in the name, you can provide that pattern when you submit the jobs. https://linux.die.net/man/1/sge_types shows you what patterns you can use. example:
-hold_jid "job_name_pattern*"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With