I have a 64-node cluster, running PBS Pro. If I submit many hundreds of jobs, I can get 64 running at once. This is great, except when all 64 jobs happen to be nearly I/O bound, and are reading/writing to the same disk. In such cases, I'd like to be able to still submit all the jobs, but have a max of (say) 10 jobs running at a given time. Is there an incantation to qsub that will allow me to do such, without having administrative access to the cluster's PBS server?
Portable Batch System (or simply PBS) is the name of computer software that performs job scheduling. Its primary task is to allocate computational tasks, i.e., batch jobs, among the available computing resources. It is often used in conjunction with UNIX cluster environments.
Create a script file that includes the details of the PBS job that you want to run. It can include the name of the program, the memory. wall time and processor requirements of the job, which queue it should run in and how to notify you of the results of the job.
In TORQUE you can do this by setting a slot limit on a job array, as long as you can arrange the jobs as an array:
qsub script.sh -t 0-99%10
would limit 10 of them to running at once. If PBSPro has an equivalent to this then you can use that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With