I have submitted my job in a linux-cluster(that uses SLURM to schedule job), but the time limit of each partition is only 24hr(actually this limit is set by the admin) and it seems that my code need to run more than a week(as per my guess). I am new to SLURM script and understand a very little about the interplay between the following:
#SBATCH --nodes=
#SBATCH --ntasks-per-node=
#SBATCH --ntasks=
#SBATCH --ntasks-per-core=
I am seeking the way out there to avoid the time limit while submitting job and run my complete job.
Suggestions are appreciated.
Time limit is set by admin and that is defined in slurm.conf at /etc/slurm/slurm.conf. There should be partition that defines the limit.
and I am afraid you cannot bypass that limit.
So the only thing that you can do is:
For 1 you need to modify the program and save state which most program should provide if they are supposed to run for long duration?
It seems you are from Nepal and if you happen to run it in Kathmandu University HPC you can ask administration they should help you here.
Regarding your second question:
#SBATCH --nodes=
#SBATCH --ntasks-per-node=
#SBATCH --ntasks=
#SBATCH --ntasks-per-core=
nodes means number of physical node.
For ntask related thing I recommend you to look on this link: What does the --ntasks or -n tasks does in SLURM?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With