I would like to have Snakemake set binding memory limits for individual rules. Based on the snakemake documentation, mem_mb
argument seems like it would work, but the job uses more memory than I've allocated.
Here's a simple rule that uses several GB of memory. I would like the rule to be stopped once it hits the memory limit, but it completes without issue.
rule:
output:
"a"
threads: 1
resources:
mem_mb = 100
shell:
"""
python3 -c 'import numpy; x=numpy.ones(1_000_000_000)'
touch a
"""
Is it possible to make this limit bind? I'd like a solution that's portable, working for Windows and Linux. I'm using snakemake locally, not with a batch scheduler or container setup.
I have absolutely no experience with this, so I can't really say if this is recommended or works well across platforms, but this seems to work on my computer (Ubuntu):
rule all:
input:
"a"
rule:
output:
"a"
threads: 1
resources:
mem_mb = 100
params:
max_mem=lambda wildcards, resources: resources.mem_mb * 1024
shell:
"""
ulimit -v {params.max_mem}
python3 -c 'import numpy; x=numpy.ones(1_000_000_000)'
touch a
"""
See here for more info.
I don't believe SnakeMake offers an "out of the box" solution for this.
I don't have much experience with this myself but note that the use of the resources
directive should be accompanied by the command line option --resources
. In your case, you could execute:
snakemake -j 10 --resources mem_mb=500
and this will ensure that jobs running at the same time will not exceed mem_mb = 500 (and at most 10 jobs run at the same time). However, a rule with more than mem_mb=500 will still run as a single job. To prevent that I think Maarten-vd-Sande's solution is the best I can think of.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With