In snakemake, you can call external scripts like so:
rule NAME:
input:
"path/to/inputfile",
"path/to/other/inputfile"
output:
"path/to/outputfile",
"path/to/another/outputfile"
script:
"path/to/script.R"
This gives convenient access to an S4 object named snakemake
inside the R script.
Now in my case, I am running snakemake on a SLURM cluster, and I need to load R with module load R/3.6.0
before an Rscript can be executed, otherwise the job will return:
/usr/bin/bash: Rscript: command not found
How can I tell snakemake to do that? If I run the rule as a shell instead of a script, my R script unfortunately has no access to the snakemake
object, so this is no desired solution:
shell:
"module load R/3.6.0;"
"Rscript path/to/script.R"
You cannot call a shell command using the script
tag. You definitely have to use the shell
tag. You can always add your inputs and outputs as arguments:
rule NAME:
input:
in1="path/to/inputfile",
in2="path/to/other/inputfile"
output:
out1="path/to/outputfile",
out2="path/to/another/outputfile"
shell:
"""
module load R/3.6.0
Rscript path/to/script.R {input.in1} {input.in2} {output.out1} {output.out2}
"""
and get your arguments in the R script:
args=commandArgs(trailingOnly=TRUE)
inFile1=args[1]
inFile2=args[2]
outFile1=args[3]
outFile2=args[4]
Use of conda environment:
You can specify a conda environment to use for a specific rule:
rule NAME:
input:
in1="path/to/inputfile",
in2="path/to/other/inputfile"
output:
out1="path/to/outputfile",
out2="path/to/another/outputfile"
conda: "r.yml"
script:
"path/to/script.R"
and in you r.yml file:
name: rEnv
channels:
- r
dependencies:
- r-base=3.6
Then when you run snakemake:
snakemake .... --use-conda
Snakemake will install all environments prior to running and each environment will be activated inside the job sent to slurm.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With