Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Snakemake - load cluster modules before an external script is called

In snakemake, you can call external scripts like so:

rule NAME:
    input:
        "path/to/inputfile",
        "path/to/other/inputfile"
    output:
        "path/to/outputfile",
        "path/to/another/outputfile"
    script:
        "path/to/script.R"

This gives convenient access to an S4 object named snakemake inside the R script. Now in my case, I am running snakemake on a SLURM cluster, and I need to load R with module load R/3.6.0 before an Rscript can be executed, otherwise the job will return:

/usr/bin/bash: Rscript: command not found

How can I tell snakemake to do that? If I run the rule as a shell instead of a script, my R script unfortunately has no access to the snakemake object, so this is no desired solution:

shell:
    "module load R/3.6.0;"
    "Rscript path/to/script.R"
like image 959
bgbrink Avatar asked Oct 16 '25 02:10

bgbrink


1 Answers

You cannot call a shell command using the script tag. You definitely have to use the shell tag. You can always add your inputs and outputs as arguments:

rule NAME:
    input:
        in1="path/to/inputfile",
        in2="path/to/other/inputfile"
    output:
        out1="path/to/outputfile",
        out2="path/to/another/outputfile"
    shell:
        """
        module load R/3.6.0
        Rscript path/to/script.R {input.in1} {input.in2} {output.out1} {output.out2}
        """

and get your arguments in the R script:

args=commandArgs(trailingOnly=TRUE)
inFile1=args[1]
inFile2=args[2]
outFile1=args[3]
outFile2=args[4]

Use of conda environment:

You can specify a conda environment to use for a specific rule:

rule NAME:
    input:
        in1="path/to/inputfile",
        in2="path/to/other/inputfile"
    output:
        out1="path/to/outputfile",
        out2="path/to/another/outputfile"
    conda: "r.yml"
    script:
        "path/to/script.R"

and in you r.yml file:

name: rEnv
channels:
  - r
dependencies:
  - r-base=3.6

Then when you run snakemake:

snakemake .... --use-conda

Snakemake will install all environments prior to running and each environment will be activated inside the job sent to slurm.

like image 163
Eric C. Avatar answered Oct 18 '25 14:10

Eric C.