Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What would be an elegant way of preventing snakemake from failing upon shell/R error?

Tags:

snakemake

I would like to be able to have my snakemake workflows continue running even when certain rules fail.

For example, I'm using a variety of tools in order to perform peak-calling of ChIP-seq data. However, certain programs issue an error when they are not able to identify peaks. I would prefer to create an empty output file in such cases, and not having snakemake fail (like some peak-callers already do).

Is there a snakemake-like way of handling such cases, using the "shell" and "run" keywords?

Thanks

like image 813
rioualen Avatar asked Aug 10 '17 12:08

rioualen


People also ask

How does Snakemake work?

A Snakemake workflow is defined by specifying rules in a Snakefile. Rules decompose the workflow into small steps (for example, the application of a single tool) by specifying how to create sets of output files from sets of input files.

What is Snakemake Python?

The Snakemake workflow management system is a tool to create reproducible and scalable data analyses. Workflows are described via a human readable, Python based language. They can be seamlessly scaled to server, cluster, grid and cloud environments, without the need to modify the workflow definition.


1 Answers

For shell commands, you can always take advantage conditional "or", ||:

rule some_rule:
    output:
        "outfile"
    shell:
        """
        command_that_errors || true
        """

# or...

rule some_rule:
    output:
        "outfile"
    run:
        shell("command_that_errors || true")

Usually an exit code of zero (0) means success, and anything non-zero indicates failure. Including || true ensures a successful exit when the command exits with a non-zero exit code (true always returns 0).

If you need to allow a specific non-zero exit code, you can use shell or Python to check the code. For Python, it would be something like the following. The shlex.split() module is used so shell commands do not need to passed as arrays of arguments.

import shlex

rule some_rule:
    output:
        "outfile"
    run:
        try:
           proc_output = subprocess.check_output(shlex.split("command_that_errors {output}"), shell=True)                       
        # an exception is raised by check_output() for non-zero exit codes (usually returned to indicate failure)
        except subprocess.CalledProcessError as exc: 
            if exc.returncode == 2: # 2 is an allowed exit code
                # this exit code is OK
                pass
            else:
                # for all others, re-raise the exception
                raise

In shell script:

rule some_rule:
    output:
        "outfile"
    run:
        shell("command_that_errors {output} || rc=$?; if [[ $rc == 2 ]]; then exit 0; else exit $?; fi")
like image 186
tomkinsc Avatar answered Sep 25 '22 23:09

tomkinsc