Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid running Snakemake rule after input or intermediary output file was updated

Even if the output files of a Snakemake build already exist, Snakemake wants to rerun my entire pipeline only because I have modified one of the first input or intermediary output files.

I figured this out by doing a Snakemake dry run with -n which gave the following report for updated input file:

Reason: Updated input files: input-data.csv

and this message for update intermediary files

reason: Input files updated by another job: intermediary-output.csv

How can I force Snakemake to ignore the file update?

like image 688
Vincent Darbot Avatar asked Jun 28 '19 08:06

Vincent Darbot


People also ask

How do you run one rule in Snakemake?

-R selects the one rule (and all its dependent rules also!), -n does a "dry run", it just prints what it would do without -n.

What is a wildcard Snakemake?

{sample} is a wildcardUsing the same wildcards in the input and output is what tells Snakemake how to match input files to output files. If two rules use a wildcard with the same name then Snakemake will treat them as completely different - rules in Snakemake are self-contained in this way.

How does Snakemake work?

A Snakemake workflow is defined by specifying rules in a Snakefile. Rules decompose the workflow into small steps (for example, the application of a single tool) by specifying how to create sets of output files from sets of input files.


2 Answers

You can use the option --touch to mark them up to date:

--touch, -t
Touch output files (mark them up to date without really changing them) instead of running their commands. This is used to pretend that the rules were executed, in order to fool future invocations of snakemake. Fails if a file does not yet exist.

Beware that this will touch all your files and thus modify the timestamps to put them back in order.

like image 142
Eric C. Avatar answered Sep 19 '22 06:09

Eric C.


In addition to Eric's answer, see also the ancient flag to ignore timestamps on input files.

Also note that the Unix command touch can be used to modify the timestamp of an existing file and make it appear older than it actually is:

touch --date='2004-12-31 12:00:00' foo.txt 
ls -l foo.txt 
-rw-rw-r-- 1 db291g db291g 0 Dec 31  2004 foo.txt 
like image 30
dariober Avatar answered Sep 18 '22 06:09

dariober