Even if the output files of a Snakemake build already exist, Snakemake wants to rerun my entire pipeline only because I have modified one of the first input or intermediary output files.
I figured this out by doing a Snakemake dry run with -n
which gave the following report for updated input file:
Reason: Updated input files: input-data.csv
and this message for update intermediary files
reason: Input files updated by another job: intermediary-output.csv
How can I force Snakemake to ignore the file update?
-R selects the one rule (and all its dependent rules also!), -n does a "dry run", it just prints what it would do without -n.
{sample} is a wildcardUsing the same wildcards in the input and output is what tells Snakemake how to match input files to output files. If two rules use a wildcard with the same name then Snakemake will treat them as completely different - rules in Snakemake are self-contained in this way.
A Snakemake workflow is defined by specifying rules in a Snakefile. Rules decompose the workflow into small steps (for example, the application of a single tool) by specifying how to create sets of output files from sets of input files.
You can use the option --touch
to mark them up to date:
--touch, -t
Touch output files (mark them up to date without really changing them) instead of running their commands. This is used to pretend that the rules were executed, in order to fool future invocations of snakemake. Fails if a file does not yet exist.
Beware that this will touch all your files and thus modify the timestamps to put them back in order.
In addition to Eric's answer, see also the ancient flag to ignore timestamps on input files.
Also note that the Unix command touch
can be used to modify the timestamp of an existing file and make it appear older than it actually is:
touch --date='2004-12-31 12:00:00' foo.txt
ls -l foo.txt
-rw-rw-r-- 1 db291g db291g 0 Dec 31 2004 foo.txt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With