My question: Within an R session, is there some way to use knitr's cached results to 'fast-forward' to the environment (i.e. the set of objects) available in a given code block, in the same sense that knit()
itself does?
knitr's built-in cacheing of code chunks is one of its killer features.
It's especially helpful when some chunks contain time-consuming computations. Unless they (or a chunk they depend on) is altered, the computations only need be carried out the first time the document is knit
ed: upon all subsequent calls to knit
, the objects created by the chunk will just be loaded from the cache.
Here's a minimal-ish example, a file called "lotsOfComps.Rnw"
:
\documentclass{article} \begin{document} The calculations in this chunk take a looooong time. <<slowChunk, cache=TRUE>>= Sys.sleep(30) ## Stands in for some time-consuming computation x <- sample(1:10, size=2) @ I wish I could `fast-forward' to this chunk, to view the cached value of \texttt{x} <<interestingChunk>>= y <- prod(x)^2 y @ \end{document}
Times needed to knit and TeXify "lotsOfComps.Rnw"
:
## First time system.time(knit2pdf("lotsOfComps.Rnw")) ## user system elapsed ## 0.07 0.02 31.81 ## Second (and subsequent) runs system.time(knit2pdf("lotsOfComps.Rnw")) ## user system elapsed ## 0.03 0.02 1.28
Within an R session, is there some way to use knitr's cached results to 'fast-forward' to the environment (i.e. the set of objects) available in a given code block, in the same sense that knit()
itself does?
Doing purl("lotsOfComps.Rnw")
and then running the code in "lotsOfComps.R"
doesn't work, because all of the objects along the way must be recomputed.
Ideally, it would be possible to do something like this to end up in the environment that exists at the beginning of <<interestingChunk>>=
:
spin("lotsOfComps.Rnw", chunk="interestingChunk") ls() # [1] "x" x # [1] 3 8
Since spin()
is not (yet?) available, what's the best way to get the equivalent result?
The most appropriate use case of caching is to save and reload R objects that take too long to compute in a code chunk, and the code does not have any side effects, such as changing global R options via options() (such changes will not be cached). If a code chunk has side effects, we recommend that you do not cache it.
You can add options to each code chunk. These options allow you to customize how or if you want code to be processed or appear on the rendered output (pdf document, html document, etc). Code chunk options are added on the first line of a code chunk after the name, within the curly brackets.
If you run into problems with cached output you can always clear the knitr cache by removing the folder named with a _cache suffix within your document's directory.
Here is one solution, which is still a little bit awkward but it works. The idea is to add a chunk option named mute
which takes NULL
by default, but it can also take an R expression, e.g. mute_later()
below. When knitr
evaluates the chunk options, mute_later()
can be evaluated and NULL
is returned; at the same time, there are side effects in opts_chunk
(setting the global chunk options like eval = FALSE
).
Now what you need to do is to put mute=mute_later()
in the chunk after which you want to skip the rest of the chunks, e.g. you can move this option from example-a
to example-b
. Because mute_later()
returns NULL
which happens to be the default value of the mute
options, the cache will not be broken even you move this option around.
\documentclass{article} \begin{document} <<setup, include=FALSE, cache=FALSE>>= rm(list = ls(all.names = TRUE), envir = globalenv()) opts_chunk$set(cache = TRUE) # enable cache to make it faster opts_chunk$set(eval = TRUE, echo = TRUE, include = TRUE) # set global options to mute later chunks mute_later = function() { opts_chunk$set(cache = FALSE, eval = FALSE, echo = FALSE, include = FALSE) NULL } # a global option mute=NULL so that using mute_later() will not break cache opts_chunk$set(mute = NULL) @ <<example-a, mute=mute_later()>>= x = rnorm(4) Sys.sleep(5) @ <<example-b>>= y = rpois(10,5) Sys.sleep(5) @ <<example-c>>= z = 1:10 Sys.sleep(3) @ \end{document}
It is awkward in the sense that you have to cut-and-paste , mute=mute_later()
around. Ideally you should just set the chunk label like the gist I wrote for Barry.
The reason that my original gist did not work is because chunk hooks are ignored when a chunk is cached. The second time you knit()
the file, the chunk hook checkpoint
for example-a
was skipped, therefore eval=TRUE
for the rest of chunks, and you saw all chunks were evaluated. By comparison, chunk options are always dynamically evaluated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With