Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Caching knitr external code from multiple Rmd files

Tags:

caching

r

knitr

I'm having difficulty getting knitr to utilize caching between two Rmd documents sharing common source code in an external R file. Although I can see in the file system that both documents are writing output to the same set of cache files, each time one Rmd document is knitted to HTML it overwrites the cache files created when the previous Rmd was knitted. Multiple knits of the same Rmd file successfully utilize the cache without re-executing the shared code. Have I missed something in configuring the cache options for support of multiple documents?

Sample code and sessionInfo() dump are below. Thanks in advance for any assistance you can offer.

test1.R

## @knitr source_chunk_1
x <- Sys.time()
x

test1a.Rmd

```{r set_global_options, cache=FALSE}
library(knitr)
opts_knit$set(self.contained = FALSE)
opts_chunk$set(cache = TRUE, cache.path = "knitrcache/test-")
read_chunk("test1.R")
```

```{r local_chunk_1, ref.label="source_chunk_1"}
```

test1b.Rmd

```{r set_global_options, cache=FALSE}
library(knitr)
opts_knit$set(self.contained = FALSE)
opts_chunk$set(cache = TRUE, cache.path = "knitrcache/test-")
read_chunk("test1.R")
```

```{r local_chunk_1, ref.label="source_chunk_1"}
```

sessionInfo

> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252           
LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] knitr_1.5
loaded via a namespace (and not attached):
[1] evaluate_0.5.3   formatR_0.10     rmarkdown_0.2.05 stringr_0.6.2    tools_3.1.0     
like image 286
Wil McKoy Avatar asked Nov 10 '22 07:11

Wil McKoy


1 Answers

After downloading and hacking around in the knitr source from github, I believe I've found the source of the problem. Code in block.R sets the hash for the cache by calling the digest() function with the contents and options of the code chunk being processed:

hash = paste(valid_path(params$cache.path, label), digest::digest(content), sep = '_')

I temporarily inserted code to write out the data stored in the content object for each of my sample Rmd scripts above. The default fig.path option value was the only component of the content that differed between them.

 > content$fig.path
[1] "./test1a_files/figure-html/"  

> content$fig.path
[1] "./test1b_files/figure-html/"

Setting a global fig.path in each Rmd file caused the content objects and resulting hash values to be identical. Now, when I knit the two Rmd files, the same cached value is used for both.

Test1.R

## @knitr source_chunk_1
x <- Sys.time()
x

test1a.Rmd

```{r set_global_options, cache=FALSE}
library(knitr)
opts_knit$set(self.contained = FALSE)
opts_chunk$set(cache = TRUE, cache.path = "knitrcache/test-", fig.path = "knitrfig/test-")
read_chunk("test1.R")
```

```{r local_chunk_1, ref.label="source_chunk_1"}
``` 

test1b.Rmd

```{r set_global_options, cache=FALSE}
library(knitr)
opts_knit$set(self.contained = FALSE)
opts_chunk$set(cache = TRUE, cache.path = "knitrcache/test-", fig.path = "knitrfig/test-")
read_chunk("test1.R")
```

```{r local_chunk_1, ref.label="source_chunk_1"}
``` 
like image 118
Wil McKoy Avatar answered Nov 15 '22 07:11

Wil McKoy