Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

rmarkdown::render problem when called from a package

Tags:

r

r-markdown

I've made a small package to reproduce the problem:

# example package
devtools::install_github("privefl/minipkg")

# example Rmd
rmd <- system.file("extdata", "Matrix.Rmd", package = "minipkg")
writeLines(readLines(rmd))  ## see content

# works fine
rmarkdown::render(
  rmd,
  "all",
  envir = new.env(),
  encoding = "UTF-8"
)

# !! does not work !!
minipkg::my_render(rmd)
minipkg::my_render  ## see source code

I don't understand why the behaviour is different and how to fix this.

Edit: I know I can use Matrix::t(). My question is more "why do I need to use it in this particular case and not in all the other cases (such as calling rmarkdown::render() outside of a package)?".


Error

Quitting from lines 10-13 (Matrix.Rmd) 
Error in t.default(mat) : argument is not a matrix

Matrix.Rmd File

---
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

```{r}
library(Matrix)
mat <- rsparsematrix(10, 10, 0.1)
t(mat)
```

Console Output:

> # example package
> devtools::install_github("privefl/minipkg")
Downloading GitHub repo privefl/minipkg@master
✔  checking for file ‘/private/var/folders/md/03gdc4c14z18kbqwpfh4jdfc0000gr/T/RtmpKefs4h/remotes685793b9df4/privefl-minipkg-c02ae62/DESCRIPTION’ ...
─  preparing ‘minipkg’:
✔  checking DESCRIPTION meta-information ...
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  building ‘minipkg_0.1.0.tar.gz’

* installing *source* package ‘minipkg’ ...
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (minipkg)
> # example Rmd
> rmd <- system.file("extdata", "Matrix.Rmd", package = "minipkg")
> writeLines(readLines(rmd))  ## see content
---
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

```{r}
library(Matrix)
mat <- rsparsematrix(10, 10, 0.1)
t(mat)
```

> # works fine
> rmarkdown::render(
+   rmd,
+   "all",
+   envir = new.env(),
+   encoding = "UTF-8"
+ )


processing file: Matrix.Rmd
  |.............                                                    |  20%
  ordinary text without R code

  |..........................                                       |  40%
label: setup (with options) 
List of 1
 $ include: logi FALSE

  |.......................................                          |  60%
  ordinary text without R code

  |....................................................             |  80%
label: unnamed-chunk-1
  |.................................................................| 100%
  ordinary text without R code


output file: Matrix.knit.md

/usr/local/bin/pandoc +RTS -K512m -RTS Matrix.utf8.md --to html4 --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash+smart --output Matrix.html --email-obfuscation none --self-contained --standalone --section-divs --template /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/h/default.html --no-highlight --variable highlightjs=1 --variable 'theme:bootstrap' --include-in-header /var/folders/md/03gdc4c14z18kbqwpfh4jdfc0000gr/T//RtmpKefs4h/rmarkdown-str68525040df1.html --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --metadata pagetitle=Matrix.utf8.md 

Output created: Matrix.html
> # !! does not work !!
> minipkg::my_render(rmd)


processing file: Matrix.Rmd
  |.............                                                    |  20%
  ordinary text without R code

  |..........................                                       |  40%
label: setup (with options) 
List of 1
 $ include: logi FALSE

  |.......................................                          |  60%
  ordinary text without R code

  |....................................................             |  80%
label: unnamed-chunk-1
Quitting from lines 10-13 (Matrix.Rmd) 
Error in t.default(mat) : argument is not a matrix

> minipkg::my_render  ## see source code
function (rmd) 
{
    rmarkdown::render(rmd, "all", envir = new.env(), encoding = "UTF-8")
}
<bytecode: 0x7f89c416c2a8>
<environment: namespace:minipkg>
>
like image 543
F. Privé Avatar asked Nov 21 '18 10:11

F. Privé


People also ask

What does knitr :: Opts_chunk set echo true mean?

The first code chunk: ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` is used to specify any global settings to be applied to the R Markdown script. The example sets all code chunks as “echo=TRUE”, meaning they will be included in the final rendered version.

Why is my R Markdown file not knitting?

No Knit HTML button This means that RStudio doesn't understand your document is supposed to be an RMarkdown document, often because your file extension is . txt . To fix this, go to the Files tab (lower right corner, same pane as Plots and Help) and select the checkbox next to your document's name.

Why does my code run but not knit?

If a chunk works in R but not when you knit, it is almost always because you've changed a variable in your global working environment not using code in a chunk. Try restarting your R session and running each chunk sequentially to make sure all your variable values are up to date.

How do I show output in R Markdown?

If you prefer to use the console by default for all your R Markdown documents (restoring the behavior in previous versions of RStudio), you can make Chunk Output in Console the default: Tools -> Options -> R Markdown -> Show output inline for all R Markdown documents .


1 Answers

How it works

The problem is envir = new.env(). What you need is envir = new.env(parent = globalenv()):

devtools::install_github("privefl/minipkg")
rmd <- system.file("extdata", "Matrix.Rmd", package = "minipkg")

minipkg::my_render(rmd)
# Fails

f <- minipkg::my_render
body(f) <- quote(rmarkdown::render(rmd, "all", envir = new.env(parent = globalenv()), encoding = "UTF-8"))

ns <- getNamespace("minipkg")
unlockBinding("my_render", ns)
assign("my_render", f, envir = ns)

minipkg::my_render(rmd)
# Patched one works :)

Why it works

Look at the default arguments of new.env() to find that the default parent environment is parent.frame(). Note that from the console, this will be globalenv() and from within a package it will be this packages namespace (not the same as package environment!).

You can get a package namespace with getNamespace("pkg"). It is the environment that contains all (also internal) objects of the package. The problem is that this environment is in a sense "disconnected" from the usual search / method lookup mechanic in R and so you won't find the necessary methods even though they are attached to search().

Now chosing new.env(parent = globalenv()) sets the parent environment to be on the top of the search path and thus able to find all attached methods.

Benchmarking different approaches

These three approaches all produce proper html files:

#' Render an Rmd file
#' @param rmd Path of the R Markdown file to render.
#' @export
my_render <- function(rmd) {
  rmarkdown::render(
    rmd,
    "all",
    envir = new.env(parent = globalenv()),
    encoding = "UTF-8"
  )
}

#' Render an Rmd file
#' @param rmd Path of the R Markdown file to render.
#' @export
my_render2 <- function(rmd) {
  cl <- parallel::makePSOCKcluster(1)
  on.exit(parallel::stopCluster(cl), add = TRUE)
  parallel::clusterExport(cl, "rmd", envir = environment())
  parallel::clusterEvalQ(cl, {
    rmarkdown::render(rmd, "all", encoding = "UTF-8")
  })[[1]]
}

#' Render an Rmd file
#' @param rmd Path of the R Markdown file to render.
#' @export
my_render3 <- function(rmd) {
    system2(
        command = "R",
        args = c("-e", shQuote(sprintf("rmarkdown::render('%s', 'all', encoding = 'UTF-8')", gsub("\\\\", "/", normalizePath(rmd))))),
        wait = TRUE
    )
}

Now it's interesting to compare their speed:

> microbenchmark::microbenchmark(my_render("inst/extdata/Matrix.Rmd"), my_render2("inst/extdata/Matrix.Rmd"), my_render3("inst/extdata/Matrix.Rmd"), times = 10L)

[...]

Unit: milliseconds
                                  expr       min       lq      mean    median        uq      max neval
  my_render("inst/extdata/Matrix.Rmd")  352.7927  410.604  656.5211  460.0608  560.3386 1836.452    10
 my_render2("inst/extdata/Matrix.Rmd") 1981.8844 2015.541 2163.1875 2118.0030 2307.2812 2407.027    10
 my_render3("inst/extdata/Matrix.Rmd") 2061.7076 2079.574 2152.0351 2138.9546 2181.1284 2377.623    10

Conclusions

  • envir = new.env(globalenv()) is by far the fastest (almost 4x faster than the alternatives)
    I expect the overhead to be constant, so it should be irrelevant for larger Rmd files.
  • There is no discernible difference between spawning a new proces with system2 and using a parallel SOCK cluster with 1 node.
like image 124
AlexR Avatar answered Oct 26 '22 22:10

AlexR