Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rmarkdown: writing inline dplyr code if column names have spaces defined with backticks

Problem

My inline code chunk breaks when I filter() or select() a column name that has white space that I would normally define with backticks in dplyr.

Example Data

    ```{r setup, include=FALSE}
    knitr::opts_chunk$set(echo = TRUE)
    library(dplyr)
    library(knitr)
    library(lazyeval)

    df <- structure(list(1:3, 2:4), .Names = c("a", "b"), row.names = c(NA, -3L), class = "data.frame")

    df <- df %>% select(`a a`=a, `b b`=b)
    ```

Inline code chunk

I'm trying something like `r df %>% filter(`a a` == 1) %>% select(`a a`) %>% as.numeric()`, but I get the following error:

    Error in base::parse(text = code, keep.source = FALSE) : <text>:2.0: unexpected end of input 1: df %>% filter( ^ Calls: <Anonymous> ... inline_exec -> withVisible -> eval -> parse_only -> <Anonymous>

...for pretty obvious reasons (the backticks end the inline code chunk). I could rename the columns in a code chunk after the intext calculations (I'm formatting them for a table), but it would be frustrating to have to break it up.

Costly lazyeval solution

This solves the problem r df %>% filter_(interp(~ which_column == 1, which_column = as.name("a a"))) %>% select_(as.name("a a")) %>% as.numeric(), but there has got to be a better way.

like image 673
sullij Avatar asked Jan 18 '17 20:01

sullij


1 Answers

I am not sure how you are running things - here I provide an answer with respect to knitr.

There is no easy solution for this case, and the work-around of moving some code inside the chunks (as suggested in one of the comments) is probably the way to go.

For future reference and further insight, I'll still share the underlying problem and an alternative solution.

Note that knitr makes use of the following pattern for inline.code (given you are using Rmarkdown format):

knitr::all_patterns$md$inline.code
[1] "`r[ #]([^`]+)\\s*`"

Now the function knitr:::parse_inline matches this through a call to stringr::str_match_all, which will detect patterns of one or multiple non-backticks ([^`]+), followed by zero or multiple space-class elements (\\s*), followed by a back-tick.

So it will end on the first backtick following `r, more or less no matter what. This makes some sense, since the lines of input are collapsed in parse_inline and there could actually be multiple inline-code statements and plain text containing back-ticks in the resulting string.

If you however restrict yourself to some conventions, you could modify the pattern to detect the end of inline code pieces differently. Below I am assuming that I always break onto a new line following a piece of inline code, so e.g. following your setup chunk I only have the following:

Hello there.

`r DF %>% filter(`a a` == 1) %>% select(`a a`) %>% as.numeric()`

This should read 1 above here.

Then I can knit in the following way, modifying the pattern to take everything until a backtick followed by a new-line break:

library(knitr)
opts_knit$set('verbose' = TRUE)
knit_patterns$set(all_patterns$md)
inline.code.2 <- "`r[ #](.+)\\s*`\n"
knitr::knit_patterns$set(inline.code = inline.code.2)

knit2html("MyRmarkdownFile.rmd")
browseURL("MyRmarkdownFile.html")

Finding a general rule for this pattern that works for everyone seems impossible though.

like image 106
RolandASc Avatar answered Sep 28 '22 11:09

RolandASc