Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modularized R markdown structure

There are a few questions about this already, but they are either unclear or provide solutions that don't work, perhaps because they are outdated:

  • Proper R Markdown Code Organization
  • How to source R Markdown file like `source('myfile.r')`?
  • http://yihui.name/knitr/demo/externalization/

Modularized code structure for large projects

R Markdown/Notebook is nice, but the way it's presented, there is typically a single file that has all the text and all the code chunks. I often have projects where such a single file structure is not a good setup. Instead, I use a single .R master file that loads the other .R files in order. I'd like to replicate this structure using R Notebook i.e. such that I have a single .Rmd file that I call the code from multiple .R files from.

The nice thing about working with a project this way is that it allows for the nice normal workflow with RStudio using the .R files but also the neat output from R Notebook/Markdown without duplicating the code.

Minimal example

This is simplified to make the example as small as possible. Two .R files and one master .Rmd file.

start.R

# libs --------------------------------------------------------------------
library(pacman)
p_load(dplyr, ggplot2)
#normally load a lot of packages here

# data --------------------------------------------------------------------
d = iris
#use iris for example, but normally would load data from file

# data manipulation tasks -------------------------------------------------
#some code here to extract useful info from the data
setosa = dplyr::filter(d, Species == "setosa")

plot.R

#setosa only
ggplot(setosa, aes(Sepal.Length)) +
  geom_density()

#all together
ggplot(d, aes(Sepal.Length, color = Species)) +
  geom_density()

And then the notebook file:

notebook.Rmd:

---
title: "R Notebook"
output:
  html_document: default
  html_notebook: default
---

First we load some packages and data and do slight transformation:

```{r start}
#a command here to load the code from start.R and display it
```

```{r plot}
#a command here to load the code from plot.R and display it
```

Desired output

The desired output is that which one gets from manually copying over the code from start.R and plot.R into the code chunks in notebook.Rmd. This looks like this (some missing due to lack of screen space):

enter image description here

Things I've tried

source

This loads the code, but does not display it. It just displays the source command:

enter image description here

knitr::read_chunk

This command was mentioned here, but actually it does the same as source as far as I can tell: it loads the code but displays nothing.

enter image description here

How do I get the desired output?

like image 314
CoderGuy123 Avatar asked Nov 10 '16 04:11

CoderGuy123


1 Answers

The solution is to use knitr's chunk option code. According to knitr docs:

code: (NULL; character) if provided, it will override the code in the current chunk; this allows us to programmatically insert code into the current chunk; e.g. a chunk option code = capture.output(dump('fivenum', '')) will use the source code of the function fivenum to replace the current chunk

No example is provided, however. It sounds like one has to feed it a character vector, so let's try readLines:

```{r start, code=readLines("start.R")}
```

```{r plot, code=readLines("start.R")}
```

This produces the desired output and thus allows for a modularized project structure.

Feeding it a file directly does not work (i.e. code="start.R"), but would be a nice enhancement.

like image 69
CoderGuy123 Avatar answered Oct 03 '22 18:10

CoderGuy123