Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Proper R Markdown Code Organization

Tags:

r

r-markdown

I have been reading about R Markdown (here, here, and here) and using it to create solid reports. I would like to try to use what little code I am running to do some ad hoc analyses and turn them into more scalable data reports.

My question is rather broad: Is there a proper way to organize your code around an R Markdown project? Say, have one script that generates all of the data structures?

For example: Let's say that I have the cars data set and I have brought in commercial data on the manufacturer. What if I wanted to attach the manufacturer to the current cars data set, and then produce a separate summary table for each company using a manipulated data set cars.by.name as well as plot a certain sample using cars.import?

EDIT: Right now I have two files open. One is an R Script file that has all of the data manipulation: subsetting and re-categorizing values. And the other is the R Markdown file where I am building out text to accompany the various tables and plots of interest. When I call an object from the R Script file--like:

```{r}
table(cars.by.name$make)
```

I get an error saying Error in summary(cars.by.name$make) : object 'cars.by.name' not found

EDIT 2: I found this older thread to be helpful. Link

---
title: "Untitled"
author: "Jeb"
date: "August 4, 2015"
output: html_document
---


This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.

When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

```{r}
table(cars.by.name$make)
```  

```{r}
summary(cars)
summary(cars.by.name)
```

```{r}
table(cars.by.name)
```   
You can also embed plots, for example:

```{r, echo=FALSE}
plot(cars)
plot(cars.import)
```

Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
like image 540
Jebediah15 Avatar asked Aug 05 '15 03:08

Jebediah15


2 Answers

There is a solution for this sort of problem, explained here.

Basically, if you have an .R file containing your code, there is no need to repeat the code in the .Rmd file, but you can include the code from .R file. For this to work, the chunks of code should be named in the .R file, and then can be included by name in the .Rmd file.

test.R:

## ---- chunk-1 ----
table(cars.by.name$make)

test.Rmd

Just once on top of the .Rmd file:

```{r echo=FALSE, cache= F}
knitr::read_chunk('test.R')
```

For every chunk you're including (replace chunk-1 with the label of that specific chunk in your .R file):

```{r chunk-1}
```

Note that it should be left empty (as is) and in run-time your code from .R will be brought over here and run.

like image 55
pbahr Avatar answered Sep 28 '22 05:09

pbahr


Often times, I have many reports that need to run the same code with slightly different parameters. Calling all my "stats" functions separately, generating the results and then just referencing is what I typically do. The way to do this is as follows:

---
title: "Untitled"
author: "Author"
date: "August 4, 2015"
output: html_document
---

```{r, echo=FALSE, message=FALSE}
directoryPath <- "rawPath" ##Something like /Users/userid/RDataFile
fullPath <- file.path(directoryPath,"myROutputFile.RData") 
load(fullPath)
```

Some Text, headers whatever

```{r}
summary(myStructure$value1) #Where myStructure was saved to the .RData file
```  

You can save an RData file by using the save.image() command.

Hope that helps!

like image 22
user1357015 Avatar answered Sep 28 '22 05:09

user1357015