Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Knitr PDF: Is there a posssibility to automatically save PDF reports (generated from .Rmd) through a loop?

I would like to create a loop, which allows me to automatically save PDF reports, which were generated from a .Rmd file. For instance, if a variable "ID" has 10 rows, I would like R to automatically save me 10 reports, into a specific directory. These reports shall vary based on the ID selected.

A previous post (Using loops with knitr to produce multiple pdf reports... need a little help to get me over the hump) has dealt with the creation of multiple pdf reports generated from .Rnw files. I tried to apply the approach as follows:

#Data

```{r, include=FALSE}
set.seed(500)
Score <- rnorm(40, 100, 15)
Criteria1<-rnorm(40, 10, 5)
Criteria2<-rnorm(40, 20, 5)
ID <- sample(1:1000,8,replace=T)
df <- data.frame(ID,Score,Criteria1,Criteria2)

#instead of manually choosing the ID:

subgroup<- subset(df, ID==1) 

# I would like to subset the Data through a loop. My approach was like like this:

for (id in unique(df$ID)){
subgroup<- df[df$ID == id,]}

```

```{r, echo=FALSE}
#Report Analysis

summary(subgroup)
```
#Here will be some text about the summary.



# At the end the goal is to produce automatic pdf reports with the ID name as a filename:

library("rmarkdown")
render("Automated_Report.rmd",output_file = paste('report.', id, '.pdf', sep=''))
like image 241
user3491036 Avatar asked May 24 '15 09:05

user3491036


1 Answers

Adapting your example:

You need one .rmd "template" file. It could be something like this, save it as template.rmd.

This is a subgroup report.

```{r, echo=FALSE}
#Report Analysis
summary(subgroup)
```

Then, you need an R script that will load the data you want, loop through the data subsets, and for each subset

  1. Define the subgroup object used inside the template
  2. render the template to the desired output

So, in this separate script:

# load data 
set.seed(500)
Score <- rnorm(40, 100, 15)
Criteria1<-rnorm(40, 10, 5)
Criteria2<-rnorm(40, 20, 5)
ID <- sample(1:1000,8,replace=T)
df <- data.frame(ID,Score,Criteria1,Criteria2)

library("rmarkdown")

# in a single for loop
#  1. define subgroup
#  2. render output
for (id in unique(df$ID)){
    subgroup <- df[df$ID == id,]
    render("template.rmd",output_file = paste0('report.', id, '.html'))    
}

This produced 8 html files in my working directory, each with a summary of a different subset of the data.

Note that this will not work if you try clicking the "knit" button inside RStudio, as that runs the R code in a separate R session. However, when you run from the console explicitly using render (or knit2pdf) the R code in the rmd file still has access to the global environment.

Rather than relying on global variables, another option would be to use parametrized reports, defining parameters in the YAML header, and passing the parameter values in as arguments to rmarkdown::render.

like image 164
Gregor Thomas Avatar answered Nov 19 '22 03:11

Gregor Thomas