Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Using loops with knitr to produce multiple pdf reports... need a little help to get me over the hump




First of all, I must admit that I'm very new to knitr and the concept of reproducible analysis, but I can see its potential in improving my current workflow (which includes much copy-pasting into word docs).

I often have to produce multiple reports by group (Hospital in this example) and within each hospital, there may be many different Wards that I'm reporting an outcome on. Previously I ran all of my plots and analysis in R using loops, then the copy/pasting work commenced; however, after reading this post (Can Sweave produce many pdfs automatically?), and it gave me hope that I may actually be able to skip many steps and go straight from R to report through Rnw/knitr.

However, after giving it a try I see that there is something that isn't quite working out (as the R environment within the Rnw does not appear to recognize the looping variables I'm trying to pass to it??).

   ##  make my data
Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)

##  Here is my current work flow-- produce all plots, but export as png and cut/paste
for(hosp in unique(df$Hospital)){
  subgroup <- df[ df$Hospital == hosp,]
  for(ward in unique(subgroup$Ward)){
    subgroup2 <- subgroup[subgroup$Ward == ward,]
    savename <- paste(hosp, ward)
    plot(subgroup2$Month, subgroup2$Outcomes, type="o", main=paste("Trend plot for", savename))
# followed by much copy/pasting

##  Here is what I'm trying to go for using knitr 
for (hosp in unique(df$Hospital)){
  knit("C:file.path\\testing_loops.Rnw", output=paste('report_', Hospital, '.tex', sep=""))

## With the following *Rnw file
## start *.Rnw Code
\usepackage[margin=1.15 in]{geometry}
<<loaddata, echo=FALSE, message=FALSE>>=
  Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)
subgroup <- df[ df$Hospital == hosp,]

<<setup, echo=FALSE >>=
  opts_chunk$set(fig.path = paste("test", hosp , sep=""))

Some infomative text about hospital \Sexpr{hosp}

<<plots, echo=FALSE >>=
  for(ward in unique(subgroup$Ward)){
    subgroup2 <- subgroup[subgroup$Ward == ward,]
    #     subgroup2 <- subgroup2[ order(subgroup2$Month),]
    savename <- paste(hosp, ward)
    plot(subgroup2$Month, subgroup2$Outcomes, type="o", main=paste("Trend plot for", savename))

##  To be then turned into pdf with this
tools::texi2pdf("C:file.path\\report_A.tex", clean = TRUE, quiet = TRUE)

After trying to run my knit() code chunk I get this error:

Error in file(con, "w") : invalid 'description' argument

And when I look into the directory where the *.tex file was to be created, I can see the 2 pdf plots from hospital A were produced (none for B) and no hospital specific *.tex file to knit into a pdf. Thanks in advance for any help you can offer!

like image 558
Chris Avatar asked Mar 13 '13 21:03


2 Answers

You don't need to re-define the data in the .Rnw file and I think the warning is coming from the fact that you are putting the output name together with Hospital (the full vector of hospitals) rather than hosp (the loop index).

Following your example, testingloops.Rnw would be

\usepackage[margin=1.15 in]{geometry}
<<loaddata, echo=FALSE, message=FALSE>>=
subgroup <- df[ df$Hospital == hosp,]

<<setup, echo=FALSE >>=
  opts_chunk$set(fig.path = paste("test", hosp , sep=""))

Some infomative text about hospital \Sexpr{hosp}

<<plots, echo=FALSE >>=
  for(ward in unique(subgroup$Ward)){
    subgroup2 <- subgroup[subgroup$Ward == ward,]
    #     subgroup2 <- subgroup2[ order(subgroup2$Month),]
    savename <- paste(hosp, ward)
    plot(subgroup2$Month, subgroup2$Outcomes, type="o", main=paste("Trend plot for", savename))

and the driver R file would be just

##  make my data
Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)

## knitr loop
for (hosp in unique(df$Hospital)){
  knit2pdf("testingloops.Rnw", output=paste0('report_', hosp, '.tex'))
like image 162
Brian Diggs Avatar answered Nov 07 '22 21:11

Brian Diggs

Great question! This works for me with the other bits you've supplied in your question. Note that I've replaced your hosp with just x. I've called your Rnw file test.rnw

# input data
Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)

# generate the tex files, one for each hospital in df
lapply(unique(df$Hospital), function(x) 
            output=paste('report_', x, '.tex', sep="")))

# generate PDFs from the tex files, one for each hospital in df
lapply(unique(df$Hospital), function(x)
       tools::texi2pdf(paste0("C:\\emacs\\", paste0('report_', x, '.tex')), 
                       clean = TRUE, quiet = TRUE))

I've replaced your loops withlapply and anonymous functions, which often seem to be considered more R-ish.

Here you can see where I replaced the hosp with x in the rnw file:

\usepackage[margin=1.15 in]{geometry}
<<loaddata, echo=FALSE, message=FALSE>>=
  Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)
subgroup <- df[ df$Hospital == x,]

<<setup, echo=FALSE >>=
  opts_chunk$set(fig.path = paste("test", x , sep=""))

Some informative text about hospital \Sexpr{x}

<<plots, echo=FALSE >>=
  for(ward in unique(subgroup$Ward)){
    subgroup2 <- subgroup[subgroup$Ward == ward,]
    #     subgroup2 <- subgroup2[ order(subgroup2$Month),]
    savename <- paste(x, ward)
    plot(subgroup2$Month, subgroup2$Outcomes, type="o", main=paste("Trend plot for", savename))

The result is two tex files (report_A.tex, report_B.tex), four PDFs for the figures (A1, A2, B1, B2) and two PDFs for the reports (report_A.pdf, report_B.pdf), each with their figures in them. Is that what you were after?

like image 11
Ben Avatar answered Nov 07 '22 23:11
