I have a dataset that looks like <pre class="prettyprint"><code> City Score Count Returns Dallas 2.9 61 21 Phoenix 2.6 52 14 Milwaukee 1.7 38 7 Chicago 1.2 95 16 Phoenix 5.9 96 16 Dallas 1.9 45 12 Dallas 2.7 75 45 Chicago 2.2 75 10 Milwaukee 2.6 12 2 Milwaukee 4.5 32 0 Dallas 1.9 65 12 Chicago 4.9 95 13 Chicago 5 45 5 Phoenix 5.2 43 5 </code></pre> I would like to build a report using R markdown; however, for each city I need to build a report. The reason for this is that one city cannot see the report for another city. How do I build a report and save a PDF of it for each city? Each report would need the median <code>Score</code>, mean <code>Count</code>, and mean <code>Returns</code>. I know that using <code>dplyr</code> I could simply use <pre class="prettyprint"><code>finaldat <- dat %>% group_by(City) %>% summarise(Score = median(Score), Count = mean(Count) , Return= mean(Returns)) </code></pre> But the frustration comes from producing a report for each <code>City</code>. Also, this is a subset of the data, not the full data. That is, this report is extensive and is a report of the results, which is systematic, not different for each <code>City</code>.

It looks like a parameterized report might be what you need. See the link for details, but the basic idea is that you set a parameter in the <code>yaml</code> of your <code>rmarkdown</code> report and use that parameter within the report to customize it (for example, by filtering the data by <code>City</code> in your case). Then in a separate R script, you <code>render</code> the report multiple times, once for each value of <code>City</code>, which you pass as a parameter to the <code>render</code> function. Here's a basic example: In your <code>Rmarkdown</code> report you would declare the parameter in the <code>yaml</code>. The listed value, <code>Dallas</code> in this case, is just the default value if no other value is input when you render the report: <pre class="prettyprint"><code>--- title: My Document output: pdf_document params: My_City: Dallas --- </code></pre> Then, in the same <code>Rmarkdown</code> document you would have your entire report--whatever calculations depend on <code>City</code>, plus the boilerplate that's the same for any <code>City</code>. You access the parameter with <code>params$My_City</code>. The code below will filter the data frame to the current value of the <code>My_City</code> parameter: <pre class="prettyprint"><code>```{r} dat %>% filter(City==params$My_City) %>% summarise(Score = median(Score), Count = mean(Count) , Return= mean(Returns)) ``` </code></pre> Then, in a separate R script, you would do something like the following to produce a separate report for each <code>City</code> (where I've assumed the Rmarkdown file above is called <code>MyReport.Rmd</code>): <pre class="prettyprint"><code>for (i in unique(dat$City)) { rmarkdown::render("MyReport.Rmd", params = list(My_City = i), output_file=paste0(i, ".pdf")) } </code></pre> In the code above, I've assumed the <code>dat</code> data frame is in the global environment of this separate R script that renders <code>MyReport.Rmd</code>. However, you could also just provide a vector of city names instead of getting the names from <code>unique(dat$City)</code>. <h4>To use a dynamic title (see question in comments):</h4> You can use parameters in the title (and other YAML metadata, such as author). For example: rmd file <pre class="prettyprint"><code>--- title: "Data for `r params$city`" output: pdf_document params: city: Dallas --- Body of report </code></pre> Separate R script to render the rmd file Compile the rmd file for two cities: <pre class="prettyprint"><code>for (i in c("New York", "Los Angeles")) { rmarkdown::render("test1.Rmd", params = list(city = i), output_file=paste0(i, ".pdf")) } </code></pre> See the R Markdown Cookbook for additional info.

How to create a different report for each subset of a data frame with R markdown?

Tags:

r

r-markdown

I have a dataset that looks like

 City   Score   Count   Returns
 Dallas 2.9 61  21
 Phoenix    2.6 52  14
 Milwaukee  1.7 38  7
 Chicago    1.2 95  16
 Phoenix    5.9 96  16
 Dallas 1.9 45  12
 Dallas 2.7 75  45
 Chicago    2.2 75  10
 Milwaukee  2.6 12  2
 Milwaukee  4.5 32  0
 Dallas 1.9 65  12
 Chicago    4.9 95  13
 Chicago    5   45  5
 Phoenix    5.2 43  5

I would like to build a report using R markdown; however, for each city I need to build a report. The reason for this is that one city cannot see the report for another city. How do I build a report and save a PDF of it for each city?

Each report would need the median Score, mean Count, and mean Returns. I know that using dplyr I could simply use

finaldat <- dat %>%
            group_by(City) %>%
            summarise(Score = median(Score),
                      Count = mean(Count)  ,
                      Return= mean(Returns))

But the frustration comes from producing a report for each City. Also, this is a subset of the data, not the full data. That is, this report is extensive and is a report of the results, which is systematic, not different for each City.

256

asked Jul 25 '16 15:07

akash87

1 Answers

It looks like a parameterized report might be what you need. See the link for details, but the basic idea is that you set a parameter in the yaml of your rmarkdown report and use that parameter within the report to customize it (for example, by filtering the data by City in your case). Then in a separate R script, you render the report multiple times, once for each value of City, which you pass as a parameter to the render function. Here's a basic example:

In your Rmarkdown report you would declare the parameter in the yaml. The listed value, Dallas in this case, is just the default value if no other value is input when you render the report:

---
title: My Document
output: pdf_document
params:
   My_City: Dallas
---

Then, in the same Rmarkdown document you would have your entire report--whatever calculations depend on City, plus the boilerplate that's the same for any City. You access the parameter with params$My_City. The code below will filter the data frame to the current value of the My_City parameter:

```{r}
dat %>%        
    filter(City==params$My_City) %>%
    summarise(Score = median(Score),
              Count = mean(Count)  ,
              Return= mean(Returns))
```

Then, in a separate R script, you would do something like the following to produce a separate report for each City (where I've assumed the Rmarkdown file above is called MyReport.Rmd):

for (i in unique(dat$City)) {
    rmarkdown::render("MyReport.Rmd", 
                      params = list(My_City = i),
                      output_file=paste0(i, ".pdf"))
}

In the code above, I've assumed the dat data frame is in the global environment of this separate R script that renders MyReport.Rmd. However, you could also just provide a vector of city names instead of getting the names from unique(dat$City).

To use a dynamic title (see question in comments):

You can use parameters in the title (and other YAML metadata, such as author). For example:

rmd file

---
title: "Data for `r params$city`"
output: pdf_document
params:
  city: Dallas
---

Body of report

Separate R script to render the rmd file

Compile the rmd file for two cities:

for (i in c("New York", "Los Angeles")) {
  rmarkdown::render("test1.Rmd", 
                    params = list(city = i),
                    output_file=paste0(i, ".pdf"))
}

See the R Markdown Cookbook for additional info.

191

answered Sep 25 '22 18:09

eipi10

Related questions
                            
                                Visualizing hierarchical data with circle packing in ggplot2?
                            
                                Integrate plotly with shinydashboard
                            
                                Import txt file in R ignoring first few lines
                            
                                data.table replace NA with mean for multiple columns and by id
                            
                                String split on a number word pattern
                            
                                How to match 2 dataframe columns and extract column values and column names?
                            
                                ggplot: Subset a layer where data is passed using a pipe
                            
                                Specify colors for each link in a force directed network, networkD3::forceNetwork()
                            
                                Reactive Function Parameters
                            
                                Error in predict() glmnet function: not-yet-implemented method
                            
                                Pass arguments in nested function to update default arguments
                            
                                R Shiny img() on UI side does not render the image
                            
                                Sentimental Analysis of review comments using qdap is slow
                            
                                How to balance unbalanced classification 1:1 with SMOTE in R
                            
                                see memory usage of the computer vs of memory usage of R in Rstudio?
                            
                                How to convert a list() to an ellipsis in R?
                            
                                Index of non-unique element in data frame
                            
                                Using scale_size_area (ggplot2) to plot points of size "0" as completely absent
                            
                                Nested ifelse with varying columns in data.table
                            
                                R: data.table. How to save dates properly with fwrite?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With