I have a dataset that looks like
City Score Count Returns
Dallas 2.9 61 21
Phoenix 2.6 52 14
Milwaukee 1.7 38 7
Chicago 1.2 95 16
Phoenix 5.9 96 16
Dallas 1.9 45 12
Dallas 2.7 75 45
Chicago 2.2 75 10
Milwaukee 2.6 12 2
Milwaukee 4.5 32 0
Dallas 1.9 65 12
Chicago 4.9 95 13
Chicago 5 45 5
Phoenix 5.2 43 5
I would like to build a report using R markdown; however, for each city I need to build a report. The reason for this is that one city cannot see the report for another city. How do I build a report and save a PDF of it for each city?
Each report would need the median Score
, mean Count
, and mean Returns
. I know that using dplyr
I could simply use
finaldat <- dat %>%
group_by(City) %>%
summarise(Score = median(Score),
Count = mean(Count) ,
Return= mean(Returns))
But the frustration comes from producing a report for each City
. Also, this is a subset of the data, not the full data. That is, this report is extensive and is a report of the results, which is systematic, not different for each City
.
Under the “File” tab, click “New File” and “R Markdown”. Name your file and choose the default output format. You can always change the output format later. Once you hit “Ok”, you can now code and write a report on R Markdown.
Reports can be compiled to any output format including HTML, PDF, MS Word, and Markdown. The first call to render creates an HTML document, whereas the second creates a PDF document. If you are using RStudio then you can also create a report using the Compile Report command (Ctrl+Shift+K).
One of the most popular ways to produce reproducible reports is R Markdown, which can be used to combine narrative and code to write academic papers, professional-quality reports, lab notebooks, web pages, blogs, presentations, and more.
It looks like a parameterized report might be what you need. See the link for details, but the basic idea is that you set a parameter in the yaml
of your rmarkdown
report and use that parameter within the report to customize it (for example, by filtering the data by City
in your case). Then in a separate R script, you render
the report multiple times, once for each value of City
, which you pass as a parameter to the render
function. Here's a basic example:
In your Rmarkdown
report you would declare the parameter in the yaml
. The listed value, Dallas
in this case, is just the default value if no other value is input when you render the report:
---
title: My Document
output: pdf_document
params:
My_City: Dallas
---
Then, in the same Rmarkdown
document you would have your entire report--whatever calculations depend on City
, plus the boilerplate that's the same for any City
. You access the parameter with params$My_City
. The code below will filter the data frame to the current value of the My_City
parameter:
```{r}
dat %>%
filter(City==params$My_City) %>%
summarise(Score = median(Score),
Count = mean(Count) ,
Return= mean(Returns))
```
Then, in a separate R script, you would do something like the following to produce a separate report for each City
(where I've assumed the Rmarkdown file above is called MyReport.Rmd
):
for (i in unique(dat$City)) {
rmarkdown::render("MyReport.Rmd",
params = list(My_City = i),
output_file=paste0(i, ".pdf"))
}
In the code above, I've assumed the dat
data frame is in the global environment of this separate R script that renders MyReport.Rmd
. However, you could also just provide a vector of city names instead of getting the names from unique(dat$City)
.
You can use parameters in the title (and other YAML metadata, such as author). For example:
rmd file
---
title: "Data for `r params$city`"
output: pdf_document
params:
city: Dallas
---
Body of report
Separate R script to render the rmd file
Compile the rmd file for two cities:
for (i in c("New York", "Los Angeles")) {
rmarkdown::render("test1.Rmd",
params = list(city = i),
output_file=paste0(i, ".pdf"))
}
See the R Markdown Cookbook for additional info.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With