Sometimes when performing exploratory analysis or producing reports we want to plot univariate distributions for many variables. I could do this faceting the plot after some tidy trick, but there's ordered factors and I want to keep them ordered on the plots.
So, to accomplish it in a more efficient way, I built a simple dplyr/ggplot based function. I made this example below using the Arthritis dataset of vcd package.
library(dplyr)
library(ggplot2)
data(Arthritis, package = "vcd")
head(Arthritis)
plotUniCat <- function(df, x) {
x <- enquo(x)
df %>%
filter(!is.na(!!x)) %>%
count(!!x) %>%
mutate(prop = prop.table(n)) %>%
ggplot(aes(y=prop, x=!!x)) +
geom_bar(stat = "identity")
}
plotUniCat(Arthritis, Improved)
I can plot a formatted graph in a very short way, which is cool, but with just one variable.
I tried to call more than one variable with a for loop, but it's not working. The code runs, but nothing happens.
variables <- c("Improved", "Sex", "Treatment")
for (i in variables) {
plotUniCat(Arthritis, noquote(i))
}
I searched about this, but it's still not clear for me. Does someone know what I am doing wrong or how to make it work?
Thanks in advance.
You need to use rlang::sym to convert strings to symbols instead of enquo. I replace for loop with purrr::map to loop through the variables
library(tidyverse)
data(Arthritis, package = "vcd")
head(Arthritis)
#> ID Treatment Sex Age Improved
#> 1 57 Treated Male 27 Some
#> 2 46 Treated Male 29 None
#> 3 77 Treated Male 30 None
#> 4 17 Treated Male 32 Marked
#> 5 36 Treated Male 46 Marked
#> 6 23 Treated Male 58 Marked
plotUniCat2 <- function(df, x) {
x <- rlang::sym(x)
df %>%
filter(!is.na(!!x)) %>%
count(!!x) %>%
mutate(prop = prop.table(n)) %>%
ggplot(aes(y=prop, x=!!x)) +
geom_bar(stat = "identity")
}
variables <- c("Improved", "Sex", "Treatment")
variables %>% purrr::map(., ~ plotUniCat2(Arthritis, .x))
#> [[1]]

#>
#> [[2]]

#>
#> [[3]]

Created on 2018-06-13 by the reprex package (v0.2.0).
Change the enquo in the function to sym, to convert the variable string to a symbol. That is,
plotUniCat <- function(df, x) {
x <- sym(x)
df %>%
filter(!is.na(!!x)) %>%
count(!!x) %>%
mutate(prop = prop.table(n)) %>%
ggplot(aes(y=prop, x=!!x)) +
geom_bar(stat = "identity")
}
or, more concisely,
plotUniCat <- function(df, x) {
x <- sym(x)
df %>%
filter(!is.na(!!x)) %>%
ggplot(aes(x = as.factor(!!x))) +
geom_histogram(stat = "count")
}
and then
out <- lapply(variables, function(i) plotUniCat(Arthritis,i))
Finally, use grid.arrange to display the plots. E.g.
library(gridExtra)
do.call(grid.arrange, c(out, ncol = 2))

I guess the OP would like to use the plotUniCat for both quoted and unquoted variable name. If we change the function, it would not work for plotUniCat(Arthritis, Improved).
Therefore, instead of change the function, we can also change the way how we call the function plotUniCat into:
for (i in variables) {
plotUniCat(Arthritis, !!rlang::sym(i))
}
However, the plots are generated but not returned by for. We can use print or lapply to force the display or collect the generated plots:
lapply(variables, function(i) plotUniCat(Arthritis, !!rlang::sym(i)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With