Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use dplyr's enquo and quo_name in a function with tidyr and ggplot2

library(dplyr) #Devel version, soon-to-be-released 0.6.0
library(tidyr)
library(ggplot2)
library(forcats) #for gss_cat data

I'm attempting to write a function that combines quosures from the soon-to-be-released dplyr devel version together with tidyr::gather and ggplot2. So far it seems to work with tidyr, but I'm having trouble with the plotting.

The below function seems to work with tidyr's gather:

GatherFun<-function(gath){
  gath<-enquo(gath)

  gss_cat%>%select(relig,marital,race,partyid)%>%
    gather(key,value,-!!gath)%>%
    count(!!gath,key,value)%>%
    mutate(perc=n/sum(n))
}

But I can't figure out how to make the plots work. I tried using !!gath with ggplot2, but it didn't work.

GatherFun<-function(gath){
  gath<-enquo(gath)

  gss_cat%>%select(relig,marital,race,partyid)%>%
    gather(key,value,-!!gath)%>%
    count(!!gath,key,value)%>%
    mutate(perc=n/sum(n))%>%
    ggplot(aes(x=value,y=perc,fill=!!gath))+
       geom_col()+
       facet_wrap(~key, scales = "free") +
       geom_text(aes(x = "value", y = "perc", 
                     label = "perc", group = !!gath),
                 position = position_stack(vjust = .05))
}
like image 581
Mike Avatar asked Apr 14 '17 05:04

Mike


People also ask

Is tidyverse and tidyr the same?

tidyr is the Tidyverse package for getting data frames to tidy. Recall that in a tidy data frame: each row is a unit of observation. each column is a single piece of information.

What is Ensym R?

ensym() and ensyms() are variants of enexpr() and enexprs() that check the captured expression is either a string (which they convert to symbol) or a symbol. If anything else is supplied they throw an error.


2 Answers

In order to make this work I had to use dplyr::quo_name to change the quosure into a string. I also had to use ggplot2::aes_string, which also requires all the inputs to be strings, and therefore quoted with "".

GatherFun <- function(gath){
  gath <- enquo(gath)
  gathN <- quo_name(gath)

  gss_cat %>% 
    select(relig, marital, race, partyid) %>%
    gather(key, value, -!!gath) %>%
    count(!!gath, key, value) %>%
    mutate(perc = round(n/sum(n), 2)) %>%
    ggplot() +
    geom_col(aes_string(x = "value", y = "perc", fill = gathN)) +
    facet_wrap(~key, scales = "free") +
    geom_text(aes_string(x = "value", y = "perc", label = "perc", group = gathN), 
              position = position_stack(vjust = .05))
}
like image 181
Mike Avatar answered Oct 17 '22 11:10

Mike


I feel like the main problem is ggplot is greedy when it tries to evaluate !!gath and does !(!gath), throwing an error as not(gath) has no meaning. I've has this issue crop up a lot when I've tried to use !! so I'm kinda weary about using it in its sugar form.

If someone more precise could correctly identify the problem it would definitely be helpful.

gather_func = function(gath) {

  gath = enquo(gath)

  gss_cat %>%
    select(relig, marital, race, partyid) %>%
    gather(key, value, -!!gath) %>%
    count(!!gath, key, value) %>%
    mutate(perc = round(n/sum(n), 2)) %>%
    ggplot(aes(x = value, y = perc, fill = eval(rlang::`!!`(gath)))) +
    geom_col() + 
    facet_wrap(~key, scales = "free") +
    geom_text(
      aes(
        x = value, 
        y = perc, 
        label = perc, 
        group = eval(rlang::`!!`(gath))
      ),
      position = position_stack(vjust = .05)
    )
}

There seems to be a few mistakes in the function call you wrote in the question. properly spacing your code will help avoid that.

You also don't have you use the rlang call, I just don't have the newest dplyr version installed.

EDIT Some thoughts using a simpler mtcars example:

Tbh I'm quite unsure of what's going on here, but I imagine it's to do with the fact the ggplot2 is relatively old now and has a slightly different design? Stepping into aes with debug, we find a structure similar to

structure(list(x = mpg, y = eval(rlang::UQE(var))), .Names = c("x", 
"y"), class = "uneval")

(This won't run through the interpreter but is roughly what the structure looks like). I think this shows why the eval call is necessary here, o/w ggplot is trying to map rlang::UQE(var) to the y aesthetic and reports it doesn't know what to do with something of class name. eval evaluates the name to, say, cyl, then the aesthetic can be mapped as normal.

I imagine dplyr verbs don't have this extra mapping step where the arguments are manipulated into some intermediate structure in the same way, so we don't have this issue.

Also, when I said you don't have to use the rlang call, it was because I assumed this function was re-exported into the new dplyr version. Because of the whole !!(...) or !(!(...)) thing I mentioned earlier, I prefer to use rlang::"!!", or rlang::UQE (which are exactly equivalent I believe).

Most of this is speculation though and if someone could correct me on anything I've got wrong it would be appreciated.

like image 26
Akhil Nair Avatar answered Oct 17 '22 12:10

Akhil Nair