library(dplyr) #Devel version, soon-to-be-released 0.6.0
library(tidyr)
library(ggplot2)
library(forcats) #for gss_cat data
I'm attempting to write a function that combines quosures from the soon-to-be-released dplyr
devel version together with tidyr::gather
and ggplot2
. So far it seems to work with tidyr
, but I'm having trouble with the plotting.
The below function seems to work with tidyr's gather
:
GatherFun<-function(gath){
gath<-enquo(gath)
gss_cat%>%select(relig,marital,race,partyid)%>%
gather(key,value,-!!gath)%>%
count(!!gath,key,value)%>%
mutate(perc=n/sum(n))
}
But I can't figure out how to make the plots work. I tried using !!gath
with ggplot2
, but it didn't work.
GatherFun<-function(gath){
gath<-enquo(gath)
gss_cat%>%select(relig,marital,race,partyid)%>%
gather(key,value,-!!gath)%>%
count(!!gath,key,value)%>%
mutate(perc=n/sum(n))%>%
ggplot(aes(x=value,y=perc,fill=!!gath))+
geom_col()+
facet_wrap(~key, scales = "free") +
geom_text(aes(x = "value", y = "perc",
label = "perc", group = !!gath),
position = position_stack(vjust = .05))
}
tidyr is the Tidyverse package for getting data frames to tidy. Recall that in a tidy data frame: each row is a unit of observation. each column is a single piece of information.
ensym() and ensyms() are variants of enexpr() and enexprs() that check the captured expression is either a string (which they convert to symbol) or a symbol. If anything else is supplied they throw an error.
In order to make this work I had to use dplyr::quo_name
to change the quosure into a string. I also had to use ggplot2::aes_string
, which also requires all the inputs to be strings, and therefore quoted with ""
.
GatherFun <- function(gath){
gath <- enquo(gath)
gathN <- quo_name(gath)
gss_cat %>%
select(relig, marital, race, partyid) %>%
gather(key, value, -!!gath) %>%
count(!!gath, key, value) %>%
mutate(perc = round(n/sum(n), 2)) %>%
ggplot() +
geom_col(aes_string(x = "value", y = "perc", fill = gathN)) +
facet_wrap(~key, scales = "free") +
geom_text(aes_string(x = "value", y = "perc", label = "perc", group = gathN),
position = position_stack(vjust = .05))
}
I feel like the main problem is ggplot
is greedy when it tries to evaluate !!gath
and does !(!gath)
, throwing an error as not(gath)
has no meaning. I've has this issue crop up a lot when I've tried to use !!
so I'm kinda weary about using it in its sugar form.
If someone more precise could correctly identify the problem it would definitely be helpful.
gather_func = function(gath) {
gath = enquo(gath)
gss_cat %>%
select(relig, marital, race, partyid) %>%
gather(key, value, -!!gath) %>%
count(!!gath, key, value) %>%
mutate(perc = round(n/sum(n), 2)) %>%
ggplot(aes(x = value, y = perc, fill = eval(rlang::`!!`(gath)))) +
geom_col() +
facet_wrap(~key, scales = "free") +
geom_text(
aes(
x = value,
y = perc,
label = perc,
group = eval(rlang::`!!`(gath))
),
position = position_stack(vjust = .05)
)
}
There seems to be a few mistakes in the function call you wrote in the question. properly spacing your code will help avoid that.
You also don't have you use the rlang
call, I just don't have the newest dplyr
version installed.
EDIT Some thoughts using a simpler mtcars
example:
Tbh I'm quite unsure of what's going on here, but I imagine it's to do with the fact the ggplot2
is relatively old now and has a slightly different design? Stepping into aes
with debug
, we find a structure similar to
structure(list(x = mpg, y = eval(rlang::UQE(var))), .Names = c("x",
"y"), class = "uneval")
(This won't run through the interpreter but is roughly what the structure looks like). I think this shows why the eval
call is necessary here, o/w ggplot is trying to map rlang::UQE(var)
to the y
aesthetic and reports it doesn't know what to do with something of class name
. eval
evaluates the name to, say, cyl
, then the aesthetic can be mapped as normal.
I imagine dplyr
verbs don't have this extra mapping step where the arguments are manipulated into some intermediate structure in the same way, so we don't have this issue.
Also, when I said you don't have to use the rlang
call, it was because I assumed this function was re-exported into the new dplyr
version. Because of the whole !!(...)
or !(!(...))
thing I mentioned earlier, I prefer to use rlang::"!!"
, or rlang::UQE
(which are exactly equivalent I believe).
Most of this is speculation though and if someone could correct me on anything I've got wrong it would be appreciated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With