Question: why can't I call sapply
inside aes()
?
Goal of following figure: Create histogram showing proportion that died/lived so that the proportion for each combination of group/type sums to 1 (example inspired by previous post).
I know you could make the figure by summarising outside of ggplot
but the question is really about why the function isn't working inside of aes
.
## Data
set.seed(999)
dat <- data.frame(group=factor(rep(1:2, 25)),
type=factor(sample(1:2, 50, rep=T)),
died=factor(sample(0:1, 50, rep=T)))
## Setup the figure
p <- ggplot(dat, aes(x=died, group=interaction(group, type), fill=group, alpha=type)) +
theme_bw() +
scale_alpha_discrete(range=c(0.5, 1)) +
ylab("Proportion")
## Proportions, all groups/types together sum to 1 (not wanted)
p + geom_histogram(aes(y=..count../sum(..count..)), position=position_dodge())
## Look at groups
stuff <- ggplot_build(p)
stuff$data[[1]]
## The long way works: proportions by group/type
p + geom_histogram(
aes(y=c(..count..[..group..==1] / sum(..count..[..group..==1]),
..count..[..group..==2] / sum(..count..[..group..==2]),
..count..[..group..==3] / sum(..count..[..group..==3]),
..count..[..group..==4] / sum(..count..[..group..==4]))),
position='dodge'
)
## Why can't I call sapply there?
p + geom_histogram(
aes(y=sapply(unique(..group..), function(g)
..count..[..group..==g] / sum(..count..[..group..==g]))),
position='dodge'
)
Error in get(as.character(FUN), mode = "function", envir = envir) : object 'expr' of mode 'function' was not found
The AES Encrypt function encrypts fields and content using the Advanced Encryption Standard (AES) algorithm with 128 bit strength. 128-bit encryption level requires an 8-byte key, which must be a 32-character hexadecimal string. Parameter: data, key. Data that is passed as the input is encrypted using the AES key.
In R, the aes() function is often used within other graphing elements to specify the desired aesthetics. The aes() function can be used in a global manner (applying to all of the graph's elements) by nesting within ggplot() .
Aesthetic Mapping ( aes ) In ggplot2 , aesthetic means “something you can see”. Each aesthetic is a mapping between a visual cue and a variable. Examples include: position (i.e., on the x and y axes) color (“outside” color)
The point geom is used to create scatterplots. The scatterplot is most useful for displaying the relationship between two continuous variables.
So, the issue arises because of a recursive call to ggplot2:::strip_dots
for any aesthetics that include 'calculated aesthetics'. There is some discussion around the calculated aesthetics in this SO question and answer. The relevant code in layer.r is here:
new <- strip_dots(aesthetics[is_calculated_aes(aesthetics)])
i.e. strip_dots
is called only if there are calculated aesthetics, defined using the regex "\\.\\.([a-zA-z._]+)\\.\\."
.
strip_dots
in takes a recursive approach, working down through the nested calls and stripping out the dots. The code is like this:
function (expr)
{
if (is.atomic(expr)) {
expr
}
else if (is.name(expr)) {
as.name(gsub(match_calculated_aes, "\\1", as.character(expr)))
}
else if (is.call(expr)) {
expr[-1] <- lapply(expr[-1], strip_dots)
expr
}
else if (is.pairlist(expr)) {
as.pairlist(lapply(expr, expr))
}
else if (is.list(expr)) {
lapply(expr, strip_dots)
}
else {
stop("Unknown input:", class(expr)[1])
}
}
If we supply an anonymous function this code as follows:
anon <- as.call(quote(function(g) mean(g)))
ggplot2:::strip_dots(anon)
we reproduce the error:
#Error in get(as.character(FUN), mode = "function", envir = envir) :
# object 'expr' of mode 'function' was not found
Working through this, we can see that anon is a call
. For call
s, strip_dots
will use lapply
to call strip_dots
on the second and third elements of the call
. For an anonymous function like this, the second element is the formals
of the function. If we look at the formals
of anon
using dput(formals(eval(anon)))
or dput(anon[[2]])
we see this:
#pairlist(g = )
For pairlist
s, strip_dots
tries to lapply
it to itself. I'm not sure why this code is there, but certainly in this circumstance it leads to the error:
expr <- anon[[2]]
lapply(expr, expr)
# Error in get(as.character(FUN), mode = "function", envir = envir) :
# object 'expr' of mode 'function' was not found
TL; DR At this stage, ggplot2
doesn't support the use of anonymous functions within aes
where a calculated aesthetic (such as ..count..
) is used.
Anyway, the desired end result can be achieved using dplyr
; in general I think it makes for more readable code to separate out the data summarisation from the plotting:
newDat <- dat %>%
group_by(died, type, group) %>%
summarise(count = n()) %>%
group_by(type, group) %>%
mutate(Proportion = count / sum(count))
p <- ggplot(newDat, aes(x = died, y = Proportion, group = interaction(group, type), fill=group, alpha=type)) +
theme_bw() +
scale_alpha_discrete(range=c(0.5, 1)) +
geom_bar(stat = "identity", position = "dodge")
I've forked ggplot2 and have made two changes to aes_calculated.r which fix the problem. The first was to correct the handling of pairlist
s to lapply
strip_dots
instead of expr
, which I think must have been the intended behaviour. The second was that for formals with no default value (like in the examples provided here), as.character(as.name(expr))
throws an error because expr
is an empty name, and while this is a valid construct, it's not possible to create one from an empty string.
Forked version of ggplot2 at https://github.com/NikNakk/ggplot2 and pull request just made.
Finally, after all that, the sapply
example given doesn't work because it returns a 2 row by 4 column matrix rather than an 8 length vector. The corrected version is like this:
p + geom_histogram(
aes(y=unlist(lapply(unique(..group..), function(g)
..count..[..group..==g] / sum(..count..[..group..==g])))),
position='dodge'
)
This gives the same output as the dplyr
solution above.
One other thing to note is that this lapply
code assumes that the data at that stage is sorted by group. I think this is always the case, but if for whatever reason it weren't you would end up with the y data out of order. An alternative which preserves the order of the rows in the calculated data would be:
p + geom_histogram(
aes(y={grp_total <- tapply(..count.., ..group.., sum);
..count.. / grp_total[as.character(..group..)]
}),
position='dodge'
)
It's also worth being aware that these expressions are evaluated in baseenv()
, the namespace of the base package. This means that any functions from other packages, even standard ones like stats
and utils
, need to be used with the ::
operator (e.g. stats::rnorm
).
After playing around a little, the problem appears to be using anonymous functions with ..group.. or ..count.. inside aes:
xy <- data.frame(x=1:10,y=1:10) #data
ggplot(xy, aes(x = x, y = sapply(y, mean))) + geom_line() #sapply is fine
ggplot(xy, aes(x = x, group = y)) +
geom_bar(aes(y = sapply(..group.., mean))) #sapply with ..group.. is fine
ggplot(xy, aes(x = x, group = y)) +
geom_bar(aes(y = sapply(..group.., function(g) {mean(g)})))
#broken, with same error
ggplot(xy, aes(x = x, group = y)) +
geom_bar(aes(y = sapply(y, function(g) {mean(g)})), stat = "identity")
#sapply with anonymous functions works fine!
It seems like a really weird bug, unless I'm missing something stupid.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With