I am trying to make a QQ-plot in ggplot2, where a select few of the points should have a different shape. But when I map the shape to a variable in the aesthetics, stat_qq includes this variable to split the data (there are 2x3 factors involved).
Here is a reproducible example:
library(ggplot2)
set.seed(331)
df <- do.call(rbind, replicate(10, {expand.grid(method=factor(letters[1:3]), model=factor(LETTERS[1:2]))}, simplify=FALSE ))
df$x <- runif(nrow(df))
df$y <- rnorm(nrow(df), sd=0.2) + 1*as.integer(df$method)
df$top <- FALSE
df <- df[order(df$y, decreasing=TRUE),]
df$top[which(df$method=='a')[1:10]] <- TRUE
So far, I have managed to make a simple QQ-plot:
ggplot(df, aes(sample=y, colour=method)) + stat_qq() + facet_grid(.~model)

This is basically what I want, except for a hand full of the points in method 'a' having a different shape, as indicated by the variable 'top'. From the code, we know that these corresponds to the top 5 values in method 'a' in each model; i.e. that the five left most of the red dots in each facet should have a different shape. Here I have attempted to add it as an aesthetics:
ggplot(df, aes(sample=y, colour=method, shape=top)) + stat_qq() + facet_grid(.~model)

Now, it is quite clear, that stat_qq has included the variable 'top' to split the data set, as the top 5 data points are plotted parallel to the the non-top points.
This is not as intended.
How can I instruct stat_qq how to group the data?
I could try the group-aesthetic:
ggplot(df, aes(sample=y, colour=method, shape=top, group=method)) + stat_qq() + facet_grid(.~model)
Warning messages:
1: Removed 10 rows containing missing values (geom_point).
2: Removed 10 rows containing missing values (geom_point).

But for some reason, this entirely removes all data points connected to the model.
Any ideas how to overcome this?
Since you want to violate one of the fundamental concepts of ggplot2 it would be easier to do the calculations outside of ggplot:
library(plyr)
df <- ddply(df, .(model, method),
transform, theo=qqnorm(y, plot.it=FALSE)[["x"]])
ggplot(df, aes(x=theo, y=y, colour=method, shape=top)) +
geom_point() + facet_grid(.~model)

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With