Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ggplot2 plot mean of subset on facet instead global mean

Tags:

r

ggplot2

facet

I would like to get the facet subet mean (x + y axis) of the subset with ggplot. However, I get the mean of the data and not the subset one. I don't know how to solve this issue.

hsb2<-read.table("http://www.ats.ucla.edu/stat/data/hsb2.csv", sep=",", header=T)
head(hsb2)
hsb2$gender = as.factor(hsb2$female)

ggplot() +
  geom_point(aes(y = read,x = write,colour = gender),data=hsb2,size = 2.2,alpha = 0.9) +
  scale_colour_brewer(guide = guide_legend(),palette = 'Set1') +
  stat_smooth(aes(x = write,y = read),data=hsb2,colour = '#000000',size = 0.8,method = lm,formula = 'y ~ x') +
  geom_vline(aes(xintercept = mean(write)),data=hsb2,linetype = 3) +
  geom_hline(aes(yintercept = mean(read)),data=hsb2,linetype = 3) +
  facet_wrap(facets = ~gender)

enter image description here

like image 581
S12000 Avatar asked Jan 28 '14 17:01

S12000


1 Answers

One way to do it is to explicitly calculate the means (x and y) for each gender and store them as new columns in the original data frame. And when faceting splits it by gender, the lines get drawn where you want them.

Using tapply

#compute the read and write means for each gender 
read_means <- tapply(hsb2$read, hsb2$gender, mean)
write_means <- tapply(hsb2$write, hsb2$gender, mean)

#store it in the data frame
hsb2$read_mean <- ifelse(hsb2$gender==0, read_means[1], read_means[2])
hsb2$write_mean <- ifelse(hsb2$gender==0, write_means[1], write_means[2])

An alternative to the lines above is to use ddply.

Using ddply from the Plyr package

The new columns can be created using a single line.

library(plyr)
ddply(hsb2, "gender", transform, 
      read_mean  = mean(read),
      write_mean = mean(write))

Now, pass the two new column means to the vline and hline calls in ggplot.

ggplot() +
  geom_point(aes(y = read,x = write,colour = gender),data=hsb2,size = 2.2,alpha = 0.9) +
  scale_colour_brewer(guide = guide_legend(),palette = 'Set1') +
  stat_smooth(aes(x = write,y = read),data=hsb2,colour = '#000000',
              size = 0.8,method = lm,formula = 'y ~ x') +
  geom_vline(aes(xintercept = write_mean),data=hsb2,linetype = 3) +
  geom_hline(aes(yintercept = read_mean),data=hsb2,linetype = 3) +
  facet_wrap(facets = ~gender)

Produces: enter image description here

like image 182
Ram Narasimhan Avatar answered Oct 11 '22 07:10

Ram Narasimhan