Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot2 free the margins constrain the rest

Tags:

r

ggplot2

I like facet_grid quite a bit and often find myself using it with geom_histogram. I like the margins=foo argument as well in that you can see over all distributions in comparison to the aggregated groups. The problem is when you include a margin argument it stretches out all the groups to meet the usually much wider range y scale for the margins as you now have all the data. This means all the aggregated groups y scale is much wider and it's harder to spot differences among groups. So I says to myself, :self you can fix this with the scales argument and allow y to vary". The problem with this solution is that now it's hard to compare aggregated groups because they all have different scales. What I'd like is to have the margins be free but everything else constrained as if though scales aren't free. Is this possible?

I present code here and pics to demonstrate what I mean. If this isn't clear please ask.

#create some data
set.seed(10)
dat <- data.frame(var1=rpois(1000, 20), 
   var2=as.factor(sample(LETTERS[1:4], 1000, replace=T)),
   var3=as.factor(sample(month.abb[1:5], 1000, replace=T)))

ggplot(dat, aes(var1)) + 
geom_histogram() + 
facet_grid(var2~var3)

Here's the plot from that. I like it in that I can easily compare all aggregated scores since their y scales are the same. But wouldn't it be nice to have the margins or the dis-aggregated histograms as well for comparison.

enter image description here

ggplot(dat, aes(var1)) + 
geom_histogram() + 
facet_grid(var2~var3, margins='var2')

Alright so we put the margins argument in and now we can compare but all the aggregated group histograms are stretched to 20 and it makes comparing them difficult (see image below). Ok let's set scales to be free then. this example isn't horrible as the data is pretty evenly distributed int he sampling method I used but in real life some cells only have a few counts and others have lots and the comparisons are even worse

enter image description here

ggplot(dat, aes(var1)) + 
geom_histogram() + 
facet_grid(var2~var3, margins='var2', scales="free_y")

So here's the plot with the scales free. Problem is they are indeed free for the aggregated scores as well and comparing them is problematic (one is 14ish, one 8ish, one 7ish).

enter image description here

So is there a way to allow just the margins to be free? Basically what I want is to take the first figure created and splice the margins on from the second figure.

like image 444
Tyler Rinker Avatar asked Nov 14 '22 05:11

Tyler Rinker


1 Answers

Would this workaround work in the meantime? You have the repeated headings, but the x scale and label can be dropped.

require(ggplot2)
require(gridExtra)
set.seed(10)
dat <- data.frame(var1=rpois(1000, 20), 
                  var2=as.factor(sample(LETTERS[1:4], 1000, replace=T)),
                  var3=as.factor(sample(month.abb[1:5], 1000, replace=T)))
dat$var4 <- "All"

windows(width=8, height=8)

p1 <- ggplot(dat, aes(var1)) + 
  geom_histogram() + 
  facet_grid(var2~var3) +


p2 <- ggplot(dat, aes(var1)) + 
  geom_histogram() + 
  facet_grid(~var3)


grid.arrange(p1, p2, nrow=2, heights=c(4,1.5))

enter image description here

You probably already know how to drop the x-scale and label from the first plot with scale_x_continuous('', breaks = NA).

like image 60
Tom Avatar answered Jan 08 '23 20:01

Tom