I like facet_grid
quite a bit and often find myself using it with geom_histogram
. I like the margins=foo
argument as well in that you can see over all distributions in comparison to the aggregated groups. The problem is when you include a margin argument it stretches out all the groups to meet the usually much wider range y scale for the margins as you now have all the data. This means all the aggregated groups y scale is much wider and it's harder to spot differences among groups. So I says to myself, :self you can fix this with the scales argument and allow y to vary". The problem with this solution is that now it's hard to compare aggregated groups because they all have different scales. What I'd like is to have the margins be free but everything else constrained as if though scales aren't free. Is this possible?
I present code here and pics to demonstrate what I mean. If this isn't clear please ask.
#create some data
set.seed(10)
dat <- data.frame(var1=rpois(1000, 20),
var2=as.factor(sample(LETTERS[1:4], 1000, replace=T)),
var3=as.factor(sample(month.abb[1:5], 1000, replace=T)))
ggplot(dat, aes(var1)) +
geom_histogram() +
facet_grid(var2~var3)
Here's the plot from that. I like it in that I can easily compare all aggregated scores since their y scales are the same. But wouldn't it be nice to have the margins or the dis-aggregated histograms as well for comparison.
ggplot(dat, aes(var1)) +
geom_histogram() +
facet_grid(var2~var3, margins='var2')
Alright so we put the margins
argument in and now we can compare but all the aggregated group histograms are stretched to 20 and it makes comparing them difficult (see image below). Ok let's set scales
to be free then. this example isn't horrible as the data is pretty evenly distributed int he sampling method I used but in real life some cells only have a few counts and others have lots and the comparisons are even worse
ggplot(dat, aes(var1)) +
geom_histogram() +
facet_grid(var2~var3, margins='var2', scales="free_y")
So here's the plot with the scales free. Problem is they are indeed free for the aggregated scores as well and comparing them is problematic (one is 14ish, one 8ish, one 7ish).
So is there a way to allow just the margins
to be free? Basically what I want is to take the first figure created and splice the margins on from the second figure.
Would this workaround work in the meantime? You have the repeated headings, but the x scale and label can be dropped.
require(ggplot2)
require(gridExtra)
set.seed(10)
dat <- data.frame(var1=rpois(1000, 20),
var2=as.factor(sample(LETTERS[1:4], 1000, replace=T)),
var3=as.factor(sample(month.abb[1:5], 1000, replace=T)))
dat$var4 <- "All"
windows(width=8, height=8)
p1 <- ggplot(dat, aes(var1)) +
geom_histogram() +
facet_grid(var2~var3) +
p2 <- ggplot(dat, aes(var1)) +
geom_histogram() +
facet_grid(~var3)
grid.arrange(p1, p2, nrow=2, heights=c(4,1.5))
You probably already know how to drop the x-scale and label from the first plot with scale_x_continuous('', breaks = NA)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With