What is the required incantation to achieve an overlapping, faceted lattice::histogram
with common break points (across groups, but potentially varying across panels)?
For example, assume I want the total range of the data (groups combined) for each panel to be split into 30 bins.
library(lattice)
set.seed(1)
d <- data.frame(v1=rep(c('A', 'B'), each=1000),
v2=rep(c(0.5, 1), each=2000),
mean=rep(c(0, 10, 2, 12), each=1000))
d$x <- rnorm(nrow(d), d$mean, d$v2)
nint=30
?p1 <- histogram(~x|v1, d, groups=v2, nint=30,
scales=list(relation='free'), type='percent',
panel = function(...) {
panel.superpose(..., panel.groups=panel.histogram,
col=c('red', 'blue'), alpha=0.3)
})
p1
Above, the bins are consistent across groups, but (1) the x-axis limits are shared across panels (problematic when the x-axis range varies substantially across panels - I really want the 30 bins to be calculated individually for each panel), and (2) the y-axis is cramped when using type='percent'
(it should extend further).
breaks=30
?p2 <- histogram(~x|v1, d, groups=v2, breaks=30,
scales=list(relation='free'), type='percent',
panel = function(...) {
panel.superpose(..., panel.groups=panel.histogram,
col=c('red', 'blue'), alpha=0.3)
})
p2
Now the axis limits look good, but the bins width varies across groups.
Using lattice
, how can I achieve overlapping, faceted histograms that have constant bin width across groups within panels, but have axis limits that fit the data for each panel?
(I realise that ggplot is an option, but I want the figure style to be consistent with my other lattice plots.)
This works, but I'm afraid it's rather pedestrian. At least it only requires the trellis object itself; it will assume the number of bins you want in each panel is equal to the nint parameter.
It works like this: check whether the panels ranges overlap. If they don't, split each (slightly extended) range into nint
bins, then concatenate them with a few empty bins in between. We also need to work out the y range, which we do by scaling according to the maximum number of counts.
fix_facets <- function(p1)
{
n_bins <- p1$panel.args.common$nint
xvals1 <- p1$panel.args[[1]]$x
xvals2 <- p1$panel.args[[2]]$x
if(min(xvals2) > max(xvals1) | min(xvals1) > max(xvals2)){
left_range <- range(xvals1)
left_range <- left_range + (diff(left_range) * c(-0.1, 0.1))
left_bins <- seq(left_range[1], left_range[2], diff(left_range)/n_bins)
right_range <- range(xvals2)
right_range <- right_range + (diff(right_range) * c(-0.1, 0.1))
right_bins <- seq(right_range[1], right_range[2], diff(right_range)/n_bins)
if(max(left_range) < min(right_range)){
mid_bins <- seq(max(left_bins), min(right_bins), diff(left_bins[1:2]))
all_bins <- c(left_bins, mid_bins, right_bins)
} else {
mid_bins <- seq(max(right_bins), min(left_bins), diff(right_bins[1:2]))
all_bins <- c(right_bins, mid_bins, left_bins)
}
p1$panel.args.common$breaks <- all_bins
p1$x.limits[[1]] <- left_range
p1$x.limits[[2]] <- right_range
histleft <- hist(xvals1, breaks = left_bins)
histright <- hist(xvals2, breaks = right_bins)
group_factor <- 100 * length(p1$condlevels[[1]])
p1$y.limits[[1]][2] <- group_factor * max(histleft$counts) / length(xvals1)
p1$y.limits[[2]][2] <- group_factor * max(histright$counts) / length(xvals2)
}
return(p1)
}
So with your example, we can do this:
p1 <- histogram(~x|v1, d, groups=v2, nint=30,
scales=list(relation='free'), type='percent',
panel = function(...) {
panel.superpose(..., panel.groups=panel.histogram,
col=c('red', 'blue'), alpha=0.3)
})
fix_facets(p1)
and to show it works with other numbers of bins...
p1 <- histogram(~x|v1, d, groups=v2, nint=10,
scales=list(relation='free'), type='percent',
panel = function(...) {
panel.superpose(..., panel.groups=panel.histogram,
col=c('red', 'blue'), alpha=0.3)
})
fix_facets(p1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With