I am using the great plotting library bayesplot
to visualize posterior probability intervals from models I am estimating with rstanarm
. I want to graphically compare draws from different models by getting the posterior intervals for coefficients onto the same plot.
Imagine, for instance, that I have 1000 draws from the posterior for three parameters beta1, beta2, beta3
for two different models:
# load the plotting library
library(bayesplot)
#> This is bayesplot version 1.6.0
#> - Online documentation and vignettes at mc-stan.org/bayesplot
#> - bayesplot theme set to bayesplot::theme_default()
#> * Does _not_ affect other ggplot2 plots
#> * See ?bayesplot_theme_set for details on theme setting
library(ggplot2)
# generate fake posterior draws from model1
fdata <- matrix(rnorm(1000 * 3), ncol = 3)
colnames(fdata) <- c('beta1', 'beta2', 'beta3')
# fake posterior draws from model 2
fdata2 <- matrix(rnorm(1000 * 3, 1, 2), ncol = 3)
colnames(fdata2) <- c('beta1', 'beta2', 'beta3')
Bayesplot makes fantastic visualizations for individual model draws, and it is ggplot2 'under the hood' so I can customize as I please:
# a nice plot of 1
color_scheme_set("orange")
mcmc_intervals(fdata) + theme_minimal() + ggtitle("Model 1")
# a nice plot of 2
color_scheme_set("blue")
mcmc_intervals(fdata2) + ggtitle("Model 2")
But what I would like to achieve is to plot these two models together on the same plot, such that for each coefficient I have two intervals and can distinguish which interval is which by mapping color to the model. However I can't figure out how to do this. Some things that don't work:
# doesnt work
mcmc_intervals(fdata) + mcmc_intervals(fdata2)
#> Error: Don't know how to add mcmc_intervals(fdata2) to a plot
# appears to pool
mcmc_intervals(list(fdata, fdata2))
Any ideas on how I could do this? Or how to do it manually given the matrices of posterior draws?
Created on 2018-10-18 by the reprex package (v0.2.1)
Just so the answer is also posted here, I have expanded on the code at the link from @Manny T (https://github.com/stan-dev/bayesplot/issues/232)
# simulate having posteriors for two different models each with parameters beta[1],..., beta[4]
posterior_1 <- matrix(rnorm(4000), 1000, 4)
posterior_2 <- matrix(rnorm(4000), 1000, 4)
colnames(posterior_1) <- colnames(posterior_2) <- paste0("beta[", 1:4, "]")
# use bayesplot::mcmc_intervals_data() function to get intervals data in format easy to pass to ggplot
library(bayesplot)
combined <- rbind(mcmc_intervals_data(posterior_1), mcmc_intervals_data(posterior_2))
combined$model <- rep(c("Model 1", "Model 2"), each = ncol(posterior_1))
# make the plot using ggplot
library(ggplot2)
theme_set(bayesplot::theme_default())
pos <- position_nudge(y = ifelse(combined$model == "Model 2", 0, 0.1))
ggplot(combined, aes(x = m, y = parameter, color = model)) +
geom_linerange(aes(xmin = l, xmax = h), position = pos, size=2)+
geom_linerange(aes(xmin = ll, xmax = hh), position = pos)+
geom_point(position = pos, color="black")
If you are like me, you will want 80% and 90% credible intervals (instead of 50% being the inner ones) and might want the coordinates flipped, and let's add a dashed line at 0 (model estimates no change). You can do that like this:
# use bayesplot::mcmc_intervals_data() function to get intervals data in format easy to pass to ggplot
library(bayesplot)
combined <- rbind(mcmc_intervals_data(posterior_1,prob=0.8,prob_outer = 0.9), mcmc_intervals_data(posterior_2,prob=0.8,prob_outer = 0.9))
combined$model <- rep(c("Model 1", "Model 2"), each = ncol(posterior_1))
# make the plot using ggplot
library(ggplot2)
theme_set(bayesplot::theme_default())
pos <- position_nudge(y = ifelse(combined$model == "Model 2", 0, 0.1))
ggplot(combined, aes(x = m, y = parameter, color = model)) +
geom_linerange(aes(xmin = l, xmax = h), position = pos, size=2)+
geom_linerange(aes(xmin = ll, xmax = hh), position = pos)+
geom_point(position = pos, color="black")+
coord_flip()+
geom_vline(xintercept=0,linetype="dashed")
A few things to note on this last one. I added prob_outer = 0.9
even though that is the default, just to show how you might change the outer credible intervals. The dashed line is created with geom_vline
and xintercept =
here instead of geom_hline
and yintercept =
because of the coord_flip
(everything is reversed). So if you don't flip axes, you will need to do the opposite.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With