I'll use violin plots here as an example, but the question extends to many other ggplot types.
I know how to subset my data along the x-axis by a factor:
ggplot(iris, aes(x = Species, y = Sepal.Length)) +
geom_violin() +
geom_point(position = "jitter")
And I know how to plot only the full dataset:
ggplot(iris, aes(x = 1, y = Sepal.Length)) +
geom_violin() +
geom_point(position = "jitter")
My question is: is there a way to plot the full data AND a subset-by-factor side-by-side in the same plot? In other words, for the iris data, could I make a violin plot that has both "full data" and "setosa" along the x-axis?
This would enable a comparison of the distribution of a full dataset and a subset of that dataset. If this isn't possible, any recommendations on better way to visualise this would also be welcome :)
Thanks for any ideas!
Using:
ggplot(iris, aes(x = "All", y = Sepal.Length)) +
geom_violin() +
geom_point(aes(color="All"), position = "jitter") +
geom_violin(data=iris, aes(x = Species, y = Sepal.Length)) +
geom_point(data=iris, aes(x = Species, y = Sepal.Length, color = Species),
position = "jitter") +
scale_color_manual(values = c("black","#F8766D","#00BA38","#619CFF")) +
theme_minimal(base_size = 16) +
theme(axis.title.x = element_blank(), legend.title = element_blank())
gives:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With