Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plot multiple series of data into a single bagplot with R

Let's condsider the bagplot example as included in the aplpack library in R. A bagplot is a bivariate generalisation of a boxplot and therefore gives insight in the distribution of data points in both axes.

Example of a bagplot: car data bagplot

Code for the example:

  # example of Rousseeuw et al., see R-package rpart
  cardata <- structure(as.integer( c(2560,2345,1845,2260,2440,
   2285, 2275, 2350, 2295, 1900, 2390, 2075, 2330, 3320, 2885,
   3310, 2695, 2170, 2710, 2775, 2840, 2485, 2670, 2640, 2655,
   3065, 2750, 2920, 2780, 2745, 3110, 2920, 2645, 2575, 2935,
   2920, 2985, 3265, 2880, 2975, 3450, 3145, 3190, 3610, 2885,
   3480, 3200, 2765, 3220, 3480, 3325, 3855, 3850, 3195, 3735,
   3665, 3735, 3415, 3185, 3690, 97, 114, 81, 91, 113, 97, 97,
   98, 109, 73, 97, 89, 109, 305, 153, 302, 133, 97, 125, 146,
   107, 109, 121, 151, 133, 181, 141, 132, 133, 122, 181, 146,
   151, 116, 135, 122, 141, 163, 151, 153, 202, 180, 182, 232,
   143, 180, 180, 151, 189, 180, 231, 305, 302, 151, 202, 182,
   181, 143, 146, 146)), .Dim = as.integer(c(60, 2)), 
   .Dimnames = list(NULL, c("Weight", "Disp.")))
  bagplot(cardata,factor=3,show.baghull=TRUE,
    show.loophull=TRUE,precision=1,dkmethod=2)
  title("car data Chambers/Hastie 1992")
  # points of y=x*x
  bagplot(x=1:30,y=(1:30)^2,verbose=FALSE,dkmethod=2)

The bagplot of aplpack seems to only support plotting a "bag" for a single data series. Even more interesting would be to plot two (or three) data series within a single bagplot, where visually comparing the "bags" of the data series gives insight in the differences in the data distributions of the data series. Does anyone know if (and if so, how) this can be done in R?

like image 304
Niek Tax Avatar asked Apr 07 '15 21:04

Niek Tax


1 Answers

If we modify some of the aplpack::bagplot code we can make a new geom for ggplot2. Then we can compare groups within a dataset in the usual ggplot2 ways. Here's one example:

library(ggplot2)
ggplot(iris, aes(Sepal.Length, Sepal.Width, 
                 colour = Species, fill = Species)) +
       geom_bag() +
       theme_minimal()

enter image description here

and we can show the points with the bagplot:

ggplot(iris, aes(Sepal.Length, Sepal.Width, 
                     colour = Species, fill = Species)) +
           geom_bag() +
           geom_point() + 
           theme_minimal()

enter image description here

Here's the code for the geom_bag and modified aplpack::bagplot function: https://gist.github.com/benmarwick/00772ccea2dd0b0f1745

like image 179
Ben Avatar answered Oct 05 '22 23:10

Ben