I am still new to ggplot2. I want to plot a box plot but instead of the raw data I have the summary points.
Page_Type ID Count min 5% 25% 50% 75% 95% Max Avg
3 24559 173 408 479.45 615.25 800.5 1547.25 4436.8 7068 1350.138462
3 24560 101 0 480 631 871 1762 5183 65177 2702.245902
6 24559 69 490 664 1181 1807 3221 4845.5 6397 2287.45098
6 24560 10 1086 1254.4 1928 1970 2007 5236.6 6044 2607
46 24559 49 217 252.45 438.75 595 1198 2647.15 4316 939.6666667
46 24560 31 266 337 467 640 1123 2531.6 5232 989.2758621
69 24559 424 644 761.8 957 1292 2212 4938.6 11246 1881.785467
69 24560 216 601 848.85 1060.25 1488.5 2465 5314.7 7981 2094.007692
82 24559 62 922 1018.2 1305 1534 1966 3313.8 22461 2325.810811
82 24560 137 630 926.6 1156 1468 2281 3764.6 11364 1922.252632
the dput output is as follows:
structure(list(Page_Type = c(3L, 3L, 6L, 6L, 46L, 46L, 69L, 69L,
82L, 82L), ID = c(24559L, 24560L, 24559L, 24560L, 24559L, 24560L,
24559L, 24560L, 24559L, 24560L), Count = c(173L, 101L, 69L, 10L,
49L, 31L, 424L, 216L, 62L, 137L), min = c(408L, 0L, 490L, 1086L,
217L, 266L, 644L, 601L, 922L, 630L), X5. = c(479.45, 480, 664,
1254.4, 252.45, 337, 761.8, 848.85, 1018.2, 926.6), X25. = c(615.25,
631, 1181, 1928, 438.75, 467, 957, 1060.25, 1305, 1156), X50. = c(800.5,
871, 1807, 1970, 595, 640, 1292, 1488.5, 1534, 1468), X75. = c(1547.25,
1762, 3221, 2007, 1198, 1123, 2212, 2465, 1966, 2281), X95. = c(4436.8,
5183, 4845.5, 5236.6, 2647.15, 2531.6, 4938.6, 5314.7, 3313.8,
3764.6), Max = c(7068L, 65177L, 6397L, 6044L, 4316L, 5232L, 11246L,
7981L, 22461L, 11364L), Avg = c(1350.138462, 2702.245902, 2287.45098,
2607, 939.6666667, 989.2758621, 1881.785467, 2094.007692, 2325.810811,
1922.252632)), .Names = c("Page_Type", "ID", "Count", "min",
"X5.", "X25.", "X50.", "X75.", "X95.", "Max", "Avg"), class = "data.frame", row.names = c(NA,
-10L))
There are 5 page types and each page type has 2 ids. I want to show the various summary metrics (min, 5%, 25% ...) as a box plot. I am ok with skiping the 5% and 95% data points to fit the more traditional look. How do I create a box plot from this data?
There is also a count column which shows how many point were used to get the summary. If this can be overlayed on the same plot great else it can be a different plot as well.
You can make boxplot with geom_boxplot()
by providing your own min, max, middle, upper and lower values, only in this case you should add stat="identity"
inside geom_boxplot()
.
ggplot(df,aes(x=as.factor(Page_Type),
ymin=min,lower=X5.,middle=X50.,upper=X75.,ymax=Max,fill=as.factor(ID)))+
geom_boxplot(stat="identity")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With