Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Plot a Pre-Binned Histogram In R

I have a pre-binned frequency table for a rather large dataset. That is, a single column vector of bins and a single column vector of counts associated with those bins. I'd like R to plot a histogram of this data by doing further binning and summing the existing counts. For example, if in the pre-binned data I have something like [(0.01, 5000), (0.02, 231), (0.03, 948)], where the first number is the bin and the second is the count, and I choose 0.04 as the new bin width, I'd expect to get [(0.04, 6179)]. What's the fastest and or easiest way to do this in R?

like image 840
Jacob Avatar asked Sep 24 '10 16:09

Jacob


2 Answers

Looks like ggplot2 has the answer.

 
library(ggplot2)
qplot(bin, data=cbind(bins,counts), weight=counts, geom="histogram")
like image 141
Jacob Avatar answered Nov 10 '22 01:11

Jacob


The new HistogramTools package on CRAN has a number of useful functions for doing exactly this. In your example, if you want to merge three adjacent buckets together at each point in the histogram to produce a new histogram with 1/3rd as many buckets, you could use the MergeBuckets function.

install.packages("HistogramTools")
library(HistogramTools)
h <- hist(rexp(1000), breaks=60)
plot(MergeBuckets(h, adj.buckets=3))

Alternatively, you can also specify a list of the new breakpoints you want explicitly, rather than telling MergeBuckets() to always merge the same number of adjacent buckets. enter image description here

like image 42
MurrayStokely Avatar answered Nov 09 '22 23:11

MurrayStokely