Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: ggplot2: Adding count labels to histogram with density overlay

I have a time-series that I'm examining for data heterogeneity, and wish to explain some important facets of this to some data analysts. I have a density histogram overlayed by a KDE plot (in order to see both plots obviously). However the original data are counts, and I want to place the count values as labels above the histogram bars.

Here is some code:

$tix_hist <- ggplot(tix, aes(x=Tix_Cnt)) 
             + geom_histogram(aes(y = ..density..), colour="black", fill="orange", binwidth=50) 
             + xlab("Bin") + ylab("Density") + geom_density(aes(y = ..density..),fill=NA, colour="blue") 
             + scale_x_continuous(breaks=seq(1,1700,by=100))

    tix_hist + opts(
       title = "Ticket Density To-Date",
       plot.title = theme_text(face="bold", size=18), 
       axis.title.x = theme_text(face="bold", size=16),
       axis.title.y = theme_text(face="bold", size=14, angle=90),
       axis.text.x = theme_text(face="bold", size=14),
       axis.text.y = theme_text(face="bold", size=14)
           )

I thought about extrapolating count values using KDE bandwidth, etc, . Is it possible to data frame the numeric output of a ggplot frequency histogram and add this as a 'layer'. I'm not savvy on the layer() function yet, but any ideas would be helpful. Many thanks!

like image 438
pmiddlet Avatar asked Jul 09 '12 23:07

pmiddlet


1 Answers

if you want the y-axis to show the bin_count number, at the same time, adding a density curve on this histogram,

you might use geom_histogram() first and record the binwidth value! (this is very important!), next add a layer of geom_density() to show the fitting curve.

if you don't know how to choose the binwidth value, you can just calculate:

my_binwidth = (max(Tix_Cnt)-min(Tix_Cnt))/30;

(this is exactly what geom_histogram does in default.)

The code is given below:

(suppose the binwith value you just calculated is 0.001)

tix_hist <- ggplot(tix, aes(x=Tix_Cnt)) ;

tix_hist<- tix_hist + geom_histogram(aes(y=..count..),colour="blue",fill="white",binwidth=0.001);

tix_hist<- tix_hist + geom_density(aes(y=0.001*..count..),alpha=0.2,fill="#FF6666",adjust=4);

print(tix_hist);
like image 182
crystal_ym Avatar answered Oct 20 '22 10:10

crystal_ym