Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Histogram with Logarithmic Scale and custom breaks

I'm trying to generate a histogram in R with a logarithmic scale for y. Currently I do:

hist(mydata$V3, breaks=c(0,1,2,3,4,5,25)) 

This gives me a histogram, but the density between 0 to 1 is so great (about a million values difference) that you can barely make out any of the other bars.

Then I've tried doing:

mydata_hist <- hist(mydata$V3, breaks=c(0,1,2,3,4,5,25), plot=FALSE) plot(rpd_hist$counts, log="xy", pch=20, col="blue") 

It gives me sorta what I want, but the bottom shows me the values 1-6 rather than 0, 1, 2, 3, 4, 5, 25. It's also showing the data as points rather than bars. barplot works but then I don't get any bottom axis.

like image 840
Weegee Avatar asked Aug 07 '09 15:08

Weegee


People also ask

What is a logarithmic histogram?

Description. Plots a log-histogram, as in for example Feiller, Flenley and Olbricht (1992). The intended use of the log-histogram is to examine the fit of a particular density to a set of data, as an alternative to a histogram with a density curve.

Why would you use a logarithmic scale?

Logarithmic scales are useful when the data you are displaying is much less or much more than the rest of the data or when the percentage differences between values are important. You can specify whether to use a logarithmic scale, if the values in the chart cover a very large range.


2 Answers

A histogram is a poor-man's density estimate. Note that in your call to hist() using default arguments, you get frequencies not probabilities -- add ,prob=TRUE to the call if you want probabilities.

As for the log axis problem, don't use 'x' if you do not want the x-axis transformed:

plot(mydata_hist$count, log="y", type='h', lwd=10, lend=2) 

gets you bars on a log-y scale -- the look-and-feel is still a little different but can probably be tweaked.

Lastly, you can also do hist(log(x), ...) to get a histogram of the log of your data.

like image 154
Dirk Eddelbuettel Avatar answered Sep 28 '22 11:09

Dirk Eddelbuettel


Another option would be to use the package ggplot2.

ggplot(mydata, aes(x = V3)) + geom_histogram() + scale_x_log10() 
like image 41
Thierry Avatar answered Sep 28 '22 10:09

Thierry