Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exact number of bins in Histogram in R

I'm having trouble making a histogram in R. The problem is that I tell it to make 5 bins but it makes 4 and I tell to make 5 and it makes 8 of them.

data <- c(5.28, 14.64, 37.25, 78.9, 44.92, 8.96, 19.22, 34.81, 33.89, 24.28, 6.5, 4.32, 2.77, 17.6, 33.26, 52.78, 5.98, 22.48, 20.11, 65.74, 35.73, 56.95, 30.61, 29.82);

hist(data, nclass = 5,freq=FALSE,col="orange",main="Histogram",xlab="x",ylab="f(x)",yaxs="i",xaxs="i")

Any ideas on how to fix it?

like image 805
Eduardo Avatar asked Jun 05 '13 05:06

Eduardo


People also ask

How do you specify the number of bins in a histogram in R?

To change the number of bins in the histogram using the ggplot2 package library in the R Language, we use the bins argument of the geom_histogram() function. The bins argument of the geom_histogram() function to manually set the number of bars, cells, or bins the whole histogram will be divided into.

How do I count bins in R?

Calculate the number of bins by taking the square root of the number of data points and round up. Calculate the bin width by dividing the specification tolerance or range (USL-LSL or Max-Min value) by the # of bins.

How do you know how many bins to use in a histogram?

Choose between 5 and 20 bins. The larger the data set, the more likely you'll want a large number of bins. For example, a set of 12 data pieces might warrant 5 bins but a set of 1000 numbers will probably be more useful with 20 bins. The exact number of bins is usually a judgment call.

What is the default number of bins in a histogram?

The towers or bars of a histogram are called bins. The height of each bin shows how many values from that data fall into that range. The default value of the number of bins to be created in a histogram is 10.


2 Answers

Use the breaks argument:

hist(data, breaks=seq(0,80,l=6),
       freq=FALSE,col="orange",main="Histogram",
       xlab="x",ylab="f(x)",yaxs="i",xaxs="i")

enter image description here

like image 57
Rob Hyndman Avatar answered Sep 21 '22 20:09

Rob Hyndman


The integer specified as argument for nclass is used as a suggestion:

the number is a suggestion only

An alternative solution is to cut your vector into a specified number of groups and plot the result:

plot(cut(data, breaks = 4))

enter image description here

like image 40
Sven Hohenstein Avatar answered Sep 21 '22 20:09

Sven Hohenstein