Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Make dotplot scale y axis as for histogram

Tags:

r

ggplot2

We are using dotplots in a classroom setting to introduce the histogram, because the binning concept is confusing to many students. So we start with the dotplot which is similar but more intuitive:

x <- rnorm(100)
qplot(x, geom = "bar")
qplot(x, geom = "dotplot", method="histodot")

dotplot

Because students do this on their own data, the code needs to work without manual fiddling. However the geom_dotplot seems to use different scaling defaults than geom_bar. The y axis does not adjust with the data, but seems to depend only on the size of the dots. For example:

x <- runif(1000)
qplot(x, geom = "bar")
qplot(x, geom = "dotplot", method="histodot")

dotplot2

How can I make geom_dotplot with stat_histodot scale the y axis exactly as it would do for the histogram, either by using smaller or overlapping dots?

like image 229
Jeroen Ooms Avatar asked Oct 02 '14 09:10

Jeroen Ooms


People also ask

How do I change the Y axis scale in R?

To change the axis scales on a plot in base R Language, we can use the xlim() and ylim() functions. The xlim() and ylim() functions are convenience functions that set the limit of the x-axis and y-axis respectively.

Is a dot plot the same as a histogram?

A dot plot displays individual data values along the x-axis and uses dots to represent the frequencies of each individual value. A histogram displays data ranges along the x-axis and uses rectangular bars to represent the frequencies of values that fall into each range.

Why would you use a dot plot instead of a histogram?

Dot plots work well for small sets of data, but become difficult to construct for large data sets. A histogram or box plot will deal more efficiently with large data sets. Dot plots show all values in the set.


1 Answers

I came up with the following workaround that shrinks the binwidth until things fit on the page:

# This function calculates a default binwidth that will work better
# for the dotplot with large n than the ggplot2 default.
calculate_smart_binwidth <- function(x, aspect_ratio = 2/3){
  x <- as.numeric(x)
  nbins <- max(30, round(sqrt(length(x)) / aspect_ratio))
  range <- range(x, na.rm = TRUE, finite = TRUE)
  if(diff(range) == 0) return(NULL)
  repeat {
    message("trying nbins: ", nbins)
    binwidth <- diff(range)/nbins;
    highest_bin <- max(ggplot2:::bin(x, binwidth = binwidth)$count);
    if(highest_bin < aspect_ratio * nbins) return(binwidth)
    nbins <- ceiling(nbins * 1.03);
  }
}

Examples:

x <- runif(1e4)
qplot(x, geom="dotplot", binwidth=calculate_smart_binwidth(x))

plot1

x <- rnorm(1e4)
qplot(x, geom="dotplot", binwidth=calculate_smart_binwidth(x))

plot2

like image 155
Jeroen Ooms Avatar answered Nov 01 '22 00:11

Jeroen Ooms