We are using dotplots in a classroom setting to introduce the histogram, because the binning concept is confusing to many students. So we start with the dotplot which is similar but more intuitive:
x <- rnorm(100)
qplot(x, geom = "bar")
qplot(x, geom = "dotplot", method="histodot")
Because students do this on their own data, the code needs to work without manual fiddling. However the geom_dotplot
seems to use different scaling defaults than geom_bar
. The y axis does not adjust with the data, but seems to depend only on the size of the dots. For example:
x <- runif(1000)
qplot(x, geom = "bar")
qplot(x, geom = "dotplot", method="histodot")
How can I make geom_dotplot
with stat_histodot
scale the y axis exactly as it would do for the histogram, either by using smaller or overlapping dots?
To change the axis scales on a plot in base R Language, we can use the xlim() and ylim() functions. The xlim() and ylim() functions are convenience functions that set the limit of the x-axis and y-axis respectively.
A dot plot displays individual data values along the x-axis and uses dots to represent the frequencies of each individual value. A histogram displays data ranges along the x-axis and uses rectangular bars to represent the frequencies of values that fall into each range.
Dot plots work well for small sets of data, but become difficult to construct for large data sets. A histogram or box plot will deal more efficiently with large data sets. Dot plots show all values in the set.
I came up with the following workaround that shrinks the binwidth until things fit on the page:
# This function calculates a default binwidth that will work better
# for the dotplot with large n than the ggplot2 default.
calculate_smart_binwidth <- function(x, aspect_ratio = 2/3){
x <- as.numeric(x)
nbins <- max(30, round(sqrt(length(x)) / aspect_ratio))
range <- range(x, na.rm = TRUE, finite = TRUE)
if(diff(range) == 0) return(NULL)
repeat {
message("trying nbins: ", nbins)
binwidth <- diff(range)/nbins;
highest_bin <- max(ggplot2:::bin(x, binwidth = binwidth)$count);
if(highest_bin < aspect_ratio * nbins) return(binwidth)
nbins <- ceiling(nbins * 1.03);
}
}
Examples:
x <- runif(1e4)
qplot(x, geom="dotplot", binwidth=calculate_smart_binwidth(x))
x <- rnorm(1e4)
qplot(x, geom="dotplot", binwidth=calculate_smart_binwidth(x))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With