I want to achieve the following outcomes:
The best I have been able to do is as follows, but I need a more general solution where the value of maxSize is computed rather than hard-coded. If I was doing it in the traditional R plots I would use par("pin") to work out the size of plot area and work backwards, but I cannot figure out how to access this information with ggplot2. Any suggestions?
library(ggplot2)
agData = data.frame(
class=rep(1:7,3),
drv = rep(1:3,rep(7,3)),
freq = as.numeric(xtabs(~class+drv,data = mpg))
)
agData = agData[agData$freq != 0,]
rng = range(agData$freq)
mn = rng[1]
mx = rng[2]
minimumArea = mx - mn
maxSize = 20
minSize = max(1,maxSize * sqrt(mn/mx))
qplot(class,drv,data = agData, size = freq) + theme_bw() +
scale_area(range = c(minSize,maxSize),
breaks = seq(mn,mx,minimumArea/4), limits = rng)
Here is what it looks like so far:
Bubble charts show two groups of numbers as a series of XY coordinates. A third set of numbers indicates the size of each datapoint, or bubble.
Z: The Z-axis determines the size of the bubble. The higher the value a records has for the field defined as the Z-axis, the larger the bubble will appear.
When no ggplot, lattice or other highlevel package seems to do the job without hours of fine tuning I always revert to the base graphics. The following code gets you what you want, and after it I have another example based on how I would have plotted it.
Note however that I have set the maximum radius to 1 cm, but just divide size.range/2
to get diameter instead. I just thought radius gave me nicer plots, and you'll probably want to adjust things anyways.
size.range <- c(.1, 1) # Min and max radius of circles, in cm
# Calculate the relative radius of each circle
radii <- sqrt(agData$freq)
radii <- diff(size.range)*(radii - min(radii))/diff(range(radii)) + size.range[1]
# Plot in two panels
mar0 <- par("mar")
layout(t(1:2), widths=c(4,1))
# Panel 1: The circles
par(mar=c(mar0[1:3],.5))
symbols(agData$class, agData$drv, radii, inches=size.range[2]/cm(1), bg="black")
# Panel 2: The legend
par(mar=c(mar0[1],.5,mar0[3:4]))
symbols(c(0,0), 1:2, size.range, xlim=c(-4, 4), ylim=c(-2,4),
inches=1/cm(1), bg="black", axes=FALSE, xlab="", ylab="")
text(0, 3, "Freq")
text(c(2,0), 1:2, range(agData$freq), col=c("black", "white"))
# Reset par settings
par(mar=mar0)
Now follows my suggestion. The largest circle has a radius of 1 cm and area of the circles are proportional to agData$freq
, without forcing a size of the smallest circle. Personally I think this is easier to read (both code and figure) and looks nicer.
with(agData, symbols(class, drv, sqrt(freq),
inches=size.range[2]/cm(1), bg="black"))
with(agData, text(class, drv, freq, col="white"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With