Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bubble chart for integer variables where the largest bubble has a diameter of 1 (on the x or y axis scale)?

Tags:

r

ggplot2

I want to achieve the following outcomes:

  1. Rescale the size of the bubbles such that the largest bubble has a diameter of 1 (on whichever has the more compressed scale of the x and y axes).
  2. Rescale the size of the bubbles such that the smallest bubble has a diameter of 1 mm
  3. Have a legend with the first and last points the minimum non-zero frequency and the maximum frequency.

The best I have been able to do is as follows, but I need a more general solution where the value of maxSize is computed rather than hard-coded. If I was doing it in the traditional R plots I would use par("pin") to work out the size of plot area and work backwards, but I cannot figure out how to access this information with ggplot2. Any suggestions?

library(ggplot2)
agData = data.frame(
  class=rep(1:7,3),
  drv = rep(1:3,rep(7,3)),
  freq = as.numeric(xtabs(~class+drv,data = mpg))
)

agData = agData[agData$freq != 0,]
rng = range(agData$freq)
mn = rng[1]
mx = rng[2]
minimumArea = mx - mn
maxSize = 20
minSize = max(1,maxSize * sqrt(mn/mx))
qplot(class,drv,data = agData, size = freq) + theme_bw() + 
  scale_area(range = c(minSize,maxSize), 
             breaks = seq(mn,mx,minimumArea/4), limits = rng) 

Here is what it looks like so far: enter image description here

like image 650
Tim Avatar asked Jul 25 '12 20:07

Tim


People also ask

How do you measure bubble size on a bubble chart?

Bubble charts show two groups of numbers as a series of XY coordinates. A third set of numbers indicates the size of each datapoint, or bubble.

What does Z axis represent in bubble charts?

Z: The Z-axis determines the size of the bubble. The higher the value a records has for the field defined as the Z-axis, the larger the bubble will appear.


1 Answers

When no ggplot, lattice or other highlevel package seems to do the job without hours of fine tuning I always revert to the base graphics. The following code gets you what you want, and after it I have another example based on how I would have plotted it.

Note however that I have set the maximum radius to 1 cm, but just divide size.range/2 to get diameter instead. I just thought radius gave me nicer plots, and you'll probably want to adjust things anyways.

size.range <- c(.1, 1) # Min and max radius of circles, in cm

# Calculate the relative radius of each circle
radii <- sqrt(agData$freq)
radii <- diff(size.range)*(radii - min(radii))/diff(range(radii)) + size.range[1]

# Plot in two panels
mar0 <- par("mar")
layout(t(1:2), widths=c(4,1))

# Panel 1: The circles
par(mar=c(mar0[1:3],.5))
symbols(agData$class, agData$drv, radii, inches=size.range[2]/cm(1), bg="black")

# Panel 2: The legend
par(mar=c(mar0[1],.5,mar0[3:4]))
symbols(c(0,0), 1:2, size.range, xlim=c(-4, 4), ylim=c(-2,4),
        inches=1/cm(1), bg="black", axes=FALSE, xlab="", ylab="")
text(0, 3, "Freq")
text(c(2,0), 1:2, range(agData$freq), col=c("black", "white"))

# Reset par settings
par(mar=mar0)

Plot suggestion 1

Now follows my suggestion. The largest circle has a radius of 1 cm and area of the circles are proportional to agData$freq, without forcing a size of the smallest circle. Personally I think this is easier to read (both code and figure) and looks nicer.

with(agData, symbols(class, drv, sqrt(freq),
     inches=size.range[2]/cm(1), bg="black"))
with(agData, text(class, drv, freq, col="white"))

Plot suggestion 2

like image 97
Backlin Avatar answered Nov 15 '22 20:11

Backlin