Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bubble Chart with ggplot2

I want to print a bubble chart in R. The problem I run into is that both, my x and my y axis are discrete. In theory this means a lot of data points (bubbles) end up on the same coordinate. I would rather have them scattered around the data point, but still within a quadrant that makes clear the bubble belongs to the respective x/y coordinate.

I think it is best demonstrated by a little example. The following code should highlight the problem:

# Example
require(ggplot2)
zz <- textConnection("Row PowerSource ProductSegment Price Model ManufacturingLocation Quantity
1 High SegmentA Low ModA LocationA 5000
2 Low SegmentB Low ModB LocationB 25000
3 High SegmentC Low ModC LocationC 15000
4 Low SegmentD High ModD LocationD 30000
5 High SegmentE High ModE LocationA 2500
6 Low SegmentA Low ModF LocationB 110000
7 High SegmentB Low ModG LocationC 20000
8 Low SegmentC Low ModH LocationD 3500
9 High SegmentD Low ModI LocationA 65500
10 Low SegmentE Low ModJ LocationB 145000
11 High SegmentA Low ModK LocationC 15000
12 Low SegmentB Low ModL LocationD 5000
13 High SegmentC Low ModM LocationA 26000
14 Low SegmentD Low ModN LocationB 14000
15 High SegmentE Mid ModO LocationC 75000
16 Low SegmentA High ModP LocationD 33000
17 High SegmentB Low ModQ LocationA 14000
18 Low SegmentC Mid ModR LocationB 33000
19 High SegmentD High ModS LocationC 95000
20 Low SegmentE Low ModT LocationD 4000
 ")
df2 <- read.table(zz, header= TRUE)
close(zz)
df2


ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource, label = Model)) +
    geom_point(aes(size = Quantity, colour = Price)) + 
    geom_text(hjust = 1, size = 2) +
    scale_size(range = c(1,15)) +
    theme_bw()

How can I scatter the bubbles a little bit to show the different products in each categories and their quantity?

(Apologies, I can't add an image at the moment because of too few reputations)

enter image description here

like image 625
Roman Avatar asked Nov 05 '14 12:11

Roman


People also ask

How do I name a circle in a bubble chart in R?

To add labels on each bubble in a bubble plot in the R Language, we use the geom_text() function of the ggplot2 package. The geom_text() function adds textual annotation overlap on top of the ggplot plot. Parameter: x and y: determines the position of the label.

How do you create a bubble chart in Python?

Bubble charts display data as a cluster of circles. The required data to create bubble chart needs to have the xy coordinates, size of the bubble and the colour of the bubbles. The colours can be supplied by the library itself.


1 Answers

As Tom Martens pointed out adjusting alpha can show any overlapping. The following alpha level:

ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource, label = Model)) +
    geom_point(aes(size = Quantity, colour = Price, alpha=.02)) + 
    geom_text(hjust = 1, size = 2) +
    scale_size(range = c(1,15)) +
    theme_bw()

results in:

enter image description here

Using geom_jitter instead of point, combined with alpha:

ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource, label = Model)) +
    geom_jitter(aes(size = Quantity, colour = Price, alpha=.02)) + 
    geom_text(hjust = 1, size = 2) +
    scale_size(range = c(1,15)) +
    theme_bw()

produces this:

enter image description here

EDIT: In order to avoid the artefact in the legend the alpha should be placed outside the aes:

ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource, label = Model)) +
    geom_point(aes(size = Quantity, colour = Price),alpha=.2) +
    geom_text(hjust = 1, size = 2) +
    scale_size(range = c(1,15)) +
    theme_bw()

resulting in:

enter image description here

and:

 ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource, label = Model)) +
    geom_jitter(aes(size = Quantity, colour = Price),alpha=.2) +
    geom_text(hjust = 1, size = 2) +
    scale_size(range = c(1,15)) +
    theme_bw()

resulting in:

enter image description here

EDIT 2: So, this took a while to figure out.

I followed the example I linked to in my comment. I adjusted the code to suit your needs. First of all I created the jitter values outside of the plot:

df2$JitCoOr <- jitter(as.numeric(factor(df2$ManufacturingLocation)))
df2$JitCoOrPow <- jitter(as.numeric(factor(df2$PowerSource)))

I then called those values into the geom_point and geom_text x and y coordinates inside aes. This worked by jittering the bubbles and matching labels to them. However it messed up the x and y axis labels so I relabled them as can be seen in scale_x_discrete and scale_y_discrete. Here is the plot code:

ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource)) +
geom_point(data=df2,aes(x=JitCoOr, y=JitCoOrPow,size = Quantity, colour = Price), alpha=.5)+
geom_text(data=df2,aes(x=JitCoOr, y=JitCoOrPow,label=Model)) + 
scale_size(range = c(1,50)) +
scale_y_discrete(breaks =1:3 , labels=c("Low","High"," "), limits = c(1, 2))+
scale_x_discrete(breaks =1:4 , labels=c("Location A","Location B","Location C","Location D"), limits = c(1,2,3,4))+ 
theme_bw()

Which gives this output:

enter image description here

You can adjust the size of the bubbles via scale_size above. I exported this image with dimensions of 1000*800.

Regarding your request to add borders I think it is unnecessary. It is very clear in this plot where the bubbles belong & I think borders would make it look a bit ugly. However, if you still want borders I'll have a look and see what I can do.

like image 83
Docconcoct Avatar answered Oct 12 '22 22:10

Docconcoct