I try to build a stacked bar chart with varying widths, so that the width indicates the mean amount of an allocation, whereas the height indicates the numbers of allocations.
Following, you'll find my reproducible data:
procedure = c("method1","method2", "method3", "method4","method1","method2", "method3", "method4","method1","method2", "method3","method4")
sector =c("construction","construction","construction","construction","delivery","delivery","delivery","delivery","service","service","service","service") 
number = c(100,20,10,80,75,80,50,20,20,25,10,4)
amount_mean = c(1,1.2,0.2,0.5,1.3,0.8,1.5,1,0.8,0.6,0.2,0.9) 
data0 = data.frame(procedure, sector, number, amount_mean)
When using geom_bar and including widths within aes, I get the following error message:
position_stack requires non-overlapping x intervals. Furthermore, the bars are no longer stacked.
bar<-ggplot(data=data0,aes(x=sector,y=number,fill=procedure, width = amount_mean)) + 
geom_bar(stat="identity") 
I also looked at the mekko-package, but it seems that this is only for bar charts.
Here is, what I'd like to have in the end (not based on above data):

Any idea how to solve my problem?
I have tried the same, geom_col() as well but I've run to the same problem - with position = "stack" it seems that we can't assign a width parameter without unstacking.
But it turned up, that solution is quite simple - we can use geom_rect() to build such plot "by hand".
There are your data:
df <- data.frame(
  procedure   = rep(paste("method", 1:4), times = 3),
  sector      = rep(c("construction", "delivery", "service"), each = 4),
  amount      = c(100, 20, 10, 80, 75, 80, 50, 20, 20, 25, 10, 4),
  amount_mean = c(1, 1.2, 0.2, 0.5, 1.3, 0.8, 1.5, 1, 0.8, 0.6, 0.2, 0.9)
)
At first I have transformed your data set:
df <- df |>
  mutate(
      amount_mean = amount_mean / max(amount_mean),
      sector_num  = as.numeric(sector)
  ) |>
  arrange(desc(amount_mean)) |>
  group_by(sector) |>
  mutate(
    xmin = sector_num - amount_mean / 2,
    xmax = sector_num + amount_mean / 2,
    ymin = cumsum(lag(amount, default = 0)), 
    ymax = cumsum(amount)
  ) |>
  ungroup()
What I do here:
amount_mean, so the 0 >= amount_mean <= 1 (better for plotting, anyway we don't have another scale to show the real values of amount_mean);sector variable into numerical (for plotting, see below);amount_mean (heavy means - at the bottom, light means on the top);xmin, xmax to represent the amount_mean, and ymin, ymax for amount. The former two are a bit trickier. ymax is obviouse - you just take a cumulative sum for all amount starting from the first one. You need cumulative sum to calculate ymin as well, but starting from 0. So the first rectangle plotted with ymin = 0, second - with ymin = ymax of previouse triangle etc.  All of this is performed withing each separate group of sectors.Plot the data:
df |>
  ggplot(aes(xmin = xmin, xmax = xmax,
             ymin = ymin, ymax = ymax, 
             fill = procedure
             )
         ) +
  geom_rect() +
  scale_x_continuous(breaks = df$sector_num, labels = df$sector) +
  #ggthemes::theme_tufte() +
  theme_bw() +
  labs(title = "Question 51136471", x = "Sector", y = "Amount") +
  theme(
    axis.ticks.x = element_blank()
  )
Result:

Another option to prevent to procedure variable to be reordered. So all let say "reds" are down, "greens" above etc. But it looks ugly:
df <- df |>
  mutate(
      amount_mean = amount_mean / max(amount_mean),
      sector_num = as.numeric(sector)
  ) |>
  arrange(procedure, desc(amount), desc(amount_mean)) |>
  group_by(sector) |>
  mutate(
    xmin = sector_num - amount_mean / 2,
    xmax = sector_num + amount_mean / 2,
    ymin = cumsum(lag(amount, default = 0)), 
    ymax = cumsum(amount)
  ) |>
  ungroup()

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With