I try to build a stacked bar chart with varying widths, so that the width indicates the mean amount of an allocation, whereas the height indicates the numbers of allocations.
Following, you'll find my reproducible data:
procedure = c("method1","method2", "method3", "method4","method1","method2", "method3", "method4","method1","method2", "method3","method4")
sector =c("construction","construction","construction","construction","delivery","delivery","delivery","delivery","service","service","service","service")
number = c(100,20,10,80,75,80,50,20,20,25,10,4)
amount_mean = c(1,1.2,0.2,0.5,1.3,0.8,1.5,1,0.8,0.6,0.2,0.9)
data0 = data.frame(procedure, sector, number, amount_mean)
When using geom_bar and including widths within aes, I get the following error message:
position_stack requires non-overlapping x intervals. Furthermore, the bars are no longer stacked.
bar<-ggplot(data=data0,aes(x=sector,y=number,fill=procedure, width = amount_mean)) +
geom_bar(stat="identity")
I also looked at the mekko-package, but it seems that this is only for bar charts.
Here is, what I'd like to have in the end (not based on above data):
Any idea how to solve my problem?
I have tried the same, geom_col()
as well but I've run to the same problem - with position = "stack"
it seems that we can't assign a width
parameter without unstacking.
But it turned up, that solution is quite simple - we can use geom_rect()
to build such plot "by hand".
There are your data:
df <- data.frame(
procedure = rep(paste("method", 1:4), times = 3),
sector = rep(c("construction", "delivery", "service"), each = 4),
amount = c(100, 20, 10, 80, 75, 80, 50, 20, 20, 25, 10, 4),
amount_mean = c(1, 1.2, 0.2, 0.5, 1.3, 0.8, 1.5, 1, 0.8, 0.6, 0.2, 0.9)
)
At first I have transformed your data set:
df <- df |>
mutate(
amount_mean = amount_mean / max(amount_mean),
sector_num = as.numeric(sector)
) |>
arrange(desc(amount_mean)) |>
group_by(sector) |>
mutate(
xmin = sector_num - amount_mean / 2,
xmax = sector_num + amount_mean / 2,
ymin = cumsum(lag(amount, default = 0)),
ymax = cumsum(amount)
) |>
ungroup()
What I do here:
amount_mean
, so the 0 >= amount_mean <= 1
(better for plotting, anyway we don't have another scale to show the real values of amount_mean
);sector
variable into numerical (for plotting, see below);amount_mean
(heavy means - at the bottom, light means on the top);xmin
, xmax
to represent the amount_mean
, and ymin
, ymax
for amount. The former two are a bit trickier. ymax
is obviouse - you just take a cumulative sum for all amount
starting from the first one. You need cumulative sum to calculate ymin
as well, but starting from 0. So the first rectangle plotted with ymin = 0
, second - with ymin = ymax
of previouse triangle etc. All of this is performed withing each separate group of sector
s.Plot the data:
df |>
ggplot(aes(xmin = xmin, xmax = xmax,
ymin = ymin, ymax = ymax,
fill = procedure
)
) +
geom_rect() +
scale_x_continuous(breaks = df$sector_num, labels = df$sector) +
#ggthemes::theme_tufte() +
theme_bw() +
labs(title = "Question 51136471", x = "Sector", y = "Amount") +
theme(
axis.ticks.x = element_blank()
)
Result:
Another option to prevent to procedure
variable to be reordered. So all let say "reds" are down, "greens" above etc. But it looks ugly:
df <- df |>
mutate(
amount_mean = amount_mean / max(amount_mean),
sector_num = as.numeric(sector)
) |>
arrange(procedure, desc(amount), desc(amount_mean)) |>
group_by(sector) |>
mutate(
xmin = sector_num - amount_mean / 2,
xmax = sector_num + amount_mean / 2,
ymin = cumsum(lag(amount, default = 0)),
ymax = cumsum(amount)
) |>
ungroup()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With