I'm trying to make a bumps chart (like parallel coordinates but with an ordinal x-axis) to show ranking over time. I can make a straight-line chart very easily:
library(ggplot2)
set.seed(47)
df <- as.data.frame(as.table(replicate(8, sample(4))), responseName = 'rank')
df$Var2 <- as.integer(df$Var2)
head(df)
#> Var1 Var2 rank
#> 1 A 1 4
#> 2 B 1 2
#> 3 C 1 3
#> 4 D 1 1
#> 5 A 2 3
#> 6 B 2 4
ggplot(df, aes(Var2, rank, color = Var1)) + geom_line() + geom_point()
Wonderful. Now, though, I want to make the connecting lines curved. Despite never having more than one y per x, geom_smooth
offers some possibilities. loess
seems like it should work, as it can ignore points except the closest. However, even with tweaking the best I can get still misses lots of points and overshoots others where it should be flat:
ggplot(df, aes(Var2, rank, color = Var1)) +
geom_smooth(method = 'loess', span = .7, se = FALSE) +
geom_point()
I've tried a number of other splines, like ggalt::geom_xspline
, but they all still overshoot or miss the points:
ggplot(df, aes(Var2, rank, color = Var1)) + ggalt::geom_xspline() + geom_point()
Is there an easy way to curve these lines? Do I need to build my own sigmoidal spline? To clarify, I'm looking for something like D3.js's d3.curveMonotoneX
which hits every point and whose local maxima and minima do not exceed the y values:
Ideally it would probably have a slope of 0 at each point, too, but that's not absolutely necessary.
Interpreting a Bump ChartWhen a line crosses another line, that is indicative of a change in rank. In other words, a crisscross in a bump chart indicates one entity has surpassed other in absolute terms even when comparison is based on relative ranks. Rank is a powerful feature for any visualization.
Using signal::pchip
with a grid of X-values works, at least in your example with numeric axes. A proper geom_
would be nice, but hey...
library(tidyverse)
library(signal)
set.seed(47)
df <- as.data.frame(as.table(replicate(8, sample(4))), responseName = 'rank')
df$Var2 <- as.integer(df$Var2)
head(df)
#> Var1 Var2 rank
#> 1 A 1 4
#> 2 B 1 2
#> 3 C 1 3
#> 4 D 1 1
#> 5 A 2 3
#> 6 B 2 4
ggplot(df, aes(Var2, rank, color = Var1)) +
geom_line(data = df %>%
group_by(Var1) %>%
do({
tibble(Var2 = seq(min(.$Var2), max(.$Var2),length.out=100),
rank = pchip(.$Var2, .$rank, Var2))
})) +
geom_point()
Result:
Building on Henrik's answer, this wraps up pchip
(I'm using the one from pracma
here but the result is the same) so it can be used alongside existing smooth methods more easily:
ggpchip = function(formula, data, weights) structure(pracma::pchipfun(data$x, data$y), class='ggpchip')
predict.ggpchip = function(object, newdata, se.fit=F, ...) {
fit = unclass(object)(newdata$x)
if (se.fit) list(fit=data.frame(fit, lwr=fit, upr=fit), se.fit=fit * 0) else fit
}
Then the actual ggplot call is straightforward:
ggplot(df, aes(Var2, rank, color=Var1)) + geom_smooth(method='ggpchip', se=F) + geom_point()
You can then use pchip to smooth other geoms, eg area plots:
ggplot(df, aes(Var2, rank, fill=Var1)) + stat_smooth(method='ggpchip', geom='area', position='fill')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With