I'm trying to make a bumps chart (like parallel coordinates but with an ordinal x-axis) to show ranking over time. I can make a straight-line chart very easily: <pre class="prettyprint lang-r prettyprint-override"><code>library(ggplot2) set.seed(47) df <- as.data.frame(as.table(replicate(8, sample(4))), responseName = 'rank') df$Var2 <- as.integer(df$Var2) head(df) #> Var1 Var2 rank #> 1 A 1 4 #> 2 B 1 2 #> 3 C 1 3 #> 4 D 1 1 #> 5 A 2 3 #> 6 B 2 4 ggplot(df, aes(Var2, rank, color = Var1)) + geom_line() + geom_point() </code></pre> <img src="https://i.imgur.com/G4idXki.png" alt=""> Wonderful. Now, though, I want to make the connecting lines curved. Despite never having more than one y per x, <code>geom_smooth</code> offers some possibilities. <code>loess</code> seems like it should work, as it can ignore points except the closest. However, even with tweaking the best I can get still misses lots of points and overshoots others where it should be flat: <pre class="prettyprint lang-r prettyprint-override"><code>ggplot(df, aes(Var2, rank, color = Var1)) + geom_smooth(method = 'loess', span = .7, se = FALSE) + geom_point() </code></pre> <img src="https://i.imgur.com/4KOE0zb.png" alt=""> I've tried a number of other splines, like <code>ggalt::geom_xspline</code>, but they all still overshoot or miss the points: <pre class="prettyprint lang-r prettyprint-override"><code>ggplot(df, aes(Var2, rank, color = Var1)) + ggalt::geom_xspline() + geom_point() </code></pre> <img src="https://i.imgur.com/9k74aG5.png" alt=""> Is there an easy way to curve these lines? Do I need to build my own sigmoidal spline? To clarify, I'm looking for something like D3.js's <code>d3.curveMonotoneX</code> which hits every point and whose local maxima and minima do not exceed the y values: <img src="https://raw.githubusercontent.com/d3/d3-shape/master/img/monotoneX.png" alt="d3.curveMonotoneX image"> Ideally it would probably have a slope of 0 at each point, too, but that's not absolutely necessary.

Using <code>signal::pchip</code> with a grid of X-values works, at least in your example with numeric axes. A proper <code>geom_</code> would be nice, but hey... <pre class="prettyprint"><code>library(tidyverse) library(signal) set.seed(47) df <- as.data.frame(as.table(replicate(8, sample(4))), responseName = 'rank') df$Var2 <- as.integer(df$Var2) head(df) #> Var1 Var2 rank #> 1 A 1 4 #> 2 B 1 2 #> 3 C 1 3 #> 4 D 1 1 #> 5 A 2 3 #> 6 B 2 4 ggplot(df, aes(Var2, rank, color = Var1)) + geom_line(data = df %>% group_by(Var1) %>% do({ tibble(Var2 = seq(min(.$Var2), max(.$Var2),length.out=100), rank = pchip(.$Var2, .$rank, Var2)) })) + geom_point() </code></pre> Result: <img src="https://i.stack.imgur.com/vFDeb.png" alt="Result">

Use curved lines in bumps chart

Tags:

r

ggplot2

spline

I'm trying to make a bumps chart (like parallel coordinates but with an ordinal x-axis) to show ranking over time. I can make a straight-line chart very easily:

library(ggplot2)
set.seed(47)

df <- as.data.frame(as.table(replicate(8, sample(4))), responseName = 'rank')
df$Var2 <- as.integer(df$Var2)

head(df)
#>   Var1 Var2 rank
#> 1    A    1    4
#> 2    B    1    2
#> 3    C    1    3
#> 4    D    1    1
#> 5    A    2    3
#> 6    B    2    4

ggplot(df, aes(Var2, rank, color = Var1)) + geom_line() + geom_point()

Wonderful. Now, though, I want to make the connecting lines curved. Despite never having more than one y per x, geom_smooth offers some possibilities. loess seems like it should work, as it can ignore points except the closest. However, even with tweaking the best I can get still misses lots of points and overshoots others where it should be flat:

ggplot(df, aes(Var2, rank, color = Var1)) + 
    geom_smooth(method = 'loess', span = .7, se = FALSE) + 
    geom_point()

I've tried a number of other splines, like ggalt::geom_xspline, but they all still overshoot or miss the points:

ggplot(df, aes(Var2, rank, color = Var1)) + ggalt::geom_xspline() + geom_point()

Is there an easy way to curve these lines? Do I need to build my own sigmoidal spline? To clarify, I'm looking for something like D3.js's d3.curveMonotoneX which hits every point and whose local maxima and minima do not exceed the y values:

d3.curveMonotoneX image

Ideally it would probably have a slope of 0 at each point, too, but that's not absolutely necessary.

647

asked May 04 '17 00:05

alistaire

2 Answers

Using signal::pchip with a grid of X-values works, at least in your example with numeric axes. A proper geom_ would be nice, but hey...

library(tidyverse)
library(signal)
set.seed(47)

df <- as.data.frame(as.table(replicate(8, sample(4))), responseName = 'rank')
df$Var2 <- as.integer(df$Var2)

head(df)
#>   Var1 Var2 rank
#> 1    A    1    4
#> 2    B    1    2
#> 3    C    1    3
#> 4    D    1    1
#> 5    A    2    3
#> 6    B    2    4

ggplot(df, aes(Var2, rank, color = Var1)) +
  geom_line(data = df %>%
              group_by(Var1) %>%
              do({
                tibble(Var2 = seq(min(.$Var2), max(.$Var2),length.out=100),
                       rank = pchip(.$Var2, .$rank, Var2))
              })) +
  geom_point()

Result: Result

191

answered Sep 17 '22 01:09

Henrik Lindberg

Building on Henrik's answer, this wraps up pchip (I'm using the one from pracma here but the result is the same) so it can be used alongside existing smooth methods more easily:

ggpchip = function(formula, data, weights) structure(pracma::pchipfun(data$x, data$y), class='ggpchip')
predict.ggpchip = function(object, newdata, se.fit=F, ...) {
  fit = unclass(object)(newdata$x)
  if (se.fit) list(fit=data.frame(fit, lwr=fit, upr=fit), se.fit=fit * 0) else fit
}

Then the actual ggplot call is straightforward:

ggplot(df, aes(Var2, rank, color=Var1)) + geom_smooth(method='ggpchip', se=F) + geom_point()

You can then use pchip to smooth other geoms, eg area plots:

ggplot(df, aes(Var2, rank, fill=Var1)) + stat_smooth(method='ggpchip', geom='area', position='fill')

answered Sep 21 '22 01:09

Charles

Related questions
                            
                                read.table reads "T" as TRUE and "F" as FALSE, how to avoid?
                            
                                subsetting a data.table using !=<some non-NA> excludes NA too
                            
                                Should I get a habit of removing unused variables in R?
                            
                                Filter data.table using inequalities and variable column names
                            
                                how to determine if a character vector is a valid numeric or integer vector
                            
                                RStudio empty on startup - No windows, no menus, no rendering
                            
                                Extracting orthogonal polynomial coefficients from R's poly() function?
                            
                                doParallel, cluster vs cores
                            
                                Split facet plot into list of plots
                            
                                Getting the column names of a Data Frame with sapply
                            
                                Different legends and fill colours for facetted ggplot?
                            
                                foreach %dopar% - guarantee on order of results?
                            
                                Where are the vertex names in an iGraph graph
                            
                                How to avoid writing a row.names column when saving a data.frame using the xlsx package
                            
                                How to set strip label font size in lattice graphics in R
                            
                                How to write to json with children from R
                            
                                Ordering date/time in descending order in R
                            
                                How do I suppress the warning from including a library when using knitr in R?
                            
                                How does `ggplotGrob` work? [closed]
                            
                                ggplot2: Adding sample size information to x-axis tick labels

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With