Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Connecting across missing values with geom_line

Tags:

r

ggplot2

I'm trying to figure out if it's possible to connect across missing values using geom_line. For example, in the link below there are missing values at time 3 in facet F. I'd like a line to connect time 2 and 4 in that case. Is there a way to achieve this?

https://farm8.staticflickr.com/7061/6964089563_b150e0c2a6.jpg

I have a data frame of cumulative values like so:

head(cumulative)    individual series Time     Value 1          A      x    1 -1.008821 2          A      x    2 -2.273712 3          A      x    3 -3.430610 4          A      x    4 -4.618860 5          A      x    5 -4.893075 6          A      x    6 -5.836532 

Which I'm plotting with:

ggplot(cumulative, aes(x=Time,y=Value, shape=series)) +      geom_point() +      geom_line(aes(linetype=series)) +      facet_wrap(~ individual, ncol=3) 
like image 847
stuwest Avatar asked Mar 08 '12 12:03

stuwest


People also ask

What does Geom_line do in R?

geom_line() connects them in order of the variable on the x axis. geom_step() creates a stairstep plot, highlighting exactly when changes occur. The group aesthetic determines which cases are connected together.

How do I connect dots in ggplot2?

Connecting Paired Points with lines using geom_line() In ggplot2 we can add lines connecting two data points using geom_line() function and specifying which data points to connect inside aes() using group argument. Now we get a scatter plot connecting paired data with lines.


2 Answers

Richie's answer is very thorough, but I wanted to show something simpler. Since lines are not drawn to NA points, another approach is drop these points when drawing lines. This implicitly makes a linear interpolation between points (as straight lines do).

Using dfr from Richie's answer, without needing the calculation of z step:

ggplot(dfr, aes(x,y)) +    geom_point() +   geom_line(data=dfr[!is.na(dfr$y),]) 

For that matter, in this case the subsetting could be done for the whole thing.

ggplot(dfr[!is.na(dfr$y),], aes(x,y)) +    geom_point() +   geom_line() 
like image 164
Brian Diggs Avatar answered Sep 24 '22 18:09

Brian Diggs


Lines aren't drawn if a value is NA. You need to replace these by interpolating across missing points. There are many different algorithms for interpolation, you need to experiment with several and see which one suits your data best. This example uses linear interpolation via interp1 in the pracma package.

Sample data:

dfr <- data.frame(   x = 1:10,   y = runif(10) ) dfr[c(3, 6, 7), "y"] <- NA 

Interpolation step:

dfr$z <- with(dfr, interp1(x, y, x, "linear")) 

Compare plots:

ggplot(dfr, aes(x, y)) + geom_line() ggplot(dfr, aes(x, z)) + geom_line() 

If you are showing this graph to other people, make sure that you clearly mark the places where you've synthesised data by interpolating (maybe using dotted lines).


Update based on comment:
You can specify different aesthetics for different geoms.

ggplot(dfr, aes(x)) +    geom_point(aes(y = y)) +   geom_line(aes(y = z)) 

To incorporate different line types for missing/non-missing y, you can do something like

ggplot(dfr, aes(x)) +    geom_point(aes(y = y)) +   geom_line(aes(y = y)) +   geom_line(aes(y = z), linetype = "dotted") 
like image 20
Richie Cotton Avatar answered Sep 22 '22 18:09

Richie Cotton