Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot2 time series plotting: how to omit periods when there is no data points?

Tags:

r

ggplot2

I have a time series with multiple days of data. In between each day there's one period with no data points. How can I omit these periods when plotting the time series using ggplot2?

An artificial example shown as below, how can I get rid of the two periods where there's no data?

code:

Time = Sys.time()+(seq(1,100)*60+c(rep(1,100)*3600*24, rep(2, 100)*3600*24, rep(3, 100)*3600*24))
Value = rnorm(length(Time))
g <- ggplot() 
g <- g + geom_line (aes(x=Time, y=Value))
g

enter image description here

like image 995
billlee1231 Avatar asked Jan 03 '13 10:01

billlee1231


People also ask

Does Ggplot remove missing values?

rm = FALSE . ggplot is somewhat more accommodating of missing values than R generally. For those stats which require complete data, missing values will be automatically removed with a warning.

Does Ggplot only work with data frames?

ggplot only works with data frames, so we need to convert this matrix into data frame form, with one measurement in each row. We can convert to this “long” form with the melt function in the library reshape2 . Notice how ggplot is able to use either numerical or categorical (factor) data as x and y coordinates.


2 Answers

First, create a grouping variable. Here, two groups are different if the time difference is larger than 1 minute:

Group <- c(0, cumsum(diff(Time) > 1))

Now three distinct panels could be created using facet_grid and the argument scales = "free_x":

library(ggplot2)
g <- ggplot(data.frame(Time, Value, Group)) + 
  geom_line (aes(x=Time, y=Value)) +
  facet_grid(~ Group, scales = "free_x")

enter image description here

like image 126
Sven Hohenstein Avatar answered Oct 06 '22 12:10

Sven Hohenstein


The problem is that how does ggplot2 know you have missing values? I see two options:

  1. Pad out your time series with NA values
  2. Add an additional variable representing a "group". For example,

    dd = data.frame(Time, Value)
    ##type contains three distinct values
    dd$type = factor(cumsum(c(0, as.numeric(diff(dd$Time) - 1))))
    
    ##Plot, but use the group aesthetic
    ggplot(dd, aes(x=Time, y=Value)) +
          geom_line (aes(group=type))
    

    gives

    enter image description here

like image 20
csgillespie Avatar answered Oct 06 '22 14:10

csgillespie