I have a time series with multiple days of data. In between each day there's one period with no data points. How can I omit these periods when plotting the time series using ggplot2
?
An artificial example shown as below, how can I get rid of the two periods where there's no data?
code:
Time = Sys.time()+(seq(1,100)*60+c(rep(1,100)*3600*24, rep(2, 100)*3600*24, rep(3, 100)*3600*24))
Value = rnorm(length(Time))
g <- ggplot()
g <- g + geom_line (aes(x=Time, y=Value))
g
rm = FALSE . ggplot is somewhat more accommodating of missing values than R generally. For those stats which require complete data, missing values will be automatically removed with a warning.
ggplot only works with data frames, so we need to convert this matrix into data frame form, with one measurement in each row. We can convert to this “long” form with the melt function in the library reshape2 . Notice how ggplot is able to use either numerical or categorical (factor) data as x and y coordinates.
First, create a grouping variable. Here, two groups are different if the time difference is larger than 1 minute:
Group <- c(0, cumsum(diff(Time) > 1))
Now three distinct panels could be created using facet_grid
and the argument scales = "free_x"
:
library(ggplot2)
g <- ggplot(data.frame(Time, Value, Group)) +
geom_line (aes(x=Time, y=Value)) +
facet_grid(~ Group, scales = "free_x")
The problem is that how does ggplot2 know you have missing values? I see two options:
NA
valuesAdd an additional variable representing a "group". For example,
dd = data.frame(Time, Value)
##type contains three distinct values
dd$type = factor(cumsum(c(0, as.numeric(diff(dd$Time) - 1))))
##Plot, but use the group aesthetic
ggplot(dd, aes(x=Time, y=Value)) +
geom_line (aes(group=type))
gives
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With