I have the following data and I would like to apply the function diff()
only on consecutive days: diff(data$ch, differences = 1, lag = 1)
returns the differences between all consecutive values of ch
(23-12, 4-23, 78-4, 120-78, 94-120, ...). I would like the diff()
function to return NA
when the dates are not consecutive. The output I am trying to obtain from the data below is:
11, -19, 74, NA, -26, NA, -34, 39, NA
Is there anyone who knows how I can do that?
Date ch
2013-01-01 12
2013-01-02 23
2013-01-03 4
2013-01-04 78
2013-01-10 120
2013-01-11 94
2013-02-26 36
2013-02-27 2
2013-02-28 41
2003-03-05 22
You can do these in base R
without installing any external packages.
Assuming that the 'Date' column is of Date
class, we take the diff
of the 'Date' and based on whether the difference between adjacent elements are greater than 1 or not, we can create a grouping index ('indx') by taking the cumulative sum (cumsum
) of the logical vector.
indx <- cumsum(c(TRUE,abs(diff(df1$Date))>1))
In the second step, we can use ave
with 'indx' as the grouping vector, and take the diff
of 'ch'. The length of output of diff
will be 1 less than the length of the 'ch' column. So we can append NA
to make the lengths same.
ave(df1$ch, indx, FUN=function(x) c(diff(x),NA))
#[1] 11 -19 74 NA -26 NA -34 39 NA NA
df1 <- structure(list(Date = structure(c(15706, 15707, 15708, 15709,
15715, 15716, 15762, 15763, 15764, 12116), class = "Date"), ch = c(12L,
23L, 4L, 78L, 120L, 94L, 36L, 2L, 41L, 22L)), .Names = c("Date",
"ch"), row.names = c(NA, -10L), class = "data.frame")
The following just "...returns NA
when the dates are not consecutive", unless there are tricky cases that it won't account for:
replace(diff(df1$ch), abs(diff(df1$Date)) > 1, NA)
#[1] 11 -19 74 NA -26 NA -34 39 NA
Try this with the libraries lubridate
and dplyr
If you don't have them do this once install.packages("dplyr");install.packages("lubridate")
Code
library(lubridate)
library(dplyr)
data$Date <- ymd(data$Date)
data2 <- data %>% mutate(diff=ifelse(Date==lag(Date)+days(1), ch-lag(ch), NA))
Data
data <-
data.frame(Date=c("2013-01-01", "2013-01-02", "2013-01-03", "2013-01-04", "2013-01-10",
"2013-01-11", "2013-01-26", "2013-01-27", "2013-01-28", "2013-03-05"),
ch=c(12, 23, 4, 78, 120, 94, 36, 2, 41, 22))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With