I would like to calculate the consecutive differences between the rows of a variable of a dataframe in R, exactly like the diff() function, however, only between the rows with the same id number.
Dummy data:
id <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2)
t1 <- c(4,3,4,4,5,8,9,11,3,8,9,7,9,10,3)
df <- data.frame(id, t1)
Using diff(df$t1) I get:
-1 1 0 1 3 1 2 -8 5 1 -2 2 1 -7
I would like:
-1 1 0 1 3 1 NA -8 5 1 -2 2 1 -7
I tried also:
df%>%
group_by(id)%>%
diff(df$t1)
But I get the error:
Error in diff.default(., df$t1) : 'lag' and 'differences' must be integers >= 1
Any ideas?
A one-liner base R could be
ave(df$t1, df$id, FUN = function(x) c(x[-1], NA) - x)
#[1] -1 1 0 1 3 1 2 NA 5 1 -2 2 1 -7 NA
You would need to move diff() to inside a mutate() statement if you are using dplyr. But diff() returns a vector that's shorter by 1 than your input vector which makes it difficult to keep the same number of rows. An alternative is to use the dplyr lead() function to grab the "next" value in the group
df%>%
group_by(id)%>%
mutate(diff=lead(t1)-t1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With