Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Conditional diff function

Tags:

r

diff

group-by

I would like to calculate the consecutive differences between the rows of a variable of a dataframe in R, exactly like the diff() function, however, only between the rows with the same id number.

Dummy data:

id <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2)

t1 <- c(4,3,4,4,5,8,9,11,3,8,9,7,9,10,3)

df <- data.frame(id, t1)

Using diff(df$t1) I get:

-1 1 0 1 3 1 2 -8 5 1 -2 2 1 -7

I would like:

-1 1 0 1 3 1 NA -8 5 1 -2 2 1 -7

I tried also:

df%>%
  group_by(id)%>%
  diff(df$t1)

But I get the error:

Error in diff.default(., df$t1) : 'lag' and 'differences' must be integers >= 1

Any ideas?

like image 258
Anna Avatar asked Dec 13 '25 03:12

Anna


2 Answers

A one-liner base R could be

ave(df$t1, df$id, FUN = function(x) c(x[-1], NA) - x)
#[1] -1  1  0  1  3  1  2 NA  5  1 -2  2  1 -7 NA
like image 174
Rui Barradas Avatar answered Dec 14 '25 16:12

Rui Barradas


You would need to move diff() to inside a mutate() statement if you are using dplyr. But diff() returns a vector that's shorter by 1 than your input vector which makes it difficult to keep the same number of rows. An alternative is to use the dplyr lead() function to grab the "next" value in the group

df%>%
  group_by(id)%>%
  mutate(diff=lead(t1)-t1)
like image 34
MrFlick Avatar answered Dec 14 '25 16:12

MrFlick



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!