dplyr lag of different group

Question

I am trying to use dplyr to mutate both a column containing the samegroup lag of a variable as well as the lag of (one of) the other group(s). Edit: Sorry, in the first edition, I messed up the order a bit by rearranging by date at the last second.

Original df

This is what my desired result would look like:

Desired Outcome df Here is a minimal code example:

library(tidyverse)

set.seed(2)
df <-
  data.frame(
    x =  sample(seq(as.Date('2000/01/01'), as.Date('2015/01/01'), by="day"), 10),
    group = sample(c("A","B"),10,replace = T),
    value = sample(1:10,size=10)
  ) %>% arrange(x)

df <- df %>%
  group_by(group) %>%
  mutate(own_lag = lag(value))


df %>% data.frame(other_lag = c(NA,1,2,7,7,9,10,10,8,6))

Thank you very much!

Cath · Accepted Answer

A solution with data.table:

library(data.table)

# to create own lag: 
setDT(df)[, own_lag:=c(NA, head(value, -1)), by=group]

# to create other group lag: (the function works actually outside of data.table, in base R, see N.B. below)
df[, other_lag:=sapply(1:.N, 
                       function(ind) {
                          gp_cur <- group[ind]
                          if(any(group[1:ind]!=gp_cur)) tail(value[1:ind][group[1:ind]!=gp_cur], 1) else NA
                       })]

df
 #            x group value own_lag other_lag
 #1: 2001-12-08     B     1      NA        NA
 #2: 2002-07-09     A     2      NA         1
 #3: 2002-10-10     B     7       1         2
 #4: 2007-01-04     A     5       2         7
 #5: 2008-03-27     A     9       5         7
 #6: 2008-08-06     B    10       7         9
 #7: 2010-07-15     A     4       9        10
 #8: 2012-06-27     A     8       4        10
 #9: 2014-02-21     B     6      10         8
#10: 2014-02-24     A     3       8         6

Explanation of other_lag determination: The idea is, for each observation, to look at the group value, if there is any group value different from current one, previous to current one, then take the last value, else, put NA.

N.B.: other_lag can be created without the need of data.table:

df$other_lag <- with(df, sapply(1:nrow(df), 
                                function(ind) {
                                 gp_cur <- group[ind]
                                 if(any(group[1:ind]!=gp_cur)) tail(value[1:ind][group[1:ind]!=gp_cur], 1) else NA
                               }))

dplyr lag of different group

Tags:

r

dplyr

Marcel Schliebs

1 Answers

Cath

Recent Activity

Donate For Us

dplyr lag of different group

Tags:

r

dplyr

Marcel Schliebs

1 Answers

Cath

Related questions

Recent Activity

Donate For Us