I've the following data:
ID GROUP DATE
A GR1 12/01/2013
A GR1 09/04/2014
A GR1 01/03/2015
A GR2 04/04/2015
A GR2 08/21/2015
A GR1 01/05/2016
A GR1 06/28/2016
B GR2 11/01/2013
B GR2 06/04/2014
B GR2 04/15/2015
B GR3 11/04/2015
B GR2 03/21/2016
B GR2 07/05/2016
B GR1 06/28/2016
C GR2 01/16/2014
C GR2 06/04/2014
C GR2 04/15/2015
C GR3 11/04/2015
C GR2 03/21/2016
C GR2 06/05/2016
C GR1 06/28/2016
I want to get the difference the person remained in each group. So the new table will look like the following:
ID GROUP DATE Diff
A GR1 12/01/2013
A GR1 09/04/2014
A GR1 01/03/2015 398
A GR2 04/04/2015
A GR2 08/21/2015 139
A GR1 01/05/2016
A GR1 06/28/2016 175
B GR2 11/01/2013
B GR2 06/04/2014
B GR2 04/15/2015 530
B GR3 11/04/2015
B GR2 03/21/2016
B GR2 07/05/2016 106
B GR1 06/28/2016
C GR2 01/16/2014
C GR2 06/04/2014
C GR2 04/15/2015 454
C GR3 11/04/2015
C GR2 03/21/2016
C GR2 01/05/2016 76
C GR1 06/28/2016
THe value in column "Diff" 398 is coming by taking difference '01/03/2015' - '12/1/2013'. Similarly all other difference.
Now my question is how to get this difference? I can't take max(date)-min(date) at each group, because group are repeating at different period. Similarly I can't take first dot and last dot as in SAS.
I'll be extremely grateful if someone help me with the solution. I would prefer the solution in SAS as the data size is very large. SO will not hold in memory.
Regards,
library(dplyr)
library(data.table)
df$xxx = rleidv(df[, c("ID","GROUP"),with = FALSE ])
df$DATE = as.Date(df$DATE, format = "%m/%d/%Y")
df %>% group_by(xxx) %>% mutate(diff = max(DATE) - min(DATE)) %>%
ungroup(xxx) %>% mutate(xxx = NULL)
# ID GROUP DATE diff
# <chr> <chr> <date> <time>
#1 A GR1 2013-12-01 398 days
#2 A GR1 2014-09-04 398 days
#3 A GR1 2015-01-03 398 days
#4 A GR2 2015-04-04 139 days
#5 A GR2 2015-08-21 139 days
#6 A GR1 2016-01-05 175 days
#7 A GR1 2016-06-28 175 days
#8 B GR2 2013-11-01 530 days
#9 B GR2 2014-06-04 530 days
#10 B GR2 2015-04-15 530 days
using only data.table
:
library(data.table)
df[, diff := max(DATE)-min(DATE),by = c("xxx")][,xxx:=NULL]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With