Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check variables on a certain date +/- 2 days in R

Tags:

r

I really hope someone can help me with this question, because I've been struggling for some time. My data looks like this:

ID DATE        VAR1       VAR2  
01 2018-07-27      0         0  
01 2018-07-28      0         0  
01 2018-07-29      0         1  
01 2018-07-30      0         1  
01 2018-07-31      0         1  
01 2018-08-01      0         0
02 2018-09-30      1         0  
02 2018-10-01      0         0  
02 2018-10-02      0         1  
02 2018-10-03      1         1  
02 2018-10-04      1         1  
02 2018-10-05      0         1  
02 2018-10-06      0         0  
02 2018-10-07      0         0  
02 2018-10-08      0         0  
02 2018-10-10      0         0  
02 2018-10-12      0         0  
02 2018-10-13      0         0 
02 2018-10-14      0         0  
02 2018-10-15      1         0  
02 2018-10-18      1         0  
02 2018-10-19      0         0  
02 2018-10-20      0         0 
02 2018-10-26      0         0  
02 2018-10-28      0         0  
02 2018-11-02      0         1 

I want to know for each ID if VAR1 was present or not on the first day VAR 2 was present +/- 2 days. I would like to store the answers in a new dataframe, like this:

ID PRESENT
01 0  
02 1   

Does someone know how to do this? VAR2 is the menstrual cycle. For some ID's I have data of multiple menstruations. If VAR1 was present on the first day +/- 2 days in one of the menstruations, I want them to come out positive.

Thanks in advance!

like image 383
Iris Avatar asked Dec 07 '25 10:12

Iris


1 Answers

One way of going about it, but there should be a better hack:

library(dplyr)

df %>%
  group_by(ID) %>%
  mutate(
    DATE = as.Date(DATE),
    VAR2 = ifelse(VAR2 == 1 & lag(VAR2) == 1, 0, VAR2),
    PRESENT = sapply(DATE,
                     function(x) any(VAR1[between(DATE, x - 2, x + 2)] == 1)) & VAR2 == 1
  ) %>% 
  summarise(PRESENT = +any(PRESENT))

Output:

# A tibble: 2 x 2
     ID PRESENT
  <int>   <int>
1     1       0
2     2       1

Data used:

df <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), DATE = structure(1:26, .Label = c("2018-07-27", "2018-07-28", 
"2018-07-29", "2018-07-30", "2018-07-31", "2018-08-01", "2018-09-30", 
"2018-10-01", "2018-10-02", "2018-10-03", "2018-10-04", "2018-10-05", 
"2018-10-06", "2018-10-07", "2018-10-08", "2018-10-10", "2018-10-12", 
"2018-10-13", "2018-10-14", "2018-10-15", "2018-10-18", "2018-10-19", 
"2018-10-20", "2018-10-26", "2018-10-28", "2018-11-02"), class = "factor"), 
    VAR1 = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L), 
    VAR2 = c(0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L)), class = "data.frame", row.names = c(NA, 
-26L))
like image 98
arg0naut91 Avatar answered Dec 09 '25 23:12

arg0naut91



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!