Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting an event only every X days per subject (in an irregular time series)

Tags:

r

I've got data where I'm counting episodes of care (like ER visits). The trick is, I can't count every single visit, because sometimes a 2nd or 3rd visit is actually a follow-up for a previous problem. So I've been given direction to count visits by using a 30 day "clean period" or "black out period", such that, I look for the first event (VISIT 1) by patient (min date), I count that event, then apply rules so as NOT to count any visits that occur in the 30 days following the first event. After that 30 day window has elapsed, I can begin looking for the 2nd visit (VISIT 2), count that one, then apply the 30 day black out again (NOT counting any visits that occur in the 30 days after visit #2)... wash, rinse, repeat...

I have rigged together a very sloppy solution that requires a lot of babysitting and manual checking of steps(see below). I have to believe that there is a better way. HELP!

data1 <- structure(list(ID = structure(c(2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 
3L, 4L, 4L, 4L, 4L, 4L), .Label = c("", "patient1", "patient2", 
"patient3"), class = "factor"), Date = structure(c(14610, 14610, 
14627, 14680, 14652, 14660, 14725, 15085, 15086, 14642, 14669, 
14732, 14747, 14749), class = "Date"), test = c(1L, 1L, 1L, 2L, 
1L, 1L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 2L)), .Names = c("ID", "Date", 
"test"), class = "data.frame", row.names = c(NA, 14L))

library(doBy) 
##     create a table of first events 
step1 <- summaryBy(Date~ID, data = data1, FUN=min) 
step1$Date30 <- step1$Date.min+30                                     
step2 <- merge(data1, step1, by.x="ID", by.y="ID") 
##     use an ifelse to essentially remove any events that shouldn't be counted 
step2$event <- ifelse(as.numeric(step2$Date) >= step2$Date.min & as.numeric(step2$Date) <= step2$Date30, 0, 1)
##     basically repeat steps above until I dont capture any more events
##  there just has to be a better way
data3 <- step2[step2$event==1,] 
data3<- data3[,1:3] 
step3 <- summaryBy(Date~ID, data = data3, FUN=min) 
step3$Date30 <- step3$Date.min+30 
step4 <- merge(data3, step3, by.x="ID", by.y="ID") 
step4$event <- ifelse(as.numeric(step4$Date) >= step4$Date.min & as.numeric(step4$Date) <= step4$Date30, 0, 1)
data4 <- step4[step4$event==1,]
data4<- data4[,1:3]
step5 <- summaryBy(Date~ID, data = data4, FUN=min)
step5$Date30 <- step5$Date.min+30
##     then I rbind the "keepers" 
##     in this case steps 1 and 3 above 
final <- rbind(step1,step3, step5) 
##     then reformat 
final <- final[,1:2] 
final$Date.min <- as.Date(final$Date.min,origin="1970-01-01") 
##     again, extremely clumsy, but it works...  HELP! :)
like image 755
Chris Avatar asked Jan 18 '23 03:01

Chris


1 Answers

This solution is loop-free and uses only base R. It produces a logical vector ok which selects the acceptable rows of data1.

ave runs the indicated anonymous function over each patient separately.

We define a state vector consisting of the current date and the start of the period for which no other dates are considered. Each date is represented by as.numeric(x) where x is the date. step takes the state vector and the current date and updates the state vector. Reduce runs it over the data and then we take only results for which the minimum and current date are the same and for which the current date is not a duplicate.

step <- function(init, curdate) {
    c(curdate, if (curdate > init[2] + 30) curdate else init[2])
}

ok <- !!ave(as.numeric(data1$Date), paste(data1$ID), FUN = function(d) {
    x <- do.call("rbind", Reduce(step, d, c(-Inf, 0), acc = TRUE))
    x[-1,1] == x[-1,2] & !duplicated(x[-1,1])
})

data1[ok, ]
like image 102
G. Grothendieck Avatar answered Feb 09 '23 02:02

G. Grothendieck