Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the vectorized way to perform for-loops with calculation between rows

Tags:

r

I'm trying to find a vectorized procedure that can replace the following code (which takes a long time to run):

for (i in 2:nrow(z)) {
  if (z$customerID[i]==z$customerID[i-1]) 
     {z$timeDelta[i]<-(z$time[i]-z$time[i-1])} else {z$timeDelta[i]<- NA}
}

I tried looking for different apply snippets, but haven't found anything useful.

Here's some sample data:

customerID    time
    1         2013-04-17 15:30:00 IDT
    1         2013-05-19 11:32:00 IDT
    1         2013-05-20 10:14:00 IDT
    2         2013-03-14 18:41:00 IST
    2         2013-04-24 09:52:00 IDT
    2         2013-04-24 17:08:00 IDT

And I want to get the following output:

customerID    time                        timeDelta*
    1         2013-04-17 15:30:00 IDT     NA
    1         2013-05-19 11:32:00 IDT     31.83 
    1         2013-05-20 10:14:00 IDT     0.94 
    2         2013-03-14 18:41:00 IST     NA
    2         2013-04-24 09:52:00 IDT     40.59
    2         2013-04-24 17:08:00 IDT     0.3 

 *I prefer the time will be in days
like image 986
Guest3290 Avatar asked Dec 07 '22 06:12

Guest3290


1 Answers

z$timeDelta <- NA
z$timeDelta[-1] <- ifelse(tail(z$customerID,-1) == head(z$customerID,-1), diff(z$time)/24, NA)

or a shorter version

z$timeDelta <- NA
z$timeDelta[-1] <- ifelse(!diff(z$customerID), diff(z$time)/24, NA)
like image 129
jdharrison Avatar answered Mar 16 '23 01:03

jdharrison