Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating time difference by ID

Tags:

r

datediff

I have data like this:

Incident.ID.. = c(rep("INCFI0000029582",4), rep("INCFI0000029587",4))
date = c("2014-09-25 08:39:45", "2014-09-25 08:39:48", "2014-09-25 08:40:44", "2014-10-10 23:04:00", "2014-09-25 08:33:32", "2014-09-25 08:34:41", "2014-09-25 08:35:24", "2014-10-10 23:04:00")
df = data.frame(Incident.ID..,date, stringsAsFactors = FALSE)

df

   Incident.ID..                date
1  INCFI0000029582 2014-09-25 08:39:45
2  INCFI0000029582 2014-09-25 08:39:48
3  INCFI0000029582 2014-09-25 08:40:44
4  INCFI0000029582 2014-10-10 23:04:00
5  INCFI0000029587 2014-09-25 08:33:32
6  INCFI0000029587 2014-09-25 08:34:41
7  INCFI0000029587 2014-09-25 08:35:24
8  INCFI0000029587 2014-10-10 23:04:00

I use this function to calculate time difference in seconds:

padded.diff = function(x) c(0L, diff(x)) 

df2=within(df, {
  date        = strptime(date, format="%Y-%m-%d %H:%M:%S")
  date.diff   = padded.diff(as.numeric(date)) 
})

df2

Incident.ID..      date                date.diff
1  INCFI0000029582 2014-09-25 08:39:45         0
2  INCFI0000029582 2014-09-25 08:39:48         3
3  INCFI0000029582 2014-09-25 08:40:44        56
4  INCFI0000029582 2014-10-10 23:04:00   1347796
5  INCFI0000029587 2014-09-25 08:33:32  -1348228
6  INCFI0000029587 2014-09-25 08:34:41        69
7  INCFI0000029587 2014-09-25 08:35:24        43
8  INCFI0000029587 2014-10-10 23:04:00   1348116

But how could I calculate the difference so that it would start from zero for every "Incident.ID.." ?:

 Incident.ID..                date date.diff
1  INCFI0000029582 2014-09-25 08:39:45         0
2  INCFI0000029582 2014-09-25 08:39:48         3
3  INCFI0000029582 2014-09-25 08:40:44        56
4  INCFI0000029582 2014-10-10 23:04:00   1347796
5  INCFI0000029587 2014-09-25 08:33:32         0
6  INCFI0000029587 2014-09-25 08:34:41        69
7  INCFI0000029587 2014-09-25 08:35:24        43
8  INCFI0000029587 2014-10-10 23:04:00   1348116
like image 991
ElinaJ Avatar asked Jan 09 '23 08:01

ElinaJ


2 Answers

With base R you could simply wrap it up in ave

ave(as.numeric(as.POSIXct(date)), Incident.ID.., FUN = padded.diff) 

Or using data.table (as per @akruns comment)

library(data.table) 
setDT(df)[, date.diff := padded.diff(as.POSIXct(date)), by = Incident.ID..]
like image 111
David Arenburg Avatar answered Feb 07 '23 22:02

David Arenburg


Here is an example using dplyr and lubridate

library(dplyr)
library(lubridate)
df %>%
    group_by(Incident.ID..) %>%
    mutate(diff = c(0, diff(ymd_hms(date))))

Source: local data frame [8 x 3]
Groups: Incident.ID..

    Incident.ID..                date    diff
1 INCFI0000029582 2014-09-25 08:39:45       0
2 INCFI0000029582 2014-09-25 08:39:48       3
3 INCFI0000029582 2014-09-25 08:40:44      56
4 INCFI0000029582 2014-10-10 23:04:00 1347796
5 INCFI0000029587 2014-09-25 08:33:32       0
6 INCFI0000029587 2014-09-25 08:34:41      69
7 INCFI0000029587 2014-09-25 08:35:24      43
8 INCFI0000029587 2014-10-10 23:04:00 1348116
like image 42
cdeterman Avatar answered Feb 07 '23 23:02

cdeterman