I have a dataframe as below
tmpdf <- data.frame(licensePlate = c("Y80901", "Y80901", "Y80901", "AMG-999", "AMG-999", "W3188", "W3188"),
starttime= c("2015-09-18 09:55", "2015-09-18 23:00", "2015-09-20 15:00", "2015-09-17 15:42", "2015-09-21 09:22", "2015-09-17 09:00", "2015-09-21 14:00"),
endtime = c("2015-09-18 17:55", "2015-09-20 11:00", "2015-09-21 12:00", "2015-09-18 13:00", "2015-09-21 14:22", "2015-09-21 12:00", "2015-09-21 16:00"))
tmpdf
licensePlate starttime endtime
1 Y80901 2015-09-18 09:55 2015-09-18 17:55
2 Y80901 2015-09-18 23:00 2015-09-20 11:00
3 Y80901 2015-09-20 15:00 2015-09-21 12:00
4 AMG-999 2015-09-17 15:42 2015-09-18 13:00
5 AMG-999 2015-09-21 09:22 2015-09-21 14:22
6 W3188 2015-09-17 09:00 2015-09-21 12:00
7 W3188 2015-09-21 14:00 2015-09-21 16:00
I want to count the last n days (for example, last 5 days from 9/17 to 9/21) in time used per day of each licensePlate, My expected result is as below:
Period LicensePlate Used Time
1 2015-09-17 Y80901 0
2 2015-09-17 AMG-999 8.3
3 2015-09-17 W3188 15
4 2015-09-18 Y80901 9
5 2015-09-18 AMG-999 13
6 2015-09-18 W3188 24
7 2015-09-19 Y80901 24
8 2015-09-19 AMG-999 0
9 2015-09-19 W3188 24
10 2015-09-20 Y80901 20
11 2015-09-20 AMG-999 0
12 2015-09-20 W3188 24
13 2015-09-21 Y80901 12
14 2015-09-21 AMG-999 5
15 2015-09-21 W3188 14
I think dplyr/data.table and lubridate can be used to obtain my result, and I may need to measure time period in days, but I don't know how to cut within a start/end interval when start/end will differ per row.
Here's something to get you started. This is almost your desired output, as it doesn't show you the missing licensePlate
per period.
The first step would be to convert your date to valid POSIXct
class, then expand the data to a per minute level (probably the most costly part in this solution) and the aggregate by licensePlate
and by Period
while summing up the results (I'm not using as.Date
here because it handles badly POSIX
values with values between 00 and 1 am).
library(data.table)
setDT(tmpdf)[, `:=`(starttime = as.POSIXct(starttime), endtime = as.POSIXct(endtime))]
res <- tmpdf[, .(licensePlate, Period = seq(starttime, endtime, by = "1 min")), by = 1:nrow(tmpdf)]
res[, .(Used_Time = round(.N/60L, 1L)), keyby = .(Period = substr(Period, 1L, 10L), licensePlate)]
# Period licensePlate Used_Time
# 1: 2015-09-17 AMG-999 8.3
# 2: 2015-09-17 W3188 15.0
# 3: 2015-09-18 AMG-999 13.0
# 4: 2015-09-18 W3188 24.0
# 5: 2015-09-18 Y80901 9.0
# 6: 2015-09-19 W3188 24.0
# 7: 2015-09-19 Y80901 24.0
# 8: 2015-09-20 W3188 24.0
# 9: 2015-09-20 Y80901 20.0
# 10: 2015-09-21 AMG-999 5.0
# 11: 2015-09-21 W3188 14.0
# 12: 2015-09-21 Y80901 12.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With