Trying to figure this out. Suppose you have a data.table:
dt <- data.table (person=c('bob', 'bob', 'bob'),
door=c('front door', 'front door', 'front door'),
type=c('timeIn', 'timeIn', 'timeOut'),
time=c(
as.POSIXct('2016 12 02 06 05 01', format = '%Y %m %d %H %M %S'),
as.POSIXct('2016 12 02 06 05 02', format = '%Y %m %d %H %M %S'),
as.POSIXct('2016 12 02 06 05 03', format = '%Y %m %d %H %M %S') )
)
I want to pivot it to look like this
person door timeIn timeOut
bob front door min(<date/time>) max(<date/time>)
I can't seem to get the right syntax for dcast.data.table. I tried
dcast.data.table(
dt, person + door ~ type,
value.var = 'time',
fun.aggregate = function(x) ifelse(type == 'timeIn', min(x), max(x))
)
which throws an error:
Aggregating function(s) should take vector inputs and return a single value (length=1).
I also tried:
dcast.data.table(dt, person + door ~ type, value.var = 'time')
But the result throws away my dates
person door timeIn timeOut
1: bob front door 2 1
Any suggestions would be appreciated. TIA
There are several ways to achieve the desired result using dcast. jazzurro's solution does the aggregation before reshaping the result. The approaches here use dcast directly but may require some post-processing. We are using jazzurro's data which are tweaked to obey the UTC time zone and CRAN version 1.10.0 of data.table.
ifelse to workAs reported in the Q,
dcast(
dt, person + door ~ type,
value.var = 'time',
fun.aggregate = function(x) ifelse(type == 'timeIn', min(x), max(x))
)
returns an error message. The full text of the error message includes the hint to use the fill parameter. Unfortunately, ifelse() doesn't respect the POSIXct class (for details see ?ifelse) so this needs to be enforced.
With
dcast(
dt, person + door ~ type,
value.var = 'time',
fun.aggregate = function(x)
lubridate::as_datetime(ifelse(type == 'timeIn', min(x), max(x))),
fill = 0
)
we do get
# person door timeIn timeOut
#1: ana front door 2016-12-02 07:06:01 2016-12-02 07:06:05
#2: bob front door 2016-12-02 06:05:01 2016-12-02 06:05:05
ifelse
ifelse's help page suggests
(tmp <- yes; tmp[!test] <- no[!test]; tmp)
as alternative. Following this advice,
dcast(
dt, person + door ~ type,
value.var = 'time',
fun.aggregate = function(x) {
test <- type == "timeIn"; tmp <- min(x); tmp[!test] = max(x)[!test]; tmp
}
)
returns
# person door timeIn timeOut
#1: ana front door 2016-12-02 07:06:01 2016-12-02 07:06:05
#2: bob front door 2016-12-02 06:05:01 2016-12-02 06:05:05
Note that neither the fill parameter nor the coercion to POSIXct is needed.
dcast
With the latest versions of dcast.data.table we can provide a list of functions to fun.aggregate:
dcast(dt, person + door ~ type, value.var = 'time', fun = list(min, max))
returns
# person door time_min_timeIn time_min_timeOut time_max_timeIn time_max_timeOut
#1: ana front door 2016-12-02 07:06:01 2016-12-02 07:06:03 2016-12-02 07:06:02 2016-12-02 07:06:05
#2: bob front door 2016-12-02 06:05:01 2016-12-02 06:05:03 2016-12-02 06:05:02 2016-12-02 06:05:05
We can remove the unwanted columns and rename the others by
dcast(dt, person + door ~ type, value.var = 'time', fun = list(min, max))[
, .(person, door, timeIn = time_min_timeIn, timeOut = time_max_timeOut)]
which gets us
# person door timeIn timeOut
#1: ana front door 2016-12-02 07:06:01 2016-12-02 07:06:05
#2: bob front door 2016-12-02 06:05:01 2016-12-02 06:05:05
As mentioned above, we are using jazzurro's data
dt <- structure(list(person = c("bob", "bob", "bob", "bob", "ana",
"ana", "ana", "ana"), door = c("front door", "front door", "front door",
"front door", "front door", "front door", "front door", "front door"
), type = c("timeIn", "timeIn", "timeOut", "timeOut", "timeIn",
"timeIn", "timeOut", "timeOut"), time = structure(c(1480658701,
1480658702, 1480658703, 1480658705, 1480662361, 1480662362, 1480662363,
1480662365), class = c("POSIXct", "POSIXt"))), .Names = c("person",
"door", "type", "time"), row.names = c(NA, -8L), class = c("data.table",
"data.frame"))
but coerce the time zone to UTC.
With
dt[, time := lubridate::with_tz(time, "UTC")]
we have
dt
# person door type time
#1: bob front door timeIn 2016-12-02 06:05:01
#2: bob front door timeIn 2016-12-02 06:05:02
#3: bob front door timeOut 2016-12-02 06:05:03
#4: bob front door timeOut 2016-12-02 06:05:05
#5: ana front door timeIn 2016-12-02 07:06:01
#6: ana front door timeIn 2016-12-02 07:06:02
#7: ana front door timeOut 2016-12-02 07:06:03
#8: ana front door timeOut 2016-12-02 07:06:05
independent of local time zone.
This would be one way to achieve your goal. I modified your dt and created the following data set. For each person, I looked for the minimum time of timeIn and the maximum time of timeOut. Then, I applied dcast() to the result.
# person door type time
#1: bob front door timeIn 2016-12-02 06:05:01
#2: bob front door timeIn 2016-12-02 06:05:02
#3: bob front door timeOut 2016-12-02 06:05:03
#4: bob front door timeOut 2016-12-02 06:05:05
#5: ana front door timeIn 2016-12-02 07:06:01
#6: ana front door timeIn 2016-12-02 07:06:02
#7: ana front door timeOut 2016-12-02 07:06:03
#8: ana front door timeOut 2016-12-02 07:06:05
library(data.table)
dcast(
dt[, .SD[(type == "timeIn" & time == min(time))|(type == "timeOut" & time == max(time))], by = person],
person + door ~ type)
# person door timeIn timeOut
#1: ana front door 2016-12-02 07:06:01 2016-12-02 07:06:05
#2: bob front door 2016-12-02 06:05:01 2016-12-02 06:05:05
DATA
dt <- structure(list(person = c("bob", "bob", "bob", "bob", "ana",
"ana", "ana", "ana"), door = c("front door", "front door", "front door",
"front door", "front door", "front door", "front door", "front door"
), type = c("timeIn", "timeIn", "timeOut", "timeOut", "timeIn",
"timeIn", "timeOut", "timeOut"), time = structure(c(1480658701,
1480658702, 1480658703, 1480658705, 1480662361, 1480662362, 1480662363,
1480662365), class = c("POSIXct", "POSIXt"))), .Names = c("person",
"door", "type", "time"), row.names = c(NA, -8L), class = c("data.table",
"data.frame"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With