In each household I want to define an indicator whose determined if a driver is available for a passenger or not. the driver is available if his/her trip start at most 1 hour after passengers's trip.
example:
household person mode start
1 1 car 7:20
1 1 car 8:00
1 1 car 8:30
1 2 non-car 7:30
1 3 non-car 7:15
1 4 car 7:00
2 1 car 7:00
2 2 non-car 9:00
in the first household driver can give a ride since his trip is 30 minutes after than second person also he can give a ride to third person . in the second household no.
output
household person mode start indicator
1 1 car 8:00 1
1 2 non-car 7:30 1
1 3 non-car 7:15 1
2 1 car 7:00 0
2 2 non-car 9:00 0
then I want to put these matched rows (indicator is 1) next to each other
output
household person mode start indicator household person mode start indicator
1 1 car 8:00 1 2 2 non-car 7:30 1
1 1 car 8:00 1 3 2 non-car 7:15 1
We convert the 'start' to datetime class with as.POSIXct, grouped by 'household', check whether the differnece of 'start' is less than or equal to 1, coerce the logical to binary with as.integer
library(dplyr)
df1 %>%
mutate(start = as.POSIXct(start, format = '%H:%M')) %>%
group_by(household) %>%
mutate(indicator = as.integer(any(diff(start) <= 1)))
# A tibble: 4 x 5
# Groups: household [2]
# household person mode start indicator
# <int> <int> <chr> <dttm> <int>
#1 1 1 car 2019-09-03 08:00:00 1
#2 1 2 non-car 2019-09-03 07:30:00 1
#3 2 1 car 2019-09-03 07:00:00 0
#4 2 2 non-car 2019-09-03 09:00:00 0
To get the second output, we can make use of the pivot_wider from the devel version of tidyr
df1 %>%
mutate(startn = as.POSIXct(start, format = '%H:%M')) %>%
group_by(household) %>%
mutate(indicator = as.integer(any(diff(startn) <= 1))) %>%
filter(indicator == 1) %>%
select(-startn) %>%
group_by(household) %>%
mutate(n = row_number()) %>%
pivot_wider(names_from = n, values_from = c(household, person, mode, start, indicator))
# A tibble: 1 x 10
# household_1 household_2 person_1 person_2 mode_1 mode_2 start_1 start_2 indicator_1 indicator_2
# <int> <int> <int> <int> <chr> <chr> <chr> <chr> <int> <int>
#1 1 1 1 2 car non-car 8:00 7:30 1 1
df1 <- structure(list(household = c(1L, 1L, 2L, 2L), person = c(1L,
2L, 1L, 2L), mode = c("car", "non-car", "car", "non-car"), start = c("8:00",
"7:30", "7:00", "9:00")), class = "data.frame", row.names = c(NA,
-4L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With