I have nested data that looks like this:
ID Date Behavior
1 1 FALSE
1 2 TRUE
1 3 TRUE
2 1 TRUE
2 2 FALSE
3 1 TRUE
3 2 TRUE
I'd like to return each array of values for a given ID that contains at least one occurrence of FALSE. I am expecting ID 1 and ID 2 to be returned, with each row of present data (3 rows for ID 1 and 2 rows for ID2).
EDIT: here is what I am expecting:
ID Date Behavior
1 1 FALSE
1 2 TRUE
1 3 TRUE
2 1 TRUE
2 2 FALSE
I'm wondering if this is a for loop or a while function - any and all help is appreciated...
Extra points for python code that mimics the R code!
Here's a possible data.table approach (assuming df is your data set)
library(data.table)
setDT(df)[, .SD[any(!Behavior)], ID] # you can also replace any(!Behavior) with !all(Behavior)
# ID Date Behavior
# 1: 1 1 FALSE
# 2: 1 2 TRUE
# 3: 1 3 TRUE
# 4: 2 1 TRUE
# 5: 2 2 FALSE
Edit: a bit more efficient solution by @Arun
setDT(df)[, if (any(!Behavior)) .SD, ID]
Or a similar dplyr approach
library(dplyr)
df %>%
group_by(ID) %>%
filter(any(!Behavior))
# Source: local data table [5 x 3]
# Groups: ID
#
# ID Date Behavior
# 1 1 1 FALSE
# 2 1 2 TRUE
# 3 1 3 TRUE
# 4 2 1 TRUE
# 5 2 2 FALSE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With