I have a data frame:
test_df <- data.frame(
x = c(rep("a", 5), rep("b", 5)),
y = c(1, 2, NA, 2, 3, NA, 1, 2, 3, 1)
)
I would like to remove all rows after y == 2 by the grouping information in column x. Is there a way to do it in dplyr
?
My desired result is From:
x y
1 a 1
2 a 2
3 a NA
4 a 2
5 a 3
6 b NA
7 b 1
8 b 2
9 b 3
10 b 1
To
x y
1 a 1
2 a 2
6 b NA
7 b 1
8 b 2
What about this way?
group_by(test_df, x) %>% slice(seq_len(min(which(y == 2))))
Source: local data frame [5 x 2]
Groups: x [2]
x y
(fctr) (dbl)
1 a 1
2 a 2
3 b NA
4 b 1
5 b 2
group_by(df, x) %>%
mutate(first2 = min(which(y == 2 | row_number() == n()))) %>%
filter(row_number() <= first2) %>%
select(-first2)
# Source: local data frame [5 x 2]
# Groups: x [2]
#
# x y
# (fctr) (int)
# 1 a 1
# 2 a 2
# 3 b NA
# 4 b 1
# 5 b 2
# 6 c 1
Using this data
df = structure(list(x = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 3L), .Label = c("a", "b", "c"), class = "factor"), y = c(1L, 2L,
NA, 2L, 3L, NA, 1L, 2L, 3L, 1L, 1L)), .Names = c("x", "y"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With