How to remove all rows after a certain point by group using dplyr?

Question

I have a data frame:

test_df <- data.frame(
  x = c(rep("a", 5), rep("b", 5)), 
  y = c(1, 2, NA, 2, 3, NA, 1, 2, 3, 1)
)

I would like to remove all rows after y == 2 by the grouping information in column x. Is there a way to do it in dplyr?

My desired result is From:

To

DatamineR · Accepted Answer

What about this way?

group_by(test_df, x) %>% slice(seq_len(min(which(y == 2))))
Source: local data frame [5 x 2]
Groups: x [2]

       x     y
  (fctr) (dbl)
1      a     1
2      a     2
3      b    NA
4      b     1
5      b     2

Gregor Thomas · Answer

group_by(df, x) %>%
    mutate(first2 = min(which(y == 2 | row_number() == n()))) %>%
    filter(row_number() <= first2) %>%
    select(-first2)
# Source: local data frame [5 x 2]
# Groups: x [2]
# 
#        x     y
#   (fctr) (int)
# 1      a     1
# 2      a     2
# 3      b    NA
# 4      b     1
# 5      b     2
# 6      c     1

Using this data

df = structure(list(x = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 3L), .Label = c("a", "b", "c"), class = "factor"), y = c(1L, 2L, 
NA, 2L, 3L, NA, 1L, 2L, 3L, 1L, 1L)), .Names = c("x", "y"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11"))

How to remove all rows after a certain point by group using dplyr?

Tags:

r

dplyr

Hao

2 Answers

DatamineR

Gregor Thomas

Recent Activity

Donate For Us

How to remove all rows after a certain point by group using dplyr?

Tags:

r

dplyr

Hao

2 Answers

DatamineR

Gregor Thomas

Related questions

Recent Activity

Donate For Us