Delete rows after a certain sequence of values in a certain column

Question

a <- c("A","A","A","B","B","B","C","C","C","C","D","D","D","D","D")
b <- c("x","y","z","x","x","z","y","z","z","z","y","z","z","z","x")
df = data.frame(a,b)


    a   b
1   A   x
2   A   y
3   A   z
4   B   x
5   B   x
6   B   z
7   C   y
8   C   z
9   C   z
10  C   z
11  D   y
12  D   z
13  D   z
14  D   z
15  D   x

For every group A, B, C, D, I'd like to delete the value z in column b every time the combination y,z appears at the end of the group.

If we have the case of a=="C", where the b-values are y,z,z,z, I'd like to delete all z's. However, in a=="D", nothing has to change as x is the last value.

The results looks like this:

    a   b
1   A   x
2   A   y
4   B   x
5   B   x
6   B   z
7   C   y
11  D   y
12  D   z
13  D   z
14  D   z
15  D   x

By grouping in dplyr, I can identify the last occurence of each value in A, so the basic case depictured in a=="A"is not a problem. I have trouble finding a solution for the case of a=="C", where I could have one occurence of y followed by 20 occurences of z.

Sven Hohenstein · Accepted Answer

You can use by and cummin in base R:

df[unlist(by(df$b, interaction(df$a), FUN = function(x) {
  tmp <- rev(cummin(rev(x == "z")))
  if (tail(x[!tmp], 1) == "y") !tmp else rep(TRUE, length(x))
})), ]

The result:

   a b
1  A x
2  A y
4  B x
5  B x
6  B z
7  C y
11 D y
12 D z
13 D z
14 D z
15 D x

Delete rows after a certain sequence of values in a certain column

Tags:

r

rows

rmuc8

1 Answers

Sven Hohenstein

Recent Activity

Donate For Us

Delete rows after a certain sequence of values in a certain column

Tags:

r

rows

rmuc8

1 Answers

Sven Hohenstein

Related questions

Recent Activity

Donate For Us