Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Increment count over rows with conditional restarting

Tags:

dataframe

r

I would like to increment a count that restarts from 1 when a condition in an existing column is met.

For example I have the following data frame:

df <- data.frame(x1 = c(10, 100, 200, 300, 87, 90, 45, 80), 
                 x2 = c("start", "a", "b", "c", "start", "k", "l", "o"))

I would like to create x3 that starts counting from 1 each time that x2 == "start".

The resulting data frame should look like this:

   x1    x2 x3
1  10 start  1
2 100     a  2
3 200     b  3
4 300     c  4
5  87 start  1
6  90     k  2
7  45     l  3
8  80     o  4

I'm guessing there are existing functions in R that give a general solution. Can anyone point me in the right direction?

like image 983
spies006 Avatar asked May 18 '26 19:05

spies006


1 Answers

Using base R:

df$x3 <- with(df, ave(x1, cumsum(x2 == 'start'), FUN = seq_along))

gives:

> df
   x1    x2 x3
1  10 start  1
2 100     a  2
3 200     b  3
4 300     c  4
5  87 start  1
6  90     k  2
7  45     l  3
8  80     o  4

Or with the dplyr or data.table packages:

library(dplyr)
df %>% 
  group_by(grp = cumsum(x2 == 'start')) %>% 
  mutate(x3 = row_number())

library(data.table)
# option 1
setDT(df)[, x3 := rowid(cumsum(x2 == 'start'))][]
# option 2
setDT(df)[, x3 := 1:.N, by = cumsum(x2 == 'start')][]
like image 163
Jaap Avatar answered May 20 '26 07:05

Jaap



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!