Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Only Use The First Match For Every N Rows

Tags:

r

I have a data.frame that looks like this.

Date  Number
1      1
2      0
3      1
4      0
5      0
6      1
7      0
8      0
9      1

I would like to create a new column that puts a 1 in the column if it is the first 1 of every 3 rows. Otherwise put a 0. For example, this is how I would like the new data.frame to look

Date  Number  New
1      1       1
2      0       0
3      1       0
4      0       0
5      0       0
6      1       1
7      0       0
8      0       0
9      1       1

Every three rows we find the first 1 and populate the column otherwise we place a 0. Thank you.

Hmm, at first glance I thought Akrun answer provided me the solution. However, it is not exactly what I am looking for. Here is what @akrun solution provides.

df1 = data.frame(Number = c(1,0,1,0,1,1,1,0,1,0,0,0))
head(df1,9)

Number
1      1
2      0
3      1
4      0
5      1
6      1
7      1
8      0
9      1

Attempt at solution:

df1 %>% 
group_by(grp = as.integer(gl(n(), 3, n()))) %>% 
mutate(New = +(Number == row_number()))



Number   grp   New
   <dbl> <int> <int>
1      1     1     1
2      0     1     0
3      1     1     0
4      0     2     0
5      1     2     0 #should be a 1
6      1     2     0
7      1     3     1
8      0     3     0
9      1     3     0

As you can see the code misses the one on row 5. I am looking for the first 1 in every chunk. Then everything else should be 0. Sorry if i was unclear akrn

Edit** Akrun new answer is exactly what I am looking for. Thank you very much

like image 576
Jordan Wrong Avatar asked Dec 10 '25 17:12

Jordan Wrong


2 Answers

Here is an option to create a grouping column with gl and then do a == with the row_number on the index of matched 1. Here, match will return only the index of the first match.

library(dplyr)
df1 %>% 
   group_by(grp = as.integer(gl(n(), 3, n()))) %>% 
   mutate(New = +(row_number() == match(1, Number, nomatch = 0)))
# A tibble: 12 x 3
# Groups:   grp [4]
#   Number   grp   New
#    <dbl> <int> <int>
# 1      1     1     1
# 2      0     1     0
# 3      1     1     0
# 4      0     2     0
# 5      1     2     1
# 6      1     2     0
# 7      1     3     1
# 8      0     3     0
# 9      1     3     0
#10      0     4     0
#11      0     4     0
#12      0     4     0
like image 163
akrun Avatar answered Dec 13 '25 08:12

akrun


Looking at the logic, perhaps you want to check if Number == 1 and that the prior 2 values were both 0. If that is not correct please let me know.

library(dplyr)

df %>%
  mutate(New = ifelse(Number == 1 & lag(Number, n = 1L, default = 0) == 0 & lag(Number, n = 2L, default = 0) == 0, 1, 0))

Output

  Date Number New
1    1      1   1
2    2      0   0
3    3      1   0
4    4      0   0
5    5      0   0
6    6      1   1
7    7      0   0
8    8      0   0
9    9      1   1
like image 26
Ben Avatar answered Dec 13 '25 10:12

Ben



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!