Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

row numbers for explicit rows in r

I need to get row numbers for explicit rows grouped over id. Let's say dataframe (df) looks like this:

 id  a   b 
 3   2   NA
 3   3   2
 3   10  NA
 3   21  0
 3   2   NA
 4   1   5
 4   1   0
 4   5   NA

I need to create one more column that would give row number sequence excluding the case where b == 0.

desired output:

 id  a   b   row
 3   2   NA   1
 3   3   2    2
 3   10  NA   3
 3   21  0    -
 3   2   NA   4
 4   1   5    1
 4   1   0    -
 4   5   NA   2

I used dplyr but not able to achieve the same, My code:

df <- df %>%
       group_by(id) %>%
       mutate(row = row_number(id[b != 0]))

Please suggest some better way to do this.

like image 978
Dheeraj Singh Avatar asked Nov 20 '25 14:11

Dheeraj Singh


1 Answers

I would propose using the data.table package for its nice capability in operating on subsets and thus avoiding inefficient operations such as ifelse or evaluation the whole data set. Also, it is better to keep you vector in numeric class (for future operations), thus NA will be probably preferable to - (character), here's a possible solution

library(data.table)
setDT(df)[is.na(b) | b != 0, row := seq_len(.N), by = id]
#    id  a  b row
# 1:  3  2 NA   1
# 2:  3  3  2   2
# 3:  3 10 NA   3
# 4:  3 21  0  NA
# 5:  3  2 NA   4
# 6:  4  1  5   1
# 7:  4  1  0  NA
# 8:  4  5 NA   2

The idea here is to operate only on the rows where is.na(b) | b != 0 and generate a sequence of each group size (.N) while updating row in place (using :=). All the rest of the rows will be assigned with NAs by default.

like image 167
David Arenburg Avatar answered Nov 22 '25 04:11

David Arenburg



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!