Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R creating a new vector based on a count of values up to the first instance of a value an existing vector

Tags:

r

count

How might I create a new variable "CountWK" that is based on a count of the values in "WK" that occur up until the first instance of "1" in "Performance" grouped by "ID"?

ID<-c('A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C')
WK<-c(1, 2, 3, 1, 2, 3, 1, 2, 3, 4, 5)
Performance<-c(0,1,1,0,1,0,0,1,0,1,1)
Data<-data.frame(ID, WK, Performance)

So, for ID "A" CountWk would be "2", for "B" "2", and for C "2" with value of N/A in "CountWk" for every other row besides the one that contains the first instance of "1" in "Performance".

like image 833
user3594490 Avatar asked Jan 09 '23 14:01

user3594490


2 Answers

Here's how I would approach this using the data.table package

First find the row index using .I and match

library(data.table)
indx <- setDT(Data)[, .I[match(1L, Performance)], by = ID]$V1

Then assign WK to CountWk by that index

Data[indx, CountWk := WK][]
#     ID WK Performance CountWk
#  1:  A  1           0      NA
#  2:  A  2           1       2
#  3:  A  3           1      NA
#  4:  B  1           0      NA
#  5:  B  2           1       2
#  6:  B  3           0      NA
#  7:  C  1           0      NA
#  8:  C  2           1       2
#  9:  C  3           0      NA
# 10:  C  4           1      NA
# 11:  C  5           1      NA
like image 53
David Arenburg Avatar answered Feb 09 '23 12:02

David Arenburg


An option using dplyr

library(dplyr)
Data %>% 
     group_by(ID) %>% 
     mutate(CountWk= ifelse(cumsum(Performance==1)==1 & Performance!=0,
                 WK, NA_real_))
#    ID WK Performance CountWk
#1   A  1           0      NA
#2   A  2           1       2
#3   A  3           1      NA
#4   B  1           0      NA
#5   B  2           1       2
#6   B  3           0      NA
#7   C  1           0      NA
#8   C  2           1       2
#9   C  3           0      NA
#10  C  4           1      NA
#11  C  5           1      NA

Or without the ifelse

  Data %>%
      group_by(ID) %>%
      mutate(CountWk= (NA^!(cumsum(Performance==1)==1 & Performance!=0)) *WK)

Or using base R

 Data$CountWk <- with(Data, (NA^!(ave(Performance==1, ID, FUN=cumsum)==1&
                        Performance!=0)) * WK)
like image 32
akrun Avatar answered Feb 09 '23 11:02

akrun