Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flag all rows in a group using data table in R if some rows meet a condition

Tags:

r

data.table

I've got a data table (works much quicker to handle all the records I've got) and I want to flag rows depending on a condition met by some rows for a group. I'll try to illustrate. My data looks like the following:

ID  DATE    EVENT
1   01-Jan-97   A
1   01-Jan-97   K
2   03-Jan-03   A
3   04-Jan-03   K

So there is an ID, a date and an event. The ultimate objective is to delete the rows that have EVENT=A if there is already an EVENT=K for the same date. So I thought about setting two flags, one if the group (ID,DATE) has A and other if they have K, so then, if it has both flags=1, I would delete the row containing A cause I wouldn't need it.

So, getting the first two flags it's ok. This is how the code looks like:

mydata.table[EVENT=="K", k.flag:=1, by=.(ID, DATE)] 

I do the same for the A.flag. Then I don't know exactly how to get a third flag to mark that if it has both then I can delete the record with A. So I want a table something like:

ID  DATE    EVENT   K.FLAG  A.FLAG  BOTH
1   01-Jan-97   A      0       1    1
1   01-Jan-97   K      1       0    1
2   03-Jan-03   A      0       1    0
3   04-Jan-03   K      1       0    0

Then I would just delete for both=1 and EVENT=A. I know it's not a very elegant solution but it's the best one I can come up with but I'm having problems with the statement for the third flag. Any suggestions? I assume that it would be like having a CASE statement with the data.table, but I can't figure it out...

Thank you!

like image 527
Fabiola Fernández Avatar asked May 28 '15 15:05

Fabiola Fernández


1 Answers

dt[, flag := all(c('A', 'K') %in% EVENT), by = DATE]
dt
#   ID      DATE EVENT  flag
#1:  1 01-Jan-97     A  TRUE
#2:  1 01-Jan-97     K  TRUE
#3:  2 03-Jan-03     A FALSE
#4:  3 04-Jan-03     K FALSE
like image 182
eddi Avatar answered Sep 28 '22 23:09

eddi