I've got a data table (works much quicker to handle all the records I've got) and I want to flag rows depending on a condition met by some rows for a group. I'll try to illustrate. My data looks like the following:
ID DATE EVENT
1 01-Jan-97 A
1 01-Jan-97 K
2 03-Jan-03 A
3 04-Jan-03 K
So there is an ID, a date and an event. The ultimate objective is to delete the rows that have EVENT=A if there is already an EVENT=K for the same date. So I thought about setting two flags, one if the group (ID,DATE) has A and other if they have K, so then, if it has both flags=1, I would delete the row containing A cause I wouldn't need it.
So, getting the first two flags it's ok. This is how the code looks like:
mydata.table[EVENT=="K", k.flag:=1, by=.(ID, DATE)]
I do the same for the A.flag. Then I don't know exactly how to get a third flag to mark that if it has both then I can delete the record with A. So I want a table something like:
ID DATE EVENT K.FLAG A.FLAG BOTH
1 01-Jan-97 A 0 1 1
1 01-Jan-97 K 1 0 1
2 03-Jan-03 A 0 1 0
3 04-Jan-03 K 1 0 0
Then I would just delete for both=1 and EVENT=A. I know it's not a very elegant solution but it's the best one I can come up with but I'm having problems with the statement for the third flag. Any suggestions? I assume that it would be like having a CASE statement with the data.table, but I can't figure it out...
Thank you!
dt[, flag := all(c('A', 'K') %in% EVENT), by = DATE]
dt
# ID DATE EVENT flag
#1: 1 01-Jan-97 A TRUE
#2: 1 01-Jan-97 K TRUE
#3: 2 03-Jan-03 A FALSE
#4: 3 04-Jan-03 K FALSE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With