Remove rows conditionally from a data.table in R

Question

I have a data.table with fields {id, menuitem, amount}.

This is transaction data - so, ids are unique, but menuitem repeats. Now, I want to remove all entries where menuitem == 'coffee'.

Also, want to delete all rows where amount <= 0;

What is the right way to do this in data.table?

I can use data$menuitem!='coffee' and then index int into data[] - but that is not necessarily efficient and does not take advantage of data.table.

Any pointers in the right direction are appreciated.

Michele · Accepted Answer

In this scenario it is not so different than data.frame

data <- data[ menuitem != 'coffee' | amount > 0]

Delete/add row by reference it is to be implemented. You find more info in this question

Regarding speed:

1 You can benefit from keys by doing something like:

setkey(data, menuitem) data <- data[!"coffee"]

which will be faster than data <- data[ menuitem != 'coffee']. However to apply the same filters you asked in the question you'll need a rolling join (I've finished my lunch break I can add something later :-)).

2 Even without key data.table is much faster for relatively big table (similar speed for handful amount of rows)

dt<-data.table(id=sample(letters,1000000,T),var=rnorm(1000000)) df<-data.frame(id=sample(letters,1000000,T),var=rnorm(1000000)) library(microbenchmark) > microbenchmark(dt[ id == "a"], df[ df$id == "a",]) Unit: milliseconds                expr       min        lq    median        uq       max neval       dt[id == "a"]  24.42193  25.74296  26.00996  26.35778  27.36355   100  df[df$id == "a", ] 138.17500 146.46729 147.38646 149.06766 154.10051   100

song.xiao · Answer

try this:

data <- data[ !(menuitem == 'coffee' | amount <= 0),]

Generally:

dt <- data.table(a=c(1,1,1,2,2,2,3,3,3),b=c(4,2,3,1,5,3,4,7,6)) dt #>    a b #> 1: 1 4 #> 2: 1 2 #> 3: 1 3 #> 4: 2 1 #> 5: 2 5 #> 6: 2 3 #> 7: 3 4 #> 8: 3 7 #> 9: 3 6 dt[a!=1,] #>    a b #> 1: 2 1 #> 2: 2 5 #> 3: 2 3 #> 4: 3 4 #> 5: 3 7 #> 6: 3 6

Remove rows conditionally from a data.table in R

Tags:

r

data.table

Gopalakrishna Palem

2 Answers

Michele

song.xiao

Recent Activity

Donate For Us

Remove rows conditionally from a data.table in R

Tags:

r

data.table

Gopalakrishna Palem

2 Answers

Michele

song.xiao

Related questions

Recent Activity

Donate For Us