Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast dropping rows in data.table [duplicate]

Tags:

r

data.table

R's data.table package offers fast subsetting of values based on keys.

So, for example:

set.seed(1342)

df1 <- data.table(group = gl(10, 10, labels = letters[1:10]),
                  value = sample(1:100))
setkey(df1, group)

df1["a"]

will return all rows in df1 where group == "a".

What if I want all rows in df1 where group != "a". Is there a concise syntax for that using data.table?

like image 395
Erik Iverson Avatar asked Dec 31 '25 23:12

Erik Iverson


1 Answers

I think you answered your own question:

> nrow(df1[group != "a"])
[1] 90
> table(df1[group != "a", group])

 a  b  c  d  e  f  g  h  i  j 
 0 10 10 10 10 10 10 10 10 10 

Seems pretty concise to me?

EDIT FROM MATTHEW : As per comments this a vector scan. There is a not join idiom here and here, and feature request #1384 to make it easier.

EDIT: feature request #1384 is implemented in data.table 1.8.3

df1[!'a']

# and to avoid the character-to-factor coercion warning in this example (where
# the key column happens to be a factor) :
df1[!J(factor('a'))]
like image 173
Chase Avatar answered Jan 03 '26 20:01

Chase



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!