Speed decrease in subsetting `data.table` when adding a bracket

Question

I recently noticed in some old code that I had been including extra square brackets when subsetting a data.table and performing a function repeatedly (in my case, calculating correlation matrices). So,

# Slow way
rcorr(DT[subgroup][, !'Group', with=F])

# Faster way
rcorr(DT[subgroup, !'Group', with=F])

(The difference being after subgroup). Just out of curiosity, why does this occur? With the extra brackets, does data.table have to perform some extra computations?

Rich Scriven · Accepted Answer

Here's a simple interpretation:

# Slow way
rcorr(DT[subgroup][, !'Group'])

The second set of brackets is a second operation on DT, meaning that DT[subgroup] creates a new data table from DT, and then [, !'Group'] operates on that data table, creating another new data table. Hence the decline in speed.

# Faster way
rcorr(DT[subgroup, !'Group'])

This way operates only on DT, all in one go.

Speed decrease in subsetting `data.table` when adding a bracket

Tags:

r

data.table

Chris Watson

1 Answers

Rich Scriven

Recent Activity

Donate For Us

Speed decrease in subsetting `data.table` when adding a bracket

Tags:

r

data.table

Chris Watson

1 Answers

Rich Scriven

Related questions

Recent Activity

Donate For Us