I have a data.table, and want to exclude some set of columns. For example, <pre class="prettyprint"><code>library(data.table) dt <- data.table(a = 1:2, b = 2:3, c = 3:4, d = 4:5) dt[ , .(b, c)] </code></pre> Gives me the second and third column, b and c. How do I instead EXCLUDE columns b and c. Coming from the data.frame world, I would expect something like the following: <pre class="prettyprint"><code>dt[ , -.(b, c)] </code></pre> or, maybe <pre class="prettyprint"><code>dt[ , !.(b, c)] </code></pre> But neithr of these work. I know I can use <pre class="prettyprint"><code>dt[ , -c(2:3), with = FALSE] </code></pre> but this just (as I understand it) casts the data.table as a data.frame and then uses the standard operations. I would like to avoid this, since it is a) kind of cheating, an b) gives up the speed boosts available in data.table. I reviewed the data.table FAQ, and the vignette, and cannot seem to find anything. (I know this is all very simplistic, and I could just select the other two columns. However, this is a microcosm of a much, MUCH bigger data.table I am working with.)

Also, in case you would not wish to change the data.table, but merely return the columns except some columns, you can do: <pre class="prettyprint"><code>dt[,.SD, .SDcols = !c('b', 'c')] </code></pre> which returns the required result of: <pre class="prettyprint"><code> a d 1: 1 4 2: 2 5 </code></pre> while dt remains unchanged: <pre class="prettyprint"><code>> dt a b c d 1: 1 2 3 4 2: 2 3 4 5 </code></pre>

We can use <code>setdiff</code> <pre class="prettyprint"><code>dt[, setdiff(names(dt), c("b", "c")), with = FALSE] </code></pre> or we can assign to <code>NULL</code> (as in the other answer) but in a single step <pre class="prettyprint"><code>dt[, c("b", "c") := NULL][] </code></pre>

You can do: <pre class="prettyprint"><code> dt[ , b := NULL][ , c := NULL] </code></pre> or you can use a list of columns to be removed: <pre class="prettyprint"><code>xx <- c("b","c") # vector of columns you DON'T want # subset dt <- dt[, !xx, with = FALSE] </code></pre>

Another way using <code>set</code>: <pre class="prettyprint"><code>set(dt,, c("b", "c"), NULL) </code></pre>

How do I exclude columns from a data.table?

Tags:

r

data.table

I have a data.table, and want to exclude some set of columns. For example,

library(data.table)
dt <- data.table(a = 1:2, b = 2:3, c = 3:4, d = 4:5)
dt[ , .(b, c)]

Gives me the second and third column, b and c. How do I instead EXCLUDE columns b and c. Coming from the data.frame world, I would expect something like the following:

dt[ , -.(b, c)]

or, maybe

dt[ , !.(b, c)]

But neithr of these work. I know I can use

dt[ , -c(2:3), with = FALSE]

but this just (as I understand it) casts the data.table as a data.frame and then uses the standard operations. I would like to avoid this, since it is a) kind of cheating, an b) gives up the speed boosts available in data.table. I reviewed the data.table FAQ, and the vignette, and cannot seem to find anything.

(I know this is all very simplistic, and I could just select the other two columns. However, this is a microcosm of a much, MUCH bigger data.table I am working with.)

967

asked May 13 '16 12:05

lukehawk

5 Answers

Also, in case you would not wish to change the data.table, but merely return the columns except some columns, you can do:

dt[,.SD, .SDcols = !c('b', 'c')]

which returns the required result of:

   a d
1: 1 4
2: 2 5

while dt remains unchanged:

> dt
   a b c d
1: 1 2 3 4
2: 2 3 4 5

answered Oct 12 '22 00:10

ira

We can use setdiff

dt[, setdiff(names(dt), c("b", "c")), with = FALSE]

or we can assign to NULL (as in the other answer) but in a single step

dt[, c("b", "c") := NULL][]

answered Oct 11 '22 23:10

akrun

You can do:

  dt[ , b := NULL][ , c := NULL]

or you can use a list of columns to be removed:

xx <- c("b","c") # vector of columns you DON'T want

# subset
  dt <- dt[, !xx, with = FALSE]

answered Oct 11 '22 23:10

rafa.pereira

you can always just do:

dt[ , -c("b", "c")]

although this uses the data.fame sintax and as the problems you describe, particularly it seems to be much slower on large data sets.

answered Oct 12 '22 01:10

cach dies

Another way using set:

set(dt,, c("b", "c"), NULL)

answered Oct 12 '22 01:10

Deb

Related questions
                            
                                Any way to produce a LaTeX table from an lme4 mer model fit object?
                            
                                get filename and path of `source`d file
                            
                                Plot percentages on y-axis
                            
                                Pivot on data.table similar to rehape melt function
                            
                                Subscript out of bounds (Caret variable importance for randomForest) [duplicate]
                            
                                Creating dummy variables in R data.table
                            
                                Does mutate change tbl by reference?
                            
                                How to read knitr/Rmd cache in interactive session?
                            
                                Extracting text after last period in string [duplicate]
                            
                                Extract text after a symbol in R
                            
                                knitr: getting a parse_all error in R when converting Rmd file into HTML
                            
                                How to manipulate NULL elements in a nested list?
                            
                                Static Variables in R
                            
                                How do I get all the output from script I am running in RStudio
                            
                                LDA with topicmodels, how can I see which topics different documents belong to?
                            
                                ggplot2 error "no layers in plot"
                            
                                Remove lines from color and fill legends
                            
                                Left justify text from multi-line facet labels
                            
                                Read Json file into a data.frame without nested lists
                            
                                Change plotly chart y variable based on selectInput

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With