How do you delete a column by name in data.table?

People also ask

How do you delete a column from a table?

Right-click the column you want to delete and choose Delete Column from the shortcut menu. If the column participates in a relationship (FOREIGN KEY or PRIMARY KEY), a message prompts you to confirm the deletion of the selected columns and their relationships. Choose Yes.

How do I remove certain variables in R?

Using rm() command: When you want to clear a single variable from the R environment you can use the “rm()” command followed by the variable you want to remove. variable: that variable name you want to remove.

Any of the following will remove column foo from the data.table df3:

# Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
df3[,foo:=NULL]

df3[, c("foo","bar"):=NULL]  # remove two columns

myVar = "foo"
df3[, (myVar):=NULL]   # lookup myVar contents

# Method 2a -- A safe idiom for excluding (possibly multiple)
# columns matching a regex
df3[, grep("^foo$", colnames(df3)):=NULL]

# Method 2b -- An alternative to 2a, also "safe" in the sense described below
df3[, which(grepl("^foo$", colnames(df3))):=NULL]

data.table also supports the following syntax:

## Method 3 (could then assign to df3, 
df3[, !"foo"]

though if you were actually wanting to remove column "foo" from df3 (as opposed to just printing a view of df3 minus column "foo") you'd really want to use Method 1 instead.

(Do note that if you use a method relying on grep() or grepl(), you need to set pattern="^foo$" rather than "foo", if you don't want columns with names like "fool" and "buffoon" (i.e. those containing foo as a substring) to also be matched and removed.)

Less safe options, fine for interactive use:

The next two idioms will also work -- if df3 contains a column matching "foo" -- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar", you'll end up with a zero-row data.table.

As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo". For programming purposes (or if you are wanting to actually remove the column(s) from df3 rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.

# Method 4:
df3[, .SD, .SDcols = !patterns("^foo$")]

Lastly there are approaches using with=FALSE, though data.table is gradually moving away from using this argument so it's now discouraged where you can avoid it; showing here so you know the option exists in case you really do need it:

# Method 5a (like Method 3)
df3[, !"foo", with=FALSE] 
# Method 5b (like Method 4)
df3[, !grep("^foo$", names(df3)), with=FALSE]
# Method 5b (another like Method 4)
df3[, !grepl("^foo$", names(df3)), with=FALSE]

You can also use set for this, which avoids the overhead of [.data.table in loops:

dt <- data.table( a=letters, b=LETTERS, c=seq(26), d=letters, e=letters )
set( dt, j=c(1L,3L,5L), value=NULL )
> dt[1:5]
   b d
1: A a
2: B b
3: C c
4: D d
5: E e

If you want to do it by column name, which(colnames(dt) %in% c("a","c","e")) should work for j.

I simply do it in the data frame kind of way:

DT$col = NULL

Works fast and as far as I could see doesn't cause any problems.

UPDATE: not the best method if your DT is very large, as using the $<- operator will lead to object copying. So better use:

DT[, col:=NULL]

Very simple option in case you have many individual columns to delete in a data table and you want to avoid typing in all column names #careadviced

dt <- dt[, -c(1,4,6,17,83,104)]

This will remove columns based on column number instead.

It's obviously not as efficient because it bypasses data.table advantages but if you're working with less than say 500,000 rows it works fine

Suppose your dt has columns col1, col2, col3, col4, col5, coln.

To delete a subset of them:

vx <- as.character(bquote(c(col1, col2, col3, coln)))[-1]
DT[, paste0(vx):=NULL]

Related questions
                            
                                "Correct" way to specifiy optional arguments in R functions
                            
                                Load multiple packages at once
                            
                                Speed up the loop operation in R
                            
                                Numbering rows within groups in a data frame
                            
                                Why use purrr::map instead of lapply?
                            
                                remove kernel on jupyter notebook
                            
                                Plot a legend outside of the plotting area in base graphics?
                            
                                Reshaping data.frame from wide to long format
                            
                                Can dplyr package be used for conditional mutating?
                            
                                Understanding exactly when a data.table is a reference to (vs a copy of) another data.table
                            
                                Remove NA values from a vector
                            
                                Label points in geom_point
                            
                                Extract a dplyr tbl column as a vector
                            
                                Selecting only numeric columns from a data frame
                            
                                Use of ~ (tilde) in R programming Language
                            
                                Explicitly calling return in a function or not
                            
                                pull out p-values and r-squared from a linear regression
                            
                                ggplot2 line chart gives "geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?"
                            
                                Set margin size when converting from Markdown to PDF with pandoc
                            
                                Calculating moving average

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do you delete a column by name in data.table?

Tags:

r

data.table

People also ask

Less safe options, fine for interactive use:

Recent Activity

Donate For Us