Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Drop data frame columns by name

Tags:

dataframe

r

r-faq

I have a number of columns that I would like to remove from a data frame. I know that we can delete them individually using something like:

df$x <- NULL

But I was hoping to do this with fewer commands.

Also, I know that I could drop columns using integer indexing like this:

df <- df[ -c(1, 3:6, 12) ]

But I am concerned that the relative position of my variables may change.

Given how powerful R is, I figured there might be a better way than dropping each column one by one.

like image 778
Btibert3 Avatar asked Jan 05 '11 14:01

Btibert3


People also ask

How do you drop few columns in a data frame?

We can use Pandas drop() function to drop multiple columns from a dataframe. Pandas drop() is versatile and it can be used to drop rows of a dataframe as well.

How do I drop a column with the same name?

drop_duplicates(). T you can drop/remove/delete duplicate columns with the same name or a different name. This method removes all columns of the same name beside the first occurrence of the column also removes columns that have the same data with the different column name.


3 Answers

You can use a simple list of names :

DF <- data.frame(
  x=1:10,
  y=10:1,
  z=rep(5,10),
  a=11:20
)
drops <- c("x","z")
DF[ , !(names(DF) %in% drops)]

Or, alternatively, you can make a list of those to keep and refer to them by name :

keeps <- c("y", "a")
DF[keeps]

EDIT : For those still not acquainted with the drop argument of the indexing function, if you want to keep one column as a data frame, you do:

keeps <- "y"
DF[ , keeps, drop = FALSE]

drop=TRUE (or not mentioning it) will drop unnecessary dimensions, and hence return a vector with the values of column y.

like image 84
Joris Meys Avatar answered Oct 28 '22 12:10

Joris Meys


There's also the subset command, useful if you know which columns you want:

df <- data.frame(a = 1:10, b = 2:11, c = 3:12)
df <- subset(df, select = c(a, c))

UPDATED after comment by @hadley: To drop columns a,c you could do:

df <- subset(df, select = -c(a, c))
like image 519
Prasad Chalasani Avatar answered Oct 28 '22 12:10

Prasad Chalasani


within(df, rm(x))

is probably easiest, or for multiple variables:

within(df, rm(x, y))

Or if you're dealing with data.tables (per How do you delete a column by name in data.table?):

dt[, x := NULL]   # Deletes column x by reference instantly.

dt[, !"x"]   # Selects all but x into a new data.table.

or for multiple variables

dt[, c("x","y") := NULL]

dt[, !c("x", "y")]
like image 252
Max Ghenis Avatar answered Oct 28 '22 12:10

Max Ghenis