Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deleting columns of a data.table inside a function

I have the following example:

irisDT <- as.data.table(iris)

mod <- function(dat) {
  dat[, index:=(1:nrow(dat))]
  setkey(dat, index)

  dat <- dat[2:10]

  dat[, index:=NULL]
  invisible()
}

mod(irisDT)
names(irisDT) # it contains index

To my surprise, the index column still exists after calling the mod() function. This is not the case when I delete the line dat <- dat[2:10]. I guess that, since rows cannot be deleted by reference yet, another data.table is created. However, I would like to delete the index column in the original data.table.

like image 917
Martijn Tennekes Avatar asked Jun 20 '12 11:06

Martijn Tennekes


People also ask

How do you remove columns from a data set?

The most easiest way to drop columns is by using subset() function. In the code below, we are telling R to drop variables x and z. The '-' sign indicates dropping variables. Make sure the variable names would NOT be specified in quotes when using subset() function.

Which SQL command can be used to delete columns from a table?

The DROP COLUMN command is used to delete a column in an existing table.

How do I delete a specific column value in a table in SQL?

Right-click the column you want to delete and choose Delete Column from the shortcut menu. If the column participates in a relationship (FOREIGN KEY or PRIMARY KEY), a message prompts you to confirm the deletion of the selected columns and their relationships. Choose Yes.


1 Answers

Great question. data.table is copied-on-change, by <-, in the usual R way.

It isn't copied-on-change by := or the set* functions (setkey,setnames,setattr) provided by the data.table package.

So it's not anything special about the data.table object itself that decides copies or not, and it's passed as an argument to functions in exactly the same way as data.frame. It's what you do on it inside the function that counts. The <- operator copies-on-change and that's no different when used on a data.table. The := operator, on the other hand, assigns by reference.

As you say, there is no way (yet) to delete rows by reference, so until then you'll need to use standard R syntax to assign the copy back to the symbol in calling scope.

As it happens, there was a slide on this at last night's LondonR talk which is now on the homepage under the presentation section (see slide with title copy()).

like image 143
Matt Dowle Avatar answered Oct 29 '22 08:10

Matt Dowle