Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assign value to specific data.table columns and rows

Tags:

still understanding this great package... Could anyone please explain me the reason of this error? Thanks!

library(data.table)  DT <- data.table(id   = LETTERS,                  var1 = rnorm(26),                  var2 = rnorm(26))  > DT[2, list(var1, var2)]             var1          var2 1: -0.8628479332 -0.2367492928 > DT[2, c(var1, var2)] [1] -0.8628479332 -0.2367492928 >  > DT[2, list(var1, var2)] <- DT[8, list(var1, var2)] Error in `[<-.data.table`(`*tmp*`, 2, list(var1, var2), value = list(var1 = -0.394006912428776,  :    object 'var1' not found > DT[2, c(var1, var2)] <- DT[8, c(var1, var2)] Error in `[<-.data.table`(`*tmp*`, 2, c(var1, var2), value = c(-0.394006912428776,  :    object 'var1' not found 
like image 217
Michele Avatar asked May 12 '13 10:05

Michele


People also ask

How do you assign a value to a DataTable to a variable?

Try to use like this. DataTable dtData = new DataTable(); //Get the data in datatable, then bind with string variable. String strName = dtData.

How do you assign data to a row?

Assigning values to a row variable can be done using the SET statement. A row value can be assigned to a row variable. A row field value or expression can be assigned to a row field. Row values can be assigned to row variables using the SET statement if they are both of the same user-defined row data type.


1 Answers

First, it is recommended to use := instead of [<- for efficiency. The [<- is mostly provided for backward consistency. So, I'll first illustrate how to efficiently use := to get what you're after. := is assignment by reference (and it updates a data.table without copying the data, therefore extremely fast).

require(data.table) DT <- data.table(x = 1:5, y = 6:10, z = 11:15) 

Suppose you want to change the 2nd row of "y" to that of 5th row of "y":

DT[2, y := DT[5, y]]  

or equivalently

DT[2, `:=`(y = DT[5, y])] 

Suppose you want to change the 2nd row of both "y" and "z" to that of the corresponding entries in row 5, then:

DT[2, c("y", "z") := as.list(DT[5, c(y, z)])] 

or equivalently

DT[2, `:=`(y = DT[5, y], z = DT[5, z])] 

Now just to show you how to assign using [<- (while it is clearly not recommended), it can be done as follows:

DT <- data.table(x = 1:5, y = 6:10, z = 11:15) DT[1, c("y", "z")] <- as.list(DT[5, c(y, z)]) 

or equivalently, you can also pass the column number:

DT[1, 2:3] <- as.list(DT[5, c(y, z)]) 

Hope this helps.


Edit 1

As to why you get the error:

First, the RHS has to be a list for [<-data.table if it has more than 1 columns to be assigned to.

Second, j argument on the left of <- is not evaluated within the environment of your data.table. So, it needs to know what the values for j are. And since you provide var1 and var2 (without the double quotes that would make them a character vector), it is understood to be a variable. And so, it checks for variables var1 and var2, but since it doesn't "see" the columns within your data.table as variables (like it normally does when you do assignments etc on the RHS of <-), it'll look for the same variables in its parent environment which is the global environment where it doesn't find them and so you get the error. For ex: do this:

y <- "y" z <- "z" # And now try your second case:  DT[2, c(y, z)] <- as.list(DT[5, c(y, z)]) # the left side takes values from the assignments you made above # the right side y and z are evaluated within the environment of your data.table # and so it sees the columns y and z as variables and their values are picked accordingly 

Third, the [<-data.table function accepts only atomic (vector) types for j argument. So, your first assignment DT[2, list(var1, var2)] <- DT[8, list(var1, var2)] will still give an error if you do it the right way, that is:

y <- "y" z <- "z" DT[2, list(y, z)] <- as.list(DT[5, c(y, z)])  # Error in `[<-.data.table`(`*tmp*`, 2, list(y, z), value = list(10L, 15L)) :  #   j must be atomic vector, see ?is.atomic 

hope this helps.


Edit 2

Just to illustrate that a copy of your data.table is being made when you do [<- but not when :=,

DT <- data.table(x = 1:5, y = 6:10, z = 11:15) tracemem(DT) # [1] "<0x7fbefb89b580>"  DT[1, c("y", "z") := list(100L, 110L)] tracemem(DT) # [1] "<0x7fbefb89b580>"  DT[2, c("y", "z")] <- list(200L, 201L) # tracemem[0x7fbefacc4fa0 -> 0x7fbefd297838]: # copied, inefficient 
like image 162
Arun Avatar answered Oct 19 '22 18:10

Arun