I would like to assign a matrix to a multi-column subset of a data.table
but the matrix ends up getting treated as a column vector. For example,
dt1 <- data.table(a1=rnorm(5), a2=rnorm(5), a3=rnorm(5))
m1 <- matrix(rnorm(10), ncol=2)
dt1[,c("a1","a2")] <- m1
Warning messages:
1: In `[<-.data.table`(`*tmp*`, , c("a1", "a2"), value = c(-0.308851784175091, :
2 column matrix RHS of := will be treated as one vector
2: In `[<-.data.table`(`*tmp*`, , c("a1", "a2"), value = c(-0.308851784175091, :
Supplied 10 items to be assigned to 5 items of column 'a1' (5 unused)
3: In `[<-.data.table`(`*tmp*`, , c("a1", "a2"), value = c(-0.308851784175091, :
2 column matrix RHS of := will be treated as one vector
4: In `[<-.data.table`(`*tmp*`, , c("a1", "a2"), value = c(-0.308851784175091, :
Supplied 10 items to be assigned to 5 items of column 'a2' (5 unused)
The problem can be solved by first converting m1
to be another data.table
object, but I'm curious what the reasonsing is for this error. The above syntax would work if dt1
were a data.frame
; what is the architectural rationale for not having it work with data.table
?
To subset a matrix based on values in a particular column, we can use single square brackets and provide the row and column values. The column values will be set for the columns that we want to subset and the row value will be set for the values of the column using which we want to subset the matrix.
To convert a table into matrix in R, we can use apply function with as. matrix. noquote function.
The way you tell R that you want to select some particular elements (i.e., a 'subset') from a vector is by placing an 'index vector' in square brackets immediately following the name of the vector. For a simple example, try x[1:10] to view the first ten elements of x.
To create a matrix in R you need to use the function called matrix(). The arguments to this matrix() are the set of elements in the vector. You have to pass how many numbers of rows and how many numbers of columns you want to have in your matrix. Note: By default, matrices are in column-wise order.
dt1[,c("a1","a2")] <- as.data.table(m1)
gives a simple solution but does make a copy.
@Simon O'Hanlon provides a solution in the data.table
way:
dt1[ , `:=`( a1 = m1[,1] , a2 = m1[,2] ) ]
and in my opinion an even better data.table
solution is provided by @Frank:
dt1[,c("a1","a2") := as.data.table(m1)]
A data.frame
is not a matrix
, nor is a data.table
a matrix
. Both data.frame
and data.table
objects are lists
. These are stored very differently, although the indexing can be similar, this is processed under the hood.
Within [<-.data.frame
splits a matrix-valued value
into a list with an element for each column.
(The line is value <- split(value, col(value))
)).
Note also that [<-.data.frame
will copy the entire data.frame in the process of assigning something to a subset of columns.
data.table
attempts to avoid this copying, as such [<-.data.table
should be avoided, as all <-
methods in R
make copies.
Within [<-.data.table
, [<-.data.frame
will be called if i
is a matrix, but not if only value
is.
data.table
usually likes you to be explicit in ensuring that the types of data match when assigning. This helps avoid any coercion and related copying.
You could, perhaps put in a feature request here to ensure compatibility, but given your usage is far outside what is recommended, then perhaps the package authors might request you simply use the data.table
conventions and approaches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With