Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: List of indices to binary matrix

Tags:

list

r

matrix

Say I have a list of indices, like:

l <- list(c(1,2,3), c(1), c(1,5), c(2, 3, 5))

Which specify the non-zero elements in a matrix, like:

(m <- matrix(c(1,1,1,0,0, 1,0,0,0,0, 1,0,0,0,5, 0,1,1,0,1), nrow=4, byrow=TRUE))

     [,1] [,2] [,3] [,4] [,5]
[1,]    1    1    1    0    0
[2,]    1    0    0    0    0
[3,]    1    0    0    0    5
[4,]    0    1    1    0    1

What is the fastest way, using R, to make m from l, giving that the matrix is very big, say 50.000 rows and 2000 columns?

like image 287
Misconstruction Avatar asked Jul 06 '15 12:07

Misconstruction


2 Answers

Try

d1 <- stack(setNames(l, seq_along(l)))
library(Matrix)
m1 <- sparseMatrix(as.numeric(d1[,2]), d1[,1], x=1)
as.matrix(m1)
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    1    1    1    0    0
#[2,]    1    0    0    0    0
#[3,]    1    0    0    0    1
#[4,]    0    1    1    0    1

Or instead of stack, we could use melt

library(reshape2)
d2 <- melt(l)
sparseMatrix(d2[,2], d2[,1],x=1)

Or using only base R

Un1 <- unlist(l)
m1 <- matrix(0, nrow=length(l), ncol=max(Un1))
m1[cbind(as.numeric(d1$ind), d1$values)] <- 1
m1
like image 154
akrun Avatar answered Nov 09 '22 22:11

akrun


For me, the following is at least 3 times faster than the suggestions above, on data the size as specified in the question (5e4 x 2e3):

  unlist_l <- unlist(l)
  M <- matrix(0, nrow = length(l), ncol = max(unique(unlist_l)))
  ij <- cbind(rep(1:length(l), lengths(l)), unlist_l)
  M[ij] <- 1

Performance might depend on data size and degree of sparsity.

like image 1
Johann de Jong Avatar answered Nov 09 '22 23:11

Johann de Jong