Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

2d matrix to 3d stacked array in r

Tags:

arrays

r

matrix

I have a dataframe data in R of dim 120000 rows by 5 columns.

Each 300 lines is a frame measured at different time intervals (ie 400 frames)

Action

I tried using array(data, c(300, 5, 400))

Expected

Make this dataframe into a 3d array by splitting data every 300 lines and stack these 400 matrices behind each other.

Actual

Reads the values down along the first column of data and puts these into the first part of the array.

like image 702
T_stats_3 Avatar asked Dec 20 '22 10:12

T_stats_3


2 Answers

Here's an approach using dim<- and aperm:

Sample data:

set.seed(1)
mat <- matrix(sample(100, 12 * 5, TRUE), ncol = 5)
mat
#       [,1] [,2] [,3] [,4] [,5]
#  [1,]   27   69   27   80   74
#  [2,]   38   39   39   11   70
#  [3,]   58   77    2   73   48
#  [4,]   91   50   39   42   87
#  [5,]   21   72   87   83   44
#  [6,]   90  100   35   65   25
#  [7,]   95   39   49   79    8
#  [8,]   67   78   60   56   10
#  [9,]   63   94   50   53   32
# [10,]    7   22   19   79   52
# [11,]   21   66   83    3   67
# [12,]   18   13   67   48   41

Slicing and dicing:

Sliced <- aperm(`dim<-`(t(mat), list(5, 3, 4)), c(2, 1, 3))

Sliced
# , , 1
# 
#      [,1] [,2] [,3] [,4] [,5]
# [1,]   27   69   27   80   74
# [2,]   38   39   39   11   70
# [3,]   58   77    2   73   48
# 
# , , 2
# 
#      [,1] [,2] [,3] [,4] [,5]
# [1,]   91   50   39   42   87
# [2,]   21   72   87   83   44
# [3,]   90  100   35   65   25
# 
# , , 3
# 
#      [,1] [,2] [,3] [,4] [,5]
# [1,]   95   39   49   79    8
# [2,]   67   78   60   56   10
# [3,]   63   94   50   53   32
# 
# , , 4
# 
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    7   22   19   79   52
# [2,]   21   66   83    3   67
# [3,]   18   13   67   48   41

Adjust the numbers to match your data.


Breaking things apart, we get:

  • t(mat): transposes your matrix (so we now have 5 x 12).
  • dim<-(..., list(...)): converts this to an array, in this case, 5 (row) x 3 (col) x 4 (third dimension).
  • aperm: the result of the last step is by-row, so we need to convert it to by columns, so this is like a t, but with multiple dimensions involved.

These are also very efficient operations. Here's a comparison of this approach with @akrun's:

m1 <- matrix(1:(300*400*5), nrow=300*400, ncol=5)

am <- function() {
  aperm(`dim<-`(t(m1), list(5, 300, 400)), c(2, 1, 3))
}

ak <- function() {
  lst <- lapply(split(seq_len(nrow(m1)),(seq_len(nrow(m1))-1) %/%300 +1),
                function(i) m1[i,])

  arr1 <- array(0, dim=c(300,5,400))
  for(i in 1:400){
    arr1[,,i] <- lst[[i]]
  }
  arr1
}

library(microbenchmark)
microbenchmark(am(), ak(), times = 20)
# Unit: milliseconds
#  expr       min        lq    median        uq      max neval
#  am()  19.09133  27.63269  31.18292  67.12434 146.2673    20
#  ak() 496.11494 518.71223 550.02215 591.27266 699.9834    20
like image 57
A5C1D2H2I1M1N2O1R2T1 Avatar answered Dec 31 '22 08:12

A5C1D2H2I1M1N2O1R2T1


Another option would be:

 m1 <- matrix(1:(300*400*5), nrow=300*400, ncol=5)
 lst <- lapply(split(seq_len(nrow(m1)),(seq_len(nrow(m1))-1) %/%300 +1),
                         function(i) m1[i,])

 arr1 <- array(0, dim=c(300,5,400))
 for(i in 1:400){
 arr1[,,i] <- lst[[i]]
 }

m1[297:300,]
#     [,1]   [,2]   [,3]   [,4]   [,5]
#[1,]  297 120297 240297 360297 480297
#[2,]  298 120298 240298 360298 480298
#[3,]  299 120299 240299 360299 480299
#[4,]  300 120300 240300 360300 480300

 tail(arr1[,,1],4)
 #      [,1]   [,2]   [,3]   [,4]   [,5]
 #[297,]  297 120297 240297 360297 480297
 #[298,]  298 120298 240298 360298 480298
 #[299,]  299 120299 240299 360299 480299
 #[300,]  300 120300 240300 360300 480300

Or as suggested by @Ananda Mahto

library(abind)
arr2 <-  abind(lapply(split(seq_len(nrow(m1)), 
           (seq_len(nrow(m1))-1) %/% 300 + 1), function(x) m1[x, ]), along = 3)
like image 22
akrun Avatar answered Dec 31 '22 08:12

akrun