I have a dataframe data
in R of dim 120000 rows by 5 columns.
Each 300 lines is a frame measured at different time intervals (ie 400 frames)
Action
I tried using array(data, c(300, 5, 400))
Expected
Make this dataframe into a 3d array by splitting data
every 300 lines and stack these 400 matrices behind each other.
Actual
Reads the values down along the first column of data
and puts these into the first part of the array.
Here's an approach using dim<-
and aperm
:
Sample data:
set.seed(1)
mat <- matrix(sample(100, 12 * 5, TRUE), ncol = 5)
mat
# [,1] [,2] [,3] [,4] [,5]
# [1,] 27 69 27 80 74
# [2,] 38 39 39 11 70
# [3,] 58 77 2 73 48
# [4,] 91 50 39 42 87
# [5,] 21 72 87 83 44
# [6,] 90 100 35 65 25
# [7,] 95 39 49 79 8
# [8,] 67 78 60 56 10
# [9,] 63 94 50 53 32
# [10,] 7 22 19 79 52
# [11,] 21 66 83 3 67
# [12,] 18 13 67 48 41
Slicing and dicing:
Sliced <- aperm(`dim<-`(t(mat), list(5, 3, 4)), c(2, 1, 3))
Sliced
# , , 1
#
# [,1] [,2] [,3] [,4] [,5]
# [1,] 27 69 27 80 74
# [2,] 38 39 39 11 70
# [3,] 58 77 2 73 48
#
# , , 2
#
# [,1] [,2] [,3] [,4] [,5]
# [1,] 91 50 39 42 87
# [2,] 21 72 87 83 44
# [3,] 90 100 35 65 25
#
# , , 3
#
# [,1] [,2] [,3] [,4] [,5]
# [1,] 95 39 49 79 8
# [2,] 67 78 60 56 10
# [3,] 63 94 50 53 32
#
# , , 4
#
# [,1] [,2] [,3] [,4] [,5]
# [1,] 7 22 19 79 52
# [2,] 21 66 83 3 67
# [3,] 18 13 67 48 41
Adjust the numbers to match your data.
Breaking things apart, we get:
t(mat)
: transposes your matrix (so we now have 5 x 12).dim<-(..., list(...))
: converts this to an array, in this case, 5 (row) x 3 (col) x 4 (third dimension).aperm
: the result of the last step is by-row, so we need to convert it to by columns, so this is like a t
, but with multiple dimensions involved.These are also very efficient operations. Here's a comparison of this approach with @akrun's:
m1 <- matrix(1:(300*400*5), nrow=300*400, ncol=5)
am <- function() {
aperm(`dim<-`(t(m1), list(5, 300, 400)), c(2, 1, 3))
}
ak <- function() {
lst <- lapply(split(seq_len(nrow(m1)),(seq_len(nrow(m1))-1) %/%300 +1),
function(i) m1[i,])
arr1 <- array(0, dim=c(300,5,400))
for(i in 1:400){
arr1[,,i] <- lst[[i]]
}
arr1
}
library(microbenchmark)
microbenchmark(am(), ak(), times = 20)
# Unit: milliseconds
# expr min lq median uq max neval
# am() 19.09133 27.63269 31.18292 67.12434 146.2673 20
# ak() 496.11494 518.71223 550.02215 591.27266 699.9834 20
Another option would be:
m1 <- matrix(1:(300*400*5), nrow=300*400, ncol=5)
lst <- lapply(split(seq_len(nrow(m1)),(seq_len(nrow(m1))-1) %/%300 +1),
function(i) m1[i,])
arr1 <- array(0, dim=c(300,5,400))
for(i in 1:400){
arr1[,,i] <- lst[[i]]
}
m1[297:300,]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 297 120297 240297 360297 480297
#[2,] 298 120298 240298 360298 480298
#[3,] 299 120299 240299 360299 480299
#[4,] 300 120300 240300 360300 480300
tail(arr1[,,1],4)
# [,1] [,2] [,3] [,4] [,5]
#[297,] 297 120297 240297 360297 480297
#[298,] 298 120298 240298 360298 480298
#[299,] 299 120299 240299 360299 480299
#[300,] 300 120300 240300 360300 480300
Or as suggested by @Ananda Mahto
library(abind)
arr2 <- abind(lapply(split(seq_len(nrow(m1)),
(seq_len(nrow(m1))-1) %/% 300 + 1), function(x) m1[x, ]), along = 3)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With