Consider the array <code>a</code>: <pre class="prettyprint"><code>> a <- array(c(1:9, 1:9), c(3,3,2)) > a , , 1 [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 , , 2 [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 </code></pre> How do we efficiently compute the row sums of the matrices indexed by the third dimension, such that the result is: <pre class="prettyprint"><code> [,1] [,2] [1,] 12 12 [2,] 15 15 [3,] 18 18 </code></pre> ?? The column sums are easy via the <code>'dims'</code> argument of <code>colSums()</code>: <pre class="prettyprint"><code>> colSums(a, dims = 1) </code></pre> but I cannot find a way to use <code>rowSums()</code> on the array to achieve the desired result, as it has a different interpretation of <code>'dims'</code> to that of <code>colSums()</code>. It is simple to compute the desired row sums using: <pre class="prettyprint"><code>> apply(a, 3, rowSums) [,1] [,2] [1,] 12 12 [2,] 15 15 [3,] 18 18 </code></pre> but that is just hiding the loop. Are there other efficient, truly vectorised, ways of computing the required row sums?

@Fojtasek's answer mentioned splitting up the array reminded me of the <code>aperm()</code> function which allows one to permute the dimensions of an array. As <code>colSums()</code> works, we can swap the first two dimensions using <code>aperm()</code> and run <code>colSums()</code> on the output. <pre class="prettyprint"><code>> colSums(aperm(a, c(2,1,3))) [,1] [,2] [1,] 12 12 [2,] 15 15 [3,] 18 18 </code></pre> Some comparison timings of this and the other suggested R-based answers: <pre class="prettyprint"><code>> b <- array(c(1:250000, 1:250000),c(5000,5000,2)) > system.time(rs1 <- apply(b, 3, rowSums)) user system elapsed 1.831 0.394 2.232 > system.time(rs2 <- rowSums3d(b)) user system elapsed 1.134 0.183 1.320 > system.time(rs3 <- sapply(1:dim(b)[3], function(i) rowSums(b[,,i]))) user system elapsed 1.556 0.073 1.636 > system.time(rs4 <- colSums(aperm(b, c(2,1,3)))) user system elapsed 0.860 0.103 0.966 </code></pre> So on my system the <code>aperm()</code> solution appears marginally faster: <pre class="prettyprint"><code>> sessionInfo() R version 2.12.1 Patched (2011-02-06 r54249) Platform: x86_64-unknown-linux-gnu (64-bit) </code></pre> However, <code>rowSums3d()</code> doesn't give the same answers as the other solutions: <pre class="prettyprint"><code>> all.equal(rs1, rs2) [1] "Mean relative difference: 0.01999992" > all.equal(rs1, rs3) [1] TRUE > all.equal(rs1, rs4) [1] TRUE </code></pre>

You could chop up the array into two dimensions, compute row sums on that, and then put the output back together the way you want it. Like so: <pre class="prettyprint"><code>rowSums3d <- function(a){ m <- matrix(a,ncol=ncol(a)) rs <- rowSums(m) matrix(rs,ncol=2) } > a <- array(c(1:250000, 1:250000),c(5000,5000,2)) > system.time(rowSums3d(a)) user system elapsed 1.73 0.17 1.96 > system.time(apply(a, 3, rowSums)) user system elapsed 3.09 0.46 3.74 </code></pre>

Efficiently compute the row sums of a 3d array in R

Tags:

arrays

r

rowsum

Consider the array a:

> a <- array(c(1:9, 1:9), c(3,3,2))
> a
, , 1

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

, , 2

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

How do we efficiently compute the row sums of the matrices indexed by the third dimension, such that the result is:

     [,1] [,2]
[1,]   12   12
[2,]   15   15
[3,]   18   18

The column sums are easy via the 'dims' argument of colSums():

> colSums(a, dims = 1)

but I cannot find a way to use rowSums() on the array to achieve the desired result, as it has a different interpretation of 'dims' to that of colSums().

It is simple to compute the desired row sums using:

> apply(a, 3, rowSums)
     [,1] [,2]
[1,]   12   12
[2,]   15   15
[3,]   18   18

but that is just hiding the loop. Are there other efficient, truly vectorised, ways of computing the required row sums?

836

asked Feb 27 '11 19:02

Gavin Simpson

2 Answers

@Fojtasek's answer mentioned splitting up the array reminded me of the aperm() function which allows one to permute the dimensions of an array. As colSums() works, we can swap the first two dimensions using aperm() and run colSums() on the output.

> colSums(aperm(a, c(2,1,3)))
     [,1] [,2]
[1,]   12   12
[2,]   15   15
[3,]   18   18

Some comparison timings of this and the other suggested R-based answers:

> b <- array(c(1:250000, 1:250000),c(5000,5000,2))
> system.time(rs1 <- apply(b, 3, rowSums))
   user  system elapsed 
  1.831   0.394   2.232 
> system.time(rs2 <- rowSums3d(b))
   user  system elapsed 
  1.134   0.183   1.320 
> system.time(rs3 <- sapply(1:dim(b)[3], function(i) rowSums(b[,,i])))
   user  system elapsed 
  1.556   0.073   1.636
> system.time(rs4 <- colSums(aperm(b, c(2,1,3))))
   user  system elapsed 
  0.860   0.103   0.966

So on my system the aperm() solution appears marginally faster:

> sessionInfo()
R version 2.12.1 Patched (2011-02-06 r54249)
Platform: x86_64-unknown-linux-gnu (64-bit)

However, rowSums3d() doesn't give the same answers as the other solutions:

> all.equal(rs1, rs2)
[1] "Mean relative difference: 0.01999992"
> all.equal(rs1, rs3)
[1] TRUE
> all.equal(rs1, rs4)
[1] TRUE

137

answered Sep 27 '22 19:09

Gavin Simpson

You could chop up the array into two dimensions, compute row sums on that, and then put the output back together the way you want it. Like so:

rowSums3d <- function(a){
    m <- matrix(a,ncol=ncol(a))
    rs <- rowSums(m)
    matrix(rs,ncol=2)
}

> a <- array(c(1:250000, 1:250000),c(5000,5000,2))
> system.time(rowSums3d(a))
   user  system elapsed 
   1.73    0.17    1.96 
> system.time(apply(a, 3, rowSums))
   user  system elapsed 
   3.09    0.46    3.74

answered Sep 27 '22 21:09

Fojtasek

Related questions
                            
                                Using auto_ptr<> with array
                            
                                C pointer notation compared to array notation: When passing to function
                            
                                No compiler error when fixed size char array is initialized without enough room for null terminator
                            
                                How can I write an array of maps [golang]
                            
                                Checking array in Athena
                            
                                Getting "extraneous argument label" when trying to append array to other array in Swift
                            
                                Java serialization of multidimensional array
                            
                                Passing array arguments by reference
                            
                                Array Initialization using { } in Java
                            
                                Cumulative sum in a matrix
                            
                                Why toString() method works differently between Array and ArrayList object in Java
                            
                                "Flattening" a cell array
                            
                                Initialize a 2d dynamic array in Go
                            
                                Replace object in array on react state
                            
                                Use Numpy to convert rgb pixel array into grayscale [duplicate]
                            
                                How to normalize a 4D numpy array?
                            
                                Convert Numpy array to Pandas DataFrame column-wise (As Single Row)
                            
                                PHP array_key_exists() and SPL ArrayAccess interface: not compatible?
                            
                                Algorithm to find a number and its square in an array
                            
                                Index confusion in numpy arrays

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Efficiently compute the row sums of a 3d array in R

Tags:

arrays

r

rowsum

Gavin Simpson

People also ask

2 Answers

Gavin Simpson

Fojtasek

Recent Activity

Donate For Us