Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to take the weighted sum of the columns of a matrix in R

Tags:

r

sum

weighted

I need the weighted sum of each column of a matrix.

data <- matrix(1:2e7,1e7,2) # warning large number, will eat up >100 megs of memory
weights <- 1:1e7/1e5
system.time(colSums(data*weights))
system.time(apply(data,2,function(x) sum(x*weights)))
all.equal(colSums(data*weights), apply(data,2,function(x) sum(x*weights)))

Typically colSums(data*weights) is faster than the apply call.

I do this operation often (on a large matrix). Hence looking for advice on the most efficient implementation. Ideally, would have been great if we could pass weights to colSums (or rowSums).

Thanks, appreciate any insights!

like image 442
Anirban Avatar asked Nov 08 '12 02:11

Anirban


1 Answers

colSums and * are both internal or primitive functions and will be much faster than the apply approach

Another approach you could try is to use some basic matrix algebra as you are looking for

 weights %*% data

The matrix multiplication method does not appear to be faster but it will avoid creating a temporary object the size of data

system.time({.y <- colSums(data * weights)})
##  user  system elapsed 
##  0.12    0.03    0.16 


system.time({.x <- weights %*% data})
##   user  system elapsed 
##   0.20    0.05    0.25 
like image 59
mnel Avatar answered Oct 24 '22 14:10

mnel