Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to multiply matrix columns with vector elements in R

Tags:

r

I have a matrix m and a vector v. I would like to multiply first column of matrix m by the first element of vector v, and multiply the second column of matrix m by the second element of vector v, and so on. I can do it with the following code, but I am looking for a way which does not require the two transpose calls. How can I do this faster in R?

m <- matrix(rnorm(120000), ncol=6) v <- c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5)  system.time(t(t(m) * v))  #   user  system elapsed  #   0.02    0.00    0.02  
like image 604
rose Avatar asked Jun 13 '13 05:06

rose


People also ask

How do you multiply a matrix by a vector in R?

When we multiply a matrix with a vector the output is a vector. Suppose we have a matrix M and vector V then they can be multiplied as M%*%V. To understand the step-by-step multiplication, we can multiply each value in the vector with the row values in matrix and find out the sum of that multiplication.

How do I multiply a column of a matrix in R?

To multiply a rows or columns of a matrix, we need to use %*% symbol that perform the multiplication for matrices in R. If we have a matrix M with 5 rows and 5 columns then row 1 of M can be multiplied with column 1 of M using M[1,]%*%M[,1], similarly, we can multiply other rows and columns.

Which algorithm is faster for matrix multiplication?

In linear algebra, the Strassen algorithm, named after Volker Strassen, is an algorithm for matrix multiplication. It is faster than the standard matrix multiplication algorithm for large matrices, with a better asymptotic complexity, although the naive algorithm is often better for smaller matrices.


2 Answers

Use some linear algebra and perform matrix multiplication, which is quite fast in R.

eg

m %*% diag(v)

some benchmarking

m = matrix(rnorm(1200000), ncol=6)   v=c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5) library(microbenchmark) microbenchmark(m %*% diag(v), t(t(m) * v)) ##    Unit: milliseconds ##            expr      min       lq   median       uq      max neval ##   m %*% diag(v) 16.57174 16.78104 16.86427 23.13121 109.9006   100 ##     t(t(m) * v) 26.21470 26.59049 32.40829 35.38097 122.9351   100 
like image 66
mnel Avatar answered Sep 22 '22 22:09

mnel


If you have a larger number of columns your t(t(m) * v) solution outperforms the matrix multiplication solution by a wide margin. However, there is a faster solution, but it comes with a high cost in in memory usage. You create a matrix as large as m using rep() and multiply elementwise. Here's the comparison, modifying mnel's example:

m = matrix(rnorm(1200000), ncol=600) v = rep(c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5), length = ncol(m)) library(microbenchmark)  microbenchmark(t(t(m) * v),    m %*% diag(v),   m * rep(v, rep.int(nrow(m),length(v))),    m * rep(v, rep(nrow(m),length(v))),    m * rep(v, each = nrow(m)))  # Unit: milliseconds #                                   expr        min         lq       mean     median         uq       max neval #                            t(t(m) * v)  17.682257  18.807218  20.574513  19.239350  19.818331  62.63947   100 #                          m %*% diag(v) 415.573110 417.835574 421.226179 419.061019 420.601778 465.43276   100 #  m * rep(v, rep.int(nrow(m), ncol(m)))   2.597411   2.794915   5.947318   3.276216   3.873842  48.95579   100 #      m * rep(v, rep(nrow(m), ncol(m)))   2.601701   2.785839   3.707153   2.918994   3.855361  47.48697   100 #             m * rep(v, each = nrow(m))  21.766636  21.901935  23.791504  22.351227  23.049006  66.68491   100 

As you can see, using "each" in rep() sacrifices speed for clarity. The difference between rep.int and rep seems to be neglible, both implementations swap places on repeated runs of microbenchmark(). Keep in mind, that ncol(m) == length(v).

autoplot

like image 22
bluegrue Avatar answered Sep 22 '22 22:09

bluegrue