I have a matrix m
and a vector v
. I would like to multiply first column of matrix m
by the first element of vector v
, and multiply the second column of matrix m
by the second element of vector v
, and so on. I can do it with the following code, but I am looking for a way which does not require the two transpose calls. How can I do this faster in R?
m <- matrix(rnorm(120000), ncol=6) v <- c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5) system.time(t(t(m) * v)) # user system elapsed # 0.02 0.00 0.02
When we multiply a matrix with a vector the output is a vector. Suppose we have a matrix M and vector V then they can be multiplied as M%*%V. To understand the step-by-step multiplication, we can multiply each value in the vector with the row values in matrix and find out the sum of that multiplication.
To multiply a rows or columns of a matrix, we need to use %*% symbol that perform the multiplication for matrices in R. If we have a matrix M with 5 rows and 5 columns then row 1 of M can be multiplied with column 1 of M using M[1,]%*%M[,1], similarly, we can multiply other rows and columns.
In linear algebra, the Strassen algorithm, named after Volker Strassen, is an algorithm for matrix multiplication. It is faster than the standard matrix multiplication algorithm for large matrices, with a better asymptotic complexity, although the naive algorithm is often better for smaller matrices.
Use some linear algebra and perform matrix multiplication, which is quite fast in R
.
eg
m %*% diag(v)
some benchmarking
m = matrix(rnorm(1200000), ncol=6) v=c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5) library(microbenchmark) microbenchmark(m %*% diag(v), t(t(m) * v)) ## Unit: milliseconds ## expr min lq median uq max neval ## m %*% diag(v) 16.57174 16.78104 16.86427 23.13121 109.9006 100 ## t(t(m) * v) 26.21470 26.59049 32.40829 35.38097 122.9351 100
If you have a larger number of columns your t(t(m) * v) solution outperforms the matrix multiplication solution by a wide margin. However, there is a faster solution, but it comes with a high cost in in memory usage. You create a matrix as large as m using rep() and multiply elementwise. Here's the comparison, modifying mnel's example:
m = matrix(rnorm(1200000), ncol=600) v = rep(c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5), length = ncol(m)) library(microbenchmark) microbenchmark(t(t(m) * v), m %*% diag(v), m * rep(v, rep.int(nrow(m),length(v))), m * rep(v, rep(nrow(m),length(v))), m * rep(v, each = nrow(m))) # Unit: milliseconds # expr min lq mean median uq max neval # t(t(m) * v) 17.682257 18.807218 20.574513 19.239350 19.818331 62.63947 100 # m %*% diag(v) 415.573110 417.835574 421.226179 419.061019 420.601778 465.43276 100 # m * rep(v, rep.int(nrow(m), ncol(m))) 2.597411 2.794915 5.947318 3.276216 3.873842 48.95579 100 # m * rep(v, rep(nrow(m), ncol(m))) 2.601701 2.785839 3.707153 2.918994 3.855361 47.48697 100 # m * rep(v, each = nrow(m)) 21.766636 21.901935 23.791504 22.351227 23.049006 66.68491 100
As you can see, using "each" in rep() sacrifices speed for clarity. The difference between rep.int and rep seems to be neglible, both implementations swap places on repeated runs of microbenchmark(). Keep in mind, that ncol(m) == length(v).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With