I am trying to take the dot product from a 331x23152 and 23152x23152 matrix.
In Python and Octave this is a trivial operation, but in R this seems to be incredibly slow.
N <- 331
M <- 23152
mat_1 = matrix( rnorm(N*M,mean=0,sd=1), N, M)
mat_2 = matrix( rnorm(N*M,mean=0,sd=1), M, M)
tm3 <- system.time({
mat_3 = mat_1%*%mat_2
})
print(tm3)
The output is
user system elapsed
101.95 0.04 101.99
In other words, this dot product takes over 100 seconds to execute.
I am running R-3.4.0 64-bit, with RStudio v1.0.143 on a i7-4790 with 16 GB RAM. As such, I did not expect this operation to take so long.
Am I overlooking something? I have started looking into the packages bigmemory and bigalgebra, but I can't help but think there's a solution without having to resort to packages.
EDIT
To give you an idea of time difference, here's a script for Octave:
n = 331;
m = 23152;
mat_1 = rand(n,m);
mat_2 = rand(m,m);
tic
mat_3 = mat_1*mat_2;
toc
The output is
Elapsed time is 3.81038 seconds.
And in Python:
import numpy as np
import time
n = 331
m = 23152
mat_1 = np.random.random((n,m))
mat_2 = np.random.random((m,m))
tm_1 = time.time()
mat_3 = np.dot(mat_1,mat_2)
tm_2 = time.time()
tm_3 = tm_2 - tm_1
print(tm_3)
The output is
2.781277894973755
As you can see, these numbers are not even in the same ballpark.
EDIT 2
At Zheyuan Li's request, here are toy examples for dot products.
In R:
mat_1 = matrix(c(1,2,1,2,1,2), nrow = 2, ncol = 3)
mat_2 = matrix(c(1,1,1,2,2,2,3,3,3), nrow = 3, ncol = 3)
mat_3 = mat_1 %*% mat_2
print(mat_3)
The output is:
[,1] [,2] [,3]
[1,] 3 6 9
[2,] 6 12 18
In Octave:
mat_1 = [1,1,1;2,2,2];
mat_2 = [1,2,3;1,2,3;1,2,3];
mat_3 = mat_1*mat_2
The output is:
mat_3 =
3 6 9
6 12 18
In Python:
import numpy as np
mat_1 = np.array([[1,1,1],[2,2,2]])
mat_2 = np.array([[1,2,3],[1,2,3],[1,2,3]])
mat_3 = np.dot(mat_1, mat_2)
print(mat_3)
The output is:
[[ 3 6 9]
[ 6 12 18]]
For more information on matrix dot products: https://en.wikipedia.org/wiki/Matrix_multiplication
EDIT 3
The output for sessionInfo()
is:
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=Dutch_Netherlands.1252 LC_CTYPE=Dutch_Netherlands.1252 LC_MONETARY=Dutch_Netherlands.1252
[4] LC_NUMERIC=C LC_TIME=Dutch_Netherlands.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.4.0 tools_3.4.0
EDIT 4
I tried the bigalgebra
package but this did not seem to speed things up:
library('bigalgebra')
N <- 331
M <- 23152
mat_1 = matrix( rnorm(N*M,mean=0,sd=1), N, M)
mat_1 <- as.big.matrix(mat_1)
mat_2 = matrix( rnorm(N*M,mean=0,sd=1), M, M)
tm3 <- system.time({
mat_3 = mat_1%*%mat_2
})
print(tm3)
The output is:
user system elapsed
101.79 0.00 101.81
EDIT 5
James suggested to alter my randomly generated matrix:
N <- 331
M <- 23152
mat_1 = matrix( runif(N*M), N, M)
mat_2 = matrix( runif(M*M), M, M)
tm3 <- system.time({
mat_3 = mat_1%*%mat_2
})
print(tm3)
The output is:
user system elapsed
102.46 0.05 103.00
This is a trivial operation?? Matrix multiplication is always an expensive operation in linear algebra computations.
Actually I think it is quite fast. A matrix multiplication at this size has
2 * 23.152 * 23.152 * 0.331 = 354.8 GFLOP
With 100 seconds your performance is 3.5 GFLOPs. Note that on most machines, the performance is at most 0.8 GLOPs - 2 GFLOPs, unless you have an optimized BLAS library.
If you think implementation elsewhere is faster, check the possibility of usage of optimized BLAS, or parallel computing. R is doing this with a standard BLAS and no parallelism.
Important
From R-3.4.0, more tools are available with BLAS.
First of all, sessionInfo()
now returns the full path of the linked BLAS library. Yes, this does not point to the symbolic link, but the final shared object! The other answer here just shows this: it has OpenBLAS.
The timing result (in the other answer) implies that parallel computing (via multi-threading in OpenBLAS) is in place. It is hard for me to tell the number of threads used, but looks like hyperthreading is on, as the slot for "system" is quite big!
Second, options
can now set matrix multiplications methods, via matprod
. Although this was introduced to deal with NA / NaN, it offers testing of performance, too!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With