I create two matrices A
and B
of the same dimension. A
contains larger values than B
. The matrix multiplication A %*% A
is about 10 times faster than B %*% B
.
Why is this?
## disable openMP
library(RhpcBLASctl); blas_set_num_threads(1); omp_set_num_threads(1)
A <- exp(-as.matrix(dist(expand.grid(1:60, 1:60))))
summary(c(A))
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 0.000000 0.000000 0.000000 0.001738 0.000000 1.000000
B <- exp(-as.matrix(dist(expand.grid(1:60, 1:60)))*10)
summary(c(B))
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 0.0000000 0.0000000 0.0000000 0.0002778 0.0000000 1.0000000
identical(dim(A), dim(B))
## [1] TRUE
system.time(A %*% A)
# user system elapsed
# 2.387 0.001 2.389
system.time(B %*% B)
# user system elapsed
# 21.285 0.020 21.310
sessionInfo()
# R version 3.6.1 (2019-07-05)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Linux Mint 19.2
# Matrix products: default
# BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
# LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
The question could be related to base::chol() slows down when matrix contains many small entries.
Edit: There are some small numbers, which seems to slow down computations. Others do not.
slow <- 6.41135533887904e-164
fast1 <- 6.41135533887904e-150
fast2 <- 6.41135533887904e-170
Mslow <- array(slow, c(1000, 1000)); system.time(Mslow %*% Mslow)
# user system elapsed
# 10.165 0.000 10.168
Mfast1 <- array(fast1, c(1000, 1000)); system.time(Mfast1 %*% Mfast1)
# user system elapsed
# 0.058 0.000 0.057
Mfast2 <- array(fast2, c(1000, 1000)); system.time(Mfast2 %*% Mfast2)
# user system elapsed
# 0.056 0.000 0.055
You most likely want to use .Machine$double.xmin
instead of double.eps
. This sets way less numbers to zero and has the same effect. To avoid subnormal numbers you might have to recompile BLAS using compiler flags that set those numbers to zero instead of raising a FP trap.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With