I am struggling to understand why this bit of code (adapted from the R Benchmark 2.5) becomes slower and slower (on average) as the number of iteration increases.
require(Matrix)
c <- 0;
for (i in 1:100) {
a <- new("dgeMatrix", x = rnorm(3250 * 3250), Dim = as.integer(c(3250, 3250)))
b <- as.double(1:3250)
invisible(gc())
timing <- system.time({
c <- solve(crossprod(a), crossprod(a, b))
})
print(timing)
rm(a, b, c)
}
Here is a sample output, which varies slightly from one run to the next.
As I understand it, nothing should saved from one iteration to the next, yet the timing slowly increases from 1 second in the first few loops to more than 4 seconds in the later loops. Do you have any idea what is causing this, and how I could fix it?
Switching the for loop to an *apply seems to yield similar results.
I know the code is not optimised, but it's coming from a widely used benchmark, and depending on what causes this behaviour, it could indicate a serious bias in its results (which only iterates 3 times by default).
I'm running R version 3.0.1 (x86_64) on Mac OS 10.8.4 with 16 GB RAM (plenty of which is free). The BLAS is OpenBLAS.
One solution would be to use the compiler package to compile your code into byte code. This should eliminate the odd timing issues as it will be calling the same compiled code each iteration. It should also make your code faster. To enable the compiler on your code, include the two lines below:
library(compiler)
enableJIT(3)
If compiling the code does not eliminate the issue, then the set of suspect problems will be narrowed down.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With