EDIT: Updated thanks to @daroczig's lovely answer below. However, test 2 still feels like it takes longer than test 1 which is what I'm wondering about.
UPDATE: On second reading, @daroczig's answer does explain away my confusion -- the issue was due to me not properly thinking out the system.time(expr) line of code.
I wanted to make a version of the system.time
function which would be a bit more informative for myself in terms of understanding run-to-run time fluctuations:
system.time.summary <- function(N, expr) {
t.mat <- replicate(N, system.time(expr))
as.data.frame(apply(t.mat[1:3,], 1, summary))
}
However the problem is, in the self contained code below, test.2
feels like it is taking longer to do than test.1
(and I've run them several times to check), even though the code is pretty much exactly the same (test.1
uses a wrapper function, whereas test.2
is just the raw code)
# set up number of runs
N <- 100
# test 1
system.time.summary(N, (1:1e8)^2 + 1)
user.self sys.self elapsed
Min. 0.000 0.000 0.000
1st Qu. 0.000 0.000 0.000
Median 0.000 0.000 0.000
Mean 0.058 0.031 0.089
3rd Qu. 0.000 0.000 0.000
Max. 0.580 0.310 0.890
# test 2
t.mat = replicate(N, system.time((1:1e8)^2 + 1))
as.data.frame(apply(t.mat[1:3,], 1, summary))
user.self sys.self elapsed
Min. 0.630 0.120 0.860
1st Qu. 0.665 0.170 0.880
Median 0.695 0.195 0.880
Mean 0.692 0.196 0.882
3rd Qu. 0.715 0.225 0.890
Max. 0.760 0.260 0.900
I hope I explained that OK! It could be that it's currently Monday morning, but this is confusing me...
My system:
# Windows Server 2008 R2
> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-pc-mingw32/x64 (64-bit)
You are running system.time(system.time())
in the fist test, and also having system.time(1:1e8)^2 + 1)
as an expression in the function which not a good idea, see:
> expr <- system.time((1:1e8)^2 + 1)
> system.time(expr)
user system elapsed
0 0 0
But anyway: use the microbenchmark package from CRAN for such purposes, you will not regret. Set up your distinct functions and you can easily roll out your simulations with 100, 1000 or any runs. You can get a neat summary and boxplots at the end of benchmarking.
For example:
> test1 <- function() (1:1e8)^2 + 1
> (results <- microbenchmark(test1(), times=10))
Unit: nanoeconds
min lq median uq max
test1() 3565386356 3703142531 3856450582 3931033077 3986309085
> boxplot(results)
As daroczig said, you have an extra system.time. But there's something else :
If you put a browser()
in your function, you'd see what happens. In fact, the expression you make, is evaluated only once and then kept in memory. This is how R optimizes internally. So if you do :
system.time.summary(N,(1:1e8)^2 +1)
t.mat is internally :
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
user.self 0.61 0 0 0 0 0 0 0 0 0
sys.self 0.36 0 0 0 0 0 0 0 0 0
elapsed 0.97 0 0 0 0 0 0 0 0 0
user.child NA NA NA NA NA NA NA NA NA NA
sys.child NA NA NA NA NA NA NA NA NA NA
and expr is :
Browse[2]> str(expr)
num [1:100000000] 2 5 10 17 26 37 50 65 82 101 ...
It's a bit difficult to change this, as R will evaluate any static expression only once and then retrieve the result another 99 times from the memory. If you want this not to happen, you have to pass an expression explicitly, and add the eval()
function. :
system.time.summary <- function(N, expr) {
t.mat <- replicate(N, system.time(eval(expr)))
as.data.frame(apply(t.mat[1:3,], 1, summary))
}
system.time.summary(N, expression((1:1e8)^2 + 1))
Now expr gets evaluated every time, and remains an expression in the function :
Browse[2]> expr
expression((1:1e+08)^2 + 1)
This gives you the correct timings.
user.self sys.self elapsed
Min. 0.6400 0.2000 0.970
1st Qu. 0.6850 0.2375 0.980
Median 0.7150 0.2700 0.985
Mean 0.7130 0.2700 0.985
3rd Qu. 0.7425 0.2975 0.990
Max. 0.7800 0.3500 1.000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With