Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculation time !=

I was wondering how much faster a!=0 is than !a==0 and used the R package microbenchmark. Here's the code (reduce 3e6 and 100 if your pc is slow):

library("microbenchmark")
a <- sample(0:1, size=3e6, replace=TRUE)
speed <- microbenchmark(a != 0, ! a == 0, times=100)
boxplot(speed, notch=TRUE, unit="ms", log=F)

Everytime, I get a plot like the one below. As expected, the first version is faster (median 26 milliseconds) than the second (33 ms).

But where do these few very high values (outliers) come from? Is that some memory management effect? If I set times to 10, there are no outliers...

Edit: sessionInfo(): R version 3.1.2 (2014-10-31) Platform: x86_64-w64-mingw32/x64 (64-bit)

computation time unequal and not_equal

like image 261
Berry Boessenkool Avatar asked Dec 08 '14 11:12

Berry Boessenkool


1 Answers

You say that you don't have outliers when times=10, but run microbenchmark with times=10 several times and you are likely to see the odd outlier. Here is a comparison of one run of times=100 with ten runs of times=10, which shows that outliers occur in both situations.

Depending on the size of the objects involved in the expression, I imagine outliers could arise when your machine is struggling with memory limitations, but they might also occur due to CPU strain e.g. due to non-R processes.

a <- sample(0:1, size=3e6, replace=TRUE)
speed1 <- microbenchmark(a != 0, ! a == 0, times=100)
speed1 <- as.data.frame(speed1)

speed2 <- replicate(10, microbenchmark(a != 0, ! a == 0, times=10), simplify=FALSE)
speed2 <- do.call(rbind, lapply(speed2, cbind))

times <- cbind(rbind(speed1, speed2), method=rep(1:2, each=200))
boxplot(time ~ expr + method, data=times, 
        names=c('!=; 1x100', '!==; 1x100', '!=; 10x10', '!==; 10x10'))

enter image description here

like image 80
jbaums Avatar answered Oct 05 '22 14:10

jbaums