Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Benchmarking: using `expression` `quote` or neither

Generally, when I run benchmarks, I wrap my statements in expression. Recently, it was suggested to either (a) not do so or (b) use quote instead of expression.

I find two advantages to wrapping the statements:

  • compared to entire statements, they are more easily swapped out.
  • I can lapply over a list of inputs, and compare those results

However, in exploring the different methods, I noticed a discrepency between the three methods (wrapping in expression, wrapping in quote, or not wrapping at all)

The question is:
Why the discrepency?
(it appears that wrapping in quote does not actually evaluate the call.)

EXAMPLE:

# SAMPLE DATA
  mat <-  matrix(sample(seq(1e6), 4^2*1e4, T), ncol=400) 

# RAW EXPRESSION TO BENCHMARK IS: 
  # apply(mat, 2, mean)

# WRAPPED EXPRESSION: 
  expr <- expression(apply(mat, 2, mean))
  quot <- quote(apply(mat, 2, mean))

# BENCHMARKS
  benchmark(raw=apply(mat, 2, mean), expr, quot)[, -(7:8)]
  #    test replications elapsed relative user.self sys.self
  #  2 expr          100   1.269       NA     1.256    0.019
  #  3 quot          100   0.000       NA     0.001    0.000
  #  1  raw          100   1.494       NA     1.286    0.021


# BENCHMARKED INDIVIDUALLY 
  benchmark(raw=apply(mat, 2, mean))[, -(7:8)]
  benchmark(expr)[, -(7:8)]
  benchmark(quot)[, -(7:8)]

  # results
  #    test replications elapsed relative user.self sys.self
  #  1  raw          100   1.274        1      1.26    0.018
  #    test replications elapsed relative user.self sys.self
  #  1 expr          100   1.476        1     1.342    0.021
  #    test replications elapsed relative user.self sys.self
  #  1 quot          100   0.006        1     0.006    0.001
like image 997
Ricardo Saporta Avatar asked Dec 04 '12 22:12

Ricardo Saporta


1 Answers

Your issue is that quote does not produce an expression but a call, so within the call to benchmark, there is no expression to evaluate.

If you evaluate the `call it will actually get evaluated, and the timings are reasonable.

class(quot)
[1] "call"
>class(expr)
[1] "expression"


 benchmark(raw=apply(mat, 2, mean), expr, eval(quot))[, -(7:8)]
        test replications elapsed relative user.self sys.self
3 eval(quot)          100    0.76    1.000      0.77        0
2       expr          100    0.83    1.092      0.83        0
1        raw          100    0.78    1.026      0.78        0

In general, I tend to create a function that contains the call / process I wish to benchmark. Note that it is good practice to include things like assigning the result to a value.

eg

 raw <- function() {x <- apply(mat, 2, mean)}

In which case it looks like that there is a slight improvement by eval(quote(...)).

benchmark(raw(), eval(quote(raw()))

                test replications elapsed relative user.self sys.self 
2 eval(quote(raw()))          100    0.76    1.000      0.75     0.01        
1              raw()          100    0.80    1.053      0.80     0.00        

But often these small differences can be due to overheads in functions and may not reflect how the performance scales to larger problems. See the many questions with benchmarkings of data.table solutions, using a small number of replications but big data may better reflect performance.

like image 119
mnel Avatar answered Oct 21 '22 14:10

mnel