My S4 class has a method that is called many times. I noticed that the execution time is much slower than it would be if a similar function was called independently. So I added a slot with type "function" to my class and used that function instead of the method. The example below shows two ways of doing this, and both of them run much faster than the corresponding method. Also, the example suggests that the lower speed of the method is not due to method having to retrieve data from the class, since the functions are faster even when they also do that.
Of course, this way of doing things is not ideal. I wonder if there is a way to accelerate method dispatch. Any suggestions?
setClass(Class = "SpeedTest",
representation = representation(
x = "numeric",
foo1 = "function",
foo2 = "function"
)
)
speedTest <- function(n) {
new("SpeedTest",
x = rnorm(n),
foo1 = function(z) sqrt(abs(z)),
foo2 = function() {}
)
}
setGeneric(
name = "method.foo",
def = function(object) {standardGeneric("method.foo")}
)
setMethod(
f = "method.foo",
signature = "SpeedTest",
definition = function(object) {
sqrt(abs(object@x))
}
)
setGeneric(
name = "create.foo2",
def = function(object) {standardGeneric("create.foo2")}
)
setMethod(
f = "create.foo2",
signature = "SpeedTest",
definition = function(object) {
z <- object@x
object@foo2 <- function() sqrt(abs(z))
object
}
)
> st <- speedTest(1000)
> st <- create.foo2(st)
>
> iters <- 100000
>
> system.time(for (i in seq(iters)) method.foo(st)) # slowest by far
user system elapsed
3.26 0.00 3.27
> # much faster
> system.time({foo1 <- st@foo1; x <- st@x; for (i in seq(iters)) foo1(x)})
user system elapsed
1.47 0.00 1.46
> # retrieving st@x instead of x does not affect speed
> system.time({foo1 <- st@foo1; for (i in seq(iters)) foo1(st@x)})
user system elapsed
1.47 0.00 1.49
> # same speed as foo1 although no explicit argument
> system.time({foo2 <- st@foo2; for (i in seq(iters)) foo2()})
user system elapsed
1.44 0.00 1.45
# Cannot increase speed by using a lambda to "eliminate" the argument of method.foo
> system.time({foo <- function() method.foo(st); for (i in seq(iters)) foo()})
user system elapsed
3.28 0.00 3.29
The cost is in method look-up, which starts from scratch in each iteration of your timing. This can be short-circuited by figuring out method dispatch once
METHOD <- selectMethod(method.foo, class(st))
for (i in seq(iters)) METHOD(st)
This (better method look-up) would be a very interesting and worth-while project; there are valuable lessons learned in other dynamic languages, e.g., inline caching mentioned on Wikipedia's dynamic dispatch page.
I wonder if the reason that you're making many method calls is because of incomplete vectorization of your data representation and methods?
This doesn't help you directly with your problem, but it's much easier to benchmark this sort of stuff with the microbenchmark package:
f <- function(x) NULL
s3 <- function(x) UseMethod("s3")
s3.integer <- function(x) NULL
A <- setClass("A", representation(a = "list"))
setGeneric("s4", function(x) standardGeneric("s4"))
setMethod(s4, "A", function(x) NULL)
B <- setRefClass("B")
B$methods(r5 = function(x) NULL)
a <- A()
b <- B$new()
library(microbenchmark)
options(digits = 3)
microbenchmark(
bare = NULL,
fun = f(),
s3 = s3(1L),
s4 = s4(a),
r5 = b$r5()
)
# Unit: nanoseconds
# expr min lq median uq max neval
# bare 13 20 22 29 36 100
# fun 171 236 270 310 805 100
# s3 2025 2478 2651 2869 8603 100
# s4 10017 11029 11528 11905 36149 100
# r5 9080 10003 10390 10804 61864 100
On my computer, the bare call takes about 20 ns. Wrapping it in a function adds about an extra 200 ns - this is the cost of creating the environment where the function execution happens. S3 method dispatch adds around 3 µs and S4/ref classes around 12 µs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With