nI would like to use memoization to cache the results of certain expensive operations so that they are not computed over and over again.
Both memoise and R.cache fit my needs. However, I am finding that caching is not robust across calls.
Here is an example that demonstrates the problem I'm seeing:
library(memoise)
# Memoisation works: b() is called only once
a <- function(x) runif(1)
replicate(5, a())
b <- memoise(a)
replicate(5, b())
# Memoisation fails: mfn() is called every single time
ProtoTester <- proto(
calc = function(.) {
fn <- function() print(runif(1))
mfn <- memoise(fn)
invisible(mfn())
}
)
replicate(5, ProtoTester$calc())
Updated based on answer
This question can have different answers based on whether persistent or non-persistent caching is used. Non-persistent caching (such as memoise
) may require single assignment and then the answer below is a nice way to go. Persistent caching (such as R.cache
) works across sessions and should be robust with respect to multiple assignments. The approach above works with R.cache
. Despite the multiple assignments, fn
is only called once with R.cache
. It would be called twice with memoise
.
> ProtoTester <- proto(
+ calc = function(.) {
+ fn <- function() print(runif(1))
+ invisible(memoizedCall(fn))
+ }
+ )
> replicate(5, ProtoTester$calc())
[1] 0.977563
[1] 0.1279641
[1] 0.01358866
[1] 0.9993092
[1] 0.3114813
[1] 0.97756303 0.12796408 0.01358866 0.99930922 0.31148128
> ProtoTester <- proto(
+ calc = function(.) {
+ fn <- function() print(runif(1))
+ invisible(memoizedCall(fn))
+ }
+ )
> replicate(5, ProtoTester$calc())
[1] 0.97756303 0.12796408 0.01358866 0.99930922 0.31148128
The reason why I thought I had a problem with R.cache
is that I was passing a proto
method as the function to memoizedCall
. proto
methods are bound to environments in ways that R.cache
has a hard time with. What you have to do in this case is unbind the function (get from an instantiated method to a simple function) and then pass the object manually as the first argument. The following example shows how this works (both Report
and Report$loader
are proto
objects:
# This will not memoize the call
memoizedCall(Report$loader$download_report)
# This works as intended
memoizedCall(with(Report$loader, download_report), Report$loader)
I'd love to know why R.cache
works with normal functions bound to environments but fails with proto
instantiated methods.
In your code, the function is memoized anew each time it is called. The following should work: it is only memoized once, when it is defined.
ProtoTester <- proto(
calc = {
fn <- function() print(runif(1))
mfn <- memoise(fn)
function(.) mfn()
}
)
replicate(5, ProtoTester$calc())
An alternative solution would be to use evals
for evaluation from (my) pander package which has an internal (temporary in an environment for current R session or persistent with disk storage) caching engine. Short example based on your code:
library(pander)
ProtoTester <- proto(
calc = function(.) {
fn <- function() runif(1)
mfn <- evals('fn()')[[1]]$result
invisible(mfn)
}
)
And running evals
with cache off and on would result in:
> evals.option('cache', FALSE)
> replicate(5, ProtoTester$calc())
[1] 0.7152186 0.4529955 0.4160411 0.1166872 0.8776698
> evals.option('cache', TRUE)
> evals.option('cache.time', 0)
> replicate(5, ProtoTester$calc())
[1] 0.7716874 0.7716874 0.7716874 0.7716874 0.7716874
Please note that the evals.option
function si to be renamed to evalsOption
soon to mitigate R CMD check
warnings about S3 methods.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With