I am writing a class that has some computation-heavy methods and some parameters that the user will want to iteratively tweak and are independent of the computation.
The actual use is for visualization, but here's a cartoon example:
class MyClass(object):
def __init__(self, x, name, mem=None):
self.x = x
self.name = name
if mem is not None:
self.square = mem.cache(self.square)
def square(self, x):
"""This is the 'computation heavy' method."""
return x ** 2
def report(self):
"""Use the results of the computation and a tweakable parameter."""
print "Here you go, %s" % self.name
return self.square(self.x)
The basic idea is that the user might want to create many instances of this class with the same x
but different name
parameters. I want to allow the user to provide a joblib.Memory
object that will cache the computation part, so they can "report" to lots of different names without recomputing the squared array each time.
(This is a little weird, I know. The reason the user needs a different class instance for each name is that they'll actually be interacting with an interface function that looks like this.
def myfunc(x, name, mem=None):
theclass = MyClass(x, name, mem)
theclass.report()
But let's ignore that for now).
Following the joblib docs I am caching the square
function with the line self.square = mem.cache(self.square)
. The problem is that, because self
will be different for different instances, the array gets recomputed every time even when the argument is the same.
I am guessing that the correct way to handle this is changing the line to
self.square = mem.cache(self.square, ignore=["self"])
However, are there any drawbacks to this approach? Is there a better way to accomplish the caching?
From the docs,
If you want to use cache inside a class the recommended pattern is to cache a pure function and use the cached function inside your class.
Since you want memory caching to be optional, I recommend something like this:
def square_function(x):
"""This is the 'computation heavy' method."""
print ' square_function is executing, not using cached result.'
return x ** 2
class MyClass(object):
def __init__(self, x, name, mem=None):
self.x = x
self.name = name
if mem is not None:
self._square_function = mem.cache(square_function)
else:
self._square_function = square_function
def square(self, x):
return self._square_function(x)
def report(self):
print "Here you go, %s" % self.name
return self.square(self.x)
from tempfile import mkdtemp
cachedir = mkdtemp()
from joblib import Memory
memory = Memory(cachedir=cachedir, verbose=0)
objects = [
MyClass(42, 'Alice (cache)', memory),
MyClass(42, 'Bob (cache)', memory),
MyClass(42, 'Charlie (no cache)')
]
for obj in objects:
print obj.report()
Execution yields:
Here you go, Alice (cache)
square_function is executing, not using cached result.
1764
Here you go, Bob (cache)
1764
Here you go, Charlie (no cache)
square_function is executing, not using cached result.
1764
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With