Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clearing lru_cache of certain methods when an attribute of the class is updated?

I have an object with a method/property multiplier. This method is called many times in my program, so I've decided to use lru_cache() on it to improve the execution speed. As expected, it is much faster:

The following code shows the problem:

from functools import lru_cache

class MyClass(object):
    def __init__(self):
        self.current_contract = 201706
        self.futures = {201706: {'multiplier': 1000},
                        201712: {'multiplier': 25}}

    @property
    @lru_cache()
    def multiplier(self):
        return self.futures[self.current_contract]['multiplier']

CF = MyClass()
assert CF.multiplier == 1000

CF.current_contract = 201712
assert CF.multiplier == 25

The 2nd assert fails, because the cached value is 1000 as lru_cache() is unaware that the underlying attribute current_contract was changed.

Is there a way to clear the cache when self.current_contract is updated?

Thanks!

like image 783
agiap Avatar asked Jul 24 '17 14:07

agiap


People also ask

What does Functools lru_cache do?

Python's functools module comes with the @lru_cache decorator, which gives you the ability to cache the result of your functions using the Least Recently Used (LRU) strategy. This is a simple yet powerful technique that you can use to leverage the power of caching in your code.

How do you clear a cache in Python?

After the use of the cache, cache_clear() can be used for clearing or invalidating the cache. These methods have limitations as they are individualized, and the cache_clear() function must be typed out for each and every LRU Cache utilizing the function.

What is lru_cache in Python?

lru_cache() is a decorator that helps in reducing function execution for the same inputs using the memoization technique. The wrapped method has a cache_info() function that produces a named tuple containing hits , misses , maxsize , and currsize to assess the cache's efficacy and optimize the maxsize parameter.

How does LRU cache work?

A Least Recently Used (LRU) Cache organizes items in order of use, allowing you to quickly identify which item hasn't been used for the longest amount of time. Picture a clothes rack, where clothes are always hung up on one side. To find the least-recently used item, look at the item on the other end of the rack.


2 Answers

Yes quite simply: make current_contract a read/write property and clear the cache in the property's setter:

from functools import lru_cache

class MyClass(object):
    def __init__(self):
        self.futures = {201706: {'multiplier': 1000},
                        201712: {'multiplier': 25}}
        self.current_contract = 201706

    @property
    def current_contract(self):
        return self._current_contract

    @current_contract.setter
    def current_contract(self, value):
        self._current_contract = value
        type(self).multiplier.fget.cache_clear()

    @property
    @lru_cache()
    def multiplier(self):
        return self.futures[self.current_contract]['multiplier']

NB : I assume your real use case involves costly computations instead of a mere dict lookup - else lru_cache might be a bit overkill ;)

like image 81
bruno desthuilliers Avatar answered Sep 22 '22 05:09

bruno desthuilliers


Short Answer

Don't clear the cache when self.current_contract is updated. That is working against the cache and throws away information.

Instead, just add methods for __eq__ and __hash__. That will teach the cache (or any other mapping) which attributes are important for influencing the result.

Worked out example

Here we add __eq__ and __hash__ to your code. That tells the cache (or any other mapping) that current_contract is the relevant independent variable:

from functools import lru_cache

class MyClass(object):
    def __init__(self):
        self.current_contract = 201706
        self.futures = {201706: {'multiplier': 1000},
                        201712: {'multiplier': 25}}

    def __hash__(self):
        return hash(self.current_contract)

    def __eq__(self, other):
        return self.current_contract == other.current_contract

    @property
    @lru_cache()
    def multiplier(self):
        return self.futures[self.current_contract]['multiplier']

An immediate advantage is that as you switch between contract numbers, previous results are kept in the cache. Try switching between 201706 and 201712 a hundred times and you will get 98 cache hits and 2 cache misses:

cf = MyClass()
for i in range(50):
    cf.current_contract = 201712
    assert cf.multiplier == 25
    cf.current_contract = 201706 
    assert cf.multiplier == 1000
print(vars(MyClass)['multiplier'].fget.cache_info())

This prints:

CacheInfo(hits=98, misses=2, maxsize=128, currsize=2)
like image 20
Raymond Hettinger Avatar answered Sep 22 '22 05:09

Raymond Hettinger