Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cache decorator for numpy arrays

I am trying to make a cache decorator for functions with numpy array input parameters

from functools import lru_cache
import numpy as np
from time import sleep

a = np.array([1,2,3,4])

@lru_cache()
def square(array):
    sleep(1)
    return array * array

square(a)

But numpy arrays are not hashable,

TypeError                                 Traceback (most recent call last)
<ipython-input-13-559f69d0dec3> in <module>()
----> 1 square(a)

TypeError: unhashable type: 'numpy.ndarray'

So they need to be converted to tuples. I have this working and caching correctly:

@lru_cache()
def square(array_hashable):
    sleep(1)
    array = np.array(array_hashable)
    return array * array

square(tuple(a))

But I wanted to wrap it all up in a decorator, so far I have tried:

def np_cache(function):
    def outter(array):
        array_hashable = tuple(array)

        @lru_cache()
        def inner(array_hashable_inner):
            array_inner = np.array(array_hashable_inner)
            return function(array_inner)

        return inner(array_hashable)

    return outter

@np_cache
def square(array):
    sleep(1)
    return array * array

But caching is not working. Computation is performed but not cached properly, as I am always waiting 1 second.

What am I missing here? I'm guessing lru_cache isn't getting the context right and its being instantiated in each call, but I don't know how to fix it.

I have tried blindly throwing the functools.wraps decorator here and there with no luck.

like image 995
Susensio Avatar asked Sep 14 '18 12:09

Susensio


People also ask

What is cache decorator in Python?

cache is a decorator that helps in reducing function execution for the same inputs using the memoization technique. The function returns the same value as lru_cache(maxsize=None) , where the cache grows indefinitely without evicting old values.

What is @cache in Python?

Python's functools module comes with the @lru_cache decorator, which gives you the ability to cache the result of your functions using the Least Recently Used (LRU) strategy. This is a simple yet powerful technique that you can use to leverage the power of caching in your code.

Is NumPy array hashable?

Using NumPy Array as a Key Only hashable objects can be used as keys to a dictionary in Python. Since a NumPy ndarray is not hashable, any attempt to use it as a key in a dictionary will result in an error.

Is NumPy memory efficient?

NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types. This allows the code to be optimized even further.


1 Answers

Your wrapper function creates a new inner() function each time you call it. And that new function object is decorated at that time, so the end result is that each time outter() is called, a new lru_cache() is created and that'll be empty. An empty cache will always have to re-calculate the value.

You need to create a decorator that attaches the cache to a function created just once per decorated target. If you are going to convert to a tuple before calling the cache, then you'll have to create two functions:

from functools import lru_cache, wraps

def np_cache(function):
    @lru_cache()
    def cached_wrapper(hashable_array):
        array = np.array(hashable_array)
        return function(array)

    @wraps(function)
    def wrapper(array):
        return cached_wrapper(tuple(array))

    # copy lru_cache attributes over too
    wrapper.cache_info = cached_wrapper.cache_info
    wrapper.cache_clear = cached_wrapper.cache_clear

    return wrapper

The cached_wrapper() function is created just once per call to np_cache() and is available to the wrapper() function as a closure. So wrapper() calls cached_wrapper(), which has a @lru_cache() attached to it, caching your tuples.

I also copied across the two function references that lru_cache puts on a decorated function, so they are accessible via the returned wrapper as well.

In addition, I also used the @functools.wraps() decorator to copy across metadata from the original function object to the wrapper, such as the name, annotations and documentation string. This is always a good idea, because that means your decorated function will be clearly identified in tracebacks, when debugging and when you need to access documentation or annotations. The decorator also adds a __wrapped__ attribute pointing back to the original function, which would let you unwrap the decorator again if need be.

like image 68
Martijn Pieters Avatar answered Sep 20 '22 17:09

Martijn Pieters