Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the unit in python lru_cache?

According to the documentation the default value for lru_cache from functools is 128. But no unit is defined.

Decorator to wrap a function with a memoizing callable that saves up to the maxsize most recent calls. It can save time when an expensive or I/O bound function is periodically called with the same arguments.

Since a dictionary is used to cache results, the positional and keyword arguments to the function must be hashable.

Distinct argument patterns may be considered to be distinct calls with separate cache entries. For example, f(a=1, b=2) and f(b=2, a=1) differ in their keyword argument order and may have two separate cache entries.

If user_function is specified, it must be a callable. This allows the lru_cache decorator to be applied directly to a user function, leaving the maxsize at its default value of 128.

My question is there any unit like bits, bytes, megabytes attached to this or is this an arbitrary number that has no simple relationship with the used memory?

like image 788
Soren Avatar asked Jun 03 '20 22:06

Soren


People also ask

How to implement an LRU cache in Python?

Since version 3.2, Python has included the @lru_cache decorator for implementing the LRU strategy. You can use this decorator to wrap functions and cache their results up to a maximum number of entries. Using @lru_cache to Implement an LRU Cache in Python

How to implement the LRU strategy in Python?

Since version 3.2, Python has included the @lru_cache decorator for implementing the LRU strategy. You can use this decorator to wrap functions and cache their results up to a maximum number of entries.

What is maxsize in LRU_cache in Python?

Python’s @lru_cache decorator offers a maxsize attribute that defines the maximum number of entries before the cache starts evicting old items. By default, maxsize is set to 128 . If you set maxsize to None , then the cache will grow indefinitely, and no entries will be ever evicted.

What is LRU_cache in functools?

lru_cache () lru_cache () is one such function in functools module which helps in reducing the execution time of the function by using memoization technique. If typed is set to True, function arguments of different types will be cached separately.


2 Answers

Short answer: It is the number of elements that are stored in the cache.

We can look up the source code of the lru_cache [GitHub]. The code is rather complicated, but in a nutshell, line 619 already gives a clue:

                    full = (cache_len() >= maxsize)

This specifies that the cache is full given that the cache_len() is greater than or equal to the maxsize.

The cache_len is a function that returns the number of records in the dictionary, as we can see in the source code:

    cache = {}
    hits = misses = 0
    full = False
    cache_get = cache.get    # bound method to lookup a key or return None
    cache_len = cache.__len__  # get cache size without calling len()

The logic also each time branches when it adds a new record, in case the cache is full, it will "kick out" one of the elements:

                if key in cache:
                    # Getting here means that this same key was added to the
                    # cache while the lock was released.  Since the link
                    # update is already done, we need only return the
                    # computed result and update the count of misses.
                    pass
                elif full:
                    # Use the old root to store the new key and result.
                    oldroot = root
                    oldroot[KEY] = key
                    oldroot[RESULT] = result
                    # Empty the oldest link and make it the new root.
                    # Keep a reference to the old key and old result to
                    # prevent their ref counts from going to zero during the
                    # update. That will prevent potentially arbitrary object
                    # clean-up code (i.e. __del__) from running while we're
                    # still adjusting the links.
                    root = oldroot[NEXT]
                    oldkey = root[KEY]
                    oldresult = root[RESULT]
                    root[KEY] = root[RESULT] = None
                    # Now update the cache dictionary.
                    del cache[oldkey]
                    # Save the potentially reentrant cache[key] assignment
                    # for last, after the root and links have been put in
                    # a consistent state.
                    cache[key] = oldroot
                else:
                    # Put result in a new link at the front of the queue.
                    last = root[PREV]
                    link = [last, root, key, result]
                    last[NEXT] = root[PREV] = cache[key] = link
                    # Use the cache_len bound method instead of the len() function
                    # which could potentially be wrapped in an lru_cache itself.
                    full = (cache_len() >= maxsize)
like image 156
Willem Van Onsem Avatar answered Oct 16 '22 23:10

Willem Van Onsem


Decorator to wrap a function with a memoizing callable that saves up to the maxsize most recent calls

The unit in that phrase is "calls". I.e. each argument pattern and its corresponding result. The size of each cached call is going to depend on the signature of the function; if it returns an object that takes up gigabytes of memory, then you might need to reduce maxsize.

like image 38
Samwise Avatar answered Oct 17 '22 01:10

Samwise