What is the unit in python lru_cache?

Tags:

functools

According to the documentation the default value for lru_cache from functools is 128. But no unit is defined.

Decorator to wrap a function with a memoizing callable that saves up to the maxsize most recent calls. It can save time when an expensive or I/O bound function is periodically called with the same arguments.

Since a dictionary is used to cache results, the positional and keyword arguments to the function must be hashable.

Distinct argument patterns may be considered to be distinct calls with separate cache entries. For example, f(a=1, b=2) and f(b=2, a=1) differ in their keyword argument order and may have two separate cache entries.

If user_function is specified, it must be a callable. This allows the lru_cache decorator to be applied directly to a user function, leaving the maxsize at its default value of 128.

My question is there any unit like bits, bytes, megabytes attached to this or is this an arbitrary number that has no simple relationship with the used memory?

788

asked Jun 03 '20 22:06

2 Answers

Short answer: It is the number of elements that are stored in the cache.

We can look up the source code of the lru_cache [GitHub]. The code is rather complicated, but in a nutshell, line 619 already gives a clue:

                    full = (cache_len() >= maxsize)

This specifies that the cache is full given that the cache_len() is greater than or equal to the maxsize.

The cache_len is a function that returns the number of records in the dictionary, as we can see in the source code:

    cache = {}
    hits = misses = 0
    full = False
    cache_get = cache.get    # bound method to lookup a key or return None
    cache_len = cache.__len__  # get cache size without calling len()

The logic also each time branches when it adds a new record, in case the cache is full, it will "kick out" one of the elements:

                if key in cache:
                    # Getting here means that this same key was added to the
                    # cache while the lock was released.  Since the link
                    # update is already done, we need only return the
                    # computed result and update the count of misses.
                    pass
                elif full:
                    # Use the old root to store the new key and result.
                    oldroot = root
                    oldroot[KEY] = key
                    oldroot[RESULT] = result
                    # Empty the oldest link and make it the new root.
                    # Keep a reference to the old key and old result to
                    # prevent their ref counts from going to zero during the
                    # update. That will prevent potentially arbitrary object
                    # clean-up code (i.e. __del__) from running while we're
                    # still adjusting the links.
                    root = oldroot[NEXT]
                    oldkey = root[KEY]
                    oldresult = root[RESULT]
                    root[KEY] = root[RESULT] = None
                    # Now update the cache dictionary.
                    del cache[oldkey]
                    # Save the potentially reentrant cache[key] assignment
                    # for last, after the root and links have been put in
                    # a consistent state.
                    cache[key] = oldroot
                else:
                    # Put result in a new link at the front of the queue.
                    last = root[PREV]
                    link = [last, root, key, result]
                    last[NEXT] = root[PREV] = cache[key] = link
                    # Use the cache_len bound method instead of the len() function
                    # which could potentially be wrapped in an lru_cache itself.
                    full = (cache_len() >= maxsize)

156

answered Oct 16 '22 23:10

Willem Van Onsem

Decorator to wrap a function with a memoizing callable that saves up to the maxsize most recent calls

The unit in that phrase is "calls". I.e. each argument pattern and its corresponding result. The size of each cached call is going to depend on the signature of the function; if it returns an object that takes up gigabytes of memory, then you might need to reduce maxsize.

answered Oct 17 '22 01:10

Samwise

Related questions
                            
                                How to print docstring for class attribute/element?
                            
                                Accessing Microsoft Sharepoint files and data using Python
                            
                                Python how to combine two columns of a dataframe into a single list?
                            
                                AWS Lambda, Python, Numpy and others as Layers
                            
                                Gradcam with guided backprop for transfer learning in Tensorflow 2.0
                            
                                ValueError: 2 columns passed, passed data had 1 columns
                            
                                Get indices of elements in tensor a that are present in tensor b
                            
                                Invoking Google Cloud Function from python using service account for authentication
                            
                                Is it possible to test a while True loop with pytest (I try with a timeout)?
                            
                                Simple Python question: Why can't I assign a variable to a sorted list (in place)? [duplicate]
                            
                                Airflow: Unable to access the AWS providers
                            
                                Difference between Numpy and Tensorflow? [closed]
                            
                                How to handle exception and exit?
                            
                                How to sort a tensor by first dimension in pytorch?
                            
                                Machine learning regression model predicts same value for every image
                            
                                Convert cx_Oracle.LOB data to string in python
                            
                                Why does time.sleep(...) not get affected by the GIL?
                            
                                Groupby a part of the string in pandas
                            
                                Finding highest n values of every column in dataframe [duplicate]
                            
                                Can you make Python3 give an error when comparing strings to bytes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the unit in python lru_cache?

Tags:

python

functools

Soren

People also ask

2 Answers

Willem Van Onsem

Samwise

Recent Activity

Donate For Us