I'm using Python 3's builtin functools.lru_cache
decorator to memoize some expensive functions. I would like to memoize as many calls as possible without using too much memory, since caching too many values causes thrashing.
Is there a preferred technique or library for accomplishing this in Python?
For example, this question lead me to a Go library for system memory aware LRU caching. Something similar for Python would be ideal.
Note: I can't just estimate the memory used per value and set maxsize
accordingly, since several processes will be calling the decorated function in parallel; a solution would need to actually dynamically check how much memory is free.
One way to implement an LRU cache in Python is to use a combination of a doubly linked list and a hash map. The head element of the doubly linked list would point to the most recently used entry, and the tail would point to the least recently used entry.
Computers also have memory but it's more expensive to retrieve data from there. However, cache memory is limited in size and there needs to be a way to manage what data needs to be removed from the cache in order to store new data. That's where LRU cache comes in.
A Least Recently Used (LRU) Cache organizes items in order of use, allowing you to quickly identify which item hasn't been used for the longest amount of time. Picture a clothes rack, where clothes are always hung up on one side. To find the least-recently used item, look at the item on the other end of the rack.
Caching, is a concept that was gifted to software world from the hardware world. A cache is a temporary storage area that stores the used items for easy access. To put it in layman's terms, it is the chair we all have.
I ended up modifying the built-in lru_cache
to use psutil
.
The modified decorator takes an additional optional argument use_memory_up_to
. If set, the cache will be considered full if there are fewer than use_memory_up_to
bytes of memory available (according to psutil.virtual_memory().available
). For example:
from .lru_cache import lru_cache
GB = 1024**3
@lru_cache(use_memory_up_to=(1 * GB))
def expensive_func(args):
...
Note: setting use_memory_up_to
will cause maxsize
to have no effect.
Here's the code: lru_cache.py
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With