Cachetools for subsequent runs in python

Tags:

caching

Am a beginner.Sorry if my question is childish . Do cachetools in python work for susbsequent runs?

import cachetools
import time


@cachetools.cached({})
def find_sum(n):
    time.sleep(5)
    return n+2


print find_sum(2)
print find_sum(3)
print find_sum(2)

So during the first run the third call is faster but the next time i run the file i want the first call to be faster and take result from the cache. Can cachetools do this?

296

asked Jun 21 '18 07:06

1 Answers

cachetools cannot do this out of the box. But it's very easy to add.

You can pass any mutable mapping you want to the memoizing decorators. You're using a plain old dict, and dicts are trivial to pickle. And even if you use one of the fancy cache implementations provided by the library, they're all easy to pickle as well.¹

So:

import cachetools
import pickle
import time

try:
    with open('mycache.pickle', 'rb') as f:
        cache = pickle.load(f)
except FileNotFoundError:
    cache = {} # or cachetools.LRUCache(maxsize=5) or whatever you want

@cachetools.cached(cache)
def find_sum(n):
    time.sleep(5)
    return n+2

print(find_sum(2))
print(find_sum(3))
print(find_sum(2))

with open('mycache.pickle', 'wb') as f:
    pickle.dump(cache, f)

Of course you can add:

A finally or context manager or atexit to make sure you save your files at shutdown even if you hit an exception or ^C.
A timer that saves them every so often.
A hook on the cache object to save on every update, or every Nth update. (Just override the __setitem__ method—or see Extending cache classes for other things you can do, if you use one of the fancier classes instead of a dict.)

Or you can even use an on-disk key-value database as a cache.

The simplest such database is dbm. It's limited to str/bytes for both keys and values. (You can use shelve if you want non-string values; if you want non-string keys, you probably want a different solution.) So, that doesn't quite work for our example. But for a similar example, it's almost magic:

import dbm
import time
import cachetools

cache = dbm.open('mycache.dbm', 'c')
@cachetools.cached(cache, key=lambda s:s)
def find_string_sum(s):
    time.sleep(5)
    return s + '!'

print(find_string_sum('a'))
print(find_string_sum('b'))
print(find_string_sum('a'))

The only tricky bit was that I had to override the key function. (The default key function handles *args, **kw, so for argument 'a' you end up with something like (('a',), ()), which is obviously not a string.)

_{1. As you can see from the source, there are bug fixes to make sure all of the classes are picklable in all supported Python versions, so this is clearly intentional.}

186

answered Sep 20 '22 16:09

abarnert

Related questions
                            
                                Python Pandas - Add values from one dataframe to another by matching labels to columns
                            
                                customized transformerMixin with data labels in sklearn
                            
                                Run two python files at the same time
                            
                                Saving a dataframe result value to a string variable?
                            
                                Show categorical x-axis values when making line plot from pandas Series in matplotlib
                            
                                How can I convert nested dictionary to defaultdict?
                            
                                Python - Split array into multiple arrays
                            
                                Unpack list of lists into list [duplicate]
                            
                                Unable to runserver with docker-compose up
                            
                                "Expand" pandas dataframe by values in column
                            
                                How to crop a bounding box out of an image
                            
                                EOFError: marshal data too short
                            
                                Python Plotly - Multiple dropdown plots, each of which have subplots
                            
                                How to display 16-bit 4096 intensity image in Python openCV?
                            
                                How to divide each column of pandas Dataframe by a Series?
                            
                                psycopg2.extras.DictCursor not returning dict in postgres
                            
                                Why does the result of scipy.sparse.csc_matrix.sum() change its type to numpy matrix?
                            
                                Simple way to print binary numbers in groups of nibbles
                            
                                PySpark Boolean Pivot
                            
                                plot two seaborn heatmap graphs side by side

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cachetools for subsequent runs in python

Tags:

python

caching

Savitha Suresh

People also ask

1 Answers

abarnert

Recent Activity

Donate For Us