Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to share a cache between multiple processes?

I'm using a LRU cache to speed up some rather heavy duty processing. It works well and speeds things up considerably. However...

When I multiprocess, each process creates it's own separate cache and there are 8 copies of the same thing. That doesn't appear to be a problem, until the box runs out of memory and bad things happen as a result...

Ideally I only need a cachesize of around 300 items for the application, and 1*300 will fit in the 7GB i have to work with, but the 8*300 just doesn't fit.

How do I get all the processes to share the same cache?

like image 903
John Mee Avatar asked Dec 03 '12 23:12

John Mee


1 Answers

I believe you can use a Manager to share a dict between processes. That should in theory let you use the same cache for all functions.

However, I think a saner logic would be to have one process that responds to queries by looking them up in the cache, and if they are not present then delegating the work to a subprocess, and caching the result before returning it. You could easily do that with

with concurrent.futures.ProcessPoolExecutor() as e:
    @functools.lru_cache
    def work(*args, **kwargs):
        return e.submit(slow_work, *args, **kwargs)

Note that work will return Future objects, which the consumer will have to wait on. The lru_cache will cache the future objects so they will be returned automatically; I believe you can access their data more than once but can't test it right now.

If you're not using Python 3, you'll have to install backported versions of concurrent.futures and functools.lru_cache.

like image 131
Katriel Avatar answered Sep 25 '22 16:09

Katriel