Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using python dictionary as a temporary in-memory key-value database?

I need something like a temporary in-memory key-value store. I know there are solutions like Redis. But I wonder if using a python dictionary could work? And potentially be even faster? So think a Tornado (or similar) server running and holding a python dictionary in memory and just return the appropriate value based on the HTTP request.

Why I need this? As part of a service there are key values being stored but they have this property: the more recent they are the more likely they are to be accessed. So I want to keep say last 100 key values in memory (as well as writing to disk) for faster retrieval.

If the server dies the dictionary can be restored again from disk.

Has anyone done something like this? Am I totally missing something here?

PS: I think it's not possible with a WSGI server, right? Because as far as I know you can't keep something in memory between individual requests.

like image 918
user1191575 Avatar asked Mar 21 '12 07:03

user1191575


People also ask

Can you use a Python dictionary as a database?

Dictionaries behave like a database in that instead of calling an integer to get a particular index value as you would with a list, you assign a value to a key and can call that key to get its related value.

Are dictionaries memory efficient Python?

Python Dictionaries are fast but their memory consumption can also be high at the same time.

How dictionaries are stored in memory in Python?

The dictionary stores in the bucket which is not full yet.

Is redis just a dictionary?

redis is an in-memory, key/value store. Think of it as a dictionary with any number of keys, each of which has a value that can be set or retrieved. However, Redis goes beyond a simple key/value store as it is actually a data structures server, supporting different kinds of values.


1 Answers

I'd definitely work with memcached. Once it has been setup you can easily decorate your functions/methods like it's done in my example:

#!/usr/bin/env python

import time
import memcache
import hashlib

def memoize(f):

    def newfn(*args, **kwargs):
        mc = memcache.Client(['127.0.0.1:11211'], debug=0)
        # generate md5 out of args and function
        m = hashlib.md5()
        margs = [x.__repr__() for x in args]
        mkwargs = [x.__repr__() for x in kwargs.values()]
        map(m.update, margs + mkwargs)
        m.update(f.__name__)
        m.update(f.__class__.__name__)
        key = m.hexdigest()

        value = mc.get(key)
        if value:
            return value
        else:
            value = f(*args, **kwargs)
            mc.set(key, value, 60)
            return value
        return f(*args)

    return newfn

@memoize
def expensive_function(x):
    time.sleep(5)
    return x

if __name__ == '__main__':
    print expensive_function('abc')
    print expensive_function('abc')

Don't care about network latency since that kind of optimization will be a waste of your time.

like image 102
Mathias Avatar answered Oct 23 '22 11:10

Mathias