Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing cookielib cookies in a database

I'm using the cookielib module to handle HTTP cookies when using the urllib2 module in Python 2.6 in a way similar to this snippet:

import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")

I'd like to store the cookies in a database. I don't know whats better - serialize the CookieJar object and store it or extract the cookies from the CookieJar and store that. I don't know which one's better or how to implement either of them. I should be also be able to recreate the CookieJar object.

Could someone help me out with the above?

Thanks in advance.

like image 683
Mridang Agarwalla Avatar asked Dec 12 '22 21:12

Mridang Agarwalla


2 Answers

cookielib.Cookie, to quote its docstring (in its sources),

is deliberately a very simple class. It just holds attributes.

so pickle (or other serialization approaches) are just fine for saving and restoring each Cookie instance.

As for CookieJar, set_cookie sets/adds one cookie instance, __iter__ (to use the latter, just do a for loop on the jar instance) returns all cookie instances it holds, one after the other.

A subclass that you can use to see how to make a "cookie jar on a database" is BSDDBCookieJar (part of mechanize, but I just pointed specifically to the jar source code file) -- it doesn't load all cookies in memory, but rather keeps them in a self._db which is a bsddb instance (mostly-on-disk, dict-lookalike hash table constrained to having only strings as keys and values) and uses pickle for serialization.

If you are OK with keeping every cookie in memory during operations, simply pickleing the jar is simplest (and, of course, put the blob in the DB and get it back from there when you're restarting) -- s = cPickle.dumps(myJar, -1) gives you a big byte string for the whole jar (and policy thereof, of course, not just the cookies), and theJar = cPickle.loads(s) rebuilds it once you've reloaded s as a blob from the DB.

like image 86
Alex Martelli Avatar answered Dec 28 '22 10:12

Alex Martelli


Here's a very simple class that I have implemented that can load/dump cookies from/to a string based on Alex' suggestion of using pickle.

from cookielib import CookieJar
try:
    import cPickle as pickle
except ImportError:
    import pickle

class StringCookieJar(CookieJar):
    def __init__(self, string=None, policy=None):
        CookieJar.__init__(self, policy)
        if string:
            self._cookies = pickle.loads(string)

    def dump(self):
        return pickle.dumps(self._cookies)
like image 44
jbochi Avatar answered Dec 28 '22 10:12

jbochi