I have the following cookie saved by curl (in test.txt, tab-separated, this editor doesn't preserve tabs):
# Netscape HTTP Cookie File
# http://curlm.haxx.se/rfc/cookie_spec.html
# This file was generated by libcurl! Edit at your own risk.
#HttpOnly_my-example.com FALSE / FALSE 0 _rails-root_session test
I'm trying to read it with the following code:
import sys
if sys.version_info < (3,):
from cookielib import Cookie, MozillaCookieJar
else:
from http.cookiejar import Cookie, MozillaCookieJar
def load_cookies_from_mozilla(filename):
ns_cookiejar = MozillaCookieJar()
ns_cookiejar.load(filename, ignore_discard=True)
return ns_cookiejar
cookies = load_cookies_from_mozilla("test.txt")
print (len(cookies))
It outputs 0 (unable to read the cookie). If I manually modify my cookie to the following line (remove HttpOnly flag and changing 0 to the empty string for expiration time, and again, tab-separated):
my-example.com FALSE / FALSE _rails-root_session test
then it outputs 1 (successfully read the cookie).
What needs to be done to my python code to read the original cookie line? And preferably to be able to save it in the same format (with HttpOnly flag and with 0 instead of empty string for never-expiring cookie)?
Thanks.
curl has a full cookie "engine" built in. If you just activate it, you can have curl receive and send cookies exactly as mandated in the specs. tell curl a file to read cookies from and start the cookie engine, or if it is not a file it will pass on the given string.
We tell curl to store them to a file at /tmp/cookies using the -c switch. If you want to both send and store cookies, you need to supply both switches. You can optionally use the -j switch to tell curl to discard any cookies with "Session" expiry.
curl can specify and store the cookies encountered during HTTP operations. The -cookie COOKIE_IDENTIFER option specifies which cookies to provide. Cookies are defined as name=value . Multiple cookies should be delimited with a semicolon ( ; ): $ curl http://example.com --cookie "user=username;pass=hack"
This appears to be an open bug: https://bugs.python.org/issue2190.
This bug report contains a link to a workaround: https://gerrit.googlesource.com/git-repo/+/master/subcmds/sync.py#995
In that linked code, the developer creates a temporary cookies file, removes the "#HttpOnly_" prefixes, and then creates a cookiejar with that temporary file.
tmpcookiefile = tempfile.NamedTemporaryFile()
tmpcookiefile.write("# HTTP Cookie File")
try:
with open(cookiefile) as f:
for line in f:
if line.startswith("#HttpOnly_"):
line = line[len("#HttpOnly_"):]
tmpcookiefile.write(line)
tmpcookiefile.flush()
cookiejar = cookielib.MozillaCookieJar(tmpcookiefile.name)
try:
cookiejar.load()
except cookielib.LoadError:
cookiejar = cookielib.CookieJar()
finally:
tmpcookiefile.close()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With