I'm trying to access an authenticated site using a cookies.txt
file (generated with a Chrome extension) with Python Requests:
import requests, cookielib cj = cookielib.MozillaCookieJar('cookies.txt') cj.load() r = requests.get(url, cookies=cj)
It doesn't throw any error or exception, but yields the login screen, incorrectly. However, I know that my cookie file is valid, because I can successfully retrieve my content using it with wget
. Any idea what I'm doing wrong?
Edit:
I'm tracing cookielib.MozillaCookieJar._really_load
and can verify that the cookies are correctly parsed (i.e. they have the correct values for the domain
, path
, secure
, etc. tokens). But as the transaction is still resulting in the login form, it seems that wget
must be doing something additional (as the exact same cookies.txt
file works for it).
To send a request with a Cookie, you need to add the "Cookie: name=value" header to your request. To send multiple cookies in a single Cookie header, separate them with semicolons or add multiple "Cookie: name=value" request headers.
Create cookie In Flask, set the cookie on the response object. Use the make_response() function to get the response object from the return value of the view function. After that, the cookie is stored using the set_cookie() function of the response object. It is easy to read back cookies.
A cookie file is text file that Webservers ask for and then add their own information to and store on your hard drive. If you have never been asked whether you want to accept a cookie, I am sad to report that your browser is set to automatically give and receive cookie files.
MozillaCookieJar
inherits from FileCookieJar
which has the following docstring in its constructor:
Cookies are NOT loaded from the named file until either the .load() or .revert() method is called.
You need to call .load()
method then.
Also, like Jermaine Xu noted the first line of the file needs to contain either # Netscape HTTP Cookie File
or # HTTP Cookie File
string. Files generated by the plugin you use do not contain such a string so you have to insert it yourself. I raised appropriate bug at http://code.google.com/p/cookie-txt-export/issues/detail?id=5
EDIT
Session cookies are saved with 0 in the 5th column. If you don't pass ignore_expires=True
to load()
method all such cookies are discarded when loading from a file.
File session_cookie.txt
:
# Netscape HTTP Cookie File .domain.com TRUE / FALSE 0 name value
Python script:
import cookielib cj = cookielib.MozillaCookieJar('session_cookie.txt') cj.load() print len(cj)
Output: 0
EDIT 2
Although we managed to get cookies into the jar above they are subsequently discarded by cookielib
because they still have 0
value in the expires
attribute. To prevent this we have to set the expire time to some future time like so:
for cookie in cj: # set cookie expire date to 14 days from now cookie.expires = time.time() + 14 * 24 * 3600
EDIT 3
I checked both wget and curl and both use 0
expiry time to denote session cookies which means it's the de facto standard. However Python's implementation uses empty string for the same purpose hence the problem raised in the question. I think Python's behavior in this regard should be in line with what wget and curl do and that's why I raised the bug at http://bugs.python.org/issue17164
I'll note that replacing 0
s with empty strings in the 5th column of the input file and passing ignore_discard=True
to load()
is the alternate way of solving the problem (no need to change expiry time in this case).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With