I'm learning how to login to an example website using python requests module. This Video Tutorial got me started. From all the cookies that I see in GoogleChrome>Inspect Element>NetworkTab, I'm not able to retrieve all of them using the following code:
import requests
with requests.Session() as s:
url = 'http://www.noobmovies.com/accounts/login/?next=/'
s.get(url)
allcookies = s.cookies.get_dict()
print allcookies
Using this I only get csrftoken like below:
{'csrftoken': 'ePE8zGxV4yHJ5j1NoGbXnhLK1FQ4jwqO'}
But in google chrome, I see all these other cookies apart from csrftoken (sessionid, _gat, _ga etc):
I even tried the following code from here, but the result was the same:
from urllib2 import Request, build_opener, HTTPCookieProcessor, HTTPHandler
import cookielib
#Create a CookieJar object to hold the cookies
cj = cookielib.CookieJar()
#Create an opener to open pages using the http protocol and to process cookies.
opener = build_opener(HTTPCookieProcessor(cj), HTTPHandler())
#create a request object to be used to get the page.
req = Request("http://www.noobmovies.com/accounts/login/?next=/")
f = opener.open(req)
#see the first few lines of the page
html = f.read()
print html[:50]
#Check out the cookies
print "the cookies are: "
for cookie in cj:
print cookie
Output:
<!DOCTYPE html>
<html xmlns="http://www.w3.org
the cookies are:
<Cookie csrftoken=ePE8zGxV4yHJ5j1NoGbXnhLK1FQ4jwqO for www.noobmovies.com/>
So, how can I get all the cookies ? Thanks.
To send a request with a Cookie, you need to add the "Cookie: name=value" header to your request. To send multiple cookies in a single Cookie header, separate them with semicolons or add multiple "Cookie: name=value" request headers.
To send cookies to the server, you need to add the "Cookie: name=value" header to your request. To send multiple Cookies in one cookie header, you can separate them with semicolons. In this Send Cookies example, we are sending HTTP cookies to the ReqBin echo URL.
The reason why request might be blocked is that, for example in Python requests library, default user-agent is python-requests and websites understands that it's a bot and might block a request in order to protect the website from overload, if there's a lot of requests being sent.
The requests module allows you to send HTTP requests using Python. The HTTP request returns a Response Object with all the response data (content, encoding, status, etc).
The cookies being set are from other pages/resources, probably loaded by JavaScript code. You can check it making the request to the page only (without running the JS code), using tools such as wget, curl or httpie.
The only cookie this server set is csrftoken
, as you can see in:
$ wget --server-response 'http://www.noobmovies.com/accounts/login/?next=/'
--2016-02-01 22:51:55-- http://www.noobmovies.com/accounts/login/?next=/
Resolving www.noobmovies.com (www.noobmovies.com)... 69.164.217.90
Connecting to www.noobmovies.com (www.noobmovies.com)|69.164.217.90|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: nginx/1.4.6 (Ubuntu)
Date: Tue, 02 Feb 2016 00:51:58 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Expires: Tue, 02 Feb 2016 00:51:58 GMT
Vary: Cookie,Accept-Encoding
Cache-Control: max-age=0
Set-Cookie: csrftoken=XJ07sWhMpT1hqv4K96lXkyDWAYIFt1W5; expires=Tue, 31-Jan-2017 00:51:58 GMT; Max-Age=31449600; Path=/
Last-Modified: Tue, 02 Feb 2016 00:51:58 GMT
Length: unspecified [text/html]
Saving to: ‘index.html?next=%2F’
index.html?next=%2F [ <=> ] 10,83K 2,93KB/s in 3,7s
2016-02-01 22:52:03 (2,93 KB/s) - ‘index.html?next=%2F’ saved [11085]
Note the Set-Cookie
line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With