Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use cookies in python 3?

I want use cookies that copy from my chrome, but make much error.

import urllib.request
import  re

def  open_url(url):
header={"User-Agent":r'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'}
Cookies={'Cookie':r"xxxxx"}
Request=urllib.request.Request(url=url,headers=Cookies)
response=urllib.request.urlopen(Request,timeout=100)
return  response.read().decode("utf-8")

Where does my code go wrong? Is that headers=Cookies ?

like image 899
Yong Deng Avatar asked Oct 29 '22 11:10

Yong Deng


1 Answers

The correct way when using urllib.request is to use an OpenerDirector populated with aCookieProcessor:

cookieProcessor = urllib.request.HTTPCookieProcessor()
opener = urllib.request.build_opener(cookieProcessor)

then you use opener and it will automagically process the cookies:

response = opener.open(request,timeout=100)

By default, the CookieJar (http.cookiejar.CookieJar) used in a simple in memory store, but you can use a FileCookieJar in you need long term storage of persistent cookies, or even a http.cookiejar.MozillaCookieJar if you want to use persistent cookies stored in a cookies.txt now legacy Mozilla format


If you want to use cookies existing in your web browser, you must first store them in a cookie.txt compatible file and load them in a MozillaCookieJar. For Mozilla, you can find an add-on Cookie Exporter. For other browser, you must manually create a cookie.txt file by reading the content of the cookies you need in your browser. The format can be found in The Unofficial Cookie FAQ. Extracts:

... each line contains one name-value pair. An example cookies.txt file may have an entry that looks like this:

.netscape.com TRUE / FALSE 946684799 NETSCAPE_ID 100103

Each line represents a single piece of stored information. A tab is inserted between each of the fields.

From left-to-right, here is what each field represents:

  • domain - The domain that created AND that can read the variable.
  • flag - A TRUE/FALSE value indicating if all machines within a given domain can access the variable. This value is set automatically by the browser, depending on the value you set for domain.
  • path - The path within the domain that the variable is valid for.
  • secure - A TRUE/FALSE value indicating if a secure connection with the domain is needed to access the variable. *expiration - The UNIX time that the variable will expire on. UNIX time is defined as the number of seconds since Jan 1, 1970 00:00:00 GMT.
  • name - The name of the variable.
  • value - The value of the variable.

But the normal way is to mimic a full session and extract automatically the cookies from the responses.

like image 191
Serge Ballesta Avatar answered Nov 15 '22 06:11

Serge Ballesta