I am trying to open the following website and retrieve the initial cookie and use it for the second url-open BUT if you run the following code it outputs 2 different cookies. How do I use the initial cookie for the second url-open?
import cookielib, urllib2 cj = cookielib.CookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) home = opener.open('https://www.idcourts.us/repository/start.do') print cj search = opener.open('https://www.idcourts.us/repository/partySearch.do') print cj
Output shows 2 different cookies every time as you can see:
<cookielib.CookieJar[<Cookie JSESSIONID=0DEEE8331DE7D0DFDC22E860E065085F for www.idcourts.us/repository>]> <cookielib.CookieJar[<Cookie JSESSIONID=E01C2BE8323632A32DA467F8A9B22A51 for www.idcourts.us/repository>]>
urllib2 is a Python module that can be used for fetching URLs. It defines functions and classes to help with URL actions (basic and digest. authentication, redirections, cookies, etc) The magic starts with importing the urllib2 module.
The Python "ModuleNotFoundError: No module named 'urllib2'" occurs because the urllib2 module has been split into urllib. request and urllib. response in Python 3. To solve the error, import the module as from urllib.
NOTE: urllib2 is no longer available in Python 3.
urllib2 is deprecated in python 3. x. use urllib instaed.
This is not a problem with urllib. That site does some funky stuff. You need to request a couple of stylesheets for it to validate your session id:
import cookielib, urllib2 cj = cookielib.CookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) # default User-Agent ('Python-urllib/2.6') will *not* work opener.addheaders = [ ('User-Agent', 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.11) Gecko/20101012 Firefox/3.6.11'), ] stylesheets = [ 'https://www.idcourts.us/repository/css/id_style.css', 'https://www.idcourts.us/repository/css/id_print.css', ] home = opener.open('https://www.idcourts.us/repository/start.do') print cj sessid = cj._cookies['www.idcourts.us']['/repository']['JSESSIONID'].value # Note the += opener.addheaders += [ ('Referer', 'https://www.idcourts.us/repository/start.do'), ] for st in stylesheets: # da trick opener.open(st+';jsessionid='+sessid) search = opener.open('https://www.idcourts.us/repository/partySearch.do') print cj # perhaps need to keep updating the referer...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With