Does urllib2 in Python 2.6.1 support proxy via https?
I've found the following at http://www.voidspace.org.uk/python/articles/urllib2.shtml:
NOTE
Currently urllib2 does not support fetching of https locations through a proxy. This can be a problem.
I'm trying automate login in to web site and downloading document, I have valid username/password.
proxy_info = {
'host':"axxx", # commented out the real data
'port':"1234" # commented out the real data
}
proxy_handler = urllib2.ProxyHandler(
{"http" : "http://%(host)s:%(port)s" % proxy_info})
opener = urllib2.build_opener(proxy_handler,
urllib2.HTTPHandler(debuglevel=1),urllib2.HTTPCookieProcessor())
urllib2.install_opener(opener)
fullurl = 'https://correct.url.to.login.page.com/user=a&pswd=b' # example
req1 = urllib2.Request(url=fullurl, headers=headers)
response = urllib2.urlopen(req1)
I've had it working for similar pages but not using HTTPS and I suspect it does not get through proxy - it just gets stuck in the same way as when I did not specify proxy. I need to go out through proxy.
I need to authenticate but not using basic authentication, will urllib2 figure out authentication when going via https site (I supply username/password to site via url)?
EDIT: Nope, I tested with
proxies = {
"http" : "http://%(host)s:%(port)s" % proxy_info,
"https" : "https://%(host)s:%(port)s" % proxy_info
}
proxy_handler = urllib2.ProxyHandler(proxies)
And I get error:
urllib2.URLError: urlopen error [Errno 8] _ssl.c:480: EOF occurred in violation of protocol
Fixed in Python 2.6.3 and several other branches:
http://www.python.org/download/releases/2.6.3/NEWS.txt
Issue #1424152: Fix for httplib, urllib2 to support SSL while working through proxy. Original patch by Christopher Li, changes made by Senthil Kumaran.
I'm not sure Michael Foord's article, that you quote, is updated to Python 2.6.1 -- why not give it a try? Instead of telling ProxyHandler that the proxy is only good for http, as you're doing now, register it for https, too (of course you should format it into a variable just once before you call ProxyHandler and just repeatedly use that variable in the dict): that may or may not work, but, you're not even trying, and that's sure not to work!-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With