Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - How to handle HTTPS request with (Urllib2 + SSL) though a HTTP proxy

I am trying to test a proxy connection by using urllib2.ProxyHandler. However, there probably some situation that I am going to request a HTTPS website (eg: https://www.whatismyip.com/)

Urllib2.urlopen() will throw ERROR if request a HTTPS site. So I tried to use a helper function to rewrite the URLOPEN method.

Here is the helper function:

def urlopen(url, timeout):
    if hasattr(ssl, 'SSLContext'):
        SslContext = ssl.create_default_context()
        SslContext.check_hostname = False
        SslContext.verify_mode = ssl.CERT_NONE
        return urllib2.urlopen(url, timeout=timeout, context=SslContext)
    else:
        return urllib2.urlopen(url, timeout=timeout)

This helper function based on answer

Then I use:

urllib2.install_opener(
     urllib2.build_opener(
         urllib2.ProxyHandler({'http': '127.0.0.1:8080'})
     )
)

to setup http proxy for urllib.opener.

Ideally, it should working when i request a website by using urlopen('http://whatismyip.com', 30) and it should pass all traffic through http proxy.

However, the urlopen() will fall into if hasattr(ssl, 'SSLContext') all the time even if it is a HTTP site. In addition, HTTPS site is not using HTTP proxy either. This cause the HTTP proxy become invalid and all traffic going through unproxied network

I also tried this answer to change HTTP into HTTPS urllib2.ProxyHandler({'https': '127.0.0.1:8080'}) but it still not working.

My proxy is working. If i am using urllib2.urlopen() instead of the rewrite version urlopen(), it works for HTTP site.

But, I do need consider the suitation if the urlopen gonna need to be used on a HTTPS ONLY site.

How to do that?

Thanks

UPDATE1: I cannot get this work with Python 2.7.11 and some of server working properly with Python 2.7.5. I assue it is python version issue.

Urllib2 will not go through HTTPS Proxy so all HTTPS web address will failed to use proxy.

like image 1000
SharkIng Avatar asked Mar 18 '16 16:03

SharkIng


3 Answers

The problem is when you pass context argument to urllib2.urlopen() then urllib2 creates opener itself instead of using the global one, which is the one that gets set when you call urllib2.install_opener(). As a result your instance of ProxyHandler which you meant to be used is not being used.
The solution is not to install opener but to use the opener directly. When building your opener, you have to pass both an instance of your ProxyHandler class (to set proxies for http and https protocols) and an instance of HTTPSHandler class (to set https context).

I created https://bugs.python.org/issue29379 for this issue.

like image 128
Piotr Dobrogost Avatar answered Nov 14 '22 23:11

Piotr Dobrogost


I personally would suggest the use of something such as python-requests as it will alleviate a lot of the issues with setting up the proxy using urllib2 directly. When using requests with a proxy you will have to do: (From their documentation)

import requests

proxies = {
  'http': 'http://10.10.1.10:3128',
  'https': 'http://10.10.1.10:1080',
}

requests.get('http://example.org', proxies=proxies)

And disabling SSL Certificate verification is as simple as passing verify=False the requests.get command above. However, this should be used sparingly and the actual issue with the SSL Cert verification should be resolve.

like image 28
Cory Shay Avatar answered Nov 14 '22 22:11

Cory Shay


One more solution is to pass context into HTTPSHandler and pass this handler into build_opener together with ProxyHandler:

proxies = {'https': 'http://localhost:8080'}
proxy = urllib2.ProxyHandler(proxies)
context = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
handler = urllib2.HTTPSHandler(context=context)
opener = urllib2.build_opener(proxy, handler)
urllib2.install_opener(opener)

Now you can view all your HTTPS requests/responses in your proxy.

like image 39
avdyushin Avatar answered Nov 15 '22 00:11

avdyushin