I manage a lot of HTTPS proxys (That's proxies which have an SSL connection of their own). I'm building a diagnostic tool in python that attempts to connect to a page through each proxy and email me if it can't connect through one of them.
The way I've set out to go about this is to use urllib to connect through each proxy and return a page which should say "success" with the code below.
def fetch(url):
connection = urllib.urlopen(
url,
proxies={'http':"https://"+server+':443'}
)
return connection.read()
print fetch(testURL)
This fetches the page I want perfectly the problem is it will still fetch the page I want even if the proxy server information is incorrect or the proxy server is inactive. So either it never uses the proxy server or it tries it and connects without it when it fails.
How can I correct this?
Edit: No one seems to know how to do this. I'm going to start reading through other languages libraries to see if they can handle it better. Does anyone know if it's easier in another language like Go?
Edit: I just wrote this in a comment below but I think it might be a misunderstanding going around. "The proxy has it's own ssl connection. So if I go to google.com, I first do a key exchange with foo.com and then another with the destination address bar.com or the destination address baz.com The destination doesn't have to be https, the proxy is https"
A simple python script to check if a proxy is working. Simply put proxy:port in array. If you want to check if internet is working or not, leave the array empty.
To use a proxy in Python, first import the requests package. Next create a proxies dictionary that defines the HTTP and HTTPS connections. This variable should be a dictionary that maps a protocol to the proxy URL. Additionally, make a url variable set to the webpage you're scraping from.
Modern proxy servers can be used as gateways for requests that access both HTTP and HTTPS resources.
Most people understand https proxy as proxy that understands CONNECT request. My example creates direct ssl connection.
try:
import http.client as httplib # for python 3.2+
except ImportError:
import httplib # for python 2.7
con = httplib.HTTPSConnection('proxy', 443) # create proxy connection
# download http://example.com/ through proxy
con.putrequest('GET', 'http://example.com/', skip_host=True)
con.putheader('Host', 'example.com')
con.endheaders()
res = con.getresponse()
print(res.read())
If your proxy is reverse proxy then change
con.putrequest('GET', 'http://example.com/', skip_host=True)
to
con.putrequest('GET', '/', skip_host=True)`
I assume its not working for https requests. Is this correct? If yes then the above code defines a proxy for only http. Try adding it for https:
proxies={'https':"https://"+server+':443'}
Another option is to use the requests
python module instead of urllib
. Have a look at http://docs.python-requests.org/en/latest/user/advanced/#proxies
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With