This is the script:
import requests
import json
import urlparse
from requests.adapters import HTTPAdapter
s = requests.Session()
s.mount('http://', HTTPAdapter(max_retries=1))
with open('proxies.txt') as proxies:
for line in proxies:
proxy=json.loads(line)
with open('urls.txt') as urls:
for line in urls:
url=line.rstrip()
data=requests.get(url, proxies=proxy)
data1=data.content
print data1
print {'http': line}
as you can see, its trying to access a list of urls through a list of proxies. Here is the urls.txt file:
http://api.exip.org/?call=ip
here is the proxies.txt file:
{"http":"http://107.17.92.18:8080"}
I got this proxy at www.hidemyass.com. Could it be a bad proxy? I have tried several and this is the result. Note: if you are trying to replicate this, you may have to update the proxy to a recent one at hidemyass.com. They seem to stop working eventually. here is the full error and traceback:
Traceback (most recent call last):
File "test.py", line 17, in <module>
data=requests.get(url, proxies=proxy)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 335, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 454, in send
history = [resp for resp in gen] if allow_redirects else []
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 144, in resolve_redirects
allow_redirects=False,
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 438, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 327, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPConnectionPool(host=u'219.231.143.96', port=18186): Max retries exceeded with url: http://www.google.com/ (Caused by <class 'httplib.BadStatusLine'>: '')
To fix max retries exceeded with URL in Python requests, we can set the retries when making a request with requests . to create a HTTPAdapter with the max_retries set to the Retry object. We set the Retry to max 3 retries and backoff_factor is the delay between retries in seconds. Then we call session.
“Max retries exceeded with URL” debugging ConnectionError , which tell us that there is something bad happened when requests was trying to connect. Sometimes, the exception is requests. exceptions. SSLError which is obviously a SSL-related problem.
Looking at stack trace you've provided your error is caused by httplib.BadStatusLine
exception, which, according to docs, is:
Raised if a server responds with a HTTP status code that we don’t understand.
In other words something that is returned (if returned at all) by proxy server cannot be parsed by httplib that does actual request.
From my experience with (writing) http proxies I can say that some implementations may not follow specs too strictly (rfc specs on http aren't easy reading actually) or use hacks to fix old browsers that have flaws in their implementation.
So, answering this:
Could it be a bad proxy?
... I'd say - that this is possible. The only real way to be sure is to see what is returned by proxy server.
Try to debug it with debugger or grab packet sniffer (something like Wireshark or Network Monitor) to analyze what happens in the network. Having info about what exactly is returned by proxy server should give you a key to solve this issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With