What is the correct way to update the user agent information in urllib3
?
How can I check that the user agent information was indeed changed and is being used?
For example:
user_agent = {'user-agent': 'Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0'}
http = urllib3.PoolManager(10, headers=user_agent)
r1 = http.request('GET', 'http://example.com/')
if r1.status is 200:
with open('somefile','w+') as f:
f.write(r1.data)
When I create a PoolManager
at http
I looked at it by dir(http)
and saw that http.headers
was empty by default and updated to the user agent info specified, but is it being used? Is there anyway to check without having to look at apache
logs?
And actually checking /var/log/apache2/access.log
after trying to update the user agent:
>>> import urllib3
>>> user_agent = {'user-agent': 'Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0'}
>>> http = urllib3.PoolManager(2, headers=user_agent)
>>> r = http.request('GET','localhost')
>>> with open('/var/log/apache2/access.log','r') as f:
... last_line = f.readlines()[-1]
...
>>> last_line
'127.0.0.1 - - [08/Dec/2014:20:42:04 -0500] "GET / HTTP/1.1" 200 461 "-" "-"\n'
header
argument should be headers
:
http = urllib3.PoolManager(10, header=user_agent)
You can confirm that headers were set correctly using sites like httpbin.org
:
>>> import urllib3
>>> user_agent = {'user-agent': 'Mozilla/5.0 (Windows NT 6.3; rv:36.0) ..'}
>>> http = urllib3.PoolManager(10, headers=user_agent)
>>> r1 = http.urlopen('GET', 'http://httpbin.org/headers')
>>> print(r1.data)
{
"headers": {
"Accept-Encoding": "identity",
"Connect-Time": "1",
"Connection": "close",
"Host": "httpbin.org",
"Total-Route-Time": "0",
"User-Agent": "Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0",
"Via": "1.1 vegur",
"X-Request-Id": "5ef53f21-6caf-4e45-8123-98e417cd05ba"
}
}
or you can use a packet analyzer (eg. Wireshark).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With