Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Changing User Agent in Python 3 for urrlib.request.urlopen

I want to open a url using urllib.request.urlopen('someurl'):

with urllib.request.urlopen('someurl') as url: b = url.read() 

I keep getting the following error:

urllib.error.HTTPError: HTTP Error 403: Forbidden 

I understand the error to be due to the site not letting python access it, to stop bots wasting their network resources- which is understandable. I went searching and found that you need to change the user agent for urllib. However all the guides and solutions I have found for this issue as to how to change the user agent have been with urllib2, and I am using python 3 so all the solutions don't work.

How can I fix this problem with python 3?

like image 540
user3662991 Avatar asked Jun 15 '14 05:06

user3662991


People also ask

How do I change user agent in Urllib?

To change the user agent header with Python urllib, we can call the build_opener method. Then we set the addheaders attribute of the returned object to add the user-agent request header. We call urllib. request.

What is Urllib request Urlopen Python?

Urllib package is the URL handling module for python. It is used to fetch URLs (Uniform Resource Locators). It uses the urlopen function and is able to fetch URLs using a variety of different protocols. Urllib is a package that collects several modules for working with URLs, such as: urllib.

Which is better Urllib or requests?

True, if you want to avoid adding any dependencies, urllib is available. But note that even the Python official documentation recommends the requests library: "The Requests package is recommended for a higher-level HTTP client interface."

Is Urllib built in Python 3?

The urllib module in Python 3 allows you access websites via your program. This opens up as many doors for your programs as the internet opens up for you. urllib in Python 3 is slightly different than urllib2 in Python 2, but they are mostly the same.


1 Answers

From the Python docs:

import urllib.request req = urllib.request.Request(     url,      data=None,      headers={         'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'     } )  f = urllib.request.urlopen(req) print(f.read().decode('utf-8')) 
like image 122
14 revs, 12 users 16% Avatar answered Oct 03 '22 03:10

14 revs, 12 users 16%