I'm using urllib2 to open a url. Now I need the html file as a string. How do I do this?
urllib2 is deprecated in python 3. x. use urllib instaed.
Despite the similar name, they are unrelated: they have a different design and a different implementation. urllib was the original Python HTTP client, added to the standard library in Python 1.2.
Urllib package is the URL handling module for python. It is used to fetch URLs (Uniform Resource Locators). It uses the urlopen function and is able to fetch URLs using a variety of different protocols.
The data returned by urlopen() or urlretrieve() is the raw data returned by the server. This may be binary data (such as an image), plain text or (for example) HTML. The HTTP protocol provides type information in the reply header, which can be inspected by looking at the Content-Type header.
In python3, it should be changed to urllib.request.openurl('http://www.example.com/').read().decode('utf-8')
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With