Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 2 vs. Python 3 - urllib formats

Tags:

I'm getting really tired of trying to figure out why this code works in Python 2 and not in Python 3. I'm just trying to grab a page of json and then parse it. Here's the code in Python 2:

import urllib, json response = urllib.urlopen("http://reddit.com/.json") content = response.read() data = json.loads(content) 

I thought the equivalent code in Python 3 would be this:

import urllib.request, json response = urllib.request.urlopen("http://reddit.com/.json") content = response.read() data = json.loads(content) 

But it blows up in my face, because the data returned by read() is a "bytes" type. However, I cannot for the life of me get it to convert to something that json will be able to parse. I know from the headers that reddit is trying to send utf-8 back to me, but I can't seem to get the bytes to decode into utf-8:

import urllib.request, json response = urllib.request.urlopen("http://reddit.com/.json") content = response.read() data = json.loads(content.decode("utf8")) 

What am I doing wrong?

Edit: the problem is that I cannot get the data into a usable state; even though json loads the data, part of it is undisplayable, and I want to be able to print the data to the screen.

Second edit: The problem has more to do with print than parsing, it seems. Alex's answer provides a way for the script to work in Python 3, by setting the IO to utf8. But a question still remains: why is it that the code worked in Python 2, but not Python 3?

like image 330
Dan Lew Avatar asked Jun 27 '10 23:06

Dan Lew


People also ask

Is Urllib built in Python 3?

The urllib module in Python 3 allows you access websites via your program. This opens up as many doors for your programs as the internet opens up for you. urllib in Python 3 is slightly different than urllib2 in Python 2, but they are mostly the same.

Is requests better than Urllib?

True, if you want to avoid adding any dependencies, urllib is available. But note that even the Python official documentation recommends the requests library: "The Requests package is recommended for a higher-level HTTP client interface."

How do I import Urlopen into Python 3?

You need to use from urllib. request import urlopen , also I suggest you use the with statement while opening a connection. @BradleyD. Freeman-Bain: you can't have a with -statement without the following block.


1 Answers

The code you post is presumably due to wrong cut-and-paste operations because it's clearly wrong in both versions (f.read() fails because there's no f barename defined).

In Py3, ur = response.decode('utf8') works perfectly well for me, as does the following json.loads(ur). Maybe the wrong copys-and-pastes affected your 2-to-3 conversion attempts.

like image 128
Alex Martelli Avatar answered Sep 17 '22 23:09

Alex Martelli