Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't print character '\u2019' in Python from JSON object

As a project to help me learn Python, I'm making a CMD viewer of Reddit using the json data (for example www.reddit.com/all/.json). When certain posts show up and I attempt to print them (that's what I assume is causing the error), I get this error:

Traceback (most recent call last): File "C:\Users\nsaba\Desktop\reddit_viewer.py", line 33, in print ( "%d. (%d) %s\n" % (i+1, obj['data']['score'], obj['data']['title']))

File "C:\Python33\lib\encodings\cp437.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_map)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u2019' in position 32: character maps to

Here is where I handle the data:

request = urllib.request.urlopen(url)
content = request.read().decode('utf-8')
jstuff = json.loads(content)

The line I use to print the data as listed in the error above:

print ( "%d. (%d) %s\n" % (i+1, obj['data']['score'], obj['data']['title']))

Can anyone suggest where I might be going wrong?

like image 246
N-Saba Avatar asked Aug 27 '13 19:08

N-Saba


People also ask

How do I pass a Unicode character in JSON?

All Unicode characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F). Any character may be escaped.

Is JSON an Ascii?

Since any JSON can represent unicode characters in escaped sequence \uXXXX , JSON can always be encoded in ASCII.

Can Python print JSON?

We can use the Python json module to pretty-print the JSON data. The json module is recommended to work with JSON files. We can use the dumps() method to get the pretty formatted JSON string.


1 Answers

It's almost certain that you problem has nothing to do with the code you've shown, and can be reproduced in one line:

print(u'\2019')

If your terminal's character set can't handle U+2019 (or if Python is confused about what character set your terminal uses), there's no way to print it out. It doesn't matter whether it comes from JSON or anywhere else.

The Windows terminal (aka "DOS prompt" or "cmd window") is usually configured for a character set like cp1252 that only knows about 256 of the 110000 characters, and there's nothing Python can do about this without a major change to the language implementation.*

See PrintFails on the Python Wiki for details, workarounds, and links to more information. There are also a few hundred dups of this problem on SO (although many of them will be specific to Python 2.x, without mentioning it).


* Windows has a whole separate set of APIs for printing UTF-16 to the terminal, so Python could detect that stdout is a Windows terminal, and if so encode to UTF-16 and use the special APIs instead of encoding to the terminal's charset and using the standard ones. But this raises a bunch of different problems (e.g., different ways of printing to stdout getting out of sync). There's been discussion about making these changes, but even if everyone were to agree and the patch were written tomorrow, it still wouldn't help you until you upgrade to whatever future version of Python it's added to…

like image 172
abarnert Avatar answered Oct 04 '22 01:10

abarnert