This doesn't appear to be possible to me using the standard library json
module. When using json.dumps
it will automatically escape all non-ASCII characters then encode the string to ASCII. I can specify that it not escape non-ASCII characters, but then it crashes when it tries to convert the output to ASCII.
The problem is - I don't want ASCII! I just want my JSON string back as a unicode (or UTF-8) string. Are there any convenient ways to do that?
Here's an example to demonstrate what I want:
d = {'navn': 'Åge', 'stilling': 'Lærling'}
json.dumps(d, output_encoding='utf8')
# => '{"stilling": "Lærling", "navn": "Åge"}'
But of course, there is no such option as output_encoding, so here's the actual output:
d = {'navn': 'Åge', 'stilling': 'Lærling'}
json.dumps(d)
# => '{"stilling": "L\\u00e6rling", "navn": "\\u00c5ge"}'
So to summarize - I want to convert a Python dict to an UTF-8 JSON string without any escapes. How can I do that?
I'll accept solutions like:
dumps
to achieve the desired effect)To Convert dictionary to JSON you can use the json. dumps() which converts a dictionary to str object, not a json(dict) object! so you have to load your str into a dict to use it by using json.
dumps() method: This method is used to convert the dictionary object into JSON data for parsing or reading and it is slower than dump() method.
You have two options to create Unicode string in Python. Either use decode() , or create a new Unicode string with UTF-8 encoding by unicode(). The unicode() method is unicode(string[, encoding, errors]) , its arguments should be 8-bit strings.
Make sure your python files are encoded in UTF-8. Or else your non-ascii characters will become question marks, ?
. Notepad++ has excellent encoding options for this.
Make sure that you have the appropriate fonts included. If you want to display Japanese characters then you need to install Japanese fonts.
Make sure that your IDE supports displaying unicode characters.
Otherwise you might get an UnicodeEncodeError
error thrown.
Example:
UnicodeEncodeError: 'charmap' codec can't encode characters in position 22-23: character maps to <undefined>
PyScripter works for me. It's included with "Portable Python" at http://portablepython.com/wiki/PortablePython3.2.1.1
json.dumps() escapes unicode characters.
Read the update at the bottom. Or...
Replace each escaped characters with the parsed unicode character.
I created a simple lambda function called getStringWithDecodedUnicode
that does just that.
import re
getStringWithDecodedUnicode = lambda str : re.sub( '\\\\u([\da-f]{4})', (lambda x : chr( int( x.group(1), 16 ) )), str )
Here's getStringWithDecodedUnicode
as a regular function.
def getStringWithDecodedUnicode( value ):
findUnicodeRE = re.compile( '\\\\u([\da-f]{4})' )
def getParsedUnicode(x):
return chr( int( x.group(1), 16 ) )
return findUnicodeRE.sub(getParsedUnicode, str( value ) )
import re
import json
getStringWithDecodedUnicode = lambda str : re.sub( '\\\\u([\da-f]{4})', (lambda x : chr( int( x.group(1), 16 ) )), str )
data = {"Japan":"日本"}
jsonString = json.dumps( data )
print( "json.dumps({0}) = {1}".format( data, jsonString ) )
jsonString = getStringWithDecodedUnicode( jsonString )
print( "Decoded Unicode: %s" % jsonString )
json.dumps({'Japan': '日本'}) = {"Japan": "\u65e5\u672c"}
Decoded Unicode: {"Japan": "日本"}
Or... just pass ensure_ascii=False
as an option for json.dumps.
Note: You need to meet the requirements that I outlined at the beginning or else this isn't going to work.
import json
data = {'navn': 'Åge', 'stilling': 'Lærling'}
result = json.dumps(d, ensure_ascii=False)
print( result ) # prints '{"stilling": "Lærling", "navn": "Åge"}'
encode_ascii=False
is the best solution IMHO.
If you are using Python2.7, here is example python file :
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# example.py
from __future__ import unicode_literals
from json import dumps as json_dumps
d = {'navn': 'Åge', 'stilling': 'Lærling'}
print json_dumps(d, ensure_ascii=False).encode('utf-8')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With