This question is based on a side-effect of that one.
My .py
files are all have # -*- coding: utf-8 -*-
encoding definer on the first line, like my api.py
As I mention on the related question, I use HttpResponse
to return the api documentation. Since I defined encoding by:
HttpResponse(cy_content, content_type='text/plain; charset=utf-8')
Everything is ok, and when I call my API service, there are no encoding problems except the string formed from a dictionary by pprint
Since I am using Turkish characters in some values in my dict, pprint converts them to unichr
equivalents, like:
API_STATUS = {
1: 'müşteri',
2: 'some other status message'
}
my_str = 'Here is the documentation part that contains Turkish chars like işüğçö'
my_str += pprint.pformat(API_STATUS, indent=4, width=1)
return HttpRespopnse(my_str, content_type='text/plain; charset=utf-8')
And my plain text output is like:
Here is the documentation part that contains Turkish chars like işüğçö
{
1: 'm\xc3\xbc\xc5\x9fteri',
2: 'some other status message'
}
I try to decode or encode pprint output to different encodings, with no success... What is the best practice to overcome this problem
pprint
appears to use repr
by default, you can work around this by overriding PrettyPrinter.format
:
# coding=utf8
import pprint
class MyPrettyPrinter(pprint.PrettyPrinter):
def format(self, object, context, maxlevels, level):
if isinstance(object, unicode):
return (object.encode('utf8'), True, False)
return pprint.PrettyPrinter.format(self, object, context, maxlevels, level)
d = {'foo': u'işüğçö'}
pprint.pprint(d) # {'foo': u'i\u015f\xfc\u011f\xe7\xf6'}
MyPrettyPrinter().pprint(d) # {'foo': işüğçö}
You should use unicode strings instead of 8-bit ones:
API_STATUS = {
1: u'müşteri',
2: u'some other status message'
}
my_str = u'Here is the documentation part that contains Turkish chars like işüğçö'
my_str += pprint.pformat(API_STATUS, indent=4, width=1)
The pprint
module is designed to print out all possible kind of nested structure in a readable way. To do that it will print the objects representation rather then convert it to a string, so you'll end up with the escape syntax wheather you use unicode strings or not. But if you're using unicode in your document, then you really should be using unicode literals!
Anyway, thg435 has given you a solution how to change this behaviour of pformat.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With