I've got a problem with strings that I get from one of my clients over xmlrpc. He sends me utf8 strings that are encoded twice :( so when I get them in python I have an unicode object that has to be decoded one more time, but obviously python doesn't allow that. I've noticed my client however I need to do quick workaround for now before he fixes it. Raw string from tcp dump: <pre class="prettyprint"><code><string>Rafa\xc3\x85\xc2\x82</string> </code></pre> this is converted into: <pre class="prettyprint"><code>u'Rafa\xc5\x82' </code></pre> The best we get is: <pre class="prettyprint"><code>eval(repr(u'Rafa\xc5\x82')[1:]).decode("utf8") </code></pre> This results in correct string which is: <pre class="prettyprint"><code>u'Rafa\u0142' </code></pre> this works however is ugly as hell and cannot be used in production code. If anyone knows how to fix this problem in more suitable way please write. Thanks, Chris

<pre class="prettyprint"> >>> s = u'Rafa\xc5\x82' >>> s.encode('raw_unicode_escape').decode('utf-8') u'Rafa\u0142' >>> </pre>

Decoding double encoded utf8 in Python

Tags:

I've got a problem with strings that I get from one of my clients over xmlrpc. He sends me utf8 strings that are encoded twice :( so when I get them in python I have an unicode object that has to be decoded one more time, but obviously python doesn't allow that. I've noticed my client however I need to do quick workaround for now before he fixes it.

Raw string from tcp dump:

<string>Rafa\xc3\x85\xc2\x82</string>

this is converted into:

u'Rafa\xc5\x82'

The best we get is:

eval(repr(u'Rafa\xc5\x82')[1:]).decode("utf8")

This results in correct string which is:

u'Rafa\u0142'

this works however is ugly as hell and cannot be used in production code. If anyone knows how to fix this problem in more suitable way please write. Thanks, Chris

354

asked Jul 24 '09 12:07

Chris Ciesielski

1 Answers

 >>> s = u'Rafa\xc5\x82' >>> s.encode('raw_unicode_escape').decode('utf-8') u'Rafa\u0142' >>>

133

answered Nov 06 '22 11:11

Ivan Baldin

Related questions
                            
                                overflow:scroll; in <td>
                            
                                Converting sanitised html back to displayable html
                            
                                Can you eval code in the context of a caller in Ruby?
                            
                                rails form_for styling
                            
                                Ruby's open-uri and cookies
                            
                                How do I pull an integer out of a NSDictionary and put it in an integer?
                            
                                Need to know if a jQuery UI Widget has been applied to a DOM object
                            
                                Templates: template function not playing well with class's template member function
                            
                                How do I get started writing a module for CPAN?
                            
                                Is there ever a good reason to use eval()?
                            
                                How to convert X => Option[R] to PartialFunction[X,R]
                            
                                int object is not iterable while trying to sum the digits of a number?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With