Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get unicode characters from a URL parameter?

I need to use a GET request to send JSON to my server via a JavaScript client, so I started echoing responses back to make sure nothing is lost in translation. There doesn't seem to be a problem with normal text, but as soon as I include a Unicode character of any sort (e.g. "ç") the character is encoded somehow (e.g. "\u00e7") and the return value is different from request value. My primary concern is that, A) In my Python code saves what the client intended on sending to the database correctly, and B) I echo the same values back to the client that were sent (when testing).

Perhaps this means I can't use base64, or have to do something different along the way. I'm ok with that. My implementation is just an attempt at a means to an end.

Current steps (any step can be changed, if needed):

Raw JSON string which I want to send to the server:

'{"weird-chars": "°ç"}'

JavaScript Base64 encoded version of the string passed to server via GET param (on a side note, will the equals sign at the end of the encoded string cause any issues?):

http://www.myserver.com/?json=eyJ3ZWlyZC1jaGFycyI6ICLCsMOnIn0=

Python str result from b64decode of param:

'{"weird-chars": "\xc2\xb0\xc3\xa7"}'

Python dict from json.loads of decoded param:

{'weird-chars': u'\xb0\xe7'}

Python str from json.dumps of that dict (and subsequent output to the browser):

'{"weird-chars": "\u00b0\u00e7"}'
like image 510
orokusaki Avatar asked Dec 31 '25 01:12

orokusaki


2 Answers

Everything looks fine to me.

>>> hex(ord(u'°'))
'0xb0'
>>> hex(ord(u'ç'))
'0xe7'

Perhaps you should decode the JSON before attempting to use it.

like image 69
Ignacio Vazquez-Abrams Avatar answered Jan 02 '26 15:01

Ignacio Vazquez-Abrams


Your procedure's fine, you just need 1 more step; that is, encoding from unicode to utf-8 (or any other encoding that supports the 'weird characters'.)

Think of decoding as what you do to go from a regular string to unicode and encoding as what you do to get back from unicode. In other words:

You de - code a str to produce a unicode string

and en - code a unicode string to produce an str.

So:

params = {'weird-chars': u'\xb0\xe7'}

encodedchars = params['weird-chars'].encode('utf-8')

encodedchars will contain your characters, displayed in the selected encoding (in this case, utf-8).

like image 31
Aphex Avatar answered Jan 02 '26 15:01

Aphex



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!