Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I decode escaped unicode javascript code in Python?

I have this string:

V posledn\u00edch m\u011bs\u00edc\u00edch se bezpe\u010dnostn\u00ed situace v Libyi zna\u010dn\u011b zhor\u0161ila, o \u010dem\u017e sv\u011bd\u010d\u00ed i ned\u00e1vn\u00e9 n\u00e1hl\u00e9 opu\u0161t\u011bn\u00ed zem\u011b nejen \u010desk\u00fdmi diplomaty. Libyi hroz\u00ed nekontrolovan\u00fd rozpad a nekone\u010d

Which should read "V posledních měsících se ..." so \u00ed is í and \u011b is ě.

Any idea how to decode this in Python? It is a javascript code I am parsing in python. I could write my own ad-hoc solution as there are not that many characters that are escaped (there are only twelve or so accented characters in Czech), but that seems ugly.

like image 751
sup Avatar asked Dec 19 '22 11:12

sup


2 Answers

Decode it using the 'unicode-escape' codec. If x is your string, x.decode('unicode-escape').

like image 179
BrenBarn Avatar answered Jan 19 '23 01:01

BrenBarn


If it is Javascript code, then perhaps it's actually JSON, and you can use json.loads to decode it.

like image 42
Ned Batchelder Avatar answered Jan 19 '23 01:01

Ned Batchelder