I am trying to take a chunk of JSON that has strings which contain the literal characters \u009e
and I would like to convert those characters to its associated single unicode character, in this case é
.
I use curl or wget to download the json which looks like:
{ "name": "Kitsun\u00e9" }
And need to translate this in Vim to:
{ "name": "Kitsuné" }
My first thought was to use Vim's iconv, but it does not evaluate the string as a single character and just returns the input.
let code = '\u00e9'
echo iconv(code, "UTF-8", "UTF-8")
" Prints \u00e9
I want to eventually use something like
%s;\\u[0-9abcdef]*;\=iconv(submatch(0),"UTF-8", "UTF-8");g
Use a decoder that interprets the '\u00c3' escapes as unicode code point U+00C3 (LATIN CAPITAL LETTER A WITH TILDE, 'Ã').
Unicode code converter. Type or paste text in the green box and click on the Convert button above it. Alternative representations will appear in all the other boxes. You can also do the same in any grey box, if you want to target only certain types of escaped text.
unichr() is named chr() in Python 3 (conversion to a Unicode character).
A unicode escape sequence is a backslash followed by the letter 'u' followed by four hexadecimal digits (0-9a-fA-F). It matches a character in the target sequence with the value specified by the four digits. For example, ”\u0041“ matches the target sequence ”A“ when the ASCII character encoding is used.
this line works for your example:
s#\\u[0-9a-f]*#\=eval('"'.submatch(0).'"')#
or
s#\v\\u([0-9a-f]{4})#\=nr2char(str2nr(submatch(1),16))#
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With