I'm trying to analyse Data in a Postgresql-Database which is a Json-Text but stored in a bytea-column. Some of the rows can be converted, but others not. Json without CodePoints inside the text works fine when casting the encoded bytea
select encode(myByteaColumn, 'escape')::json -> aJsonProperty as myProp from myTable
But for some rows this fails, because there are some Encoded CodePoints inside the string like the German Umlauts (Ä,Ö, etc.),
German words like Zuständigkeit
are shown as Zust\303\244ndigkeit
when encoding the bytea like this
select encode(myByteaColumn, 'escape') from myTable
The Database is set to UTF-8.
demo: db<>fiddle
convert_from()
works for me:
SELECT convert_from(decode('Zuständigkeit', 'escape'),'UTF8')
SELECT convert_from(decode('{"Zuständigkeit":"ABC"}', 'escape'),'UTF8')::jsonb -> 'Zuständigkeit'
As a side note, you were getting \303\244
, because according to this section in the PostgreSQL documentation, the encode(data bytea, 'escape')
function:
converts zero bytes and high-bit-set bytes to octal sequences (
\nnn
) and doubles backslashes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With