Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert escaped Unicode character back to actual character in PostgreSQL

Is there a way how I can convert the following string back to the human-readable value? I have some external data where all non-ascii characters are escaped.

Example strings:

16 StringProvider_111=Telefon\u00ED kontakty
17 StringProvider_116=Odpov\u011Bdn\u00E1 osoba

Required Result:

16 StringProvider_111=Telefoní kontakty
17 StringProvider_116=Odpovědná osoba

SQLFiddle

The database has UTF8 encoding and collation cs_CZ.UTF-8

like image 299
Tomas Greif Avatar asked Mar 21 '23 13:03

Tomas Greif


1 Answers

One old trick is using parser for this purpose:

postgres=# select e'Telefon\u00ED kontakty';
     ?column?      
-------------------
 Telefoní kontakty
(1 row)

CREATE OR REPLACE FUNCTION public.unescape(text)
RETURNS text
LANGUAGE plpgsql
AS $function$
DECLARE result text;
BEGIN
  EXECUTE format('SELECT e''%s''', $1) INTO result;
  RETURN result;
END;
$function$

It works, but it is SQL injection vulnerable - so you should to sanitize input text first!

Here is less readable, but safe version - but you have to manually specify one char as escape symbol:

CREATE OR REPLACE FUNCTION public.unescape(text, text) 
 RETURNS text
 LANGUAGE plpgsql
 AS $function$
 DECLARE result text;
 BEGIN
   EXECUTE format('SELECT U&%s UESCAPE %s', 
                         quote_literal(replace($1, '\u','^')),
                         quote_literal($2)) INTO result;
   RETURN result;
 END;
 $function$

Result

postgres=# select unescape('Odpov\u011Bdn\u00E1 osoba','^');
    unescape     
-----------------
 Odpovědná osoba
(1 row)
like image 160
Pavel Stehule Avatar answered Apr 06 '23 00:04

Pavel Stehule