Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PL/SQL converting special characters

Tags:

oracle

plsql

I'm creating a simple XML 1.0 file with the help of a short PL/SQL function filled with data from a table.

The data from the table contains also HTML characters like <>, & and so on. For these special characters I've build a short search and replace function which looks like this:

 newXmlString := REPLACE(xmlString,    '&',  '&amp;' );
 newXmlString := REPLACE(newXmlString, '\',  '' );
 newXmlString := REPLACE(newXmlString, '<',  '&lt;' );
 newXmlString := REPLACE(newXmlString, '>',  '&gt;' );
 newXmlString := REPLACE(newXmlString, '"',  '&quot;' );
 newXmlString := REPLACE(newXmlString, '''', '&apos;' );

Now there is more data in the table which has the effect that the XML file is not able to validate because of special control characters (https://en.wikipedia.org/wiki/Control_character) like:

  • ETX (End of Text)
  • SYN (Synchronous Idle)

Note: Not every control character corrupts the validation of a XML file! Linebreaks or Carriage Return is still possible.

Of course I now can search and replace them as well, for example with:

newXmlString := REPLACE(newXmlString, chr(3), '' ); -- ETX end of text

But is there a build in function or something like a library I can use with PL/SQL without listing and search+replacing them?

UPDATE 1

I also tried to use the function dbms_xmlgen.getxml but this function throws an error because of 'special char to escaped char conversion failed.'+

UPDATE 2

I tried using REGEXP_REPLACE(STRING_VALUE,'[[:cntrl:]]') which will work, but this will also delete line breaks, which we want to keep and also has no effect on the validation of a XML file.

like image 459
frgtv10 Avatar asked Nov 08 '22 23:11

frgtv10


1 Answers

TRANSLATE is indeed the way to go. Build a string with the CHR function and apply it only once. Here is an example for ETX: 3, EOT: 4 and SYN: 22. You can append others when needed.

Notice the 'a' at the start of the string that returns as the only character in the second string. This function needs one chracter that is not eliminated.

FUNCTION clean_control( in_text IN VARCHAR2 )
   RETURN VARCHAR2
IS
   v_search  VARCHAR2( 30 ) := 'a' || CHR( 3 ) || CHR( 4 ) || CHR( 22 );
BEGIN
   RETURN TRANSLATE( in_text, v_search, 'a' );
END;
like image 83
hendrik_at_geislersoftware Avatar answered Nov 15 '22 06:11

hendrik_at_geislersoftware