We are currently migrating one of our oracle databases to UTF8 and we have found a few records that are near the 4000 byte varchar limit. When we try and migrate these record they fail as they contain characters that become multibyte UF8 characters. What I want to do within PL/SQL is locate these characters to see what they are and then either change them or remove them.
I would like to do :
SELECT REGEXP_REPLACE(COLUMN,'[^[:ascii:]],'')
but Oracle does not implement the [:ascii:] character class.
Is there a simple way doing what I want to do?
Alternatively, you can also use regular expressions to find non-ASCII characters. ASCII character set is captured using regex [A-Za-z0-9]. You can use this regex in your query as shown below, to find non-ASCII characters. mysql> SELECT * FROM data WHERE full_name NOT REGEXP '[A-Za-z0-9]';
By using regexp_like and regexp_replace, how to find non printable characters from a string. would like to remove # and € and all special characters which are of €, which are not readable by the keyboard. select * from table1 where REGEXP_LIKE(column1, '[^A-Z^a-z^0-9^[^. ^{^}]' ,'x');
The Oracle REGEXP_REPLACE() function replaces a sequence of characters that matches a regular expression pattern with another string. The REGEXP_REPLACE() function is an advanced version of the REPLACE() function.
I think this will do the trick:
SELECT REGEXP_REPLACE(COLUMN, '[^[:print:]]', '')
If you use the ASCIISTR
function to convert the Unicode to literals of the form \nnnn
, you can then use REGEXP_REPLACE
to strip those literals out, like so...
UPDATE table SET field = REGEXP_REPLACE(ASCIISTR(field), '\\[[:xdigit:]]{4}', '')
...where field and table are your field and table names respectively.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With