I'm working on a routine that compares strings, but for better efficiency I need to remove all characters that are not letters or numbers.
I'm using multiple REPLACE
functions now, but maybe there is a faster and nicer solution ?
A common solution to remove all non-alphanumeric characters from a String is with regular expressions. The idea is to use the regular expression [^A-Za-z0-9] to retain only alphanumeric characters in the string. You can also use [^\w] regular expression, which is equivalent to [^a-zA-Z_0-9] .
To remove all non-alphanumeric characters from a string, call the replace() method, passing it a regular expression that matches all non-alphanumeric characters as the first parameter and an empty string as the second. The replace method returns a new string with all matches replaced.
SELECT REGEXP_REPLACE(columnName, '[^\\x20-\\x7E]', '') from tableName; Count to do a safety check ... Then update This update is a catch all after the mapping update. Change the limit number to the count value above ...
Courtesy of michal.jakubeczy's answer below, replacing by Regex is now supported by MySQL:
UPDATE {table} SET {column} = REGEXP_REPLACE({column}, '[^0-9a-zA-Z ]', '')
Regex isn't supported here. I had to create my own function called alphanum which stripped the chars for me:
DROP FUNCTION IF EXISTS alphanum; DELIMITER | CREATE FUNCTION alphanum( str CHAR(255) ) RETURNS CHAR(255) DETERMINISTIC BEGIN DECLARE i, len SMALLINT DEFAULT 1; DECLARE ret CHAR(255) DEFAULT ''; DECLARE c CHAR(1); IF str IS NOT NULL THEN SET len = CHAR_LENGTH( str ); REPEAT BEGIN SET c = MID( str, i, 1 ); IF c REGEXP '[[:alnum:]]' THEN SET ret=CONCAT(ret,c); END IF; SET i = i + 1; END; UNTIL i > len END REPEAT; ELSE SET ret=''; END IF; RETURN ret; END | DELIMITER ;
Now I can do:
select 'This works finally!', alphanum('This works finally!');
and I get:
+---------------------+---------------------------------+ | This works finally! | alphanum('This works finally!') | +---------------------+---------------------------------+ | This works finally! | Thisworksfinally | +---------------------+---------------------------------+ 1 row in set (0.00 sec)
Hurray!
From a performance point of view, (and on the assumption that you read more than you write)
I think the best way would be to pre calculate and store a stripped version of the column, This way you do the transform less.
You can then put an index on the new column and get the database to do the work for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With