Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace all non alphanumeric characters, new lines, and multiple white space with one space

People also ask

How do you replace non-alphanumeric characters?

replaceAll() method. A common solution to remove all non-alphanumeric characters from a String is with regular expressions. The idea is to use the regular expression [^A-Za-z0-9] to retain only alphanumeric characters in the string. You can also use [^\w] regular expression, which is equivalent to [^a-zA-Z_0-9] .

What is alphanumeric characters & white spaces?

Alphanumeric characters by definition only comprise the letters A to Z and the digits 0 to 9. Spaces and underscores are usually considered punctuation characters, so no, they shouldn't be allowed. If a field specifically says "alphanumeric characters, space and underscore", then they're included.

How do you replace a non-alphanumeric character in Python?

Short example re. sub(r'\W+', '_', 'bla: bla**(bla)') replaces one or more consecutive non-alphanumeric characters by an underscore.


Be aware, that \W leaves the underscore. A short equivalent for [^a-zA-Z0-9] would be [\W_]

text.replace(/[\W_]+/g," ");

\W is the negation of shorthand \w for [A-Za-z0-9_] word characters (including the underscore)

Example at regex101.com


Jonny 5 beat me to it. I was going to suggest using the \W+ without the \s as in text.replace(/\W+/g, " "). This covers white space as well.


Since [^a-z0-9] character class contains all that is not alnum, it contains white characters too!

 text.replace(/[^a-z0-9]+/gi, " ");

Well I think you just need to add a quantifier to each pattern. Also the carriage-return thing is a little funny:

text.replace(/[^a-z0-9]+|\s+/gmi, " ");

edit The \s thing matches \r and \n too.