I want to remove any non-alphanumeric character from a string, except for certain ones.
StringUtils.replacePattern(input, "\\p{Alnum}", "");
How can I also exclude those certain characters, like .-;
?
Use the not operator ^
:
[^a-zA-Z0-9.\-;]+
This means "match what is not these characters". So:
StringUtils.replacePattern(input, "[^a-zA-Z0-9.\\-;]+", "");
Don't forget to properly escape the characters that need escaping: you need to use two backslashes \\
because your regex is a Java string.
You could negate your expression;
\p{Alnum}
By placing it in a negative character class:
[^\p{Alnum}]
That will match any non-alpha numeric characters, you could then replace those with ""
. if you wanted to allow additional characters you can just append them to the character class, e.g.:
[^\p{Alnum}\s]
will not match white space characters (\s
).
If you where to replace
[^\p{Alnum}.;-]
with ""
, these characters will also be allowed: .
, ;
or -
.
StringUtils uses Java's standard Pattern
class under the hood. If you don't want to import Apache's library and want it to run quicker (since it doesn't have to compile the regex each time it's used) you could do:
private static final Pattern NO_ODD_CHARACTERS = Pattern.compile("[^a-zA-Z0-9.\\-;]+");
...
String cleaned = NO_ODD_CHARACTERS.matcher(input).replaceAll("");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With