I want to remove all the special characters from a string except numbers and normal a-z characters.
I am doing it like this:
text = text.replaceAll("[^a-zA-Z0-9 ]+", "");
The problem with this way is that it will also remove all non-latin characters like è, é, ê, ë and many others.
By non-special characters (the ones I want to keep) I mean all the numbers and all the alphabetical characters for all the languages or at least as many as possible.
How do I only remove the special characters?
You can try \p{L}
for all letters and \p{N}
for all numbers:
text = text.replaceAll("[^\\p{L}\\p{N} ]+", "");
I know you said regex, but if guava is an option:
CharMatcher.JAVA_LETTER_OR_DIGIT.retainFrom("èêAAAGRt123")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With