Removing all characters but letters in a string

If I have a string "ja.v_,a", how can I remove all non-letter characters to output "java"? I have tried str = str.replaceAll("\\W", "" ), but to no avail.

Eragon20 Avatar asked Jan 04 '23 05:01


2 Answers

Could you try this one?

System.out.println("ja.v_,a".replaceAll("[^a-zA-Z]", "")) //java
Roma Khomyshyn Avatar answered Jan 14 '23 06:01

I would like to refer to this article and quote it:

Regex examples and tutorials always give you the [a-zA-Z0-9]+ regex to "validate alphanumeric input". It is built-in in many validation frameworks. And it is so utterly wrong. This is a regex that must never appear anywhere in your code, unless you have a pretty good explanation. Yet, the example is ubiquitous. Instead, the right regex is [\p{L}0-9]+

So in your case it would be:

str.replaceAll("[^\\p{L}]", "");
System.out.println("ja.v_,a".replaceAll("[^\\p{L}]", ""));
System.out.println("сл-=о-_=во!".replaceAll("[^\\p{L}]", ""));

Where \p{L} is the Unicode definition of a "letter".

Mikhail Antonov Avatar answered Jan 14 '23 06:01

