Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to remove only special characters and not other language letters

I used a regex expression to remove special characters from name. The expression will remove all letters except English alphabets.

public static void main(String args[]) {
    String name = "Özcan Sevim.";
    name = name.replaceAll("[^a-zA-Z\\s]", " ").trim();
    System.out.println(name);
}

Output:

zcan Sevim

Expected Output:

Özcan Sevim 

I get bad result as I did it this way, the right way will be to remove special characters based on ASCII codes so that other letters will not be removed, can someone help me with a regex that would remove only special characters.

like image 919
Ashok Kumar Avatar asked Jan 27 '23 21:01

Ashok Kumar


1 Answers

You can use \p{IsLatin} or \p{IsAlphabetic}

name = name.replaceAll("[^\\p{IsLatin}]", " ").trim();

Or to remove the punctuation just use \p{Punct} like this :

name = name.replaceAll("\\p{Punct}", " ").trim();

Outputs

Özcan Sevim

take a look at the full list of Summary of regular-expression constructs and use the one which can help you.

like image 62
YCF_L Avatar answered Jan 31 '23 09:01

YCF_L