Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to filter a Java String to get only alphabet characters?

Tags:

java

string

regex

I'm generating a XML file to make payments and I have a constraint for user's full names. That param only accept alphabet characters (a-ZAZ) + whitespaces to separe names and surnames.

I'm not able to filter this in a easy way, how can I build a regular expression or a filter to get my desireable output?

Example:

'Carmen López-Delina Santos' must be 'Carmen LopezDelina Santos'

I need to transform vowels with decorations in single vowels as follows: á > a, à > a, â > a, and so on; and also remove special characters as dots, hyphens, etc.

Thanks!

like image 365
EnriMR Avatar asked Jun 11 '15 11:06

EnriMR


1 Answers

You can first use a Normalizer and then remove the undesired characters:

String input = "Carmen López-Delina Santos";
String withoutAccent = Normalizer.normalize(input, Normalizer.Form.NFD);
String output = withoutAccent.replaceAll("[^a-zA-Z ]", "");
System.out.println(output); //prints Carmen LopezDelina Santos

Note that this may not work for all and any non-ascii letters in any language - if such a case is encountered the letter would be deleted. One such example is the Turkish i.

The alternative in that situation is probably to list all the possible letters and their replacement...

like image 173
assylias Avatar answered Sep 19 '22 17:09

assylias