Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP-RegEx for german full name with umlauts and some internationalisation

Dear Stackoverflowianers, Dear RegEx-Gurus,

I was searching the web for regex pattern that checks the plausibility of a full name in german language. I found many posts on patterns without german umlauts aso... From all this posts and my logical understanding I build this pattern together yet:

^([A-ZÖÄÜ]{0,1})([-a-zäöüß\.']{2,30})( {1}|-{1})([A-ZÄÖÜ]{0,1})([a-zäöüß']{0,30})( {1}|-{1})?([A-ZÖÄÜ]{0,1})([a-zäöüß']{0,30})(( {0,1}|-{1})([A-ZÖÄÜ]{0,1})([a-zäöüß']{0,30}))+$

It should match the following possible variations (status now)(expected):

  • "Hans Spitzer" (match)(yes)
  • "hans spitzer" (match)(yes)
  • "Hans-peter Österreicher" (match)(yes)
  • "Dr. Anna-Marie Pelzer-Hahnenkamp" (match)(yes)
  • "Dipl-Ing. Gerhard Meyer" (no-match)(no)
  • "Lisa-Maria Brandner-Kapeller" (match)(yes)
  • "John Mc'Connor" (match)(yes)
  • "John" (no-match)(yes)
  • "Johann " (match)(no)
  • "Osama Al Sawarri" (match)(yes)
  • "Frank F." (no-match)(yes)
  • "Johann F. Kerner" (no-match)(yes)
  • "Johann F Kerner" (match)(no)
  • "li xian" (match)(yes)
  • "Li Xian" (no-match)(no)
  • "Li Fu" (no-match)(no)
  • "li fu" (match)(yes)

(where status now means if it matches now and expected means if it should or should not match)

I need to use this pattern for preg_match in PHP.

I'd be so thankfull if somebody could help me to refine this pattern. As soon it is - nearly - perfect I will add it to http://gskinner.com/RegExr/ for public use (they have 2 or 3 fullname checks but they're not working well or not at all).

Thx. in advance for your help...

Best regards, Ingmar

like image 785
Ingmar Erdös Avatar asked Jul 19 '13 06:07

Ingmar Erdös


1 Answers

Given the vast range of perfectly valid names in use around the world, you should do the absolute minimum of validation on it. People with hyphens and apostrphes in their names get rightfully annoyed when they're told that their name is invalid.

Even trying to force initials to have a dot after them may be wrong, as there are plenty of people in the world with single-character names.

My advice would therefore be to not validate it at all.

However if you must do some kind of validation, then the best advice I can give is to stick to filtering out symbols that you definitely want to exclude, and avoid doing anything more complex than that.

So a simple pattern might look like this:

/[^\$%\^\*£=~@]/

That will prevent the user from including symbols like $ or @ in their name, because yes, those are pretty implausible for a valid name. But make sure you do allow quote marks and hyphens, commas, and even brackets, because real people do have these characters in their names.

Hope that helps.

like image 121
Spudley Avatar answered Sep 29 '22 00:09

Spudley