Kindergarten 101 teaches some of us that: "The letters in your name should be lowercase, with uppercase first letters." Yet in this post-literate era, how people enter their names in web forms seems to depend on their mood, or solar flares or whatnot: All uppercase, all lowercase, mixed, upside down...
Philosophically, I say whatever! Occupy your name, who cares. But I have OCD clients that prefer to see data normalized, standardized, predictable. So I'm asking you guys if you've seen any well-thought-out PHP functions for case-fixing names, that take into consideration the various exceptions that ucwords()
would totally butcher, such as:
Any functions out there that attempt to accommodate these alphabet rebels?
UPDATE
From Robin v. G.'s point of van-tage, there can be no script to rule them all. But I've decided that names entered entirely in lower or uppercase are likely candidates for a good scrubbing. So for these, I will do ...
if ($name == strtoupper($name) || $name == strtolower($name)) {
$name = ucwords(strtolower($name));
}
It would be easy enough to modify this to fix a few likely exceptions: dashes, apostrophes, 'McD', etc. Mistakes will be made, but who will complain? Not the meek bastard who entered their name in lowercase.
Oh wait, my name is in lowercase...
This is simply impossible.
Spelling of names varies from country to country, as you show in your question. The easiest way to go is to find the most common way of spelling, and that would be to capitalise every first letter of every 'word', i.e. every string preceded by a space, hyphen, dot or apostroph.
This doesn't fix all your problems (YungCheng, McDonaldo) and leaves you with other issues as well, but that's as close as you're gonna get.
Compare:
There's no algorithm fixing this.
This article illustrates the problem with Dutch names very well, and that's just one language. There's probably an article like this for every language in the world. ;)
Here is a try
$names=array();
$names[]="sven-alex crumpet";
$names[]="RONALDO McDonalDO";
$names[]="Boopsie o'Brien";
$names[]="j.r. BOB DOBBS";
$names[]="francesca DE LOS gatOS";
$names[]="yungcheng LI";
$names[]="mr hankey";
$names[]="santas little helper";
$names[]="j.r.r. tolkien";
$splitters=array(' ','.',"'",'-'); //more to come
$fixedNames=array();
foreach($names as $name) {
$fixed='';
$blank=str_replace($splitters,'?',$name);
$n=explode('?',$blank);
foreach($n as $f) $fixed.=ucfirst(strtolower($f)).' ';
for ($i=0;$i<strlen($fixed);$i++) {
if ($fixed[$i]==' ') {
if ($blank[$i]=='?') {
$fixed[$i]=$name[$i];
}
}
}
$fixedNames[]=substr_replace($fixed,'', -1);
}
echo '<pre>';
print_r($fixedNames);
echo '<pre>';
outputs
Array
(
[0] => Sven-Alex Crumpet
[1] => Ronaldo Mcdonaldo
[2] => Boopsie O'Brien
[3] => J.R. Bob Dobbs
[4] => Francesca De Los Gatos
[5] => Yungcheng Li
[6] => Mr Hankey
[7] => Santas Little Helper
[8] => J.R.R. Tolkien
)
It is impossible to "correct" a name like YungCheng without algorithms taking care of regional / cultural conventions and a huge name database to compare with.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With