Just starting to explore the 'wonders' of regex. Being someone who learns from trial and error, I'm really struggling because my trials are throwing up a disproportionate amount of errors... My experiments are in PHP using ereg(). Anyway. I work with first and last names separately but for now using the same regex. So far I have: <pre class="prettyprint"><code>^[A-Z][a-zA-Z]+$ </code></pre> Any length string that starts with a capital and has only letters (capital or not) for the rest. But where I fall apart is dealing with the special situations that can pretty much occur anywhere. <ul> <li>Hyphenated Names (Worthington-Smythe) </li> <li>Names with Apostophies (D'Angelo) </li> <li>Names with Spaces (Van der Humpton) - capitals in the middle which may or may not be required is way beyond my interest at this stage.</li> <li>Joint Names (Ben & Jerry)</li> </ul> Maybe there's some other way a name can be that I'm no thinking of, but I suspect if I can get my head around this, I can add to it. I'm pretty sure there will be instances where more than one of these situations comes up in one name. So, I think the bottom line is to have my regex also accept a space, hyphens, ampersands and apostrophes - but not at the start or end of the name to be technically correct.

<ul> <li>Hyphenated Names (Worthington-Smythe)</li> </ul> Add a - into the second character class. The easiest way to do that is to add it at the start so that it can't possibly be interpreted as a range modifier (as in <code>a-z</code>). <pre class="prettyprint">^[A-Z][-a-zA-Z]+$</pre> <ul> <li>Names with Apostophies (D'Angelo)</li> </ul> A naive way of doing this would be as above, giving: <pre class="prettyprint">^[A-Z][-'a-zA-Z]+$</pre> Don't forget you may need to escape it inside the string! A 'better' way, given your example might be: <pre class="prettyprint">^[A-Z]'?[-a-zA-Z]+$</pre> Which will allow a possible single apostrophe in the second position. <ul> <li>Names with Spaces (Van der Humpton) - capitals in the middle which may or may not be required is way beyond my interest at this stage.</li> </ul> Here I'd be tempted to just do our naive way again: <pre class="prettyprint">^[A-Z]'?[- a-zA-Z]+$</pre> A potentially better way might be: <pre class="prettyprint">^[A-Z]'?[- a-zA-Z]( [a-zA-Z])*$</pre> Which looks for extra words at the end. This probably isn't a good idea if you're trying to match names in a body of extra text, but then again, the original wouldn't have done that well either. <ul> <li>Joint Names (Ben & Jerry)</li> </ul> At this point you're not looking at single names anymore? Anyway, as you can see, regexes have a habit of growing very quickly...

Regex for names

Tags:

regex

php

Just starting to explore the 'wonders' of regex. Being someone who learns from trial and error, I'm really struggling because my trials are throwing up a disproportionate amount of errors... My experiments are in PHP using ereg().

Anyway. I work with first and last names separately but for now using the same regex. So far I have:

^[A-Z][a-zA-Z]+$

Any length string that starts with a capital and has only letters (capital or not) for the rest. But where I fall apart is dealing with the special situations that can pretty much occur anywhere.

Hyphenated Names (Worthington-Smythe)
Names with Apostophies (D'Angelo)
Names with Spaces (Van der Humpton) - capitals in the middle which may or may not be required is way beyond my interest at this stage.
Joint Names (Ben & Jerry)

Maybe there's some other way a name can be that I'm no thinking of, but I suspect if I can get my head around this, I can add to it. I'm pretty sure there will be instances where more than one of these situations comes up in one name.

So, I think the bottom line is to have my regex also accept a space, hyphens, ampersands and apostrophes - but not at the start or end of the name to be technically correct.

258

asked Nov 08 '08 20:11

Humpton

2 Answers

This regex is perfect for me.

^([ \u00c0-\u01ffa-zA-Z'\-])+$

It works fine in php environments using preg_match(), but doesn't work everywhere.

It matches Jérémie O'Co-nor so I think it matches all UTF-8 names.

187

answered Oct 05 '22 19:10

Daan

Hyphenated Names (Worthington-Smythe)

Add a - into the second character class. The easiest way to do that is to add it at the start so that it can't possibly be interpreted as a range modifier (as in a-z).

^[A-Z][-a-zA-Z]+$

Names with Apostophies (D'Angelo)

A naive way of doing this would be as above, giving:

^[A-Z][-'a-zA-Z]+$

Don't forget you may need to escape it inside the string! A 'better' way, given your example might be:

^[A-Z]'?[-a-zA-Z]+$

Which will allow a possible single apostrophe in the second position.

Names with Spaces (Van der Humpton) - capitals in the middle which may or may not be required is way beyond my interest at this stage.

Here I'd be tempted to just do our naive way again:

^[A-Z]'?[- a-zA-Z]+$

A potentially better way might be:

^[A-Z]'?[- a-zA-Z]( [a-zA-Z])*$

Which looks for extra words at the end. This probably isn't a good idea if you're trying to match names in a body of extra text, but then again, the original wouldn't have done that well either.

Joint Names (Ben & Jerry)

At this point you're not looking at single names anymore?

Anyway, as you can see, regexes have a habit of growing very quickly...

answered Oct 05 '22 17:10

Matthew Scharley

Related questions
                            
                                Retrieving RSS feed with tag <content:encoded>
                            
                                Using htaccess, how to restrict seeing directory contents, but allow server to use the contents?
                            
                                Fatal error: Nesting level too deep - recursive dependency?
                            
                                Input::file() returning null Laravel
                            
                                Easiest way to alternate row colors in PHP/HTML?
                            
                                format xml string
                            
                                PHPMailer - SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
                            
                                How can I check if a variable exists in Smarty?
                            
                                PHPMailer GoDaddy Server SMTP Connection Refused
                            
                                Multi-Dimensional array count in PHP
                            
                                calculate math expression from a string using eval
                            
                                Convert RGB to hex color values in PHP
                            
                                Your requirements could not be resolved to an installable set of packages for laravel
                            
                                Excel date conversion using PHP Excel
                            
                                How do you build a web based email client using PHP?
                            
                                TCPDF ERROR: Some data has already been output, can't send PDF file
                            
                                PHP Print keys from an object?
                            
                                ZF2: Get url parameters in controller
                            
                                Find youtube Link in PHP string and Convert it into embed code?
                            
                                PHP remove first zeros

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With