A lot of contact management programs do this - you type in a name (e.g., "John W. Smith") and it automatically breaks it up internally into:
First name: John
Middle name: W.
Last name: Smith
Likewise, it figures out things like "Mrs. Jane W. Smith" and "Dr. John Doe, Jr." correctly as well (assuming you allow for fields like "prefix" and "suffix" in names).
I assume this is a fairly common things that people would want to do... so the question is... how would you do it? Is there a simple algorithm for this? Maybe a regular expression?
I'm after a .NET solution, but I'm not picky.
Update: I appreciate that there is no simple solution for this that covers ALL edge cases and cultures... but let's say for the sake of argument that you need the name in pieces (filling out forms - as in, say, tax or other government forms - is one case where you are bound to enter the name into fixed fields, whether you like it or not), but you don't necessarily want to force the user to enter their name into discrete fields (less typing = easier for novice users).
You'd want to have the program "guess" (as best it can) on what's first, middle, last, etc. If you can, look at how Microsoft Outlook does this for contacts - it lets you type in the name, but if you need to clarify, there's an extra little window you can open. I'd do the same thing - give the user the window in case they want to enter the name in discrete pieces - but allow for entering the name in one box and doing a "best guess" that covers most common names.
Name parsing consists of separating names into their given name and surname components and identifying titles and qualifiers, such as Mr. and Jr. You parse names as one of the first steps to scoring names to increase the likelihood that each name component is analyzed correctly.
To parse, in computer science, is where a string of commands – usually a program – is separated into more easily processed components, which are analyzed for correct syntax and then attached to tags that define each component.
A parser is a Java class that extracts attributes from a local file and stores the information in the repository. More specifically, in the case of a document, a parser: Takes in an InputStream or Reader object. Processes the character input, extracting attributes as it goes.
If you must do this parsing, I'm sure you'll get lots of good suggestions here.
My suggestion is - don't do this parsing.
Instead, create your input fields so that the information is already separated out. Have separate fields for title, first name, middle initial, last name, suffix, etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With