Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple way to parse a person's name into its component parts? [closed]

A lot of contact management programs do this - you type in a name (e.g., "John W. Smith") and it automatically breaks it up internally into:

First name: John
Middle name: W.
Last name: Smith

Likewise, it figures out things like "Mrs. Jane W. Smith" and "Dr. John Doe, Jr." correctly as well (assuming you allow for fields like "prefix" and "suffix" in names).

I assume this is a fairly common things that people would want to do... so the question is... how would you do it? Is there a simple algorithm for this? Maybe a regular expression?

I'm after a .NET solution, but I'm not picky.

Update: I appreciate that there is no simple solution for this that covers ALL edge cases and cultures... but let's say for the sake of argument that you need the name in pieces (filling out forms - as in, say, tax or other government forms - is one case where you are bound to enter the name into fixed fields, whether you like it or not), but you don't necessarily want to force the user to enter their name into discrete fields (less typing = easier for novice users).

You'd want to have the program "guess" (as best it can) on what's first, middle, last, etc. If you can, look at how Microsoft Outlook does this for contacts - it lets you type in the name, but if you need to clarify, there's an extra little window you can open. I'd do the same thing - give the user the window in case they want to enter the name in discrete pieces - but allow for entering the name in one box and doing a "best guess" that covers most common names.

like image 946
Keithius Avatar asked Sep 19 '08 16:09

Keithius


People also ask

What is parsing a name?

Name parsing consists of separating names into their given name and surname components and identifying titles and qualifiers, such as Mr. and Jr. You parse names as one of the first steps to scoring names to increase the likelihood that each name component is analyzed correctly.

What does it mean to parse a string?

To parse, in computer science, is where a string of commands – usually a program – is separated into more easily processed components, which are analyzed for correct syntax and then attached to tags that define each component.

What is parser in Java?

A parser is a Java class that extracts attributes from a local file and stores the information in the repository. More specifically, in the case of a document, a parser: Takes in an InputStream or Reader object. Processes the character input, extracting attributes as it goes.


1 Answers

If you must do this parsing, I'm sure you'll get lots of good suggestions here.

My suggestion is - don't do this parsing.

Instead, create your input fields so that the information is already separated out. Have separate fields for title, first name, middle initial, last name, suffix, etc.

like image 56
shadit Avatar answered Sep 22 '22 13:09

shadit