Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Name comparison algorithm

Tags:

algorithm

php

To check if a name is inside an anti-terrorism list.

In addition of the given name, also search for similar names (possible aliases).

Example:
given name => Bin Laden alert!
given name => Ben Larden mhm.. suspicious name, matchs at xx% with Bin Laden

How can I do this?

  • using PHP
  • names are 100% correct, since they are from official sources
  • i'm Italian, but i think this won't be a problem, since names are international
  • names can be composed of several words: Najmiddin Kamolitdinovich JALOLOV
  • looking for companies and people

I looked at differents algorithms: do you think that Levenshtein can do the job?
thank you in advance!

ps i got some problems to format this text, sorry :-)

like image 888
tampe125 Avatar asked Oct 24 '10 11:10

tampe125


1 Answers

I'd say your best bet to get this working with PHP's native functions are

  • soundex() — Calculate the soundex key of a string
  • levenshtein() - Calculate Levenshtein distance between two strings
  • metaphone() - Calculate the metaphone key of a string
  • similar_text() - Calculate the similarity between two strings

Since you are likely matching the names against a database (?), you might also want to check whether your database provides any Name Matching Functions.

Google also provided a PDF with a nice overview on Name Matching Algorithms:

  • http://homepages.cs.ncl.ac.uk/brian.randell/Genealogy/NameMatching.pdf
like image 166
Gordon Avatar answered Sep 29 '22 11:09

Gordon