how to recognize similar words with difference in spelling

Tags:

I want to filter out duplicate customer names from a database. A single customer may have more than one entry to the system with the same name but with little difference in spelling. So here is an example: A customer named Brook may have three entries to the system with this variations:

Brook Berta
Bruck Berta
Biruk Berta

Let's assume we are putting this name in one database column. I would like to know the different mechanisms to identify such duplications form say a 100,000 records. We may use regular expressions in C# to iterate through all records or some other pattern matching technique or we may export these records to what ever best fits for such queries (SQL with Regular Expression capabilities)).

This is what I thought as a solution

Write a C# code to iterate through each record
Get only the Consonant letters in order (in the above case: BrKBrt)
Search for the same Consonant pattern from the other records considering similar sounding letters like (C,K) (C,S), (F, PH)

So please forward any ideas.

910

asked Jun 22 '10 07:06

Elias Haileselassie

2 Answers

The Double Metaphone algorithm, published in 2000, is a new and improved version of the Soundex algorithm that was patented in 1918.

The article has links to Double Metaphone implementations in many languages.

answered Nov 14 '22 23:11

Ray Burns

Have a look at Soundex

There is a Soundex function in Transact-SQL (see http://msdn.microsoft.com/en-us/library/ms187384.aspx):

SELECT 
SOUNDEX('brook berta'),
SOUNDEX('Bruck Berta'),
SOUNDEX('Biruk Berta')

returns the same value B620 for each of the example values

answered Nov 14 '22 21:11

Mario Menger

Related questions
                            
                                ExecuteNonQuery() returns -1 in Update when records are updated
                            
                                Scope of a delegate in C#
                            
                                Take next parameter as field width in String.Format
                            
                                Dependency injection with multiple repositories
                            
                                Standard Methods vs Extensions Methods
                            
                                What is the use of spring.net?
                            
                                Check if an object is a generic collection
                            
                                Constructing big strings (e.g. for SQL commands) how smart is the C# compiler?
                            
                                Don't show context menu if nothing is selected
                            
                                .NET threading question
                            
                                Cannot attach an entity that already exists
                            
                                How to access usercontrol's values from page?
                            
                                What are the main differences between C# and Java? [duplicate]
                            
                                Generically creating objects in C#
                            
                                Is there a shorthand way to denullify a string in C#?
                            
                                WinForms: Alternative to SplitContainer?
                            
                                Passing static parameters to a class
                            
                                How can I make this Dictionary TryGetValue code more readable?
                            
                                Implicit array casting in C#
                            
                                The new operator in C# isn't overriding base class member

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

how to recognize similar words with difference in spelling

Tags:

c#

sql

pattern-matching

linq-to-sql

Elias Haileselassie

People also ask

2 Answers

Ray Burns

Mario Menger

Recent Activity

Donate For Us