Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server : fix misspelled business names

I'm looking for advice on how to tackle the issue of different spelling for the same name. I have a SQL Server database with company names, and there are some companies that are the same but the spelling is different.

For example:

Building Supplies pty
Buidings Supplies pty
Building Supplied l/d 

The problem is that there are no clear consistencies in the variation. Sometimes it's an extra 's', other times its an extra space.

Unfortunately I don't have a lookup list, so I can't use Fuzzy LookUp. I need to create the clean list.

Is there a method that people use to deal with this problem?

p.s I tried searching for this problem but can't seem to find a similar thread

Thanks

like image 273
jmich738 Avatar asked Oct 20 '22 23:10

jmich738


1 Answers

You can use SOUNDEX() DIFFERENCE() for this purpose.

DECLARE @SampleData TABLE(ID INT, BLD VARCHAR(50), SUP VARCHAR(50))

INSERT INTO @SampleData 
SELECT 1, 'Building','Supplies'
UNION
SELECT 2, 'Buidings','Supplies'
UNION
SELECT 3, 'Biulding','Supplied'
UNION
SELECT 4, 'Road','Contractor'
UNION
SELECT 5, 'Raod','Consractor'
UNION
SELECT 6, 'Highway','Supplies'

SELECT *, DIFFERENCE('Building', BLD) AS DIF  
FROM @SampleData
WHERE DIFFERENCE('Building', BLD) >= 3

Result

ID  BLD         SUP        DIF
1   Building    Supplies    4
2   Buidings    Supplies    3
3   Biulding    Supplied    4

If this serves your purpose you can write an update query to update selected record accordingly.

like image 100
CoderHawk Avatar answered Oct 23 '22 03:10

CoderHawk