Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to group by column having spelling mistakes

While working with some legacy data, I want to group the data on a column ignoring spelling mistakes. I think SOUNDEX() could do the job to achieve the desired result. Here is what I tried:

SELECT soundex(AREA)
FROM MASTER
GROUP BY soundex(AREA)
ORDER BY soundex(AREA)

But (obviously) the SOUNDEX returned 4-character code in result rows like this, loosing actual strings:

A131
A200
A236

How could I include at least one occurrence from the group into the query result instead of 4-character code.

like image 954
Khadim Ali Avatar asked May 08 '13 14:05

Khadim Ali


1 Answers

SELECT soundex(AREA) as snd_AREA, min(AREA) as AREA_EXAMPLE_1, max(AREA) as AREA_EXAMPLE_2
from MASTER
group by soundex(AREA)
order by AREA_EXAMPLE_1
;

In MySQL you could select group_concat(distinct AREA) as list_area to get all the versions, and I don't know about that in SQL-Server, but min and max give two examples of the areas, and you wanted to discard the diffs anyway.

like image 141
flaschenpost Avatar answered Dec 01 '22 23:12

flaschenpost