Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL: How to merge case-insensitive duplicates

What would be the best way to remove duplicates while merging their records into one?

I have a situation where the table keeps track of player names and their records like this:

stats
-------------------------------
nick     totalgames     wins   ...
John     100            40
john     200            97
Whistle  50             47
wHiStLe  75             72
...

I would need to merge the rows where nick is duplicated (when ignoring case) and merge the records into one, like this:

    stats
    -------------------------------
    nick     totalgames     wins   ...
    john     300            137
    whistle  125            119
    ...

I'm doing this in Postgres. What would be the best way to do this?

I know that I can get the names where duplicates exist by doing this:

select lower(nick) as nick, totalgames, count(*) 
from stats 
group by lower(nick), totalgames
having count(*) > 1;

I thought of something like this:

update stats
set totalgames = totalgames + s.totalgames
from (that query up there) s
where lower(nick) = s.nick

Except this doesn't work properly. And I still can't seem to be able to delete the other duplicate rows containing the duplicate names. What can I do? Any suggestions?

like image 696
Joe Avatar asked Aug 17 '13 04:08

Joe


People also ask

Is SQL merge case sensitive?

In SQL Server, joins are case-insensitive. Case-sensitive collations are not supported with ArcGIS.

How do I ignore duplicates in SQL query?

Use the INSERT IGNORE command rather than the INSERT command. If a record doesn't duplicate an existing record, then MySQL inserts it as usual. If the record is a duplicate, then the IGNORE keyword tells MySQL to discard it silently without generating an error.

Does SQL intersect remove duplicates?

UNION, EXCEPT, and INTERSECT are operators that operate similarly and can come between two queries to form Boolean combinations between the results of the two queries. Tables returned by UNION, EXCEPT and INTERSECT by default will have duplicate records eliminated. Use ALL to retain duplicates in the results table.


1 Answers

I think easiest way to do it in one query would be using common table expressions:

with cte as (
    delete from stats
    where lower(nick) in (
      select lower(nick) from stats group by lower(nick) having count(*) > 1
    )
    returning *
)
insert into stats(nick, totalgames, wins)
select lower(nick), sum(totalgames), sum(wins)
from cte
group by lower(nick);

As you see, inside the cte I'm deleting duplicates and returning deleted rows, after that inserting grouped deleted data back into table.

see sql fiddle demo

like image 70
Roman Pekar Avatar answered Sep 20 '22 03:09

Roman Pekar