What would be the best way to remove duplicates while merging their records into one?
I have a situation where the table keeps track of player names and their records like this:
stats
-------------------------------
nick totalgames wins ...
John 100 40
john 200 97
Whistle 50 47
wHiStLe 75 72
...
I would need to merge the rows where nick is duplicated (when ignoring case) and merge the records into one, like this:
stats
-------------------------------
nick totalgames wins ...
john 300 137
whistle 125 119
...
I'm doing this in Postgres. What would be the best way to do this?
I know that I can get the names where duplicates exist by doing this:
select lower(nick) as nick, totalgames, count(*)
from stats
group by lower(nick), totalgames
having count(*) > 1;
I thought of something like this:
update stats
set totalgames = totalgames + s.totalgames
from (that query up there) s
where lower(nick) = s.nick
Except this doesn't work properly. And I still can't seem to be able to delete the other duplicate rows containing the duplicate names. What can I do? Any suggestions?
In SQL Server, joins are case-insensitive. Case-sensitive collations are not supported with ArcGIS.
Use the INSERT IGNORE command rather than the INSERT command. If a record doesn't duplicate an existing record, then MySQL inserts it as usual. If the record is a duplicate, then the IGNORE keyword tells MySQL to discard it silently without generating an error.
UNION, EXCEPT, and INTERSECT are operators that operate similarly and can come between two queries to form Boolean combinations between the results of the two queries. Tables returned by UNION, EXCEPT and INTERSECT by default will have duplicate records eliminated. Use ALL to retain duplicates in the results table.
I think easiest way to do it in one query would be using common table expressions:
with cte as (
delete from stats
where lower(nick) in (
select lower(nick) from stats group by lower(nick) having count(*) > 1
)
returning *
)
insert into stats(nick, totalgames, wins)
select lower(nick), sum(totalgames), sum(wins)
from cte
group by lower(nick);
As you see, inside the cte I'm deleting duplicates and returning deleted rows, after that inserting grouped deleted data back into table.
see sql fiddle demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With