Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Self join vs group by when counting duplicates

I'm trying to count duplicates based on a column of a table in an Oracle Database. This query using group by:

select count(dockey), sum(total)
from
(
select doc1.xdockeyphx dockey, count(doc1.xdockeyphx) total
from ecm_ocs.docmeta doc1
where doc1.xdockeyphx is not null
group by doc1.xdockeyphx
having count(doc1.xdockeyphx) > 1
)

Returns count = 94408 and sum(total) = 219330. I think this is the correct value.

Now, trying this other query using a self join:

select count(distinct(doc1.xdockeyph))
from ecm_ocs.docmeta doc1, ecm_ocs.docmeta doc2
where doc1.did > doc2.did
and doc1.xdockeyphx = doc2.xdockeyphx
and doc1.xdockeyphx is not null
and doc2.xdockeyphx is not null

The result is also 94408 but this one:

select count(*)
from ecm_ocs.docmeta doc1, ecm_ocs.docmeta doc2
where doc1.did > doc2.did
and doc1.xdockeyphx = doc2.xdockeyphx
and doc1.xdockeyphx is not null
and doc2.xdockeyphx is not null

Is returning 1567466, which I think is wrong.

The column I'm using to find duplicates is XDOCKEYPHX and the DID is the primary key of the table.

Why is the value sum(total) different from the result of the last query? I can't see why the last query is returning more duplicate rows than expected.

like image 490
carcaret Avatar asked Dec 01 '25 02:12

carcaret


1 Answers

You don't need the complexity of your last where clause

where doc1.did > doc2.did
and doc1.xdockeyphx = doc2.xdockeyphx
and doc1.xdockeyphx is not null
and doc2.xdockeyphx is not null

If you think about it, doc2.xdockeyphx cannot be null if doc1.xdockeyphx is not null. perhaps it is better expressed by joining tables....

select count(*)
from ecm_ocs.docmeta doc1
join ecm_ocs.docmeta doc2
on doc1.xdockeyphx = doc2.xdockeyphx
where doc1.xdockeyphx is not null and doc1.did > doc2.did

Your first two queries report distinct/grouped results where your last one simply reports all results, which is why the counts differ.

like image 192
vogomatix Avatar answered Dec 02 '25 17:12

vogomatix



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!