Distinct of two columns grouping on another column

Tags:

I am trying to get a count of number of duplicate values on two columns grouping on another column in SQL Server.

Below is a sample scenario I am working on.

    DECLARE @mytable TABLE (CampName varchar(10),ID VARCHAR(10),ListName varchar(10))
    INSERT INTO @mytable
            ( CampName, ID, ListName )
    VALUES  ( 'A',   'X',   'Y' ), ( 'A',   'X',   'Y' ), 
            ( 'A',   'Y',   'Z' ), ( 'A',   'Y',   'Z' ),
            ( 'A',   'Y',   'Z' ), ( 'A',   'P',   'Q' ),
            ( 'B',   'X',   'Y' ), ( 'B',   'X',   'Y' ), 
            ( 'B',   'Y',   'Z' ), ( 'B',   'Y',   'Z' ),
            ( 'B',   'Y',   'Z' ), ( 'B',   'P',   'Q' ),
            ( 'B',   'R',   'S' ), ( 'B',   'R',   'S' )

This would result in the following table.

 CampName   ID  ListName
-------------------------------------
      A     X     Y
      A     X     Y -- Duplicate Record
      A     Y     Z
      A     Y     Z -- Duplicate Record
      A     Y     Z -- Duplicate Record
      A     P     Q
      B     X     Y 
      B     X     Y -- Duplicate Record
      B     Y     Z
      B     Y     Z -- Duplicate Record
      B     Y     Z -- Duplicate Record
      B     P     Q
      B     R     S
      B     R     S -- Duplicate Record

I need the output as follows:

CampName   dupcount
-------------------
A            3
B            4

Basically, I need to figure out the number of duplicate (ID,ListName) for each CampName irrespective of what the duplicate values are.

Let me know if I can clarify something else in this regard. Any help would be greatly appreciated.

896

asked Sep 08 '16 14:09

Kashyap MNVL

2 Answers

You can use the following query:

SELECT CampName, SUM(cnt) AS dupcount
FROM (
  SELECT CampName, COUNT(*) - 1 AS cnt
  FROM @mytable
  GROUP BY CampName, ID, ListName
  HAVING COUNT(*) > 1) AS t
GROUP BY CampName

The inner query uses a HAVING clause to filter out non-duplicate entries. It also calculates the number of duplicate records per ID, ListName. The outer query simply sums the number of duplicates.

171

answered Oct 20 '22 18:10

Giorgos Betsos

I believe that the distinct number of combinations of ID and ListName need to be subtracted from the total count for each CampName group to get the correct result.

SELECT t.CampName,
       COUNT(*) - COUNT(DISTINCT 'ColOne' + ID + 'ColTwo' + ListName) AS dupcount
FROM yourTable t
GROUP BY CampName

This query employs a trick, which is concatenating the ID and ListName columns, which are both text, to effectively form a pseudo-group. The need for this is that DISTINCT only works on a single column, but you have two columns which need to be considered.

Reference: Quora: In SQL, how to I count DISTINCT over multiple columns?

answered Oct 20 '22 19:10

Tim Biegeleisen

Related questions
                            
                                Magento - Module INSERT,UPDATE, DELETE, SELECT code
                            
                                How to concatenate variables into SQL strings
                            
                                Restricting a column to accept only 2 values
                            
                                How to substract/add minutes from a timestamp in postgreSQL
                            
                                Insert an object into a JSON array in SQL Server
                            
                                How to deal with time storage in SQL
                            
                                Do you use the OUTER keyword when writing left/right JOINs in SQL?
                            
                                Does sqlite3 support a trigger to automatically update an 'updated_on' datetime field?
                            
                                Oracle :Compare dates only by the Hour
                            
                                How can I check duplicate record in SQL Server [duplicate]
                            
                                Wildcard characters sql only alphabet characters
                            
                                Is “Where IN” with multiple columns defined in Standard SQL?
                            
                                Get overall sum of all databases size in a SQL Server
                            
                                Create HTML table from sql table
                            
                                SQL query to extract all WordPress posts with categories
                            
                                Join a count query on generate_series() and retrieve Null values as '0'
                            
                                SQL/Hive count distinct column
                            
                                TSQL - perform code for each row of an select
                            
                                Oracle Convert TIMESTAMP with Timezone to DATE
                            
                                Update row with select on same table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Distinct of two columns grouping on another column

Tags:

sql

sql-server

tsql

Kashyap MNVL

People also ask

2 Answers

Giorgos Betsos

Tim Biegeleisen

Recent Activity

Donate For Us