I have a table that contains a field which is used for grouping and another field which holds data. I want a good way to find any GroupColumn value where every DataColumn value contains a specific value.
Example
+-------------+------------+
| GroupColumn | DataColumn |
+-------------+------------+
| GroupA | Data1 |
| GroupA | Data2 |
| GroupA | Data3 |
| GroupB | Data1 |<---These two values are the same
| GroupB | Data1 |<---for the same group
| GroupC | Data1 |
| GroupC | Data2 |
| GroupC | Data2 |
| GroupC | Data3 |
+-------------+------------+
Desired Output
Group B
In the example above the DataColumn changes for GroupA and GroupC, but for GroupB both values in the DataColumn are the same so I would want this result returned.
Current Solution
I have 2 current solutions based around the same theme, but I feel that this is something that SQL should be able to do in a easier fashion.
Group everything in the table, count the times GroupColumn appears and put this into a table. Do the same, but apply a condition. Join the 2 tables and see where the 2 counts do not match.
SELECT GROUPCOLUMN, COUNT(*) [TOTAL] INTO #ALL
FROM #TABLE
GROUP BY GROUPCOLUMN
SELECT GROUPCOLUMN, COUNT(*) [TOTAL] INTO #SOME
FROM #TABLE
WHERE DATACOLUMN = 'DATA1'
GROUP BY GROUPCOLUMN
SELECT * FROM #ALL A
INNER JOIN #SOME S ON A.GROUPCOLUMN = S.GROUPCOLUMN
WHERE S.TOTAL = A.TOTAL
Use a SUM and a CASE to check for the specific value and count everything and check in a sub-query.
SELECT * FROM
(SELECT GROUPCOLUMN, SUM(CASE WHEN DATACOLUMN = 'DATA1' THEN 1 ELSE 0 END) [VALUE], COUNT(*) [TOTAL] FROM #TABLE (NOLOCK)
GROUP BY GROUPCOLUMN) A
WHERE A.VALUE = A.TOTAL
Is there a better way to do this in SQL?
Thanks in advance.
Ninja
Because the COALESCE is an expression, you can use it in any clause that accepts an expression such as SELECT , WHERE , GROUP BY , and HAVING .
The GROUP BY statement groups rows that have the same values into summary rows, like "find the number of customers in each country". The GROUP BY statement is often used with aggregate functions ( COUNT() , MAX() , MIN() , SUM() , AVG() ) to group the result-set by one or more columns.
Cannot use an aggregate or a subquery in an expression used for the group by list of a GROUP BY clause. The original idea was to create the table in beginning of the query, so the (SELECT * FROM #TBL) could be used on the query itself, instead of defining the names on each GROUP BY.
You are looking for HAVING
clause
SELECT GROUPCOLUMN
FROM #TABLE (NOLOCK)
GROUP BY GROUPCOLUMN
HAVING Count(*) = Count(case when DATACOLUMN = 'DATA1' then 1 end)
It sounds like you are looking for each group that has a single distinct value in DATACOLUMN
:
SELECT GROUPCOLUMN
FROM #TABLE
GROUP BY GROUPCOLUMN
HAVING COUNT(DISTINCT DATACOLUMN) = 1
Note that COUNT(DISTINCT ...)
does not count NULL
as a distinct value.
You should be able to compare COUNT(*)
with COUNT(DISTINCT DATACOLUMN)
to do this. Like this:
SELECT GROUPCOLUMN
FROM #TABLE
GROUP BY GROUPCOLUMN
HAVING Count(*) = Count(DISTINCT DATACOLUMN)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With