SELECT DISTINCT for data groups

I have following table:

ID  Data
1   A
2   A
2   B
3   A
3   B
4   C
5   D 
6   A
6   B

etc. In other words, I have groups of data per ID. You will notice that the data group (A, B) occurs multiple times. I want a query that can identify the distinct data groups and number them, such as:

DataID     Data
101        A
102        A
102        B
103        C
104        D

So DataID 102 would resemble data (A,B), DataID 103 would resemble data (C), etc. In order to be able to rewrite my original table in this form:

ID   DataID
1    101
2    102
3    102
4    103
5    104
6    102

How can I do that?

PS. Code to generate the first table:

CREATE TABLE #t1 (id INT, data VARCHAR(10))
INSERT INTO #t1
SELECT 1, 'A'
UNION ALL SELECT 2, 'A'
UNION ALL SELECT 2, 'B'
UNION ALL SELECT 3, 'A'
UNION ALL SELECT 3, 'B'
UNION ALL SELECT 4, 'C'
UNION ALL SELECT 5, 'D'
UNION ALL SELECT 6, 'A'
UNION ALL SELECT 6, 'B'

Can you use SELECT distinct and GROUP BY?

Well, GROUP BY and DISTINCT have their own use. GROUP BY cannot replace DISTINCT in some situations and DISTINCT cannot take place of GROUP BY. It is as per your choice and situation how you are optimizing both of them and choosing where to use GROUP BY and DISTINCT.

Can you use SELECT distinct with multiple columns?

In SQL multiple fields may also be added with DISTINCT clause. DISTINCT will eliminate those rows where all the selected fields are identical.

How do you SELECT distinct data?

The distinct keyword is used with select keyword in conjunction. It is helpful when we avoid duplicate values present in the specific columns/tables. The unique values are fetched when we use the distinct keyword. SELECT DISTINCT returns only distinct (different) values.

Can we use distinct with GROUP BY in MySQL?

DISTINCT combined with ORDER BY needs a temporary table in many cases. Because DISTINCT may use GROUP BY , learn how MySQL works with columns in ORDER BY or HAVING clauses that are not part of the selected columns. See Section 12.20.

In my opinion You have to create a custom aggregate that concatenates data (in case of strings CLR approach is recommended for perf reasons). Then I would group by ID and select distinct from the grouping, adding a row_number()function or add a dense_rank() your choice. Anyway it should look like this

with groupings as (
select concat(data) groups
from Table1
group by ID
)
select groups, rownumber() over () from groupings

The following query using CASE will give you the result shown below.

From there on, getting the distinct datagroups and proceeding further should not really be a problem.

SELECT     
    id, 
     MAX(CASE data WHEN 'A' THEN data ELSE '' END) + 
     MAX(CASE data WHEN 'B' THEN data ELSE '' END) + 
     MAX(CASE data WHEN 'C' THEN data ELSE '' END) + 
     MAX(CASE data WHEN 'D' THEN data ELSE '' END) AS DataGroups
FROM  t1
GROUP BY id

ID  DataGroups
1   A
2   AB
3   AB
4   C
5   D
6   AB

However, this kind of logic will only work in case you the "Data" values are both fixed and known before hand.

In your case, you do say that is the case. However, considering that you also say that they are 1000 of them, this will be frankly, a ridiculous looking query for sure :-)

LuckyLuke's suggestion above would, frankly, be the more generic way and probably saner way to go about implementing the solution though in your case.

SELECT DISTINCT for data groups

Tags:

sql

tsql

sql-server-2008

distinct-values

littlegreen

People also ask

2 Answers

luckyluke

Jagmag

Recent Activity

Donate For Us

SELECT DISTINCT for data groups

Tags:

sql

tsql

sql-server-2008

distinct-values

littlegreen

People also ask

2 Answers

luckyluke

Jagmag

Related questions

Recent Activity

Donate For Us