This question explained about a way of getting distinct combination of multiple columns. But I want to know the difference between the methods of DISTINCT, UNION, GROUP BY keyword method for this purpose. I am getting different results when using them. My queries are like this Query 1.
select 
column1,
column2,
column3
from table
group by 1,2,3
Query 2.
select distinct 
column1,
column2,
column3
from table
Query 3.
SELECT DISTINCT(ans) FROM (
    SELECT column1 AS ans FROM sametable
    UNION
    SELECT column2 AS ans FROM sametable
    UNION
    SELECT column3 AS ans FROM sametable
) AS Temp
I am getting different number of rows for above queries(Edit: The first two are giving equal number of rows but last one is giving differnetly). Can any body explain what the above queries are doing? Especially the third one?
EDIT: Note that I am doing UNION on same table. In that case what will happen?
Starting with what I think is the simplest, DISTINCT, really is just that. It returns the distinct combinations of rows. Think of this dataset:
COL1      COL2      COL3
A         B         C
D         E         F
G         H         I
A         B         C   <- duplicate of row 1
This will return 3 rows because the 4th row in the dataset exactly matches the first row. Result:
COL1      COL2      COL3
A         B         C
D         E         F
G         H         I
The GROUP BY is frequently used for summaries and other calculations select COL1, SUM(COL2) from table group by column1;
For this dataset:
COL1      COL2
A         5
A         6
B         2
C         3
C         4
C         5
would return
COL1     SUM(COL2)
A        11
B        2
C        12
a UNION just takes results from different queries and presents them as 1 result set:
Table1
COL1
A
Table2
COLX
B
Table3
WHATEVER_COLUMN_NAME
Giddyup
select COL1 from Table1
UNION
select COLX from Table2
UNION 
select WHATEVER_COLUMN_NAME from Table3;
Result Set:
A
B
Giddyup
When performing a union, the column datatypes must match up. You can't UNION a number column with a char column (unless you explicitly perform a data conversion)
Lets assume this is your db data:
column1 | column2 | column3
1       | 2       | 1
1       | 2       | 2
1       | 2       | 1
3       | 1       | 2
1       | 2       | 2
1       | 2       | 2
1       | 2       | 2
In the first example you will get all column combinations from the db (as GROUP BY 1,2,3 does nothing) including duplicates, so it will return:
1       | 2       | 1
1       | 2       | 2
1       | 2       | 1
3       | 1       | 2
1       | 2       | 2
1       | 2       | 2
1       | 2       | 2
Second example takes unique values for column tuples so you will end with
1       | 2       | 1
1       | 2       | 2
3       | 1       | 2
Last query takes all values from three columns and then it removes duplicates from that set. So you will get all values from any of the tables. In the end this will return
1
2
3
Does this makes it clear?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With