Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between group by, distinct, Union for selecting distinct values for multiple columns?

This question explained about a way of getting distinct combination of multiple columns. But I want to know the difference between the methods of DISTINCT, UNION, GROUP BY keyword method for this purpose. I am getting different results when using them. My queries are like this Query 1.

select 
column1,
column2,
column3
from table
group by 1,2,3

Query 2.

select distinct 
column1,
column2,
column3
from table

Query 3.

SELECT DISTINCT(ans) FROM (
    SELECT column1 AS ans FROM sametable
    UNION
    SELECT column2 AS ans FROM sametable
    UNION
    SELECT column3 AS ans FROM sametable
) AS Temp

I am getting different number of rows for above queries(Edit: The first two are giving equal number of rows but last one is giving differnetly). Can any body explain what the above queries are doing? Especially the third one?

EDIT: Note that I am doing UNION on same table. In that case what will happen?

like image 717
Enjoy coding Avatar asked Feb 03 '23 10:02

Enjoy coding


2 Answers

Starting with what I think is the simplest, DISTINCT, really is just that. It returns the distinct combinations of rows. Think of this dataset:

COL1      COL2      COL3
A         B         C
D         E         F
G         H         I
A         B         C   <- duplicate of row 1

This will return 3 rows because the 4th row in the dataset exactly matches the first row. Result:

COL1      COL2      COL3
A         B         C
D         E         F
G         H         I

The GROUP BY is frequently used for summaries and other calculations select COL1, SUM(COL2) from table group by column1;

For this dataset:

COL1      COL2
A         5
A         6
B         2
C         3
C         4
C         5

would return

COL1     SUM(COL2)
A        11
B        2
C        12

a UNION just takes results from different queries and presents them as 1 result set:

Table1
COL1
A

Table2
COLX
B

Table3
WHATEVER_COLUMN_NAME
Giddyup

select COL1 from Table1
UNION
select COLX from Table2
UNION 
select WHATEVER_COLUMN_NAME from Table3;

Result Set:

A
B
Giddyup

When performing a union, the column datatypes must match up. You can't UNION a number column with a char column (unless you explicitly perform a data conversion)

like image 51
Brett McCann Avatar answered Feb 06 '23 01:02

Brett McCann


Lets assume this is your db data:

column1 | column2 | column3
1       | 2       | 1
1       | 2       | 2
1       | 2       | 1
3       | 1       | 2
1       | 2       | 2
1       | 2       | 2
1       | 2       | 2

First query

In the first example you will get all column combinations from the db (as GROUP BY 1,2,3 does nothing) including duplicates, so it will return:

1       | 2       | 1
1       | 2       | 2
1       | 2       | 1
3       | 1       | 2
1       | 2       | 2
1       | 2       | 2
1       | 2       | 2

2nd query

Second example takes unique values for column tuples so you will end with

1       | 2       | 1
1       | 2       | 2
3       | 1       | 2

3rd query

Last query takes all values from three columns and then it removes duplicates from that set. So you will get all values from any of the tables. In the end this will return

1
2
3

Does this makes it clear?

like image 40
RaYell Avatar answered Feb 06 '23 00:02

RaYell