Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How SQL's DISTINCT clause works?

I'm looking for the answer on how DISTINCT clause works in SQL (SQL Server 2008 if that makes a difference) on a query with multiple tables joined?

I mean how the SQL engine handles the query with DISTINCT clause?

The reason I'm asking is that I was told by my far more experienced colleague that SQL applies DISTINCT to every field of every table. It seems unlikely for me, but I want to make sure....

For example having two tables:

CREATE TABLE users
(
u_id INT PRIMARY KEY,
u_name VARCHAR(30),
u_password VARCHAR(30)
)

CREATE TABLE roles
(
r_id INT PRIMARY KEY,
r_name VARCHAR(30)
)

CREATE TABLE users_l_roles
(
u_id INT FOREIGN KEY REFERENCES users(u_id) ,
r_id INT FOREIGN KEY REFERENCES roles(r_id) 
)

And then having this query:

SELECT          u_name
FROM            users 
INNER JOIN      users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN      roles ON users_l_roles.r_id = roles.r_id 

Assuming there was user with two roles then the above query will return two records with the same user name.

But this query with distinct:

SELECT DISTINCT u_name
FROM            users 
INNER JOIN      users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN      roles ON users_l_roles.r_id = roles.r_id 

will return only one user name.

The question is whether SQL will compare all the fields from all the joined tables (u_id, u_name, u_password, r_id, r_name) or it will compare only named fields in the query (u_name) and distinct the results?

like image 255
korzeniow Avatar asked Jan 24 '12 19:01

korzeniow


People also ask

How do we use the distinct statement what is its use?

The SELECT DISTINCT statement is used to return only distinct (different) values. Inside a table, a column often contains many duplicate values; and sometimes you only want to list the different (distinct) values.

How do you use distinct in the middle of a selected statement?

Adding the DISTINCT keyword to a SELECT query causes it to return only unique values for the specified column list so that duplicate rows are removed from the result set. Since DISTINCT operates on all of the fields in SELECT's column list, it can't be applied to an individual field that are part of a larger group.

What work does the keyword distinct achieve for SQL?

The DISTINCT keyword in SQL is used to fetch unique records and eliminates the duplicate records in the table.

Can we use 2 distinct in SQL?

Answer. Yes, the DISTINCT clause can be applied to any valid SELECT query. It is important to note that DISTINCT will filter out all rows that are not unique in terms of all selected columns.


2 Answers

DISTINCT filters out duplicate values of your returned fields.

A really simplified way to look at it is:

  • It builds your overall result set (including duplicates) based on your FROM and WHERE clauses
  • It sorts that result set based on the fields you want to return
  • It removes any duplicate values in those fields

It's semantically equivalent to a GROUP BY where all returned fields are in the GROUP BY clause.

like image 162
JNK Avatar answered Oct 24 '22 01:10

JNK


DISTINCT simply de-duplicates the resultant recordset after all other query operations have been performed. This article has more detail.

like image 24
mwigdahl Avatar answered Oct 24 '22 01:10

mwigdahl