I'm looking for the answer on how DISTINCT clause works in SQL (SQL Server 2008 if that makes a difference) on a query with multiple tables joined?
I mean how the SQL engine handles the query with DISTINCT clause?
The reason I'm asking is that I was told by my far more experienced colleague that SQL applies DISTINCT to every field of every table. It seems unlikely for me, but I want to make sure....
For example having two tables:
CREATE TABLE users
(
u_id INT PRIMARY KEY,
u_name VARCHAR(30),
u_password VARCHAR(30)
)
CREATE TABLE roles
(
r_id INT PRIMARY KEY,
r_name VARCHAR(30)
)
CREATE TABLE users_l_roles
(
u_id INT FOREIGN KEY REFERENCES users(u_id) ,
r_id INT FOREIGN KEY REFERENCES roles(r_id)
)
And then having this query:
SELECT u_name
FROM users
INNER JOIN users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN roles ON users_l_roles.r_id = roles.r_id
Assuming there was user with two roles then the above query will return two records with the same user name.
But this query with distinct:
SELECT DISTINCT u_name
FROM users
INNER JOIN users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN roles ON users_l_roles.r_id = roles.r_id
will return only one user name.
The question is whether SQL will compare all the fields from all the joined tables (u_id, u_name, u_password, r_id, r_name) or it will compare only named fields in the query (u_name) and distinct the results?
The SELECT DISTINCT statement is used to return only distinct (different) values. Inside a table, a column often contains many duplicate values; and sometimes you only want to list the different (distinct) values.
Adding the DISTINCT keyword to a SELECT query causes it to return only unique values for the specified column list so that duplicate rows are removed from the result set. Since DISTINCT operates on all of the fields in SELECT's column list, it can't be applied to an individual field that are part of a larger group.
The DISTINCT keyword in SQL is used to fetch unique records and eliminates the duplicate records in the table.
Answer. Yes, the DISTINCT clause can be applied to any valid SELECT query. It is important to note that DISTINCT will filter out all rows that are not unique in terms of all selected columns.
DISTINCT
filters out duplicate values of your returned fields.
A really simplified way to look at it is:
FROM
and WHERE
clausesIt's semantically equivalent to a GROUP BY
where all returned fields are in the GROUP BY
clause.
DISTINCT
simply de-duplicates the resultant recordset after all other query operations have been performed. This article has more detail.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With