Adding the DISTINCT keyword to a SELECT query causes it to return only unique values for the specified column list so that duplicate rows are removed from the result set. Since DISTINCT operates on all of the fields in SELECT's column list, it can't be applied to an individual field that are part of a larger group.
Yes, DISTINCT works on all combinations of column values for all columns in the SELECT clause.
Select with distinct on all columns of the first query. Select with distinct on multiple columns and order by clause. Count() function and select with distinct on multiple columns.
Using code SELECT DISTINCT(id,col_a) FROM Table will result: "(2,2)" "(3,3)" "(4,3)" "(5,4)" as you can see, the second column has duplicates.
try this query
SELECT ID, FirstName, LastName FROM table GROUP BY(FirstName)
To avoid potentially unexpected results when using GROUP BY
without an aggregate function, as is used in the accepted answer, because MySQL is free to retrieve ANY value within the data set being grouped when not using an aggregate function [sic] and issues with ONLY_FULL_GROUP_BY
. Please consider using an exclusion join.
Assuming the firstname and lastname are uniquely indexed (unambiguous), an alternative to GROUP BY
is to sort using a LEFT JOIN
to filter the result set, otherwise known as an exclusion JOIN.
See Demonstration
Ascending order (A-Z)
To retrieve the distinct firstname ordered by lastname from A-Z
Query
SELECT t1.*
FROM table_name AS t1
LEFT JOIN table_name AS t2
ON t1.firstname = t2.firstname
AND t1.lastname > t2.lastname
WHERE t2.id IS NULL;
Result
| id | firstname | lastname |
|----|-----------|----------|
| 2 | Bugs | Bunny |
| 1 | John | Doe |
Descending order (Z-A)
To retrieve the distinct firstname ordered by lastname from Z-A
Query
SELECT t1.*
FROM table_name AS t1
LEFT JOIN table_name AS t2
ON t1.firstname = t2.firstname
AND t1.lastname < t2.lastname
WHERE t2.id IS NULL;
Result
| id | firstname | lastname |
|----|-----------|----------|
| 2 | Bugs | Bunny |
| 3 | John | Johnson |
You can then order the resulting data as desired.
If the first and last name combination are not unique (ambiguous) and you have multiple rows of the same values, you can filter the result set by including an OR condition on the JOIN criteria to also filter by id.
See Demonstration
table_name data
(1, 'John', 'Doe'),
(2, 'Bugs', 'Bunny'),
(3, 'John', 'Johnson'),
(4, 'John', 'Doe'),
(5, 'John', 'Johnson')
Query
SELECT t1.*
FROM table_name AS t1
LEFT JOIN table_name AS t2
ON t1.firstname = t2.firstname
AND (t1.lastname > t2.lastname
OR (t1.firstname = t1.firstname AND t1.lastname = t2.lastname AND t1.id > t2.id))
WHERE t2.id IS NULL;
Result
| id | firstname | lastname |
|----|-----------|----------|
| 1 | John | Doe |
| 2 | Bugs | Bunny |
IN()
Subquery - Unambiguous EntitiesFor larger datasets, using an exclusion join can be very slow.
If you have unambiguous entries, an alternative is to use a Composite IN()
criteria against a MIN/MAX
aggregate subquery.
Demonstration
Ascending Order (A-Z)
Query
SELECT t1.*
FROM table_name AS t1
WHERE (t1.firstname, t1.lastname) IN(
SELECT firstname, MIN(lastname)
FROM table_name
GROUP BY firstname
)
Result
| id | firstname | lastname |
|----|-----------|----------|
| 2 | Bugs | Bunny |
| 1 | John | Doe |
Descending Order (Z-A)
Query
SELECT t1.*
FROM table_name AS t1
WHERE (t1.firstname, t1.lastname) IN(
SELECT firstname, MAX(lastname)
FROM table_name
GROUP BY firstname
)
Result
| id | firstname | lastname |
|----|-----------|----------|
| 2 | Bugs | Bunny |
| 3 | John | Johnson |
IN()
Dependent Subquery - Ambiguous EntitiesThe same theory can be applied from the Ambiguous Exclusion join to the composite IN()
subquery method, by adding a dependent subquery with MIN/MAX
on the id column.
Demonstration
Query
SELECT t1.*
FROM table_name AS t1
WHERE t1.id IN(
SELECT MIN(id)
FROM table_name
WHERE (t1.firstname, t1.lastname) IN(
SELECT firstname, MIN(lastname)
FROM table_name
GROUP BY firstname
)
GROUP BY firstname, lastname
);
Result
| id | firstname | lastname |
|----|-----------|----------|
| 1 | John | Doe |
| 2 | Bugs | Bunny |
EDIT
My original answer using an ordered subquery, was written prior to MySQL 5.7.5, which is no longer applicable, due to the changes with ONLY_FULL_GROUP_BY
. Please use the one of the examples above instead.
It is also important to note; when ONLY_FULL_GROUP_BY
is disabled (original behavior prior to MySQL 5.7.5), the use of GROUP BY
without an aggregate function may yield unexpected results, because MySQL is free to choose ANY value within the data set being grouped [sic].
Meaning an ID
or lastname
value may be retrieved that is not associated with the retrieved firstname
row.
WARNING
With MySQL GROUP BY
may not yield the expected results when used with ORDER BY
See Test Case Example
The best method of implementation, to ensure expected results, is to filter the result set scope using an ordered subquery.
table_name data
(1, 'John', 'Doe'),
(2, 'Bugs', 'Bunny'),
(3, 'John', 'Johnson')
Query
SELECT * FROM (
SELECT * FROM table_name ORDER BY ID DESC
) AS t1
GROUP BY FirstName
Result (MySQL 5.6)
| ID | first | last |
|----|-------|---------|
| 2 | Bugs | Bunny |
| 3 | John | Johnson |
Comparison
To demonstrate the unexpected results when using GROUP BY
in combination with ORDER BY
Query
SELECT * FROM table_name GROUP BY FirstName ORDER BY ID DESC
Result (MySQL 5.6)
| ID | first | last |
|----|-------|-------|
| 2 | Bugs | Bunny |
| 1 | John | Doe |
The DISTINCT
keyword doesn't really work the way you're expecting it to. When you use SELECT DISTINCT col1, col2, col3
you are in fact selecting all unique {col1, col2, col3} tuples.
SELECT ID,LastName
From TABLE_NAME
GROUP BY FirstName
HAVING COUNT(*) >=1
How about
`SELECT
my_distinct_column,
max(col1),
max(col2),
max(col3)
...
FROM
my_table
GROUP BY
my_distinct_column`
SELECT firstName, ID, LastName from tableName GROUP BY firstName
As pointed out by fyrye, the accepted answer pertains to older versions of MySQL in which ONLY_FULL_GROUP_BY
had not yet been introduced. With MySQL 8.0.17 (used in this example), unless you disable ONLY_FULL_GROUP_BY
you would get the following error message:
mysql> SELECT id, firstName, lastName FROM table_name GROUP BY firstName;
ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'mydatabase.table_name.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
One way to work around this not mentioned by fyrye, but described in https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html, is to apply the ANY_VALUE()
function to the columns which are not in the GROUP BY
clause (id
and lastName
in this example):
mysql> SELECT ANY_VALUE(id) as id, firstName, ANY_VALUE(lastName) as lastName FROM table_name GROUP BY firstName;
+----+-----------+----------+
| id | firstName | lastName |
+----+-----------+----------+
| 1 | John | Doe |
| 2 | Bugs | Bunny |
+----+-----------+----------+
2 rows in set (0.01 sec)
As written in the aforementioned docs,
In this case, MySQL ignores the nondeterminism of address values within each name group and accepts the query. This may be useful if you simply do not care which value of a nonaggregated column is chosen for each group.
ANY_VALUE()
is not an aggregate function, unlike functions such asSUM()
orCOUNT()
. It simply acts to suppress the test for nondeterminism.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With