I know in simple queries the performance and execution plans of the Distinct and Group By are almost the same.
e.g.
SELECT Name FROM NamesTable GROUP BY Name
SELECT DISTINCT Name FROM NamesTable
But I've read in some scenarios their performance would be different e.g. in subqueries, etc?
So, could you make some examples or explain some scenarios where their performance are different?
Many thanks
DISTINCT is used to filter unique records out of the records that satisfy the query criteria. The "GROUP BY" clause is used when you need to group the data and it should be used to apply aggregate operators to each group.
DISTINCT is used to filter unique records out of all records in the table. It removes the duplicate rows. SELECT DISTINCT will always be the same, or faster than a GROUP BY.
I personally wouldn't consider performance, but what is semantically correct. DISTINCT implies you want a distinct set of columns. However, GROUP BY implies you want to compute some sort of aggregate value which you are not.
In MySQL, DISTINCT seems a bit faster than GROUP BY if theField is not indexed. DISTINCT only eliminate duplicate rows but GROUP BY seems to sort them in addition.
If you include a calculated value in the field list you will see a difference in the execution plan.
select Value,
getdate()
from YourTable
group by UnitID
select distinct
Value,
getdate()
from YourTable
The group by
query aggregates before it computes the scalar value. The distinct
query computes the scalar value before the aggregate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With