I was wondering which is better practice and/or faster. In the below, customer_id is the primary key. Group by the primary key, along with 5 other fields or just group by the primary key and then do a join to get the other fields. Or is this a 'it depends' question?
select customer_id, customer_name, customer_address, 5 other description fields..,
sum(amount) as SalesAmount
from customer c
inner join orders o
on o.customer_id = c.customer_id
group by customer_id, customer_name, customer_address, 5 other description fields..
vs
select customer_id, customer_name, customer_address, 5 other description fields..,
SalesAmount
from
(
select customer_id,
sum(amount) as SalesAmount
from customer c
inner join orders o
on o.customer_id = c.customer_id
group by customer_id
) t
inner join customer c
on c.customer_id = t.customer_id
This is a depend question. The latter variant will do one more join, which is potentially slower. But if the additional data you are getting from this join is big, it can be faster, because the grouping operation does not need to shuffle all the data.
So you see, there are scenarios for both variants in which they are faster than the other. You need to measure.
My guess is that for real data and queries the first variant is usually a lot faster.
I think that the first code is easier to read, which often is a good starting point when selecting how to do things.
You are guessing it yourself: This is an "it depends" question that cannot be answered theoretically. You have to investigate it a bit yourself, on your actual data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With