This is more of an academic question, because in my particular case I can create an easy workaround, but I would like to understand the reason behind this anyway.
Using an InnoDB table (MariaDB 10.0.31) with (among others) columns customer and uri, I wanted to select the distinct uris for a specific customer. Now, the table is quite large (around 50M entries), so there is a composite index on customer and uri.
Basically what I don't understand is why the order of the columns in the group by clause matters.
explain select customer, uri from `tableName` group by customer,uri;
tells me it will use the existing index for group by, but
explain select customer, uri from `tableName` group by uri,customer;
won't do so.
Could someone explain why this is the case? I always thought of the group by clause as declarative.
Maybe it's because it's Friday, but I can't think of a case, where the order of the group by columns would affect the result.
Your observation is correct. Results would be different as the "prefix" order of columns mentioned in the composite index declaration is used for decision making by the Cost based optimizer. This behavior is due to the usage of B-TREE index
GROUP BY clause is used for ordering the result and hence if
More on this and topic of Loose/Tight Index Scan can be found here https://dev.mysql.com/doc/refman/5.7/en/group-by-optimization.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With