I've noticed some strange behavior in the query optimizer for Redshift, and I'm wondering if anyone can explain it or point out a workaround.
For large group by
queries, it's pretty essential to get the optimizer to plan a GroupAggregate rather than a HashAggregate, so it doesn't try to fit the temporary results in memory. This works fine for me in general. But when I try to use that group by
as a subquery, it switches to HashAggregate.
For example, consider the following query.
select install_app_version, user_id, max(platform) as plat
from dailies
group by install_app_version, user_id;
The table dailies has sortkeys (install_app_version, user_id) and distkey (user_id). Hence a GroupAggregate is possible, and the query plan looks like this, as it should.
XN GroupAggregate (cost=0.00..184375.32 rows=1038735 width=51)
-> XN Seq Scan on daily_players (cost=0.00..103873.42 rows=10387342 width=51)
In contrast, if I use the above in a subquery of any other query, I get a HashAggregate. For example, even something as simple as
select count(1) from
( select install_app_version, user_id, max(platform) as plat
from daily_players
group by install_app_version, user_id
);
has the query plan
XN Aggregate (cost=168794.32..168794.32 rows=1 width=0)
-> XN Subquery Scan derived_table1 (cost=155810.13..166197.48 rows=1038735 width=0)
-> XN HashAggregate (cost=155810.13..155810.13 rows=1038735 width=39)
-> XN Seq Scan on daily_players (cost=0.00..103873.42 rows=10387342 width=39)
The same pattern persists no matter what I do in the outer query. I can group by install_app_version and user_id, I can take aggregates, I can do no grouping at all externally. Even sorting the inner query does nothing.
In the cases I've shown it's not such a big deal, but I'm joining several subqueries with their own group by
, doing aggregates over that - it quickly gets out of hand and very slow without GroupAggregate.
If anyone has wisdom about the query optimizer and can answer this, it'd be much appreciated! Thanks!
don't know if your question is still opened, but I put this here because I think others could be interested.
Redshift seems to perform GROUP BY aggregation with HashAggregate by default (even when conditions for GroupAggregate are right), and switch only to GroupAggregate when there is at least one computation made by aggregation THAT NEED TO BE RESOLVED FOR THE QUERY TO RETURN. I mean by this that, in your previous example, the "max(platform) as plat" is of no use for the final "COUNT(1)" result of the query. I believe that, in such case, the aggregate computation of MAX() function is not computed at all.
The workaround I use is to add an useless HAVING clause that does nothing but still need to be computed (for exemple "HAVING COUNT(1)"). This always return true (because each group has COUNT(1) equals to at least 1 and so is true), but enables the query plan to use GroupAggregate.
Example :
EXPLAIN SELECT COUNT(*) FROM (SELECT mycol FROM mytable GROUP BY 1);
XN Aggregate (cost=143754365.00..143754365.00 rows=1 width=0)
-> XN Subquery Scan derived_table1 (cost=141398732.80..143283238.56 rows=188450576 width=0)
-> XN HashAggregate (cost=141398732.80..141398732.80 rows=188450576 width=40)
-> XN Seq Scan on mytable (cost=0.00..113118986.24 rows=11311898624 width=40)
EXPLAIN SELECT COUNT(*) FROM (SELECT mycol FROM mytable GROUP BY 1 HAVING COUNT(1));
XN Aggregate (cost=171091871.18..171091871.18 rows=1 width=0)
-> XN Subquery Scan derived_table1 (cost=0.00..171091868.68 rows=1000 width=0)
-> XN GroupAggregate (cost=0.00..171091858.68 rows=1000 width=40)
Filter: ((count(1))::boolean = true)
-> XN Seq Scan on mytable (cost=0.00..113118986.24 rows=11311898624 width=40)
This works because 'mycol' is both the distkey and the sortkey of 'mytable'.
As you can see, the query plan estimate than the query with GroupAggregate is more costly than the one with HashAggregate (which must be the thing which make the query plan choose HashAggregate). Do not trust that, in my example the second query runs up to 7 times faster than the first one ! The cool thing is that the GroupAggregate do not need much memory to be computed, and so will almost never perform 'Disk Based Aggregate'.
In fact, I realised it's even a much better option to perform COUNT(DISTINCT x) with a subquery GroupAggregate than with the standard COUNT(DISTINCT x) (in my example, 'mycol' is a NOT NULL column) :
EXPLAIN SELECT COUNT(DISTINCT mycol) FROM mytable ;
XN Aggregate (cost=143754365.00..143754365.00 rows=1 width=72)
-> XN Subquery Scan volt_dt_0 (cost=141398732.80..143283238.56 rows=188450576 width=72)
-> XN HashAggregate (cost=141398732.80..141398732.80 rows=188450576 width=40)
-> XN Seq Scan on mytable (cost=0.00..113118986.24 rows=11311898624 width=40)
3 minutes 46 s
EXPLAIN SELECT COUNT(*) FROM (SELECT mycol FROM mytable GROUP BY 1 HAVING COUNT(1));
XN Aggregate (cost=171091871.18..171091871.18 rows=1 width=0)
-> XN Subquery Scan derived_table1 (cost=0.00..171091868.68 rows=1000 width=0)
-> XN GroupAggregate (cost=0.00..171091858.68 rows=1000 width=40)
Filter: ((count(1))::boolean = true)
-> XN Seq Scan on mytable (cost=0.00..113118986.24 rows=11311898624 width=40)
40 seconds
Hopes that helps !
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With