I have a query that runs a lot slower (~5 minutes) when I run it with the default enable_nestloop=true and enable_nestloop=false (~10 secs).
Explain analyse result for both cases:
Machine A nestloop=true - http://explain.depesz.com/s/nkj0 (~5 minutes) Machine A nestloop=false - http://explain.depesz.com/s/wBM (~10 secs)
On a different slightly slower machine, copying the database over and leaving the default enable_nestloop=true it takes ~20 secs.
Machine B nestloop=true - (~ 20secs)
For all the cases above I ensured that I did an ANALYZE before running the queries. There were no other queries running in parallel.
Both machines are running Postgres 8.4. Machine A is running Ubuntu 10.04 32 bit while Machine B is running Ubuntu 8.04 32 bit.
The actual query is available here . It is a reporting query with many joins as the database is mainly used for transaction processing.
Without resorting to putting in something like materialized views what can I do to make the planner do what I achieved by setting enable_nestloop=false ?
From the research I have done it seems to be that the reason the planner is choosing the seemingly unoptimal query is because of the huge difference between the estimated and actual rows. How can I get this figure closer ?
If I should rewrite the query, what should I change ?
Why is it that the planner seems to be doing the right thing for Machine B. What should I be comparing in both the machines ?
If the query planner chooses suboptimal query plans, then chances are it has incomplete or misleading information to work with.
See this PostgreSQL Wiki page on server tuning. Especially pay attention to the chapters on random_page_cost and default_statistics_target.
Also read the corresponding chapters in the manual on Statistics Used by the Planner and Planner Cost Constants.
More specifically, it might help to increase the statistics target
for the following columns:
ALTER TABLE postgres.products ALTER COLUMN id SET STATISTICS 1000;
ALTER TABLE postgres.sales_orders ALTER COLUMN retailer_id SET STATISTICS 1000;
ALTER TABLE postgres.sales_orders ALTER COLUMN company_id SET STATISTICS 1000;
ALTER TABLE goods_return_notes ALTER COLUMN retailer_id SET STATISTICS 1000;
ALTER TABLE goods_return_notes ALTER COLUMN company_id SET STATISTICS 1000;
ALTER TABLE retailer_category_leaf_nodes ALTER COLUMN tree_left SET STATISTICS 1000;
ALTER TABLE channels ALTER COLUMN principal_id SET STATISTICS 1000;
These are involved in the filters resulting in the
huge difference between the estimated and actual rows.
There are more. Check every column where the planer deviates a lot from the estimate. Default is just 100. Makes only sense for tables with >> 1000 rows. Experiment with the setting. Run ANALYZE
on the tables afterwards for the changes to take effect.
It might also help to create a partial index on postgres(sales_orders.retailer_id) WHERE retailer_id IS NOT NULL
(depending on how common NULL values are).
Another thing that may help you is to upgrade to the latest version 9.1. There have been a number of substantial improvements in this area.
Turns out rewriting the query was the best fix. The query was written in a way that it relied heavily on left joins and had many joins. I flattened it out and reduced the left joins by using my knowledge of the join nature of the data in the tables the query was joining. I guess the rule of thumb is if the planner is coming out with real crappy estimates, there might be a better way of writing the query.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With