Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing PostgreSQL query with multiple joins, order by and small limit

I need some tips for optimizing queries fetching from large tables.

In this example I have 5 tables:

Brands
- id_brand
- b_name

Products
- id_product
- p_name
- ean
...
- fk_brand

Prod_attributes
- id_prod_att
- size_basic
...
- fk_product

Stores
- id_store
- s_name
...

Stocks
- id_stock
- stock_amount
- fk_prod_att
- fk_store

I need a query with ordered list of stocks, limited, so this is the general approach I used:

SELECT stores.s_name, stocks.stock_amount, prod_attributes.size_basic, 
products.p_name, products.ean, brands.b_name 

FROM (stocks 
    INNER JOIN stores 
    ON stocks.fk_store = stores.id_store) 
    INNER JOIN (prod_attributes 
        INNER JOIN (products 
            INNER JOIN brands 
            ON products.fk_brand = brands.id_brand) 
        ON prod_attributes.fk_product = products.id_product) 
    ON stocks.fk_prod_att = prod_attributes.id_prod_att 

ORDER BY s_name, p_name, size_basic 

LIMIT 25 OFFSET 0

This works fast on small tables, but when the tables grow the query gets very expensive. With 3,5M rows in Stocks, 300K in Prod_attributes, 25K Products it executes in over 8800ms, which is not acceptable for me.

All forgein keys have indexes and DB has been vacuum-analyzed recently.

I know that the issue lies in the ORDER BY part, because of it the query does not use indexes and does sequential scans. If I remove the ordering then the query is very fast.

For solving this I know the I can remove ORDER BY, but it's not feasible option for me. De-normalization of the DB or materialized view could help here also - again I would like to avoid this if possible.

What else I can do to speed this query up?

EXPLAIN ANALYZE:
- slow with order by: http://explain.depesz.com/s/AHO
- fast without order by: http://explain.depesz.com/s/NRxr

like image 274
Lucius Avatar asked Nov 12 '22 18:11

Lucius


1 Answers

A possible way to go is to remove stores from the join. Instead, you could:

  • Loop through stores (order by s_name) in a stored procedure or in source code and, for each store, execute the join filtering on stocks.fk_store. You could break the loop whenever you obtain a sufficient number of records.

  • If possible, partition stocks using fk_store key, in order to reduce heavily the number of tuples in the join.

In this way you should have a good benefit.

like image 126
Emyl Avatar answered Nov 15 '22 12:11

Emyl