Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Execute Subquery refactoring first before any other SQL

I Have a very complex view which is of the below form

create or replace view loan_vw as 
select * from (with loan_info as (select loan_table.*,commission_table.* 
                                   from loan_table,
                                  commission_table where 
                                  contract_id=commission_id)
                select /*complex transformations */ from loan_info
                where type <> 'PRINCIPAL'
                union all 
                select /*complex transformations */ from loan_info
                where type = 'PRINCIPAL')

Now IF I do the below select the query hangs

         select * from loan_vw where contract_id='HA001234TY56';

But if I hardcode inside the subquery refactoring or use package level variable in the same session the query returns in a second

create or replace view loan_vw as 
        select * from (with loan_info as (select loan_table.*,commission_table.* 
                                           from loan_table,
                                          commission_table where 
                                          contract_id=commission_id
                                          and contract_id='HA001234TY56'
                                          )
                        select /*complex transformations */ from loan_info
                        where type <> 'PRINCIPAL'
                        union all 
                        select /*complex transformations */ from loan_info
                        where type = 'PRINCIPAL')

Since I use Business object I cannot use package level variable

So my question is there a hint in Oracle to tell the optimizer to first check the contract_id in loan_vw in the subquery refactoring

As requested the analytical function used is the below

select value_date, item, credit_entry, item_paid
from (
  select value_date, item, credit_entry, debit_entry,
    greatest(0, least(credit_entry, nvl(sum(debit_entry) over (), 0)
      - nvl(sum(credit_entry) over (order by value_date
          rows between unbounded preceding and 1 preceding), 0))) as item_paid
  from your_table
)
where item is not null;

After following the advice given by Boneist and MarcinJ I removed the Sub query refactoring (CTE) and wrote one long query like the below which improved the performance from 3 min to 0.156 seconds

  create or replace view loan_vw as
  select /*complex transformations */
                               from loan_table,
                              commission_table where 
                              contract_id=commission_id
               and loan_table.type <> 'PRINCIPAL'
  union all
  select /*complex transformations */
                               from loan_table,
                              commission_table where 
                              contract_id=commission_id
               and loan_table.type = 'PRINCIPAL'
like image 404
psaraj12 Avatar asked Apr 11 '19 09:04

psaraj12


1 Answers

Are these transformations really that complex you have to use UNION ALL? It's really hard to optimize something you can't see, but have you maybe tried getting rid of the CTE and implementing your calculations inline?

CREATE OR REPLACE VIEW loan_vw AS
SELECT loan.contract_id
     , CASE commission.type -- or wherever this comes from
         WHEN 'PRINCIPAL'
         THEN SUM(whatever) OVER (PARTITION BY loan.contract_id, loan.type) -- total_whatever

         ELSE SUM(something_else) OVER (PARTITION BY loan.contract_id, loan.type) -- total_something_else
      END AS whatever_something
  FROM loan_table loan 
 INNER 
  JOIN commission_table commission
    ON loan.contract_id = commission.commission_id

Note that if your analytic functions don't have PARTITION BY contract_id you won't be able to use an index on that contract_id column at all.

Take a look at this db fiddle (you'll have to click on ... on the last result table to expand the results). Here, the loan table has an indexed (PK) contract_id column, but also some_other_id that is also unique, but not indexed and the predicate on the outer query is still on contract_id. If you compare plans for partition by contract and partition by other id, you'll see that index is not used at all in the partition by other id plan: there's a TABLE ACCESS with FULL options on the loan table, as compared to INDEX - UNIQUE SCAN in partition by contract. That's obviously because the optimizer cannot resolve the relation between contract_id and some_other_id by its own, and so it'll need to run SUM or AVG over the entire window instead of limiting window row counts through index usage.

What you can also try - if you have a dimension table with those contracts - is to join it to your results and expose the contract_id from the dimension table instead of the most likely huge loan fact table. Sometimes this can lead to an improvement in cardinality estimates through the usage of a unique index on the dimension table.

Again, it's really hard to optimize a black box, without a query or even a plan, so we don't know what's going on. CTE or a subquery can get materialized unnecessarily for example.

like image 82
MarcinJ Avatar answered Nov 09 '22 06:11

MarcinJ