Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Query statement consume quite long time on execution on oracle

I got task to improve existing code / query from my company,

Database version

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
"CORE   10.2.0.4.0  Production"
TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
NLSRTL Version 10.2.0.4.0 - Production

Here's the problem- when below code is executed, the time taken to finish the job is more than 4 hours, something around 7 to 8 hours.

enter image description here

395 row data within 3 hours 37 minutes

  SELECT DISTINCT GROUP_DIST_NUMBER, BEGIN_DATE, PRICE_DROP_DATE
    FROM (SELECT DISTINCT
                 G.GROUP_DIST_NUMBER,
                 TO_DATE (:B2, 'DD-MON-YYYY') BEGIN_DATE,
                 TO_DATE (:B2, 'DD-MON-YYYY') PRICE_DROP_DATE
            FROM POS_DISTI_GROUP G,
                 POS_CUST_XREF M,
                 S_CPT_SEQ_NO C,
                 PP_STD_PRICE P,
                 S_CPT_AUDIT A,
                 RPT_PRODUCT_VALUE_LEVEL L
           WHERE     G.END_DATE > TO_DATE (:B2, 'DD-MON-YYYY')
                 AND G.GROUP_DIST_NUMBER = M.DIST_NUMBER
                 AND M.SG_BILL_TO_CUST_NO = A.BILL_TO_CUST_NO
                 AND A.START_DATE <= TO_DATE (:B2, 'DD-MON-YYYY')
                 AND A.END_DATE >= TO_DATE (:B2, 'DD-MON-YYYY')
                 AND L.PROD_VALUE = P.PROD_VALUE
                 AND L.PROD_LEVEL = P.PROD_LEVEL
                 AND C.CPT_PRICE_CODE IN
                        (SELECT /*+ PRECOMPUTE_SUBQUERY */
                                DISTINCT C1.CPT_PRICE_CODE
                           FROM PP_STD_PRICE P1,
                                S_CPT_PRICE_CODE C1,
                                S_CPT_SEQ_NO S1
                          WHERE     P1.STDP_ID = :B1
                                AND C1.CPT_PRICE_CAT LIKE 'NB%'
                                AND C1.CPT_PRICE_CODE = S1.CPT_PRICE_CODE
                                AND S1.PRICE_PROTECTABLE = 'Y')
                 AND C.CPT_PRICE_CODE = P.CUST_PRICE_TYPE
                 AND P.STDP_ID = :B1
                 AND A.CUST_PRICE_TYPE = C.CPT_BILL_CODE
                 AND M.ACTIVE_IND != 'N'
                 AND (M.CATEGORY_TYPE LIKE 'DIRECT%' OR M.INDIRECT_DISTI = 'Y')
                 AND TRUNC (M.ARCHIVE_DATE) > TRUNC (SYSDATE)
          UNION
            SELECT G.GROUP_DIST_NUMBER,
                   P.BEGIN_DATE,
                   MIN (INVT.PRICE_DROP_DATE) PRICE_DROP_DATE
              FROM POS_DISTI_GROUP G,
                   POS_CUST_XREF M,
                   PP_DEBIT_AUTHORIZATION P,
                   RPT_PRODUCT_VALUE_LEVEL L,
                   POS_PP_INVENTORY INVT
             WHERE     G.END_DATE > TO_DATE (:B2, 'DD-MON-YYYY')
                   AND G.GROUP_DIST_NUMBER = M.DIST_NUMBER
                   AND M.ACTIVE_IND != 'N'
                   AND (M.CATEGORY_TYPE LIKE 'DIRECT%' OR M.INDIRECT_DISTI = 'Y')
                   AND G.DIST_NUMBER = P.DIST_NUMBER
                   AND L.PROD_VALUE = P.PROD_VALUE
                   AND L.PROD_LEVEL = P.PROD_LEVEL
                   AND P.BEGIN_DATE >= TO_DATE (:B2, 'DD-MON-YYYY') - 6
                   AND P.BEGIN_DATE <= TO_DATE (:B2, 'DD-MON-YYYY')
                   AND INVT.DIST_NUMBER = G.GROUP_DIST_NUMBER
                   AND INVT.STMODEL = L.MOD_DESC
                   AND INVT.PPCF_SHOW_DATE = P.BEGIN_DATE
                   AND P.PRICE_TYPE = 'I'
                   AND (   P.POS_PROCESSED_FLAG IS NULL
                        OR P.POS_PROCESSED_FLAG != 'C')
                   AND P.POS_PP_FLAG = 'Y'
                   AND TRUNC (M.ARCHIVE_DATE) > TRUNC (SYSDATE)
          GROUP BY G.GROUP_DIST_NUMBER, P.BEGIN_DATE)
ORDER BY GROUP_DIST_NUMBER;

I have no idea how to tune this query statement to improve the performance and make it execute faster

here the EXPLAIN PLAN

--------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                        | Rows  | Bytes |TempSpc| Cost (%CPU)| Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                             |   101 |  2525 |       | 24156  (10)|       |       |
|   1 |  SORT ORDER BY                            |                             |   101 |  2525 |       | 24156  (10)|       |       |
|   2 |   VIEW                                    |                             |   101 |  2525 |       | 24155  (10)|       |       |
|   3 |    SORT UNIQUE                            |                             |   101 | 17691 |       | 24155  (75)|       |       |
|   4 |     UNION-ALL                             |                             |       |       |       |            |       |       |
|*  5 |      HASH JOIN                            |                             |    10M|  1680M|       |  6446   (5)|       |       |
|*  6 |       TABLE ACCESS FULL                   | S_CPT_SEQ_NO                |   651 |  5208 |       |     5   (0)|       |       |
|*  7 |       HASH JOIN                           |                             |  2383K|   379M|       |  6318   (3)|       |       |
|*  8 |        TABLE ACCESS FULL                  | POS_DISTI_GROUP             |   100 |  1800 |       |     5   (0)|       |       |
|*  9 |        HASH JOIN                          |                             |  2396K|   340M|  4320K|  6283   (3)|       |       |
|  10 |         VIEW                              | RPT_PRODUCT_VALUE_LEVEL     |   138K|  2697K|       |  1905   (3)|       |       |
|  11 |          UNION-ALL                        |                             |       |       |       |            |       |       |
|* 12 |           HASH JOIN RIGHT OUTER           |                             | 13965 |   627K|       |    91   (5)|       |       |
|  13 |            INDEX FULL SCAN                | PK_SEAEGO_PRODUCT_HIERARCHY |   298 |  4172 |       |     1   (0)|       |       |
|* 14 |            HASH JOIN RIGHT OUTER          |                             | 13965 |   436K|       |    89   (4)|       |       |
|  15 |             INDEX FULL SCAN               | PK_S_CAP_GROUP              |     2 |     8 |       |     1   (0)|       |       |
|  16 |             TABLE ACCESS FULL             | SMA_STMODEL                 | 13965 |   381K|       |    87   (3)|       |       |
|* 17 |           HASH JOIN RIGHT OUTER           |                             | 14175 |  1065K|       |   158   (5)|       |       |
|  18 |            INDEX FAST FULL SCAN           | PK_S_FAMILY                 |  1366 |  5464 |       |     2   (0)|       |       |
|* 19 |            HASH JOIN RIGHT OUTER          |                             | 14175 |  1010K|       |   156   (5)|       |       |
|  20 |             INDEX FULL SCAN               | PK_F_MODPRODMGR             |    22 |    88 |       |     1   (0)|       |       |
|* 21 |             HASH JOIN                     |                             | 14175 |   955K|       |   154   (4)|       |       |
|  22 |              TABLE ACCESS FULL            | SMA_PRODUCTMODEL            | 14132 |   317K|       |    62   (2)|       |       |
|* 23 |              HASH JOIN RIGHT OUTER        |                             | 13965 |   627K|       |    91   (5)|       |       |
|  24 |               INDEX FULL SCAN             | PK_SEAEGO_PRODUCT_HIERARCHY |   298 |  4172 |       |     1   (0)|       |       |
|* 25 |               HASH JOIN RIGHT OUTER       |                             | 13965 |   436K|       |    89   (4)|       |       |
|  26 |                INDEX FULL SCAN            | PK_S_CAP_GROUP              |     2 |     8 |       |     1   (0)|       |       |
|  27 |                TABLE ACCESS FULL          | SMA_STMODEL                 | 13965 |   381K|       |    87   (3)|       |       |
|  28 |           MAT_VIEW ACCESS FULL            | RPT_PROD_MV                 |   109K|  1288K|       |  1656   (3)|       |       |
|* 29 |         HASH JOIN                         |                             |   141K|    17M|       |  3191   (3)|       |       |
|* 30 |          INDEX RANGE SCAN                 | UK_PP_STD_PRICE_STDP_ID     |  4128 |   108K|       |    23   (0)|       |       |
|* 31 |          HASH JOIN                        |                             |  5341 |   532K|       |  3165   (3)|       |       |
|* 32 |           TABLE ACCESS FULL               | POS_CUST_XREF               |    54 |  2268 |       |    25   (4)|       |       |
|* 33 |           HASH JOIN                       |                             |   193K|    11M|       |  3137   (3)|       |       |
|* 34 |            TABLE ACCESS FULL              | S_CPT_AUDIT                 |    68 |  2108 |       |    76   (4)|       |       |
|* 35 |            HASH JOIN                      |                             |   745K|    20M|       |  3052   (2)|       |       |
|  36 |             TABLE ACCESS FULL             | S_CPT_SEQ_NO                |  1301 | 16913 |       |     5   (0)|       |       |
|  37 |             MERGE JOIN CARTESIAN          |                             | 88205 |  1378K|       |  3037   (2)|       |       |
|* 38 |              INDEX RANGE SCAN             | UK_PP_STD_PRICE_STDP_ID     |  4128 | 20640 |       |    23   (0)|       |       |
|  39 |              BUFFER SORT                  |                             |    21 |   231 |       |  3014   (2)|       |       |
|* 40 |               TABLE ACCESS FULL           | S_CPT_PRICE_CODE            |    21 |   231 |       |     1   (0)|       |       |
|  41 |      HASH GROUP BY                        |                             |     1 |   191 |       | 16421   (5)|       |       |
|* 42 |       FILTER                              |                             |       |       |       |            |       |       |
|  43 |        NESTED LOOPS                       |                             |     1 |   191 |       | 16419   (5)|       |       |
|* 44 |         HASH JOIN                         |                             |     7 |  1176 |       | 16370   (5)|       |       |
|* 45 |          HASH JOIN                        |                             |    74 |  8584 |       |  4790   (3)|       |       |
|* 46 |           HASH JOIN                       |                             |    60 |  3780 |       |    31   (7)|       |       |
|* 47 |            TABLE ACCESS FULL              | POS_CUST_XREF               |    60 |  2100 |       |    25   (4)|       |       |
|* 48 |            TABLE ACCESS FULL              | POS_DISTI_GROUP             |   100 |  2800 |       |     5   (0)|       |       |
|* 49 |           TABLE ACCESS FULL               | PP_DEBIT_AUTHORIZATION      |   345 | 18285 |       |  4759   (3)|       |       |
|  50 |          PARTITION RANGE ALL              |                             | 18192 |   923K|       | 11579   (6)|     1 |    33 |
|* 51 |           INDEX FAST FULL SCAN            | POS_PP_INVENTORY_PK         | 18192 |   923K|       | 11579   (6)|     1 |    33 |
|* 52 |         VIEW                              | RPT_PRODUCT_VALUE_LEVEL     |     1 |    23 |       |     7   (0)|       |       |
|  53 |          UNION ALL PUSHED PREDICATE       |                             |       |       |       |            |       |       |
|* 54 |           FILTER                          |                             |       |       |       |            |       |       |
|  55 |            NESTED LOOPS OUTER             |                             |     1 |    46 |       |     2   (0)|       |       |
|  56 |             NESTED LOOPS OUTER            |                             |     1 |    42 |       |     2   (0)|       |       |
|  57 |              TABLE ACCESS BY INDEX ROWID  | SMA_STMODEL                 |     1 |    28 |       |     2   (0)|       |       |
|* 58 |               INDEX UNIQUE SCAN           | PK_SMA_STMODEL              |     1 |       |       |     1   (0)|       |       |
|* 59 |              INDEX UNIQUE SCAN            | PK_SEAEGO_PRODUCT_HIERARCHY |   298 |  4172 |       |     0   (0)|       |       |
|* 60 |             INDEX UNIQUE SCAN             | PK_S_CAP_GROUP              |     2 |     8 |       |     0   (0)|       |       |
|  61 |           NESTED LOOPS OUTER              |                             |     1 |    77 |       |     3   (0)|       |       |
|  62 |            NESTED LOOPS OUTER             |                             |     1 |    73 |       |     3   (0)|       |       |
|  63 |             NESTED LOOPS OUTER            |                             |     1 |    69 |       |     3   (0)|       |       |
|  64 |              NESTED LOOPS OUTER           |                             |     1 |    65 |       |     3   (0)|       |       |
|  65 |               NESTED LOOPS                |                             |     1 |    51 |       |     3   (0)|       |       |
|* 66 |                TABLE ACCESS BY INDEX ROWID| SMA_PRODUCTMODEL            |     1 |    23 |       |     2   (0)|       |       |
|* 67 |                 INDEX UNIQUE SCAN         | PK_SMA_PRODUCTMODEL         |     1 |       |       |     1   (0)|       |       |
|  68 |                TABLE ACCESS BY INDEX ROWID| SMA_STMODEL                 |     1 |    28 |       |     1   (0)|       |       |
|* 69 |                 INDEX UNIQUE SCAN         | PK_SMA_STMODEL              |     1 |       |       |     0   (0)|       |       |
|* 70 |               INDEX UNIQUE SCAN           | PK_SEAEGO_PRODUCT_HIERARCHY |   298 |  4172 |       |     0   (0)|       |       |
|* 71 |              INDEX UNIQUE SCAN            | PK_S_FAMILY                 |  1366 |  5464 |       |     0   (0)|       |       |
|* 72 |             INDEX UNIQUE SCAN             | PK_S_CAP_GROUP              |     2 |     8 |       |     0   (0)|       |       |
|* 73 |            INDEX UNIQUE SCAN              | PK_F_MODPRODMGR             |    22 |    88 |       |     0   (0)|       |       |
|* 74 |           MAT_VIEW ACCESS BY INDEX ROWID  | RPT_PROD_MV                 |     1 |    24 |       |     2   (0)|       |       |
|* 75 |            INDEX UNIQUE SCAN              | IDX_RPT_PROD_MV_PROD_NO     |     1 |       |       |     1   (0)|       |       |
--------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("C1"."CPT_PRICE_CODE"="S1"."CPT_PRICE_CODE")
   6 - filter("S1"."PRICE_PROTECTABLE"='Y')
   7 - access("G"."GROUP_DIST_NUMBER"="M"."DIST_NUMBER")
   8 - filter("G"."END_DATE">TO_DATE(:B2,'DD-MON-YYYY'))
   9 - access("L"."PROD_VALUE"="P"."PROD_VALUE" AND "L"."PROD_LEVEL"="P"."PROD_LEVEL")
  12 - access("ST"."MARKETING_NAME"="PH"."MARKETING_NAME"(+))
  14 - access("ST"."MOD_CAPACITY_FORMATTED"="SCG"."MOD_CAPACITY_FORMATTED"(+))
  17 - access("SF"."FAMILY"(+)=SUBSTRB("PM"."MODEL",1,3))
  19 - access("PM"."DESIGN_APPLICATION"="DA"."DESIGN_APPLICATION"(+))
  21 - access("PM"."MOD_DESC"="ST"."MOD_DESC")
  23 - access("ST"."MARKETING_NAME"="PH"."MARKETING_NAME"(+))
  25 - access("ST"."MOD_CAPACITY_FORMATTED"="SCG"."MOD_CAPACITY_FORMATTED"(+))
  29 - access("C"."CPT_PRICE_CODE"="P"."CUST_PRICE_TYPE")
  30 - access("P"."STDP_ID"=TO_NUMBER(:B1))
  31 - access("M"."SG_BILL_TO_CUST_NO"="A"."BILL_TO_CUST_NO")
  32 - filter("M"."SG_BILL_TO_CUST_NO" IS NOT NULL AND ("M"."INDIRECT_DISTI"='Y' OR "M"."CATEGORY_TYPE" LIKE 'DIRECT%') AND 
              "M"."ACTIVE_IND"<>'N' AND TRUNC(INTERNAL_FUNCTION("M"."ARCHIVE_DATE"))>TRUNC(SYSDATE@!))
  33 - access("A"."CUST_PRICE_TYPE"="C"."CPT_BILL_CODE")
  34 - filter("A"."START_DATE"<=TO_DATE(:B2,'DD-MON-YYYY') AND "A"."END_DATE">=TO_DATE(:B2,'DD-MON-YYYY'))
  35 - access("C"."CPT_PRICE_CODE"="C1"."CPT_PRICE_CODE")
  38 - access("P1"."STDP_ID"=TO_NUMBER(:B1))
  40 - filter("C1"."CPT_PRICE_CAT" LIKE 'NB%')
  42 - filter(TO_DATE(:B2,'DD-MON-YYYY')-6<=TO_DATE(:B2,'DD-MON-YYYY'))
  44 - access("INVT"."DIST_NUMBER"="G"."GROUP_DIST_NUMBER" AND "INVT"."PPCF_SHOW_DATE"="P"."BEGIN_DATE")
  45 - access("G"."DIST_NUMBER"="P"."DIST_NUMBER")
  46 - access("G"."GROUP_DIST_NUMBER"="M"."DIST_NUMBER")
  47 - filter(("M"."INDIRECT_DISTI"='Y' OR "M"."CATEGORY_TYPE" LIKE 'DIRECT%') AND "M"."ACTIVE_IND"<>'N' AND 
              TRUNC(INTERNAL_FUNCTION("M"."ARCHIVE_DATE"))>TRUNC(SYSDATE@!))
  48 - filter("G"."END_DATE">TO_DATE(:B2,'DD-MON-YYYY'))
  49 - filter("P"."PRICE_TYPE"='I' AND "P"."POS_PP_FLAG"='Y' AND ("P"."POS_PROCESSED_FLAG"<>'C' OR "P"."POS_PROCESSED_FLAG" 
              IS NULL) AND "P"."BEGIN_DATE"<=TO_DATE(:B2,'DD-MON-YYYY') AND "P"."BEGIN_DATE">=TO_DATE(:B2,'DD-MON-YYYY')-6)
  51 - filter("INVT"."PPCF_SHOW_DATE"<=TO_DATE(:B2,'DD-MON-YYYY') AND "INVT"."PPCF_SHOW_DATE">=TO_DATE(:B2,'DD-MON-YYYY')-6)
  52 - filter("L"."PROD_LEVEL"="P"."PROD_LEVEL")
  54 - filter("P"."PROD_VALUE"="INVT"."STMODEL")
  58 - access("ST"."MOD_DESC"="P"."PROD_VALUE")
  59 - access("ST"."MARKETING_NAME"="PH"."MARKETING_NAME"(+))
  60 - access("ST"."MOD_CAPACITY_FORMATTED"="SCG"."MOD_CAPACITY_FORMATTED"(+))
  66 - filter("PM"."MOD_DESC"="INVT"."STMODEL")
  67 - access("PM"."MODEL"="P"."PROD_VALUE")
  69 - access("ST"."MOD_DESC"="INVT"."STMODEL")
  70 - access("ST"."MARKETING_NAME"="PH"."MARKETING_NAME"(+))
  71 - access("SF"."FAMILY"(+)=SUBSTRB("PM"."MODEL",1,3))
  72 - access("ST"."MOD_CAPACITY_FORMATTED"="SCG"."MOD_CAPACITY_FORMATTED"(+))
  73 - access("PM"."DESIGN_APPLICATION"="DA"."DESIGN_APPLICATION"(+))
  74 - filter("MOD_DESC"="INVT"."STMODEL")
  75 - access("PROD_NO"="P"."PROD_VALUE")

Note
-----
   - 'PLAN_TABLE' is old version

and the statistic of rows count for table

    TABLE_Name              NUM_ROWS
    -----------             ---------
    POS_DISTI_GROUP          2009 
    POS_CUST_XREF            2801
    S_CPT_SEQ_NO             1301
    PP_STD_PRICE             2658450
    S_CPT_AUDIT            27200
    PP_DEBIT_AUTHORIZATION   1199420
    POS_PP_INVENTORY     7276850
    PP_STD_PRICE             2658450
    S_CPT_PRICE_CODE     192
    S_CPT_SEQ_NO             1301
    SMA_STMODEL          13965
    RPT_PROD_MV          109980

create table statement. CLICK HERE

Table Description. CLICK HERE

Retrieve EXPLAIN PLAN with rerun gather_plan_statistics as @jonearles suggest. CLICK HERE

*link from google doc

like image 592
user2982040 Avatar asked Oct 02 '22 11:10

user2982040


1 Answers

The problem

Aggregation is happening too late in the execution plan. Plan IDs 4 and 5 generate 13 billion rows and account for 95% of the execution time. Oracle incorrectly believes the number of rows will be smaller, and that earlier aggregations should be merged together.

Plan IDs 6 through 40 represent the first half of the inline view, before the UNION. That part of the query has two DISTINCTs, yet there are no types of aggregation operations for that part of the execution plan. Oracle incorrectly thinks it's better to join everything first and perform one SORT UNIQUE, instead of performing multiple SORT UNIQUE or HASH GROUP BY and combining the results.

Reproduce the problem

Fully reproducing this problem without a full export is almost impossible. Even though it's only a moderately complicated SQL statement there are thousands of variables involved. The code below only demonstrates how Oracle can incorrectly merge aggregation operations.

First, create two simple tables. Each has 100K rows. TEST1 has numbers from 1 to 100000. TEST2 contains 100000 rows, but only one distinct number. To artificially make a bad plan, statistics are gathered too soon on TEST2. The optimizer thinks that TEST2 only has one row but it really has 100000.

drop table test1 purge;
drop table test2 purge;

create table test1(a number);
create table test2(a number);

insert into test1 select level from dual connect by level <= 100000;
insert into test2 values (1);
commit;

begin
    dbms_stats.gather_table_stats(user, 'test1');
    dbms_stats.gather_table_stats(user, 'test2');
end;
/

insert into test2 select 1 from dual connect by level <= 100000;
commit;

The sample query below retrieves all distinct TEST1.A where A is also in distinct TEST2.A.

By default, using the artificially bad statistics, Oracle joins the tables first and then performs the HASH GROUP BY and HASH UNIQUE. This is a bad plan, it joins all 100K values from TEST2. It would be better to perform the HASH GROUP BY first and then only join 1 row from that table.

explain plan for
select distinct a from test1 where a in (select a from test2 group by a);

select * from table(dbms_xplan.display(format => 'outline'));

------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |     1 |     8 |    79   (2)| 00:00:01 |
|   1 |  HASH UNIQUE         |       |     1 |     8 |    79   (2)| 00:00:01 |
|   2 |   HASH GROUP BY      |       |     1 |     8 |    79   (2)| 00:00:01 |
|*  3 |    HASH JOIN         |       |     1 |     8 |    79   (2)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| TEST2 |     1 |     3 |     3   (0)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| TEST1 |   100K|   488K|    76   (2)| 00:00:01 |
------------------------------------------------------------------------------

Potential Solution #1: Hints

Unfortunately there are no official hints to control when and where sorting and grouping happen. By playing around with the outline format option I was able to find a few potentially helpful hints: USE_HASH_AGGREGATION, OUTLINE_LEAF, and PLACE_DISTINCT. (These hints are really tricky - the reason I used a group by instead of another distinct in my sample is because I had so much trouble with the PLACE_DISTINCT hint!)

Using these undocumented hints can build a better plan. The results from TEST2 go through a HASH GROUP BY right away, as they should. This is similar to the plan that would be produced if the statistics were accurate.

explain plan for
select /*+ USE_HASH_AGGREGATION(@"SEL$5DA710D3") OUTLINE_LEAF(@"SEL$683B0107") */
distinct a from test1 where a in (select a from test2 group by a);

select * from table(dbms_xplan.display(format => 'outline alias'));

----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |     1 |     8 |    79   (2)| 00:00:01 |
|   1 |  HASH UNIQUE          |          |     1 |     8 |    79   (2)| 00:00:01 |
|*  2 |   HASH JOIN SEMI      |          |     1 |     8 |    79   (2)| 00:00:01 |
|   3 |    VIEW               | VW_NSO_1 |     1 |     3 |     3   (0)| 00:00:01 |
|   4 |     HASH GROUP BY     |          |     1 |     3 |     3   (0)| 00:00:01 |
|   5 |      TABLE ACCESS FULL| TEST2    |     1 |     3 |     3   (0)| 00:00:01 |
|   6 |    TABLE ACCESS FULL  | TEST1    |   100K|   488K|    76   (2)| 00:00:01 |
----------------------------------------------------------------------------------

Potential Solution #2: Force a plan with ROWNUM.

A much simpler and safer version of the above is to use ROWNUM. ROWNUM is a pseudocolumn that represents the order of the rows returned. When there is a ROWNUM Oracle cannot move the distinct and group by because it would affect that order.

Unfortunately, this trick requires extra code and generates extra steps in the plan. Those extra steps are mostly just passing data through and shouldn't slow things done much.

explain plan for
select distinct a from test1 where a in
(
    --Extra level only because we only want to project one column.
    --It's syntactically required, but the optimizer throws out this inline view.
    select a
    from
    (
        --The ROWNUM forces everything in this inline view to happen separately.
        select a, rownum
        from
        (
            select a from test2 group by a
        )
    )
);

select * from table(dbms_xplan.display(format => 'outline alias'));

---------------------------------------------------------------------------------
| Id  | Operation               | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |       |     1 |     8 |    79   (2)| 00:00:01 |
|   1 |  HASH UNIQUE            |       |     1 |     8 |    79   (2)| 00:00:01 |
|*  2 |   HASH JOIN SEMI        |       |     1 |     8 |    79   (2)| 00:00:01 |
|   3 |    VIEW                 |       |     1 |     3 |     3   (0)| 00:00:01 |
|   4 |     COUNT               |       |       |       |            |          |
|   5 |      VIEW               |       |     1 |     3 |     3   (0)| 00:00:01 |
|   6 |       HASH GROUP BY     |       |     1 |     3 |     3   (0)| 00:00:01 |
|   7 |        TABLE ACCESS FULL| TEST2 |     1 |     3 |     3   (0)| 00:00:01 |
|   8 |    TABLE ACCESS FULL    | TEST1 |   100K|   488K|    76   (2)| 00:00:01 |
---------------------------------------------------------------------------------

Potential Solution #3: Fix cardinality estimates and hope for the best.

If the estimated number of rows is accurate the plan is almost always good. When the row estimates are far off, find the first part of the execution plan where the cardinality is wrong. For this plan, it appears to be plan ID 36. E-Rows and A-Rows are off by an order of magnitude:

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                        | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
...
|* 36 |             TABLE ACCESS FULL             | POS_CUST_XREF               |      1 |     54 |    579 |00:00:00.01 |     131 |      0 |       |       |          |

Step 36 has a complex predicate that involves SYSDATE.

  36 - filter(("M"."SG_BILL_TO_CUST_NO" IS NOT NULL AND ("M"."INDIRECT_DISTI"='Y' OR "M"."CATEGORY_TYPE" LIKE 'DIRECT%') AND "M"."ACTIVE_IND"<>'N' AND
              TRUNC(INTERNAL_FUNCTION("M"."ARCHIVE_DATE"))>TRUNC(SYSDATE@!)))

Even with up-to-date statistics that condition is going to be difficult to predict. Dynamic sampling may help. Try re-running the query with a top-level hint like this:

SELECT /*+ dynamic_sampling(6) */ ...

Fixing those early discrepancies will usually fix other problems later in the plan. This example is only one possible source of cardinality mismatches. Other tricks may be necessary to improve other cardinality estimates. This can be a very difficult method but it can pay off in multiple ways.

Red Herrings

There are many potential improvements to any moderately complicated SQL statement. There are several good ideas in the comments and answers. But when tuning it is always imperative to focus on what is slowest, not what is easiest to fix. It sounds obvious, but it's a very easy trap to fall into. That's why I asked you to use /*+ gather_plan_statistics*/, and that's why my answer only focuses on the parts of the plan with a large actual time.

For example, in my earlier comment I suggested looking at the NESTED LOOPS where ROWS=1. Now that we have the actual time we know that suggestion is not helpful. (Although in general you should still be skeptical of a plan with large tables but ROWS=1.)

like image 117
Jon Heller Avatar answered Oct 13 '22 12:10

Jon Heller