Optimizing SELECT query performance

Tags:

I have a SELECT statement that runs really slow, it's holding back our night process.

The query is: (Please don't comment about the implicit join syntax, this is automatically generated by Informatica that runs this code) :

SELECT *
  FROM STG_DIM_CRM_CASES,V_CRM_CASE_ID_EXISTS_IN_DWH,stg_scd_customers_key
 WHERE STG_DIM_CRM_CASES.CRM_CASE_ID = V_CRM_CASE_ID_EXISTS_IN_DWH.CASE_ID(+)
   AND STG_DIM_CRM_CASES.account_number = stg_scd_customers_key.account_number(+)
   and STG_DIM_CRM_CASES.Case_Create_Date between  stg_scd_customers_key.start_date(+) and  stg_scd_customers_key.end_date(+)

edit: The actual query selects only account_number,start_date,end_date and one other column which is not indexed.

Tables info :

STG_DIM_CRM_CASES

Index - (Account_Number,Case_Create_Date)
size - 270k records.

stg_scd_customers_key

Index - Account_Number,Start_Date,End_Date
Partitioned - End_Date
Size - 500 million records.

V_CRM_CASE_ID_EXISTS_IN_DWH(View) -

select  t.case_id
from crm_ps_rc_case t, dim_crm_cases x
where t.case_id=x.crm_case_id;

dim_crm_cases -

Indexed - (crm_case_id)
Size - 100 million .

crm_ps_rc_case -

Size - 270k records

Edit - If it wasn't clear, the view returns 270k records .

The query without the join to stg_scd is taking seconds, looks like it is the part that causing the performance issues, the view runs in seconds too although it is being joined to a 100 Million records table. Right now the query is taking somewhere between 12 to 30 minutes, depends how busy our sources are.

Here is the EXECUTION PLAN :

6   |   0 | SELECT STATEMENT                |                             |  3278K|  1297M|   559K  (4)| 02:10:37 |       |       |        |      |            |
7   |   1 |  PX COORDINATOR                 |                             |       |       |            |          |       |       |        |      |            |
8   |   2 |   PX SEND QC (RANDOM)           | :TQ10003                    |  3278K|  1297M|   559K  (4)| 02:10:37 |       |       |  Q1,03 | P->S | QC (RAND)  |
9   |*  3 |    HASH JOIN OUTER              |                             |  3278K|  1297M|   559K  (4)| 02:10:37 |       |       |  Q1,03 | PCWP |            |
10  |   4 |     PX RECEIVE                  |                             | 29188 |    10M| 50662   (5)| 00:11:50 |       |       |  Q1,03 | PCWP |            |
11  |   5 |      PX SEND HASH               | :TQ10002                    | 29188 |    10M| 50662   (5)| 00:11:50 |       |       |  Q1,02 | P->P | HASH       |
12  |*  6 |       HASH JOIN RIGHT OUTER     |                             | 29188 |    10M| 50662   (5)| 00:11:50 |       |       |  Q1,02 | PCWP |            |
13  |   7 |        BUFFER SORT              |                             |       |       |            |          |       |       |  Q1,02 | PCWC |            |
14  |   8 |         PX RECEIVE              |                             | 29188 |   370K| 50575   (5)| 00:11:49 |       |       |  Q1,02 | PCWP |            |
15  |   9 |          PX SEND BROADCAST      | :TQ10000                    | 29188 |   370K| 50575   (5)| 00:11:49 |       |       |        | S->P | BROADCAST  |
16  |  10 |           VIEW                  | V_CRM_CASE_ID_EXISTS_IN_DWH | 29188 |   370K| 50575   (5)| 00:11:49 |       |       |        |      |            |
17  |* 11 |            HASH JOIN            |                             | 29188 |   399K| 50575   (5)| 00:11:49 |       |       |        |      |            |
18  |  12 |             TABLE ACCESS FULL   | CRM_PS_RC_CASE              | 29188 |   199K|   570   (1)| 00:00:08 |       |       |        |      |            |
19  |  13 |             INDEX FAST FULL SCAN| DIM_CRM_CASES$1PK           |   103M|   692M| 48894   (3)| 00:11:25 |       |       |        |      |            |
20  |  14 |        PX BLOCK ITERATOR        |                             | 29188 |    10M|    87   (2)| 00:00:02 |       |       |  Q1,02 | PCWC |            |
21  |  15 |         TABLE ACCESS FULL       | STG_DIM_CRM_CASES           | 29188 |    10M|    87   (2)| 00:00:02 |       |       |  Q1,02 | PCWP |            |
22  |  16 |     BUFFER SORT                 |                             |       |       |            |          |       |       |  Q1,03 | PCWC |            |
23  |  17 |      PX RECEIVE                 |                             |   515M|    14G|   507K  (3)| 01:58:28 |       |       |  Q1,03 | PCWP |            |
24  |  18 |       PX SEND HASH              | :TQ10001                    |   515M|    14G|   507K  (3)| 01:58:28 |       |       |        | S->P | HASH       |
25  |  19 |        PARTITION RANGE ALL      |                             |   515M|    14G|   507K  (3)| 01:58:28 |     1 |  2982 |        |      |            |
26  |  20 |         TABLE ACCESS FULL       | STG_SCD_CUSTOMERS_KEY       |   515M|    14G|   507K  (3)| 01:58:28 |     1 |  2982 |        |      |            |
27  ------------------------------------------------------------------------------------------------------------------------------------------------------------
28   
29  Predicate Information (identified by operation id):
30  ---------------------------------------------------
31   
32     3 - access("STG_DIM_CRM_CASES"."ACCOUNT_NUMBER"="STG_SCD_CUSTOMERS_KEY"."ACCOUNT_NUMBER"(+))
33         filter("STG_DIM_CRM_CASES"."CASE_CREATE_DATE">="STG_SCD_CUSTOMERS_KEY"."START_DATE"(+) AND 
34                "STG_DIM_CRM_CASES"."CASE_CREATE_DATE"<="STG_SCD_CUSTOMERS_KEY"."END_DATE"(+))
35     6 - access("STG_DIM_CRM_CASES"."CRM_CASE_ID"="V_CRM_CASE_ID_EXISTS_IN_DWH"."CASE_ID"(+))
36    11 - access("T"."CASE_ID"="X"."CRM_CASE_ID")

Notes: Adding indexes may be an issue, depends on the index. This is not the only place this tables are being used, so indexes may interfere with other commands(Inserts mostly) on these tables.

I've also tried adding a filter on stg_scd and excluding all the dates smaller than the minimum date in Table_Cases, but that didn't help because it filtered only 1 year of records.

Thanks in advance.

962

asked Jul 27 '16 12:07

sagi

2 Answers

What I believe to be happening is the engine is having to resolve the 100m+ records from view join to 500m records BEFORE it applies limiting criteria (thus it creates a cross join and even if it can use indexes that's a lot of records to generate then parse. So even though you wrote it as an outer join, the engine isn't able to processes it that way (I don't know why)

So at a minimum 100m*500m = 50,000m that's a lot of data to generate and then parse/limit.

By eliminating the view, the engine may be better able to optimize and use the indexes thus eliminating the need for the 50,000m record join.

Areas where I would focus my time in troubleshooting:

Eliminate the view just to remove it as a potential overhead issue.
Recognize no tie between stg_scd_customers_key and V_CRM_CASE_ID_EXISTS_IN_DWH exists. This means the engine may be doing a cross join BEFORE the results of STG_DIM_CRM_CASES to stg_scd_customers_key have been resolved.

CONSIDER eliminating the view, or using an inline view

Eliminating the view:

SELECT *
  FROM STG_DIM_CRM_CASES 
      ,crm_ps_rc_case t
      ,dim_crm_cases x 
      ,stg_scd_customers_key
 WHERE t.case_id=x.crm_case_id
   AND STG_DIM_CRM_CASES.CRM_CASE_ID = t.CASE_ID(+)
   AND STG_DIM_CRM_CASES.account_number = stg_scd_customers_key.account_number(+)
   AND STG_DIM_CRM_CASES.Case_Create_Date 
       between  stg_scd_customers_key.start_date(+) and stg_scd_customers_key.end_date(+)

using an inline view:

SELECT *
  FROM STG_DIM_CRM_CASES 
  (select  t.case_id
   from crm_ps_rc_case t, dim_crm_cases x
   where t.case_id=x.crm_case_id) V_CRM_CASE_ID_EXISTS_IN_DWH
      ,stg_scd_customers_key
 WHERE STG_DIM_CRM_CASES.CRM_CASE_ID = V_CRM_CASE_ID_EXISTS_IN_DWH.CASE_ID(+)
   AND STG_DIM_CRM_CASES.account_number = stg_scd_customers_key.account_number(+)
   AND STG_DIM_CRM_CASES.Case_Create_Date 
       between  stg_scd_customers_key.start_date(+) and stg_scd_customers_key.end_date(+)

As to why: - http://www.dba-oracle.com/art_hints_views.htm

While order of the where clause SHOULDN'T matter consider: On the off chase the engine is executing in the order listed, limiting the 500m down and then adding the supplemental data from the view would logically be faster.

SELECT *
  FROM STG_DIM_CRM_CASES,stg_scd_customers_key,V_CRM_CASE_ID_EXISTS_IN_DWH
 WHERE STG_DIM_CRM_CASES.account_number = stg_scd_customers_key.account_number(+)
   and STG_DIM_CRM_CASES.Case_Create_Date between  stg_scd_customers_key.start_date(+) and  stg_scd_customers_key.end_date(+)
   and STG_DIM_CRM_CASES.CRM_CASE_ID = V_CRM_CASE_ID_EXISTS_IN_DWH.CASE_ID(+)

answered Oct 16 '22 22:10

xQbert

The problem is in scanning all partitions:

18 | PX SEND HASH | :TQ10001 | 515M| 14G| 507K (3)| 01:58:28 | | | | S->P | HASH | 25 | 19 | PARTITION RANGE ALL |
| 515M| 14G| 507K (3)| 01:58:28 | 1 | 2982 | |
| | 26 | 20 | TABLE ACCESS FULL | STG_SCD_CUSTOMERS_KEY | 515M| 14G|

It happens because you are using left join to this table. Can you select 1 partition using bind variable? What is partition key? I don't see hint for parallel but according to you plan it uses parallel. Is there parallel degree on any object level? Can you remove parallel and post explain plan without parallel please?

answered Oct 16 '22 22:10

Rusty

Related questions
                            
                                Select numbers with more than 4 decimal places
                            
                                Oracle: how to get percent of total by a query?
                            
                                Concatenate a Varchar and int [duplicate]
                            
                                LINQ to Entities - Where IN clause in query [duplicate]
                            
                                How do I generate CRUD stored procedures from a table in SQL Server Management Studio
                            
                                UPDATE syntax in SQLite
                            
                                Laravel Eloquent, group by month/year
                            
                                Sql select rows containing part of string
                            
                                Oracle Convert Seconds to Hours:Minutes:Seconds
                            
                                MySQL subquery returns more than one row
                            
                                SQL Subtract exactly a year
                            
                                Find overlapping date ranges in PostgreSQL
                            
                                Quickest/Easiest way to use Search/Replace through all stored procedures
                            
                                How long would it take a non-programmer to learn C#, the .NET Framework, and SQL? [closed]
                            
                                How to parse / tokenize an SQL statement in Node.js [closed]
                            
                                Hibernate @SQLDelete sql not adding schema
                            
                                How do I get a SUM to calculate properly with a join?
                            
                                Optimizing Slick generated SQL query
                            
                                Data source name not found, and no default driver specified

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Optimizing SELECT query performance

Tags:

performance

sql

optimization

oracle

view

sagi

People also ask

2 Answers

xQbert

Rusty

Recent Activity

Donate For Us