Optimizing Oracle query by removing 'exists' and 'not exists'

Question

I recently moved a piece of code into production on a oracle database where one of the more experienced developer who reviewed it mentioned I had way too many exists and not exists statements and that there should be a way to remove them, but it had been too long since he had to use it and didn't remember much on how it worked. Currently, I'm going back and making the piece of code more maintainable as it is a piece likely to be changed multiple times in future years as business logic/requirements change, and I wanted to go ahead and optimize it while making it more maintainable.

I've tried looking it up, but all I can find is recommendations on replacing not in with not exists and to not return actual results.

As such, I'm wondering what can be done to optimize out exists/not exists or if there is a way to write exists/not exists so that oracle will optimize it internally (likely at a better degree than I can).

For example, how can the following be optimized?

UPDATE
    SCOTT.TABLE_N N
SET
    N.VALUE_1 = 'Data!'
WHERE
    N.VALUE_2 = 'Y'
    AND
    EXISTS
    (
        SELECT
            1
        FROM
            SCOTT.TABLE_Q Q
        WHERE
            N.ID = Q.N_ID
    )
    AND
    NOT EXISTS
    (
        SELECT
            1
        FROM
            SCOTT.TABLE_W W
        WHERE
            N.ID = W.N_ID
    )

Kirill Leontev · Accepted Answer

Your statement seems perfectly fine to me.

In any optimizing task, don't think patterns. Don't think like, "(not) exists is bad and slow, (not) in is super cool and fast".

Think like, how much work does database do on each step and how can you measure it?

A simple example:

-- NOT IN:

23:59:41 HR@sandbox> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.03
23:59:43 HR@sandbox> set autotrace traceonly explain statistics
23:59:49 HR@sandbox> select country_id from countries where country_id not in (select country_id from locations);

11 rows selected.

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 1748518851

------------------------------------------------------------------------------------------
| Id  | Operation              | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                 |     1 |     6 |     4   (0)| 00:00:01 |
|*  1 |  FILTER                |                 |       |       |            |          |
|   2 |   NESTED LOOPS ANTI SNA|                 |    11 |    66 |     4  (75)| 00:00:01 |
|   3 |    INDEX FULL SCAN     | COUNTRY_C_ID_PK |    25 |    75 |     1   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN    | LOC_COUNTRY_IX  |    13 |    39 |     0   (0)| 00:00:01 |
|*  5 |   TABLE ACCESS FULL    | LOCATIONS       |     1 |     3 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "LOCATIONS" "LOCATIONS" WHERE
              "COUNTRY_ID" IS NULL))
   4 - access("COUNTRY_ID"="COUNTRY_ID")
   5 - filter("COUNTRY_ID" IS NULL)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         11  consistent gets
          8  physical reads
          0  redo size
        446  bytes sent via SQL*Net to client
        363  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         11  rows processed

-- NOT EXISTS

23:59:57 HR@sandbox> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.17
00:00:02 HR@sandbox> select country_id from countries c where not exists (select 1 from locations l where l.country_id = c.country_id );

11 rows selected.

Elapsed: 00:00:00.30

Execution Plan
----------------------------------------------------------
Plan hash value: 840074837

-------------------------------------------------------------------------------------
| Id  | Operation         | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                 |    11 |    66 |     1   (0)| 00:00:01 |
|   1 |  NESTED LOOPS ANTI|                 |    11 |    66 |     1   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN | COUNTRY_C_ID_PK |    25 |    75 |     1   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN| LOC_COUNTRY_IX  |    13 |    39 |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("L"."COUNTRY_ID"="C"."COUNTRY_ID")


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          2  physical reads
          0  redo size
        446  bytes sent via SQL*Net to client
        363  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         11  rows processed

NOT IN in this example reads twice as much database blocks and performs more complicated filtering - ask yourself, why would you chose it over NOT EXISTS?

Optimizing Oracle query by removing 'exists' and 'not exists'

Tags:

sql

oracle

Lawtonfogle

1 Answers

Kirill Leontev

Recent Activity

Donate For Us

Optimizing Oracle query by removing 'exists' and 'not exists'

Tags:

sql

oracle

Lawtonfogle

1 Answers

Kirill Leontev

Related questions

Recent Activity

Donate For Us