UPDATE table1
SET col1 = 'Y'
WHERE col2 in (select col2 from table2)
In the above query, imagine the inner query returns 10000 rows. Does this query with IN clause affect performance?
If so, what can be done for faster execution?
The Oracle IN condition is used to help reduce the need to use multiple OR conditions in a SELECT, INSERT, UPDATE, or DELETE statement.
The Oracle "with" clause will help performance in situations where your query contains several identical sub queries. Instead of re-computing the repeating sub queries, it will query or aggregate once, assign a name to the resulting data and refer to it.
The with clause, aka subquery factoring, allows you to tell us "hey, reuse this result over and over in the query". We can factor out a subquery that is used more then once and reuse it -- resulting in a perhaps "better" plan. It can also make the query overall "more readable".
In Oracle we can only put up to 1000 values into an IN clause.
if the subquery returns a large number of rows compared to the number of rows in TABLE1, the optimizer will likely produce a plan like this:
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 300K| 24M| | 1581 (1)| 00:0
| 1 | UPDATE | TABLE1 | | | | |
|* 2 | HASH JOIN SEMI | | 300K| 24M| 9384K| 1581 (1)| 00:0
| 3 | TABLE ACCESS FULL| TABLE1 | 300K| 5860K| | 355 (2)| 00:0
| 4 | TABLE ACCESS FULL| TABLE2 | 168K| 10M| | 144 (2)| 00:0
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COL2"="COL2")
It will scan both tables once and update only the rows in TABLE1 common to both tables. This is a highly efficient plan if you need to update lots of rows.
Sometimes the inner query will have few rows compared to the number of rows in TABLE1. If you have an index on TABLE1(col2)
, you could then obtain a plan similar to this one:
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 93 | 4557 | 247 (1)| 00:00:03 |
| 1 | UPDATE | TABLE1 | | | | |
| 2 | NESTED LOOPS | | 93 | 4557 | 247 (1)| 00:00:03 |
| 3 | SORT UNIQUE | | 51 | 1326 | 142 (0)| 00:00:02 |
| 4 | TABLE ACCESS FULL| TABLE2 | 51 | 1326 | 142 (0)| 00:00:02 |
|* 5 | INDEX RANGE SCAN | IDX1 | 2 | 46 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("T1"."COL2"="T2"."COL2")
In that case Oracle will read the rows from TABLE2 and for each (unique) row, perform an index access on TABLE1.
Which access is faster depend upon the selectivity of the inner query and the clustering of the index on TABLE1 (are the rows with similar value of col2
in TABLE1 next to each other or randomly spread?). In any case, performance wise, if you need to perform this update this query is one of the fastest way to do it.
UPDATE table1 outer
SET col1 = 'Y'
WHERE EXISTS (select null
from table2
WHERE col2 = outer.col2)
This could be better
To get the idea which is better - look at the execution plan.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With