I have a table (50M rows) which has indexes on column_a and column_b
when I do select count(*) from table where column_a in (list_a)
, I get in no time my results.
Same with select count(*) from table where column_b in (list_b)
.
But when I do
select count(*) from table where column_a in (list_a) or column_b in (list_b)
My queries become insanely slow and last half hour before outputting the right number... Am I doing something wrong? How can I optimize the actual behavior of this query?
Thanks!
Plan for query 1:
Plan hash value: 2471097773
-------------------------------------------------------------
| Id | Operation | Name |
-------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT AGGREGATE | |
| 2 | NESTED LOOPS | |
| 3 | SORT UNIQUE | |
| 4 | TABLE ACCESS FULL | LIST_A |
| 5 | BITMAP CONVERSION COUNT | |
| 6 | BITMAP INDEX SINGLE VALUE| MY_TABLE_IX02 |
-------------------------------------------------------------
Plan for query 2
Plan hash value: 1870911518
-------------------------------------------------------------
| Id | Operation | Name |
-------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT AGGREGATE | |
| 2 | NESTED LOOPS | |
| 3 | SORT UNIQUE | |
| 4 | TABLE ACCESS FULL | LIST_B |
| 5 | BITMAP CONVERSION COUNT | |
| 6 | BITMAP INDEX SINGLE VALUE| MY_TABLE_IX05 |
-------------------------------------------------------------
Plan for query 3:
Plan hash value: 1821967683
----------------------------------------------------------------
| Id | Operation | Name |
----------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT AGGREGATE | |
| 2 | FILTER | |
| 3 | VIEW | index$_join$_001 |
| 4 | HASH JOIN | |
| 5 | BITMAP CONVERSION TO ROWIDS| |
| 6 | BITMAP INDEX FULL SCAN | MY_TABLE_IX02 |
| 7 | BITMAP CONVERSION TO ROWIDS| |
| 8 | BITMAP INDEX FULL SCAN | MY_TABLE_IX05 |
| 9 | TABLE ACCESS FULL | LIST_A |
| 10 | TABLE ACCESS FULL | LIST_B |
----------------------------------------------------------------
In my experience, OR tends to introduce a negative impact on queries (like ignoring indices and triggering full table scans). Sometimes this isn't so bad, but I have had queries that went from lightening fast to taking minutes because of it.
One possible solution is to change the OR
into a UNION
or even a UNION ALL
. I have had success with this in the past to improve the performance of queries, but you will have to compare them to one another to see whether this will work for you.
You can try out the three options below and see if any one of them offers a significant improvement over the others.
Original query (edited to return rows since you mentioned returning data instead of doing a count):
select * from table where column_a in (list_a) or column_b in (list_b)
Query that avoids the OR
:
select * from table where column_a in (list_a)
UNION
select * from table where column_b in (list_b)
And since a UNION
triggers a DISTINCT
, this might be worth trying out as well:
select * from table where column_a in (list_a) and not column_b in (list_b)
UNION ALL
select * from table where column_b in (list_b) and not column_a in (list_a)
UNION ALL
select * from table where column_a in (list_a) and column_b in (list_b)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With