I'm using an indexed column used as a filter by putting it 'between' two literal values. (The column is in the second position of the index and actually makes execution slower; I will deal with that later).
What's confusing me is that Oracle (11.2.0.3.0) uses or ignores said index depending on the format of the value and format strings supplied to to_date:
This ignores the index:
SQL> SELECT *
2 FROM gprs_history_import gh
3 WHERE start_call_date_time BETWEEN
4 to_date('20140610 000000','yyyymmdd hh24miss') AND
5 to_date('20140610 235959','yyyymmdd hh24miss')
6 /
Execution Plan
----------------------------------------------------------
Plan hash value: 990804809
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 350 | 219K| 242K (1)| 00:56:42 | | |
| 1 | PARTITION RANGE SINGLE| | 350 | 219K| 242K (1)| 00:56:42 | 74 | 74 |
| 2 | PARTITION LIST ALL | | 350 | 219K| 242K (1)| 00:56:42 | 1 | 3 |
|* 3 | TABLE ACCESS FULL | GPRS_HISTORY_IMPORT | 350 | 219K| 242K (1)| 00:56:42 | 220 | 222 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
This one does use the index (notice the space after the date part in line 4):
SQL> SELECT *
2 FROM gprs_history_import gh
3 WHERE start_call_date_time BETWEEN
4 to_date('20140610 ','yyyymmdd ') AND
5 to_date('20140610 235959','yyyymmdd hh24miss')
6 /
Execution Plan
----------------------------------------------------------
Plan hash value: 464458373
---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 350 | 219K| 2795K (1)| 10:52:15 | | |
|* 1 | FILTER | | | | | | | |
| 2 | PARTITION RANGE ITERATOR | | 350 | 219K| 2795K (1)| 10:52:15 | KEY | 74 |
| 3 | PARTITION LIST ALL | | 350 | 219K| 2795K (1)| 10:52:15 | 1 | 3 |
| 4 | TABLE ACCESS BY LOCAL INDEX ROWID| GPRS_HISTORY_IMPORT | 350 | 219K| 2795K (1)| 10:52:15 | KEY | 222 |
|* 5 | INDEX SKIP SCAN | GPRS_HISTORY_IMPORT_IDX1 | 1 | | 2795K (1)| 10:52:15 | KEY | 222 |
---------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_DATE('20140610 ','yyyymmdd ')<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
5 - access("START_CALL_DATE_TIME">=TO_DATE('20140610 ','yyyymmdd ') AND "START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10
23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
filter("START_CALL_DATE_TIME">=TO_DATE('20140610 ','yyyymmdd ') AND "START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10
23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
(The filter in (1) seems a bit silly, as if Oracle didn't understand the expression)
Again, this one doesn't (I removed the trailing space):
SQL> SELECT *
2 FROM gprs_history_import gh
3 WHERE start_call_date_time BETWEEN
4 to_date('20140610','yyyymmdd') AND
5 to_date('20140610 235959','yyyymmdd hh24miss')
6 /
Execution Plan
----------------------------------------------------------
Plan hash value: 990804809
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 350 | 219K| 242K (1)| 00:56:42 | | |
| 1 | PARTITION RANGE SINGLE| | 350 | 219K| 242K (1)| 00:56:42 | 74 | 74 |
| 2 | PARTITION LIST ALL | | 350 | 219K| 242K (1)| 00:56:42 | 1 | 3 |
|* 3 | TABLE ACCESS FULL | GPRS_HISTORY_IMPORT | 350 | 219K| 242K (1)| 00:56:42 | 220 | 222 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
Putting quotes around the space precludes the index from ever being used.
What gives?
Indexes are used in Oracle to provide quick access to rows in a table. Indexes provide faster access to data for operations that return a small portion of a table's rows. Although Oracle allows an unlimited number of indexes on a table, the indexes only help if they are used to speed up queries.
An index is a schema object that contains an entry for each value that appears in the indexed column(s) of the table or cluster and provides direct, fast access to rows. Oracle Database supports several types of index: Normal indexes. (By default, Oracle Database creates B-tree indexes.)
In Oracle SQL Developer, when you have SQL in the worksheet, there is a button "Explain Plan", you can also hit F10. After you execute Explain plan, it will show in the bottom view of SQL Developer. There is a column "OBJECT_NAME", it will tell you what index is being used.
In general, you should create an index on a column in any of the following situations: The column is queried frequently. A referential integrity constraint exists on the column. A UNIQUE key integrity constraint exists on the column.
Ok - I'll give it a try, this is mostly deduction from the availabe Information:
Why does Oracle choose a different execution plan?
It seems in your second query with the unusual Date-Format, the optimizer has no idea what the value of the resulting date is. You see the Filter Predicate:
1 - filter(TO_DATE('20140610 ','yyyymmdd ')<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
Which means the optimizer is not even sure that the first date is smaller than the second! That means the optimizer has no idea about the number of returned rows and will just use a generic plan without taking specific statistics into account. It would be the same if you had a user-defined function xyt() which would return a date for the range. The optimizer has no way to know what date-value will result - This means you get a generall all purpose plan, which should be pretty decent for any date-range specified.
In the first and third case, the optimizer seems to understand the date directly and can guess the number of rows which are in the date range by using statistics. So while the second Query was to the Optimizer like BETWEEN X AND 3
this Query is like BETWEEN 1 AND 3
So he optimizes the query plan for the predicted number of returned rows!
The Strange thing seems to be, that the query optimizer has such problems with a strange date format, could be filed as a bug/request for improvement...
But an important Point:
Basically if you return a high number of rows from a table a full table scan without index access will in many cases be much faster, especially when operating on certain partitions! - The Table scan will only access the pertition for the matching date range - so only for the date in question and returns all rows from this partition. This is much faster than queriyng the index for each single row and then extracting the row by index access... Try to profile the querys - the full table scan on partition should be 3 times as fast with much less IO
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With