Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle uses or ignores indexed column depending on the format of to_date(literal)

I'm using an indexed column used as a filter by putting it 'between' two literal values. (The column is in the second position of the index and actually makes execution slower; I will deal with that later).

What's confusing me is that Oracle (11.2.0.3.0) uses or ignores said index depending on the format of the value and format strings supplied to to_date:

This ignores the index:

SQL> SELECT *
  2  FROM gprs_history_import  gh
  3  WHERE start_call_date_time BETWEEN
  4      to_date('20140610 000000','yyyymmdd hh24miss') AND
  5      to_date('20140610 235959','yyyymmdd hh24miss')
  6  /

Execution Plan
----------------------------------------------------------
Plan hash value: 990804809

--------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                     |   350 |   219K|   242K  (1)| 00:56:42 |       |       |
|   1 |  PARTITION RANGE SINGLE|                     |   350 |   219K|   242K  (1)| 00:56:42 |    74 |    74 |
|   2 |   PARTITION LIST ALL   |                     |   350 |   219K|   242K  (1)| 00:56:42 |     1 |     3 |
|*  3 |    TABLE ACCESS FULL   | GPRS_HISTORY_IMPORT |   350 |   219K|   242K  (1)| 00:56:42 |   220 |   222 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))

This one does use the index (notice the space after the date part in line 4):

SQL> SELECT *
  2  FROM gprs_history_import  gh
  3  WHERE start_call_date_time BETWEEN
  4      to_date('20140610 ','yyyymmdd ') AND
  5      to_date('20140610 235959','yyyymmdd hh24miss')
  6  /

Execution Plan
----------------------------------------------------------
Plan hash value: 464458373

---------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                          |   350 |   219K|  2795K  (1)| 10:52:15 |       |       |
|*  1 |  FILTER                              |                          |       |       |            |          |       |       |
|   2 |   PARTITION RANGE ITERATOR           |                          |   350 |   219K|  2795K  (1)| 10:52:15 |   KEY |    74 |
|   3 |    PARTITION LIST ALL                |                          |   350 |   219K|  2795K  (1)| 10:52:15 |     1 |     3 |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID| GPRS_HISTORY_IMPORT      |   350 |   219K|  2795K  (1)| 10:52:15 |   KEY |   222 |
|*  5 |      INDEX SKIP SCAN                 | GPRS_HISTORY_IMPORT_IDX1 |     1 |       |  2795K  (1)| 10:52:15 |   KEY |   222 |
---------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_DATE('20140610 ','yyyymmdd ')<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
   5 - access("START_CALL_DATE_TIME">=TO_DATE('20140610 ','yyyymmdd ') AND "START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10
              23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
       filter("START_CALL_DATE_TIME">=TO_DATE('20140610 ','yyyymmdd ') AND "START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10
              23:59:59', 'syyyy-mm-dd hh24:mi:ss'))

(The filter in (1) seems a bit silly, as if Oracle didn't understand the expression)

Again, this one doesn't (I removed the trailing space):

SQL> SELECT *
  2  FROM gprs_history_import  gh
  3  WHERE start_call_date_time BETWEEN
  4      to_date('20140610','yyyymmdd') AND
  5      to_date('20140610 235959','yyyymmdd hh24miss')
  6  /

Execution Plan
----------------------------------------------------------
Plan hash value: 990804809

--------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                     |   350 |   219K|   242K  (1)| 00:56:42 |       |       |
|   1 |  PARTITION RANGE SINGLE|                     |   350 |   219K|   242K  (1)| 00:56:42 |    74 |    74 |
|   2 |   PARTITION LIST ALL   |                     |   350 |   219K|   242K  (1)| 00:56:42 |     1 |     3 |
|*  3 |    TABLE ACCESS FULL   | GPRS_HISTORY_IMPORT |   350 |   219K|   242K  (1)| 00:56:42 |   220 |   222 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))

Putting quotes around the space precludes the index from ever being used.

What gives?

like image 649
arielCo Avatar asked Jun 16 '14 23:06

arielCo


People also ask

What is the use of index in Oracle?

Indexes are used in Oracle to provide quick access to rows in a table. Indexes provide faster access to data for operations that return a small portion of a table's rows. Although Oracle allows an unlimited number of indexes on a table, the indexes only help if they are used to speed up queries.

What is index in Oracle and types of index with example?

An index is a schema object that contains an entry for each value that appears in the indexed column(s) of the table or cluster and provides direct, fast access to rows. Oracle Database supports several types of index: Normal indexes. (By default, Oracle Database creates B-tree indexes.)

How you can identify that index is not working properly on a table in Oracle?

In Oracle SQL Developer, when you have SQL in the worksheet, there is a button "Explain Plan", you can also hit F10. After you execute Explain plan, it will show in the bottom view of SQL Developer. There is a column "OBJECT_NAME", it will tell you what index is being used.

On which columns you should create indexes in Oracle?

In general, you should create an index on a column in any of the following situations: The column is queried frequently. A referential integrity constraint exists on the column. A UNIQUE key integrity constraint exists on the column.


1 Answers

Ok - I'll give it a try, this is mostly deduction from the availabe Information:

Why does Oracle choose a different execution plan?

It seems in your second query with the unusual Date-Format, the optimizer has no idea what the value of the resulting date is. You see the Filter Predicate:

1 - filter(TO_DATE('20140610 ','yyyymmdd ')<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))

Which means the optimizer is not even sure that the first date is smaller than the second! That means the optimizer has no idea about the number of returned rows and will just use a generic plan without taking specific statistics into account. It would be the same if you had a user-defined function xyt() which would return a date for the range. The optimizer has no way to know what date-value will result - This means you get a generall all purpose plan, which should be pretty decent for any date-range specified.

In the first and third case, the optimizer seems to understand the date directly and can guess the number of rows which are in the date range by using statistics. So while the second Query was to the Optimizer like BETWEEN X AND 3 this Query is like BETWEEN 1 AND 3 So he optimizes the query plan for the predicted number of returned rows!

The Strange thing seems to be, that the query optimizer has such problems with a strange date format, could be filed as a bug/request for improvement...

But an important Point:

  1. A full table scan does not have to be a BAD plan... As well as using an index is not always faster!
  2. The cost in the query plan is in no way directly related to the actual execution time or performance - it is an internal measurement to compare different plans for the SAME QUERY (So you cannot compare the cost of different querys like your queries 1,2 and 3)

Basically if you return a high number of rows from a table a full table scan without index access will in many cases be much faster, especially when operating on certain partitions! - The Table scan will only access the pertition for the matching date range - so only for the date in question and returns all rows from this partition. This is much faster than queriyng the index for each single row and then extracting the row by index access... Try to profile the querys - the full table scan on partition should be 3 times as fast with much less IO

like image 116
Falco Avatar answered Nov 15 '22 11:11

Falco