My question concerns Oracle 11g and the use of indexes in SQL queries. In my database, there is a table that is structured as followed: <pre class="prettyprint"><code>Table tab ( rowid NUMBER(11), unique_id_string VARCHAR2(2000), year NUMBER(4), dynamic_col_1 NUMBER(11), dynamic_col_1_text NVARCHAR2(2000) ) TABLESPACE tabspace_data; </code></pre> I have created two indexes: <pre class="prettyprint"><code>CREATE INDEX Index_dyn_col1 ON tab (dynamic_col_1, dynamic_col_1_text) TABLESPACE tabspace_index; CREATE INDEX Index_unique_id_year ON tab (unique_id_string, year) TABLESPACE tabspace_index; </code></pre> The table contains around 1 to 2 million records. I extract the data from it by executing the following SQL command: <pre class="prettyprint"><code>SELECT distinct "sub_select"."dynamic_col_1" "AS_dynamic_col_1","sub_select"."dynamic_col_1_text" "AS_dynamic_col_1_text" FROM ( SELECT "tab".* FROM "tab" where "tab".year = 2011 ) "sub_select" </code></pre> Unfortunately, the query needs around 1 hour to execute, although I created the both indexes described above. The explain plan shows that Oracle uses a "Table Full Access", i.e. a full table scan. Why is the index not used? As an experiment, I tested the following SQL command: <pre class="prettyprint"><code>SELECT DISTINCT "dynamic_col_1" "AS_dynamic_col_1", "dynamic_col_1_text" "AS_dynamic_col_1_text" FROM "tab" </code></pre> Even in this case, the index is not used and a full table scan is performed. In my real database, the table contains more indexed columns like "dynamic_col_1" and "dynamic_col_1_text". The whole index file has a size of about 50 GB. A few more informations: <ul> <li>The database is Oracle 11g installed on my local computer.</li> <li>I use Windows 7 Enterprise 64bit.</li> <li>The whole index is split over 3 dbf files with about 50GB size.</li> </ul> I would really be glad, if someone could tell me how to make Oracle use the index in the first query. Because the first query is used by another program to extract the data from the database, it can hardly be changed. So it would be good to tweak the table instead. Thanks in advance. [01.10.2011: UPDATE] I think I've found the solution for the problem. Both columns <code>dynamic_col_1</code> and <code>dynamic_col_1_text</code> are nullable. After altering the table to prohibit "NULL"-values in both columns and adding a new index solely for the column <code>year</code>, Oracle performs a Fast Index Scan. The advantage is that the query takes now about 5 seconds to execute and not 1 hour as before.

Your index should be: <pre class="prettyprint"><code>CREATE INDEX Index_year ON tab (year) TABLESPACE tabspace_index; </code></pre> Also, your query could just be: <pre class="prettyprint"><code>SELECT DISTINCT dynamic_col_1 "AS_dynamic_col_1", dynamic_col_1_text "AS_dynamic_col_1_text" FROM tab WHERE year = 2011; </code></pre> If your index was created solely for this query though, you could create it including the two fetched columns as well, then the optimiser would not have to go to the table for the query data, it could retrieve it directly from the index making your query more efficient again. Hope it helps...

Are you sure that an index access would be faster than a full table scan? As a very rough estimate, full table scans are 20 times faster than reading an index. If <code>tab</code> has more than 5% of the data in 2011 it's not surprising that Oracle would use a full table scan. And as @Dan and @Ollie mentioned, with <code>year</code> as the second column this will make the index even slower. If the index really is faster, than the issue is probably bad statistics. There are hundreds of ways the statistics could be bad. Very briefly, here's what I'd look at first: <ol> <li>Run an explain plan with and without and index hint. Are the cardinalities off by 10x or more? Are the times off by 10x or more?</li> <li>If the cardinality is off, make sure there are up to date stats on the table and index and you're using a reasonable ESTIMATE_PERCENT (DBMS_STATS.AUTO_SAMPLE_SIZE is almost always the best for 11g).</li> <li>If the time is off, check your workload statistics.</li> <li>Are you using parallelism? Oracle always assumes a near linear improvement for parallelism, but on a desktop with one hard drive you probably won't see any improvement at all.</li> </ol> Also, this isn't really relevant to your problem, but you may want to avoid using quoted identifiers. Once you use them you have to use them everywhere, and it generally makes your tables and queries painful to work with.

Oracle 11g: Index not used in "select distinct"-query

Tags:

database

indexing

oracle

My question concerns Oracle 11g and the use of indexes in SQL queries.

In my database, there is a table that is structured as followed:

Table tab (
  rowid NUMBER(11),
  unique_id_string VARCHAR2(2000),
  year NUMBER(4),
  dynamic_col_1 NUMBER(11),
  dynamic_col_1_text NVARCHAR2(2000)
 ) TABLESPACE tabspace_data;

I have created two indexes:

CREATE INDEX Index_dyn_col1 ON tab (dynamic_col_1, dynamic_col_1_text) TABLESPACE tabspace_index;
CREATE INDEX Index_unique_id_year ON tab (unique_id_string, year) TABLESPACE tabspace_index;

The table contains around 1 to 2 million records. I extract the data from it by executing the following SQL command:

SELECT distinct
 "sub_select"."dynamic_col_1" "AS_dynamic_col_1","sub_select"."dynamic_col_1_text" "AS_dynamic_col_1_text"
FROM 
(
    SELECT "tab".*  FROM "tab"
    where "tab".year = 2011
) "sub_select"

Unfortunately, the query needs around 1 hour to execute, although I created the both indexes described above. The explain plan shows that Oracle uses a "Table Full Access", i.e. a full table scan. Why is the index not used?

As an experiment, I tested the following SQL command:

SELECT DISTINCT
 "dynamic_col_1" "AS_dynamic_col_1", "dynamic_col_1_text" "AS_dynamic_col_1_text"
 FROM "tab"

Even in this case, the index is not used and a full table scan is performed.

In my real database, the table contains more indexed columns like "dynamic_col_1" and "dynamic_col_1_text". The whole index file has a size of about 50 GB.

A few more informations:

The database is Oracle 11g installed on my local computer.
I use Windows 7 Enterprise 64bit.
The whole index is split over 3 dbf files with about 50GB size.

I would really be glad, if someone could tell me how to make Oracle use the index in the first query. Because the first query is used by another program to extract the data from the database, it can hardly be changed. So it would be good to tweak the table instead.

Thanks in advance.

[01.10.2011: UPDATE]

I think I've found the solution for the problem. Both columns dynamic_col_1 and dynamic_col_1_text are nullable. After altering the table to prohibit "NULL"-values in both columns and adding a new index solely for the column year, Oracle performs a Fast Index Scan. The advantage is that the query takes now about 5 seconds to execute and not 1 hour as before.

985

asked Sep 24 '11 13:09

oracle_user54

3 Answers

Your index should be:

CREATE INDEX Index_year 
ON tab (year) 
TABLESPACE tabspace_index;

Also, your query could just be:

SELECT DISTINCT
       dynamic_col_1 "AS_dynamic_col_1",
       dynamic_col_1_text "AS_dynamic_col_1_text"
  FROM tab
 WHERE year = 2011;

If your index was created solely for this query though, you could create it including the two fetched columns as well, then the optimiser would not have to go to the table for the query data, it could retrieve it directly from the index making your query more efficient again.

Hope it helps...

answered Sep 22 '22 06:09

Ollie

I don't have an Oracle instance on hand so this is somewhat guesswork, but my inclination is to say it's because you have the compound index in the wrong order. If you had year as the first column in the index it might use it.

answered Sep 20 '22 06:09

Dan

Are you sure that an index access would be faster than a full table scan? As a very rough estimate, full table scans are 20 times faster than reading an index. If tab has more than 5% of the data in 2011 it's not surprising that Oracle would use a full table scan. And as @Dan and @Ollie mentioned, with year as the second column this will make the index even slower.

If the index really is faster, than the issue is probably bad statistics. There are hundreds of ways the statistics could be bad. Very briefly, here's what I'd look at first:

Run an explain plan with and without and index hint. Are the cardinalities off by 10x or more? Are the times off by 10x or more?
If the cardinality is off, make sure there are up to date stats on the table and index and you're using a reasonable ESTIMATE_PERCENT (DBMS_STATS.AUTO_SAMPLE_SIZE is almost always the best for 11g).
If the time is off, check your workload statistics.
Are you using parallelism? Oracle always assumes a near linear improvement for parallelism, but on a desktop with one hard drive you probably won't see any improvement at all.

Also, this isn't really relevant to your problem, but you may want to avoid using quoted identifiers. Once you use them you have to use them everywhere, and it generally makes your tables and queries painful to work with.

answered Sep 21 '22 06:09

Jon Heller

Related questions
                            
                                Where does elementor stores the data from the content that is created from the frontend view?
                            
                                Has and belongs to many relationship with multiple databases
                            
                                Generalization vs Specialization of DB table [closed]
                            
                                How do I store a rating in a song?
                            
                                What's the meaning of ORM?
                            
                                Loosely Coupled Database Design - How To?
                            
                                Data Structure for storing a sorting field to efficiently allow modifications
                            
                                Creating a database connection pool
                            
                                ignore insert of rows that violate duplicate key index
                            
                                Database connectivity Delphi
                            
                                Which Oracle table uses a sequence?
                            
                                How should I store an Java Enum in JavaDB?
                            
                                Database best practices
                            
                                Is it possible to store javascript in a database?
                            
                                Scalable, fast, text file backed database engine?
                            
                                In sqlite How to add column in table if same column is not exists in table
                            
                                Recommendations for C# database access
                            
                                Choosing data type for MySQL?
                            
                                Why most hibernate applications are using sequence for id generation?
                            
                                MySQL: Fields length. Does it really matter?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With