Oracle 10g - optimize WHERE IS NOT NULL

Tags:

We have Oracle 10g and we need to query 1 table (no joins) and filter out rows where 1 of the columns is null. When we do this - WHERE OurColumn IS NOT NULL - we get a full table scan on a very large table - BAD BAD BAD. The column has an index on it but it gets ignored in this instance. Are there any solutions to this?

Thanks

434

asked Apr 06 '09 13:04

Jim Evans

2 Answers

The optimizer thinks that the full table scan will be better.

If there are just a few NULL rows, the optimizer is right.

If you are absolutely sure that the index access will be faster (that is, you have more than 75% rows with col1 IS NULL), then hint your query:

SELECT  /*+ INDEX (t index_name_on_col1) */
        *
FROM    mytable t
WHERE   col1 IS NOT NULL

Why 75%?

Because using INDEX SCAN to retrieve values not covered by the index implies a hidden join on ROWID, which costs about 4 times as much as table scan.

If the index range includes more than 25% of rows, the table scan is usually faster.

As mentioned by Tony Andrews, clustering factor is more accurate method to measure this value, but 25% is still a good rule of thumb.

114

answered Oct 14 '22 22:10

Quassnoi

The optimiser will make its decision based on the relative cost of the full table scan and using the index. This mainly comes down to how many blocks will have to be read to satisfy the query. The 25%/75% rule of thumb mentioned in another answer is simplistic: in some cases a full table scan will make sense even to get 1% of the rows - i.e. if those rows happen to be spread around many blocks.

For example, consider this table:

SQL> create table t1 as select object_id, object_name from all_objects;

Table created.
SQL> alter table t1 modify object_id null;

Table altered.

SQL> update t1 set object_id = null
  2  where mod(object_id,100) != 0
  3  /

84558 rows updated.

SQL> analyze table t1 compute statistics;

Table analyzed.

SQL> select count(*) from t1 where object_id is not null;

  COUNT(*)
----------
       861

As you can see, only approximately 1% of the rows in T1 have a non-null object_id. But due to the way I built the table, these 861 rows will be spread more or less evenly around the table. Therefore, the query:

select * from t1 where object_id is not null;

is likely to visit almost every block in T1 to get data, even if the optimiser used the index. It makes sense then to dispense with the index and go for a full table scan!

A key statistic to help identify this situation is the index clustering factor:

SQL> select clustering_factor from user_indexes where index_name='T1_IDX';

CLUSTERING_FACTOR
-----------------
              460

This value 460 is quite high (compared to the 861 rows in the index), and suggests that a full table scan will be used. See this DBAZine article on clustering factors.

answered Oct 14 '22 22:10

Tony Andrews

Related questions
                            
                                java.sql.SQLException: Could not commit with auto-commit set on at oracle.jdbc.driver.PhysicalConnection.commit(PhysicalConnection.java:4443)
                            
                                Parser for Oracle SQL
                            
                                Oracle: Updating a table column using ROWNUM in conjunction with ORDER BY clause
                            
                                Creating a table from a query using a different tablespace (Oracle SQL)
                            
                                Getting Error - ORA-01858: a non-numeric character was found where a numeric was expected
                            
                                oracle jdbc driver version madness
                            
                                Capitalization of Person names in programming [closed]
                            
                                unwanted leading blank space on oracle number format
                            
                                How to average time intervals?
                            
                                PLSQL - Drop all database objects of a user
                            
                                Java Oracle exception - "maximum number of expressions in a list is 1000"
                            
                                Confusing error about missing left parenthesis in SQL statement
                            
                                How to specify @lock timeout in spring data jpa query?
                            
                                What are materialized views?
                            
                                Displaying RowID in Select * (all) Statement
                            
                                Find out name of PL/SQL procedure
                            
                                How to find maximum avg
                            
                                Oracle SQL -- insert multiple rows into a table with one statement?
                            
                                java.sql.SQLException: Invalid column name
                            
                                Parameterized query in Oracle trouble

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Oracle 10g - optimize WHERE IS NOT NULL

Tags:

optimization

null

oracle

Jim Evans

People also ask

2 Answers

Quassnoi

Tony Andrews

Recent Activity

Donate For Us