Index over a column with only 5 distinct values - Worth it?

Tags:

I have a table with a potential of up to 5.000.000 rows. One of the columns in this table is used alone in queries, but there is only 5 possible values of this column, and currently I got 10.000 rows and according to the explain plan it makes no sense to use my index on that column.

Will it ever, or shouldn't I bother with an index

Edit: This is the two explain plans at the moment Without index http://img706.imageshack.us/img706/1903/noindex.png vs. With forced index via hints http://img692.imageshack.us/img692/8205/indexp.png The latter image I force the usage of the index with a hint.

747

asked Dec 10 '09 08:12

svrist

1 Answers

It depends on a couple of things.

Firstly, the distribution of values. If you only have five distinct values but one of them accounts for 99.9999% of rows in the table then obviously you would not want the optimiser to use the index for that value but you might want it to use it for the others. In some cases like this it's worth using a function-based index to ensure that you only index the values of interest and not the ones that are just taking up space.

Secondly, are there queries that can be answered using that index without accessing the table?

Note that it's not just the percentage of rows that will be accessed that matters, but the number of blocks of the table that will need to be accessed. For example if you have a table of 1000 blocks and 30 rows per block on average, and one column has 30 distinct values (each one being present in 1000 rows), then the number of blocks that need to be visited to read every row for a single value varies between 1000/30=34 (worth using an index) and 1000 (not worth using an index) depending on how the rows are distributed. this is expressed by the clustering factor of the index -- if it's value is close to the number of rows in the table then the index is less likely to be used, and if it's close to the number of blocks then it's more likely to be used.

also, you might look at index compression to see if that saves you space.

Be careful with bitmap indexes -- they are not friendly to systems where they are subject to modification by multiple sessions at the same time (eg. two people both inserting rows at the same time into the indexed table).

A more effective strategy if you do want to improve the efficieny of queries with predicates on these five values is to use partitioning, partly because of partition pruning in the query but also because of the improvement in statistics available to the optimiser when it knows that only one partition will be accessed and can use partition-level statistics instead of global statistics.

answered Nov 16 '22 03:11

David Aldridge

Related questions
                            
                                Any tools to export the whole Oracle DB as SQL scripts
                            
                                Implement the Business Services in PL/SQL or Java? Favor/Cons?
                            
                                CONNECT BY or hierarchical queries in RDBMS other than Oracle
                            
                                sysdate difference
                            
                                How do I use a database to manage a semaphore?
                            
                                ORA-00907: missing right parenthesis Error while creating a table?
                            
                                Converting time difference to a given format in Oracle
                            
                                Resolve Math Functions PL/SQL
                            
                                ORA-00932: inconsistent datatypes: expected DATE got BINARY in Hibernate
                            
                                2 Outer Joins on Same Table?
                            
                                Oracle aggregation function to allocate amount
                            
                                Get value based on max of a different column grouped by another column [duplicate]
                            
                                Oracle PreparedStatement - NullPointerException for some developers, but not all
                            
                                Scenario to allow update based on booking-SQL
                            
                                how to reset Identity column in Oracle
                            
                                How to handle/use special characters like percent (%) and ampersand (&) in Oracle SQL queries
                            
                                Grouping data by name and date ranges
                            
                                Division by zero handled differently
                            
                                What does the letter on the Oracle release mean? [closed]
                            
                                Is it possible to use GROUP BY with bind variables?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Index over a column with only 5 distinct values - Worth it?

Tags:

indexing

oracle

svrist

People also ask

1 Answers

David Aldridge

Recent Activity

Donate For Us