I have created script to find selectivity of each column for all tables. In some tables with less than 100 rows, selectivity of a column is more than 50%. Where Selectivity = Distinct Values / Total Number Rows. Are those columns eligible for an index? Or, can you tell me the minimum requirement for number of rows to create an index?
You can index on any column - the question is whether it makes any sense and whether that index will be used....
Typically, a selectivity of less than 1-5% might work - the smaller that percentage, the better. The best is single values out of a large population, e.g. a single customer ID out of hundreds of thousands - those indices will definitely be used.
Things like gender (only 2 values) or other things that only have a very limited number of possible values typically don't work well on an index. At least on their own - these columns might be ok to be included into another index as a second or third column.
But really, the only way to find out whether or not an index makes sense is to
There's no golden rule as to when an index will be used (or ignored) - too many variables play into that decision.
For some expert advice on how to deal with indices, and how to find out which indices might not get used, and when it makes sense to create an index, see Kimberly Tripp's blog posts:
Most DBMS use a cache for data and code (stored procedure, execution plan, etc.). In SQL Server I think it's called the data and procedure cache, and in Oracle, it's called the buffer cache and the SGA. Table data and/or index can be in the cache.
Small table which are frequently accessed will most likely fit in the cache. But the table can be evicted from the cache, say, if a query load fresh data from the disk. There are options to indicate that you want a table to be permanently in the cache (See PINTABLE). That's maybe a better strategy that using an index if your table is very small (which is your case). Adding an index (which would also always be in the cache) could help further, but I don't know what would be the gain.
The big different in performance is disk access vs. memory access. Purpose of index is to reduce the amount of data to read from the disk, but if it's already in memory, gain is probably small.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With