In a former company I worked at, the rule of thumb was that a table should have no more than one index (allowing the odd exception, and certain parent-tables holding references to nearly all other tables and thus are updated very frequently).
The idea being that often, indexes cost the same or more to uphold than they gain. Note that this question is different to indexed-view-vs-indexes-on-table as the motivation is not only reporting.
Is this true? Is this index-purism worth it?
In your career do you generally avoid using indexes?
What are the general large-scale recommendations regarding indexes?
Currently and at the last company we use SQL Server, so any product specific guidelines are welcome too.
Standard indexes on a column can lead to substantial decreases in query execution times as shown in this article on optimizing queries. Multi-column indexes can achieve even greater decreases in query time due to its ability to move through the data quicker.
It is possible for an index to have two or more columns. Multi column indexes are also known as compound or concatenated indexes. Let us look at a query that could use two different indexes on the table based on the WHERE clause restrictions. We first create these indexes.
Too many indexes create additional overhead associated with the extra amount of data pages that the Query Optimizer needs to go through. Also, too many indexes require too much space and add to the time it takes to accomplish maintenance tasks.
A good exercise is to create multiple granular indexes (single column index) on a table, build queries using different columns in the where clause, and then view the query plan. Interestingly, SQL Server will often use the indexes as tables to reduce the work load.
You need to create exactly as many indexes as you need to create. No more, no less. It is as simple as that.
Everybody "knows" that an index will slow down DML statements on a table. But for some reason very few people actually bother to test just how "slow" it becomes in their context. Sometimes I get the impression that people think that adding another index will add several seconds to each inserted row, making it a game changing business tradeoff that some fictive hotshot user should decide in a board room.
I'd like to share an example that I just created on my 2 year old pc, using a standard MySQL installation. I know you tagged the question SQL Server, but the example should be easily converted. I insert 1,000,000 rows into three tables. One table without indexes, one table with one index and one table with nine indexes.
drop table numbers;
drop table one_million_rows;
drop table one_million_one_index;
drop table one_million_nine_index;
/*
|| Create a dummy table to assist in generating rows
*/
create table numbers(n int);
insert into numbers(n) values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
/*
|| Create a table consisting of 1,000,000 consecutive integers
*/
create table one_million_rows as
select d1.n + (d2.n * 10)
+ (d3.n * 100)
+ (d4.n * 1000)
+ (d5.n * 10000)
+ (d6.n * 100000) as n
from numbers d1
,numbers d2
,numbers d3
,numbers d4
,numbers d5
,numbers d6;
/*
|| Create an empty table with 9 integer columns.
|| One column will be indexed
*/
create table one_million_one_index(
c1 int, c2 int, c3 int
,c4 int, c5 int, c6 int
,c7 int, c8 int, c9 int
,index(c1)
);
/*
|| Create an empty table with 9 integer columns.
|| All nine columns will be indexed
*/
create table one_million_nine_index(
c1 int, c2 int, c3 int
,c4 int, c5 int, c6 int
,c7 int, c8 int, c9 int
,index(c1), index(c2), index(c3)
,index(c4), index(c5), index(c6)
,index(c7), index(c8), index(c9)
);
/*
|| Insert 1,000,000 rows in the table with one index
*/
insert into one_million_one_index(c1,c2,c3,c4,c5,c6,c7,c8,c9)
select n, n, n, n, n, n, n, n, n
from one_million_rows;
/*
|| Insert 1,000,000 rows in the table with nine indexes
*/
insert into one_million_nine_index(c1,c2,c3,c4,c5,c6,c7,c8,c9)
select n, n, n, n, n, n, n, n, n
from one_million_rows;
My timings are:
I'm better with SQL than statistics and math, but I'd like to think that: Adding 8 indexes to my table, added (6,98-1,5) 5,48 seconds in total. Each index would then have contributed 0,685 seconds (5,48 / 8) for all 1,000,000 rows. That would mean that the added overhead per row per index would have been 0,000000685 seconds. SOMEBODY CALL THE BOARD OF DIRECTORS!
In conclusion, I'd like to say that the above test case doesn't prove a shit. It just shows that tonight, I was able to insert 1,000,000 consecutive integers into in a table in a single user environment. Your results will be different.
That is utterly ridiculous. First, you need multiple indexes in order to perfom correctly. For instance, if you have a primary key, you automatically have an index. that means you can't index anything else with the rule you described. So if you don't index foreign keys, joins will be slow and if you don't index fields used in the where clause, queries will still be slow. Yes you can have too many indexes as they do take extra time to insert and update and delete records, but no more than one is not dangerous, it is a requirement to have a system that performs well. And I have found that users tolerate a longer time to insert better than they tolerate a longer time to query.
Now the exception might be for a system that takes thousands of readings per second from some automated equipment. This is a database that generally doesn't have indexes to speed inserts. But usually these types of databases are also not used for reading, the data is transferred instead daily to a reporting database which is indexed.
Yes, definitely - too many indexes on a table can be worse than no indexes at all. However, I don't think there's any good in having the "at most one index per table" rule.
For SQL Server, my rule is:
Finding the right mix of indices - weighing the pros of speeding up queries vs. the cons of additional overhead on INSERT, UPDATE, DELETE - is not an exact science - it's more about know-how, experience, measuring, measuring, and measuring again.
Any fixed rule is bound to be more contraproductive than anything else.....
The best content on indexing comes from Kimberly Tripp - the Queen of Indexing - see her blog posts here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With