I have Log and LogItem tables; I'm writing a query to grab some data from both. There are thousands of Logs
and each Log
can have up to 125 LogItems
The query in question is complicated so I'm skipping it (if someone thinks it's important I can probably post it), but when I ran SSMS Estimated Query plan, it told me a new Non-Clustered index would improve performance up to 100%.
Existing Index: Non-clustered
Key Colums (LogItem): ParentLogID, DateModified, Name, DatabaseModified
Query Plan Recommendation
CREATE NONCLUSTERED INDEX [LogReportIndex]
ON [dbo].[LogItem] ([ParentLogID],[DatabaseModified])
Just for fun, I created this new index and ran the query and much to my surprise, it now takes ~1 second for my query to run, when before it was 10+ seconds.
I assumed that my existing index would cover this new query, so my question is why did creating a new index on the only columns used in my new query improve performance? Should I have an index for each unique combination of columns used in my where
clauses?
note: I don't think this is because the SQL Server is caching my results, I ran the query about 25-30 times before I created the index and it consistantly took 10-15 seconds, after the index it is now consistantly ~1 or less.
Order of columns in an index is important. If filtering requires column 1 and 4 from index, the index is not going to help. It's only useful when filtering by the first N consecutive columns.
This is because index is a tree. You can't efficiently select all nodes of the tree where column3 = something
, because they are scattered all other the place, belonging to different values of column1
and column2
. But if you know column1
and column2
as well, locating the right branch in the tree is a no brainer.
The leading edge of an index is what matters.
As long as your query is "covered" by a leading edge of an index, it will be efficient. Database indexes are typically implemented as B-Trees and the structure of the B-Tree dictates that the search must be done in a certain order, which is why the order of fields in the composite index matters.
If you have "holes", e.g. if you search on ParentLogID
and DatabaseModified
, but only have index on {ParentLogID, DateModified, Name, DatabaseModified}
, then only the {ParentLogID}
portion of the index can be utilized efficiently.
(NOTE: Some DBMSes can utilize the {DatabaseModified}
portion through "skip scan", but even if your DBMS does that it is much less efficient than the regular index access).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With