I've heard them talked about since I started working in tech about 18 months ago. I know that they potentially improve performance, and they seem to be column specific -- ("We index the User table on the date_of_birth column").
Just looking for a quick overview of what exactly they are, what they are used for, and how they work.
Duplicate indexes are those that exactly match the Key and Included columns. That's easy. Possible duplicate indexes are those that very closely match Key/Included columns.
Yes, you can create a clustered index on key columns that contain duplicate values.
Duplicate indexes take up double the room in SQL Server– and even if indexes are COMPLETELY identical, SQL Server may choose to use both of them. Duplicate indexes essentially cost you extra IO, CPU, and Memory, just the things you were trying to SAVE by adding nonclustered indexes!
Duplicate data is any record that inadvertently shares data with another record in a Database. Duplicate data is easy to spot and it mostly occurs when transferring data between systems. The most popular occurrence of duplicate data is a complete carbon copy of a record.
I wrote a complete book about it! It's also available for free on the web: http://use-the-index-luke.com/
I try to answer your questions shortly—which is not exactly what I'm good at. The last time I tried, I ended up writing a book...
Like tables, indexes consist of rows and columns but store the data in a logically sorted manner to improve search performance. Think of it like a telephone book (a printed one). They are usually sorted last_name
, first_name
and potentially other criteria (e.g. zip code). This sorting makes it possible to find all entries for a specific last name quickly. If you know the first name too, you can even find the entries for the combination last name/first name very quickly.
If you just know the first name, however, the telephone book does not really help you. The very same is true for multi-column database indexes. So yes, an index can potentially improve search performance. If you have the wrong index for your question (e.g. a phonebook when searching by first name) they might be useless.
You can have many indexes on the same table but on different columns. So, an index on last_name
,first_name
is different from an index on first_name
only (which you would need to optimize searches by first name).
Indexes hold redundant data (ex: clustered indexes = telephone book). They have the same information as stored in the table (ex: function based indexes), but in a sorted manner. This redundancy is automatically maintained by the database for each write operation you perform (insert/update/delete). Consequently, indexed decrease write performance.
Besides finding data quickly, indexes can also be used to optimize sort operations (order by
) and physically arrange related data closely together (clustering).
To get a better idea, look at the full table of contents of my book: http://use-the-index-luke.com/sql/table-of-contents
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With