What columns generally make good indexes?

Tags:

As a follow up to "What are indexes and how can I use them to optimise queries in my database?" where I am attempting to learn about indexes, what columns are good index candidates? Specifically for an MS SQL database?

After some googling, everything I have read suggests that columns that are generally increasing and unique make a good index (things like MySQL's auto_increment), I understand this, but I am using MS SQL and I am using GUIDs for primary keys, so it seems that indexes would not benefit GUID columns...

363

asked Sep 20 '08 04:09

mmattax

10 Answers

Indexes can play an important role in query optimization and searching the results speedily from tables. The most important step is to select which columns are to be indexed. There are two major places where we can consider indexing: columns referenced in the WHERE clause and columns used in JOIN clauses. In short, such columns should be indexed against which you are required to search particular records. Suppose, we have a table named buyers where the SELECT query uses indexes like below:

SELECT
 buyer_id /* no need to index */
FROM buyers
WHERE first_name='Tariq' /* consider indexing */
AND last_name='Iqbal'   /* consider indexing */

Since "buyer_id" is referenced in the SELECT portion, MySQL will not use it to limit the chosen rows. Hence, there is no great need to index it. The below is another example little different from the above one:

SELECT
 buyers.buyer_id, /* no need to index */
 country.name    /* no need to index */
FROM buyers LEFT JOIN country
ON buyers.country_id=country.country_id /* consider indexing */
WHERE
 first_name='Tariq' /* consider indexing */
AND
 last_name='Iqbal' /* consider indexing */

According to the above queries first_name, last_name columns can be indexed as they are located in the WHERE clause. Also an additional field, country_id from country table, can be considered for indexing because it is in a JOIN clause. So indexing can be considered on every field in the WHERE clause or a JOIN clause.

The following list also offers a few tips that you should always keep in mind when intend to create indexes into your tables:

Only index those columns that are required in WHERE and ORDER BY clauses. Indexing columns in abundance will result in some disadvantages.
Try to take benefit of "index prefix" or "multi-columns index" feature of MySQL. If you create an index such as INDEX(first_name, last_name), don’t create INDEX(first_name). However, "index prefix" or "multi-columns index" is not recommended in all search cases.
Use the NOT NULL attribute for those columns in which you consider the indexing, so that NULL values will never be stored.
Use the --log-long-format option to log queries that aren’t using indexes. In this way, you can examine this log file and adjust your queries accordingly.
The EXPLAIN statement helps you to reveal that how MySQL will execute a query. It shows how and in what order tables are joined. This can be much useful for determining how to write optimized queries, and whether the columns are needed to be indexed.

Update (23 Feb'15):

Any index (good/bad) increases insert and update time.

Depending on your indexes (number of indexes and type), result is searched. If your search time is gonna increase because of index then that's bad index.

Likely in any book, "Index Page" could have chapter start page, topic page number starts, also sub topic page starts. Some clarification in Index page helps but more detailed index might confuse you or scare you. Indexes are also having memory.

Index selection should be wise. Keep in mind not all columns would require index.

163

answered Oct 12 '22 10:10

Somnath Muluk

Some folks answered a similar question here: How do you know what a good index is?

Basically, it really depends on how you will be querying your data. You want an index that quickly identifies a small subset of your dataset that is relevant to a query. If you never query by datestamp, you don't need an index on it, even if it's mostly unique. If all you do is get events that happened in a certain date range, you definitely want one. In most cases, an index on gender is pointless -- but if all you do is get stats about all males, and separately, about all females, it might be worth your while to create one. Figure out what your query patterns will be, and access to which parameter narrows the search space the most, and that's your best index.

Also consider the kind of index you make -- B-trees are good for most things and allow range queries, but hash indexes get you straight to the point (but don't allow ranges). Other types of indexes have other pros and cons.

Good luck!

answered Oct 12 '22 10:10

SquareCog

It all depends on what queries you expect to ask about the tables. If you ask for all rows with a certain value for column X, you will have to do a full table scan if an index can't be used.

Indexes will be useful if:

The column or columns have a high degree of uniqueness
You frequently need to look for a certain value or range of values for the column.

They will not be useful if:

You are selecting a large % (>10-20%) of the rows in the table
The additional space usage is an issue
You want to maximize insert performance. Every index on a table reduces insert and update performance because they must be updated each time the data changes.

Primary key columns are typically great for indexing because they are unique and are often used to lookup rows.

answered Oct 12 '22 09:10

Plasmer

Any column that is going to be regularly used to extract data from the table should be indexed.

This includes: foreign keys -

select * from tblOrder where status_id=:v_outstanding

descriptive fields -

select * from tblCust where Surname like "O'Brian%"

The columns do not need to be unique. In fact you can get really good performance from a binary index when searching for exceptions.

select * from tblOrder where paidYN='N'

answered Oct 12 '22 10:10

pappes

In general (I don't use mssql so can't comment specifically), primary keys make good indexes. They are unique and must have a value specified. (Also, primary keys make such good indexes that they normally have an index created automatically.)

An index is effectively a copy of the column which has been sorted to allow binary search (which is much faster than linear search). Database systems may use various tricks to speed up search even more, particularly if the data is more complex than a simple number.

My suggestion would be to not use any indexes initially and profile your queries. If a particular query (such as searching for people by surname, for example) is run very often, try creating an index over the relevate attributes and profile again. If there is a noticeable speed-up on queries and a negligible slow-down on insertions and updates, keep the index.

(Apologies if I'm repeating stuff mentioned in your other question, I hadn't come across it previously.)

answered Oct 12 '22 09:10

Zooba

It really depends on your queries. For example, if you almost only write to a table then it is best not to have any indexes, they just slow down the writes and never get used. Any column you are using to join with another table is a good candidate for an index.

Also, read about the Missing Indexes feature. It monitors the actual queries being used against your database and can tell you what indexes would have improved the performace.

answered Oct 12 '22 09:10

jwanagel

A GUID column is not the best candidate for indexing. Indexes are best suited to columns with a data type that can be given some meaningful order, ie sorted (integer, date etc).

It does not matter if the data in a column is generally increasing. If you create an index on the column, the index will create it's own data structure that will simply reference the actual items in your table without concern for stored order (a non-clustered index). Then for example a binary search can be performed over your index data structure to provide fast retrieval.

It is also possible to create a "clustered index" that will physically reorder your data. However you can only have one of these per table, whereas you can have multiple non-clustered indexes.

answered Oct 12 '22 10:10

Ash

Your primary key should always be an index. (I'd be surprised if it weren't automatically indexed by MS SQL, in fact.) You should also index columns you SELECT or ORDER by frequently; their purpose is both quick lookup of a single value and faster sorting.

The only real danger in indexing too many columns is slowing down changes to rows in large tables, as the indexes all need updating too. If you're really not sure what to index, just time your slowest queries, look at what columns are being used most often, and index them. Then see how much faster they are.

answered Oct 12 '22 11:10

Eevee

Numeric data types which are ordered in ascending or descending order are good indexes for multiple reasons. First, numbers are generally faster to evaluate than strings (varchar, char, nvarchar, etc). Second, if your values aren't ordered, rows and/or pages may need to be shuffled about to update your index. That's additional overhead.

If you're using SQL Server 2005 and set on using uniqueidentifiers (guids), and do NOT need them to be of a random nature, check out the sequential uniqueidentifier type.

Lastly, if you're talking about clustered indexes, you're talking about the sort of the physical data. If you have a string as your clustered index, that could get ugly.

answered Oct 12 '22 09:10

Ian Suttle

The ol' rule of thumb was columns that are used a lot in WHERE, ORDER BY, and GROUP BY clauses, or any that seemed to be used in joins frequently. Keep in mind I'm referring to indexes, NOT Primary Key

Not to give a 'vanilla-ish' answer, but it truly depends on how you are accessing the data

answered Oct 12 '22 10:10

curtisk

Related questions
                            
                                Return rows in random order [duplicate]
                            
                                How to kill/stop a long SQL query immediately?
                            
                                Errors: "INSERT EXEC statement cannot be nested." and "Cannot use the ROLLBACK statement within an INSERT-EXEC statement." How to solve this?
                            
                                How to add Active Directory user group as login in SQL Server
                            
                                How can I alter a primary key constraint using SQL syntax?
                            
                                When do I need to use Begin / End Blocks and the Go keyword in SQL Server?
                            
                                How to force a SQL Server 2008 database to go Offline
                            
                                SQL function as default parameter value?
                            
                                SQL Server: Get data for only the past year
                            
                                How do I enable MSDTC on SQL Server?
                            
                                Search of table names
                            
                                Backup a single table with its data from a database in sql server 2008
                            
                                Should I index a bit field in SQL Server?
                            
                                alternatives to REPLACE on a text or ntext datatype
                            
                                Optimal way to concatenate/aggregate strings
                            
                                Subtract one day from datetime
                            
                                Connection timeout for SQL server
                            
                                How do I find duplicates across multiple columns?
                            
                                varbinary to string on SQL Server
                            
                                Is having an 'OR' in an INNER JOIN condition a bad idea?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What columns generally make good indexes?

Tags:

database

optimization

sql-server

indexing

database-design

mmattax

People also ask

10 Answers

Somnath Muluk

SquareCog

Plasmer

pappes

Zooba

jwanagel

Ash

Eevee

Ian Suttle

curtisk

Recent Activity

Donate For Us