Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I know when to index a column, and with what?

Tags:

In docs for various ORMs they always provide a way to create indexes, etc. They always mention to be sure to create the appropriate indexes for efficiency, as if that is inherent knowledge to a non-hand-written-SQLer who needs to use an ORM. My understanding of indexes (outside of PK) is basically: If you plan to do LIKE queries (ie, search) based on the contents of a column, you should use a full text index for that column. What else should I know regarding indexes (mostly pertaining to efficiency)? I feel like there is a world of knowledge at my door step, but there's a huge folded mouse pad jammed up under it, so I can't get through (I don't know why I felt like I needed to say that, but thanks for providing the couch).

like image 429
orokusaki Avatar asked Nov 04 '10 03:11

orokusaki


People also ask

How do you know which columns need indexing?

Columns with one or more of the following characteristics are good candidates for indexing: Values are unique in the column, or there are few duplicates. There is a wide range of values (good for regular indexes). There is a small range of values (good for bitmap indexes).

When should I index a column SQL?

Indexes will be useful if: The column or columns have a high degree of uniqueness. You frequently need to look for a certain value or range of values for the column.

How do I determine which columns need indexing in SQL Server?

Creating a Index on table depends on which columns you want to put WHERE clause. Once you are able to know columns to use for WHERE clause you can create Index over them. Typically numeric columns are good choice for Indexing.

When should you use indexing?

An index in a database is very similar to an index in the back of a book. It means when we want to jump to a specific topic from a book then we can first see the index of the book and then find the page number for that topic and then open that topic by going to that page number.


1 Answers

Think of an index very roughly like the index in the back of a book. It's a totally separate area from the content of the book, where if you are seeking some specific value, you can go to the index and look it up (indexes are ordered, so finding things there is much quicker than scanning every page of the book).

The index entry has a page number, so you can then quickly go to the page seeking your topic. A database index is very similar; it is an ordered list of the relevant information in your database (the field(s) included in the index), with information for the database to find the records which match.

So... you would create an index when you have information that you need to search on frequently. Normal indexes don't help you for 'partial' seeks like LIKE queries, but any time you need to get a set of results where field X has certain value(s), they keep the DBMS from needing to 'scan' the whole table, looking for matching values.

They also help when you need to sort on a column.

Another thing to keep in mind; If the DBMS allows you to create single indexes that have multiple fields, be sure to investigate the effects of doing so, specific to your DBMS. An index that includes multiple fields is likely only to be fully (or at all) useful if all those fields are being used in a query. Conversely, having multiple indexes for a single table, with one field per index, may not be of much (or any) help for queries that are filtering/sorting by multiple fields.


You mentioned Full Text indexes and PKs (Primary Keys). These are different than regular indexes, though they often serve similar purposes.

First, note that a Primary Key is usually an index (in MSSQL, a 'Clustered Index', in fact), but this does not need to be the case specifically. As an example, an MSSQL PK is a Clustered Index by default; clustered indexes are special in that they are not a separate bit of data stored elsewhere, but the data itself is arranged in the table in order by the Clustered Index. This is why a popular PK is an int value that is auto-generated with sequential, increasing values. So, a Clustered Index sorts the data in the table specifically by the field's value. Compare this to a traditional dictionary; the entries themselves are ordered by the 'key', which is the word being defined.

But in MSSQL (check your DBMS documentation for your information), you can change the Clustered Index to be a different field, if you like. Sometimes this is done on datetime based fields.


Full Text indexes are different kinds of beasts entirely. They use some of the same principles, but what they are doing isn't exactly the same as normal indexes, which I am describing. Also: in some DBMS's, LIKE queries do not use the full text index; special query operators are required.

These indexes are different because their intent is not to find/sort on the whole value of the column (a number, a date, a short bit of char data), but instead to find individual words/phrases within the text field(s) being indexed.

They can also often enable searching for similar words, different tenses, common misspellings and the like, and typically ignore noise words. The different way in which they work is why they also may need different operators to use them. (again, check your local documentation for your DBMS!)

like image 159
Andrew Barber Avatar answered Sep 23 '22 06:09

Andrew Barber