Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compound index required to speed up join-ed query?

A colleague asked me to explain how indexes (indices?) boost up performance; I tried to do so, but got confused myself.
I used the model below for explanation (an error/diagnostics logging database). It consists of three tables:

  • List of business systems, table "System" containing their names
  • List of different types of traces, table "TraceTypes", defining what kinds of error messages can be logged
  • Actual trace messages, having foreign keys from System and TraceTypes tables

I used MySQL for the demo, however I don't recall the table types I used. I think it was InnoDB.

 System                                TraceTypes
-----------------------------         ------------------------------------------
| ID          | Name        |         | ID    | Code   | Description           |
-----------------------------         ------------------------------------------
| 1           | billing     |         | 1     | Info   | Informational mesage  |
| 2           | hr          |         | 2     | Warning| Warning only          |
-----------------------------         | 3     | Error  | Failure               |
           |                          ------------------------------------------
           |                ------------|
 Traces    |                |            
 --------------------------------------------------
 | ID | System_ID | TraceTypes_ID | Message       |
 --------------------------------------------------
 | 1  |  1        |  1            | Job starting  |
 | 2  |  1        |  3            | System.nullr..|
 --------------------------------------------------

First, i added some records to all of the tables and demonstrated that the query below executes in 0.005 seconds:

select count(*) from Traces 
  inner join System on Traces.System_ID = System.ID
  inner join TraceTypes on Traces.TraceTypes_ID = TraceTypes.ID
where 
  System.Name='billing' and TraceTypes.Code = 'Info'

Then I generated more data (no indexes yet)

  • "System" contained about 100 entries
  • "TraceTypes" contained about 50 entries
  • "Traces" contained ~10 million records.

Now the previous query took 8-10 seconds.

I created indexes on Traces.System_ID column and Traces.TraceTypes_ID column. Now this query executed in milliseconds:

select count(*) from Traces where System_id=1 and TraceTypes_ID=1;

This was also fast:

select count(*) from Traces 
  inner join System on Traces.System_ID = System.ID
where System.Name='billing' and TraceTypes_ID=1;

but the previous query which joined all the three tables still took 8-10 seconds to complete.

Only when I created a compound index (both System_ID and TraceTypes_ID columns included in index), the speed went down to milliseconds.

The basic statement I was taught earlier is "all the columns you use for join-ing, must be indexed".
However, in my scenario I had indexes on both System_ID and TraceTypes_ID, however MySQL didn't use them. The question is - why? My bets is - the item count ratio 100:10,000,000:50 makes the single-column indexes too large to be used. But is it true?

like image 287
naivists Avatar asked Jan 12 '10 09:01

naivists


2 Answers

First, the correct, and the easiest, way to analyze a slow SQL statement is to do EXPLAIN. Find out how the optimizer chose its plan and ponder on why and how to improve that. I'd suggest to study the EXPLAIN results with only 2 separate indexes to see how mysql execute your statement.

I'm not very familiar with MySQL, but it seems that there's restriction of MySQL 4 of using only one index per table involved in a query. There seems to be improvements on this since MySQL 5 (index merge), but I'm not sure whether it applies to your case. Again, EXPLAIN should tell you the truth.

Even with using 2 indexes per table allowed (MySQL 5), using 2 separate indexes is generally slower than compound index. Using 2 separate indexes requires index merge step, compared to the single pass of using a compound index.

Multi Column indexes vs Index Merge might be helpful, which uses MySQL 5.4.2.

like image 145
bryantsai Avatar answered Oct 07 '22 08:10

bryantsai


It's not the size of the indexes so much as the selectivity that determines whether the optimizer will use them.

like image 38
Mitch Wheat Avatar answered Oct 07 '22 09:10

Mitch Wheat