Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL Query Performance - Query/Schema/Indexes?

Basically having some performance issues with queries, mainly to my largest table which holds call data.

The main query contains quite a few left joins & sub-selects, but in a scenario where I'm running a query where I expect back 1.3M calls to be returned, the query is just not doing it. Having to stop it at 7 minutes means there's definately a problem somewhere.

I've narrowed down the main query and tested the simplest sub-select join which is

SELECT
    DateStart,
    ID,
    NumbID,
    EffectiveFlag,
    OrigNumber
FROM calls
WHERE 
    DateStart <= '2013-12-31'
AND DateStart >= '2013-01-01'
AND CallLength >= '00:00:00' 
AND Direction = '1'
AND CustID IN (474,482,250,268,197,604,132,359,279,441,118,448,152,133,380,162,249,679,226,259,2450,2408,2451,2453,2439,2454,2444,2445,2452)

And even that query takes 4.5s - so when it's a sub-select in a query with other joins & sub-selected, I can imagine why the query as a whole is unusable.

The explain statement for the above query is

+----+-------------+-------+-------+-------------------------------------------------------------------------------------------------------+----------------------+---------+------+---------+-------------+
| id | select_type | table | type  | possible_keys                                                                                         | key                  | key_len | ref  | rows    | Extra       |
+----+-------------+-------+-------+-------------------------------------------------------------------------------------------------------+----------------------+---------+------+---------+-------------+
|  1 | SIMPLE      | calls | range | idx_CustID,idx_DateStart,idx_CustID_DateStart,idx_CustID_TermNumber,idx_Direction                     | idx_CustID_DateStart | 7       | NULL | 1660009 | Using where |
+----+-------------+-------+-------+-------------------------------------------------------------------------------------------------------+----------------------+---------+------+---------+-------------+

The database schema of the calls table is

+-------------------+-------------+------+-----+---------------------+----------------+
| Field             | Type        | Null | Key | Default             | Extra          |
+-------------------+-------------+------+-----+---------------------+----------------+
| ID                | int(11)     | NO   | PRI | NULL                | auto_increment |
| CustID            | int(11)     | NO   | MUL | 0                   |                |
| CarrID            | int(11)     | NO   | MUL | NULL                |                |
| TariID            | int(11)     | NO   | MUL | 0                   |                |
| CarrierRef        | varchar(30) | NO   | MUL |                     |                |
| NumbID            | int(11)     | NO   | MUL | 0                   |                |
| VlviID            | int(11)     | NO   | MUL | NULL                |                |
| VcamID            | int(11)     | NO   | MUL | NULL                |                |
| SomeID            | int(11)     | NO   | MUL | NULL                |                |
| VlnsID            | int(11)     | NO   | MUL | NULL                |                |
| NGNumber          | varchar(12) | NO   |     |                     |                |
| OrigNumber        | varchar(16) | NO   | MUL | NULL                |                |
| CLIRestrictedFlag | int(2)      | NO   |     | NULL                |                |
| OrigLocality      | varchar(11) | NO   | MUL |                     |                |
| OrigAreaCode      | varchar(11) | NO   | MUL |                     |                |
| TermNumber        | varchar(16) | NO   | MUL | NULL                |                |
| BatchNumber       | varchar(10) | NO   | MUL |                     |                |
| DateStart         | date        | NO   | MUL | 0000-00-00          |                |
| DateClear         | date        | NO   |     | 0000-00-00          |                |
| TimeStart         | time        | NO   |     | 00:00:00            |                |
| TimeClear         | time        | NO   |     | 00:00:00            |                |
| CallLength        | time        | NO   |     | 00:00:00            |                |
| RingLength        | time        | NO   |     | 00:00:00            |                |
| EffectiveFlag     | smallint(1) | NO   | MUL | NULL                |                |
| UnansweredFlag    | smallint(1) | NO   | MUL | NULL                |                |
| EngagedFlag       | smallint(1) | NO   |     | NULL                |                |
| RecID             | int(11)     | NO   | MUL | NULL                |                |
| CreatedUserID     | int(11)     | NO   |     | 0                   |                |
| CreatedDatetime   | datetime    | NO   | MUL | 0000-00-00 00:00:00 |                |
| Direction         | int(1)      | NO   | MUL | NULL                |                |
+-------------------+-------------+------+-----+---------------------+----------------+

The indexes on the calls table are

+-------+------------+---------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name                  | Seq_in_index | Column_name     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+---------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
| calls |          0 | PRIMARY                   |            1 | ID              | A         |    23905312 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_CustID                |            1 | CustID          | A         |        1685 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_NumbID                |            1 | NumbID          | A         |       37765 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_OrigNumber            |            1 | OrigNumber      | A         |     5976328 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_OrigLocality          |            1 | OrigLocality    | A         |       45019 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_OrigAreaCode          |            1 | OrigAreaCode    | A         |         846 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_TermNumber            |            1 | TermNumber      | A         |      232090 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_DateStart             |            1 | DateStart       | A         |        4596 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_EffectiveFlag         |            1 | EffectiveFlag   | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_UnansweredFlag        |            1 | UnansweredFlag  | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_EngagedFlag           |            1 | UnansweredFlag  | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_TariID                |            1 | TariID          | A         |         110 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_CustID_DateStart      |            1 | CustID          | A         |        1685 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_CustID_DateStart      |            2 | DateStart       | A         |      919435 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_NumbID_DateStart      |            1 | NumbID          | A         |       37765 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_NumbID_DateStart      |            2 | DateStart       | A         |     5976328 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_RecID                 |            1 | RecID           | A         |      288015 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_CarrierRef            |            1 | CarrierRef      | A         |     7968437 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_CustID_CallTermNumber |            1 | CustID          | A         |        1685 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_CustID_CallTermNumber |            2 | TermNumber      | A         |      246446 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_CreatedDatetime       |            1 | CreatedDatetime | A         |      771139 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_Direction             |            1 | Direction       | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_VlviID                |            1 | VlviID          | A         |       50539 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_SomeID                |            1 | SomeID          | A         |          30 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_VcamID                |            1 | VcamID          | A         |          64 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_VlnsID                |            1 | VlnsID          | A         |         191 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_CarrID                |            1 | CarrID          | A         |           4 |     NULL | NULL   |      | BTREE      |         |
| calls |          1 | idx_BatchNumber           |            1 | BatchNumber     | A         |      271651 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+---------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+

Something which I understand may be causing the performance, is the indexes on columns with a low cardinality. I know columns such as Direction which has a cardinality of 2 is actually probably worse of with an index in terms of performance, but that alone shouldn't be making the statement so slow.

In terms of the cardinality requirements to have a worthwhile index, is there a general cardinality percentage compared to total table records at which an index increases performance and when it reduces performance?

I understand that no one is going to be able to throw an answer at me that will change the query time from 4.5s to 0.01s, but any advice on either the query itself, the table schema, the indexes, or the hardware would be greatly appreciated.

Update:

@Sebas "could you please rerun the query AND explain plan without the part: AND CallLength >= '00:00:00' AND Direction = '1' please?"

+----+-------------+-------+-------+---------------------------------------------------------------------+----------------------+---------+------+--------+-------------+
| id | select_type | table | type  | possible_keys                                                       | key                  | key_len | ref  | rows   | Extra       |
+----+-------------+-------+-------+---------------------------------------------------------------------+----------------------+---------+------+--------+-------------+
|  1 | SIMPLE      | calls | range | idx_CustID,idx_DateStart,idx_CustID_DateStart,idx_CustID_TermNumber | idx_CustID_DateStart | 7       | NULL | 724813 | Using where |
+----+-------------+-------+-------+---------------------------------------------------------------------+----------------------+---------+------+--------+-------------+
like image 294
Ryan Avatar asked Oct 16 '13 15:10

Ryan


People also ask

What is performance schema in MySQL?

The MySQL Performance Schema is a feature for monitoring MySQL Server execution at a low level. The Performance Schema has these characteristics: The Performance Schema provides a way to inspect internal execution of the server at runtime.

How performance of select query is improved if index is created on table?

Indexing makes columns faster to query by creating pointers to where data is stored within a database. Imagine you want to find a piece of information that is within a large database. To get this information out of the database the computer will look through every row until it finds it.

How do I know if my performance schema is enabled?

The performance schema is disabled by default for performance reasons. You can check its current status by looking at the value of the performance_schema system variable. The performance schema cannot be activated at runtime - it must be set when the server starts by adding the following line in your my.


1 Answers

Is your "DateStart" a truncated datetime -- keep date only? If not, you may want to build one with truncated value (by day, or hour), and use int datatype, which will make the index much smaller for faster query.


Or, another way to optimize (golden rule #1 don't do it, #2 do't do it now).

If and only if your date and PK are sync in sequence, you can build a external index of Range of StartDate <=> ID (PK).

and using below pattern

SELECT @start:=ID_START FROM ANOTHER_TABLE WHERE StartDate='2013-01-01'
SELECT @end:=ID_END     FROM ANOTHER_TABLE WHERE StartDate='2013-12-31'
SELECT * FROM calls WHERE ID BETWEEN @start and @end AND CustId in (xxxxx) ....

By using above pattern, Mysql will know if has to scan only a segment of table.

like image 172
Dennis C Avatar answered Nov 04 '22 05:11

Dennis C