Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Complex MySQL query still using filesort although indexes exist

I have a Joomla table with thousands of rows of content (approx 3million). I'm having a bit of trouble rewriting the database queries to be as fast as possible when querying the tables.

Here is my full query:

SELECT cc.title AS category, a.id, a.title, a.alias, a.title_alias, a.introtext, a.fulltext, a.sectionid, a.state, a.catid, a.created, a.created_by, a.created_by_alias, a.modified, a.modified_by, a.checked_out, a.checked_out_time, a.publish_up, a.publish_down, a.attribs, a.hits, a.images, a.urls, a.ordering, a.metakey, a.metadesc, a.access, CASE WHEN CHAR_LENGTH(a.alias) THEN CONCAT_WS(":", a.id, a.alias) ELSE a.id END AS slug, CASE WHEN CHAR_LENGTH(cc.alias) THEN CONCAT_WS(":", cc.id, cc.alias) ELSE cc.id END AS catslug, CHAR_LENGTH( a.`fulltext` ) AS readmore, u.name AS author, u.usertype, g.name AS groups, u.email AS author_email
FROM j15_content AS a
LEFT JOIN j15_categories AS cc
ON a.catid = cc.id
LEFT JOIN j15_users AS u
ON u.id = a.created_by
LEFT JOIN j15_groups AS g
ON a.access = g.id
WHERE 1
AND a.access <= 0
AND a.catid = 108
AND a.state = 1
AND ( publish_up = '0000-00-00 00:00:00' OR publish_up <= '2012-02-08 00:16:26' )
AND ( publish_down = '0000-00-00 00:00:00' OR publish_down >= '2012-02-08 00:16:26' )
ORDER BY a.title, a.created DESC
LIMIT 0, 10

Here is the output from an EXPLAIN:

 +----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-----------------------------+
| id | select_type | table | type   | possible_keys                                         | key       | key_len | ref                       | rows    | Extra                       |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-----------------------------+
|  1 | SIMPLE      | a     | ref    | idx_access,idx_state,idx_catid,idx_access_state_catid | idx_catid | 4       | const                     | 3108187 | Using where; Using filesort |
|  1 | SIMPLE      | cc    | const  | PRIMARY                                               | PRIMARY   | 4       | const                     |       1 |                             |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY                                               | PRIMARY   | 4       | database.a.created_by     |       1 |                             |
|  1 | SIMPLE      | g     | eq_ref | PRIMARY                                               | PRIMARY   | 1       | database.a.access         |       1 |                             |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-----------------------------+

And to show what indexes exist, SHOW INDEX FROM j15_content:

+-------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table       | Non_unique | Key_name               | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| j15_content |          0 | PRIMARY                |            1 | id          | A         |     3228356 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_section            |            1 | sectionid   | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access             |            1 | access      | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_checkout           |            1 | checked_out | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_state              |            1 | state       | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_catid              |            1 | catid       | A         |           6 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_createdby          |            1 | created_by  | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | title                  |            1 | title       | A         |      201772 |        4 | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access_state_catid |            1 | access      | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access_state_catid |            2 | state       | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access_state_catid |            3 | catid       | A         |           7 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_title_created      |            1 | title       | A         |     3228356 |        8 | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_title_created      |            2 | created     | A         |     3228356 |     NULL | NULL   |      | BTREE      |         |
+-------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

As you can see there are a few pieces of data being taken from the database. Now I have tested by simplifying the query that the real issue lies with the ORDER BY clause. Without ordering the results, the query is quite responsive, here is an explanation:

+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-------------+
| id | select_type | table | type   | possible_keys                                         | key       | key_len | ref                       | rows    | Extra       |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-------------+
|  1 | SIMPLE      | a     | ref    | idx_access,idx_state,idx_catid,idx_access_state_catid | idx_catid | 4       | const                     | 3108187 | Using where |
|  1 | SIMPLE      | cc    | const  | PRIMARY                                               | PRIMARY   | 4       | const                     |       1 |             |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY                                               | PRIMARY   | 4       | database.a.created_by     |       1 |             |
|  1 | SIMPLE      | g     | eq_ref | PRIMARY                                               | PRIMARY   | 1       | database.a.access         |       1 |             |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-------------+

As you can see it's the fatal filesort that's killing the server. With this many rows, I'm doing my best to optimize everything through indexes but something still isn't right with this. Any input would be greatly appreciated.

Tried using FORCE INDEX to no avail:

explain     SELECT cc.title AS category, a.id, a.title, a.alias, a.title_alias, a.introtext, a.fulltext, a.sectionid, a.state, a.catid, a.created, a.created_by, a.created_by_alias, a.modified, a.modified_by, a.checked_out, a.checked_out_time, a.publish_up, a.publish_down, a.attribs, a.hits, a.images, a.urls, a.ordering, a.metakey, a.metadesc, a.access, CASE WHEN CHAR_LENGTH(a.alias) THEN CONCAT_WS(":", a.id, a.alias) ELSE a.id END AS slug, CASE WHEN CHAR_LENGTH(cc.alias) THEN CONCAT_WS(":", cc.id, cc.alias) ELSE cc.id END AS catslug, CHAR_LENGTH( a.`fulltext` ) AS readmore, u.name AS author, u.usertype, g.name AS groups, u.email AS author_email
    ->     FROM bak_content AS a
    ->     FORCE INDEX (idx_title_created)
    ->     LEFT JOIN bak_categories AS cc
    ->     ON a.catid = cc.id
    ->     LEFT JOIN bak_users AS u
    ->     ON u.id = a.created_by
    ->     LEFT JOIN bak_groups AS g
    ->     ON a.access = g.id
    ->     WHERE 1
    ->     AND a.access <= 0
    ->     AND a.catid = 108
    ->     AND a.state = 1
    ->     AND ( publish_up = '0000-00-00 00:00:00' OR publish_up <= '2012-02-08
    ->     AND ( publish_down = '0000-00-00 00:00:00' OR publish_down >= '2012-0
    ->     ORDER BY a.title, a.created DESC
    ->     LIMIT 0, 10;

Produces:

+----+-------------+-------+--------+---------------+---------+---------+-------
| id | select_type | table | type   | possible_keys | key     | key_len | ref
+----+-------------+-------+--------+---------------+---------+---------+-------
|  1 | SIMPLE      | a     | ALL    | NULL          | NULL    | NULL    | NULL
|  1 | SIMPLE      | cc    | const  | PRIMARY       | PRIMARY | 4       | const
|  1 | SIMPLE      | u     | eq_ref | PRIMARY       | PRIMARY | 4       | database
|  1 | SIMPLE      | g     | eq_ref | PRIMARY       | PRIMARY | 1       | database
+----+-------------+-------+--------+---------------+---------+---------+-------
like image 728
user1199057 Avatar asked Feb 09 '12 07:02

user1199057


People also ask

Does MySQL use index for ORDER BY?

Yes, MySQL uses your index to sort the information when the order is by the sorted column. Also, if you have indexes in all columns that you have added to the SELECT clause, MySQL will not load the data from the table itself, but from the index (which is faster).

What is using temporary in MySQL?

A temporary table in MySQL will be dropped automatically when the user closes the session or terminates the connection manually. A temporary table can be created by the user with the same name as a normal table in a database.

Why index is not being used in MySQL?

The Benefits and Drawbacks of Using Indexes in MySQLIndexes consume disk space. Indexes degrade the performance of INSERT, UPDATE and DELETE queries – when data is updated, the index needs to be updated together with it. MySQL does not protect you from using multiple types of indexes at the same time.

What is Filesort in MySQL?

In MySQL, filesort is the catch-all algorithm for producing sorted results for ORDER-BY or GROUP-BY queries. MySQL has two algorithms for filesort, both the original and the modified algorithms are described in the user manual.


1 Answers

AFAIK this can't be reasonably solved using an index, hints or restructuring of the query itself.

The reason this is slow is the fact that it requires a filesort of 2M rows which does actually take a long time. If you zoom in on the order by it's specified as ORDER BY a.title, a.created DESC. The problem is the combination of sorting on more than 1 column and having a DESC part. Mysql does not support descending indexes (the keyword DESC is supported in the CREATE INDEX statement but only for future use).

The suggested workaround is to create an extra column 'reverse_created' that gets automatically populated in such a way that your query can use ORDER BY a.title, a.reverse_created. So you fill it with max_time - created_time. Then create an index on that combination and (if needed) specify that index as a hint.

There are a couple of really good blog articles about this topic that explain this a lot better and with examples:

  • http://www.mysqlperformanceblog.com/2006/05/09/descending-indexing-and-loose-index-scan/
  • http://www.mysqlperformanceblog.com/2007/02/16/using-index-for-order-by-vs-restricting-number-of-rows/

-Update- You should be able to do a quick test on this by removing the "DESC" part from the order by in your query. The results will be functionally wrong but it should use the existing index you have (or otherwise the force should work).

like image 113
Eddy Avatar answered Sep 22 '22 14:09

Eddy