Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fix Using where; Using temporary; Using filesort

I have two simple tables:

CREATE TABLE cat_urls (
  Id int(11) NOT NULL AUTO_INCREMENT,
  SIL_Id int(11) NOT NULL,
  SiteId int(11) NOT NULL,
  AsCatId int(11) DEFAULT NULL,
  Href varchar(2048) NOT NULL,
  ReferrerHref varchar(2048) NOT NULL DEFAULT '',
  AddedOn datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  GroupId int(11) DEFAULT NULL,
  PRIMARY KEY (Id),
  INDEX SIL (SIL_Id, AsCatId)
)

CREATE TABLE products (
  Id int(11) NOT NULL AUTO_INCREMENT,
  CatUrlId int(11) NOT NULL,
  Href varchar(2048) NOT NULL,
  SiteIdentity varchar(2048) NOT NULL,
  Price decimal(12, 2) NOT NULL,
  IsAvailable bit(1) NOT NULL,
  ClientCode varchar(256) NOT NULL,
  PRIMARY KEY (Id),
  INDEX CatUrl (CatUrlId)
)

And I have pretty simple query:

SELECT cu.Href, COUNT(p.CatUrlId) FROM cat_urls cu
       JOIN products p ON p.CatUrlId=cu.Id
       WHERE sil_id=4601038
GROUP by cu.Id

EXPLAIN says:

id  select_type table   type    possible_keys   key     key_len ref                     rows    Extra
1   SIMPLE      cu      ref     PRIMARY,SIL     SIL     4       const                   303     Using where; Using temporary; Using filesort
1   SIMPLE      p       ref     CatUrl          CatUrl  4       blue_collar_logs.cu.Id  6       Using index

Please tell me is there any way to fix "Using where; Using temporary; Using filesort" and improve perfomance of this query?

like image 793
Pavel Avatar asked Dec 25 '14 08:12

Pavel


People also ask

What is using temporary in MySQL?

In MySQL, a temporary table is a special type of table that allows you to store a temporary result set, which you can reuse several times in a single session. A temporary table is very handy when it is impossible or expensive to query data that requires a single SELECT statement with the JOIN clauses.

What is Filesort in MySQL?

In MySQL, filesort is the catch-all algorithm for producing sorted results for ORDER-BY or GROUP-BY queries. MySQL has two algorithms for filesort, both the original and the modified algorithms are described in the user manual.

How do you use MySQL explain?

In MySQL, EXPLAIN can be used in front of a query beginning with SELECT , INSERT , DELETE , REPLACE , and UPDATE . For a simple query, it would look like the following: EXPLAIN SELECT * FROM foo WHERE foo. bar = 'infrastructure as a service' OR foo.


1 Answers

It looks that, for some reason, MySQL chooses to use the index SIL on the first table and it uses it both for lookup (WHERE sil_id = 4601038) and grouping (GROUP BY cu.Id).

You can tell it to use the PK of the table

SELECT cu.Href, COUNT(p.CatUrlId) FROM cat_urls cu
    USE INDEX FOR JOIN (PRIMARY)
JOIN products p ON p.CatUrlId=cu.Id
WHERE sil_id=4601038
GROUP by cu.Id

and it will produce this execution plan:

id | select_type | table | type  | possible_keys | key     | key_len | ref              | rows | Extra
---+-------------+-------+-------+---------------+---------+---------+------------------+------+-------------
1  | SIMPLE      | cu    | index | PRIMARY       | PRIMARY | 4       | NULL             | 1    | Using where
1  | SIMPLE      | p     | ref   | CatUrl        | CatUrl  | 4       | cbs-test-1.cu.Id | 1    | Using index

Ignore the values reported in column rows; they are not correct because my tables are empty.

Notice the Extra column now contains only Using where but also notice that the join type column changed from ref (very good) to index (full index scan, not quite good).

A better solution is to add an index on column SIL_Id. I know, SIL_Id is a prefix of index SIL(SIL_Id, AsCatId) and in theory another index on column SIL_Id is completely useless. But it seems it solves the issue on this case.

ALTER TABLE cat_urls
  ADD INDEX (SIL_Id)
;

Now use it in the query:

SELECT cu.Href, COUNT(p.CatUrlId) FROM cat_urls cu
    USE INDEX FOR JOIN (SIL_Id)
JOIN products p ON p.CatUrlId=cu.Id
WHERE sil_id=4601038
GROUP by cu.Id

The query execution plan looks much better now:

id | select_type | table | type | possible_keys | key    | key_len | ref              | rows | Extra
---+-------------+-------+------+---------------+--------+---------+------------------+------+-------------
1  | SIMPLE      | cu    | ref  | SIL_Id        | SIL_Id | 4       | const            | 1    | Using where
1  | SIMPLE      | p     | ref  | CatUrl        | CatUrl | 4       | cbs-test-1.cu.Id | 1    | Using index

The drawback is that we have an extra index that is (theoretically) useless. It occupies storage space and it consumes processor cycles every time a row is added, deleted or have its SIL_Id field modified.

like image 184
axiac Avatar answered Sep 22 '22 16:09

axiac