I ran a query in Mysql like below: <pre class="prettyprint"><code>EXPLAIN SELECT * FROM( SELECT * # Select Number 2 FROM post WHERE parentid = 13 ORDER BY time, id LIMIT 1, 10 ) post13_childs JOIN post post13_childs_childs ON post13_childs_childs.parentid = post13_childs.id </code></pre> and the result was: <pre class="prettyprint"><code>id |select_type |table |type |possible_keys |key |key_len |ref |rows |Extra 1 |PRIMARY |<derived2> |ALL | NULL | NULL |NULL |NULL |10 | 1 |PRIMARY |post13_childs_childs|ref |parentid |parentid |9 |post13_childs.id |10 |Using where 2 |DERIVED |post |ALL |parentid |parentid |9 | |153153 |Using where; Using filesort </code></pre> This means it used the index <code>parentid</code> but scaned all rows due to <code>ALL</code> and <code>153153</code>. Why could not the index help to not <code>Full Scannig</code>? Although if i run the derived query (Select #2) alone like below: <pre class="prettyprint"><code>Explain SELECT * FROM post WHERE parentid=13 ORDER BY time , id LIMIT 1,10 </code></pre> the result would be desired: <pre class="prettyprint"><code>id |select_type |table |type |possible_keys |key |key_len |ref |rows |Extra 1 |SIMPLE |post |ref |parentid |parentid |9 |const|41 |Using where; Using filesort </code></pre> <h3>Edit:</h3> The table <code>post</code> has these indexes: <ol> <li>id (PRIMARY)</li> <li>parentid</li> <li>time, id (timeid)</li> </ol> count of total rows --> 141280. count of children of <code>13</code> (<code>parentid=13</code>) --> 41 count of children of <code>11523</code> --> 10119 When i add index of <code>(parent,time,id)</code>, problem of first query would be solved by the explin output for <code>13</code> --> 40 rows, type:ref and for <code>11523</code> --> 19538 rows, type:ref!!! this Means all children rows of <code>11423</code> is examined while i limited first 10 rows.

Your subquery: <pre class="prettyprint"><code> SELECT * # Select Number 2 FROM post WHERE parentid = 13 ORDER BY time, id LIMIT 1, 10; </code></pre> This mentions three columns explicitly, plus all the rest of the columns You have three indexes. Here is how they can be used: <ul> <li>id (PRIMARY) -- This index is useless. Although mentioned in the <code>order by</code> clause, it is the second condition</li> <li>parentid -- This index can be used for satisfying the <code>where</code> clause. However, after the correct data is pulled, it then would need to be sorted explicitly.</li> <li>time, id (timeid) -- This index can be used for the sort, with a big BUT. MySQL can scan the index to get everything in the right order. But it will have to check, row-by-row, whether the condition on <code>parentid</code> is met. </li> </ul> Just to introduce why optimization is hard. If you have a small amount of data (say the table fits on one or two pages), then a full table scan followed by a sort is probably fine. If most of the <code>parentid</code> values are <code>13</code>, then the second index could be a worst case. If the table does not fit into memory, then the third would be incredibly slow (something called page thrashing). The correct index for this subquery is one that satisfies the <code>where</code> clause and allows ordering. That index is <code>parentid, time, id</code>. This is not a covering index (unless these are all the columns in the table). But it should reduce the number of hits to actual rows to 10 because of the <code>limit</code> clause. Note that for the complete query, you want an index on <code>parentid</code>. And, happily, an index on <code>parentid, time, id</code> counts as such an index. So, you can remove that index. The <code>time, id</code> index is probably not necessary, unless you need that for other queries. Your query is also filtering only those "children" that have "children" themselves. It is quite possible that no rows will be returned. Do you really intend a <code>left outer join</code>? As a final comment. I assume that this query is a simplification of your real query. The query is pulling all columns from two tables -- and those two tables are the same. That is, you will be getting duplicate column names from identical tables. You should have column aliases to better define the columns.

Mysql Explain Query with type "ALL" when an index is used

Tags:

sql

indexing

mysql

query-optimization

explain

I ran a query in Mysql like below:

EXPLAIN
SELECT *
FROM(
        SELECT *  # Select Number 2
        FROM post
        WHERE   parentid = 13
        ORDER BY time, id
        LIMIT 1, 10
    ) post13_childs
JOIN post post13_childs_childs
ON post13_childs_childs.parentid = post13_childs.id

and the result was:

id |select_type  |table               |type |possible_keys  |key      |key_len  |ref              |rows    |Extra
1  |PRIMARY      |<derived2>          |ALL  | NULL          | NULL    |NULL     |NULL             |10      |
1  |PRIMARY      |post13_childs_childs|ref  |parentid       |parentid |9        |post13_childs.id |10      |Using where
2  |DERIVED      |post                |ALL  |parentid       |parentid |9        |                 |153153  |Using where; Using filesort

This means it used the index parentid but scaned all rows due to ALL and 153153. Why could not the index help to not Full Scannig?

Although if i run the derived query (Select #2) alone like below:

Explain
SELECT * FROM post  
WHERE parentid=13
ORDER BY time , id
LIMIT 1,10

the result would be desired:

id |select_type  |table  |type |possible_keys  |key      |key_len  |ref  |rows    |Extra
1  |SIMPLE       |post   |ref  |parentid       |parentid |9        |const|41      |Using where; Using filesort

Edit:

The table post has these indexes:

id (PRIMARY)
parentid
time, id (timeid)

count of total rows --> 141280.
count of children of 13 (parentid=13) --> 41
count of children of 11523 --> 10119

When i add index of (parent,time,id), problem of first query would be solved by the explin output for 13 --> 40 rows, type:ref
and for 11523 --> 19538 rows, type:ref!!! this Means all children rows of 11423 is examined while i limited first 10 rows.

619

asked Dec 20 '13 09:12

ahoo

2 Answers

Your subquery:

    SELECT *  # Select Number 2
    FROM post
    WHERE   parentid = 13
    ORDER BY time, id
    LIMIT 1, 10;

This mentions three columns explicitly, plus all the rest of the columns You have three indexes. Here is how they can be used:

id (PRIMARY) -- This index is useless. Although mentioned in the order by clause, it is the second condition
parentid -- This index can be used for satisfying the where clause. However, after the correct data is pulled, it then would need to be sorted explicitly.
time, id (timeid) -- This index can be used for the sort, with a big BUT. MySQL can scan the index to get everything in the right order. But it will have to check, row-by-row, whether the condition on parentid is met.

Just to introduce why optimization is hard. If you have a small amount of data (say the table fits on one or two pages), then a full table scan followed by a sort is probably fine. If most of the parentid values are 13, then the second index could be a worst case. If the table does not fit into memory, then the third would be incredibly slow (something called page thrashing).

The correct index for this subquery is one that satisfies the where clause and allows ordering. That index is parentid, time, id. This is not a covering index (unless these are all the columns in the table). But it should reduce the number of hits to actual rows to 10 because of the limit clause.

Note that for the complete query, you want an index on parentid. And, happily, an index on parentid, time, id counts as such an index. So, you can remove that index. The time, id index is probably not necessary, unless you need that for other queries.

Your query is also filtering only those "children" that have "children" themselves. It is quite possible that no rows will be returned. Do you really intend a left outer join?

As a final comment. I assume that this query is a simplification of your real query. The query is pulling all columns from two tables -- and those two tables are the same. That is, you will be getting duplicate column names from identical tables. You should have column aliases to better define the columns.

149

answered Oct 19 '22 23:10

Gordon Linoff

Doing an ORDER BY that is not helped by any index can regularly kill performance. For the inner query, I would have a covering index on (parentID, time, id ) so that both the WHERE and ORDER BY clauses can utilize the index. Since the parentID is also the basis of the join afterwords, it should be good to go there to and be quite fast.

answered Oct 19 '22 23:10

DRapp

Related questions
                            
                                WordPress prepared statement with IN() condition
                            
                                error: command 'x86_64-linux-gnu-gcc' when installing mysqlclient
                            
                                Update a boolean to its opposite in SQL without using a SELECT
                            
                                Access MySQL field's Comments with PHP
                            
                                Deleting ALL products on Magento
                            
                                MySql Last Insert ID, Connector .net
                            
                                Class does not have a table or tablename specified and does not inherit from an existing table-mapped class
                            
                                Storing "CASE WHEN" condition in Doctrine2 entity
                            
                                Access to MySQL with R using a pre 4.1.1 authentication protocol
                            
                                How to avoid jobs DB table locks issue when using Laravel queues?
                            
                                Convert MySQL script to SQL Server [closed]
                            
                                What is the best way to bind decimal / double / float values with PDO in PHP?
                            
                                Migrate from MySQL to PostgreSQL on Linux (Kubuntu)
                            
                                Paypal IPN, Not getting all the transactions responses after changing the ipn url in the account
                            
                                How would you store a business's hours in the db/model of a Rails app?
                            
                                Set AUTO_INCREMENT using SqlAlchemy with MySQL on Columns with non-primary keys?
                            
                                LIMIT 1 is very slow, for specific records, using different keys
                            
                                MySQL: using UNION vs multiple queries

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With