I ran a query in Mysql like below:
EXPLAIN
SELECT *
FROM(
SELECT * # Select Number 2
FROM post
WHERE parentid = 13
ORDER BY time, id
LIMIT 1, 10
) post13_childs
JOIN post post13_childs_childs
ON post13_childs_childs.parentid = post13_childs.id
and the result was:
id |select_type |table |type |possible_keys |key |key_len |ref |rows |Extra
1 |PRIMARY |<derived2> |ALL | NULL | NULL |NULL |NULL |10 |
1 |PRIMARY |post13_childs_childs|ref |parentid |parentid |9 |post13_childs.id |10 |Using where
2 |DERIVED |post |ALL |parentid |parentid |9 | |153153 |Using where; Using filesort
This means it used the index parentid
but scaned all rows due to ALL
and 153153
.
Why could not the index help to not Full Scannig
?
Although if i run the derived query (Select #2) alone like below:
Explain
SELECT * FROM post
WHERE parentid=13
ORDER BY time , id
LIMIT 1,10
the result would be desired:
id |select_type |table |type |possible_keys |key |key_len |ref |rows |Extra
1 |SIMPLE |post |ref |parentid |parentid |9 |const|41 |Using where; Using filesort
The table post
has these indexes:
count of total rows --> 141280.
count of children of 13
(parentid=13
) --> 41
count of children of 11523
--> 10119
When i add index of (parent,time,id)
, problem of first query would be solved by the explin output for 13
--> 40 rows, type:ref
and for 11523
--> 19538 rows, type:ref!!! this Means all children rows of 11423
is examined while i limited first 10 rows.
In MongoDB, you can use the cursor. explain() method or the db. collection. explain() method to determine whether or not a query uses an index.
Optimizing Queries with EXPLAIN It displays information from a built-in MySQL optimizer regarding the statement execution plan and the number of rows scanned in each table. Thus we can determine the cost of the query. The query below shows how EXPLAIN works with the SELECT statement.
The key_len column indicates the length of the key that MySQL decided to use. The value of key_len enables you to determine how many parts of a multiple-part key MySQL actually uses. If the key column says NULL , the key_len column also says NULL .
Extra – contains additional information regarding the query execution plan. Values such as “Using temporary”, “Using filesort”, etc. in this column may indicate a troublesome query. For a complete list of possible values and their meaning, check out the MySQL documentation.
Your subquery:
SELECT * # Select Number 2
FROM post
WHERE parentid = 13
ORDER BY time, id
LIMIT 1, 10;
This mentions three columns explicitly, plus all the rest of the columns You have three indexes. Here is how they can be used:
order by
clause, it is the second conditionwhere
clause. However, after the correct data is pulled, it then would need to be sorted explicitly.parentid
is met. Just to introduce why optimization is hard. If you have a small amount of data (say the table fits on one or two pages), then a full table scan followed by a sort is probably fine. If most of the parentid
values are 13
, then the second index could be a worst case. If the table does not fit into memory, then the third would be incredibly slow (something called page thrashing).
The correct index for this subquery is one that satisfies the where
clause and allows ordering. That index is parentid, time, id
. This is not a covering index (unless these are all the columns in the table). But it should reduce the number of hits to actual rows to 10 because of the limit
clause.
Note that for the complete query, you want an index on parentid
. And, happily, an index on parentid, time, id
counts as such an index. So, you can remove that index. The time, id
index is probably not necessary, unless you need that for other queries.
Your query is also filtering only those "children" that have "children" themselves. It is quite possible that no rows will be returned. Do you really intend a left outer join
?
As a final comment. I assume that this query is a simplification of your real query. The query is pulling all columns from two tables -- and those two tables are the same. That is, you will be getting duplicate column names from identical tables. You should have column aliases to better define the columns.
Doing an ORDER BY that is not helped by any index can regularly kill performance. For the inner query, I would have a covering index on (parentID, time, id ) so that both the WHERE and ORDER BY clauses can utilize the index. Since the parentID is also the basis of the join afterwords, it should be good to go there to and be quite fast.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With