I've got a basic MySQL performance question related to explain. I have two queries that return the same result and I am trying to understand how to make sense of the EXPLAIN
of the execution plans.
The table has 50000 records in it and I am performing a record comparison. My first query takes 18.625 secs to run. The explain plan is as follows.
id select_type table type possible_keys key key_len ref rows filtered Extra
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 SIMPLE a ALL NULL NULL NULL NULL 49520 100.00
1 SIMPLE b ref scoreEvent,eventScore eventScore 4 olympics.a.eventId 413 100.00 Using where; Using index; Not exists
1 SIMPLE c ref PRIMARY,scoreEvent,eventScore scoreEvent 8 olympics.a.score,olympics.a.eventId 4 100.00 Using where; Using index; Not exists
My next query has takes 0.106 secs to run...
id select_type table type possible_keys key key_len ref rows filtered Extra
-----------------------------------------------------------------------------------------------------------------------------------
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 50000 100.00 Using temporary; Using filesort
2 DERIVED results ALL NULL NULL NULL NULL 49520 100.00 Using filesort
In the documentation it says that ALL
requires a full table scan and this is very bad. It also says that filesort
requires an extra pass to sort the records, it also says that Not exists
means MySQL was able to do a LEFT JOIN
optimization. It is also clear that the first method is using indexes whereas the second method doesn't.
I'm trying to work out what is going on here and what maths is involved. I am running RESET QUERY CACHE
between tests to ensure one isn't given any sort of unfair advantage. 49520 x 413 x 4 is a lot smaller than 50000 x 49520.
Is it to do with the id
in the explain plan?
When I'm testing these and other queries it seems that my observations are that the query complexity can be approximated by multiplying items with the same id and adding the result of each id together... Is this a valid assumption?
As requested in the comments the schema and the queries just in case it helps, but I am not looking for better queries... Merely an explanation of the EXPLAIN
. The table in question...
CREATE TABLE results (
resultId INT NOT NULL auto_increment KEY,
athleteId INT NOT NULL,
eventId INT NOT NULL,
score INT NOT NULL,
CONSTRAINT FOREIGN KEY (athleteId) REFERENCES athletes(athleteId),
CONSTRAINT FOREIGN KEY (eventId) REFERENCES events(eventId),
INDEX eventScore (eventId, score),
INDEX scoreEvent (score, eventId)
) ENGINE=innodb;
The first query...
SELECT a.resultId, a.eventId, a.athleteId, a.score
FROM results a
-- Find records with matching eventIds and greater scores
LEFT JOIN results b
ON b.eventId = a.eventId
AND b.score > a.score
-- Find records with matching scores and lesser testIds
LEFT JOIN results c
ON c.eventId = a.eventId
AND c.score = a.score
AND c.resultId < a.resultId
-- Filter out all records where there were joins
WHERE c.resultId IS NULL
AND b.resultId IS NULL;
The second query...
SELECT resultId, athleteId, eventId, score
FROM (
SELECT resultId, athleteId, eventId, score
FROM results
ORDER BY eventId, score DESC, resultId
) AS a
GROUP BY eventId;
I also noticed that if I drop the index eventScore
that the query drops down to 2.531 secs and the execution plan doesn't so much change but the order of the possible_keys changes and it's not Using index
for table b
(ignore the slight changes in row counts I am generating data each time I change the schema)...
id select_type table type possible_keys key key_len ref rows filtered Extra
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 SIMPLE a ALL NULL NULL NULL NULL 47457 100.00
1 SIMPLE b ref eventId,scoreEvent eventId 4 olympics.a.eventId 659 100.00 Using where; Not exists
1 SIMPLE c ref PRIMARY,eventId,scoreEvent scoreEvent 8 olympics.a.score,olympics.a.eventId 5 100.00 Using where; Using index; Not exists
A query on a huge table can be performed without reading all the rows; a join involving several tables can be performed without comparing every combination of rows. The set of operations that the optimizer chooses to perform the most efficient query is called the “query execution plan”, also known as the EXPLAIN plan.
The EXPLAIN PLAN statement displays execution plans chosen by the Oracle optimizer for SELECT , UPDATE , INSERT , and DELETE statements. A statement's execution plan is the sequence of operations Oracle performs to run the statement. The row source tree is the core of the execution plan.
The EXPLAIN statement provides information about how MySQL executes statements: EXPLAIN works with SELECT , DELETE , INSERT , REPLACE , and UPDATE statements. When EXPLAIN is used with an explainable statement, MySQL displays information from the optimizer about the statement execution plan.
A query plan (or query execution plan) is a sequence of steps used to access data in a SQL relational database management system. This is a specific case of the relational model concept of access plans.
In fact when you see you should not to multiply, but sum this numbers. In you case compare (49520 x 413 x 4) and (50000 + 49520).
Gereral rule is simple: summarize all segments (DERIVED, PRIMARY) and multiply rows within each segment.
id select_type ... rows
1 PRIMARY 1
1 PRIMARY 2
2 DERIVED 3
2 DERIVED 4
3 DERIVED 5
3 DERIVED 6
Complexity is: 1*2 + 3*4 + 5*6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With