I am reading High performance MySQL and I am a little confused about deferred join.
The book says that the following operation cannot be optimized by index(sex, rating) because the high offset requires them to spend most of their time scanning a lot of data that they will then throw away.
mysql> SELECT <cols> FROM profiles WHERE sex='M' ORDER BY rating LIMIT 100000, 10;
While a deferred join helps minimize the amount of work MySQL must do gathering data that it will only throw away.
SELECT <cols> FROM profiles INNER JOIN (
SELECT <primary key cols> FROM profiles
WHERE x.sex='M' ORDER BY rating LIMIT 100000, 10
) AS x USING(<primary key cols>);
Why a deferred join will minimize the amount of gathered data.
The example you presented assumes that InnoDB is used. Let's say that the PRIMARY KEY
is just id
.
INDEX(sex, rating)
is a "secondary key". Every secondary key (in InnoDB) includes the PK implicitly, so it is really an ordered list of (sex, rating, id)
values. To get to the "data" (<cols>
), it uses id
to drill down the PK BTree (which contains the data, too) to find the record.
Fast Case: Hence,
SELECT id FROM profiles
WHERE x.sex='M' ORDER BY rating LIMIT 100000, 10
will do a "range scan" of 100010 'rows' in the index. This will be quite efficient for I/O, since all the information is consecutive, and nothing is wasted. (No, it is not smart enough to jump over 100000 rows; that would be quite messy, especially when you factor in the transaction_isolation_mode.) Those 100010 rows probably fit in about 1000 blocks of the index. Then it gets the 10 values of id
.
With those 10 ids, it can do 10 joins ("NLJ" = "Nested Loop Join"). It is rather likely that the 10 rows are scattered around the table, possibly requiring 10 hits to the disk.
Let's "count the disk hits" (ignoring non-leaf nodes in the BTrees, which are likely to be cached anyway): 1000 + 10 = 1010. On ordinary disks, this might take 10 seconds.
Slow Case: Now let's look at the original query (SELECT <cols> FROM profiles WHERE sex='M' ORDER BY rating LIMIT 100000, 10;
). Let's continue to assume INDEX(sex, rating)
plus the implicit id
on the end.
As before, it will index scan through the 100010 rows (est. 1000 disk hits). But as it goes, it is too dumb to do what was done above. It will reach over into the data to get the <cols>
. This often (depending on caching) requires a random disk hit. This could be upwards of 100010 disk hits (if the table is huge and caching is not very useful).
Again, 100000 are tossed and 10 are delivered. Total 'cost': 100010 disk hits (worst case), which might take 17 minutes.
Keep in mind that there are 3 editions of High performance MySQL; they were written over the past 13 or so years. You are probably using a much newer version of MySQL than they covered. I do not happen to know if the optimizer has gotten any smarter in this area. These, if available to you, may give clues:
EXPLAIN FORMAT=JSON SELECT ...;
OPTIMIZER TRACE...
My favorite "Handler" trick for studying how things work may be helpful:
FLUSH STATUS;
SELECT ...
SHOW SESSION STATUS LIKE 'Handler%'.
You are likely to see numbers like 100000 and 10, or small multiples of such. But, keep in mind that a fast range scan of the index counts as 1 per row, and so does a slow random disk hit for a big set of <cols>
.
Overview: To make this technique work, the subquery need a "covering" index, with the columns correctly ordered.
"Covering" means that (sex, rating, id)
contains all the columns touched. (We are assuming that <cols>
contains other columns, perhaps bulky ones that won't work in an INDEX
.)
"Correct" ordering of the columns: The columns are in just the right order to get all the way through the query. (See also my cookbook.)
WHERE
columns compared with =
to constants. (sex
)ORDER BY
, in order. (rating
)id
)If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With