I have a table of books :
CREATE TABLE `books` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`nameOfBook` VARCHAR(32),
`releaseDate` DATETIME NULL DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `Index 2` (`releaseDate`, `id`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=33029692;
I compared two SQL requests to do a pagiation with sort on releaseDate. Both of theses request return the same result.
(simple one)
select SQL_NO_CACHE id,name, releaseDate
from books
where releaseDate <= '2016-11-07'
AND (releaseDate<'2016-11-07' OR id < 3338191)
ORDER by releaseDate DESC, id DESC limit 50;
and
(tuple comparison or row comparaison)
select SQL_NO_CACHE id,name, releaseDate
from books
where (releaseDate ,id) < ('2016-11-07',3338191)
ORDER by releaseDate DESC, id DESC limit 50;
When I do the explain of the request i got this
simple one :
"id";"select_type";"table";"type";"possible_keys";"key";"key_len";"ref";"rows";"Extra"
"1";"SIMPLE";"books";"range";"PRIMARY,Index 2";"Index 2";"9";"";"1015876";"Using where; Using index"
We can see it is parsing "1015876" of rows
The explain for the tuple comparison :
"id";"select_type";"table";"type";"possible_keys";"key";"key_len";"ref";"rows";"Extra"
"1";"SIMPLE";"books";"index";"";"Index 2";"13";"";"50";"Using where; Using index"
We can see it is parsing "50" of rows.
But if I checked the exectution time the simple one :
/* Affected rows: 0 Lignes trouvées: 50 Avertissements: 0 Durée pour 1 query: 0,031 sec. */
and the tuple one :
/* Affected rows: 0 Lignes trouvées: 50 Avertissements: 0 Durée pour 1 query: 3,682 sec. */
I don't understant why according to the explain the tuple comparison is better but the execution time is badly worse?
MySQL can run more than 50,000 simple queries per second on commodity server hardware and over 2,000 queries per second from a single correspondent on a Gigabit network, so running multiple queries isn't necessarily such a bad thing.
A tuple is simply a row contained in a table in the tablespace. A table usually contains columns and rows in which rows stand for records while columns stand for attributes. A single row of a table that has a single record for such a relation is known as a tuple.
I've been irritated by this for years. WHERE (a,b) > (1,2)
has never been optimized, in spite of it being easily transformed into the other formulation. Even the other format was poorly optimized until a few years ago.
Using EXPLAIN FORMAT=JSON SELECT ...
might give you some better clues.
Meanwhile, EXPLAIN
ignored the LIMIT
and suggested 1015876. On many cases, EXPLAIN
provides a "decent" Row estimate, but not either of these.
Feel free to file a bug report: http://bugs.mysql.com (and post the link here).
Another formulation was recently optimized, in spite of OR
being historically un-optimizable.
where releaseDate < '2016-11-07'
OR (releaseDate = '2016-11-07' AND id < 3338191)
For measuring query optimizations, I like to do:
FLUSH STATUS;
SELECT ...
SHOW SESSION STATUS LIKE 'Handler%';
Small values, such as '50' for your case, indicate good optimization; large value (1M) indicate a scan. The Handler numbers are exact; unlike the estimates in EXPLAIN
.
Update 5.7.3 has improved handling of tuples, aka "row constructors"
Update MySQL Bug#104128 covers this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With