Almost everywhere I read the use of FORCE INDEX
is highly discouraged and I perfectly understand and know why - there are immensely huge chances that MySQL knows better what indexes to choose than the (average) developer.
However, recently I have found a case where FORCE INDEX
improved my execution times in the range of hundred of times:
JOIN
on 4 tablesINNER JOIN
ed table have more than 1 milion recordspublished_date
, stored as varchar
in YMD format (could not changed to datetime
)published_date
of at most 5 000 recordsGROUP BY
and ORDER BY
clauses on the first table on different fields than published_date
were needed for this queryAlthough I've rewritten the query in many ways, I was not able to get execution times smaller than 130 seconds (with highest being over 700 ). After using FORCE INDEX
with published_date
, the execution time dropped below 5 seconds.
It took me a few days to remember about the infamous FORCE INDEX
option.
Questions:
FORCE INDEX
saved you?FORCE INDEX
?Edit - Obsevations: I created this blog post with the question here also. All the answer you provide will appear there too - whith credits and all the stuff you want.
Edit 2
I applied the suggestions I received in your comments (ANALYZE TABLE
and OPTIMIZE TABLE
), below is the output of EXPLAIN
applied on query - unfortunately the index selection is not at all better:
FORCE INDEX
on table with alias a
:id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE am2 range PRIMARY,idx_meta_article idx_meta_article 4 NULL 275228 Using where; Using index; Using temporary; Using f...
1 SIMPLE a eq_ref PRIMARY,serial_issue_date_productid,pub_date,idx_d... PRIMARY 4 mydb_toto.am2.ArticleID 1 Using where
1 SIMPLE ai ref PRIMARY,idx_iso_article PRIMARY 4 mydb_toto.a.serial 11523 Using where; Using index
1 SIMPLE m range PRIMARY,meta_articles_type meta_articles_type 4 NULL 96 Using where
1 SIMPLE am eq_ref PRIMARY,idx_meta_article PRIMARY 8 mydb_toto.a.serial,mydb_toto.m.meta_id 1 Using where; Using index
FORCE INDEX
on table with alias a
:id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE a range pub_date pub_date 11 NULL 17679 Using where; Using temporary; Using filesort
1 SIMPLE am2 ref PRIMARY,idx_meta_article PRIMARY 4 mydb_toto.a.serial 21930 Using where; Using index
1 SIMPLE ai ref PRIMARY,idx_iso_article PRIMARY 4 mydb_toto.a.serial 11523 Using where; Using index
1 SIMPLE m range PRIMARY,meta_articles_type meta_articles_type 4 NULL 96 Using where
1 SIMPLE am eq_ref PRIMARY,idx_meta_article PRIMARY 8 mydb_toto.am2.ArticleID,mydb_toto.m.meta_id 1 Using where; Using index
ANALYZE TABLE
, without FORCE INDEX
:id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE am2 range PRIMARY,idx_meta_article idx_meta_article 4 NULL 275228 Using where; Using index; Using temporary; Using f...
1 SIMPLE a eq_ref PRIMARY,serial_issue_date_productid,pub_date,idx_d... PRIMARY 4 mydb_toto.am2.ArticleID 1 Using where
1 SIMPLE ai ref PRIMARY,idx_iso_article PRIMARY 4 mydb_toto.a.serial 11523 Using where; Using index
1 SIMPLE m range PRIMARY,meta_articles_type meta_articles_type 4 NULL 96 Using where
1 SIMPLE am eq_ref PRIMARY,idx_meta_article PRIMARY 8 mydb_toto.a.serial,mydb_toto.m.meta_id 1 Using where; Using index
OPTIMIZE TABLE
, without FORCE INDEX
:id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE am2 range PRIMARY,idx_meta_article idx_meta_article 4 NULL 275228 Using where; Using index; Using temporary; Using f...
1 SIMPLE a eq_ref PRIMARY,serial_issue_date_productid,pub_date,idx_d... PRIMARY 4 mydb_toto.am2.ArticleID 1 Using where
1 SIMPLE ai ref PRIMARY,idx_iso_article PRIMARY 4 mydb_toto.a.serial 11523 Using where; Using index
1 SIMPLE m range PRIMARY,meta_articles_type meta_articles_type 4 NULL 96 Using where
1 SIMPLE am eq_ref PRIMARY,idx_meta_article PRIMARY 8 mydb_toto.a.serial,mydb_toto.m.meta_id 1 Using where; Using index
OPTIMIZE TABLE
and ANALYZE TABLE
, with FORCE INDEX
:id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE a range pub_date pub_date 11 NULL 17679 Using where; Using temporary; Using filesort
1 SIMPLE am2 ref PRIMARY,idx_meta_article PRIMARY 4 mydb_toto.a.serial 21930 Using where; Using index
1 SIMPLE ai ref PRIMARY,idx_iso_article PRIMARY 4 mydb_toto.a.serial 11523 Using where; Using index
1 SIMPLE m range PRIMARY,meta_articles_type meta_articles_type 4 NULL 96 Using where
1 SIMPLE am eq_ref PRIMARY,idx_meta_article PRIMARY 8 mydb_toto.am2.ArticleID,mydb_toto.m.meta_id 1 Using where; Using index
The particular value in a table field can be searched quickly by using an index. If no index is defined in the table, the search will start from the first record and stop searching where the searching value exists.
Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. The larger the table, the more this costs.
I have noticed that FORCE INDEX helps when you have multiple joins and sub-queries on VARCHAR fields where both the FK and the referenced value are not the primary key, while at the same time having where clause on a DATE field.
Something like:
SELECT NAME, a.reference_no, i.value, p.value FROM customers AS c
INNER JOIN accounts AS a ON c.id = a.customer_id
INNER JOIN invoices AS i ON i.reference_no = a.reference_no
INNER JOIN payments AS p ON p.invoice_no = i.invoice_no
WHERE payments.date >= '2011-09-01' AND DATE < '2011-10-01';
mysql will always use the PKs and FKs, where you would it to use the payment_date index on the payments table first as it is the largest one. So a FORCE INDEX(payment_date)
on the payments table join would help a lot.
This an example from the third-party billing database that we use at work. We had huge issues with optimisation, and FORCE INDEX did the job most of the times. Usually we found the slow quires with mysqladmin, tested them with FORCE INDEX and send them to the vendors to rewrite them in the source code of the app.
Here are the four tables to get a better grip on the example:
CREATE TABLE `customers` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1;
CREATE TABLE `accounts` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`customer_id` int(11) NOT NULL,
`reference_no` varchar(10) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `reference_no_uniq` (`reference_no`),
KEY `FK_accounts` (`customer_id`),
CONSTRAINT `FK_accounts` FOREIGN KEY (`customer_id`) REFERENCES `customers` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=latin1;
CREATE TABLE `invoices` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`reference_no` varchar(10) NOT NULL,
`invoice_no` varchar(10) NOT NULL,
`value` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `invoice_no_uniq` (`invoice_no`),
KEY `FK_invoices` (`reference_no`),
CONSTRAINT `FK_invoices` FOREIGN KEY (`reference_no`) REFERENCES `accounts` (`reference_no`)
) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=latin1;
CREATE TABLE `payments` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`invoice_no` varchar(10) NOT NULL,
`value` int(11) NOT NULL,
`date` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `FK_payments` (`invoice_no`),
KEY `payment_date` (`date`),
CONSTRAINT `FK_payments` FOREIGN KEY (`invoice_no`) REFERENCES `invoices` (`invoice_no`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;
I notice through your EXPLAIN
plans that the table sequence has changed, with the first two tables reversed, which may well be where your performance improvements are coming from, in addition to the use of the date index.
Have you looked into the use of STRAIGHT_JOIN
within the queries to force the sequence of the tables?
I have worked on a large database schema, where the optimal join configuration used STRAIGHT_JOIN
s all the way through the query, and the performance was 100 fold improved over the INNER JOIN
equivalents.
Unfortunately, I don't have access to the system any more to get some example EXPLAIN
plans, but the optimal table sequence went something like this;
Table 1 10 rows 1 analysed
Table 2 500 rows 50 analysed
Table 3 1,000,000 rows 300,000 analysed
Table 4 500,000,000 rows 4,000,000 analysed
Using STRAIGHT_JOIN
s to keep this sequence resulted in query performance far greater than the INNER JOIN
equivalent, which essentially just reversed the sequence of the tables.
Go back to your original query, remove the force index, and replace the INNER JOIN
s with STRAIGHT_JOIN
s and see what the explain plan gives you.
You may also want to create a composite index on the a
table using pub_date
and serial
, which I think will improve the query even further.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With