I have a table that has a foreign key to a table that stores some blob data. When I do an inner join on the tables with a condition on the main table the join type goes from 'index' to 'ALL'. I would like to avoid this as my blob table is on the order of tens of gigabytes. How can I avoid it?
Here is the the basic inner join:
EXPLAIN SELECT m.id, b.id, b.data
FROM metadata m, blobstore b
WHERE m.fkBlob = b.id;
1, 'SIMPLE', 'm', 'index', 'fk_blob', 'fk_blob', '4', '', 1, 'Using index'
1, 'SIMPLE', 'b', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'blob_index.m.fkBlob', 1, ''
Here I add a condition on the main table:
EXPLAIN SELECT m.id, b.id, b.data
FROM metadata m, blobstore b
WHERE m.fkBlob = b.id AND m.start < '2009-01-01';
1, 'SIMPLE', 'b', 'ALL', 'PRIMARY', '', '', '', 1, ''
1, 'SIMPLE', 'm', 'ref', 'fk_blob,index_start', 'fk_blob', '4', 'blob_index.b.id', 1, 'Using where'
Notice that the order in which the tables are listed has changed. It is now doing a full table scan on the blob table because of a condition I've added regarding the main table.
Here is the schema:
DROP TABLE IF EXISTS `blob_index`.`metadata`;
CREATE TABLE `blob_index`.`metadata` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`fkBlob` int(10) unsigned NOT NULL,
`start` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `fk_blob` (`fkBlob`),
KEY `index_start` (`start`),
CONSTRAINT `fk_blob` FOREIGN KEY (`fkBlob`) REFERENCES `blobstore` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
DROP TABLE IF EXISTS `blob_index`.`blobstore`;
CREATE TABLE `blob_index`.`blobstore` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`data` mediumblob NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Full table scan occurs when there is no index or index is not being used by SQL. And the result of full scan table is usually slower that index table scan. The situation is that: the larger the table, the slower of the data returns.
Make sure that full table scans are the bottleneck before you spend a lot of time doing something that may only improve performance by 1%. Parallelism SELECT /*+ PARALLEL */ * FROM Table1; Parallelism can easily improve full table scan performance by an order of magnitude on many systems.
A table scan is the reading of every row in a table and is caused by queries that don't properly use indexes. Table scans on large tables take an excessive amount of time and cause performance problems.
I guess you are trying this on empty table (because MySQL thinks it needs to go through one row to do full table scan), what might influence results of scheduler. When you will do it on real table, the EXPLAIN
results might vary (and actually did vary in my test).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With