Can someone explain why these two queries (one using IN
and one using EXISTS
) return different results in MySQL 5.6 but not in MySQL 5.5?
Using EXPLAIN
, I can see different execution plans for each, but I need help understanding what's going on, and why would this IN
logic be broken in 5.6 but not 5.5?
Fiddle illustrating the problem: http://sqlfiddle.com/#!9/da52b/95
Members can have two addresses: a home address and a firm address. The desired result is to provide a region X and get a list of all members with a mailing address in that region. The mailing address is the firm address if it exists, otherwise it is the home address. Cities can belong to one or more regions.
Simplified database structure and data:
CREATE TABLE `city` (
`c_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`c_name` varchar(255) NOT NULL DEFAULT '',
PRIMARY KEY (`c_id`)
);
INSERT INTO `city`
VALUES
('1', 'Hillsdale'),
('2', 'Smallville'),
('3', 'Oakside'),
('4', 'Lakeview');
CREATE TABLE `city_region` (
`cr_city` int(11) unsigned NOT NULL,
`cr_region` int(11) NOT NULL,
PRIMARY KEY (`cr_city`,`cr_region`)
);
INSERT INTO `city_region`
VALUES
('1', '3'),
('2', '1'),
('3', '1'),
('3', '2'),
('4', '1'),
('4', '3');
CREATE TABLE `firm_address` (
`fa_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`fa_member` int(11) NOT NULL,
`fa_city` int(11) NOT NULL,
PRIMARY KEY (`fa_id`)
);
INSERT INTO `firm_address`
VALUES
('1', '1', '3'),
('2', '2', '2'),
('3', '3', '1'),
('4', '6', '2'),
('5', '7', '1');
CREATE TABLE `home_address` (
`ha_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`ha_member` int(11) NOT NULL,
`ha_city` int(11) NOT NULL,
PRIMARY KEY (`ha_id`)
);
INSERT INTO `home_address`
VALUES
('1', '1', '2'),
('2', '2', '3'),
('3', '3', '1'),
('4', '4', '1'),
('5', '5', '2'),
('6', '6', '2');
CREATE TABLE `member` (
`m_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`m_name` varchar(255) NOT NULL DEFAULT '',
PRIMARY KEY (`m_id`)
);
INSERT INTO `member`
VALUES
('1', 'John'),
('2', 'Bob'),
('3', 'Dave'),
('4', 'Jane'),
('5', 'Mary'),
('6', 'Karen'),
('7', 'Christie');
CREATE TABLE `region` (
`r_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`r_name` varchar(255) NOT NULL DEFAULT '',
PRIMARY KEY (`r_id`)
);
INSERT INTO `region`
VALUES
('1', 'Central'),
('2', 'Lake District'),
('3', 'Westside');
Query 1 (wrong, missing a member):
SELECT * FROM member
LEFT OUTER JOIN home_address ON m_id = ha_member
LEFT OUTER JOIN city home_city ON ha_city = home_city.c_id
LEFT OUTER JOIN firm_address ON m_id = fa_member
LEFT OUTER JOIN city firm_city ON fa_city = firm_city.c_id
WHERE 1 IN (
SELECT r_id
FROM region
INNER JOIN city_region ON r_id = cr_region
WHERE cr_city = IF(fa_city IS NULL, ha_city, fa_city)
)
Query 2 (returning the correct results):
SELECT * FROM member
LEFT OUTER JOIN home_address ON m_id = ha_member
LEFT OUTER JOIN city home_city ON ha_city = home_city.c_id
LEFT OUTER JOIN firm_address ON m_id = fa_member
LEFT OUTER JOIN city firm_city ON fa_city = firm_city.c_id
WHERE EXISTS (
SELECT r_id
FROM region
INNER JOIN city_region ON r_id = cr_region
WHERE cr_city = IF(fa_city IS NULL, ha_city, fa_city)
AND r_id = 1
)
Any help understanding this inconsistency would be appreciated.
Thank you.
I spent some time looking at this today, and it appears to be a bug in MySQL 5.6. (I also tested MySQL 5.6.15 and got the same result.)
MySQL 5.6 uses some new optimizations in executing this query, but they do not seem to be responsible for the difference, as it does not help to set e.g.:
set session optimizer_switch="block_nested_loop=off";
Using IFNULL(fa_city, ha_city)
instead of IF(fa_city IS NULL, ha_city, fa_city)
does yield a correct result, so the bug appears to be somewhere in the processing of IF()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With