The problem exists in MySQL's newest version, so I even doubt that that may be a bug.
Here are two tables:
t1(id int), values (10),(2)
t2(id int), values (0),(null),(1)
Execute:
select id from t1 where id > all (select id from t2);
Return result set:
+------+
| id |
+------+
| 10 |
| 2 |
+------+
According to my knowledge and the page http://dev.mysql.com/doc/refman/5.5/en/all-subqueries.html
The statement should return empty result! Because each judgement in "where" leads to null, like this:
select id > all (select id from t2) as c1 from t1;
returns:
+------+
| c1 |
+------+
| NULL |
| NULL |
+------+
and actually select id from t1 where null;
returns nothing!
Finally, I tried this:
explain extended select id from t1 where id > all (select id from t2);
show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | select `test`.`t1`.`id` AS `id` from `test`.`t1` where <not>((`test`.`t1`.`id` <= (select max(`test`.`t2`.`id`) from `test`.`t2`))) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
We can see that MySQL optimizes the original SQL to this one, which actually fits the result set.
but I not think the optimized SQL equals the original one .
Am I wrong?
Update: Upon further analysis and unfolding of MySQL's > ALL
odd implementation. This answer should be considered as MySQL-specific. So for further disclaimer, explanation on answer here regarding > ALL
is not applicable to other RDBMSes (unless there are other RDBMSes that copied MySQL implementation). Internal translation from > ALL
to a MAX
construct, applies to MySQL only.
This:
select id from t1 where id > all (select id from t2);
is semantically equivalent to:
select id from t1 where id > (select max(id) from t2);
Since select max(id) from t2
returns 1, the second query materializes to this:
select id from t1 where id > 1
That's why it returns both 10
and 2
from table t1
One of the instances where NULL rules is being applied is when you use NOT IN
, an example:
DDL:
create table t1(id int);
insert into t1 values (10),(2);
create table t2(id int);
insert into t2 values (0),(null),(1);
Query:
select * from t1 where id not in (select id from t2);
-- above is evaluated same as the following query, so the rules about null applies,
-- hence the above and following query will not return any record.
select * from t1 where id <> 0 and id <> null and id <> 1;
-- to eliminate null side-effect, do this:
select * from t1 where id not in (select id from t2 where id is not null);
-- which is equivalent to this:
select * from t1 where id <> 0 and id <> 1;
The last two queries returns 10
and 2
, whereas the first two queries returns empty set
Live test: http://www.sqlfiddle.com/#!2/82865/1
Hope these examples erases your confusion with NULL rules.
Regarding
but I not think the optimized sql equals the original one .
Optimized sql being this:
select `test`.`t1`.`id` AS `id` from `test`.`t1` where <not>((`
test`.`t1`.`id` <= (select max(`test`.`t2`.`id`) from `test`.`t2`)))
That is really equivalent to your original query: select id from t1 where id > all (select id from t2);
The construct t1.field > all (select t2.field from t2)
is just a syntactic sugar for:
t1.field > (select max(t2.field) from t2)
If you will apply DeMorgan theorem on the optimized SQL by MySql:
not (t1.id <= (select max(t2.id) from t2))
That is equivalent to:
t1.id > (select max(t2.id) from t2)
Which in turn is equivalent to the syntactic sugar ALL
:
t1.id > ALL(select t2.id from t2)
This is a bug in MySQL (Reported and verified here).
The fix will be available in 5.6.7 (next 5.6x version) as well as in the next major tree (5.7x)
It differs from the stated behaviour in the MySQL docs and that prescribed in the ANSI standard.
Moreover it is not even consistent in MySQL and you get different results when the sub query references a table compared to when the sub query contains (the same) literal values.
CREATE TABLE t2
(
id INT
);
INSERT INTO t2
VALUES (0),
(NULL),
(1);
/*Returns row with 10*/
SELECT *
FROM (SELECT 10 AS id) T
WHERE id > ALL (SELECT id
FROM t2);
/*Returns no rows. Explain Plan says "Impossible Where"*/
SELECT *
FROM (SELECT 10 AS id) T
WHERE id > ALL (SELECT 0
UNION ALL
SELECT NULL
UNION ALL
SELECT 1);
The second behaviour is correct per the spec. How 10 > ALL( (0),(null),(1) )
ought to be logically evaluated as follows
10 > 0 = TRUE
10 > NULL = UNKNOWN
10 > 1 = TRUE
Under the rules of three valued logic
TRUE AND UNKNOWN AND TRUE = UNKNOWN
So this row should not be returned. See the ANSI specification which clearly states
The result of "
R <comp op> <quantifier> T
" is derived by the application of the implied<comparison predicate>
"R <comp op> RT
" to every rowRT
inT
:
Therefore this is not a semantically valid optimisation when T
is Nullable. The full section of the spec is reproduced below.
8.7
Function Specify a quantified comparison. Format <quantified comparison predicate> ::= <row value constructor> <comp op> <quantifier> <table subquery> <quantifier> ::= <all> | <some> <all> ::= ALL <some> ::= SOME | ANY Syntax Rules 1) The <row value constructor> shall be of the same degree as the result of the <table subquery>. 2) The data types of the values of the <row value constructor> shall be respectively comparable to those of the columns of the <table subquery>. 3) The collating sequence for each pair of respective values in the <quantified comparison predicate> is determined in the same manner as described in Subclause 8.2, "<comparison predicate>". Access Rules None. General Rules 1) Let R be the result of the <row value constructor> and let T be the result of the <table subquery>. 2) The result of "R <comp op> <quantifier> T" is derived by the application of the implied <comparison predicate> "R <comp op> RT" to every row RT in T: Case: a) If T is empty or if the implied <comparison predicate> is true for every row RT in T, then "R <comp op> <all> T" is true. b) If the implied <comparison predicate> is false for at least one row RT in T, then "R <comp op> <all> T" is false. c) If the implied <comparison predicate> is true for at least one row RT in T, then "R <comp op> <some> T" is true. d) If T is empty or if the implied <comparison predicate> is false for every row RT in T, then "R <comp op> <some> T" is false. e) If "R <comp op> <quantifier> T" is neither true nor false, then it is unknown.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With