Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange behaviour with NOT IN

Tags:

sql

mysql

I have this table :

select count(distinct clean_deep_link) from tbl_1;
+---------------------------------+
| count(distinct clean_deep_link) |
+---------------------------------+
|                          121211 |
+---------------------------------+

I have this query :

select count(1) from tbl_1 where clean_deep_link IN
  (select clean_deep_link from tbl_2);

+----------+
| count(1) |
+----------+
|    66360 |
+----------+

But when I change the query to not in it returns an empty set :

select count(1) from tbl_1
where clean_deep_link not in (select clean_deep_link from tbl_2);
+----------+
| count(1) |
+----------+
|        0 |
+----------+

How is this possible? if the subquery contains about half of the records, shouldn't the not of the subquery contain the other half? What am I missing here?

Thanks

like image 235
WeaselFox Avatar asked May 16 '12 07:05

WeaselFox


2 Answers

I would assume that tbl_1.clean_deep_link is NULL for the rest of the rows.

These values are neither IN nor NOT IN your sub-query.


The other reason could be, that you have NULL in tbl_2.clean_deep_link.

Please try the following:

select count(1) from tbl_1
where clean_deep_link not in (select clean_deep_link
                              from tbl_2 WHERE clean_deep_link IS NOT NULL);

The problem with NULL is that it is neither =, nor <> any other value (including NULL).

When checking for NOT IN, MySQL needs to check for each value in tbl_1 that it is not contained in tbl_2 and thus checks if they are <>.

Your values were not <> NULL, so they were not NOT IN.

See also: Using NOT IN operator with null values


Check example in SQL Fiddle.

like image 52
Peter Lang Avatar answered Oct 05 '22 18:10

Peter Lang


The NULL columns in MySQL are counted as being distinct, so three NULL values are all considered distinct.

As mentioned elsewhere, you can't compare NULL against other values using conventional comparison operators, amongst which IN and NOT IN.

The following operators can handle NULL values

x <=> y - returns 1 if x == y (even if both are NULL)
x IS NULL - returns 1 if x is null
x IS NOT NULL - returns 0 if x is null
like image 39
Ja͢ck Avatar answered Oct 05 '22 18:10

Ja͢ck