I am facing the following conundrum in sqlite, and wonder if there's something I'm missing about the sqlite dialect.
Basically, when I ask for total unique values in one table I get one number. When I ask for the subset of those values present in a second table I get smaller number. But when I ask for the compliment of that subset I get 0.
Consider two tables, A and B, with different but overlapping populations of widgets denoted by a unique id we can call Widget_ID.
I am getting the following behavior:
SELECT COUNT(DISTINCT WIDGET_ID)
FROM A
Output: 100
SELECT COUNT(DISTINCT WIDGET_ID)
FROM A
WHERE WIDGET_ID IN (SELECT WIDGET_ID FROM B)
Output: 75
SELECT COUNT(DISTINCT WIDGET_ID)
FROM A
WHERE WIDGET_ID NOT IN (SELECT WIDGET_ID FROM B)
Output: 0
Shouldn't this be impossible?
The normal reason for problems with not in
is the presence of NULL
values. NOT IN
always fails when the list has NULL
because NULL
comparisons cannot be true.
You can fix this by filtering them outer:
SELECT COUNT(DISTINCT WIDGET_ID)
FROM A
WHERE WIDGET_ID NOT IN (SELECT WIDGET_ID FROM B WHERE WIDGET_ID IS NOT NULL);
I prefer to use NOT EXISTS
, because it has the semantics that you expect:
SELECT COUNT(DISTINCT WIDGET_ID)
FROM A
WHERE NOT EXISTS (SELECT WIDGET_ID FROM B WHERE B.WIDGET_ID = A.WIDGET_ID);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With