Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NOT IN subquery fails when there are NULL-valued results

Sorry guys, I had no idea how to phrase this one, but I have the following in a where clause:

person_id not in (
    SELECT distinct person_id
    FROM protocol_application_log_devl pal
    WHERE pal.set_id = @set_id
)

When the subquery returns no results, my whole select fails to return anything. To work around this, I replaced person_id in the subquery with isnull(person_id, '00000000-0000-0000-0000-000000000000').

It seems to work, but is there a better way to solve this?

like image 341
jreed121 Avatar asked Dec 20 '22 22:12

jreed121


1 Answers

It is better to use NOT EXISTS anyway:

WHERE NOT EXISTS(
    SELECT 1 FROM protocol_application_log_devl pal
    WHERE pal.person_id = person_id
     AND  pal.set_id = @set_id
)

Should I use NOT IN, OUTER APPLY, LEFT OUTER JOIN, EXCEPT, or NOT EXISTS?

A pattern I see quite a bit, and wish that I didn't, is NOT IN. When I see this pattern, I cringe. But not for performance reasons – after all, it creates a decent enough plan in this case:

The main problem is that the results can be surprising if the target column is NULLable (SQL Server processes this as a left anti semi join, but can't reliably tell you if a NULL on the right side is equal to – or not equal to – the reference on the left side). Also, optimization can behave differently if the column is NULLable, even if it doesn't actually contain any NULL values

Instead of NOT IN, use a correlated NOT EXISTS for this query pattern. Always. Other methods may rival it in terms of performance, when all other variables are the same, but all of the other methods introduce either performance problems or other challenges.

like image 71
Tim Schmelter Avatar answered Jan 17 '23 22:01

Tim Schmelter