I came up against a strange problem in Postgres yesterday when trying to filter out user ids from a stats table. When we did, for example, user_id != 24
, postgres excluded the rows where user_id
is NULL
as well.
I created the following test code which shows the same results.
CREATE TEMPORARY TABLE test1 ( id int DEFAULT NULL ); INSERT INTO test1 (id) VALUES (1), (2), (3), (4), (5), (2), (4), (6), (4), (7), (5), (9), (5), (3), (6), (4), (3), (7), (NULL), (NULL), (NULL), (NULL), (NULL), (NULL), (NULL); SELECT COUNT(*) FROM test1; SELECT id, COUNT(*) as count FROM test1 GROUP BY id; SELECT id, COUNT(*) as count FROM test1 WHERE id != 1 GROUP BY id; SELECT id, COUNT(*) as count FROM test1 WHERE (id != 1 OR id IS NULL) GROUP BY id;
The first query just counts all the rows. The second counts the number of each value, including nulls. The third excludes the value 1 and also all the nulls. The fourth is a work around to exclude value 1 and still include the nulls.
For what I'm trying to use this query for, null values should always be included.
Is the work around the only way to do this? Is this expected Postgres behaviour?
In PostgreSQL, NULL means no value. In other words, the NULL column does not have any value. It does not equal 0, empty string, or spaces. The NULL value cannot be tested using any equality operator like “=” “!=
An integer column can be null, but '' is an empty string not null.
nullif also used with the coalesce function to handle the null values. PostgreSQL nullif function returns a null value if provided expressions are equal. If two expressions provided are equal, then it provides a null value; as a result, otherwise, it will return the first expression as a result.
<> is the standard SQL operator meaning "not equal". Many databases, including postgresql, supports != as a synonym for <> . They're exactly the same in postgresql.
Your "work around" is the usual way to do it. Everything is behaving as expected.
The reason is simple: nulls are neither equal, nor not equal, to anything. This makes sense when you consider that null means "unknown", and the truth of a comparison to an unknown value is also unknown.
The corollary is that:
null = null
is not truenull = some_value
is not truenull != some_value
is not trueThe two special comparisons IS NULL
and IS NOT NULL
exist to deal with testing if a column is, or is not, null
. No other comparisons to null can be true.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With