I was a little surprised to find out that the WHERE statement in Google Big Query ignores NULLS. Does anyone know of a better way to do this?
I have the following data set:
Name Score
Allan 20
Brian NULL
Clare 30
Say I want to select all records where the Score is not equal to 20. If I use the following code in Big Query
SELECT * FROM [....]
where
Score <> 20
The following is the result:
Name Score
Clare 30
The problem is that the record for Brian which is NULL is also not equal to 20 and therefore should be in my results.
Other than checking spefically for NULLS is there a better way to do this?
Thanks Ria
SQL (and thus BigQuery, which is SQL-like), has a trivalent logic. What that boils down to is that statements cannot just be TRUE or FALSE, they can also be NULL. In this case, the statement NULL <> 20 is neither TRUE nor FALSE, it is itself NULL. It might be helpful to think of NULL values as unknown. Since we don't know Brian's age, we don't know whether it is equal to 20. But the query only returns rows for which the where-clause evaluates to TRUE, and therefore the row with Brian is excluded.
If you want to include NULL values, you have to explicitly write
where (Score <> 20 or Score is null)
select * from [...]
where coalesce(score, 0) <> 20
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With