In Postgres 9.6, on a users table with around 12 million rows, where the active
boolean column has a btree index on it.
EXPLAIN ANALYZE SELECT * FROM users WHERE active = 'f' LIMIT 1;
Limit (cost=0.00..0.14 rows=1 width=982) (actual time=0.039..0.040 rows=1 loops=1)
-> Seq Scan on users (cost=0.00..3190979.68 rows=23264168 width=982) (actual time=0.036..0.036 rows=1 loops=1)
Filter: (NOT active)
Rows Removed by Filter: 115
Planning time: 0.161 ms
Execution time: 0.067 ms
But, using IS FALSE
seems to use the index.
EXPLAIN ANALYZE SELECT * FROM users WHERE active IS FALSE LIMIT 1;
Limit (cost=0.44..0.59 rows=1 width=982) (actual time=0.054..0.056 rows=1 loops=1)
-> Index Scan using index_users_on_active on users (cost=0.44..2059.14 rows=13183 width=982) (actual time=0.051..0.051 rows=1 loops=1)
Index Cond: (active = false)
Filter: (active IS FALSE)
Planning time: 0.170 ms
Execution time: 0.094 ms
The vast majority of the records, the active value is true
, and I understand an index isn't always faster.
It seems that Rails prefers the active = 'f'
syntax, as that's what it outputs when you build a query.
Why are these different? How are they different? Should the other ever be used?
In PostgreSQL, NULL means no value. In other words, the NULL column does not have any value. It does not equal 0, empty string, or spaces.
<> is the standard SQL operator meaning "not equal". Many databases, including postgresql, supports != as a synonym for <> . They're exactly the same in postgresql.
The difference is in how NULL values are handled. Per the docs:
IS FALSE
will always return a boolean value, even if the argument is null.= 'f'
will return null if the argument is null.So to illustrate,
rnubel=# SELECT NULL = 'f' as equals_false, NULL IS FALSE as is_false;
equals_false | is_false
--------------+----------
| f
Generally, most things in SQL will operate on a ternary-logic basis, where null values produce null results; so it's not surprising that Rails defaults to using the operator consistent with that basis. However, it's different from what you'd expect in Ruby (where !nil == true
), so it's likely been the cause of much unexpected behavior.
I would like to oppose my answer to the one from Robert Nubel. Despite the fact that comparison null = false
will return null, it will not affect the resulting set:
t=> with b(v) as (values(true),(false),(null))
mon-> select v,v::text from b;
v | v
---+-------
t | true
f | false
|
(3 rows)
t=> with b(v) as (values(true),(false),(null))
select v,v::text from b where v is false;
v | v
---+-------
f | false
(1 row)
t=> with b(v) as (values(true),(false),(null))
select v,v::text from b where not v;;
v | v
---+-------
f | false
(1 row)
t=> with b(v) as (values(true),(false),(null))
select v,v::text from b where v = 'f';
v | v
---+-------
f | false
(1 row)
The difference in execution plans was due to index condition hit (Index Cond: (active = false)
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With