I am having a ~90 MB database consisting mostly on message attachments including a BLOB column content
, that stores the binary attachment data.
I assume it is not wise to create an index over a BLOBs, so no indexes involved apart from the autoindex.
For getting empty attachments, I compared the following querys:
SELECT message_id FROM attachments WHERE content IS NULL;
and
SELECT message_id FROM attachments WHERE length(content) = 0;
which result in the same rows in my usecase.
Why does the first one take 250ms and the second one only 1-2ms (both on a SSD)? What is the reason behind that? Is there a hidden length index or something? Any insight appreciated.
Additional info
The EXPLAIN QUERY PLAN
in both cases is
0|0|0|SCAN TABLE attachments
The negation IS NOT NULL
vs. length() != 0
results in the same performance difference 250ms vs. 2ms.
WHERE content IS NULL AND length(content) = 0;
takes 250ms and WHERE length(content) = 0 AND content IS NULL;
takes 2ms.These are simply different queries: LENGTH
is a scalar function which returns (see here)
(i) NULL
if the input is NULL
(ii) 0
if the input is a string of zero length (or if it is convertible to a string, resp.).
Therefore the condition length(content)=0
is true for content being an empty string, and false when content is NULL
(because comparison with NULL
always is false).
Based on this, I guess that your table contains several NULL
fields and only a few which actually contain a value. This is supported also by your second additional info, where you say that IS NOT NULL
shows a comparable performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With