Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generic bit string comparison against zero in Postgres

Is there a way to do a non-zero bit string test without hard-coding the bit string width of 0?

For example, suppose I have two tables, Users and Features, each with masks, I want to test this:

SELECT u.name FROM Users u, Features f
  WHERE u.mask & f.mask;

matching implicit non-zero results. However, SQL requires an explicit boolean result for WHERE as opposed to an implicit cast, such as this:

SELECT u.name FROM Users u, Features f
  WHERE (u.mask & f.mask) != 0::BIT(2048);

I don't want to hardcode 2048 (or whatever) in this query for a number of reasons.

Testing expr = 0 or expr > 0 results in a type error. Oddly, I can test expr = 0::BIT(1), but that gives the wrong answer because Postgres does not consider all all-zero bit strings to be equal.

select 0::BIT(2) > 0::BIT(1);
 ?column? 
----------
 t
(1 row)

I can create a calculated zero by doing this:

SELECT u.name FROM Users u, Features f
  WHERE (u.mask & f.mask) != (u.mask & ~u.mask);

Which works but feels like an awful hack.

Any suggestions or insights?

RESULTS

I benchmarked several options provided below. Thanks for the suggestions, Erwin!

Based on a very large data set and 100,000 queries, I found the following constructs resulted in the associated queries per second. Hopefully someone from the Postgres team sees this and provides a generic 0 to speed things up! Unfortunately most generic approaches seem to incur a string conversion which is quite expensive.

Constructs                              |  Queries / s
----------------------------------------+--------------
(u.mask & f.mask) <> 0::BIT(2048)       |  158
(u.mask & f.mask) <> (u.mask # u.mask)  |  135
(u.mask & f.mask) <> (u.mask & ~u.mask) |  125
position('1' IN (u.mask & f.mask)) > 0  |   37
(u.mask & f.mask)::TEXT !~ '^0+$'       |   27
like image 489
Joseph Avatar asked Nov 28 '13 00:11

Joseph


1 Answers

Short bitstring

To exclude cases where the bitwise AND (&) returns a bitstring consisting of nothing but zeros, but the length might change (B'000...'), you can use a cast to integer (up to bit(32)) or bigint (up to bit(64)):

SELECT u.name
FROM   users u
JOIN   features f ON (u.mask & f.mask)::int <> 0;

When cast to integer, all of them result in 0.
This also excludes cases where either of the columns is NULL. In other words, the result has to include at least one 1.

Long bitstring

If your values can be longer than 64 bit, you could cast to text and check with a regular expression:

ON (u.mask & f.mask)::text !~ '^0+$'

Pattern explained:

^ .. beginning of string
0+ .. one or more '0'
$ .. end of string

Or, as the manual informs:

The following SQL-standard functions work on bit strings as well as character strings: length, bit_length, octet_length, position, substring, overlay.

Ergo:

ON position('1' IN (u.mask & f.mask)) > 0
like image 165
Erwin Brandstetter Avatar answered Sep 30 '22 19:09

Erwin Brandstetter