Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the easiest way to represent a bytea as a single integer in PostgreSQL?

I have a bytea column that contains 14 bytes of data. The last 3 bytes of the 14 contain the CRC code of the data. I would like to extract the CRC as a single integer to be stored in a new column.

How would I go about doing this?

To clarify, here's one way of doing it in Java:

int crc = ((rawData[len - 3] & 0xff) << 16 |
            (rawData[len - 2] & 0xff) << 8 |
            (rawData[len - 1] & 0xff)) & 0xffffff;

I'm hoping to find a solution without bit shifting, i.e. something like a method that accepts 4 bytes and converts them into an integer.

like image 407
Zoltán Avatar asked Jun 20 '13 08:06

Zoltán


2 Answers

Another way is to extract the last 6 characters in hex representation, prepend an x and cast directly:

db=# SELECT ('x' || right('\x00000000000001'::bytea::text, 6))::bit(24)::int;
 int4
------
    1

.. which is a bit shorter than the get_byte() route, but is also an undocumented feature of PostgreSQL. However, I quote Tom Lane here:

This is relying on some undocumented behavior of the bit-type input converter, but I see no reason to expect that would break. A possibly bigger issue is that it requires PG >= 8.3 since there wasn't a text to bit cast before that.

Details in this related answer:

  • Convert hex in text representation to decimal number

This assumes that your setting of bytea_output is hex, which is the default since version 9.0. To be sure, you can test / set it for your session:

SET bytea_output = 'hex';

More here:

  • PostgreSQL 9.X bytea representation in 'hex' or 'escape' for thumbnail images

Performance

I ran a test (best of 10) on a table with 10k rows. get_byte() is actually a bit faster in Postgres 9.1:

CREATE TEMP TABLE t (a bytea);
INSERT INTO t
SELECT (12345670000000 + generate_series(1,10000))::text::bytea;

Bit shifting is about as fast as multiplying / adding:

SELECT 
 ('x' || right(a::text, 6))::bit(24)::int                           -- 34.9 ms
,(get_byte(a, 11) << 16) + (get_byte(a, 12) << 8) + get_byte(a, 13) -- 27.0 ms
,(get_byte(a, 11) << 16) | (get_byte(a, 12) << 8) | get_byte(a, 13) -- 27.1 ms
, get_byte(a, 11) * 65536 + get_byte(a, 12) * 256 + get_byte(a, 13) -- 27.1 ms
FROM t
like image 85
Erwin Brandstetter Avatar answered Nov 11 '22 14:11

Erwin Brandstetter


select get_byte(b, 11) * 65536 + get_byte(b, 12) * 256 + get_byte(b, 13)
from (values ('12345678901234'::bytea)) s(b);
 ?column? 
----------
  3289908
like image 30
Clodoaldo Neto Avatar answered Nov 11 '22 14:11

Clodoaldo Neto