One of our software-projects uses a PostgreSQL-table with a column 'guid' of type bytea.
This is used with hibernate 3.3.2.GA with PostgreSQL 8.4, which serializes the java UUID type using java object serialization. The result is a value like the following escape
format bytea literal:
'\254\355\000\005sr\000\016java.util.UUID\274\231\003\367\230m\205/\002\000\002J\000\014leastSigBitsJ\000\013mostSigBitsxp\273\222)\360*r\322\262u\274\310\020\342\004M '
... which we cannot easily use in a query as select or condition to retrieve relevant rows.
Does anyone have a way to read or use the bytea-column in the select- or where-parts of a query (e.g. via psql or pgadmin3), without setting up some hibernate-query?
Update: See edit to question, this answer applies to the commonplace 16-byte serializations of uuid; the question was amended to reflect java serialization.
Interesting problem. I landed up writing a simple C extension to do it efficiently, but it's probably more sensible to use the PL/Python version below.
Because uuid
is a fixed sized type and bytea
is varlena
you can't just create cast ... as implicit
to binary-coerce them, because the variable length field header would get in the way.
There's no built-in function for bytea input to return a uuid. It'd be a handy thing to have, but I don't think anyone's done it yet.
Update: There's actually a simple way to do this. bytea
in hex form is actually a valid uuid literal once the \x
is stripped off, because uuid_in
accepts plain undecorated hex without -
or {}
. So just:
regress=> SET bytea_output = 'hex';
SET
regress=> SELECT CAST( substring(CAST (BYTEA '\x0FCC6350118D11E4A5597DE5338EB025' AS text) from 3) AS uuid);
substring
--------------------------------------
0fcc6350-118d-11e4-a559-7de5338eb025
(1 row)
It involves a couple of string copies and a hex encode/decode cycle, but it'll be tons faster than any of the PL answers I suggested earlier, though slower than C.
Personally I recommend using PL/Perl or pl/pythonu. I'll follow up with an example.
Assuming your uuid is the hex-format bytea literal:
'\x0FCC6350118D11E4A5597DE5338EB025'
you could turn it into a uuid
type with:
create language plperlu;
create or replace function to_uuid(bytea) returns uuid language plperlu immutable as $$
use Data::UUID;
my $ug = new Data::UUID;
my $uuid = $ug->from_hexstring(substr($_[0],2));
return $ug->to_string($uuid);
$$
SET bytea_output = hex;
SELECT to_uuid(BYTEA '\x0FCC6350118D11E4A5597DE5338EB025');
It's probably faster and cleaner in Python because the PL/Python interface passes bytea
as raw bytes not as hex strings:
CREATE LANGUAGE plpythonu;
CREATE or replace function to_uuid(uuidbytes bytea)
RETURNS uuid LANGUAGE plpythonu IMMUTABLE
AS $$
import uuid
return uuid.UUID(bytes=uuidbytes)
$$;
SELECT to_uuid(BYTEA '\x0FCC6350118D11E4A5597DE5338EB025');
You can see the C extension module here.
But really, I mean it about it being ugly. If you want it done properly in C, it's best to actually patch PostgreSQL rather than use an extension.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With