Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Function min(uuid) does not exist in postgresql

I have imported tables from Postgres to hdfs by using sqoop. My table have uuid field as primary key and my command sqoop as below:

sqoop import --connect 'jdbc:postgresql://localhost:5432/mydb' --username postgreuser --password 123456abcA --driver org.postgresql.Driver --table users --map-column-java id=String --target-dir /hdfs/postgre/users --as-avrodatafile --compress -m 2

But I got the error:

Import failed: java.io.IOException: org.postgresql.util.PSQLException: ERROR: function min(uuid) does not exist

I tried executed the sql command: SELECT min(id) from users and got the same error. How could I fix it ? I use Postgres 9.4, hadoop 2.9.0 and sqoop 1.4.7

like image 700
hazzy Avatar asked Mar 12 '18 03:03

hazzy


People also ask

Does Postgres have UUID?

UUID is an abbreviation for Universal Unique Identifier defined by RFC 4122 and has a size of 128-bit. It is created using internal algorithms that always generate a unique value. PostgreSQL has its own UUID data type and provides modules to generate them.

Does Postgres automatically generate UUID?

Unfortunately, while PostgreSQL is great for storing and comparing UUID data, it lacks capabilities for creating UUID values in its core. Instead, it relies on third-party modules to create UUIDs using specified techniques.

How many bytes is a UUID Postgres?

The GUID/UUID is not a text in either PostgreSQL or SQL Server system. They are native typs in both systems and are stored using 16 bytes. PostgreSQL does although have auto casts built in for converting uuids to text where as SQL Server requires an explicit cast to varchar.

What is GUID in PostgreSQL?

(Some systems refer to this data type as a globally unique identifier, or GUID, instead.) This identifier is a 128-bit quantity that is generated by an algorithm chosen to make it very unlikely that the same identifier will be generated by anyone else in the known universe using the same algorithm.


2 Answers

I'd like to credit @robin-salih 's answer, I've used it and implementation of min for int, to build following code:

CREATE OR REPLACE FUNCTION min(uuid, uuid)
RETURNS uuid AS $$
BEGIN
    IF $2 IS NULL OR $1 > $2 THEN
        RETURN $2;
    END IF;

    RETURN $1;
END;
$$ LANGUAGE plpgsql;


create aggregate min(uuid) (
  sfunc = min,
  stype = uuid,
  combinefunc = min,
  parallel = safe,
  sortop = operator (<)
);

It almost the same, but takes advantages of B-tree index, so select min(id) from tbl works in few millis.

P.S. I'm not pgsql expert, perhaps my code is somehow wrong, double check before use in production, but I hope it uses indexes and parallel execution correctly. I've made it just from sample code, not digging into theory behind aggregates in PG.

like image 191
Bogdan Mart Avatar answered Sep 20 '22 08:09

Bogdan Mart


Postgres doesn't have built-in function for min/max uuid, but you can create your own using the following code:

CREATE OR REPLACE FUNCTION min(uuid, uuid)
RETURNS uuid AS $$
BEGIN
    IF $2 IS NULL OR $1 > $2 THEN
        RETURN $2;
    END IF;

    RETURN $1;
END;
$$ LANGUAGE plpgsql;


CREATE AGGREGATE min(uuid)
(
    sfunc = min,
    stype = uuid
);
like image 39
Robin Salih Avatar answered Sep 22 '22 08:09

Robin Salih