Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to exclude null values in array_agg like in string_agg using postgres?

With postgresql-9.3 one can do this;

SELECT g.id,
   array_remove(array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END), NULL) canonical_users,
   array_remove(array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END), NULL) non_canonical_users
FROM groups g 
GROUP BY g.id;

Update: with postgresql-9.4;

SELECT g.id,
   array_agg(g.users) FILTER (WHERE g.canonical = 'Y') canonical_users,
   array_agg(g.users) FILTER (WHERE g.canonical = 'N') non_canonical_users
FROM groups g 
GROUP BY g.id;

select
    id,
    (select array_agg(a) from unnest(canonical_users) a where a is not null) canonical_users,
    (select array_agg(a) from unnest(non_canonical_users) a where a is not null) non_canonical_users
from (
    SELECT g.id,
           array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END) canonical_users,
           array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END) non_canonical_users
    FROM groups g
    GROUP BY g.id
) s

Or, simpler and may be cheaper, using array_to_string which eliminates nulls:

SELECT
    g.id,
    array_to_string(
        array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END)
        , ','
    ) canonical_users,
    array_to_string(
        array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END)
        , ','
    ) non_canonical_users
FROM groups g
GROUP BY g.id

If you are looking for a modern answer to the general question of how to remove a NULL from an array, it is:

array_remove(your_array, NULL)

I was specifically curious about performance and wanted to compare this to the best possible alternative:

CREATE OR REPLACE FUNCTION strip_nulls(
    IN array_in ANYARRAY
)
RETURNS anyarray AS
'
SELECT
    array_agg(a)
FROM unnest(array_in) a
WHERE
    a IS NOT NULL
;
'
LANGUAGE sql
;

Doing a pgbench test proved (with high confidence) that array_remove() is a little more than twice as fast. I did my test on double precision numbers with a variety of array sizes (10, 100 and 1000 elements) and random NULLs in between.


It's also worth noting that this can be used to remove blanks ('' != NULL). But the second parameter accepts anyelement, and since it is most likely they you'd be indicating a blank with a string literal, make sure to cast it to the form you want, usually a non-array.

For example:

select array_remove(array['abc', ''], ''::text);

If you try:

select array_remove(array['abc', ''], '');

it will assume that the '' is TEXT[] (array) and will throw this error:

ERROR: malformed array literal: ""


In solving the general question of removing nulls from array aggregates there are two main ways of attacking the problem: either doing array_agg(unnest(array_agg(x)) or creating a custom aggregate.

The first is of the form shown above:

SELECT 
    array_agg(u) 
FROM (
    SELECT 
        unnest(
            array_agg(v)
        ) as u 
    FROM 
        x
    ) un
WHERE 
    u IS NOT NULL;

The second:

/*
With reference to
http://ejrh.wordpress.com/2011/09/27/denormalisation-aggregate-function-for-postgresql/
*/
CREATE OR REPLACE FUNCTION fn_array_agg_notnull (
    a anyarray
    , b anyelement
) RETURNS ANYARRAY
AS $$
BEGIN

    IF b IS NOT NULL THEN
        a := array_append(a, b);
    END IF;

    RETURN a;

END;
$$ IMMUTABLE LANGUAGE 'plpgsql';

CREATE AGGREGATE array_agg_notnull(ANYELEMENT) (
    SFUNC = fn_array_agg_notnull,
    STYPE = ANYARRAY,
    INITCOND = '{}'
);

Calling the second is (naturally) a little nicer looking than the first:

select array_agg_notnull(v) from x;


I am adding this even though this this thread is quite old, but I ran into this neat trick that works quite well on small arrays. It runs on Postgres 8.4+ without additional libraries or functions.

string_to_array(array_to_string(array_agg(my_column)))::int[]

The array_to_string() method actually gets rid of the nulls.


You should wrap your array_agg with array_remove.

SELECT g.id,
       array_remove(array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END), NULL) canonical_users,
       array_remove(array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END), NULL) non_canonical_users
FROM groups g
GROUP BY g.id;