Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vector (array) addition in Postgres

I have a column with numeric[] values which all have the same size. I'd like to take their element-wise average. By this I mean that the average of

{1, 2, 3}, {-1, -2, -3}, and {3, 3, 3}

should be {1, 1, 1}. Also of interest is how to sum these element-wise, although I expect that any solution for one will be a solution for the other.

(NB: The length of the arrays is fixed within a single table, but may vary between tables. So I need a solution which doesn't assume a certain length.)

My initial guess is that I should be using unnest somehow, since unnest applied to a numeric[] column flattens out all the arrays. So I'd like to think that there's a nice way to use this with some sort of windowing function + group by to pick out the individual components of each array and sum them.

-- EXAMPLE DATA
CREATE TABLE A
  (vector numeric[])
;

INSERT INTO A
  VALUES
    ('{1, 2, 3}'::numeric[])
    ,('{-1, -2, -3}'::numeric[])
    ,('{3, 3, 3}'::numeric[])
;
like image 447
Brandon Humpert Avatar asked Oct 15 '14 17:10

Brandon Humpert


2 Answers

I've written an extension to do vector addition (and subtraction, multiplication, division, and powers) with fast C functions. You can find it on Github or PGXN.

Given two arrays a and b you can say vec_add(a, b). You can also add either side to a scalar, e.g. vec_add(a, 5).

If you want a SUM aggregate function instead you can find that in aggs_for_vecs, also on PGXN.

Finally if you want to sum up all the elements of a single array, you can use aggs_for_arrays (PGXN).

like image 156
Paul A Jungwirth Avatar answered Sep 22 '22 19:09

Paul A Jungwirth


I discovered a solution on my own which is probably the one I will use.

First, we can define a function for adding two vectors:

CREATE OR REPLACE FUNCTION vec_add(arr1 numeric[], arr2 numeric[])
RETURNS numeric[] AS
$$
SELECT array_agg(result)
FROM (SELECT tuple.val1 + tuple.val2 AS result
      FROM (SELECT UNNEST($1) AS val1
                   ,UNNEST($2) AS val2
                   ,generate_subscripts($1, 1) AS ix) tuple
      ORDER BY ix) inn;
$$ LANGUAGE SQL IMMUTABLE STRICT;

and a function for multiplying by a constant:

CREATE OR REPLACE FUNCTION vec_mult(arr numeric[], mul numeric)
RETURNS numeric[] AS
$$
SELECT array_agg(result)
FROM (SELECT val * $2 AS result
      FROM (SELECT UNNEST($1) AS val
                   ,generate_subscripts($1, 1) as ix) t
      ORDER BY ix) inn;
$$ LANGUAGE SQL IMMUTABLE STRICT;

Then we can use the PostgreSQL statement CREATE AGGREGATE to create the vec_sum function directly:

CREATE AGGREGATE vec_sum(numeric[]) (
    SFUNC = vec_add
    ,STYPE = numeric[]
);

And finally, we can find the average as:

SELECT vec_mult(vec_sum(vector), 1 / count(vector)) FROM A;
like image 36
Brandon Humpert Avatar answered Sep 22 '22 19:09

Brandon Humpert