I have a column with numeric[]
values which all have the same size. I'd like to take their element-wise average. By this I mean that the average of
{1, 2, 3}, {-1, -2, -3}, and {3, 3, 3}
should be {1, 1, 1}
. Also of interest is how to sum these element-wise, although I expect that any solution for one will be a solution for the other.
(NB: The length of the arrays is fixed within a single table, but may vary between tables. So I need a solution which doesn't assume a certain length.)
My initial guess is that I should be using unnest
somehow, since unnest
applied to a numeric[]
column flattens out all the arrays. So I'd like to think that there's a nice way to use this with some sort of windowing function + group by
to pick out the individual components of each array and sum them.
-- EXAMPLE DATA
CREATE TABLE A
(vector numeric[])
;
INSERT INTO A
VALUES
('{1, 2, 3}'::numeric[])
,('{-1, -2, -3}'::numeric[])
,('{3, 3, 3}'::numeric[])
;
I've written an extension to do vector addition (and subtraction, multiplication, division, and powers) with fast C functions. You can find it on Github or PGXN.
Given two arrays a
and b
you can say vec_add(a, b)
. You can also add either side to a scalar, e.g. vec_add(a, 5)
.
If you want a SUM
aggregate function instead you can find that in aggs_for_vecs, also on PGXN.
Finally if you want to sum up all the elements of a single array, you can use aggs_for_arrays (PGXN).
I discovered a solution on my own which is probably the one I will use.
First, we can define a function for adding two vectors:
CREATE OR REPLACE FUNCTION vec_add(arr1 numeric[], arr2 numeric[])
RETURNS numeric[] AS
$$
SELECT array_agg(result)
FROM (SELECT tuple.val1 + tuple.val2 AS result
FROM (SELECT UNNEST($1) AS val1
,UNNEST($2) AS val2
,generate_subscripts($1, 1) AS ix) tuple
ORDER BY ix) inn;
$$ LANGUAGE SQL IMMUTABLE STRICT;
and a function for multiplying by a constant:
CREATE OR REPLACE FUNCTION vec_mult(arr numeric[], mul numeric)
RETURNS numeric[] AS
$$
SELECT array_agg(result)
FROM (SELECT val * $2 AS result
FROM (SELECT UNNEST($1) AS val
,generate_subscripts($1, 1) as ix) t
ORDER BY ix) inn;
$$ LANGUAGE SQL IMMUTABLE STRICT;
Then we can use the PostgreSQL statement CREATE AGGREGATE
to create the vec_sum
function directly:
CREATE AGGREGATE vec_sum(numeric[]) (
SFUNC = vec_add
,STYPE = numeric[]
);
And finally, we can find the average as:
SELECT vec_mult(vec_sum(vector), 1 / count(vector)) FROM A;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With