Aggregate functions over arrays

Tags:

I have a table like this:

+-----+----------------+
| ID  |  array300      |
+-----+----------------+
| 100 | {110,25,53,..} |
| 101 | {56,75,59,...} |
| 102 | {65,93,82,...} |
| 103 | {75,70,80,...} |
+-----+----------------+

array300 column is an array of 300 elements. I need to have arrays of 100 elements with every element representing the average of 3 elements of array300. For this example the answer will be like:
array100
{62.66,...}
{63.33,...}
{80,...}
{78.33,...}

313

asked Dec 10 '12 15:12

f.ashouri

2 Answers

Try something like this:

SELECT id, unnest(array300) as val, ntile(100) OVER (PARTITION BY id) as bucket_num
FROM your_table

This SELECT will give you 300 records per array300 with same id and assing them the bucket_num (1 for firs 3 elements, 2 for next 3, and so on).

Then use this select to get the avg of elements in the bucket:

SELECT id, avg(val) as avg_val
FROM (...previous select here...)
GROUP BY id, bucket_num

Next - just aggregate the avg_val into array:

SELECT id, array_agg(avg_val) as array100
FROM (...previous select here...)
GROUP BY id

Details: unnest , ntile , array_agg , OVER (PARTITION BY )

UPD: Try this function:

CREATE OR REPLACE FUNCTION public.array300_to_100 (
  p_array300 numeric []
)
RETURNS numeric [] AS
$body$
DECLARE
  dim_start int = array_length(p_array300, 1); --size of input array
  dim_end int = 100; -- size of output array
  dim_step int = dim_start / dim_end; --avg batch size
  tmp_sum NUMERIC; --sum of the batch
  result_array NUMERIC[100]; -- resulting array
BEGIN

  FOR i IN 1..dim_end LOOP --from 1 to 100.
    tmp_sum = 0;

    FOR j IN (1+(i-1)*dim_step)..i*dim_step LOOP --from 1 to 3, 4 to 6, ...
      tmp_sum = tmp_sum + p_array300[j];  
    END LOOP; 

    result_array[i] = tmp_sum / dim_step;
  END LOOP; 

  RETURN result_array;
END;
$body$
LANGUAGE 'plpgsql'
IMMUTABLE
RETURNS NULL ON NULL INPUT;

It takes one array300 and outputs one array100. To use it:

SELECT id, array300_to_100(array300)
FROM table1;

If you have any problems understanding it - just ask me.

174

answered Sep 30 '22 17:09

Ihor Romanchenko

Putting the pieces of Igor into another form:

 select id, array300, (
    select array_agg(z) from
    (
        select avg(x) from 
        (
            select x, ntile(array_length(array300,1)/3) over() from unnest(array300) x
        ) y 
        group by ntile
    ) z
) array100
from your_table

For a small example table like this

 id |       array300        
----+-----------------------
  1 | {110,25,53,110,25,53}
  2 | {56,75,59,110,25,53}
  3 | {65,93,82,110,25,53}
  4 | {75,70,80,110,25,53}

the result is:

 id |       array300        |                   array100                    
----+-----------------------+-----------------------------------------------
  1 | {110,25,53,110,25,53} | {(62.6666666666666667),(62.6666666666666667)}
  2 | {56,75,59,110,25,53}  | {(63.3333333333333333),(62.6666666666666667)}
  3 | {65,93,82,110,25,53}  | {(80.0000000000000000),(62.6666666666666667)}
  4 | {75,70,80,110,25,53}  | {(75.0000000000000000),(62.6666666666666667)}
(4 rows)

Edit My first version used a fixes ntile(2). This only worked for source arrays of size 6. I've fixed that by using array_length(array300,1)/3 instead.

answered Sep 30 '22 17:09

A.H.

Related questions
                            
                                C#: Convert array to use in params with additional parameters
                            
                                In numpy, what does indexing an array with the empty tuple vs. ellipsis do?
                            
                                Error C2100 - Illegal Indirection
                            
                                Iterate over static array in java without array variable
                            
                                How to align std::array contained data?
                            
                                What is the correct type to use for an array index?
                            
                                How does a vector as a key works internally in C++?
                            
                                adding to json property that may or may not exist yet
                            
                                Using an array's SetValue method vs. the [] indexers
                            
                                Programmatically add column names to numpy ndarray
                            
                                Write a mode method in Java to find the most frequently occurring element in an array
                            
                                Duplicate an array an arbitrary number of times (javascript)
                            
                                Javascript map method on array of string elements
                            
                                What is the difference between any and any[ ]?
                            
                                How do fix the "Cannot find 'NSFetchRequest' in scope error
                            
                                How to shift an array of bytes by 12-bits
                            
                                Elegant way to init and extend a javascript array
                            
                                PostgreSQL convert array returned from function to columns
                            
                                Difference between .MakeArrayType() and .MakeArrayType(1)
                            
                                ruby array internals

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Aggregate functions over arrays

Tags:

arrays

postgresql

aggregate-functions

f.ashouri

People also ask

2 Answers

Ihor Romanchenko

A.H.

Recent Activity

Donate For Us