Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting positive aggregate value and ignoring negative in Postgres SQL

I must apply a certain transformation fn(argument). Here argument is equal to value, but not when it is negative. When you get a first negative value, then you "wait" until it sums up with consecutive values and this sum becomes positive. Then you do fn(argument). See the table I want to get:

value      argument 
---------------------
  2           2      
  3           3      
 -10          0      
  4           0
  3           0
  10          7
  1           1

I could have summed all values and apply fn to the sum, but fn can be different for different rows and it is essential to know the row number to choose a concrete fn.

As want a Postgres SQL solution, looks like window functions fit, but I am not experienced enough to write expression that does that yet. In fact, I am new to "thinking in sql", unfortunately. I guess that can be easily done in an imperative way, but I do not want to write a stored procedure yet.

like image 605
zamza Avatar asked Aug 13 '11 09:08

zamza


2 Answers

I suppose I'm late, but this may help someone:

select
    value,
    greatest(0, value) as argument
from your_table;
like image 152
Nuno Rafael Figueiredo Avatar answered Nov 07 '22 01:11

Nuno Rafael Figueiredo


This doesn't really fit any of the predefined aggregation functions. You probably need to write your own. Note that in postgresql, aggregate functions can be used as window functions, and in fact that is the only way to write window functions in anything other than C, as of 9.0.

You can write a function that tracks the state of "summing" the values, except that it always returns the input value if the current "sum" is positive, and just keeps adding when the "sum" is negative. Then you simply need to take the greater of either this sum or zero. To whit:

-- accumulator function: first arg is state, second arg is input
create or replace function ouraggfunc(int, int)
 returns int immutable language plpgsql as $$
begin
  raise info 'ouraggfunc: %, %', $1, $2; -- to help you see what's going on
  -- get started by returning the first value ($1 is null - no state - first row)
  if $1 is null then
    return $2;
  end if;
  -- if our state is negative, we're summing until it becomes positive
  -- otherwise, we're just returning the input
  if $1 < 0 then
    return $1 + $2;
  else
    return $2;
  end if;
end;
$$;

You need to create an aggregate function to invoke this accumulator:

create aggregate ouragg(basetype = int, sfunc = ouraggfunc, stype = int);

This defines that the aggregate takes integers as input and stores its state as an integer.

I copied your example into a table:

steve@steve@[local] =# create table t(id serial primary key, value int not null, argument int not null);
NOTICE:  CREATE TABLE will create implicit sequence "t_id_seq" for serial column "t.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t_pkey" for table "t"
CREATE TABLE
steve@steve@[local] =# copy t(value, argument) from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 2    2
>> 3    3
>> -10  0
>> 4    0
>> 3    0
>> 10   7
>> 1    1
>> \.

And you can now have those values produced by using the aggregate function with a window clause:

steve@steve@[local] =# select value, argument, ouragg(value) over(order by id) from t;
INFO:  ouraggfunc: <NULL>, 2
INFO:  ouraggfunc: 2, 3
INFO:  ouraggfunc: 3, -10
INFO:  ouraggfunc: -10, 4
INFO:  ouraggfunc: -6, 3
INFO:  ouraggfunc: -3, 10
INFO:  ouraggfunc: 7, 1
 value | argument | ouragg
-------+----------+--------
     2 |        2 |      2
     3 |        3 |      3
   -10 |        0 |    -10
     4 |        0 |     -6
     3 |        0 |     -3
    10 |        7 |      7
     1 |        1 |      1
(7 rows)

So as you can see, the final step is that you need to take the output of the function if it is positive, or zero. This can be done by wrapping the query, or writing a function to do that:

create function positive(int) returns int immutable strict language sql as
$$ select case when $1 > 0 then $1 else 0 end $$;

and now:

select value, argument, positive(ouragg(value) over(order by id)) as raw_agg from t

This produces the arguments for the function that you specified in the question.

like image 27
araqnid Avatar answered Nov 07 '22 01:11

araqnid