Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

First and last value aggregate functions in postgresql that work correctly with NULL values

I know there are aggregate functions for getting last and first value of rows in postgresql

My problem is, that they do not work as i need. And i could use the help of one postgresql wizard. I'm using postgresql 9.2 - in case the version makes offering solution easyer.

Query

select v.id, v.active, v.reg_no, p.install_date, p.remove_date 
from vehicle v 
    left join period p on (v.id = p.car_id) 
where v.id = 1 
order by v.id, p.install_date asc

Returns 6 rows:

id, active, reg_no, install_date, remove_date
1, TRUE, something, 2008-08-02 11:13:39, 2009-02-09 10:32:32
....
1, TRUE, something, 2010-08-15 21:16:40, 2012-08-25 07:44:30
1, TRUE, something, 2012-09-10 17:05:12, NULL

But when i use aggregating query:

select max(id) as id, last(active) as active, first(install_date) as install_date, last(remove_date) as remove_date 
from (
    select v.id, v.active, v.reg_no, p.install_date, p.remove_date 
    from vehicle v 
      left join period p on (v.id = p.car_id) 
    where v.id = 1 
    order by v.id, p.install_date asc
) as bar 
group by id

Then i get

id, active, install_date, remove_date
1, TRUE, 2008-08-02 11:13:39, 2012-08-25 07:44:30

not

id, active, install_date, remove_date
1, TRUE, 2008-08-02 11:13:39, NULL

as i expected

Is it possible to change the aggregate functions somehow to yield NULL if the value of last row is null, not last existing value?

EDIT1

Roman Pekar offered alternative solution to my problem, but that does not fit my needs. The reason is - i simplified the original query. But the query i run is more complex. I realise that there might be alternative solutions to my problem - this why is update the post to include the original, more complex, query. Which is:

select partner_id, sum(active) as active, sum(installed) as installed, sum(removed) as removed 
from (
    select 
    pc.partner_id as partner_id, 
    v.id, 
    CASE WHEN v.active = TRUE THEN 1 ELSE 0 END as active, 
    CASE WHEN first(p.install_date) BETWEEN '2013-12-01' AND '2014-01-01' THEN 1 ELSE 0 END as installed,
    CASE WHEN last(p.remove_date) BETWEEN '2013-12-01' AND '2014-01-01' THEN 1 ELSE 0 END as removed 
    from vehicle v 
        left join period p on (v.id = p.car_id) 
        left join partner_clients pc on (pc.account_id = v.client_id) 
    group by pc.partner_id, v.id, v.active
) as foo group by partner_id

As you can see, i actually need to get first and last value of several vehicles not one and in the end aggregate the counts of those vehicles by the owners of those vehicles.

/EDIT1

like image 591
Odif Yltsaeb Avatar asked Dec 03 '13 08:12

Odif Yltsaeb


People also ask

Can aggregate functions work with null values?

An aggregate function performs a calculation on a set of values, and returns a single value. Except for COUNT(*) , aggregate functions ignore null values.

Which aggregate function accepts null values?

All aggregate functions except COUNT (*), GROUPING , and GROUPING_ID ignore nulls. You can use the NVL function in the argument to an aggregate function to substitute a value for a null.

How do nulls affect the aggregate functions?

Nulls and Aggregate Functions. If an aggregate function against a column that contains nulls is executed, the function ignores the nulls. This prevents unknown or inapplicable values from affecting the result of the aggregate.

How do you sum null values in PostgreSQL?

MySQL and PostgreSQL cannot sum up NULL values with the + value. The sum value will be NULL . If you want to do additions in the database: use SUM if it's an option to sum up a column of a result set instead of expressions ( SUM ignores NULL values)


2 Answers

You could use window functions lead() and lag() to check first and last record, for example:

select
    max(a.id) as id,
    max(a.first) as first,
    max(a.last) as last
from (
    select
         v.id,
         case when lag(v.id) over(order by v.id, p.install_date) is null then p.install_date end as first,
         case when lead(v.id) over(order by v.id, p.install_date) is null then p.remove_date end as last
    from vehicle v 
       left join period p on (v.id = p.car_id) 
    where v.id = 1 
) as a

sql fiddle demo

like image 180
Roman Pekar Avatar answered Sep 22 '22 05:09

Roman Pekar


Thanks to Damien i went reading postgresql documentation about creating functions (source) and fiddled with the function changing it from:

CREATE OR REPLACE FUNCTION public.last_agg ( anyelement, anyelement )
RETURNS anyelement LANGUAGE sql IMMUTABLE STRICT AS $$
        SELECT $2;
$$;

CREATE AGGREGATE public.last (
        sfunc    = public.last_agg,
        basetype = anyelement,
        stype    = anyelement
);

to:

CREATE OR REPLACE FUNCTION public.last_agg ( anyelement, anyelement )
RETURNS anyelement LANGUAGE sql IMMUTABLE CALLED ON NULL INPUT AS $$
        SELECT $2;
$$;

CREATE AGGREGATE public.last (
        sfunc    = public.last_agg,
        basetype = anyelement,
        stype    = anyelement
);

and it seems to have fixed my troubles.

Thanks for reading.

like image 44
Odif Yltsaeb Avatar answered Sep 19 '22 05:09

Odif Yltsaeb