In had a rather large error in my previous question
select earliest date from multiple rows
The answer by horse_with_no_name returns a perfect result, and I am hugely appreciative, however I got my own initial question wrong so I really apologise; if you look at the table below;
circuit_uid |customer_name |rack_location |reading_date | reading_time | amps | volts | kw | kwh | kva | pf | key -------------------------------------------------------------------------------------------------------------------------------------- cu1.cb1.r1 | Customer 1 | 12.01.a1 | 2012-01-02 | 00:01:01 | 4.51 | 229.32 | 1.03 | 87 | 1.03 | 0.85 | 15 cu1.cb1.r1 | Customer 1 | 12.01.a1 | 2012-01-02 | 01:01:01 | 4.18 | 230.3 | 0.96 | 90 | 0.96 | 0.84 | 16 cu1.cb1.r2 | Customer 1 | 12.01.a1 | 2012-01-02 | 00:01:01 | 4.51 | 229.32 | 1.03 | 21 | 1.03 | 0.85 | 15 cu1.cb1.r2 | Customer 1 | 12.01.a1 | 2012-01-02 | 01:01:01 | 4.18 | 230.3 | 0.96 | 23 | 0.96 | 0.84 | 16 cu1.cb1.s2 | Customer 2 | 10.01.a1 | 2012-01-02 | 00:01:01 | 7.34 | 228.14 | 1.67 | 179 | 1.67 | 0.88 | 24009 cu1.cb1.s2 | Customer 2 | 10.01.a1 | 2012-01-02 | 01:01:01 | 9.07 | 228.4 | 2.07 | 182 | 2.07 | 0.85 | 24010 cu1.cb1.s3 | Customer 2 | 10.01.a1 | 2012-01-02 | 00:01:01 | 7.34 | 228.14 | 1.67 | 121 | 1.67 | 0.88 | 24009 cu1.cb1.s3 | Customer 2 | 10.01.a1 | 2012-01-02 | 01:01:01 | 9.07 | 228.4 | 2.07 | 124 | 2.07 | 0.85 | 24010 cu1.cb1.r1 | Customer 3 | 01.01.a1 | 2012-01-02 | 00:01:01 | 7.32 | 229.01 | 1.68 | 223 | 1.68 | 0.89 | 48003 cu1.cb1.r1 | Customer 3 | 01.01.a1 | 2012-01-02 | 01:01:01 | 6.61 | 228.29 | 1.51 | 226 | 1.51 | 0.88 | 48004 cu1.cb1.r4 | Customer 3 | 01.01.a1 | 2012-01-02 | 00:01:01 | 7.32 | 229.01 | 1.68 | 215 | 1.68 | 0.89 | 48003 cu1.cb1.r4 | Customer 3 | 01.01.a1 | 2012-01-02 | 01:01:01 | 6.61 | 228.29 | 1.51 | 217 | 1.51 | 0.88 | 48004
As you can see each customer now has multiple circuits. So the result would now be the sum of each of the earliest kwh readings for each circuit per customer, so the result in this table would be;
customer_name | kwh(sum)
--------------+-----------
customer 1 | 108 (the result of 87 + 21)
customer 2 | 300 (the result of 179 + 121)
customer 3 | 438 (the result of 223 + 215)
There will be more than 2 circuits per customer and the readings can happen at varying times, hence the need for the 'earliest' reading.
Would anybody have any suggestions for the revised question?
PostgreSQL 8.4 on CentOs/Redhat.
SELECT customer_name, sum(kwh) AS kwh_total
FROM (
SELECT DISTINCT ON (customer_name, circuit_uid)
customer_name, circuit_uid, kwh
FROM readings
WHERE reading_date = '2012-01-02'::date
ORDER BY customer_name, circuit_uid, reading_time
) x
GROUP BY 1
Same as before, just pick the earliest per (customer_name, circuit_uid).
Then sum per customer_name.
A multi-column index like the following will make this very fast:
CREATE INDEX readings_multi_idx
ON readings(reading_date, customer_name, circuit_uid, reading_time);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With