Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Moving average in postgresql

I have the following table in my Postgresql 9.1 database:

select * from ro; date       |  shop_id | amount  -----------+----------+-------- 2013-02-07 |     1001 |      3 2013-01-31 |     1001 |      2 2013-01-24 |     1001 |      1 2013-01-17 |     1001 |      5 2013-02-10 |     1001 |     10 2013-02-03 |     1001 |      4 2012-12-27 |     1001 |      6 2012-12-20 |     1001 |      8 2012-12-13 |     1001 |      4 2012-12-06 |     1001 |      3 2012-10-29 |     1001 |      3 

I am trying to get a moving average comparing data against last 3 Thursdays without including the current Thursday. Here's my query:

select date, shop_id, amount, extract(dow from date), avg(amount) OVER (PARTITION BY extract(dow from date) ORDER BY date DESC                       ROWS BETWEEN 0 PRECEDING AND 2 FOLLOWING)                           from ro where extract(dow from date) = 4 

This is the result given

date       |  shop_id | amount | date_part |        avg          -----------+----------+--------+-----------+-------------------- 2013-02-07 |     1001 |      3 |         4 | 2.0000000000000000 2013-01-31 |     1001 |      2 |         4 | 2.6666666666666667 2013-01-24 |     1001 |      1 |         4 | 4.0000000000000000 2013-01-17 |     1001 |      5 |         4 | 6.3333333333333333 2012-12-27 |     1001 |      6 |         4 | 6.0000000000000000 2012-12-20 |     1001 |      8 |         4 | 5.0000000000000000 2012-12-13 |     1001 |      4 |         4 | 3.5000000000000000 2012-12-06 |     1001 |      3 |         4 | 3.0000000000000000 

I expect

date       |  shop_id | amount | date_part |        avg          -----------+----------+--------+-----------+-------------------- 2013-02-07 |     1001 |      3 |         4 | 2.6666666666666667 2013-01-31 |     1001 |      2 |         4 | 4.0000000000000000 2013-01-24 |     1001 |      1 |         4 | 6.3333333333333333 2013-01-17 |     1001 |      5 |         4 | 6.0000000000000000 2012-12-27 |     1001 |      6 |         4 | 5.0000000000000000 2012-12-20 |     1001 |      8 |         4 | 2012-12-13 |     1001 |      4 |         4 | 2012-12-06 |     1001 |      3 |         4 | 
like image 928
Glicious Avatar asked Feb 07 '13 10:02

Glicious


People also ask

How do you do a moving average in SQL?

To do so, we calculate the average of the stock prices from three consecutive days—the day in question and the two previous days—then repeat the same for each day in the data set. This is a three-day moving average, because we average over a period of three days.

How do you find average in PostgreSQL?

PostgreSQL provides an AVG() function to calculate the average value of a set. The AVG() function is one of the most frequently used aggregate functions in PostgreSQL. The AVG() function enables users to calculate the average value of a numeric column. It can be used with both the SELECT and HAVING clause.

How is EMA calculated in SQL?

For each iteration, the value of @today_ema is set equal to α times the current row's time series value plus (1 – α) times the exponential moving average from the prior row. Then, an update statement revises the ema value for the current row to the value of @today_ema.

How do I find the average of each row in SQL?

For example, 2+4+4+6+6+8 is 30 divided 6 which results in an average of 5. This is the basic syntax for the AVG function: SELECT AVG(column_name) FROM table_name; In this example, we have a table called students , with columns of id , name , date , and scores .


1 Answers

SQL Fiddle

select     "date",     shop_id,     amount,     extract(dow from date),     case when         row_number() over (order by date) > 3         then             avg(amount) OVER (                 ORDER BY date DESC                 ROWS BETWEEN 1 following AND 3 FOLLOWING             )         else null end from (     select *     from ro     where extract(dow from date) = 4 ) s 

What is wrong with the OP's query is the frame specification:

ROWS BETWEEN 0 PRECEDING AND 2 FOLLOWING 

Other than that my query avoids unneeded computing by filtering Thursdays before applying the expensive window functions.

If it is necessary to partition by shop_id then obviously add the partition by shop_id to both functions, avg and row_number.

like image 158
Clodoaldo Neto Avatar answered Sep 30 '22 04:09

Clodoaldo Neto