Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count median grouped by day

I have a script which counts median value for all table data:

SELECT avg(t1.price) as median_val FROM (
SELECT @rownum:=@rownum+1 as `row_number`, d.price
  FROM mediana d,  (SELECT @rownum:=0) r
  WHERE 1
  ORDER BY d.price
) as t1, 
(
  SELECT count(*) as total_rows
  FROM mediana d
  WHERE 1
) as t2
AND t1.row_number>=total_rows/2 and t1.row_number<=total_rows/2+1;

Now I need to get median value not for all table values, but grouped by date. Is it possible? http://sqlfiddle.com/#!2/7cf27 - so as result I will get 2013-03-06 - 1.5 , 2013-03-05 - 3.5.

like image 237
Alex Avatar asked Mar 13 '13 13:03

Alex


People also ask

How do you calculate the median of grouped data?

The formula for median of grouped data depends on the observations, the class size, the frequency, and the cumulative frequency. The formula to calculate the median is l + [(n/2−c)/f] × h. Where, l = lower limit of median class.

How do you calculate median for grouped data using Excel?

In the worksheet, select cell A1, and press CTRL+V. Click inside an empty cell. Click the Formula tab, and then click AutoSum > More functions. Type MEDIAN in the Search for a function: box, and then click OK.


1 Answers

I hope I didn't loose myself and overcomplicate things, but here's what I came up with:

SELECT sq.created_at, avg(sq.price) as median_val FROM (
SELECT t1.row_number, t1.price, t1.created_at FROM(
SELECT IF(@prev!=d.created_at, @rownum:=1, @rownum:=@rownum+1) as `row_number`, d.price, @prev:=d.created_at AS created_at
FROM mediana d, (SELECT @rownum:=0, @prev:=NULL) r
ORDER BY created_at, price
) as t1 INNER JOIN  
(
  SELECT count(*) as total_rows, created_at 
  FROM mediana d
  GROUP BY created_at
) as t2
ON t1.created_at = t2.created_at
WHERE 1=1
AND t1.row_number>=t2.total_rows/2 and t1.row_number<=t2.total_rows/2+1
)sq
group by sq.created_at

What I did here, is mainly just to reset the rownumber to 1 when the date changes (it's important to order by created_at) and included the date so we can group by it. In the query which calculates total rows I also included created_at, so we can join the two subqueries.

like image 88
fancyPants Avatar answered Sep 19 '22 10:09

fancyPants