Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get average of the 'middle' values in a group?

Tags:

sql

postgresql

I have a table that has values and group ids (simplified example). I need to get the average for each group of the middle 3 values. So, if there are 1, 2, or 3 values it's just the average. But if there are 4 values, it would exclude the highest, 5 values the highest and lowest, etc. I was thinking some sort of window function, but I'm not sure if it's possible.

http://www.sqlfiddle.com/#!11/af5e0/1

For this data:

TEST_ID TEST_VALUE  GROUP_ID
1       5           1
2       10          1
3       15          1
4       25          2
5       35          2
6       5           2
7       15          2
8       25          3
9       45          3
10      55          3
11      15          3
12      5           3
13      25          3
14      45          4

I'd like

GROUP_ID    AVG
1           10
2           15
3           21.6
4           45
like image 642
Barbara Laird Avatar asked Sep 30 '13 18:09

Barbara Laird


2 Answers

Another option using analytic functions;

SELECT group_id,
       avg( test_value )
FROM (
  select t.*,
         row_number() over (partition by group_id order by test_value ) rn,
         count(*) over (partition by group_id  ) cnt
  from test t
) alias 
where 
   cnt <= 3
   or 
   rn between floor( cnt / 2 )-1 and ceil( cnt/ 2 ) +1
group by group_id
;

Demo --> http://www.sqlfiddle.com/#!11/af5e0/59

like image 94
krokodilko Avatar answered Nov 01 '22 17:11

krokodilko


I'm not familiar with the Postgres syntax on windowed functions, but I was able to solve your problem in SQL Server with this SQL Fiddle. Maybe you'll be able to easily migrate this into Postgres-compatible code. Hope it helps!

A quick primer on how I worked it.

  1. Order the test scores for each group
  2. Get a count of items in each group
  3. Use that as a subquery and select only the middle 3 items (that's the where clause in the outer query)
  4. Get the average for each group

--

select  
  group_id,
  avg(test_value)
from (
  select 
    t.group_id, 
    convert(decimal,t.test_value) as test_value, 
    row_number() over (
      partition by t.group_id
      order by t.test_value
    ) as ord,
    g.gc
  from
    test t
    inner join (
      select group_id, count(*) as gc
      from test
      group by group_id
    ) g
      on t.group_id = g.group_id
  ) a
where
  ord >= case when gc <= 3 then 1 when gc % 2 = 1 then gc / 2 else (gc - 1) / 2 end
  and ord <= case when gc <= 3 then 3 when gc % 2 = 1 then (gc / 2) + 2 else ((gc - 1) / 2) + 2 end
group by
  group_id
like image 29
Derek Kromm Avatar answered Nov 01 '22 19:11

Derek Kromm