Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using QUARTILE in an Excel pivot table to summarise data by sub-populations

I've got a large table of data in an Excel spreadsheet that, essentially, can be considered to be a collection of values for individuals identified as belonging to various subpopulations:

  IndivID   SubPopID  Value
     1          A       33.56
     2          E       42.31
     3          D       16.35
     4          A       50.59
     5          F       80.63
     ...

There are over 10,000 individuals in this table, and over 50 subpopulations.

I would like to calculate the 5-number summary (Min, Q1, Median, Q3, Max) for each of the subpopulations.

Min and Max are easy in a pivot table. But Q1, Q2 (median), and Q3 don't seem to be possible.

Question

Is there any way to calculate the quartiles for each subpopulation in a pivot table in Excel?

The answer should look like this:

SubPopID      Min     Q1     Q2        Q3      Max
   A         3.23    12.06   20.35   28.29     50.59
   B 
   C
   ...
like image 723
Assad Ebrahim Avatar asked Nov 18 '12 14:11

Assad Ebrahim


People also ask

How do you summarize data in a PivotTable?

In the PivotTable, right-click the value field you want to change, and then click Summarize Values By. Click the summary function you want. Note: Summary functions aren't available in PivotTables that are based on Online Analytical Processing (OLAP) source data. The sum of the values.


3 Answers

My experience is that you can't do percentiles in a pivot table. I just did a web search and don't see anything to contradict that. I think the best you can do is have your source data in a table so you can refer to it dynamically, and then use the percentile function in another table.

I've seen one article about using PowerPivot, but that's only available in Excel 2010, and it looks hard. I have 2010, and when faced with a similar problem, I still opted to go the route I'm suggesting here.

EDIT: Explanation of my proposed alternative:

In answer to your question in the comments, here's how I'd calculate percentiles from a table of data:

Table of percentiles

It relies on an single formula, which is the same for every cell from F2 to H6. It's an array formula, meaning it's entered with Ctrl-Shift-Enter. It uses an IF statement to filter on the various possible SubPopIDs. Here's the formula for F2:

=PERCENTILE(IF(Table1[SubPopID]=F$1,Table1[Value],""),$E2/100)
like image 89
Doug Glancy Avatar answered Oct 19 '22 20:10

Doug Glancy


If you need to check a condition for more than one column (a=1 AND b=2) you can expand the array using the * character

=PERCENTILE((IF((Table1[SubPopID]=[condition1]) * (Table1[SubPopID2]=[condition2]),Table1[Value]),""),$E2/100)
like image 37
zer0links Avatar answered Oct 19 '22 21:10

zer0links


I found this answer where a solution is described.

It uses the same aproach as described by Doug Glancy but associate it with a calculated field in the pivot table. The answer gives an example excel file.

like image 40
Alban Avatar answered Oct 19 '22 21:10

Alban