I'd like to use the percentile_cont
function to get median values in T-SQL. However, I also need to get mean values as well. I'd like to do something like the following:
SELECT CustomerID ,
AVG(Expenditure) AS MeanSpend , percentile_cont
( .5) WITHIN GROUP(ORDER BY Expenditure) OVER( ) AS MedianSpend
FROM Customers
GROUP BY CustomerID
Can this be accomplished? I know I can use the OVER clause to group the percentile_cont
results...
but then I'm stuck using two queries, am I not?
The definition from Microsoft for PERCENTILE_CONT is: “Calculates a percentile based on a continuous distribution of the column value in SQL Server. The result is interpolated and might not be equal to any of the specific values in the column.”
PERCENTILE_CONT returns a computed result after doing linear interpolation. PERCENTILE_DISC simply returns a value from the set of values that are aggregated over.
The PERCENT_RANK function computes the rank of the employee's salary within a department as a percentage. The PARTITION BY clause is specified to partition the rows in the result set by department. The ORDER BY clause in the OVER clause orders the rows in each partition.
Just figured it out... gotta drop the group by and give both aggregation functions a over statement.
SELECT CustomerID,
AVG(Expenditure) OVER(PARTITION BY CustomerID) AS MeanSpend,
percentile_cont(.5) WITHIN GROUP(ORDER BY Expenditure) OVER(PARTITION BY CustomerID) AS MedianSpend
FROM Customers
You can't use "group by" with window functions. These functions return the aggregated values for every row. One way is to use "select distinct" to get rid of the duplicate rows. Just make sure you partition each window function by the non-aggregated columns (groupId in this example).
--Generate test data
SELECT TOP(10)
value.number%3 AS groupId
, value.number AS number
INTO #data
FROM master.dbo.spt_values AS value
WHERE value."type" = 'P'
ORDER BY NEWID()
;
--View test data
SELECT * FROM #data ORDER BY groupId,number;
--CALCULATE MEDIAN
SELECT DISTINCT
groupId
, AVG(number) OVER(PARTITION BY groupId) AS mean
, percentile_cont(.5) WITHIN GROUP(ORDER BY number) OVER(PARTITION BY groupId) AS median
FROM #data
;
--Clean up
DROP TABLE #data;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With