I've been working with SQL Server for the better part of a decade, and this grouping (or partitioning, or ranking...I'm not sure what the answer is!) one has me stumped. Feels like it should be an easy one, too. I'll generalize my problem:
Let's say I have 3 employees (don't worry about them quitting or anything...there's always 3), and I keep up with how I distribute their salaries on a monthly basis.
Month Employee PercentOfTotal
--------------------------------
1 Alice 25%
1 Barbara 65%
1 Claire 10%
2 Alice 25%
2 Barbara 50%
2 Claire 25%
3 Alice 25%
3 Barbara 65%
3 Claire 10%
As you can see, I've paid them the same percent in Months 1 and 3, but in Month 2, I've given Alice the same 25%, but Barbara got 50% and Claire got 25%.
What I want to know is all the distinct distributions I've ever given. In this case there would be two -- one for months 1 and 3, and one for month 2.
I'd expect the results to look something like this (NOTE: the ID, or sequencer, or whatever, doesn't matter)
ID Employee PercentOfTotal
--------------------------------
X Alice 25%
X Barbara 65%
X Claire 10%
Y Alice 25%
Y Barbara 50%
Y Claire 25%
Seems easy, right? I'm stumped! Anyone have an elegant solution? I just put together this solution while writing this question, which seems to work, but I'm wondering if there's a better way. Or maybe a different way from which I'll learn something.
WITH temp_ids (Month)
AS
(
SELECT DISTINCT MIN(Month)
FROM employees_paid
GROUP BY PercentOfTotal
)
SELECT EMP.Month, EMP.Employee, EMP.PercentOfTotal
FROM employees_paid EMP
JOIN temp_ids IDS ON EMP.Month = IDS.Month
GROUP BY EMP.Month, EMP.Employee, EMP.PercentOfTotal
Thanks y'all! -Ricky
This gives you an answer in a slightly different format than you requested:
SELECT DISTINCT
T1.PercentOfTotal AS Alice,
T2.PercentOfTotal AS Barbara,
T3.PercentOfTotal AS Claire
FROM employees_paid T1
JOIN employees_paid T2
ON T1.Month = T2.Month AND T1.Employee = 'Alice' AND T2.Employee = 'Barbara'
JOIN employees_paid T3
ON T2.Month = T3.Month AND T3.Employee = 'Claire'
Result:
Alice Barbara Claire
25% 50% 25%
25% 65% 10%
If you want to, you can use UNPIVOT to turn this result set into the form you asked for.
SELECT rn AS ID, Employee, PercentOfTotal
FROM (
SELECT *, ROW_NUMBER() OVER (ORDER BY Alice) AS rn
FROM (
SELECT DISTINCT
T1.PercentOfTotal AS Alice,
T2.PercentOfTotal AS Barbara,
T3.PercentOfTotal AS Claire
FROM employees_paid T1
JOIN employees_paid T2 ON T1.Month = T2.Month AND T1.Employee = 'Alice'
AND T2.Employee = 'Barbara'
JOIN employees_paid T3 ON T2.Month = T3.Month AND T3.Employee = 'Claire'
) T1
) p UNPIVOT (PercentOfTotal FOR Employee IN (Alice, Barbara, Claire)) AS unpvt
Result:
ID Employee PercentOfTotal
1 Alice 25%
1 Barbara 50%
1 Claire 25%
2 Alice 25%
2 Barbara 65%
2 Claire 10%
What you want is for each month's distribution to act as a signature or pattern of values which you would then want to find in other months. What is not clear is whether the employee to whom the value went is as important as the break down of percentages. For example, would Alice=65%, Barbara=25%, Claire=10% be the same as the Month 3 in your example? In my example, I presumed that it would not be the same. Similar to Martin Smith's solution, I find the signatures by multiplying each percentage by 10. This presumes that all percentage values are less than one. If someone could have a percentage of 110% for example, that would create problems for this solution.
With Employees As
(
Select 1 As Month, 'Alice' As Employee, .25 As PercentOfTotal
Union All Select 1, 'Barbara', .65
Union All Select 1, 'Claire', .10
Union All Select 2, 'Alice', .25
Union All Select 2, 'Barbara', .50
Union All Select 2, 'Claire', .25
Union All Select 3, 'Alice', .25
Union All Select 3, 'Barbara', .65
Union All Select 3, 'Claire', .10
)
, EmployeeRanks As
(
Select Month, Employee, PercentOfTotal
, Row_Number() Over ( Partition By Month Order By Employee, PercentOfTotal ) As ItemRank
From Employees
)
, Signatures As
(
Select Month
, Sum( PercentOfTotal * Cast( Power( 10, ItemRank ) As bigint) ) As SignatureValue
From EmployeeRanks
Group By Month
)
, DistinctSignatures As
(
Select Min(Month) As MinMonth, SignatureValue
From Signatures
Group By SignatureValue
)
Select E.Month, E.Employee, E.PercentOfTotal
From Employees As E
Join DistinctSignatures As D
On D.MinMonth = E.Month
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With