I have a collection of objects containing transactional data. This dataset has a 3 level parent-child relationship with the levels being: Project > Stage > Transaction. For each project and stage there will be multiple transactions.
class CostElement
{
string ProjectNumber { get; set; }
string Stage { get; set; }
string DerivedCostElement { get; set; }
DateTime? Date { get; set; }
decimal? Amount { get; set; }
}
What I want to find is a unique list of cost elements (DerivedCostElement
), with a count of the maximum number of times it is used in any given single project. I.e. Count the number of stages it appears in for each project and then select the maximum value.
What I have tried so far is:
//count all cost elements per project
var aggregateQuery = from ce in _costElements
group ce by new { ce.ProjectNumber, ce.Stage, ce.DerivedCostElement }
into g
select new
{
g.Key.DerivedCostElement,
g.Key.ProjectNumber,
g.Key.Stage
};
//select cost elements having max count
var countQuery = from g in aggregateQuery
group g by new { g.DerivedCostElement, g.ProjectNumber }
into grp
select new CostElementCount
{
CostElement = grp.Key.DerivedCostElement,
ProjectNumber = grp.Key.ProjectNumber,
Count = grp.Count()
};
return countQuery.ToList();
This for the most part works, I end up with a list of cost elements per project and the maximum number of occurrences per project. However the results are still per project, what I want is a unique list with only the maximum value.
Can someone please help me obtain the correct result? Also any suggestions on how to write this more efficiently or succinctly would be greatly appreciated.
I'm not sure about the writing the first two queries you have more efficiently, but I think adding the following line before your return statement would get the results you want.
countQuery = countQuery.OrderByDescending(x => x.Count).GroupBy(x => x.CostElement).Select(g => g.First());
Or to keep the syntax more consistent
countQuery = from cq in countQuery
orderby cq.Count descending
group cq by cq.CostElement
into grp
select grp.First();
Here is the test data I used...
_costElements = new List<CostElement>();
_costElements.Add(new CostElement() { ProjectNumber = "a", Stage = "a", DerivedCostElement = "ce6" });
_costElements.Add(new CostElement() { ProjectNumber = "a", Stage = "b", DerivedCostElement = "ce1" });
_costElements.Add(new CostElement() { ProjectNumber = "a", Stage = "x", DerivedCostElement = "ce1" });
_costElements.Add(new CostElement() { ProjectNumber = "a", Stage = "b", DerivedCostElement = "ce2" });
_costElements.Add(new CostElement() { ProjectNumber = "a", Stage = "b", DerivedCostElement = "ce3" });
_costElements.Add(new CostElement() { ProjectNumber = "a", Stage = "c", DerivedCostElement = "ce3" });
_costElements.Add(new CostElement() { ProjectNumber = "a", Stage = "d", DerivedCostElement = "ce3" });
_costElements.Add(new CostElement() { ProjectNumber = "b", Stage = "e", DerivedCostElement = "ce1" });
_costElements.Add(new CostElement() { ProjectNumber = "b", Stage = "f", DerivedCostElement = "ce1" });
_costElements.Add(new CostElement() { ProjectNumber = "c", Stage = "g", DerivedCostElement = "ce1" });
_costElements.Add(new CostElement() { ProjectNumber = "c", Stage = "h", DerivedCostElement = "ce1" });
_costElements.Add(new CostElement() { ProjectNumber = "c", Stage = "h", DerivedCostElement = "ce2" });
_costElements.Add(new CostElement() { ProjectNumber = "c", Stage = "i", DerivedCostElement = "ce2" });
_costElements.Add(new CostElement() { ProjectNumber = "d", Stage = "j", DerivedCostElement = "ce2" });
_costElements.Add(new CostElement() { ProjectNumber = "d", Stage = "k", DerivedCostElement = "ce1" });
_costElements.Add(new CostElement() { ProjectNumber = "d", Stage = "l", DerivedCostElement = "ce1" });
_costElements.Add(new CostElement() { ProjectNumber = "d", Stage = "m", DerivedCostElement = "ce1" });
And the results I came up with (with a bit of reordering of the countQuery results) were...
CostElement: ce1 ProjectNumber: d Count: 3
CostElement: ce2 ProjectNumber: c Count: 2
CostElement: ce3 ProjectNumber: a Count: 3
CostElement: ce6 ProjectNumber: a Count: 1
Which I think is a list of all cost elements, along with the project and count where each of them appears most. That output came from a ToString() I added to CostElementCount
public string ToString()
{
return string.Format("CostElement: {0} ProjectNumber: {1} Count: {2}", CostElement, ProjectNumber, Count);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With