I am trying to figure out how to refactor this LINQ code nicely. This code and other similar code repeats within the same file as well as in other files. Sometime the data being manipulated is identical and sometimes the data changes and the logic remains the same.
Here is an example of duplicated logic operating on different fields of different objects.
public IEnumerable<FooDataItem> GetDataItemsByColor(IEnumerable<BarDto> dtos)
{
double totalNumber = dtos.Where(x => x.Color != null).Sum(p => p.Number);
return from stat in dtos
where stat.Color != null
group stat by stat.Color into gr
orderby gr.Sum(p => p.Number) descending
select new FooDataItem
{
Color = gr.Key,
NumberTotal = gr.Sum(p => p.Number),
NumberPercentage = gr.Sum(p => p.Number) / totalNumber
};
}
public IEnumerable<FooDataItem> GetDataItemsByName(IEnumerable<BarDto> dtos)
{
double totalData = dtos.Where(x => x.Name != null).Sum(v => v.Data);
return from stat in dtos
where stat.Name != null
group stat by stat.Name into gr
orderby gr.Sum(v => v.Data) descending
select new FooDataItem
{
Name = gr.Key,
DataTotal = gr.Sum(v => v.Data),
DataPercentage = gr.Sum(v => v.Data) / totalData
};
}
Anyone have a good way of refactoring this?
Something like this:
public IEnumerable<FooDataItem> GetDataItems<T>(IEnumerable<BarDto> dtos,
Func<BarDto, T> groupCriteria,
Func<BarDto, double> dataSelector,
Func<T, double, double, FooDataItem> resultFactory)
{
var validDtos = dtos.Where(d => groupCriteria(d) != null);
double totalNumber = validDtos.Sum(dataSelector);
return validDtos
.GroupBy(groupCriteria)
.OrderBy(g => g.Sum(dataSelector))
.Select(gr => resultFactory(gr.Key,
gr.Sum(dataSelector),
gr.Sum(dataSelector) / totalNumber));
}
In your example, you might call it like this:
GetDataItems(
x => x.Color, // the grouping criterion
x => x.Number, // the value criterion
(key, total, pct) =>
new FooDataItem {
Color = key, NumberTotal = total, NumberPercentage = pct });
If you changed FooDataItem
to be more generic, it would be easier.
I wouldn't use the query syntax for this, use the method chain.
public IEnumerable<FooDataItem> GetDataItems(IEnumerable<BarDto> dtos, Func<BarDto, object> key, Func<BarDto, object> data)
{
double totalData = dtos.Where(d => key(d) != null).Sum(data);
return dtos.Where(d => key(d) != null)
.GroupBy(key)
.OrderBy(d => d.Sum(data))
.Select(
o => new FooDataItem()
{
Key = o.Key,
Total = o.Sum(data),
Percentage = o.sum(data) / totalData
});
}
(written without compiler etc).
Personally, I wouldn't refactor it though, as this would make the code less read- and understandable.
You would need to switch from query expression and convert all of your where, group by, order by, and select clauses into lambdas. You could then create a function that accepts each of them as parameters. Here's an example:
private static IEnumerable<FooDataItem> GetData<T>(IEnumerable<Foo> foos, Func<Foo, bool> where, Func<Foo, T> groupby, Func<IGrouping<T, Foo>, T> orderby, Func<IGrouping<T, Foo>, FooDataItem> select)
{
var query = foos.Where(where).GroupBy(groupby).OrderBy(orderby).Select(select);
return query;
}
Based on this code
class Foo
{
public int Id { get; set; }
public int Bar { get; set; }
}
...
List<Foo> foos = new List<Foo>(); // populate somewhere
Func<Foo, bool> where = f => f.Id > 0;
Func<Foo, int> groupby = f => f.Id;
Func<IGrouping<int, Foo>, int> orderby = g => g.Sum(f => f.Bar);
Func<IGrouping<int, Foo>, FooDataItem> select = g => new FooDataItem { Key = g.Key, BarTotal = g.Sum(f => f.Bar) };
var query = GetData(foos, where, groupby, orderby, select);
I think that if you refactored this, it would be more difficult to read than what you have already. Anything I can think of either involves dynamic Linq, or modifying or encapsulating BarDto to have some sort of a specialized item to be used for nothing but grouping.
Here is an extension method which factors out the similar portions of each query:
public static IEnumerable<TDataItem> GetDataItems<TData, TDataItem>(
this IEnumerable<BarDto> dtos,
Func<BarDto, TData> dataSelector,
Func<BarDto, double> numberSelector,
Func<TData, double, double, TDataItem> createDataItem)
where TData : class
{
var eligibleDtos = dtos.Where(dto => dataSelector(dto) != null);
var totalNumber = eligibleDtos.Sum(numberSelector);
return
from dto in eligibleDtos
group dto by dataSelector(dto) into dtoGroup
let groupNumber = dtoGroup.Sum(numberSelector)
orderby groupNumber descending
select createDataItem(dtoGroup.Key, groupNumber, groupNumber / totalNumber);
}
You would use it like this:
var itemsByName = dtos.GetDataItems(
dto => dto.Name,
dto => dto.Data,
(name, groupTotal, groupPercentage) => new FooDataItem
{
Name = name,
NumberTotal = groupTotal,
NumberPercentage = groupPercentage
});
var itemsByColor = dtos.GetDataItems(
dto => dto.Color,
dto => dto.Number,
(color, groupTotal, groupPercentage) => new FooDataItem
{
Color = color,
DataTotal = groupTotal,
DataPercentage = groupPercentage
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With