Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split List to Categories using Linq

Tags:

c#

linq

I have class that is used to get and send data to database. Data example:

Id Amount ProductName RecordDateTime
1 2 Fresh Apples 23/2/2021
2 3 Sweet Bananas 13/6/2021
3 1 Yellow Bananas 12/7/2021
4 7 Green Apples 31/5/2021
5 9 Juicy Apples 12/9/2021
6 4 Young Potato's 5/2/2021
7 5 Orange Carrots 4/6/2021

Class:

public class LogModel
{
    public int Id { get; set; }

    public int Amount { get; set; }

    public string ProductName { get; set; }

    public DateTime RecordDateTime { get; set; }
}

Then I have another class that is used for user customization so that the user can build their own category structure. This is treeview with user-built structure like:

   - Categories
       - Fruits
         - Apples | Bananas
       - Vegetables
         - Potato's | Carrots

Here is the class:

  public class CategoryModel : BaseViewModel
  {
    private ObservableCollection<CategoryModel> categoryItems;
    public ObservableCollection<CategoryModel> CategoryItems
    {
      get => this.categoryItems;
      set
      {
        this.categoryItems = value;
        this.OnPropertyChanged();
      }
    }

    private string itemName;
    public string ItemName
    {
      get => this.itemName;
      set
      {
        this.itemName = value;
        this.OnPropertyChanged();
      }
    }
   }

I need to create a new list that will assign each record from the database to own category based on string.Contains after splitting by |. I have managed to get method with Linq to sort only needed data to work with from BD. How to do actual assign of data to each category? No need to match my structure. Basic linq example would be enough.

public List<LogModel> GetCategorySplit(DateTime startDate, DateTime endDate)
{
  using (var db = new SQLDBContext.SQLDBContext())
  {
    List<LogModel> result = db.LogModel
    .Where(w => w.RecordDateTime >= startDate.Date && w.RecordDateTime.Date <= endDate.Date && w.Amount != 0)
    .Select(
      s => new LogModel
      {
        ProductName = s.ProductName,
        Amount = s.Amount,
      })
    .ToList();

    // I have tried some foreach loop after that, but it is not so clear in my head how this process 
    // should go overall

    //foreach (var item in result)
    //{
    //  if (ExtensionMethods.StringContains(item.ProductName, value))
    //  {

    //  }
    //}

    return result;
  }
}

Expected Output:

Fruits          22
Vegetables      9

P.S. Please comment if something is unclear, I am quite new here.

like image 772
10101 Avatar asked Feb 16 '26 00:02

10101


2 Answers

In order to be able to test my solution I've used in-memory data structures.

var dataSource =  new List<LogModel>
{
    new LogModel { Id = 1, Amount =  2, ProductName = "Fresh Apples", RecordDateTime = new DateTime(2021, 02, 23)},
    new LogModel { Id = 2, Amount =  3, ProductName = "Sweet Bananas", RecordDateTime = new DateTime(2021, 06, 13)},
    new LogModel { Id = 3, Amount =  1, ProductName = "Yellow Bananas", RecordDateTime = new DateTime(2021, 07, 12)},
    new LogModel { Id = 4, Amount =  7, ProductName = "Green Apples", RecordDateTime = new DateTime(2021, 05, 31)},
    new LogModel { Id = 5, Amount =  9, ProductName = "Juicy Apples", RecordDateTime = new DateTime(2021, 09, 12)},
    new LogModel { Id = 6, Amount =  4, ProductName = "Young Potato's", RecordDateTime = new DateTime(2021, 02, 05)},
    new LogModel { Id = 7, Amount =  5, ProductName = "Orange Carrots", RecordDateTime = new DateTime(2021, 06, 04)}
};

var categories = new List<CategoryModel>
{
    new CategoryModel
    {
        ItemName = "Fruits",
        CategoryItems = new ObservableCollection<CategoryModel>
        {
            new CategoryModel {ItemName = "Apples"},
            new CategoryModel {ItemName = "Bananas"}
        }
    },
    new CategoryModel
    {
        ItemName = "Vegetables",
        CategoryItems = new ObservableCollection<CategoryModel>
        {
            new CategoryModel {ItemName = "Potato's"},
            new CategoryModel {ItemName = "Carrots"}
        }
    }
};

The grouping logic can be written like this:

var topLevelQuantities = new Dictionary<string, int>();
foreach (var topLevelCategory in categories)
{
    var filters = topLevelCategory.CategoryItems.Select(leafLevel => leafLevel.ItemName);
    var count = dataSource.Where(log => filters.Any(filter => log.ProductName.Contains(filter)))
        .Sum(log => log.Amount);
    topLevelQuantities.Add(topLevelCategory.ItemName, count);
}
  • Here I iterate through the top level categories foreach (var topLevelCategory in categories)
    • I assumed the the graph depth is 2 (so there are top level and leaf level entities)
  • Then I gather the leafs' ItemNames into the filters
  • I perform a filtering based on the filters dataSource.Where( ... filters.Any(...))
    • And finally I calculate the Sum of the filtered LogModels' Amount
  • In order to be able to pass the accumulated data to another function/layer/whatever I've used a Dictionary<string, int>

I've used the following command to examine the result:

foreach (var (categoryName, categoryQuantity) in topLevelQuantities.Select(item => (item.Key, item.Value)))
{
    Console.WriteLine($"{categoryName}: {categoryQuantity}");
}

NOTES

  • Please bear in mind that this solution was designed against in-memory data. So, after data has been fetched from the database. (If you want to perform this on the database level then that requires another type of solution.)
  • Please also bear in mind that this solution requires multiple iterations over the dataSource. So, if the dataSource is very large (or there are lots of top level categories) than the solution might not perform well.

UPDATE: When the depth of the category hierarchy is 3.

If we can assume that there can be only 1 top-level entity then we should not need to change too much:

var rootCategory = new CategoryModel
{
    ItemName = "Categories",
    CategoryItems = new ObservableCollection<CategoryModel>
    {
        new CategoryModel
        {
            ItemName = "Fruits",
            CategoryItems = new ObservableCollection<CategoryModel>
            {
                new CategoryModel {ItemName = "Apples"},
                new CategoryModel {ItemName = "Bananas"}
            }
        },
        new CategoryModel
        {
            ItemName = "Vegetables",
            CategoryItems = new ObservableCollection<CategoryModel>
            {
                new CategoryModel {ItemName = "Potato's"},
                new CategoryModel {ItemName = "Carrots"}
            }
        }
    }
};

var midLevelQuantities = new Dictionary<string, int>();
foreach (var midLevelCategory in rootCategory.CategoryItems)
{
  ...
}

If there can be multiple top level categories:

var categories = new List<CategoryModel>
{
    new CategoryModel
    {
        ItemName = "Categories",
        CategoryItems = new ObservableCollection<CategoryModel>
        {
            new CategoryModel
            {
                ItemName = "Fruits",
                CategoryItems = new ObservableCollection<CategoryModel>
                {
                    new CategoryModel {ItemName = "Apples"},
                    new CategoryModel {ItemName = "Bananas"}
                }
            },
            new CategoryModel
            {
                ItemName = "Vegetables",
                CategoryItems = new ObservableCollection<CategoryModel>
                {
                    new CategoryModel {ItemName = "Potato's"},
                    new CategoryModel {ItemName = "Carrots"}
                }
            }
        }
    }
};

then we need to use recursive graph traversal.
I've introduced the following helper class to store the calculations' result:

public class Report
{
    public string Name { get; }
    public string Parent { get; }
    public double Quantity { get; }

    public Report(string name, string parent, double quantity)
    {
        Name = name;
        Parent = parent;
        Quantity = quantity;
    }
}

The traversal can be implemented like this:

private static List<Report> GetReports(CategoryModel category, string parent, List<Report> summary)
{
    if (category.CategoryItems == null || category.CategoryItems.Count == 0)
    {
        var count = dataSource.Where(log => log.ProductName.Contains(category.ItemName)).Sum(log => log.Amount);
        summary.Add(new Report(category.ItemName, parent,count));
        return summary;
    }

    foreach (var subCategory in category.CategoryItems)
    {
        summary = GetReports(subCategory, category.ItemName, summary);
    }

    var subTotal = summary.Where(s => s.Parent == category.ItemName).Sum(s => s.Quantity);
    summary.Add(new Report(category.ItemName, parent, subTotal));
    return summary;

}
  • In the if block we handle that case when we are at the leaf level
    • That's where we perform the queries
  • In the foreach block we iterate through all of its children and we are calling the same function recursively
  • After the foreach loop we calculate the current category's subTotal by aggregating its children's Quantities

Please note: This design assumes that the category names are unique.

To display the results we need yet another recursive function:

private static void DisplayResult(int depth, IEnumerable<CategoryModel> categories, List<Report> report)
{
    foreach (var category in categories)
    {
        var indentation = new string('\t', depth);
        var data =report.Single(r => r.Name == category.ItemName);
        Console.WriteLine($"{indentation}{data.Name}: {data.Quantity}");

        if (category.CategoryItems == null || category.CategoryItems.Count == 0)
            continue;

        DisplayResult((byte)(depth +1), category.CategoryItems, report);
    }
}
  • Based on the category's level (depth) we calculate the indentation
  • We lookup the corresponding report based on the ItemName property of the category
    • We print out the report
  • If the current category does not have subcategories then we move to the next item
    • Otherwise we call the same function for the subcategories recursively

Let's put all things together

var report = new List<Report>();
foreach (var category in categories)
{
    var subReport = GetReports(category, "-", new List<Report>());
    report.AddRange(subReport);
}

DisplayResult(0, categories, report);

The output will be:

Categories: 31
        Fruits: 22
                Apples: 18
                Bananas: 4
        Vegetables: 9
                Potato's: 4
                Carrots: 5
like image 161
Peter Csala Avatar answered Feb 18 '26 14:02

Peter Csala


So you have a displayed sequence of CategoryModels, with ItemNames like "Fruits", "Vegetables", etc.

Every CategoryModel has some sub-CategoryModels in property CategoryItems. In your example, CategoryModel "Fruits" has CategoryItems with ItemNames like "Apples" and "Bananas". "Vegetables has sub category ItemNames like "Potato's" and "Carrots".

You forgot to tell us, can a CategoryItem with ItemName "Apples" also have sub-categories, so you have sub-sub-categories? And can these sub-sub-categories have more sub-sub-sub-categories?

I need to create a new list that will assign each LogModel from the database to a CategoryModel.

How do you decide which LogModels belongs to "Apples". Is it if LogModel.ProductName contains "Apples"?

Apparently you need an extend class LogModel with a method that says: "Yes, a Juicy Apple is an Apple", or "No, a Tasty Pomodoro is not an Apple", neither is a "Granny Smith"

By the way: do you see the flaw in your requirement: why is a "Granny Smith" not an apple, like a "Juicy Apple"?

But let's design for change: if you later want to change how you match a LogModel with a CategoryModel, changes will be small.

Luckily, you already fetched the LogModels that you want to process from the database, so we can work "AsEnumerable", instead of "AsQueryable". This gives us more freedom in the functions that we can use.

public static bool IsMatch(LogModel log, CateGoryModel category) 
{
    // TODO: compare log with category, and decide whether it is a match
}

Usage will be as if IsMatch is a method of LogModel:

LogModel log = ...
CategoryModel category = ...
bool isMatch = IsMatch(log, category);

Your current requirement seems to be that a log matches a category, if the ProductName of the log contains the ItemName of the category. So if "Juicy Apples" contains "Apples" (not sure if you want case sensitivity)

public static bool IsMatch(LogModel log, CateGoryModel category) 
{
    // TODO: decide what to do if log == null, category == null. 
    // TODO: decide what to do if log.ProductName == null or category.ItemName == null

    return log.ProductName.Contains(category.ItemName, StringComparison.CurrentCulture);
}

If later you decide that you don't check on names, but for instance on a FruitType, than changes will only have to be in this method, nowhere else.

Why is fruits 22, and vegetables 9?

Well, fruits has apples and bananas, and the sum of all matching apples (according to the method defined above) and all matching bananas is 22. Similarly: vegetables is potato's and carrots and the sum of all matching potato's (4) and matching carrots (5) is 9

So for every Category, we take all SubCategories, and find the LogModels that match. We sum all Amounts of the matching LogModels.

So as a matter of fact, you would like to extend class CategoryModel with a method that takes all fetched LogModels and returns the Amount of items. Something like this:

class CategoryModel
{
    ...

    public int GetAmount(IEnumerable<LogModel> logModels)
    {
        ...
    }
}

If you don't want or if you cannot add this method to CategoryModel, you can always create an extension method. For those who are not familiar with extension methods, see Extension Methods Demystified

public static int GetAmount(this CategoryModel category, IEnumerable<LogModel> logModels)
{
    // TODO: exception if items null or empty

    int amount = logModels.Where(log => IsMatch(log, category)
                          .Select(log => log.Amount)
                          .Sum();

    // Add all amounts of the subCategories:
    if (category.CategoryItems != null && category.CategoryItems.Count != 0)
    {
        amount += category.CategoryItems
                  .Select(catgoryItem => categoryItem.GetAmounts(logModels))
                  .Sum();
    }
    return amount;
}

Nota bene: this method uses recursion, so it works even if you have sub-sub-sub-... categories.

Usage:

var fetchedLogModels = db.LogModels.Where(...).Select(...).ToList();
IEnumerable<CategoryModel> categories = ... 

categoryies are "fruits" and "vegetables" etc.

var result = categories.Select(category => new
{
    CategoryName = category.ItemName,
    Amount = category.GetAmounts(fetchedLogModels),
});

Well, doesn't that seem like some nice piece of code? Every difficulty is hidden somewhere deep inside and the code is easily changeable and unit testable. If you want to change how you make your Matches, or want to change that GetAmounts is not recursive anymore, or that it becomes a method of CategoryModels: none of the users have to change.

like image 28
Harald Coppoolse Avatar answered Feb 18 '26 14:02

Harald Coppoolse



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!