I have a class like this
public class Category
{
CategoryID
List<Product> ProductList
}
public class Product
{
CategoryID
}
List<Category>
around 15k row and List <Product>
around 40k row. I'm doing with LINQ like this
CategoryList.ForEach(i =>
{
i.ProductList = ProductList.Where(x => x.CategoryID == i.CategoryID).ToList();
});
I want to know there are any implement better performance for this. Data could increase or decrease in other case
I think it's silly to use parallelism, when a simple change to the algorithm can speed it up thousandfold. Assuming CategoryID
has a decent implementation of GetHashCode()
a dictionary/lookup has nearly O(1)
lookup times compared to the O(n)
time of scanning through a list.
Two possible approaches:
Assuming categories have unique IDs you can use:
var categoryById = categories.ToDictionary(c => c.CategoryID);
foreach(var product in products)
{
var category = categoryById[product.CategoryID];
category.Products.Add(product);
}
This has runtime O(products.Count + categories.Count)
and uses O(categories.Count)
memory.
If categories don't start out with an empty list, you might need to create it.
var productsByCategory = products.ToLookup(product => product.CategoryID);
foreach(var category in categories)
{
category.Products = products[category.CategoryID].ToList();
}
This has runtime O(products.Count + categories.Count)
and uses O(products.Count)
memory.
Since there are typically more products than categories, this approach will require more memory. On the other hand the lookup might remove the need to embed a list of products in the category object.
You can use parallel implementation of LINQ - PLINQ. Usually PLINQ increases the speed of LINQ, because it's using all available cores more efficiently.
CategoryList.AsParallel().ForAll(i =>
{
i.ProductList = ProductList.AsParallel.Where(x => x.CategoryID == i.CategoryID).ToList();
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With