I've got the following situation:
There are two related types. For this question, I'll use the following simple types:
public class Person
{
public Guid Id {get; set;}
public int status {get; set;}
}
public class Account
{
public Guid AccountId {get; set;}
public decimal Amount { get; set; }
public Guid PersonId { get; set; }
}
So that one Person
might have multiple Account
s (i.e., multiple Account
s would reference the same PersonId
).
In our database, there are tens of thousands of persons, and each have 5-10 accounts on average.
I need to retrieve each person's accounts, assuming they fulfill certain requirements. Afterwards, I need to see if all of this person's accounts, together, fulfill another condition.
In this example, let's say I need every account with amount < 100
, and that after retrieving one person's accounts, I need to check if their sum is larger than 1000.
Using a LINQ query is desirable, but can't be done using group-by-into
keywords, because the Linq-Provider (LINQ-to-CRM) doesn't support it.
In addition, doing the following simple LINQ query to implement listing 3 requirements is also not possible (please read the inlined comment):
var query = from p in personList
join a in accountList on p.Id equals a.PersonId
where a.Amount < 100
select a;
var groups = query.GroupBy(a => a.PersonId);
// and now, run in bulks on x groups
// (let x be the groups amount that won't cause an out-of-memory exception)
It is not possible for 2 reasons:
a. The Linq-Provider force a call to ToList()
before using GroupBy()
.
b. Trying to actually call ToList()
before using GroupBy()
results in an
out-of-memory exception - since there are tens of thousands of accounts.
For efficiency reasons, I don't want to do the following, since it means tens of thousands retrievals:
a. Retrieve all persons.
b. Loop through them and retrieve each person's accounts on each iteration.
Will be glad for efficient ideas.
I would suggest ordering the query by PersonId
, switching to LINQ to Objects via AsEnumerable()
(thus executing it, but without materializing the whole result set in memory like ToList()
call), and then use the GroupAdjacent
method from MoreLINQ package:
This method is implemented by using deferred execution and streams the groupings. The grouping elements, however, are buffered. Each grouping is therefore yielded as soon as it is complete and before the next grouping occurs.
var query = from p in personList
join a in accountList on p.Id equals a.PersonId
where a.Amount < 100
orderby a.PersonId
select a;
var groups = query.AsEnumerable()
.GroupAdjacent(a => a.PersonId)
.Where(g => g.Sum(a => a.Amount) > 1000);
The AsEnumerable()
trick works well with EF query provider for sure. Whether it works with LINQ to CRM provider really depends on how the provider implements GetEnumerator()
method - if it tries to buffer the whole query result anyway, then you are out of luck.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With