I have an issue building a fairly hefty linq query. Basically I have a situation whereby I need to execute a subquery in a loop to filter down the number of matches that are returned from the database. Example code is in this loop below:
foreach (Guid parent in parentAttributes)
{
var subQuery = from sc in db.tSearchIndexes
join a in db.tAttributes on sc.AttributeGUID equals a.GUID
join pc in db.tPeopleIndexes on a.GUID equals pc.AttributeGUID
where a.RelatedGUID == parent && userId == pc.CPSGUID
select sc.CPSGUID;
query = query.Where(x => subQuery.Contains(x.Id));
}
When I subsequently call the ToList() on the query variable it appears that only a single one of the subqueries has been performed and I'm left with a bucketful of data I don't require. However this approach works:
IList<Guid> temp = query.Select(x => x.Id).ToList();
foreach (Guid parent in parentAttributes)
{
var subQuery = from sc in db.tSearchIndexes
join a in db.tAttributes on sc.AttributeGUID equals a.GUID
join pc in db.tPeopleIndexes on a.GUID equals pc.AttributeGUID
where a.RelatedGUID == parent && userId == pc.CPSGUID
select sc.CPSGUID;
temp = temp.Intersect(subQuery).ToList();
}
query = query.Where(x => temp.Contains(x.Id));
Unfortunately this approach is nasty as it results in multiple queries to the remote database whereby the initial approach if I could get it working would only result in a single hit. Any ideas?
Well, you can just put multiple "where" clauses in directly, but I don't think you want to. Multiple "where" clauses ends up with a more restrictive filter - I think you want a less restrictive one.
There are the following two ways to write LINQ queries using the Standard Query operators, in other words Select, From, Where, Orderby, Join, Groupby and many more. Using lambda expressions. Using SQL like query expressions.
In a LINQ query, the first step is to specify the data source. In C# as in most programming languages a variable must be declared before it can be used. In a LINQ query, the from clause comes first in order to introduce the data source ( customers ) and the range variable ( cust ).
I think you are hitting a special case of capturing the loop variable in the lambda expression used to filter. Also known as an access to modified closure error.
Try this:
foreach (Guid parentLoop in parentAttributes)
{
var parent = parentLoop;
var subQuery = from sc in db.tSearchIndexes
join a in db.tAttributes on sc.AttributeGUID equals a.GUID
join pc in db.tPeopleIndexes on a.GUID equals pc.AttributeGUID
where a.RelatedGUID == parent && userId == pc.CPSGUID
select sc.CPSGUID;
query = query.Where(x => subQuery.Contains(x.Id));
}
The problem is capturing the parent
variable in the closure (that the LINQ syntax is converted to), which causes all the subQuery
es to be run with the same parent id.
What happens is the compiler generating a class to hold the delegate and the local variables the delegate accesses. The compiler re-uses the same instance of that class for each loop; and therefore, once the query executes, all of the Where
s executes with the same parent
Guid, namely the last to execute.
Declaring the parent
inside the loop scope causes the compiler to essentially make a copy of the variable, with the correct value, to be captured.
This can be a bit hard to grasp at first, so if this is the first time it has hit you; I'd recommend these two articles for background and a thorough explanation:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With