Basically what I'm trying to do is take a list of objects and filter it based on a number of criteria where one of the criteria is that the key doesn't exist in another list. Here's an example:
My two classes are similar to this:
public class Test
{
public string name;
public string instructor_name;
public string course;
}
public class Appointment
{
public string site;
public DateTime forWhen;
public string testName;
}
I want to sort through a List<Test> by looking at the course and making sure that test doesn't exist in the List<Appointment>. In SQL I'd do it something like this:
SELECT new Group<Test>(c.Key, c)
FROM tests in testList
WHERE tests.Course != "Science"
AND tests.name NOT IN (SELECT testName FROM appotList)
However, I can't figure out how I would do this in LINQ. Any ideas?
If you're talking about performing client-side filtering, that's pretty easy with LINQ to Objects. Something like this:
List<Test> tests = ...;
List<Appointment> appts = ...;
var query = tests.Except(
tests.Join(appts, t => t.name, a => a.testName, (t, a) => t));
The following is slightly simpler to read:
var query = tests.Where(t => !appts.Any(a => a.testName == t.name));
But the first version will be faster, since the Join
function will compute a hash table of the matches rather than doing a linear search of the appts
list for every element in tests
.
http://introducinglinq.com/blogs/marcorusso/archive/2008/01/14/the-not-in-clause-in-linq-to-sql.aspx
Consider this code that returns all the customers who don't have an order in the Orders table. This is one SQL query that returns that value.
SELECT *
FROM [dbo].[Customers] AS [t0]
WHERE [t0].[CustomerID] NOT IN (
SELECT [t1].[CustomerID]
FROM [dbo].[Orders] AS [t1]
)
This is not the faster way to get the desired result (using a NOT EXISTS is the favorite way - more on this shortly). LINQ offers a Contains extension method that allows writing the following code.
NorthwindDataContext dc = new NorthwindDataContext();
dc.Log = Console.Out;
var query =
from c in dc.Customers
where !(from o in dc.Orders
select o.CustomerID)
.Contains(c.CustomerID)
select c;
foreach (var c in query) Console.WriteLine( c );
In LINQ to SQL the query is translated into this SQL code:
SELECT [t0].[CustomerID], [t0].[CompanyName], [t0].[ContactName],
[t0].[ContactTitle], [t0].[Address], [t0].[City],
[t0].[Region], [t0].[PostalCode], [t0].[Country], [t0].[Phone], [t0].[Fax]
FROM [dbo].[Customers] AS [t0]
WHERE NOT (EXISTS(
SELECT NULL AS [EMPTY]
FROM [dbo].[Orders] AS [t1]
WHERE [t1].[CustomerID] = [t0].[CustomerID]
))
This approach is not only semantically equivalent, but also faster in execution. The following is the result with SET STATISTICS IO ON. The first result is for the hand-written query that use the NOT IN clause. The second result is for the LINQ to SQL generated query.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With