Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recommend a fitting LINQ provider to me (SQL server, complex queries)

I've been using LINQ to SQL & to entities for a while and am overall very happy with them. However i know of their limitations and one in particular is becoming a large issue for me. When you do a complex nested query in the form of

MyContext.SomeTable
.Select(item=>new{
    item.SomeProperty1,
    item.SomeProperty2,
    item.NavigationProperty1
        .Select(nav1=> new {// retrieve some properties}), // This triggers a single query as long as don't have more than one subquery
    item.NavigationProperty2
        .Select(nav2=> new {// retrieve some properties}) // This triggers one query PER ROW in the original query
});

The providers i have tested are LINQ TO SQL / LINQ TO entities (and even worse, devart LINQConnect that fares worse and generates 1 per row on the first navigation property)

What i get now that is generated(pseudocode):

select t1.a,t1.b,t2.c,t2.d from mytable as t1
join navproperty1table as t2

and 1 millions (if there is 1 million results in the first set) of queries like this: select t3.e,t3.f from navproperty2table as t3 where id = X (X changing on X query to next element returned by first query)

What i want:

select t1.a,t1.b,t2.c,t2.d,t3.e,t3.f from mytable as t1 
join navproperty1table as t2
join navproperty2table as t3

Now of course if there were 3 rows in the original table it wouldn't be an issue, but i have 10s of thousands to millions of rows in my tables "and" i need a much much much more complex query in a single select (i want to get a complex graph at once). Think 20 + tables with 3-6 levels of nesting accessing an additional 2-5 tables each.

My SQL server can perfectly cope with it, i don't care for the bandwidth either, it's on an instance linked by a gigabit connection, i can't get that data in deferred manner, i actually "use" all of it immediately so it's not just laziness. Right now for performance reasons i had to split the query in many small queries and join them manually on the LINQ to object size, which gives some really nasty code for whoever maintains it but was the only actual solution i had, so overall including all the small queries and final joining, I'm at over 600 lines of unsplitable code in a single method that is totally unmaintainable.

Are there actually "any" LINQ providers production ready today before i go and evaluated them all that work in such a mindset or am i better off coding and commercializing my own? (I'm very surprised that they don't all work that way actually, i can't see a single instance where you'd be better off with the foreach case and the ones i've tried that claim to get rid of n+1 with loadwith, don't get rid of it as they still do n+1 queries but just batch it in a single call, 1 round trip & n+1 queries isn't satisfying when 1 is 10 000 then 10 000 000 and then 10 000 000 000)

  • (note that I'm speculating on what exactly triggers this, but it isn't the question, no matter what triggers this "exactly" I'm sure to hit it in my current context)

PS: Note that I'm running .NET 4.0 full profile on a windows server 2008 or higher and on SQL server 2008 or higher, a provider that doesn't support anything else would be fine, i have zero requirements for migration, portability, lower .net versions, lower sql server support etc. Migrating to even more recent versions is an option if required. I also don't have any prerequisites for modeling or advanced features, the DB is already there, i only want to query tables, so something with no modeling / views / DML / stored procedure / functions support is fine, my one and only requirement is sensible SQL generation on complex queries and object graphs

EDIT: for clarification here is an actual example of the issue on a DB everyone can get, adventureworks

Querying employees for each contact

Contacts
.Select(cont=>new 
{
    cont.EmailAddress,
    cont.EmailPromotion,
    Employees = cont.Employees
        .Select(emp=>new
        {
            emp.Gender,
            emp.HireDate
        }).ToList()
}).ToList()

Generates

SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], (
SELECT COUNT(*)
FROM [HumanResources].[Employee] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value]

FROM [Person].[Contact] AS [t0] LEFT OUTER JOIN [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID] ORDER BY [t0].[ContactID], [t1].[EmployeeID]

Now querying just vendors for each Contact Contacts .Select(cont=>new { cont.EmailAddress, cont.EmailPromotion, Vendors = cont.VendorContacts.Select(vend=>new { vend.ContactTypeID, vend.ModifiedDate }).ToList() }).ToList()

still ok:

SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[ContactTypeID], [t1].[ModifiedDate], (
SELECT COUNT(*)
FROM [Purchasing].[VendorContact] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value]

FROM [Person].[Contact] AS [t0] LEFT OUTER JOIN [Purchasing].[VendorContact] AS [t1] ON [t1].[ContactID] = [t0].[ContactID] ORDER BY [t0].[ContactID], [t1].[VendorID]

Now querying both at once (triggers X row query)

Contacts
.Select(cont=>new 
{
    cont.EmailAddress,
    cont.EmailPromotion,
    Employees = cont.Employees
        .Select(emp=>new
        {
            emp.Gender,
            emp.HireDate
        }).ToList(),
    Vendors = cont.VendorContacts.Select(vend=>new
    {
        vend.ContactTypeID,
        vend.ModifiedDate
    }).ToList()
}).ToList()

Generates the ugly and slow (not pasting it all for obvious reasons but you get the point):

SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], (
SELECT COUNT(*)
FROM [HumanResources].[Employee] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value], [t0].[ContactID]
FROM [Person].[Contact] AS [t0]
LEFT OUTER JOIN [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID]
ORDER BY [t0].[ContactID], [t1].[EmployeeID]
GO

-- Region Parameters
DECLARE @x1 Int = 1
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

 -- Region Parameters
DECLARE @x1 Int = 2
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 3
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 4
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 5
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 6
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 7
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 8
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 9
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 10
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

What i expect / would like to see generated:

SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], [t2].[ContactTypeID], [t2].[ModifiedDate] ,[t0].[ContactID]
FROM [Person].[Contact] AS [t0]
LEFT OUTER JOIN [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID]
LEFT OUTER JOIN [Purchasing].[VendorContact] AS [t2] ON [t2].[ContactID] = [t0].[ContactID]
GO
like image 770
Ronan Thibaudau Avatar asked Nov 13 '22 10:11

Ronan Thibaudau


1 Answers

A workaround would be to create a view

from your definition

select t1.a,t1.b,t2.c,t2.d,t3.e,t3.f from mytable as t1 join navproperty1table as t2 join navproperty2table as t3

and use linq-2-sql to query that view.

Not sure if I understand your query completely, but uou might just do

from x in MyContext.Sometable
Select new { x.a, x.b, x.t2.c, x.t2.d, x.t3.f } 

and so on.. I cannot test it right now but I am pretty sure yhis will create the select (and only one) you want.

like image 76
Pleun Avatar answered Mar 14 '23 18:03

Pleun