Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ TO SQL GROUP BY HAVING generated only as subquery

according to all samples, the SQL conversion to LINQ for HAVING clause as in this example:

SELECT NAME
FROM TABLES
GROUP BY NAME
HAVING COUNT(*) > 1

is: (vb.net)

from t in tables 
group t by t.NAME into g = Group 
where g.count > 1 
select g

BUT the above LINQ statement is translated to the following SQL:


SELECT [t1].[NAME] AS [Key]
FROM (
    SELECT COUNT(*) AS [value], [t0].[NAME]
    FROM [tables] AS [t0]
    WHERE [t0].[NAME] <> @p0
    GROUP BY [t0].[NAME]
    ) AS [t1]
WHERE [t1].[value] > @p1

I never succeed in generating HAVING clause form LINQ. Let's say HAVING and subquery group by with WHERE are equivalent in results, but is there a difference in performance? what about keeping my original SQL queries at least SIMILAR to the ones LINQ generates underlying?

like image 886
holy Avatar asked Oct 09 '22 16:10

holy


1 Answers

You can demonstrate whether or not the two queries in question are executed in the same fashion by SQL Server by viewing the Execution Plans. Take the following test code (I used SQL Server 2008):

CREATE TABLE #TABLES ([ID] INT IDENTITY, [Name] VARCHAR(30))
INSERT INTO #TABLES VALUES('A')
INSERT INTO #TABLES VALUES('A')
INSERT INTO #TABLES VALUES('B')
INSERT INTO #TABLES VALUES('C')
INSERT INTO #TABLES VALUES('D')

SELECT NAME
FROM #TABLES
WHERE [Name] <> 'D'
GROUP BY NAME
HAVING COUNT(*) > 1

SELECT [t1].[NAME]
FROM (
    SELECT COUNT(*) AS [value], [t0].[NAME]
    FROM [#TABLES] AS [t0]
    WHERE [t0].[NAME] <> 'D'
    GROUP BY [t0].[NAME]
    ) AS [t1]
WHERE [t1].[value] > 1

DROP TABLE #TABLES

Executing these queries from SQL Query Analyzer with "Include Actual Execution Plan" selected under the "Query" menu will yield the following output:

execution plans

In this case, since the generated query plans are exactly the same, it certainly appears that there should be no performance difference between your SQL and the SQL generated from your LINQ statement.

As a side note, unfortunately, I couldn't find any documentation on why LinqToSql doesn't make use of HAVING, or if using HAVING vs. using a sub select yields a performance gain one way or the other. If I had to guess I'd say the query optimizer in SQL Server internally turns these statements into the same query before they are executed, which is why the Execution Plans are identical for both statements. Whether or not my previous statement is true, I would say that if you're ever in doubt, just check out the Execution Plans for your SQL version versus the LinqToSql version. If they are the same then you've got nothing to worry about performance-wise. If your SQL version is much more efficient, then you could always write a stored procedure and just call the stored procedure using LinqToSql.

like image 124
rsbarro Avatar answered Oct 12 '22 22:10

rsbarro