LINQ TO SQL GROUP BY HAVING generated only as subquery

Question

according to all samples, the SQL conversion to LINQ for HAVING clause as in this example:

SELECT NAME
FROM TABLES
GROUP BY NAME
HAVING COUNT(*) > 1

is: (vb.net)

from t in tables 
group t by t.NAME into g = Group 
where g.count > 1 
select g

BUT the above LINQ statement is translated to the following SQL:

SELECT [t1].[NAME] AS [Key]
FROM (
    SELECT COUNT(*) AS [value], [t0].[NAME]
    FROM [tables] AS [t0]
    WHERE [t0].[NAME] <> @p0
    GROUP BY [t0].[NAME]
    ) AS [t1]
WHERE [t1].[value] > @p1

I never succeed in generating HAVING clause form LINQ. Let's say HAVING and subquery group by with WHERE are equivalent in results, but is there a difference in performance? what about keeping my original SQL queries at least SIMILAR to the ones LINQ generates underlying?

rsbarro · Accepted Answer

You can demonstrate whether or not the two queries in question are executed in the same fashion by SQL Server by viewing the Execution Plans. Take the following test code (I used SQL Server 2008):

CREATE TABLE #TABLES ([ID] INT IDENTITY, [Name] VARCHAR(30))
INSERT INTO #TABLES VALUES('A')
INSERT INTO #TABLES VALUES('A')
INSERT INTO #TABLES VALUES('B')
INSERT INTO #TABLES VALUES('C')
INSERT INTO #TABLES VALUES('D')

SELECT NAME
FROM #TABLES
WHERE [Name] <> 'D'
GROUP BY NAME
HAVING COUNT(*) > 1

SELECT [t1].[NAME]
FROM (
    SELECT COUNT(*) AS [value], [t0].[NAME]
    FROM [#TABLES] AS [t0]
    WHERE [t0].[NAME] <> 'D'
    GROUP BY [t0].[NAME]
    ) AS [t1]
WHERE [t1].[value] > 1

DROP TABLE #TABLES

Executing these queries from SQL Query Analyzer with "Include Actual Execution Plan" selected under the "Query" menu will yield the following output:

execution plans

In this case, since the generated query plans are exactly the same, it certainly appears that there should be no performance difference between your SQL and the SQL generated from your LINQ statement.

As a side note, unfortunately, I couldn't find any documentation on why LinqToSql doesn't make use of HAVING, or if using HAVING vs. using a sub select yields a performance gain one way or the other. If I had to guess I'd say the query optimizer in SQL Server internally turns these statements into the same query before they are executed, which is why the Execution Plans are identical for both statements. Whether or not my previous statement is true, I would say that if you're ever in doubt, just check out the Execution Plans for your SQL version versus the LinqToSql version. If they are the same then you've got nothing to worry about performance-wise. If your SQL version is much more efficient, then you could always write a stored procedure and just call the stored procedure using LinqToSql.

LINQ TO SQL GROUP BY HAVING generated only as subquery

Tags:

sql

vb.net

linq-to-sql

having-clause

holy

1 Answers

rsbarro

Recent Activity

Donate For Us

LINQ TO SQL GROUP BY HAVING generated only as subquery

Tags:

sql

vb.net

linq-to-sql

having-clause

holy

1 Answers

rsbarro

Related questions

Recent Activity

Donate For Us