I was given the following query in SQL (variable names obfuscated) which is trying to get the values (Ch, Wa, Bu, Hi) resulting in the greatest number (cnt) of Pi entries.
select top 1 Pi.Ch, Pi.Wa, Pi.Bu, Pi.Hi, COUNT(1) as cnt
from Product, Si, Pi
where Product.Id = Si.ProductId
and Si.Id = Pi.SiId
and Product.Code = @CodeParameter
group by Pi.Ch, Pi.Wa, Pi.Bu, Pi.Hi
order by cnt desc
which runs instantly in SQL management studio on our production database. I've successfully written the code a few ways in C# LINQ and Entity Framework, but every way the code runs in 8 - 10 seconds. One attempt is the following code (doing it without the print as one call gives the same performance results):
using(var context = new MyEntities()){
var query = context.Products
.Where(p => p.Code == codeFromFunctionArgument)
.Join(context.Sis, p => p.Id, s => s.ProductId, (p, s) => new { sId = s.Id })
.Join(context.Pis, ps => ps.sId, pi => pi.SiId, (ps, pi) => new {pi.Ch, pic.Wa, pic.Bu, pic.Hi})
.GroupBy(
pi => pi,
(k, g) => new MostPisResult()
{
Ch = k.Ch,
Wa = k.Wa,
Bu = k.Bu,
Hi = k.Hi,
Count = g.Count()
}
)
.OrderByDescending(x => x.Count);
Console.WriteLine(query.ToString());
return query.First();
}
}
which outputs the following SQL statements:
SELECT
[Project1].[C2] AS [C1],
[Project1].[Ch] AS [Ch],
[Project1].[Wa] AS [Wa],
[Project1].[Bu] AS [Bu],
[Project1].[Hi] AS [Hi],
[Project1].[C1] AS [C2]
FROM ( SELECT
[GroupBy1].[A1] AS [C1],
[GroupBy1].[K1] AS [Ch],
[GroupBy1].[K2] AS [Wa],
[GroupBy1].[K3] AS [Bu],
[GroupBy1].[K4] AS [Hi],
1 AS [C2]
FROM ( SELECT
[Extent3].[Ch] AS [K1],
[Extent3].[Wa] AS [K2],
[Extent3].[Bu] AS [K3],
[Extent3].[Hi] AS [K4],
COUNT(1) AS [A1]
FROM [dbo].[Product] AS [Extent1]
INNER JOIN [dbo].[Si] AS [Extent2] ON [Extent1].[Id] = [Extent2].[ProductId]
INNER JOIN [dbo].[Pi] AS [Extent3] ON [Extent2].[Id] = [Extent3].[SiId]
WHERE ([Extent1].[Code] = @p__linq__0) AND (@p__linq__0 IS NOT NULL)
GROUP BY [Extent3].[Ch], [Extent3].[Wa], [Extent3].[Bu], [Extent3].[Hi]
) AS [GroupBy1]
) AS [Project1]
ORDER BY [Project1].[C1] DESC
I've also tried in query syntax with about the same result. I also tried (but not for very long) executing the original SQL query directly with EF, but couldn't quickly get it working.
Is there some mistake I'm doing in translating the query to LINQ? Is there an obvious way I'm missing to improve the query? Is it possible to write the query in EF / LINQ with the same performance as the SQL statements?
====== Update ======
In SQL profiler the output for the original query is exactly the same. For the LINQ query it is very similar to what I posted above.
exec sp_executesql N'SELECT TOP (1)
[Project1].[C2] AS [C1],
[Project1].[Ch] AS [Ch],
[Project1].[Wa] AS [Wa],
[Project1].[Bu] AS [Bu],
[Project1].[Hi] AS [Hi],
[Project1].[C1] AS [C2]
FROM ( SELECT
[GroupBy1].[A1] AS [C1],
[GroupBy1].[K1] AS [Ch],
[GroupBy1].[K2] AS [Wa],
[GroupBy1].[K3] AS [Bu],
[GroupBy1].[K4] AS [Hi],
1 AS [C2]
FROM ( SELECT
[Extent3].[Ch] AS [K1],
[Extent3].[Wa] AS [K2],
[Extent3].[Bu] AS [K3],
[Extent3].[Hi] AS [K4],
COUNT(1) AS [A1]
FROM [dbo].[Product] AS [Extent1]
INNER JOIN [dbo].[Si] AS [Extent2] ON [Extent1].[Id] = [Extent2].[ProductId]
INNER JOIN [dbo].[Pi] AS [Extent3] ON [Extent2].[Id] = [Extent3].[SiId]
WHERE ([Extent1].[Code] = @p__linq__0) AND (@p__linq__0 IS NOT NULL)
GROUP BY [Extent3].[Ch], [Extent3].[Wa], [Extent3].[Bu], [Extent3].[Hi]
) AS [GroupBy1]
) AS [Project1]
ORDER BY [Project1].[C1] DESC',N'@p__linq__0 nvarchar(4000)',@p__linq__0=N'109579'
====== Update 2 ======
Here's the obfuscated XML output of the query execution plan on Snipt.org. Note the variable in question here is named "MagicalCode" in the output and both values "109579" and "2449-268-550" are valid (strings in C#) as in the final line of the XML output.
<ParameterList>
<ColumnReference
Column="@p__linq__0"
ParameterCompiledValue="N'109579'"
ParameterRuntimeValue="N'2449-268-550'" />
</ParameterList>
Plan image with actual row counts displayed
====== Update 3 ======
(hidden in a comment) I ran the EF generated SQL from entity framework in SSMS and it ran instantly. So I might be suffering from some form of parameter sniffing as hinted by this question. I'm not sure how to deal with it in the context of entity framework.
====== Update 4 ======
Updated Entity Framework SQL Execution Plan and SSMS SQL Query Execution Plan that can be opened with Plan Explorer.
====== Update 5 ======
Some workaround attempts
context.Database.SqlQuery<ReturnObject>(...)
ran in ~4-5 seconds.SqlCommand
and the connection string obtained from EF context took about 3 seconds (context initialization overhead).SqlCommand
took with hardcoded connection string takes about 1.5 seconds.
So I ended up using the last one for now. The last thing I can think of is writing a stored procedure to get closer to the "instant" performance of running the query in SSMS.Entity framework is a great ORM shipped by Microsoft which allow you to query and modify RDBMS like SQL Server, Oracle, DB2 and MySQL etc. by using LINQ syntax.
LINQ queries are always executed when the query variable is iterated over, not when the query variable is created. This is called deferred execution. You can also force a query to execute immediately, which is useful for caching query results. This is described later in this topic.
You could try using IQueryable.AsNoTracking() see http://msdn.microsoft.com/en-us/library/gg679352(v=vs.103).aspx. It is safe to use AsNoTracking() in cases where you are not going to edit the results and save them back to the database again. Usually it makes a big difference when a Query returns a large number of rows. Make sure you put System.Data.Entity in your uses if you want to use .AsNoTracking()
It could be a problem of cached execution plan. Try execute stored procedure to clean query execution plans:
DBCC FREEPROCCACHE
Also this thread might be helpful: Entity Framework cached query plan performance degrades with different parameters
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With