Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

EF 4.1: Why does turning a constant into a variable result in extra sub query?

Today I discovered that Entity Framework was adding an unnecessary sub query to the SQL it generates. I started digging my code trying to narrow down where it might come from. A (long) while later I pin-pointed what's causing it. But now I'm more confused than when I started, as I have no clue why it causes it.

Basically what I discovered is that on certain scenarios, simply converting a constant into a variable can alter the SQL that Entity Framework generates. I've shrunk everything to the bare minimum and packed it in a little console app:

using System;
using System.Data.Entity;
using System.Linq;

class Program
{
    private static readonly BlogContext _db = new BlogContext();

    static void Main(string[] args)
    {
        const string email = "[email protected]";

        var comments = from c in _db.Comments
                       where c.Email == email
                       select c;

        var result = (from p in _db.Posts
                      join c in comments on p.PostId equals c.PostId
                      orderby p.Title
                      select new { p.Title, c.Content });

        Console.WriteLine(result);
    }
}

public class BlogContext : DbContext
{
    public DbSet<Post> Posts { get; set; }
    public DbSet<Comment> Comments { get; set; }
}

public class Post
{
    public int PostId { get; set; }
    public string Title { get; set; }
}

public class Comment
{
    public int CommentId { get; set; }
    public int PostId { get; set; }
    public string Email { get; set; }
    public string Content { get; set; }
}

This shows the following output, which is perfect:

SELECT
[Extent1].[PostId] AS [PostId],
[Extent1].[Title] AS [Title],
[Extent2].[Content] AS [Content]
FROM  [dbo].[Posts] AS [Extent1]
INNER JOIN [dbo].[Comments] AS [Extent2] ON [Extent1].[PostId] = [Extent2].[PostId]
WHERE N'[email protected]' = [Extent2].[Email]
ORDER BY [Extent1].[Title] ASC

Now if I make email a variable:

/*const*/ string email = "[email protected]";

The output changes radically:

SELECT
[Project1].[PostId] AS [PostId],
[Project1].[Title] AS [Title],
[Project1].[Content] AS [Content]
FROM ( SELECT
        [Extent1].[PostId] AS [PostId],
        [Extent1].[Title] AS [Title],
        [Extent2].[Content] AS [Content]
        FROM  [dbo].[Posts] AS [Extent1]
        INNER JOIN [dbo].[Comments] AS [Extent2] ON [Extent1].[PostId] = [Extent2].[PostId]
        WHERE [Extent2].[Email] = @p__linq__0
)  AS [Project1]
ORDER BY [Project1].[Title] ASC

As a side note, LINQ to SQL does not seem to do this. I know it's probably okay to ignore this, as both commands return the same data. But I'm extremely curious as to why this happens. Up until today I always had the (perhaps false?) impression that it is always safe to turn a constant into a variable, providing that the value remains the same (which in this case does). So I have to ask...

Why a seemingly insignificant change causes such a big difference in the generated SQL?

Update:

Just to be clear, my question isn't about the value of email being a hard-coded value in the first query and a variable in the second (which makes all the sense in the world). My question is about why the variable version results in the extra sub query.

Thanks!

like image 214
Daniel Liuzzi Avatar asked May 07 '11 15:05

Daniel Liuzzi


1 Answers

The answer is fairly simple. Your LINQ query is expressed with expression trees. The difference with the const variable vs non const one are lies in ConstantExpression and ParameterExpression.

When you use const your LINQ query uses ConstExpression for this variable, and when you use non const it uses ParameterExpression which are interpreted differently by the EF Runtime.

Constant actually means that the value will never change and the value can be inlined into the query.

like image 190
Ivan Zlatanov Avatar answered Sep 30 '22 11:09

Ivan Zlatanov