Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple CASE WHEN in Entity Framework with TPH and enumeration

I have a very strange behavior when using TPH on EF 6.1.3. Here is a basic example to reproduce :

public class BaseType
{
    public int Id { get; set; }
}
public class TypeA : BaseType
{
    public string PropA { get; set; }
}
public class TypeB : BaseType
{
    public decimal PropB { get; set; }
    public OneEnum PropEnum { get; set; }
}
public class TypeC : TypeB
{
    public int PropC { get; set; }
}

public enum OneEnum
{
    Foo,
    Bar
}

public partial class EnumTestContext : DbContext
{
    public EnumTestContext()
    {
        this.Database.Log = s => { Debug.WriteLine(s); };
    }
    public DbSet<BaseType> BaseTypes { get; set; }
}

class Program
{
    static void Main(string[] args)
    {
        Database.SetInitializer(new DropCreateDatabaseAlways<EnumTestContext>());
        using (var context = new EnumTestContext())
        {
            context.BaseTypes.Add(new TypeA() { Id = 1, PropA = "propA" });
            context.BaseTypes.Add(new TypeB() { Id = 2, PropB = 4.5M, /*PropEnum = OneEnum.Bar*/ });
            context.BaseTypes.Add(new TypeC() { Id = 3, PropB = 4.5M, /*PropEnum = OneEnum.Foo,*/ PropC = 123 });
            context.SaveChanges();

            var onetype = context.BaseTypes.Where(b => b.Id == 1).FirstOrDefault();

            Console.WriteLine("typeof {0} with {1}", onetype.GetType().Name, onetype.Id);
        }

        Console.WriteLine("Press any key to exit...");
        Console.ReadKey();
    }
}

This code works perfectly, but the generated query is extrememly weird and complex, especialy there are a lot of CASE WHEN

SELECT 
    [Limit1].[C1] AS [C1], 
    [Limit1].[Id] AS [Id], 
    [Limit1].[C2] AS [C2], 
    [Limit1].[C3] AS [C3], 
    [Limit1].[C4] AS [C4], 
    [Limit1].[C5] AS [C5]
    FROM ( SELECT TOP (1) 
        [Extent1].[Id] AS [Id], 
        CASE WHEN ([Extent1].[Discriminator] = N'BaseType') THEN '0X' WHEN ([Extent1].[Discriminator] = N'TypeA') THEN '0X0X' WHEN ([Extent1].[Discriminator] = N'TypeB') THEN '0X1X' ELSE '0X1X0X' END AS [C1], 
        CASE WHEN ([Extent1].[Discriminator] = N'BaseType') THEN CAST(NULL AS varchar(1)) WHEN ([Extent1].[Discriminator] = N'TypeA') THEN [Extent1].[PropA] WHEN ([Extent1].[Discriminator] = N'TypeB') THEN CAST(NULL AS varchar(1)) END AS [C2], 
        CASE WHEN ([Extent1].[Discriminator] = N'BaseType') THEN CAST(NULL AS decimal(18,2)) WHEN ([Extent1].[Discriminator] = N'TypeA') THEN CAST(NULL AS decimal(18,2)) WHEN ([Extent1].[Discriminator] = N'TypeB') THEN [Extent1].[PropB] ELSE [Extent1].[PropB] END AS [C3], 
        CASE WHEN ([Extent1].[Discriminator] = N'BaseType') THEN CAST(NULL AS int) WHEN ([Extent1].[Discriminator] = N'TypeA') THEN CAST(NULL AS int) WHEN ([Extent1].[Discriminator] = N'TypeB') THEN [Extent1].[PropEnum] ELSE [Extent1].[PropEnum] END AS [C4], 
        CASE WHEN ([Extent1].[Discriminator] = N'BaseType') THEN CAST(NULL AS int) WHEN ([Extent1].[Discriminator] = N'TypeA') THEN CAST(NULL AS int) WHEN ([Extent1].[Discriminator] = N'TypeB') THEN CAST(NULL AS int) ELSE [Extent1].[PropC] END AS [C5]
        FROM [dbo].[BaseTypes] AS [Extent1]
        WHERE ([Extent1].[Discriminator] IN (N'TypeA',N'TypeB',N'TypeC',N'BaseType')) AND (1 = [Extent1].[Id])
    )  AS [Limit1]

Except the cost of multiple and useless THEN CAST(NULL as X), the query is large (> 50 KBs) in my project because I have a lot of derived classes, containing a lot a properties. As you can expect, my DBA team is not happy to see this kind of queries to our databases.

If I remove the enumeration property on TypeB, the request is much more cleaner. Same thing if I have only two hierarchy levels, aka class TypeC : BaseType (compared to 3 in the example because class TypeC : TypeB).

Is there any settings or model configuration or workaround to avoid this strange behavior ?

Update

Here is the generated query if I remove TypeB.PropEnum

SELECT TOP (1) 
    [Extent1].[Discriminator] AS [Discriminator], 
    [Extent1].[Id] AS [Id], 
    [Extent1].[PropA] AS [PropA], 
    [Extent1].[PropB] AS [PropB], 
    [Extent1].[PropC] AS [PropC]
    FROM [dbo].[BaseTypes] AS [Extent1]
    WHERE ([Extent1].[Discriminator] IN (N'TypeA',N'TypeB',N'TypeC',N'BaseType')) AND (1 = [Extent1].[Id])

Update 2

A common solution is to create a separate property the integer value and ignore the enum property. This works, but it's quite confusing to have 2 properties for the same purpose.

public class TypeB : BaseType
{
    public decimal PropB { get; set; }

    public int PropEnumValue { get; set; }

    [NotMapped]
    public OneEnum PropEnum
    {
        get { return (OneEnum)PropEnumValue; }
        set { PropEnumValue = (int)value; }
    }
}

Update 3

I've found a bug on codeplex : https://entityframework.codeplex.com/workitem/2117. It doesn't seems to be solved.

like image 361
Cybermaxs Avatar asked Nov 24 '15 17:11

Cybermaxs


1 Answers

About using EF/Large queries

I've done some work with EF6 and semi-large hierarchies. There are a few things you should consider. First of all why isn't your DBA team not happy with these kind of queries. Of course these arn't the queries they would write but assuming management doesn't want you to spend the time to write every single query from scratch they'll have to live with the fact that you use an ORM framework and that ORM framework might cause queries that are a bit larger.

Now if they have specific performance concerns you SHOULD address those.

What you can do

now what can you do to clean up your queries.

1) Make all classes that could be abstract abstract.

2) Make all other classes sealed.

3) In your linq queries cast to concrete types where possible(using OfType() ). This might even work better than an .Select(x => x as SomethingHere). If you have a particular nasty query it might take some experimentation what tunes your query from linq best.

explanation what i've found through experimentation

As you notice with your queries it's checking the discriminator. If you queries get a bit more complex (and i expect those 50k queries to be one of those) you'll see that it adds in code for string concatenation to check every possible combination. you see that happening a bit in the

THEN '0X' WHEN ([Extent1].[Discriminator] = N'TypeA') THEN '0X0X' 

part. I've did some POCs trying to figure out this behaviour and what seems to be happening is that entity framework is translating properties to 'aspects' (my term). For example an class will have a "PropertyA" if the translated string contains either '0X' or '0X0X'. PropertyB it might translate to "R2D2" and PropertyC to "C3P0". that way if a classname is translated to "R2D2C3P0". it knows it has both PropertyB & PropertyC. It has to take in account some hidden derived types and all supertypes. Now if enity framework can be more sure about your class hierarchy (by making classes sealed) it can simplify the logic here. And in my experience the string building logic EF generates can be even more complex than then ones you're showing here. That is why making classes abstract/sealed EF can be smarter about this and reduce your queries.

Another performance tip

Now also make sure you have proper indexes on the discriminator column. (You could do this from your DbMigration script inside entity framework).

'Desparate' performance measure

Now if all else fails make your discriminator an int. This will hurt the readability of your database/queries a LOT, but it helps performance. (and you could even have all your classes automatically emit a property that contains the class name so you keep some readability of types inside your database).

UPDATE:

after some more research after the comment from RX_DID_RX it turns out you can only seal/make poco's abstract if you don't use dynamic proxy generation. (lazy loading & change tracking). In my particular app we didn't use this so it worked well for us but i have to revert my earlier recommendation.

For more detail an EF6 specific link http://www.entityframeworktutorial.net/Types-of-Entities.aspx

adding indexes, and playing with casting in linq queries can still help though.

like image 97
Batavia Avatar answered Oct 19 '22 03:10

Batavia