Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

EF 6 vs EF 5 relative performance issue when deploying to IIS8

I have an MVC 4 application with EF 6. After upgrading from EF 5 to EF 6 I noticed a performance issue with one of my linq-entities queries. At first I was excited because on my development box I noticed a 50% improvement from EF 5 to EF 6. This query returns about 73,000 records. The SQL being run on production server was intercepted with Activity Monitor, Recent Expensive Queries, this timing is also included in the following tables. The following numbers are once DB is is warmed up:

Development: 64 bit OS, SS 2012, 2 cores, 6 GB RAM, IIS Express.

EF 5 ~30 sec
EF 6 ~15 sec
SQL ~26 sec

Production: 64 bit OS, SS 2012, 32 cores, 32 GB RAM, IIS8.

EF 5 ~8 sec
EF 6 ~4 minutes
SQL ~6 sec.

I have included the specs just to give an idea of what relative performance should be. So it appears that when I use EF 6 in my development environment I get performance improvement, when I publish to my production server a huge performance problem. The databases are similar if not exactly the same. All indexes have been rebuilt, the SQL query also seems to indicate that there is no reason to suspect database is at fault. Application pool is .Net 4.0 in production. Both development and production server have .Net 4.5 installed. I don't know what to check next or how to debug this problem, any ideas on what to do or how to debug further?

Update: Using SQL Server Profiler found that EF5 and EF6 produce slightly different TSQL. The TSQL difference is as follows:

EF5: LEFT OUTER JOIN [dbo].[Pins] AS [Extent9] ON [Extent1].[PinId] = [Extent9].[PinID]
EF6: INNER JOIN [dbo].[Pins] AS [Extent9] ON [Extent1].[PinId] = [Extent9].[PinID]

This same TSQL from EF6 also performs differently depending on the server/database the TSQL is executed on. After inspecting query plan for EF6 & slow database (production server SS build 11.0.3000 Enterprise Edition) this plan does all scans and no seeks when compared to an identical instance (test server SS build 11.0.3128 Developers Edition) which has a few seeks that make the difference. Wall clock time is > 4 min for production and 12 sec for small test server. EF places these queries into sp_executesql proc, the intercepted sp_executesql proc was used for the timing mentioned above. I do NOT get slow time (bad query plan) with EF5 or EF6 generated code when executed on development server. Also strange, if I remove TSQL from sp_executesql and run it on production server the query is executed quickly (6 sec). In summary three things need to happen for slow execution plan:

1. Execute on production server build 11.0.3000
2. Use Inner Join with Pins table (EF6 generated code).
3. Execute TSQL inside of sp_executesql.

The test environment was created with a backup of my production data, data on both servers is identical. Could creating a backup and restoring database have fixed some problem with the data? I haven't tried deleting instance and restoring on production server because I would like to know for sure what the problem is before I delete and restore the instance, just in case it does fix the problem. I did try and flush cache with the following TSQL

select DB_ID() 
DBCC Flushprocindb(database_Id)
and 
DBCC FREEPROCCACHE(plan_handle)

Flushing with above did not effect the query plan. Any suggestions what to try next?

Following is the linq query:

    result =
    (
    from p1 in context.CookSales

    join l2 in context.CookSaleStatus on new { ID = p1.PinId, YEAR = year1 } equals new { ID = l2.PinId, YEAR = l2.StatusYear } into list2
    from p3 in list2.DefaultIfEmpty()
    join l3 in context.CookSaleStatus on new { ID = p1.PinId, YEAR = year2 } equals new { ID = l3.PinId, YEAR = l3.StatusYear } into list3
    from p4 in list3.DefaultIfEmpty()
    join l4 in context.CookSaleStatus on new { ID = p1.PinId, YEAR = year3 } equals new { ID = l4.PinId, YEAR = l4.StatusYear } into list4
    from p5 in list4.DefaultIfEmpty()
    join l10 in context.CookSaleStatus on new { ID = p1.PinId, YEAR = year4 } equals new { ID = l10.PinId, YEAR = l10.StatusYear } into list10
    from p11 in list10.DefaultIfEmpty()

    join l5 in context.ILCookAssessors on p1.PinId equals l5.PinID into list5
    from p6 in list5.DefaultIfEmpty()
    join l7 in context.ILCookPropertyTaxes on new { ID = p1.PinId } equals new { ID = l7.PinID } into list7
    from p8 in list7.DefaultIfEmpty()

    join l13 in context.WatchLists on p1.PinId equals l13.PinId into list13
    from p14 in list13.DefaultIfEmpty()

    join l14 in context.Pins on p1.PinId equals l14.PinID into list14
    from p15 in list14.DefaultIfEmpty()
    orderby p1.Volume, p1.PIN
    where p1.SaleYear == userSettings.SaleYear 
    where ((p1.PinId == pinId) || (pinId == null))
    select new SaleView
    {
        id = p1.id,
        PinId = p1.PinId,
        Paid = p1.Paid == "P" ? "Paid" : p1.Paid,
        Volume = p1.Volume,
        PinText = p15.PinText,
        PinTextF = p15.PinTextF,
        ImageFile = p15.FnaImage.TaxBodyImageFile,
        SaleYear = p1.SaleYear,
        YearForSale = p1.YearForSale,
        Unpaid = p1.DelinquentAmount,
        Taxes = p1.TotalTaxAmount,
        TroubleTicket = p1.TroubleTicket,
        Tag1 = p1.Tag1,
        Tag2 = p1.Tag2,
        HasBuildingPermit = p1.Pin1.BuildingPermitGeos.Any(p => p.PinId == p1.PinId),
        BidRate = p1.BidRate,
        WinningBid = p1.WinningBid,
        WinningBidderNumber = p1.BidderNumber,
        WinningBidderName = p1.BidderName,
        TaxpayerName = p1.TaxpayerName,
        PropertyAddress = SqlFunctions.StringConvert((double?)p1.TaxpayerPropertyHouse) + " " + p1.TaxpayerPropertyDirection + " "
                        + p1.TaxpayerPropertyStreet
                        + " " + p1.TaxpayerPropertySuffix +
                        System.Environment.NewLine + (p1.TaxpayerPropertyCity ?? "") + ", " + (p1.TaxpayerPropertyState ?? "") +
                        " " + (p1.TaxpayerPropertyZip ?? ""),
        MailingAddress = (p1.TaxpayerName ?? "") + System.Environment.NewLine + (p1.TaxpayerMailingAddress ?? "") +
                        System.Environment.NewLine + (p1.TaxpayerMailingCity ?? "") + ", " + (p1.TaxpayerMailingState ?? "") +
                        " " + (p1.TaxpayerMailingZip ?? ""),
        Status1 = p3.Status.Equals("Clear") ? null : p3.Status,
        Status2 = p4.Status.Equals("Clear") ? null : p4.Status,
        Status3 = p5.Status.Equals("Clear") ? null : p5.Status,
        Status4 = p11.Status.Equals("Clear") ? null : p11.Status,
        Township = p6.Township,
        AssessorLastUpdate = p6.LastUpdate,
        Age = p6.Age,
        LandSquareFootage = p6.LandSquareFootage,
        BuildingSquareFootage = p6.BuildingSquareFootage,
        CurrLand = p6.CurrLand,
        CurrBldg = p6.CurrBldg,
        CurrTotal = p6.CurrTotal,
        PriorLand = p6.PriorLand,
        PriorBldg = p6.PriorBldg,
        PriorTotal = p6.PriorTotal,
        ClassDescription = p6.ClassDescription,
        Class = p1.Classification == null ? p6.Class.Trim() : p1.Classification,
        TaxCode = p6.TaxCode,
        Usage = p6.Usage,

        Status0 = (p8.CurrentTaxYear != null && p8.CurrentTaxYearPaidAmount == 0) ? "Paid" : null, 
        LastTaxYearPaidAmount = p8.LastTaxYearPaidAmount,
        NoteStatus = p15.PinNotes.Any(p => p.PinId == p15.PinID),
        EntryComment = p1.EntryComment,
        IsInScavenger = p14.IsInScavenger ?? false,
        IsInTbs = p14.IsInTbs ?? false,
        RedeemVts = (p3.Redeemed == "VTS" || p4.Redeemed == "VTS" || p5.Redeemed == "VTS" || p11.Redeemed == "VTS") ? true : false,
        FivePercenter = (p3.FivePercenter || p4.FivePercenter || p5.FivePercenter || p11.FivePercenter) ? true : false,
    }
    ).ToList();

The SQL that is generated with this query seems reasonable. (I have not included it because when I paste it in it's not formatted and difficult to read.)

like image 273
ricardo Avatar asked Mar 17 '14 18:03

ricardo


1 Answers

While researching this problem I have unearthed several things about SQL Server that I didn't know. This may be common knowledge for some, but for me it wasn't. Here are my overall highlights.

  1. EF uses dynamic sql for all queries specifically sp_exectutesql(). sp_executesql() executes dynamic SQL, if you remove this SQL and execute as adhoc query in SSMS don't expect to get the same performance results. This is documented well here and references this document which I highly recommend reading if you are having these types of issues.
  2. EF5 produces different dynamic SQL than EF6 under certain conditions.
  3. It's difficult to optimize linq to entities because you may get different results depending on hardware, this is explained in the references. My original goal was to optimize the linq query when I upgraded to EF6. I noticed that not using navigation properties improved performance in my dev and test servers, but killed it in production.
  4. Final result with acceptable performance, in all environments, was a combination of join and navigation properties. In the end If I had used all navigation properties it would have worked better from the start. The join keys used were from the wrong tables, when you write ad hoc SQL it doesn't matter but it must for dynamic SQL. Had I used the navigation there would not have been any keys to get wrong. However, the best performance was with one join and the remaining navigation properties. The dynamic SQL generated is remarkably similar for all scenarios, but the SQL Server query plan optimizer gets better clues when navigation properties are used (this is a guess).

The key part of linq changed was this:

                from p1 in context.CookSales
                join p15 in context.Pins on p1.PinId equals p15.PinID
                where p1.SaleYear == userSettings.SaleYear
                where ((p1.PinId == pinId) || (pinId == null))
                orderby p1.Volume, p1.PIN
                select new SaleView bla bla

The Pins table contains the primary key for PinId while all the other tables have PinId as foreign key. Keeping Pins as join and not a navigation property improved performance.

like image 52
ricardo Avatar answered Oct 12 '22 13:10

ricardo