Trying to upgrade from EF5 to EF6, I encounter significant performance gap searching table by nullable column. Here is a sample:
public class Customer
{
public int Id { get; set; }
public int? ManagerId { get; set; }
//public virtual Manager Manager { get; set; }
}
public class MyContext : DbContext
{
public MyContext(string connstring): base(connstring){}
public DbSet<Customer> Customers { get; set; }
}
class Program
{
static void Main(string[] args)
{
var db = new MyContext("CONNSTRING");
var managerId = 1234;
var q = from b in db.Customers
where b.ManagerId == managerId
select b.Id;
var s = q.ToString();
}
}
When EF6 generates SQL, it adds some logic for null handling:
SELECT
[Extent1].[Id] AS [Id]
FROM [dbo].[Customers] AS [Extent1]
WHERE (([Extent1].[ManagerId] = @p__linq__0)
AND ( NOT ([Extent1].[ManagerId] IS NULL OR @p__linq__0 IS NULL)))
OR (([Extent1].[ManagerId] IS NULL) AND (@p__linq__0 IS NULL))
Notice same linq produced more simple SQL under EF5:
SELECT
[Extent1].[Id] AS [Id]
FROM [dbo].[Customers] AS [Extent1]
WHERE [Extent1].[ManagerId] = @p__linq__0
I can understand the point developers tried to achieve: if you supply null as parameter, the query where managerId = null will not select any lines. I appreciate the care, but 99.9% times search logic is separated: one usecase looks for where ManagerId == null
, another searches for specific id where ManagerId == managerId
The problem is in big performance impact: MS SQL does not use index on ManagerId and table scan occurs. My project has hundreds of similar searches and on database size about 100GB overall performance after upgrade to EF6 cut by 10 approximately.
The question is does anybody know some kind of configuration or convention to disable this roadblock in EF6 and generate simple sql?
EDIT:
I checked a dozen similar selects in my project and found that:
EF6 even checks for null when constant is exactly specified as not null, for example:
from p in db.PtnActivations
where p.Carrier != "ALLTEL"
where p.Carrier != "ATT"
where p.Carrier != "VERIZON"
generates SQL
WHERE ( NOT (('ALLTEL' = [Extent1].[Carrier]) AND ([Extent1].[Carrier] IS NOT NULL))) AND ( NOT (('ATT' = [Extent1].[Carrier]) AND ([Extent1].[Carrier] IS NOT NULL))) AND ( NOT (('VERIZON' = [Extent1].[Carrier]) AND ([Extent1].[Carrier] IS NOT NULL)))
that did not utilize my index on carrier. EF5 version had
( NOT (('ALLTEL' = [Extent1].[Carrier]))) AND ( NOT (('ATT' = [Extent1].[Carrier]))) AND ( NOT (('VERIZON' = [Extent1].[Carrier]) ))
that utilized it.
Notice the condition ('ALLTEL' = [Extent1].[Carrier]) AND ([Extent1].[Carrier] IS NOT NULL)
. Second part is always false, but adding this part gives up the index.
My routine import of about 1.7M records (that usually had taken about 30 minutes) is up for 3 hours and progress is about 30%.
Set
db.Configuration.UseDatabaseNullSemantics = true;
to get the behavior you had in EF5. This workitem describes what the difference between true
and false
is and should help you decide whether you are OK with the old behavior or not.
Very different answer.
If you are using a varchar(xxx) the LNQ to SQL spits out nvarchar(4000), which breaks indexes and conversions massively blowing out your sql plan. In my case I found this question due to the odd null behavior, but that was not the issue. Answer below solves both null and nvarchar issue. SQL plan went from ~11 to .006.
public class InterestingRow
{
[Key]
public int interesting_row_id { get; set; }
[StringLength(255), Required, Column(TypeName = "varchar")]
public string public_guid { get; set; }
}
(Yes there are many reasons to use varchar, like you are storing a publicly exposed guid)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With