I searched a lot on my performance problem and tried all sorts of different things, but I just can't seem to get it to work fast enough. Here's my problem to it's simplest form:
I'm using entity framework 5 and I want to be able to lazy load child instances of a parent when the user selects that parent, so I don't have to pull the entire database. However I have been having performance problems with lazy loading the children. I think the problem is the wire up of the navigation properties between the Parent and the children. I'm also thinking it must be something I did wrong because I believe this is a simple case.
So I put up a program to test a single lazy load to isolate the problem.
Here's the Test:
I created a POCO Parent class and a Child POCO Class. Parent has n Children and Child has 1 Parent. There's only 1 parent in the SQL Server database and 25 000 children for that single parent. I tried different methods to load this data. Whenever I load either the children and the parent in the same DbContext, it takes a really long time. But if I load them in different DbContexts, it loads really fast. However, I want those instances to be in the same DbContext.
Here is my test setup and everything you need to replicate it:
POCOs:
public class Parent
{
public int ParentId { get; set; }
public string Name { get; set; }
public virtual List<Child> Childs { get; set; }
}
public class Child
{
public int ChildId { get; set; }
public int ParentId { get; set; }
public string Name { get; set; }
public virtual Parent Parent { get; set; }
}
DbContext:
public class Entities : DbContext
{
public DbSet<Parent> Parents { get; set; }
public DbSet<Child> Childs { get; set; }
}
TSQL Script to create the Database and data:
USE [master]
GO
IF EXISTS(SELECT name FROM sys.databases
WHERE name = 'PerformanceParentChild')
alter database [PerformanceParentChild] set single_user with rollback immediate
DROP DATABASE [PerformanceParentChild]
GO
CREATE DATABASE [PerformanceParentChild]
GO
USE [PerformanceParentChild]
GO
BEGIN TRAN T1;
SET NOCOUNT ON
CREATE TABLE [dbo].[Parents]
(
[ParentId] [int] CONSTRAINT PK_Parents PRIMARY KEY,
[Name] [nvarchar](200) NULL
)
GO
CREATE TABLE [dbo].[Children]
(
[ChildId] [int] CONSTRAINT PK_Children PRIMARY KEY,
[ParentId] [int] NOT NULL,
[Name] [nvarchar](200) NULL
)
GO
INSERT INTO Parents (ParentId, Name)
VALUES (1, 'Parent')
DECLARE @nbChildren int;
DECLARE @childId int;
SET @nbChildren = 25000;
SET @childId = 0;
WHILE @childId < @nbChildren
BEGIN
SET @childId = @childId + 1;
INSERT INTO [dbo].[Children] (ChildId, ParentId, Name)
VALUES (@childId, 1, 'Child #' + convert(nvarchar(5), @childId))
END
CREATE NONCLUSTERED INDEX [IX_ParentId] ON [dbo].[Children]
(
[ParentId] ASC
)
GO
ALTER TABLE [dbo].[Children] ADD CONSTRAINT [FK_Children.Parents_ParentId] FOREIGN KEY([ParentId])
REFERENCES [dbo].[Parents] ([ParentId])
GO
COMMIT TRAN T1;
App.config containing the connection string:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<connectionStrings>
<add
name="Entities"
providerName="System.Data.SqlClient"
connectionString="Server=localhost;Database=PerformanceParentChild;Trusted_Connection=true;"/>
</connectionStrings>
</configuration>
Test console class:
class Program
{
static void Main(string[] args)
{
List<Parent> parents;
List<Child> children;
Entities entities;
DateTime before;
TimeSpan childrenLoadElapsed;
TimeSpan parentLoadElapsed;
using (entities = new Entities())
{
before = DateTime.Now;
parents = entities.Parents.ToList();
parentLoadElapsed = DateTime.Now - before;
System.Diagnostics.Debug.WriteLine("Load only the parent from DbSet:" + parentLoadElapsed.TotalSeconds + " seconds");
}
using (entities = new Entities())
{
before = DateTime.Now;
children = entities.Childs.ToList();
childrenLoadElapsed = DateTime.Now - before;
System.Diagnostics.Debug.WriteLine("Load only the children from DbSet:" + childrenLoadElapsed.TotalSeconds + " seconds");
}
using (entities = new Entities())
{
before = DateTime.Now;
parents = entities.Parents.ToList();
parentLoadElapsed = DateTime.Now - before;
before = DateTime.Now;
children = entities.Childs.ToList();
childrenLoadElapsed = DateTime.Now - before;
System.Diagnostics.Debug.WriteLine("Load the parent from DbSet:" + parentLoadElapsed.TotalSeconds + " seconds" +
", then load the children from DbSet:" + childrenLoadElapsed.TotalSeconds + " seconds");
}
using (entities = new Entities())
{
before = DateTime.Now;
children = entities.Childs.ToList();
childrenLoadElapsed = DateTime.Now - before;
before = DateTime.Now;
parents = entities.Parents.ToList();
parentLoadElapsed = DateTime.Now - before;
System.Diagnostics.Debug.WriteLine("Load the children from DbSet:" + childrenLoadElapsed.TotalSeconds + " seconds" +
", then load the parent from DbSet:" + parentLoadElapsed.TotalSeconds + " seconds");
}
using (entities = new Entities())
{
before = DateTime.Now;
parents = entities.Parents.ToList();
parentLoadElapsed = DateTime.Now - before;
before = DateTime.Now;
children = parents[0].Childs;
childrenLoadElapsed = DateTime.Now - before;
System.Diagnostics.Debug.WriteLine("Load the parent from DbSet:" + parentLoadElapsed.TotalSeconds + " seconds" +
", then load the children from Parent's lazy loaded navigation property:" + childrenLoadElapsed.TotalSeconds + " seconds");
}
using (entities = new Entities())
{
before = DateTime.Now;
parents = entities.Parents.Include(p => p.Childs).ToList();
parentLoadElapsed = DateTime.Now - before;
System.Diagnostics.Debug.WriteLine("Load the parent from DbSet and children from include:" + parentLoadElapsed.TotalSeconds + " seconds");
}
using (entities = new Entities())
{
entities.Configuration.ProxyCreationEnabled = false;
entities.Configuration.AutoDetectChangesEnabled = false;
entities.Configuration.LazyLoadingEnabled = false;
entities.Configuration.ValidateOnSaveEnabled = false;
before = DateTime.Now;
parents = entities.Parents.Include(p => p.Childs).ToList();
parentLoadElapsed = DateTime.Now - before;
System.Diagnostics.Debug.WriteLine("Load the parent from DbSet and children from include:" + parentLoadElapsed.TotalSeconds + " seconds with everything turned off");
}
}
}
Here are the results of those test:
Load only the parent from DbSet:0,972 seconds
Load only the children from DbSet:0,714 seconds
Load the parent from DbSet:0,001 seconds, then load the children from DbSet:8,6026 seconds
Load the children from DbSet:0,6864 seconds, then load the parent from DbSet:7,5816159 seconds
Load the parent from DbSet:0 seconds, then load the children from Parent's lazy loaded navigation property:8,5644549 seconds
Load the parent from DbSet and children from include:8,6428788 seconds
Load the parent from DbSet and children from include:9,1416586 seconds with everything turned off
Analysis
Whenever the parent and the children are in the same DbContext, it takes a long time (9 seconds) to wire everything up. I even tried turning off everything from proxy creation to lazy loading, but to no avail. Can someone please help me ?
I answered similar question previously. My previous answer contains theory answering this issue but with your detailed question I can directly point where the issue is. First lets run one of the problematic cases with performance profiler. This is result from DotTrace when using tracing mode:
Fixing relations runs in loop. It means that for 25.000 records you have 25.000 iterations but each of these iterations internally calls CheckIfNavigationPropertyContainsEntity
on EntityCollection
:
internal override bool CheckIfNavigationPropertyContainsEntity(IEntityWrapper wrapper)
{
if (base.TargetAccessor.HasProperty)
{
object navigationPropertyValue = base.WrappedOwner.GetNavigationPropertyValue(this);
if (navigationPropertyValue != null)
{
if (!(navigationPropertyValue is IEnumerable))
{
throw new EntityException(Strings.ObjectStateEntry_UnableToEnumerateCollection(base.TargetAccessor.PropertyName, base.WrappedOwner.Entity.GetType().FullName));
}
foreach (object obj3 in navigationPropertyValue as IEnumerable)
{
if (object.Equals(obj3, wrapper.Entity))
{
return true;
}
}
}
}
return false;
}
Number of iterations of inner loop grows as items are added to the navigation property. The math is in my previous answer - it is arithmetic series where the total number of iterations of inner loop is 1/2 * (n^2 - n) => n^2 complexity. The inner loop inside the outer loop results in 312.487.500 iterations in your case as also the performance tracing shows.
I created work item on EF CodePlex for this issue.
This is not answer as I don't have a solution to improve the performance, but the comment section doesn't have enough space for the following. I just want to add a few additional tests and observations.
First, I could reproduce your measured times almost exactly for all seven tests. I have used EF 4.1 for the test.
Some interesting things to note:
From (the fast) test 2 I would conclude that object materialization (converting the rows and columns returned from the database server into objects) isn't slow.
This is also confirmed by loading the entities in test 3 without change tracking:
parents = entities.Parents.AsNoTracking().ToList();
// ...
children = entities.Childs.AsNoTracking().ToList();
This code runs fast although 25001 objects have to be materialized as well (but no relationships between the navigation properties will be established!).
Also from (the fast) test 2 I would conclude that creating entity snapshots for change tracking isn't slow.
In tests 3 and 4 the relationships between the parent and the 25000 children get fixed up when the entities are loaded from the database, i.e. EF adds all the Child
entities to the parent's Childs
collection and sets the Parent
in each child to the loaded parent. Apparently this step is slow, as you already guessed:
I think the problem is the wire up of the navigation properties between the Parent and the children.
Especially the collection side of the relationship seems to be the problem: If you comment out the Childs
navigation property in the Parent
class (the relationship is still a required one-to-many relationship then) tests 3 and 4 are fast, although EF still sets the Parent
property for all 25000 Child
entities.
I don't know why filling up the navigation collection during relationship fixup is so slow. If you simulate it manually in a naive manner, like so...
entities.Configuration.ProxyCreationEnabled = false;
children = entities.Childs.AsNoTracking().ToList();
parents = entities.Parents.AsNoTracking().ToList();
parents[0].Childs = new List<Child>();
foreach (var c in children)
{
if (c.ParentId == parents[0].ParentId)
{
c.Parent = parents[0];
parents[0].Childs.Add(c);
}
}
... it goes fast. Obviously relationship fixup internally doesn't work this simple way. Maybe it needs to be checked if the collection already contains the child to be tested:
foreach (var c in children)
{
if (c.ParentId == parents[0].ParentId)
{
c.Parent = parents[0];
if (!parents[0].Childs.Contains(c))
parents[0].Childs.Add(c);
}
}
This is significantly slower (around 4 seconds).
Anyway, relationship fixup seems to be the performance botteneck. I don't know how to improve it if you need change tracking and correct relations between your attached entities.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With