Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallelism and the Entity Framework

It's very common in our web applications to need data from a variety of tables in our database. Today you might find 5 or 6 database queries being executed serially for a single request. None of these queries depend on data from the other so they are perfect candidates for being executed in parallel. The problem is the well known DbConcurrencyException which is thrown when multiple queries are executed against the same context.

We typically use a single context per request and then have a repository class so we can reuse queries across various projects. We then dispose of the context at the end of the request when the controller is disposed.

Below is an example which uses parallelism, but there's still a problem!

var fileTask = new Repository().GetFile(id);
var filesTask = new Repository().GetAllFiles();
var productsTask = AllProducts();
var versionsTask = new Repository().GetVersions();
var termsTask = new Repository().GetTerms();

await Task.WhenAll(fileTask, filesTask, productsTask, versionsTask, termsTask);

Each repository is internally creating its own context, but as it is now, they aren't being disposed. That's a problem. I know I could call Dispose on each repository that I create, but that starts to clutter the code quickly. I could create a wrapper function for each query which uses its own context, but that feels messy and isn't a great long term solution for the problem.

What would be the best way to address this problem? I'd like the client/consumer to not have to worry about disposing each repository/context in the case of having multiple queries executed in parallel.

The only idea I have right now is to follow an approach similar to a factory pattern, except my factory would keep track of all the objects it created. I could then dispose of the factory once I know my queries are finished and the factory could internally dispose of each repository/context.

I'm surprised to see such little discussion around parallelism and the Entity Framework, so hopefully some more ideas from the community will come in.

Edit

Here is a simple example of what our repository looks like:

public class Repository : IDisposable {
    public Repository() {
        this.context = new Context();
        this.context.Configuration.LazyLoadingEnabled = false;
    }

    public async Task<File> GetFile(int id) {
        return await this.context.Files.FirstOrDefaultAsync(f => f.Id == id);
    }

    private bool disposed = false;

    protected virtual void Dispose(bool disposing) {
        if (!this.disposed) {
            if (disposing) {
                context.Dispose();
            }
        }
        this.disposed = true;
    }

    public void Dispose() {
        Dispose(true);
        GC.SuppressFinalize(this);
    }
}

As you can see, each repository gets its own context. This means that each repository needs to be disposed of. In the example I gave above, that means I would need 4 calls to Dispose().

My thoughts for a factory approach to the problem was something like the following:

public class RepositoryFactory : IDisposable {
    private List<IRepository> repositories;

    public RepositoryFactory() {
        this.repositories = new List<IRepository>();
    }

    public IRepository CreateRepository() {
        var repo = new Repository();
        this.repositories.Add(repo);
        return repo;            
    }

    #region Dispose
    private bool disposed = false;

    protected virtual void Dispose(bool disposing) {
        if (!this.disposed) {
            if (disposing) {
                foreach (var repo in repositories) {
                    repo.Dispose();
                }
            }
        }
        this.disposed = true;
    }

    public void Dispose() {
        Dispose(true);
        GC.SuppressFinalize(this);
    }
    #endregion
}

This factory would be responsible for creating instances of my repository, but it would also keep track of all the instances it has created. Once this single factory class is disposed of it would internally be responsible for disposing of each repository that it created.

like image 877
Justin Helgerson Avatar asked Apr 15 '15 15:04

Justin Helgerson


People also ask

What are the three types of Entity Framework?

There are three approaches to model your entities in Entity Framework: Code First, Model First, and Database First. This article discusses all these three approaches and their pros and cons.

What is the purpose of Entity Framework?

The Entity Framework enables developers to work with data in the form of domain-specific objects and properties, such as customers and customer addresses, without having to concern themselves with the underlying database tables and columns where this data is stored.

Is a DbContext per thread in parallel ForEach safe?

I have researched this, and I agree that DbContext is not thread-safe. The pattern I propose does use multiple threads, but a single DbContext is only every accessed by a single thread in a single-threaded fashion.


1 Answers

You could allow clients to configure the disposal behavior of Repository by passing some sort of optional (false by default) autodispose bit to the constructor. An implementation would look something like this:

public class Repository : IDisposable
{
    private readonly bool _autodispose = false;
    private readonly Lazy<Context> _context = new Lazy<Context>(CreateContext);

    public Repository(bool autodispose = false) {
        _autodispose = autodispose;
    }

    public Task<File> GetFile(int id) {
        // public query methods are still one-liners
        return WithContext(c => c.Files.FirstOrDefaultAsync(f => f.Id == id));
    }

    private async Task<T> WithContext<T>(Func<Context, Task<T>> func) {
        if (_autodispose) {
            using (var c = CreateContext()) {
                return await func(c);
            }
        }
        else {
            return await func(_context.Value);
        }
    }

    private static Context CreateContext() {
        var c = new Context();
        c.Configuration.LazyLoadingEnabled = false;
        return c;
    }

    public void Dispose() {
        if (_context.IsValueCreated)
            _context.Value.Dispose();
    }
}

Note: I kept the disposal logic simple for illustration; you may need to work your disposed bits back in.

Your query methods are still simple one-liners, and the client can very easily configure the disposal behavior as needed, and even re-use a Repository instance in auto-disposal situations:

var repo = new Repository(autodispose: true);
var fileTask = repo.GetFile(id);
var filesTask = repo.GetAllFiles();
var productsTask = AllProducts();
var versionsTask = repo.GetVersions();
var termsTask = repo.GetTerms();

await Task.WhenAll(fileTask, filesTask, productsTask, versionsTask, termsTask);
like image 83
Todd Menier Avatar answered Sep 28 '22 23:09

Todd Menier