What is the correct way of repository implementation in EF Core?
public IAsyncEnumerable<Order> GetOrder(int orderId)
{
return blablabla.AsAsyncEnumerable();
}
or
public Task<IEnumerable<Order>> GetOrder(int orderId)
{
return blablabla.ToListAsync();
}
Is it performance wise to call AsAsyncEnumerable()
? Is this approach safe?
From one hand it doesn't create List<T>
object so it should be slightly faster. But from the order hand the query is not materialized so we deffer the SQL execution and the result can change in the meantime.
IAsyncEnumerable<T> exposes an enumerator that has a MoveNextAsync() method that can be awaited. This means a method that produces this result can make asynchronous calls in between yielding results. Cool! This method can now yield data asynchronously.
WhenAll(IEnumerable<Task>) Creates a task that will complete when all of the Task objects in an enumerable collection have completed. WhenAll(Task[]) Creates a task that will complete when all of the Task objects in an array have completed.
The bottom line is that returning IEnumerable is questionable at best, but returning IQueryable completely ruins your test-ability and destroys separation of concerns. We were unable to load Disqus.
We flip the return type to IEnumerable<Task<User>>. This would require that we trust any consumers of this code to await the result of each task after every enumeration.
Basically returning an IAsyncEnumerable from an async method. Iterating over the IAsyncEnumerable and yielding the result immediately back. Creating a struct which can store an IAsyncEnumerable temporarily, which seems to be the better solution, but still kind of overkill.
Returning IQueryable from your repository automatically forces the domain logic to make several assumptions. As I sort of mentioned before, the first assumption is that the domain is responsible for writing query logic.
According to source .ToListAsync
will use IAsyncEnumerable
internally anyway, so there's not much of performance benefits in one or another.
But one important feature of .ToListAsync
or .ToArrayAsync
is cancellation.
public static async Task<List<TSource>> ToListAsync<TSource>(
this IQueryable<TSource> source,
CancellationToken cancellationToken = default)
{
var list = new List<TSource>();
await foreach (var element in source.AsAsyncEnumerable().WithCancellation(cancellationToken))
{
list.Add(element);
}
return list;
}
List will basically hold everything in memory but it might be a serious performance concern only if the list is really big. In this case you might consider paging your big response.
public Task<List<Order>> GetOrders(int orderId, int offset, int limit)
{
return blablabla.Skip(offset).Take(limit).ToListAsync();
}
The decision really comes down to whether you wish to buffer or stream.
If you want to buffer the results, use ToList()
or ToListAsync()
.
If you want to stream the results, use AsEnumerable()
or AsAsyncEnumerable()
.
From the docs:
Buffering refers to loading all your query results into memory, whereas streaming means that EF hands the application a single result each time, never containing the entire resultset in memory. In principle, the memory requirements of a streaming query are fixed - they are the same whether the query returns 1 row or 1000; a buffering query, on the other hand, requires more memory the more rows are returned. For queries that result large resultsets, this can be an important performance factor.
In general, it's best to stream, unless you need to buffer.
When you stream, once the data is read, you can't read it again without hitting the DB again. So if you need to read the same data more than once, you'll need to buffer.
If a repository streams a IEnumerable
, the caller could choose to buffer it by calling ToList()
(or ToListAsync()
on IAsyncEnumerable
). We lose this flexibility if the repository chooses to return an IList.
So to answer your question, you're better off to the repository stream the result. And let the caller decide if they want to buffer.
If the team working on the project is not comfortable with stream semantics, or if most of the code already buffers, it might make sense to suffix the methods that stream with something like AsStream
(eg. GetOrdersAsStream()
) so that they know they shouldn't be enumerating it more than once.
So a repository could have:
async Task<List<Order>> GetOrders() => await GetOrdersAsStream.ToListAsync();
IAsyncEnumerable<Order> GetOrdersAsStream() => ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With