How to avoid memory overflow when querying large datasets with Entity Framework and LINQ

Tags:

I have a class that handles all database methods, including Entity Framework related stuff. When data is needed, other classes may invoke a method in this class such as

public List<LocalDataObject> GetData(int start, int end);

The database is querying using LINQ to EF and the calling class can then iterate over the data. But since other classes have no access to the entities in EF, I need to perform a "ToList()" operation on the query and by that fetching the full dataset into memory.

What will happen if this set is VERY large (10s-100s of GB)?

Is there a more efficient way of doing iteration and still maintain loose coupling?

573

asked May 08 '11 12:05

Saul

2 Answers

The correct way to work with large datasets in Entity framework is:

Use EFv4 and POCO objects - it will allow sharing objects with upper layer without introducing dependency on Entity framework
Turn off proxy creation / lazy loading to fully detach POCO entity from object context
Expose IQueryable<EntityType> to allow upper layer to specify query more precisely and limit the number of record loaded from database
When exposing IQueryable set MergeOption.NoTracking on ObjectQuery in your data access method. Combining this setting with turned off proxy creation should result in not cached entities and iteration through result of the query should always load only single materialized entity (without caching of loaded entities).

In your simple scenario you can always check that client doesn't ask too many records and simply fire exception or return only maximum allowed records.

135

answered Sep 19 '22 05:09

Ladislav Mrnka

As much as I like EF for quick/simple data access, I probably wouldn't use it for such a scenario. When dealing with data of that size I'd opt for stored procedures that return exactly what you need, and nothing extra. Then use a lightweight DataReader to populate your objects.

The DataReader provides an unbuffered stream of data that allows procedural logic to efficiently process results from a data source sequentially. The DataReader is a good choice when retrieving large amounts of data because the data is not cached in memory.

Additionally, as far as memory management goes, of course make sure you wrap your code handling unmanaged resources in a using block for proper disposal/garbage collection.

You also may want to consider implementing paging.

answered Sep 19 '22 05:09

Kon

Related questions
                            
                                why can't a local variable be volatile in C#?
                            
                                Why does WCF sometimes add "Field" to end of generated proxy types?
                            
                                WPF Listview Access to SelectedItem and subitems
                            
                                The assembly with display name 'VJSharpCodeProvider' failed to load
                            
                                WCF service proxy not setting "FieldSpecified" property
                            
                                Communication between C# applications - the easy way
                            
                                Returning table with CLR
                            
                                C# Partial Classes
                            
                                How to disable creation of empty log file on app start?
                            
                                How do I implement rate limiting in an ASP.NET MVC site?
                            
                                How should I concatenate strings?
                            
                                listbox items orientation to horizontal
                            
                                Clear Console Buffer
                            
                                What does adding Name and Namespace to DataContract do?
                            
                                Is the List<T>.AddRange() thread safe?
                            
                                Conditional DataGridView Formatting
                            
                                Performance gains in re-writing C# code in C/C++
                            
                                c# NaN comparison differences between Equals() and ==
                            
                                Web automation using .NET
                            
                                ASP.NET MVC 3: DefaultModelBinder with inheritance/polymorphism

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to avoid memory overflow when querying large datasets with Entity Framework and LINQ

Tags:

c#

linq

entity-framework

Saul

People also ask

2 Answers

Ladislav Mrnka

Kon

Recent Activity

Donate For Us