Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Entity framework remove object from context, but not from database

I am working on a batch process which dumps ~800,000 records from a slow legacy database (1.4-2ms per record fetch time...it adds up) into MySQL which can perform a little faster. To optimize this, I have been loading all of the MySQL records into memory which puts usage to about 200MB. Then, I start dumping from the legacy database and updating the records.

Originally, when this would complete updating the records I would then call SaveContext which would then make my memory jump from ~500MB-800MB to 1.5GB. Very soon, I would get out of memory exceptions (the virtual machine this is running on has 2GB of RAM) and even if I were to give it more RAM, 1.5-2GB is still a little excessive and that would be just putting a band-aid on the problem. To remedy this, I started calling SaveContext every 10,000 records which helped things along a bit and since I was using delegates to fetch the data from the legacy database and update it in MySQL I didn't receive too horrible a hit in performance since after the 5 second or so wait while it was saving it would then run through the update in memory for the 3000 or so records that had backed up. However, the memory usage still keeps going up.

Here are my potential issues:

  • The data comes out of the legacy database in any order, so I can't chunk the updates and periodically release the ObjectContext.
  • If I don't grab all of the data out of MySQL beforehand and instead look it up during the update process by record, it is incredibly slow. I instead grab it all beforehand, cast it to a dictionary indexed by the primary key, and as I update the data I remove the records from the dictionary.

One possible solution I thought of is to somehow free the memory being used by entities that I know I will never touch again since they have already been updated (like clearing the cache, but only for a specific item), but I don't know if that is even possible with Entity Framework.

Does anyone have any thoughts?

like image 262
Los Frijoles Avatar asked Jun 29 '12 00:06

Los Frijoles


2 Answers

You can call the Detach method on the context passing it the object you no longer need: http://msdn.microsoft.com/en-us/library/system.data.objects.objectcontext.detach%28v=vs.90%29.aspx

like image 184
fyjham Avatar answered Oct 20 '22 09:10

fyjham


I'm wondering if your best bet isn't another tool as previously suggested or just forgoing the use of Entity Framework. If you instead do the code without an ORM, you can:

  1. Tune the SQL statements to improve performance
  2. Easily control, and change, the scope of transactions to get the best performance.
  3. You can batch the updates so that you aren't calling the server to accomplish multiple updates instead of them being performed one at a time.
like image 39
Jeff Siver Avatar answered Oct 20 '22 09:10

Jeff Siver