Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improve speed for updating existing records (~11.000) in Core Data

I'm parsing a ton of data which I initially insert into a core data store.

At a later point, I am parsing the same XML, though some of it may have been updated. What I then do is check for an existing record with the same tag and if one already exist, I update the record with the data.

However, while my initial parsing (about 11.000 records) takes 8 seconds or so, updating seems expensive and takes 144 seconds (these are Simulator runs, so significantly longer on actual devices).

While the first time is fine (I'm showing a progress bar), the second is unacceptably long, and I would like to do something to improve the speed (even though it happens in the background on a separate thread).

Unfortunately it's not a matter of find-or-create as the data in the XML may have changed for individual records, so each could essentially need an update.

I've indexes the attributes, which sped up the initial parsing and the updating as well, but it's still slow (numbers above are with indexing). What I have noticed it that the parsing/updating seems to slow down gradually. While initially fast, it gets slower and slower as more and more records are dealt with.

So finally my question is if anything has any suggestions for me for how I could improve the speed at which I am updating my dataset? I am using MagicalRecord for fetching the record. Here's the code:

Record *record;
if (!isUpdate) {
    record = [NSEntityDescription insertNewObjectForEntityForName:@"Record" inManagedObjectContext:backgroundContext];
} else {
    NSPredicate *recordPredicate = [NSPredicate predicateWithFormat:@"SELF.tag == %@", [[node attributeForName:@"tag"] stringValue]];
    record = [Record findFirstWithPredicate:recordPredicate];
}
like image 796
runmad Avatar asked May 01 '12 17:05

runmad


2 Answers

Instead of doing tons of fetches, do one query for each entity type and store them in a dictionary by tag, then just check the dictionary if there's an object with that key. You should be able to set the propertiesToFetch to just include the tag, and it should reduce overhead.

like image 98
Senior Avatar answered Nov 07 '22 22:11

Senior


You could also try a combination of Senior's answer with hashing of the properties.

On insert hash the properties and store that hash as a sort of checksum property of the Record.
On update you set the fetched properties to be tag and checksum and do one fetch of all the items. Then as you iterate over your data set if the checksum differs from the one that has been fetched you can fetch that Record and update it.

like image 35
auibrian Avatar answered Nov 08 '22 00:11

auibrian