Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient way to delete a large number (10.000+) objects in Core Data?

Tags:

ios

core-data

The way I'm trying to delete multiple sets of 10.000+ NSManagedObjects is just too memory intensive (around 20MB live bytes), and my app is being jettisoned. Here is the implementation of the delete method:

+ (void)deleteRelatedEntitiesInManagedObjectContext:(NSManagedObjectContext *)context 
{
    NSFetchRequest *fetch = [[NSFetchRequest alloc] init];
    [context setUndoManager:nil];

    [fetch setEntity:[NSEntityDescription entityForName:NSStringFromClass(self) inManagedObjectContext:context]];
    [fetch setIncludesPropertyValues:NO];

    NSError *error = nil;
    NSArray *entities = [context executeFetchRequest:fetch error:&error];

    NSInteger deletedCount = 0;
    for (NSManagedObject *item in entities) {
        [context deleteObject:item];
        deletedCount++;

        if (deletedCount == 500) {
            [context save:&error];
            deletedCount = 0;
        }
    }

    if (deletedCount != 0) {
        [context save:&error];
    }
}

I've tried: -setFetchBatchSize, but there's even more memory used.

What would be a more memory-efficient way to do this?

like image 847
Victor Bogdan Avatar asked May 14 '12 10:05

Victor Bogdan


2 Answers

EDIT: Just watched 2015 WWDC "What's New in Core Data" (it's always the first video I watch, but I've been very busy this year) and they announced a new API: NSBatchDeleteRequest that should be much more efficient than any previous solution.


Efficient has multiple meanings, and most often means some sort of trade-off. Here, I assume you just want to contain memory while deleting.

Core Data has lots of performance options, beyond the scope of any single SO question.

How memory is managed depends on the settings for your managedObjectContext and fetchRequest. Look at the docs to see all the options. In particular, though, you should keep these things in mind.

Also, keep in mind the performance aspect. This type of operation should be performed on a separate thread.

Also, note that the rest of your object graph will also come into play (because of how CoreData handles deletion of related objects.

Regarding memory consumption, there are two properties on MOC in particular to pay attention to. While there is a lot here, it is by no means close to comprehensive. If you want to actually see what is happening, NSLog your MOC just before and after each save operation. In particular, log registeredObjects and deletedObjects.

  1. The MOC has a list of registered objects. By default, it does not retain registered objects. However, if retainsRegisteredObjects is YES, it will retain all registered objects.

  2. For deletes in particular, setPropagatesDeletesAtEndOfEvent tells the MOC how to handle related objects. If you want them handled with the save, you need to set that value to NO. Otherwise, it will wait until the current event is done

  3. If you have really large object sets, consider using fetchLimit. While faults do not take a lot of memory, they still take some, and many thousands at a time are not insignificant. It means more fetching, but you will limit the amount of memory

  4. Also consider, any time you have large internal loops, you should be using your own autorelease pool.

  5. If this MOC has a parent, saving only moves those changes to the parent. In this case, if you have a parent MOC, you are just making that one grow.

For restricting memory, consider this (not necessarily best for your case -- there are lots of Core Data options -- only you know what is best for your situation, based on all the options you are using elsewhere.

I wrote a category on NSManagedObjectContext that I use for saving when I want to make sure the save goes to the backing store, very similar to this. If you do not use a MOC hierarchy, you don't need it, but... there is really no reason NOT to use a hierarchy (unless you are bound to old iOS).

- (BOOL)cascadeSave:(NSError**)error {
    __block BOOL saveResult = YES;
    if ([self hasChanges]) {            
        saveResult = [self save:error];
    }
    if (saveResult && self.parentContext) {
        [self.parentContext performBlockAndWait:^{
            saveResult = [self.parentContext cascadeSave:error];
        }];
    }
    return saveResult;
}

I modified your code a little bit...

+ (void)deleteRelatedEntitiesInManagedObjectContext:(NSManagedObjectContext *)context 
{
    NSFetchRequest *fetch = [[NSFetchRequest alloc] init];
    [context setUndoManager:nil];

    [fetch setEntity:[NSEntityDescription entityForName:NSStringFromClass(self) inManagedObjectContext:context]];
    [fetch setIncludesPropertyValues:NO];
    [fetch setFetchLimit:500];

    NSError *error = nil;
    NSArray *entities = [context executeFetchRequest:fetch error:&error];
    while ([entities count] > 0) {
        @autoreleasepool {
            for (NSManagedObject *item in entities) {
                [context deleteObject:item];
            }
            if (![context cascadeSave:&error]) {
                // Handle error appropriately
            }
        }
        entities = [context executeFetchRequest:fetch error:&error];
    }
}
like image 186
Jody Hagins Avatar answered Nov 15 '22 07:11

Jody Hagins


In a moment of inspiration, I removed [fetch setIncludesPropertyValues:NO]; and it was good. From the docs:

During a normal fetch (includesPropertyValues is YES), Core Data fetches the object ID and property data for the matching records, fills the row cache with the information, and returns managed object as faults (see returnsObjectsAsFaults). These faults are managed objects, but all of their property data still resides in the row cache until the fault is fired. When the fault is fired, Core Data retrieves the data from the row cache—there is no need to go back to the database.

I managed to reduce allocated live bytes to ~13MB, which is better.

like image 45
Victor Bogdan Avatar answered Nov 15 '22 05:11

Victor Bogdan