Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Core data find-or-create most efficient way

I have around 10000 objects of entity 'Message'. When I add a new 'Message' i want to first see whether it exists - and if it does just update it's data, but if it doesn't to create it.

Right now the "find-or-create" algorithm works with by saving all of the Message objects 'objectID' in one array and then filtering through them and getting the messages with existingObjectWithID:error:

This works fine but in my case when I fetch an 'Message' using existingObjectWithID: and then try to set and save a property by setting the property of the 'Message' object and calling save: on it's context it doesn't saves it properly. Has anyone come across a problem like this?

Is there a more efficient way to make find-or-create algorithm?

like image 523
Devfly Avatar asked Mar 07 '14 10:03

Devfly


3 Answers

First, Message is a "bad" name for a CoreData entity as apple use it internally and it cause problems later in development.
You can read a little more about it HERE

I've noticed that all suggested solutions here use an array or a fetch request.
You might want to consider a dictionary based solution ...

In a single threaded/context application this is accomplished without too much of a burden by adding to cache (dictionary) the newly inserted objects (of type Message) and pre-populating the cache with existing object ids and keys mapping.

Consider this interface:

@interface UniquenessEnforcer : NSObject

@property (readonly,nonatomic,strong) NSPersistentStoreCoordinator* coordinator;
@property (readonly,nonatomic,strong) NSEntityDescription* entity;
@property (readonly,nonatomic,strong) NSString* keyProperty;
@property (nonatomic,readonly,strong) NSError* error;

- (instancetype) initWithEntity:(NSEntityDescription *)entity
                    keyProperty:(NSString*)keyProperty
                    coordinator:(NSPersistentStoreCoordinator*)coordinator;

- (NSArray*) existingObjectIDsForKeys:(NSArray*)keys;
- (void) unregisterKeys:(NSArray*)keys;
- (void) registerObjects:(NSArray*)objects;//objects must have permanent objectIDs
- (NSArray*) findOrCreate:(NSArray*)keys
                  context:(NSManagedObjectContext*)context
                    error:(NSError* __autoreleasing*)error;
@end

flow:

1) on application start, allocate a "uniqueness enforcer" and populate your cache:

//private method of uniqueness enforcer
- (void) populateCache
{
    NSManagedObjectContext* context = [[NSManagedObjectContext alloc] init];
    context.persistentStoreCoordinator = self.coordinator;

    NSFetchRequest* r = [NSFetchRequest fetchRequestWithEntityName:self.entity.name];
    [r setResultType:NSDictionaryResultType];

    NSExpressionDescription* objectIdDesc = [NSExpressionDescription new];
    objectIdDesc.name = @"objectID";
    objectIdDesc.expression = [NSExpression expressionForEvaluatedObject];
    objectIdDesc.expressionResultType = NSObjectIDAttributeType;

    r.propertiesToFetch = @[self.keyProperty,objectIdDesc];

    NSError* error = nil;

    NSArray* results = [context executeFetchRequest:r error:&error];
    self.error = error;
    if (results) {
        for (NSDictionary* dict in results) {
            _cache[dict[self.keyProperty]] = dict[@"objectID"];
        }
    } else {
        _cache = nil;
    }
}

2) when you need to test existence simply use:

- (NSArray*) existingObjectIDsForKeys:(NSArray *)keys
{
    return [_cache objectsForKeys:keys notFoundMarker:[NSNull null]];
}

3) when you like to actually get objects and create missing ones:

- (NSArray*) findOrCreate:(NSArray*)keys
                  context:(NSManagedObjectContext*)context
                    error:(NSError* __autoreleasing*)error
{
    NSMutableArray* fullList = [[NSMutableArray alloc] initWithCapacity:[keys count]];
    NSMutableArray* needFetch = [[NSMutableArray alloc] initWithCapacity:[keys count]];

    NSManagedObject* object = nil;
    for (id<NSCopying> key in keys) {
        NSManagedObjectID* oID = _cache[key];
        if (oID) {
            object = [context objectWithID:oID];
            if ([object isFault]) {
                [needFetch addObject:oID];
            }
        } else {
            object = [NSEntityDescription insertNewObjectForEntityForName:self.entity.name
                                                   inManagedObjectContext:context];
            [object setValue:key forKey:self.keyProperty];
        }
        [fullList addObject:object];
    }

    if ([needFetch count]) {
        NSFetchRequest* r = [NSFetchRequest fetchRequestWithEntityName:self.entity.name];
        r.predicate = [NSPredicate predicateWithFormat:@"SELF IN %@",needFetch];
        if([context executeFetchRequest:r error:error] == nil) {//load the missing faults from store
            fullList = nil;
        }
    }

    return fullList;
}

In this implementation you need to keep track of objects deletion/creation yourself.
You can use the register/unregister methods (trivial implementation) for this after a successful save.
You could make this a bit more automatic by hooking into the context "save" notification and updating the cache with relevant changes.

The multi-threaded case is much more complex (same interface but different implementation altogether when taking performance into account).
For instance, you must make your enforcer save new items (to the store) before returning them to the requesting context as they don't have permanent IDs otherwise, and even if you call "obtain permanent IDs" the requesting context might not save eventually.
you will also need to use a dispatch queue of some sort (parallel or serial) to access your cache dictionary.

Some math:

Given:
10K (10*1024) unique key objects
average key length of 256[byte]
objectID length of 128[byte]
we are looking at:
10K*(256+128) =~ 4[MB] of memory

This might be a high estimate, but you should take this into account ...

like image 164
Dan Shelly Avatar answered Oct 28 '22 05:10

Dan Shelly


Ok, many things can go wrong here this is how to:

  1. Create NSManagedObjectContext -> MOC
  2. Create NSFetchRequest with the right entity
  3. Create the NSPredicate and attache it to the fetch request
  4. execute fetch request on newly created context
  5. fetch request will return an array of objects matching the predicate (you should have only one object in that array if your ids are distinct)
  6. cast first element of an array to NSManagedObject
  7. change its property
  8. save context

The most important thing of all is that you use the same context for fetching and saving, and u must do it in the same thread cause MOC is not thread safe and that is the most common error that people do

like image 38
AntonijoDev Avatar answered Oct 28 '22 07:10

AntonijoDev


Currently you say you maintain an array of `objectID's. When you need to you:

filter through them and get the messages with existingObjectWithID:error:

and after this you need to check if the message you got back:

  1. exists
  2. matches the one you want

This is very inefficient. It is inefficient because you are always fetching objects back from the data store into memory. You are also doing it individually (not batching). This is basically the slowest way you could possibly do it.

Why changes to that object aren't saved properly isn't clear. You should get an error of some kind. But, you should really change your search approach:

Instead of looping and loading, use a single fetch request with a predicate:

NSFetchRequest *request = ...;
NSPredicate *filterPredicate = [NSPredicate predicateWithFormat:@"XXX == %@", YYY];

[request setPredicate:filterPredicate];
[request setFetchLimit:1];

where XXX is the name of the attribute in the message to test, and YYY is the value to test it against.

When you execute this fetch on the MOC you should get one or zero responses. If you get zero, create + insert a new message and save the MOC. If you get one, update it and save the MOC.

like image 39
Wain Avatar answered Oct 28 '22 05:10

Wain