Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Objective-c: Problems with blocks and NSEnumerationConcurrent

I have a dictionary containing a second dictionary with 1000 entries. The entries are all NSStrings of the type key = key XXX, and value = element XXX where XXX is a number between 0 - the number of elements - 1. (Several days ago, I asked about Objective-C dictionaries containing a dictionary. Please refer to that question if you want the code that creates the dictionary.)

The sum total length of all the strings in the sub dictionary is 28,670 characters. ie:

strlen("key 0")+strlen("element 0")+
//and so on up through 
strlen("key 999")+strlen("element 999") == 28670. 

Consider this a very simple hash value as an indicator if a method has enumerated every key+value pair once and only once.

I have one subroutine that works perfectly (using blocks) to access the individual dictionary key and values:

NSUInteger KVC_access3(NSMutableDictionary *dict){
    __block NSUInteger ll=0;
    NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];

    [subDict 
        enumerateKeysAndObjectsUsingBlock:
            ^(id key, id object, BOOL *stop) {
                ll+=[object length];
                ll+=[key length];
    }];
    return ll;
}
// will correctly return the expected length...

If I try the same using concurrent blocks (on a multi processor machine), I get a number close to but not exactly the expected 28670:

NSUInteger KVC_access4(NSMutableDictionary *dict){
    __block NSUInteger ll=0;
    NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];

    [subDict 
        enumerateKeysAndObjectsWithOptions:
            NSEnumerationConcurrent
        usingBlock:
            ^(id key, id object, BOOL *stop) {
                ll+=[object length];
                ll+=[key length]; 
    }];
    return ll;
}
// will return correct value sometimes; a shortfall value most of the time...

The Apple docs for NSEnumerationConcurrent state:

 "the code of the Block must be safe against concurrent invocation."

I think that is probably the issue, but what is the issue with my code or the block in KVC_access4 that is NOT safe for concurrent invocation?

Edit & Conclusion

Thanks to BJ Homer's excellent solution, I got NSEnumerationConcurrent working. I timed both methods extensively. The code I have above in KVC_access3 is faster and easier for small and medium sized dictionaries. It much faster on lots of dictionaries. However, if you have a mongo big dictionary (millions or tens of millions of key/value pairs) then this code:

[subDict 
    enumerateKeysAndObjectsWithOptions:
        NSEnumerationConcurrent
    usingBlock:
        ^(id key, id object, BOOL *stop) {
        NSUInteger workingLength = [object length];
        workingLength += [key length];

        OSAtomicAdd64Barrier(workingLength, &ll); 
 }];

is up to 4x faster. The crossover point for size is about 1 dictionary of 100,000 of my test elements. More dictionaries and that crossover point is higher presumably because of set-up time.

like image 576
dawg Avatar asked May 01 '11 20:05

dawg


1 Answers

With concurrent enumeration, you'll have the block being run simultaneously on multiple threads. This means that multiple threads are accessing ll at the same time. Since you have no synchronization, you're prone to race conditions.

This is a problem because the += operation is not an atomic operation. Remember, ll += x is the same thing as ll = ll + x. This involves reading ll, adding x to that value, and then storing the new value back in ll. Between the time that ll is read on Thread X and when it is stored, any changes caused by other threads will be lost when Thread X gets back to storing its calculation.

You need to add synchronization such that multiple threads can't be modifying the value at the same time. The naive solution is this:

__block NSUInteger ll=0;
NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];

[subDict 
    enumerateKeysAndObjectsWithOptions:NSEnumerationConcurrent
    usingBlock:
        ^(id key, id object, BOOL *stop) {
            @synchronized(subDict) { // <-- Only one thread can be in this block at a time.
                ll+=[object length];
                ll+=[key length];
            }
}];
return ll;

However, this discards all the benefits you get from concurrent enumeration, since the entire body of the block is now enclosed in a synchronized block—in effect, only one instance of this block would be actually running at a time.

If concurrency is actually a significant performance requirement here, I'd suggest the following:

__block uint64 ll = 0; // Note the change in type here; it needs to be a 64-bit type.

^(id key, id object, BOOL *stop) {
    NSUInteger workingLength = [object length];
    workingLength += [key length];

    OSAtomicAdd64Barrier(workingLength, &ll); 
}

Note that I'm using OSAtomicAdd64Barrier, which is a fairly low-level function that is guaranteed to increment a value atomically. You could also use @synchronized to control the access, but if this operation is actually a significant performance bottleneck, then you're probably going to want the most performant option, even at the cost of a bit of clarity. If this feels like overkill, then I suspect enabling concurrent enumeration isn't really going to affect your performance all that much.

like image 69
BJ Homer Avatar answered Oct 16 '22 01:10

BJ Homer