Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NSArray vs C Array performance comparison

I've been recently running a little research project on the performance of sequential or random access to an NSArray in relation to a C Array. Most of the test cases are showing up as i'd expect however a few don't work how I thought they would and i'm hoping somebody might be able to explain why.

Fundamentally the test consists of filling a C Array with 50k objects, iterating over each one and calling a method (which internally just increments a float in the object), the second part of the test involves creating a loop that completes 50k iterations but accesses a random object in the array. Basically it's pretty simple.

To do the comparison i'm initializing the NSArray with the C Array. Each test is then run via a block passed in to a method which tracks the time it take to execute the block. The code i'm using is contained below but i'd like to cover the results and queries I have first.

These tests were run on an iPhone 4 and wrapped in a dispatch_after to mitigate any remaining threading or non atomic operations as a result of starting the app. The results of a single run are as follows, each run is essentially the same with minor variations:

===SEQUENCE===
NSARRAY FAST ENUMERATION: 12ms
NSARRAY FAST ENUMERATION WEAK: 186ms
NSARRAY BLOCK ENUMERATION: 31ms (258.3%)
C ARRAY DIRECT: 7ms (58.3%)
C ARRAY VARIABLE ASSIGN: 33ms (275.0%)
C ARRAY VARIABLE ASSIGN WEAK: 200ms (1666.7%)

===RANDOM===
NSARRAY RANDOM: 102ms (850.0%) *Relative to fast enumeration
C ARRAY DIRECT RANDOM: 39ms (38.2%) *Relative to NSArray Random
C ARRAY VARIABLE ASSIGN RANDOM: 82ms (80.4%)

The fastest approach seems to be directly accessing the items in the C Array using "*(carray + idx)", what is most puzzling though is that assigning the pointer from the C Array to an objective c variable "id object = *(carry + idx)" causes a massive performance hit.

I figured initially it was maybe arc doing something with reference counting as the variable was strong so at this point I changed it to weak expecting the performance to increase "__weak id object = *(carry + idx)". To my surprise it was actually a lot slower.

The random access results went pretty well how i'd expected based on the sequence results, so no surprises there luckily enough.

As a result of this there are a number of questions:

  1. Why does assigning to a variable take so long?
  2. Why does assigning to a weak variable take even longer? (Maybe there is something I don't understand going on here)
  3. Considering the above how have Apple got the standard fast enumeration to perform so well?

And for completeness here is the code. So i'm creating the arrays as follows:

__block id __strong *cArrayData = (id __strong *)malloc(sizeof(id) * ITEM_COUNT);

for (NSUInteger idx = 0; idx < ITEM_COUNT; idx ++) {
    NSTestObject *object = [[NSTestObject alloc] init];
    cArrayData[idx] = object;
}

__block NSArray *arrayData = [NSArray arrayWithObjects:cArrayData count:ITEM_COUNT];

And NSTestObject is defined like this:

@interface NSTestObject : NSObject

- (void)doSomething;

@end

@implementation NSTestObject
{
    float f;
}

- (void)doSomething
{
    f++;
}

And the method used to profile the code:

int machTimeToMS(uint64_t machTime)
{
    const int64_t kOneMillion = 1000 * 1000;
    static mach_timebase_info_data_t s_timebase_info;

    if (s_timebase_info.denom == 0) {
        (void) mach_timebase_info(&s_timebase_info);
    }
    return (int)((machTime * s_timebase_info.numer) / (kOneMillion * s_timebase_info.denom));
}

- (int)profile:(dispatch_block_t)call name:(NSString *)name benchmark:(int)benchmark
{

    uint64_t startTime, stopTime;
    startTime = mach_absolute_time();

    call();

    stopTime = mach_absolute_time();

    int duration = machTimeToMS(stopTime - startTime);

    if (benchmark > 0) {
        NSLog(@"%@: %i (%0.1f%%)", name, duration, ((float)duration / (float)benchmark) * 100.0f);
    } else {
        NSLog(@"%@: %i", name, duration);
    }

    return duration;

}

Finally this is how i'm performing the actual tests:

int benchmark = [self profile:^ {
    for (NSTestObject *view in arrayData) {
        [view doSomething];
    }
} name:@"NSARRAY FAST ENUMERATION" benchmark:0];

[self profile:^ {
    for (NSTestObject __weak *view in arrayData) {
        [view doSomething];
    }
} name:@"NSARRAY FAST ENUMERATION WEAK" benchmark:0];

[self profile:^ {
    [arrayData enumerateObjectsUsingBlock:^(NSTestObject *view, NSUInteger idx, BOOL *stop) {
        [view doSomething];
    }];
} name:@"NSARRAY BLOCK ENUMERATION" benchmark:benchmark];

[self profile:^ {
    for (NSUInteger idx = 0; idx < ITEM_COUNT; idx ++) {
        [*(cArrayData + idx) doSomething];
    }
} name:@"C ARRAY DIRECT" benchmark:benchmark];

[self profile:^ {
    id object = nil;
    NSUInteger idx = 0;
    while (idx < ITEM_COUNT) {
        object = (id)*(cArrayData + idx);
        [object doSomething];
        object = nil;
        idx++;
    }
} name:@"C ARRAY VARIABLE ASSIGN" benchmark:benchmark];

[self profile:^ {
    __weak id object = nil;
    NSUInteger idx = 0;
    while (idx < ITEM_COUNT) {
        object = (id)*(cArrayData + idx);
        [object doSomething];
        object = nil;
        idx++;
    }
} name:@"C ARRAY VARIABLE ASSIGN WEAK" benchmark:benchmark];

NSLog(@"\n===RANDOM===\n");

benchmark = [self profile:^ {
    id object = nil;
    for (NSUInteger idx = 0; idx < ITEM_COUNT; idx ++) {
        object = arrayData[arc4random()%ITEM_COUNT];
        [object doSomething];
    }
} name:@"NSARRAY RANDOM" benchmark:benchmark];

[self profile:^ {
    NSUInteger idx = 1;
    while (idx < ITEM_COUNT) {
        [*(cArrayData + arc4random()%ITEM_COUNT) doSomething];
        idx++;
    }
} name:@"C ARRAY DIRECT RANDOM" benchmark:benchmark];

[self profile:^ {
    id object = nil;
    NSUInteger idx = 0;
    while (idx < ITEM_COUNT) {
        object = (id)*(cArrayData + arc4random()%ITEM_COUNT);
        [object doSomething];
        idx++;
    }
} name:@"C ARRAY VARIABLE ASSIGN RANDOM" benchmark:benchmark];
like image 609
Andy Avatar asked Jul 10 '13 09:07

Andy


People also ask

What is the difference between NSArray and array?

Array is a struct, therefore it is a value type in Swift. NSArray is an immutable Objective C class, therefore it is a reference type in Swift and it is bridged to Array<AnyObject> . NSMutableArray is the mutable subclass of NSArray . Because foo changes the local value of a and bar changes the reference.

Can we use NSArray in Swift?

In Swift, the NSArray class conforms to the ArrayLiteralConvertible protocol, which allows it to be initialized with array literals. For more information about object literals in Swift, see Literal Expression in The Swift Programming Language (Swift 4.1).

What is NSArray?

NSArray is an immutable Objective C class, therefore it is a reference type in Swift and it is bridged to Array<AnyObject> . NSMutableArray is the mutable subclass of NSArray .

Is NSArray ordered?

The answer is yes, the order of the elements of an array will be maintained - because an array is an ordered collection of items, just like a string is an ordered sequence of characters...


1 Answers

Why does assigning to a variable take so long?

Your guess was correct: ARC calls retain when you assign, and release when you reassign, or when id goes out of scope.

Why does assigning to a weak variable take even longer? (Maybe there is something I don't understand going on here)

Recall that ARC promises to clear your weak reference when the last strong reference is gone. That is why weak references are more expensive: in order to nil out the __weak id, ARC registers id's address with the runtime to get a notification of the object being released. This registration requires writing into a hash table - far slower than just retaining and releasing.

Considering the above how have Apple got the standard fast enumeration to perform so well?

Fast enumeration uses blocks of the array that backs NSArray directly. Essentially, they grab a block of 30 elements or so, and treat access to it as a plain C array. Then they grab the next block, iterate over it as if it were a C array, and so on. There is some small overhead, but it is per block, not per element, so you get a pretty impressive performance.

like image 77
Sergey Kalinichenko Avatar answered Oct 19 '22 04:10

Sergey Kalinichenko