Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

+ing Swift arrays of AnyObject much faster than +ing arrays of T

Given the following three simple functions:

func twice_Array_of_Int(a: [Int]) -> [Int] {
    return a + a
}

func twice_Array_of_T<T>(a: [T]) -> [T] {
    return a + a
}

func twice_Array_of_Any(a: [AnyObject]) -> [AnyObject] {
    return a + a
}

Assuming release build (-Os), how would you expect their performance to compare?

My expectation was that [Int] -> [Int] would be much faster than [AnyObject] -> [AnyObject]... and it is... orders of magnitude faster.

However, I also expected [T] -> [T] to perform much better than [AnyObject] -> [AnyObject] and nearly as fast as [Int] -> [Int]... right?

Here I turned out to be wrong: even [AnyObject] -> [AnyObject] (even including the cast back to [Int]) is 5 times faster than [T] -> [T]! This is disappointing not least because generics are one of the most promising features of Swift.

In one of their WWDC videos Apple engineers mentioned that they are implementing generics natively, i.e. that using them is not leading to code bloating. Is this explaining the poor performance of [T] -> [T]? If they simply expended generic functions at compile time, performance of [T] -> [T] and [Int] -> [Int] should've been the same, right?

Here is the test code:

func testPerformance_twice_Array_of_Int() {
    let a = Array(1...100_000)
    self.measureBlock {
        let twice_a = twice_Array_of_Int(a)
    }
    // average: 0.000, relative standard deviation: 76.227%
}

func testPerformance_twice_Array_of_T() {
    let a = Array(1...100_000)
    self.measureBlock {
        let twice_a = twice_Array_of_T(a)
    }
    // measured [Time, seconds] average: 0.554, relative standard deviation: 7.846%
}

func testPerformance_twice_Array_of_Any() {
    let a = Array(1...100_000)
    self.measureBlock {
        let twice_a = twice_Array_of_Any(a) as [Int]
    }
    // average: 0.115, relative standard deviation: 8.303%

    // without the cast to [Int] = average: 0.039, relative standard deviation: 2.931%
}

I'd love to hear your opinion and how you plan to factor this into your code design.

EDIT

I've just made an even simpler measurement with an even more startling result:

func ==(lhs: (Int, Int), rhs: (Int, Int)) -> Bool {
    return lhs.0 == rhs.0 && lhs.1 == rhs.1
}

compared to:

func ==<T: Equatable>(lhs: (T, T), rhs: (T, T)) -> Bool {
    return lhs.0 == rhs.0 && lhs.1 == rhs.1
}

result:

func testPerformance_Equals_Tuple_Int() {
    let a = (2, 3)
    let b = (3, 2)
    XCTAssertFalse(a == b)
    let i = 1_000_000
    self.measureBlock() {
        for _ in 1...i {
            let c = a == b
        }
        // average: 0.002, relative standard deviation: 9.781%
    }
}

compared to:

func testPerformance_Equals_Tuple_T() {
    let a = (2, 3)
    let b = (3, 2)
    XCTAssertFalse(a == b)
    let i = 1_000_000
    self.measureBlock() {
        for _ in 1...i {
            let c = a == b
        }
        // average: 2.080, relative standard deviation: 5.118%
    }
}

The generic version of the infix function is more than 1000 times slower!

EDIT 2

On 21 Aug, Austin Zheng gave a talk about "Enums, Pattern Matching & Generics" at a Swift Language User Group meetup (with Chris Lattner as the special guest). He said that Swift emits code optimised for common types, but falls back to a native generic version of the function for other types as needed at runtime. See: http://realm.io/news/swift-enums-pattern-matching-generics/ (starting at 32:00).

EDIT 3

With Swift 2 out, this is well overdue for an update (just as soon as I get a breather)...

like image 669
Milos Avatar asked Sep 21 '14 13:09

Milos


1 Answers

I'd love to hear your opinion and how you plan to factor this into your code design.

You should not factor this into your code design. The Swift compiler is rapidly evolving and the optimizer is under continual and radical development. Changing your coding practices based on micro-benchmarks on early versions of the optimizer is the worst possible form of "premature optimization."

Code for clarity. Code for correctness. When you see performance problems, investigate them. There is no program slower than the program that crashes. Both [Int] and [T] are far safer, clearer, and easier to work with than [AnyObject] (which you constantly must cast and validate). The choice shouldn't be difficult. When you have some live code that demonstrates a problem with [T] in Instruments, then you should investigate other options (though I would still put [AnyObject] at the bottom; the obvious solution in your above code is to write a special-case overload that handles [Int] if that were really faster).

Since you have an interesting test case demonstrating a surprising difference between the generic and native, it would be appropriate to open a radar (bugreport.apple.com). That way, when the issue is resolved, your clear, correct code will get a free speed boost.


EDIT: I haven't looked at the assembler output yet (and you should), but I do have several theories on why this could be true (if it actually is true; I haven't reproduced it, either). [AnyObject] may be replaced here with NSArray, which has radically different performance characteristics from Array. That's the key reason you should never think "[AnyObject] is faster" based on some microbenchmark that isn't your real code. The performance of a+a may be completely unrelated to the performance of some other operation.

Regarding [Int] vs [T], you may misunderstand how Swift deals with generics. Swift does not create a completely new version of every function for every type. It creates a generic version. That version may not optimize everything the same way that a specific-type version would. In this case, for instance, it's possible that the [T] version does memory management that the [Int] version doesn't (I'm totally guessing here). The optimizer can make an optimized version (which is why you shouldn't try to second-guess it), but it may not (which is why you might sometimes have to help it with a special overload). Swift Yeti has a nice article explaining further.

Again, you should never assume you know what the optimizer is going to do without testing on live code that is at least similar to what you care about (and shouldn't really even consider it until you have some reason to believe this is a performance bottleneck). It is very easy to write "crazy code because it's faster" that is in fact much slower, but still crazy.

Optimizer knowledge is power, but it is a dangerous power if you don't know exactly when to use it.

like image 131
Rob Napier Avatar answered Oct 04 '22 19:10

Rob Napier