Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In swift which loop is faster `for` or `for-in`? Why?

Tags:

swift

Which loop should I use when have to be extremely aware of the time it takes to iterate over a large array.

like image 726
Krishna Avatar asked Nov 29 '22 23:11

Krishna


1 Answers

Short answer

Don’t micro-optimize like this – any difference there is could be far outweighed by the speed of the operation you are performing inside the loop. If you truly think this loop is a performance bottleneck, perhaps you would be better served by using something like the accelerate framework – but only if profiling shows you that effort is truly worth it.

And don’t fight the language. Use for…in unless what you want to achieve cannot be expressed with for…in. These cases are rare. The benefit of for…in is that it’s incredibly hard to get it wrong. That is much more important. Prioritize correctness over speed. Clarity is important. You might even want to skip a for loop entirely and use map or reduce.

Longer Answer

For arrays, if you try them without the fastest compiler optimization, they perform identically, because they essentially do the same thing.

Presumably your for ;; loop looks something like this:

var sum = 0
for var i = 0; i < a.count; ++i {
    sum += a[i]
}

and your for…in loop something like this:

for x in a {
    sum += x
}

Let’s rewrite the for…in to show what is really going on under the covers:

var g = a.generate()
while let x = g.next() {
    sum += x
}

And then let’s rewrite that for what a.generate() returns, and something like what the let is doing:

 var g = IndexingGenerator<[Int]>(a)
 var wrapped_x = g.next()
 while wrapped_x != nil {
     let x = wrapped_x!
     sum += x
     wrapped_x = g.next()
 }

Here is what the implementation for IndexingGenerator<[Int]> might look like:

struct IndexingGeneratorArrayOfInt {
    private let _seq: [Int]
    var _idx: Int = 0

    init(_ seq: [Int]) {
        _seq = seq
    }

    mutating func generate() -> Int? {
        if _idx != _seq.endIndex {
            return _seq[_idx++]
        }
        else {
            return nil
        }
    }
}

Wow, that’s a lot of code, surely it performs way slower than the regular for ;; loop!

Nope. Because while that might be what it is logically doing, the compiler has a lot of latitude to optimize. For example, note that IndexingGeneratorArrayOfInt is a struct not a class. This means it has no overhead over declaring the two member variables directly. It also means the compiler might be able to inline the code in generate – there is no indirection going on here, no overloaded methods and vtables or objc_MsgSend. Just some simple pointer arithmetic and deferencing. If you strip away all the syntax for the structs and method calls, you’ll find that what the for…in code ends up being is almost exactly the same as what the for ;; loop is doing.

for…in helps avoid performance errors

If, on the other hand, for the code given at the beginning, you switch compiler optimization to the faster setting, for…in appears to blow for ;; away. In some non-scientific tests I ran using XCTestCase.measureBlock, summing a large array of random numbers, it was an order of magnitude faster.

Why? Because of the use of count:

for var i = 0; i < a.count; ++i {
                // ^-- calling a.count every time...
    sum += a[i]
}

Maybe the optimizer could have fixed this for you, but in this case it hasn’t. If you pull the invariant out, it goes back to being the same as for…in in terms of speed:

let count = a.count
for var i = 0; i < count; ++i {
    sum += a[i]
}

“Oh, I would definitely do that every time, so it doesn’t matter”. To which I say, really? Are you sure? Bet you forget sometimes.

But you want the even better news? Doing the same summation with reduce was (in my, again not very scientific, tests) even faster than the for loops:

let sum = a.reduce(0,+)

But it is also so much more expressive and readable (IMO), and allows you to use let to declare your result. Given that this should be your primary goal anyway, the speed is an added bonus. But hopefully the performance will give you an incentive to do it regardless.

This is just for arrays, but what about other collections? Of course this depends on the implementation but there’s a good reason to believe it would be faster for other collections like dictionaries, custom user-defined collections.

My reason for this would be that the author of the collection can implement an optimized version of generate, because they know exactly how the collection is being used. Suppose subscript lookup involves some calculation (such as pointer arithmetic in the case of an array - you have to add multiple the index by the value size then add that to the base pointer). In the case of generate, you know what is being done is to sequentially walk the collection, and therefore you could optimize for this (for example, in the case of an array, hold a pointer to the next element which you increment each time next is called). Same goes for specialized member versions of reduce or map.

This might even be why reduce is performing so well on arrays – who knows (you could stick a breakpoint on the function passed in if you wanted to try and find out). But it’s just another justification for using the language construct you should probably be using regardless.

like image 163
Airspeed Velocity Avatar answered Dec 20 '22 23:12

Airspeed Velocity