I read the famous Why is it faster to process a sorted array than an unsorted array? and I decided to play around and experiment with other languages such as Swift. I was surprised by the run time differences between 2 very similar snippets of code.
In Swift one can access elements in an array either in a direct way or with a subscript while in a for-in loop. For instance this code:
for i in 0..<size {
sum += data[i]
}
Could be written:
for element in data {
sum += element
}
With size
the data
length and data
an array of summable elements.
So, I just implemented in Swift (code bellow) the same algorithm as in the question I mentioned in the first paragraph and what surprised me is that the first method is roughly 5 times faster than the second method.
I don't really know the backstage subscript implementation but I thought that accessing directly the elements in a Swift for-in loop was just syntactic sugar.
My question is what is the difference between the two for-in
syntaxes and why it is faster to use subscript?
here is the detail of timers. I'm using Xcode 9.4.1 with Swift 4.1 on an early 2015 MacBook Air with a Commande Line Project.
// Using Direct Element Access
Elapsed Time: 8.506288427
Sum: 1051901000
vs
// Using Subscript
Elapsed Time: 1.483967902
Sum: 1070388000
Bonus question: why the execution is 100 times slower in Swift than in C++ (both executed on the same Mac in a n Xcode project)? For instance 100,000 repetitions in C++ take nearly the same time as 1,000 repetitions in Swift. My first guess is that Swift is a higher level language than C++ and that Swift operates more safety checks for instance.
Here is the Swift code I used, I only modified the second nested loop:
import Foundation
import GameplayKit
let size = 32_768
var data = [Int]()
var sum = 0
var rand = GKRandomDistribution(lowestValue: 0, highestValue: 255)
for _ in 0..<size {
data.append(rand.nextInt())
}
// data.sort()
let start = DispatchTime.now()
for _ in 0..<1_000 {
// Only the following for-in loop changes
for i in 0..<size {
if data[i] <= 128 {
sum += data[i]
}
}
}
let stop = DispatchTime.now()
let nanoTime = stop.uptimeNanoseconds - start.uptimeNanoseconds
let elapsed = Double(nanoTime) / 1_000_000_000
print("Elapsed Time: \(elapsed)")
print("Sum: \(sum)")
Subscripts enable you to query instances of a type by writing one or more values in square brackets after the instance name. Their syntax is similar to both instance method syntax and computed property syntax.
The for (with a counter) is just incrementing a counter. Very fast. The for-in uses an iterator (call object to pass the next element). This is much slower.
In Swift4, subscripts are the shortcuts which are used to access elements of a list, sequence or a collection. Subscript is used to set or retrieve a value using index instead of writing functions. For example: Array[Index], Dictionary[Key]
The overall performance output highly depends on the optimizations done by the compiler. If you compile your code with optimizations enabled, you will see the difference between both solutions is minimal.
To demonstrate this, I updated your code adding two methods, one with subscripting
and the other using for-in
.
import Foundation
import GameplayKit
let size = 32_768
var data = [Int]()
var sum = 0
var rand = GKRandomDistribution(lowestValue: 0, highestValue: 255)
for _ in 0..<size {
data.append(rand.nextInt())
}
// data.sort()
func withSubscript() {
let start = DispatchTime.now()
for _ in 0..<1_000 {
for i in 0..<size {
if data[i] <= 128 {
sum += data[i]
}
}
}
let stop = DispatchTime.now()
let elapsed = Double(stop.uptimeNanoseconds - start.uptimeNanoseconds) / 1_000_000_000
print("With subscript:")
print("- Elapsed Time: \(elapsed)")
print("- Sum: \(sum)")
}
func withForIn() {
let start = DispatchTime.now()
for _ in 0..<1_000 {
for element in data {
if element <= 128 {
sum += element
}
}
}
let stop = DispatchTime.now()
let elapsed = Double(stop.uptimeNanoseconds - start.uptimeNanoseconds) / 1_000_000_000
print("With for-in:")
print("- Elapsed Time: \(elapsed)")
print("- Sum: \(sum)")
}
withSubscript()
withForIn()
I saved that code into a file called array-subscripting.swift
.
Then, from the command line, we can run it without any optimizations, like this:
$ swift array-subscripting.swift
With subscript:
- Elapsed Time: 0.924554249
- Sum: 1057062000
With for-in:
- Elapsed Time: 5.796038213
- Sum: 2114124000
As you mentioned in the post, there is a big difference in performance.
This difference is pretty negligible when the code is compiled with optimizations:
$ swiftc array-subscripting.swift -O
$ ./array-subscripting
With subscript:
- Elapsed Time: 0.110622556
- Sum: 1054578000
With for-in:
- Elapsed Time: 0.11670454
- Sum: 2109156000
As you can see, both solutions are way faster than before, and very similar on time execution.
Back to your original question, subscripting
provides direct access to memory, which is pretty efficient in the case of contiguous arrays, where elements are stored next to each other in memory.
for-in
loops, in the other hand, create an immutable copy of each element from the array, which incurs in a performance hit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With