I'm developing a game with Swift and I have a static array of positional data that I use for processing in the game loop. I originally was using an array of Structs to hold this data but I decided to switch to classes so I could use referencing. However after making the change and profiling I noticed that the CPU was spending much more time on the method that handles this data than it did when I was using Structs.
So I decided to create a simple test to see what was going on.
final class SomeClass {}
struct SomeStruct {}
let classes = [
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
]
let structs = [
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
]
func test1() {
for i in 0...10000000 {
for s in classes {}
}
}
func test2() {
for i in 0...10000000 {
for s in structs {}
}
}
Test1 takes 15.4722579717636 s while Test2 takes only 0.276068031787872 s. Iterating continuously through the structs array was 56X faster. So my question is, why is this? I'm looking for a detailed answer. If I'd have to take a guess I would say that the structs themselves are stored sequentially in memory while the classes are only stored as addresses. So they would need to be dereferenced each time. But then again, don't the structs need to be copied each time?
Side Note: Both arrays are small, but I'm iterating them continuously. If I change the code to iterate once but make the arrays very large like so:
for i in 0...10000000 {
structs.append(SomeStruct())
classes.append(SomeClass())
}
func test1() {
for s in classes {}
}
func test2() {
for s in structs {}
}
Then I end up with the following: Test1 takes 0.841085016727448 s while Test2 takes 0.00960797071456909 s. The structs take 88X faster.
I'm using an OS X release build and the optimization level is set for Fastest,Smallest [-Os]
As requested, I have edited this question to include a test where the structs and classes are no longer empty. They use the same properties that I use in my game. Still did not make a difference. Structs are still so much faster, and I don't know why. Hopefully someone can provide the answer.
import Foundation
final class StructTest {
let surfaceFrames = [
SurfaceFrame(a: SurfacePoint(x: 0, y: 410), b: SurfacePoint(x: 0, y: 400), c: SurfacePoint(x: 875, y: 410), surfaceID: 0, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 880, y: 304), b: SurfacePoint(x: 880, y: 294), c: SurfacePoint(x: 962, y: 304), surfaceID: 1, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 787, y: 138), b: SurfacePoint(x: 791, y: 129), c: SurfacePoint(x: 1031, y: 248), surfaceID: 2, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 523, y: 138), b: SurfacePoint(x: 523, y: 128), c: SurfacePoint(x: 806, y: 144), surfaceID: 3, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1020, y: 243), b: SurfacePoint(x: 1020, y: 233), c: SurfacePoint(x: 1607, y: 241), surfaceID: 4, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1649, y: 304), b: SurfacePoint(x: 1649, y: 294), c: SurfacePoint(x: 1731, y: 305), surfaceID: 5, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1599, y: 240), b: SurfacePoint(x: 1595, y: 231), c: SurfacePoint(x: 1852, y: 128), surfaceID: 6, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1807, y: 141), b: SurfacePoint(x: 1807, y: 131), c: SurfacePoint(x: 2082, y: 138), surfaceID: 7, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 976, y: 413), b: SurfacePoint(x: 976, y: 403), c: SurfacePoint(x: 1643, y: 411), surfaceID: 8, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1732, y: 410), b: SurfacePoint(x: 1732, y: 400), c: SurfacePoint(x: 2557, y: 410), surfaceID: 9, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 2130, y: 490), b: SurfacePoint(x: 2138, y: 498), c: SurfacePoint(x: 2109, y: 512), surfaceID: 10, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1598, y: 828), b: SurfacePoint(x: 1597, y: 818), c: SurfacePoint(x: 1826, y: 823), surfaceID: 11, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 715, y: 826), b: SurfacePoint(x: 715, y: 816), c: SurfacePoint(x: 953, y: 826), surfaceID: 12, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 840, y: 943), b: SurfacePoint(x: 840, y: 933), c: SurfacePoint(x: 920, y: 943), surfaceID: 13, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1005, y: 1011), b: SurfacePoint(x: 1005, y: 1001), c: SurfacePoint(x: 1558, y: 1011), surfaceID: 14, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1639, y: 943), b: SurfacePoint(x: 1639, y: 933), c: SurfacePoint(x: 1722, y: 942), surfaceID: 15, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1589, y: 825), b: SurfacePoint(x: 1589, y: 815), c: SurfacePoint(x: 1829, y: 825), surfaceID: 16, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 0, y: 0), b: SurfacePoint(x: 1, y: 1), c: SurfacePoint(x: 2, y: 2), surfaceID: 17, dynamic:true)
]
func run() {
let startTime = CFAbsoluteTimeGetCurrent()
for i in 0 ... 10000000 {
for s in surfaceFrames {
}
}
let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
println("Time elapsed \(timeElapsed) s")
}
}
struct SurfacePoint {
var x,y: Int
}
struct SurfaceFrame {
let a,b,c :SurfacePoint
let surfaceID: Int
let dynamic: Bool
}
import Foundation
final class ClassTest {
let surfaceFrames = [
SurfaceFrame(a: SurfacePoint(x: 0, y: 410), b: SurfacePoint(x: 0, y: 400), c: SurfacePoint(x: 875, y: 410), surfaceID: 0, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 880, y: 304), b: SurfacePoint(x: 880, y: 294), c: SurfacePoint(x: 962, y: 304), surfaceID: 1, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 787, y: 138), b: SurfacePoint(x: 791, y: 129), c: SurfacePoint(x: 1031, y: 248), surfaceID: 2, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 523, y: 138), b: SurfacePoint(x: 523, y: 128), c: SurfacePoint(x: 806, y: 144), surfaceID: 3, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1020, y: 243), b: SurfacePoint(x: 1020, y: 233), c: SurfacePoint(x: 1607, y: 241), surfaceID: 4, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1649, y: 304), b: SurfacePoint(x: 1649, y: 294), c: SurfacePoint(x: 1731, y: 305), surfaceID: 5, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1599, y: 240), b: SurfacePoint(x: 1595, y: 231), c: SurfacePoint(x: 1852, y: 128), surfaceID: 6, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1807, y: 141), b: SurfacePoint(x: 1807, y: 131), c: SurfacePoint(x: 2082, y: 138), surfaceID: 7, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 976, y: 413), b: SurfacePoint(x: 976, y: 403), c: SurfacePoint(x: 1643, y: 411), surfaceID: 8, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1732, y: 410), b: SurfacePoint(x: 1732, y: 400), c: SurfacePoint(x: 2557, y: 410), surfaceID: 9, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 2130, y: 490), b: SurfacePoint(x: 2138, y: 498), c: SurfacePoint(x: 2109, y: 512), surfaceID: 10, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1598, y: 828), b: SurfacePoint(x: 1597, y: 818), c: SurfacePoint(x: 1826, y: 823), surfaceID: 11, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 715, y: 826), b: SurfacePoint(x: 715, y: 816), c: SurfacePoint(x: 953, y: 826), surfaceID: 12, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 840, y: 943), b: SurfacePoint(x: 840, y: 933), c: SurfacePoint(x: 920, y: 943), surfaceID: 13, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1005, y: 1011), b: SurfacePoint(x: 1005, y: 1001), c: SurfacePoint(x: 1558, y: 1011), surfaceID: 14, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1639, y: 943), b: SurfacePoint(x: 1639, y: 933), c: SurfacePoint(x: 1722, y: 942), surfaceID: 15, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 1589, y: 825), b: SurfacePoint(x: 1589, y: 815), c: SurfacePoint(x: 1829, y: 825), surfaceID: 16, dynamic:false),
SurfaceFrame(a: SurfacePoint(x: 0, y: 0), b: SurfacePoint(x: 1, y: 1), c: SurfacePoint(x: 2, y: 2), surfaceID: 17, dynamic:true)
]
func run() {
let startTime = CFAbsoluteTimeGetCurrent()
for i in 0 ... 10000000 {
for s in surfaceFrames {
}
}
let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
println("Time elapsed \(timeElapsed) s")
}
}
struct SurfacePoint {
var x,y: Int
}
final class SurfaceFrame {
let a,b,c :SurfacePoint
let surfaceID: Int
let dynamic: Bool
init(a: SurfacePoint, b: SurfacePoint, c: SurfacePoint, surfaceID: Int, dynamic: Bool) {
self.a = a
self.b = b
self.c = c
self.surfaceID = surfaceID
self.dynamic = dynamic
}
}
In this test, classes took 14.5261079668999 s while the test with structs took only 0.310304999351501 s. Structs were 47X faster.
As Martin R recommended, I profiled both tests and indeed the retain/release calls is what makes iterating through an array of classes much slower than iterating through the array of structs. Just to be clear, here are the tests I ran.
import Foundation
final class SomeClass {}
struct SomeStruct {}
var classes = [
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
SomeClass(),
]
var structs = [
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
SomeStruct(),
]
let startTime = CFAbsoluteTimeGetCurrent()
/*
structTest()
classTest()
*/
let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
println("Time elapsed \(timeElapsed) s")
func structTest() {
for i in 0 ... 1000000 {
for e in structs {}
}
}
func classTest() {
for i in 0 ... 1000000 {
for e in classes {}
}
}
Here are pictures of the profiling of both tests using instruments. You can see just by adding up the running time that the Classes test spends nearly all of its time with retain/releases during each iteration. I'd be interested to see how Swift 2.0 handles this.
Structs
Classes
So just out of curiosity, I thought what would happen if I could by-pass the retain/release calls by doing pointer arithmetic directly on the array (side note: I recommend you never do this in a real app). So I created one last test. However in this test, instead of iterating the array multiple times I would just create one large array and iterate it once because this is where most of the overhead occurs anyway. I've also decided to access properties in this test to reduce the ambiguity in optimizations.
So here are the results from the final test:
Below is the code for the test.
final class SomeClass {
var a: Int
init(a: Int) {
self.a = a
}
}
struct SomeStruct {
var a: Int
init(a: Int) {
self.a = a
}
}
var classes: [SomeClass] = []
var structs: [SomeStruct] = []
var total: Int = 0
for i in 0 ... 100000000 {
classes.append(SomeClass(a:i))
structs.append(SomeStruct(a:i))
}
let startTime = CFAbsoluteTimeGetCurrent()
/*structTest()
classTest()
structTestPointer()
classTestPointer()*/
let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
println("Time elapsed \(timeElapsed) s")
func structTest() {
for someStruct in structs {
let a = someStruct.a
total = total &+ a
}
}
func structTestPointer() {
var pointer = UnsafePointer<SomeStruct>(structs)
for j in 0 ..< structs.count {
let someStruct = pointer.memory
let a = someStruct.a
total = total &+ a
pointer++
}
}
func classTest() {
for someClass in classes {
let a = someClass.a
total = total &+ a
}
}
func classTestPointer() {
var pointer = UnsafePointer<SomeClass>(classes)
for j in 0 ..< classes.count {
let someClass = pointer.memory
let a = someClass.a
total = total &+ a
pointer++
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With