Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is iterating continuously through an array of Structs MUCH faster than an array of Classes?

I'm developing a game with Swift and I have a static array of positional data that I use for processing in the game loop. I originally was using an array of Structs to hold this data but I decided to switch to classes so I could use referencing. However after making the change and profiling I noticed that the CPU was spending much more time on the method that handles this data than it did when I was using Structs.

So I decided to create a simple test to see what was going on.

final class SomeClass {}
struct SomeStruct {}

let classes = [
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
]

let structs = [
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
]

func test1() {
    for i in 0...10000000 {
        for s in classes {}
    }
}

func test2() {
    for i in 0...10000000 {
        for s in structs {}
    }
}

Test1 takes 15.4722579717636 s while Test2 takes only 0.276068031787872 s. Iterating continuously through the structs array was 56X faster. So my question is, why is this? I'm looking for a detailed answer. If I'd have to take a guess I would say that the structs themselves are stored sequentially in memory while the classes are only stored as addresses. So they would need to be dereferenced each time. But then again, don't the structs need to be copied each time?

Side Note: Both arrays are small, but I'm iterating them continuously. If I change the code to iterate once but make the arrays very large like so:

for i in 0...10000000 {
   structs.append(SomeStruct())
   classes.append(SomeClass())
}
func test1() {
    for s in classes {}
}

func test2() {
    for s in structs {}
}

Then I end up with the following: Test1 takes 0.841085016727448 s while Test2 takes 0.00960797071456909 s. The structs take 88X faster.

I'm using an OS X release build and the optimization level is set for Fastest,Smallest [-Os]


Edit

As requested, I have edited this question to include a test where the structs and classes are no longer empty. They use the same properties that I use in my game. Still did not make a difference. Structs are still so much faster, and I don't know why. Hopefully someone can provide the answer.

import Foundation

final class StructTest {
    let surfaceFrames = [
        SurfaceFrame(a: SurfacePoint(x: 0, y: 410), b: SurfacePoint(x: 0, y: 400), c: SurfacePoint(x: 875, y: 410), surfaceID: 0, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 880, y: 304), b: SurfacePoint(x: 880, y: 294), c: SurfacePoint(x: 962, y: 304), surfaceID: 1, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 787, y: 138), b: SurfacePoint(x: 791, y: 129), c: SurfacePoint(x: 1031, y: 248), surfaceID: 2, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 523, y: 138), b: SurfacePoint(x: 523, y: 128), c: SurfacePoint(x: 806, y: 144), surfaceID: 3, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1020, y: 243), b: SurfacePoint(x: 1020, y: 233), c: SurfacePoint(x: 1607, y: 241), surfaceID: 4, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1649, y: 304), b: SurfacePoint(x: 1649, y: 294), c: SurfacePoint(x: 1731, y: 305), surfaceID: 5, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1599, y: 240), b: SurfacePoint(x: 1595, y: 231), c: SurfacePoint(x: 1852, y: 128), surfaceID: 6, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1807, y: 141), b: SurfacePoint(x: 1807, y: 131), c: SurfacePoint(x: 2082, y: 138), surfaceID: 7, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 976, y: 413), b: SurfacePoint(x: 976, y: 403), c: SurfacePoint(x: 1643, y: 411), surfaceID: 8, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1732, y: 410), b: SurfacePoint(x: 1732, y: 400), c: SurfacePoint(x: 2557, y: 410), surfaceID: 9, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 2130, y: 490), b: SurfacePoint(x: 2138, y: 498), c: SurfacePoint(x: 2109, y: 512), surfaceID: 10, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1598, y: 828), b: SurfacePoint(x: 1597, y: 818), c: SurfacePoint(x: 1826, y: 823), surfaceID: 11, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 715, y: 826), b: SurfacePoint(x: 715, y: 816), c: SurfacePoint(x: 953, y: 826), surfaceID: 12, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 840, y: 943), b: SurfacePoint(x: 840, y: 933), c: SurfacePoint(x: 920, y: 943), surfaceID: 13, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1005, y: 1011), b: SurfacePoint(x: 1005, y: 1001), c: SurfacePoint(x: 1558, y: 1011), surfaceID: 14, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1639, y: 943), b: SurfacePoint(x: 1639, y: 933), c: SurfacePoint(x: 1722, y: 942), surfaceID: 15, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1589, y: 825), b: SurfacePoint(x: 1589, y: 815), c: SurfacePoint(x: 1829, y: 825), surfaceID: 16, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 0, y: 0), b: SurfacePoint(x: 1, y: 1), c: SurfacePoint(x: 2, y: 2), surfaceID: 17, dynamic:true)
    ]

    func run() {
        let startTime = CFAbsoluteTimeGetCurrent()
        for  i in 0 ... 10000000 {
            for s in surfaceFrames {
                
            }
        }
        let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
        println("Time elapsed \(timeElapsed) s")
    }
}



struct SurfacePoint {
    var x,y: Int
}
struct SurfaceFrame {
    let a,b,c :SurfacePoint
    let surfaceID: Int
    let dynamic: Bool
}

import Foundation

final class ClassTest {
    let surfaceFrames = [
        SurfaceFrame(a: SurfacePoint(x: 0, y: 410), b: SurfacePoint(x: 0, y: 400), c: SurfacePoint(x: 875, y: 410), surfaceID: 0, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 880, y: 304), b: SurfacePoint(x: 880, y: 294), c: SurfacePoint(x: 962, y: 304), surfaceID: 1, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 787, y: 138), b: SurfacePoint(x: 791, y: 129), c: SurfacePoint(x: 1031, y: 248), surfaceID: 2, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 523, y: 138), b: SurfacePoint(x: 523, y: 128), c: SurfacePoint(x: 806, y: 144), surfaceID: 3, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1020, y: 243), b: SurfacePoint(x: 1020, y: 233), c: SurfacePoint(x: 1607, y: 241), surfaceID: 4, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1649, y: 304), b: SurfacePoint(x: 1649, y: 294), c: SurfacePoint(x: 1731, y: 305), surfaceID: 5, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1599, y: 240), b: SurfacePoint(x: 1595, y: 231), c: SurfacePoint(x: 1852, y: 128), surfaceID: 6, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1807, y: 141), b: SurfacePoint(x: 1807, y: 131), c: SurfacePoint(x: 2082, y: 138), surfaceID: 7, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 976, y: 413), b: SurfacePoint(x: 976, y: 403), c: SurfacePoint(x: 1643, y: 411), surfaceID: 8, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1732, y: 410), b: SurfacePoint(x: 1732, y: 400), c: SurfacePoint(x: 2557, y: 410), surfaceID: 9, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 2130, y: 490), b: SurfacePoint(x: 2138, y: 498), c: SurfacePoint(x: 2109, y: 512), surfaceID: 10, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1598, y: 828), b: SurfacePoint(x: 1597, y: 818), c: SurfacePoint(x: 1826, y: 823), surfaceID: 11, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 715, y: 826), b: SurfacePoint(x: 715, y: 816), c: SurfacePoint(x: 953, y: 826), surfaceID: 12, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 840, y: 943), b: SurfacePoint(x: 840, y: 933), c: SurfacePoint(x: 920, y: 943), surfaceID: 13, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1005, y: 1011), b: SurfacePoint(x: 1005, y: 1001), c: SurfacePoint(x: 1558, y: 1011), surfaceID: 14, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1639, y: 943), b: SurfacePoint(x: 1639, y: 933), c: SurfacePoint(x: 1722, y: 942), surfaceID: 15, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 1589, y: 825), b: SurfacePoint(x: 1589, y: 815), c: SurfacePoint(x: 1829, y: 825), surfaceID: 16, dynamic:false),
        SurfaceFrame(a: SurfacePoint(x: 0, y: 0), b: SurfacePoint(x: 1, y: 1), c: SurfacePoint(x: 2, y: 2), surfaceID: 17, dynamic:true)
    ]

    func run() {
        let startTime = CFAbsoluteTimeGetCurrent()
        for  i in 0 ... 10000000 {
            for s in surfaceFrames {
                
            }
        }
        let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
        println("Time elapsed \(timeElapsed) s")
    }
}



struct SurfacePoint {
    var x,y: Int
}
final class SurfaceFrame {
    let a,b,c :SurfacePoint
    let surfaceID: Int
    let dynamic: Bool

    init(a: SurfacePoint, b: SurfacePoint, c: SurfacePoint, surfaceID: Int, dynamic: Bool) {
        self.a = a
        self.b = b
        self.c = c
        self.surfaceID = surfaceID
        self.dynamic = dynamic
    }
}

In this test, classes took 14.5261079668999 s while the test with structs took only 0.310304999351501 s. Structs were 47X faster.

like image 257
Epic Byte Avatar asked Nov 01 '22 01:11

Epic Byte


1 Answers

As Martin R recommended, I profiled both tests and indeed the retain/release calls is what makes iterating through an array of classes much slower than iterating through the array of structs. Just to be clear, here are the tests I ran.

import Foundation

final class SomeClass {}

struct SomeStruct {}

var classes = [
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
    SomeClass(),
]
var structs = [
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
    SomeStruct(),
]

let startTime = CFAbsoluteTimeGetCurrent()
/*
structTest()
classTest()
*/
let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
println("Time elapsed \(timeElapsed) s")


func structTest() {
    for i in 0 ... 1000000 {
        for e in structs {}
    }
}


func classTest() {
    for i in 0 ... 1000000 {
        for e in classes {}
    }
}

Here are pictures of the profiling of both tests using instruments. You can see just by adding up the running time that the Classes test spends nearly all of its time with retain/releases during each iteration. I'd be interested to see how Swift 2.0 handles this.

Structs enter image description here

Classes enter image description here

So just out of curiosity, I thought what would happen if I could by-pass the retain/release calls by doing pointer arithmetic directly on the array (side note: I recommend you never do this in a real app). So I created one last test. However in this test, instead of iterating the array multiple times I would just create one large array and iterate it once because this is where most of the overhead occurs anyway. I've also decided to access properties in this test to reduce the ambiguity in optimizations.

So here are the results from the final test:

  • One iteration over large Struct array: 1.00037097930908 s
  • One iteration over large Class array: 11.3165299892426 s
  • One iteration over large Struct array using pointer arithmetic: 0.773443996906281 s
  • One iteration over large Class array using pointer arithmetic: 2.81995397806168 s

Below is the code for the test.

final class SomeClass {
    var a: Int
    init(a: Int) {
        self.a = a
    }
}
struct SomeStruct {
    var a: Int
    init(a: Int) {
        self.a = a
    }
}

var classes: [SomeClass] = []
var structs: [SomeStruct] = []

var total: Int = 0

for i in 0 ... 100000000 {
    classes.append(SomeClass(a:i))
    structs.append(SomeStruct(a:i))
}

let startTime = CFAbsoluteTimeGetCurrent()
/*structTest()
classTest()
structTestPointer()
classTestPointer()*/
let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
println("Time elapsed \(timeElapsed) s")

func structTest() {
    for someStruct in structs {
        let a = someStruct.a
        total = total &+ a
    }
}

func structTestPointer() {
    var pointer = UnsafePointer<SomeStruct>(structs)
    for j in 0 ..< structs.count {
        let someStruct = pointer.memory
        let a = someStruct.a
        total = total &+ a
        pointer++
    }
}

func classTest() {
    for someClass in classes {
        let a = someClass.a
        total = total &+ a
    }
}

func classTestPointer() {
    var pointer = UnsafePointer<SomeClass>(classes)
    for j in 0 ..< classes.count {
        let someClass = pointer.memory
        let a = someClass.a
        total = total &+ a
        pointer++
    }
}
like image 122
Epic Byte Avatar answered Nov 15 '22 06:11

Epic Byte