Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Process Array in parallel using GCD

I have a large array that I would like to process by handing slices of it to a few asynchronous tasks. As a proof of concept, I have the written the following code:

class TestParallelArrayProcessing {
    let array: [Int]
    var summary: [Int]

    init() {
        array = Array<Int>(count: 500000, repeatedValue: 0)
        for i in 0 ..< 500000 {
            array[i] = Int(arc4random_uniform(10))
        }
        summary = Array<Int>(count: 10, repeatedValue: 0)
    }

    func calcSummary() {
        let group = dispatch_group_create()
        let queue = dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0)

        for i in 0 ..< 10 {
            dispatch_group_async(group, queue, {
                let base = i * 50000
                for x in base ..< base + 50000 {
                    self.summary[i] += self.array[x]
                }
            })
        }
        dispatch_group_notify(group, queue, {
            println(self.summary)
        })
    }
}

After init(), array will be initialized with random integers between 0 and 9.

The calcSummary function dispatches 10 tasks that take disjoint chunks of 50000 items from array and add them up, using their respective slot in summary as an accummulator.

This program crashes at the self.summary[i] += self.array[x] line. The error is:

 EXC_BAD_INSTRUCTION (code = EXC_I386_INVOP).

I can see, in the debugger, that it has managed to iterate a few times before crashing, and that the variables, at the time of the crash, have values within correct bounds.

I have read that EXC_I386_INVOP can happen when trying to access an object that has already been released. I wonder if this has anything to do with Swift making a copy of the array if it is modified, and, if so, how to avoid it.

like image 939
Eduardo Avatar asked Nov 01 '14 22:11

Eduardo


1 Answers

This is a slightly different take on the approach in @Eduardo's answer, using the Array type's withUnsafeMutableBufferPointer<R>(body: (inout UnsafeMutableBufferPointer<T>) -> R) -> R method. That method's documentation states:

Call body(p), where p is a pointer to the Array's mutable contiguous storage. If no such storage exists, it is first created.

Often, the optimizer can eliminate bounds- and uniqueness-checks within an array algorithm, but when that fails, invoking the same algorithm on body's argument lets you trade safety for speed.

That second paragraph seems to be exactly what's happening here, so using this method might be more "idiomatic" in Swift, whatever that means:

func calcSummary() {
    let group = dispatch_group_create()
    let queue = dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0)
    
    self.summary.withUnsafeMutableBufferPointer {
        summaryMem -> Void in
        for i in 0 ..< 10 {
            dispatch_group_async(group, queue, {
                let base = i * 50000
                for x in base ..< base + 50000 {
                    summaryMem[i] += self.array[x]
                }
            })
        }
    }

    dispatch_group_notify(group, queue, {
        println(self.summary)
    })
}
like image 68
Nate Cook Avatar answered Oct 05 '22 12:10

Nate Cook