Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding to array in parallel

I'm using Grand Central Dispatch to transforms elements of one array into another. I call dispatch_apply on the source array, transform it into zero or more items, then add them to the destination array. This is a simplified example:

let src = Array(0..<1000)
var dst = [UInt32]()

let queue = dispatch_queue_create("myqueue", DISPATCH_QUEUE_CONCURRENT)
dispatch_apply(src.count, queue) { i in
    dst.append(arc4random_uniform(UInt32(i))) // <-- potential error here
}

print(dst)

I sometimes get an error on the append line. The error is always one of:

1. malloc: *** error for object 0x107508f00: pointer being freed was not allocated
2. fatal error: UnsafeMutablePointer.destroy with negative count
3. fatal error: Can't form Range with end < start

I guess this is due to append not being thread-safe. What did I do wrong and how to fix it?

like image 883
Jenny Avatar asked Nov 04 '15 02:11

Jenny


People also ask

How do you use an array in parallel?

Parallel arrays use two or more arrays to represent a collection of data where each corresponding array index is a matching field for a given record. For example, if there are two arrays, one for names and one for ages, the array elements at names[2] and ages[2] would describe the name and age of the third person.

What is the problem with parallel arrays?

Disadvantages of Parallel Array Because the various arrays can be stored randomly far apart, they have significantly worse locality of reference when trying to visit the records non-sequentially and examining multiple fields of each record.

What are the limitations of parallel arrays?

Parallel arrays may be arbitrarily composed with indexed arrays and records, but they cannot be nested inside of other parallel arrays, i.e., the elements of a parallel array cannot be a parallel array. The rank of a parallel array is its number of dimensions.

What is a parallel array?

Last Updated : 22 Jun, 2021 Parallel Array: Also known as structure an array (SoA), multiple arrays of the same size such that i-th element of each array is closely related and all i-th elements together represent an object or entity. An example parallel array is two arrays that represent x and y co-ordinates of n points.

What is a group of parallel arrays in SQL?

A group of parallel arrays is a form of implicit data structure that uses multiple arrays to represent a singular array of records. It keeps a separate, homogeneous data array for each field of the record, each having the same number of elements.

How to add values to ArrayList in Java?

ArrayList allows using Add () method to add the values to the array list. We can add the output here because the ArrayList is not the fixed size. Adding values to the array. In this example, we are adding one more animal, “Cow”, to the Array $Animals.

How to add and remove an array in PowerShell?

PowerShell arrays are of a fixed size, so when adding or removing the array to the list, it destroys the existing array and creates a new one, and array with the fixed size doesn’t allow the Add () or Remove () method; instead, we can use the += (Plus Equal) for add and -= (Minus Equal) operators for add and the remove operation.


2 Answers

As you have discovered, Swift may relocate the array for memory efficiency. When you run multiple append, it's bound to happen. My guess is that appending to the array is a very low cost operation compared to your transformation. You can create a serial queue just to extend the dst array:

let src = Array(0..<1000)
var dst = [UInt32]()

let queue = dispatch_queue_create("myqueue", DISPATCH_QUEUE_CONCURRENT)
let serialQueue = dispatch_queue_create("mySerialQueue", DISPATCH_QUEUE_SERIAL)
let serialGroup = dispatch_group_create()

dispatch_apply(src.count, queue) { i in
    let result = arc4random_uniform(UInt32(i)) // Your long-running transformation here
    dispatch_group_async(serialGroup, serialQueue) {
        dst.append(result)
    }
}

// Wait until all append operations are complete
dispatch_group_wait(serialGroup, DISPATCH_TIME_FOREVER)
like image 186
Code Different Avatar answered Nov 10 '22 15:11

Code Different


In case anyone is interested in Swift 4 (I suppose Swift 3 onwards) solution (slightly different approach):

let source = Array(0..<1000)

// If inside class/struct, make this lazy or execute the block inside init
var destination: [UInt32] = {
    var copy = Array<UInt32>.init(repeating: 0, count: source.count)

    DispatchQueue.concurrentPerform(iterations: source.count) { i in
        // Do stuff to get the i'th element 
        copy[i] = UInt32.random(in: 0..<i) 
    }

    return copy
}()

If you choose to avoid the tie between your array size and the number of concurrent operations:

(e.g. you may choose to do it in two iterations 0..<500 in parallel with 500..<1000)

let iterations = 2  

DispatchQueue.concurrentPerform(iterations: iterations) { index in

    // First iteration would start at 0 whereas second (index = 1) starts at 500
    let start = index * source.count / iterations

    // First and second iterations will have 500 and 1000 upper bounds respectively 
    let end = (index + 1) * source.count / iterations

    for i in start..<end {
        // Do stuff to get the i'th element
        copy[i] = UInt32.random(in: 0..<source.count)
    }
}
like image 43
Lukas Avatar answered Nov 10 '22 14:11

Lukas