Multithreaded Functional Programming in Swift

Tags:

I've been manipulating byte arrays in Swift 2.1 lately, and I often find myself writing code like this:

// code to add functions to a [UInt8] object
extension CollectionType where Generator.Element == UInt8 {

    func xor(with byte: UInt8) -> [UInt8] {
        return map { $0 ^ byte }
    }
}

// example usage: [67, 108].xor(with: 0) == [67, 108]

Is there an easy way to parallelize this map call, so that multiple threads can operate on non-overlapping areas of the array at the same time?

I could write code to manually divide the array into sub-arrays and call map on each sub-array in distinct threads. But I wonder if some framework exists in Swift to do the division automatically, since map is a functional call that can work in a thread-safe environment without side-effects.

Clarifying notes:

The code only needs to work on a [UInt8] object, not necessarily every CollectionType.

417

asked Dec 09 '15 01:12

m81

2 Answers

The easiest way to perform a loop of calculations in parallel is concurrentPerform (previously called dispatch_apply; see Performing Loop Iterations Concurrently in the Concurrency Programming Guide). But, no, there is no map rendition that will do this for you. You have to do this yourself.

For example, you could write an extension to perform the concurrent tasks:

extension Array {
    public func concurrentMap<T>(_ transform: (Element) -> T) -> [T] {
        var results = [Int: T](minimumCapacity: count)

        let lock = NSLock()

        DispatchQueue.concurrentPerform(iterations: count) { index in
            let result = transform(self[index])
            lock.synchronized {
                results[index] = result
            }
        }

        return (0 ..< results.count).compactMap { results[$0] }
    }
}

Where

extension NSLocking {
    func synchronized<T>(block: () throws -> T) rethrows -> T {
        lock()
        defer { unlock() }
        return try block()
    }
}

You can use whatever synchronization mechanism you want (locks, serial queues, reader-writer), but the idea is to perform transform concurrently and then synchronize the update of the collection.

Note:

This will block the thread you call it from (just like the non-concurrent map will), so make sure to dispatch this to a background queue.
One needs to ensure that there is enough work on each thread to justify the inherent overhead of managing all of these threads. (E.g. a simple xor call per loop is not sufficient, and you'll find that it's actually slower than the non-concurrent rendition.) In these cases, make sure you stride (see Improving Loop Code that balances the amount of work per concurrent block). For example, rather than doing 5000 iterations of one extremely simple operation, do 10 iterations of 500 operations per loop. You may have to experiment with suitable striding values.

While I suspect you don't need this discussion, for readers unfamiliar with concurrentPerform (formerly known as dispatch_apply), I'll illustrate its use below. For a more complete discussion on the topic, refer to the links above.

For example, let's consider something far more complicated than a simple xor (because with something that simple, the overhead outweighs any performance gained), such as a naive Fibonacci implementation:

func fibonacci(_ n: Int) -> Int {
    if n == 0 || n == 1 {
        return n
    }
    return fibonacci(n - 1) + fibonacci(n - 2)
}

If you had an array of Int values for which you wanted to calculate, rather than:

let results = array.map { fibonacci($0) }

You could:

var results = [Int](count: array.count, repeatedValue: 0)
DispatchQueue.concurrentPerform(iterations: array.count) { index in
    let result = self.fibonacci(array[index])
    synchronize.update { results[index] = result }      // use whatever synchronization mechanism you want
}

Or, if you want a functional rendition, you can use that extension I defined above:

let results = array.concurrentMap { fibonacci($0) }

For Swift 2 rendition, see previous revision of this answer.

177

answered Nov 04 '22 02:11

Rob

My implementation seems to be correct and performs well by comparison with all the others I've seen. Tests and benchmarks are here

extension RandomAccessCollection {
    /// Returns `self.map(transform)`, computed in parallel.
    ///
    /// - Requires: `transform` is safe to call from multiple threads.
    func concurrentMap<B>(_ transform: (Element) -> B) -> [B] {
        let batchSize = 4096 // Tune this
        let n = self.count
        let batchCount = (n + batchSize - 1) / batchSize
        if batchCount < 2 { return self.map(transform) }

        return Array(unsafeUninitializedCapacity: n) {
            uninitializedMemory, resultCount in
            resultCount = n
            let baseAddress = uninitializedMemory.baseAddress!

            DispatchQueue.concurrentPerform(iterations: batchCount) { b in
                let startOffset = b * n / batchCount
                let endOffset = (b + 1) * n / batchCount
                var sourceIndex = index(self.startIndex, offsetBy: startOffset)
                for p in baseAddress+startOffset..<baseAddress+endOffset {
                    p.initialize(to: transform(self[sourceIndex]))
                    formIndex(after: &sourceIndex)
                }
            }
        }
    }
}

Hope this helps,

-Dave

answered Nov 04 '22 01:11

Dave Abrahams

Related questions
                            
                                How to view contents of an array while debugging in Code Blocks?
                            
                                Asp.Net MVC 4 Automatically bind model from array of objects in form post
                            
                                Swift: how add offset to memcpy(...)
                            
                                JavaScript: Removing duplicates in an array of arrays
                            
                                Passing an array as a function argument from within a function which takes it as an argument in C
                            
                                (Java) How can I sort an array of objects, and break ties between two objects?
                            
                                numpy einsum to get axes permutation
                            
                                Fancy indexing with assignment for numpy array
                            
                                How to assign a Dictionary to AnyObject in swift
                            
                                Error when passing pointer to array of structs
                            
                                How do I check if an array index is out of range SWIFT [duplicate]
                            
                                Copying string from argv to char array in C
                            
                                Select pandas frame rows based on two columns' values
                            
                                Adding an np.array as a column in a pandas.DataFrame
                            
                                What does "[Ljava.lang.Object;@" mean?
                            
                                check to see if all values inside an array is a number
                            
                                Efficiently generating multiple instances of numpy.random.choice without replacement
                            
                                Realm Lists and Swift Arrays
                            
                                What is the zero-length array mentioned in the draft standard?
                            
                                A more elegant secondary sort for arrays [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Multithreaded Functional Programming in Swift

Tags:

arrays

multithreading

swift

m81

People also ask

2 Answers

Rob

Dave Abrahams

Recent Activity

Donate For Us