As an exercise I've rewritten a few of Swift's higher order functions, one being <code>.filter</code>. I decided to measure my version of <code>.filter</code> against Swift's using instruments and I'm rather confused about the results. Here's what my version of filter looks like, which I admit may be incorrect. <pre class="prettyprint"><code>extension Array { func myFilter(predicate: Element -> Bool) -> [Element] { var filteredArray = [Element]() for x in self where predicate(x) { filteredArray.append(x) } return filteredArray } } </code></pre> <h3>What Happened</h3> My Filter <ul> <li>Overall CPU consumption: 85.7%</li> <li>My Filter's consumption: 67.9%</li> </ul> <img src="https://i.stack.imgur.com/eeA7T.png" alt="enter image description here"> <img src="https://i.stack.imgur.com/EYDFa.png" alt="enter image description here"> Swift's Filter <ul> <li>Overall CPU consumption: 57.7%</li> <li>My Filter's consumption: 70.9%</li> </ul> <img src="https://i.stack.imgur.com/z7vT1.png" alt="enter image description here"> <img src="https://i.stack.imgur.com/1mWXZ.png" alt="enter image description here"> <h3>What I expected</h3> I expected similar performance. I'm confused why my filter function call itself would consume less CPU yet my overall application CPU is nearly 30% higher. <h3>My Question</h3> If I've written <code>filter</code> wrong, please help me to understand my error(s). Otherwise please point out why Swift's <code>filter</code> reduces CPU load by 30% over mine.

Ok, so after reading all posted comments, I decided to also benchmark, and here are my results. Oddly enough, the built-in <code>filter</code> seems to perform worse than a custom implementation. TL;DR; Because your function is short, and the compiler has access to it source code, the compiler inlines the function call, which enables more optimisations. Another consideration is as your <code>myFilter</code> declaration doesn't take into consideration exception throwing closures, thing that the built-in <code>filter</code> does. Add <code>@inline(never)</code>, <code>throws</code> and <code>rethrows</code> to your <code>myFilter</code> declaration and you'll get similar results as for the built-in <code>filter</code> <h3>Research results</h3> I used <code>mach_absolute_time()</code> to obtain accurate times. I didn't converted the results to seconds as I was merely interested in comparison. Tests were run on Yosemite 10.10.5 with Xcode 7.2. <pre class="prettyprint"><code>import Darwin extension Array { func myFilter(@noescape predicate: Element -> Bool) -> [Element] { var filteredArray = [Element]() for x in self where predicate(x) { filteredArray.append(x) } return filteredArray } } let arr = [Int](1...1000000) var start = mach_absolute_time() let _ = arr.filter{ $0 % 2 == 0} var end = mach_absolute_time() print("filter: \(end-start)") start = mach_absolute_time() let _ = arr.myFilter{ $0 % 2 == 0} end = mach_absolute_time() print("myFilter: \(end-start)") </code></pre> In <code>debug</code> mode, <code>filter</code> is faster than <code>myFilter</code>: <pre class="prettyprint"><code>filter: 370930078 myFilter: 479532958 </code></pre> In <code>release</code>, however, <code>myFilter</code> is much better than <code>filter</code>: <pre class="prettyprint"><code>filter: 15966626 myFilter: 4013645 </code></pre> What's even more strange is that an exact copy of the built-in <code>filter</code> (taken from Marc's comment) behaves better than the built-in one. <pre class="prettyprint"><code>extension Array { func originalFilter( @noescape includeElement: (Generator.Element) throws -> Bool ) rethrows -> [Generator.Element] { var result = ContiguousArray<Generator.Element>() var generator = generate() while let element = generator.next() { if try includeElement(element) { result.append(element) } } return Array(result) } } start = mach_absolute_time() let _ = arr.originalFilter{ $0 % 2 == 0} end = mach_absolute_time() print("originalFilter: \(end-start)") </code></pre> With the above code, my benchmark app gives the following output: <pre class="prettyprint"><code>filter: 13255199 myFilter: 3285821 originalFilter: 3309898 </code></pre> Back to <code>debug</code> mode, the 3 flavours of <code>filter</code> give this output: <pre class="prettyprint"><code>filter: 343038057 myFilter: 429109866 originalFilter: 345482809 </code></pre> <code>filter</code> and <code>originalFilter</code> give very close results. Which makes me think that Xcode is linking against the debug version of Swifts stdlib. However when build in <code>release</code>, Swifts stdlib performs 3 times better than in <code>debug</code>, and this confused me. So the next step was profiling. I hit <code>Cmd+I</code>, set the sample interval to 40us, and profiled the app two times: one when only the <code>filter</code> call was enabled, and one with <code>myFilter</code> enabled. I removed the printing code in order to have a stack-trace as clean as possible. Built-in <code>filter</code> profiling: <img src="https://i.stack.imgur.com/tGkjL.png" alt="build in filter time profiling"> (source: cristik-test.info) <code>myFilter</code>: <img src="https://cristik-test.info/filter-myFilter.png" alt="myFilter time profiling"> Eureka!, I found the answer. There's no track of the <code>myFilter</code> call, meaning that the compiler inlined the function call, thus enabling extra optimizations that give a performance boost. I added the <code>@inline(never)</code> attribute to <code>myFilter</code>, and it's performance degraded. Next, to make it closer to the built-in filter was to add the <code>throws</code> and <code>rethrows</code> declaration, as the built-in filter allows passing closures that throw exceptions. And surprise (or not), this is what I got: <pre class="prettyprint"><code>filter: 11489238 myFilter: 6923719 myFilter not inlined: 9275967 my filter not inlined, with throws: 11956755 </code></pre> Final conclusion: the fact that the compiler can inline the function call, combined with lack of support for exceptions was responsible for the better performance of your custom filtering method. The following code gives results very similar to the build-in <code>filter</code>: <pre class="prettyprint"><code>extension Array { @inline(never) func myFilter(predicate: Element throws -> Bool) rethrows -> [Element] { var filteredArray = [Element]() for x in self where try predicate(x) { filteredArray.append(x) } return filteredArray } } </code></pre> <h3>Original answer:</h3> Swift's <code>filter</code> should perform better, because: <ol> <li>it has access to the internal state of the array and is not forced to go through the enumeration, which means at least one less function call</li> <li>it might optimize the way it builds the result array</li> </ol> #1 might not give much difference, as function calls are not very expensive #2 on the other hand might make a big difference for large arrays. Appending a new element to the array might result in the array needing to increase its capacity, which implies allocating new memory and copying the contents of the current state.

Why does my version of filter perform so differently than Swifts?

Tags:

swift

filter

higher-order-functions

instruments

As an exercise I've rewritten a few of Swift's higher order functions, one being .filter. I decided to measure my version of .filter against Swift's using instruments and I'm rather confused about the results.

Here's what my version of filter looks like, which I admit may be incorrect.

extension Array {
    func myFilter(predicate: Element -> Bool) -> [Element] {
        var filteredArray = [Element]()
        for x in self where predicate(x) {
            filteredArray.append(x)
        }

        return filteredArray
    }
}

What Happened

My Filter

Overall CPU consumption: 85.7%
My Filter's consumption: 67.9%

enter image description here

Swift's Filter

Overall CPU consumption: 57.7%
My Filter's consumption: 70.9%

enter image description here

What I expected

I expected similar performance. I'm confused why my filter function call itself would consume less CPU yet my overall application CPU is nearly 30% higher.

My Question

If I've written filter wrong, please help me to understand my error(s). Otherwise please point out why Swift's filter reduces CPU load by 30% over mine.

488

asked Dec 31 '15 04:12

Dan Beaulieu

1 Answers

Ok, so after reading all posted comments, I decided to also benchmark, and here are my results. Oddly enough, the built-in filter seems to perform worse than a custom implementation.

TL;DR; Because your function is short, and the compiler has access to it source code, the compiler inlines the function call, which enables more optimisations.

Another consideration is as your myFilter declaration doesn't take into consideration exception throwing closures, thing that the built-in filter does.

Add @inline(never), throws and rethrows to your myFilter declaration and you'll get similar results as for the built-in filter

Research results

I used mach_absolute_time() to obtain accurate times. I didn't converted the results to seconds as I was merely interested in comparison. Tests were run on Yosemite 10.10.5 with Xcode 7.2.

import Darwin

extension Array {
    func myFilter(@noescape predicate: Element -> Bool) -> [Element] {
        var filteredArray = [Element]()
        for x in self where predicate(x) {
            filteredArray.append(x)
        }

        return filteredArray
    }
}

let arr = [Int](1...1000000)

var start = mach_absolute_time()
let _ = arr.filter{ $0 % 2 == 0}
var end = mach_absolute_time()
print("filter:         \(end-start)")

start = mach_absolute_time()
let _ = arr.myFilter{ $0 % 2 == 0}
end = mach_absolute_time()
print("myFilter:       \(end-start)")

In debug mode, filter is faster than myFilter:

filter:         370930078
myFilter:       479532958

In release, however, myFilter is much better than filter:

filter:         15966626
myFilter:       4013645

What's even more strange is that an exact copy of the built-in filter (taken from Marc's comment) behaves better than the built-in one.

extension Array {
    func originalFilter(
        @noescape includeElement: (Generator.Element) throws -> Bool
        ) rethrows -> [Generator.Element] {

            var result = ContiguousArray<Generator.Element>()

            var generator = generate()

            while let element = generator.next() {
                if try includeElement(element) {
                    result.append(element)
                }
            }

            return Array(result)
    }

}

start = mach_absolute_time()
let _ = arr.originalFilter{ $0 % 2 == 0}
end = mach_absolute_time()
print("originalFilter: \(end-start)")

With the above code, my benchmark app gives the following output:

filter:         13255199
myFilter:       3285821
originalFilter: 3309898

Back to debug mode, the 3 flavours of filter give this output:

filter:         343038057
myFilter:       429109866
originalFilter: 345482809

filter and originalFilter give very close results. Which makes me think that Xcode is linking against the debug version of Swifts stdlib. However when build in release, Swifts stdlib performs 3 times better than in debug, and this confused me.

So the next step was profiling. I hit Cmd+I, set the sample interval to 40us, and profiled the app two times: one when only the filter call was enabled, and one with myFilter enabled. I removed the printing code in order to have a stack-trace as clean as possible.

Built-in filter profiling: build in filter time profiling
_{(source: cristik-test.info)}

myFilter: myFilter time profiling

Eureka!, I found the answer. There's no track of the myFilter call, meaning that the compiler inlined the function call, thus enabling extra optimizations that give a performance boost.

I added the @inline(never) attribute to myFilter, and it's performance degraded.

Next, to make it closer to the built-in filter was to add the throws and rethrows declaration, as the built-in filter allows passing closures that throw exceptions.

And surprise (or not), this is what I got:

filter: 11489238
myFilter: 6923719
myFilter not inlined: 9275967
my filter not inlined, with throws: 11956755

Final conclusion: the fact that the compiler can inline the function call, combined with lack of support for exceptions was responsible for the better performance of your custom filtering method.

The following code gives results very similar to the build-in filter:

extension Array {
    @inline(never)
    func myFilter(predicate: Element throws -> Bool) rethrows -> [Element] {
        var filteredArray = [Element]()
        for x in self where try predicate(x) {
            filteredArray.append(x)
        }

        return filteredArray
    }
}

Original answer:

Swift's filter should perform better, because:

it has access to the internal state of the array and is not forced to go through the enumeration, which means at least one less function call
it might optimize the way it builds the result array

#1 might not give much difference, as function calls are not very expensive

#2 on the other hand might make a big difference for large arrays. Appending a new element to the array might result in the array needing to increase its capacity, which implies allocating new memory and copying the contents of the current state.

145

answered Oct 19 '22 22:10

Cristik

Related questions
                            
                                How to compare nested collections in swift
                            
                                how to play sound with AVAudioPCMBuffer
                            
                                Convert Range<Int> to Range<String.Index>
                            
                                CIPerspectiveCorrection filter returns image flipped and inverted
                            
                                Open App From Widget IOS with Swift
                            
                                'YearCalendarUnit' was deprecated in OS X version 10.10: Use NSCalendarUnitYear instead
                            
                                XCode 6.3 Crashes When Archiving Application
                            
                                inputAccessoryView Dismiss Keyboard
                            
                                How to deploy (build and run) app on the Apple Watch?
                            
                                Set a variable to the < ("less than") operator as a function in Swift?
                            
                                Is there a way to set up a NSCollectionView programmatically in Swift?
                            
                                can't install opencv with cocoapods, Could not resolve host: hivelocity.dl.sourceforge.net
                            
                                Split large Array in Array of two elements
                            
                                Display UISearchController's searchbar programmatically
                            
                                How to update search results when scope button changed. Swift UISearchController
                            
                                Swift UITesting error: Invalid escape sequence in literal. \U201c
                            
                                How to display image in watchOS complication
                            
                                Enum case switch not found in type
                            
                                Convert touch location of UITapGestureRecognizer for SpriteKit - Swift
                            
                                Using AlamofireImage Inside UITableViewCell

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With