Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Swift Array extension for standard deviation

I am frequently needing to calculate mean and standard deviation for numeric arrays. So I've written a small protocol and extensions for numeric types that seems to work. I just would like feedback if there is anything wrong with how I have done this. Specifically, I am wondering if there is a better way to check if the type can be cast as a Double to avoid the need for the asDouble variable and init(_:Double) constructor.

I know there are issues with protocols that allow for arithmetic, but this seems to work ok and saves me from putting the standard deviation function into classes that need it.

protocol Numeric {
    var asDouble: Double { get }
    init(_: Double)
}

extension Int: Numeric {var asDouble: Double { get {return Double(self)}}}
extension Float: Numeric {var asDouble: Double { get {return Double(self)}}}
extension Double: Numeric {var asDouble: Double { get {return Double(self)}}}
extension CGFloat: Numeric {var asDouble: Double { get {return Double(self)}}}

extension Array where Element: Numeric {

    var mean : Element { get { return Element(self.reduce(0, combine: {$0.asDouble + $1.asDouble}) / Double(self.count))}}

    var sd : Element { get {
        let mu = self.reduce(0, combine: {$0.asDouble + $1.asDouble}) / Double(self.count)
        let variances = self.map{pow(($0.asDouble - mu), 2)}
        return Element(sqrt(variances.mean))
    }}
}

edit: I know it's kind of pointless to get [Int].mean and sd, but I might use numeric elsewhere so it's for consistency..

edit: as @Severin Pappadeux pointed out, variance can be expressed in a manner that avoids the triple pass on the array - mean then map then mean. Here is the final standard deviation extension

extension Array where Element: Numeric {

    var sd : Element { get {
        let sss = self.reduce((0.0, 0.0)){ return ($0.0 + $1.asDouble, $0.1 + ($1.asDouble * $1.asDouble))}
        let n = Double(self.count)
        return Element(sqrt(sss.1/n - (sss.0/n * sss.0/n)))
    }}
}
like image 739
twiz_ Avatar asked Jul 17 '16 14:07

twiz_


Video Answer


4 Answers

Swift 4 Array extension with FloatingPoint elements:

extension Array where Element: FloatingPoint {

    func sum() -> Element {
        return self.reduce(0, +)
    }

    func avg() -> Element {
        return self.sum() / Element(self.count)
    }

    func std() -> Element {
        let mean = self.avg()
        let v = self.reduce(0, { $0 + ($1-mean)*($1-mean) })
        return sqrt(v / (Element(self.count) - 1))
    }

}
like image 134
David Thorsrud Avatar answered Oct 17 '22 02:10

David Thorsrud


There's actually a class that provides this functionality already - called NSExpression. You could reduce your code size and complexity by using this instead. There's quite a bit of stuff to this class, but a simple implementation of what you want is as follows.

let expression = NSExpression(forFunction: "stddev:", arguments: [NSExpression(forConstantValue: [1,2,3,4,5])])
let standardDeviation = expression.expressionValueWithObject(nil, context: nil)

You can calculate mean too, and much more. Info here: http://nshipster.com/nsexpression/

like image 40
Jordan Smith Avatar answered Oct 17 '22 01:10

Jordan Smith


In Swift 3 you might (or might not) be able to save yourself some duplication with the FloatingPoint protocol, but otherwise what you're doing is exactly right.

like image 5
matt Avatar answered Oct 17 '22 00:10

matt


To follow up on Matt's observation, I'd do the main algorithm on FloatingPoint, taking care of Double, Float, CGFloat, etc. But then I then do another permutation of this on BinaryInteger, to take care of all of the integer types.

E.g. on FloatingPoint:

extension Array where Element: FloatingPoint {
    
    /// The mean average of the items in the collection.
    
    var mean: Element { return reduce(Element(0), +) / Element(count) }
    
    /// The unbiased sample standard deviation. Is `nil` if there are insufficient number of items in the collection.
    
    var stdev: Element? {
        guard count > 1 else { return nil }
        
        return sqrt(sumSquaredDeviations() / Element(count - 1))
    }
    
    /// The population standard deviation. Is `nil` if there are insufficient number of items in the collection.
    
    var stdevp: Element? {
        guard count > 0 else { return nil }
        
        return sqrt(sumSquaredDeviations() / Element(count))
    }
    
    /// Calculate the sum of the squares of the differences of the values from the mean
    ///
    /// A calculation common for both sample and population standard deviations.
    ///
    /// - calculate mean
    /// - calculate deviation of each value from that mean
    /// - square that
    /// - sum all of those squares
    
    private func sumSquaredDeviations() -> Element {
        let average = mean
        return map {
            let difference = $0 - average
            return difference * difference
        }.reduce(Element(0), +)
    }
}

But then on BinaryInteger:

extension Array where Element: BinaryInteger {
    var mean: Double { return map { Double(exactly: $0)! }.mean }
    var stdev: Double? { return map { Double(exactly: $0)! }.stdev }
    var stdevp: Double? { return map { Double(exactly: $0)! }.stdevp }
}

Note, in my scenario, even when dealing with integer input data, I generally want floating point mean and standard deviations, so I arbitrarily chose Double. And you might want to do safer unwrapping of Double(exactly:). You can handle this scenario any way you want. But it illustrates the idea.

like image 3
Rob Avatar answered Oct 17 '22 01:10

Rob