Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is reduce faster than sum or sumBy?

Tags:

f#

My coworker and I were comparing the speed of C# functions when passing in lambdas to do work compared to inlined functions with regards to time spent doing work. We found that you incurred a cost when passing in a lambda projection to a C# select function (for example) and wanted to see if F# had the same issues, or if it did something different.

Regardless of our original intent, we stumbled onto something that we can't figure out. In the following example we sum a list 3 different ways

  1. Reduce
  2. Sum
  3. SumBy

module fs

open NUnit.Framework
open FsUnit
open System
open System.Diagnostics;

[<Test>]
let sumTest() = 
    let nums = [0..1000]

    let repeat = 100000

    let stopWatch = new Stopwatch()

    stopWatch.Start()

    let sumsReduce = 
        [
            for i in [0..repeat] do
                yield List.reduce (+) nums
        ]

    Console.WriteLine("reduce = {0} - Time = {1}", List.head sumsReduce, stopWatch.Elapsed.TotalSeconds); 
    stopWatch.Restart()

    let sumsSum = 
        [
            for i in [0..repeat] do
                yield List.sum nums
        ]

    Console.WriteLine("sum = {0} - Time = {1}", List.head sumsSum, stopWatch.Elapsed.TotalSeconds); 
    stopWatch.Restart()


    let sumsSumBy = 
        [
            for i in [0..repeat] do
                yield List.sumBy id nums
        ]

    Console.WriteLine("sumBy = {0} - Time = {1}", List.head sumsSumBy, stopWatch.Elapsed.TotalSeconds); 
    stopWatch.Restart()

The output to this looks like:

reduce = 500500 - Time = 0.2725156
sum = 500500 - Time = 1.1183165
sumBy = 500500 - Time = 1.1126781

So clearly reduce is the big winner here. In the decompilation, I can see that reduce gets boiled down

[Serializable]
internal class sumsReduce\u004021\u002D1 : OptimizedClosures.FSharpFunc<int, int, int>
{
  internal sumsReduce\u004021\u002D1()
  {
    base.\u002Ector();
  }

  public override int Invoke(int x, int y)
  {
    return x + y;
  }
}

But I'm having a hard time figuring out what sum and sumBy are doing. Where is the timing discrepancy from?


The current answer suggested that reduce is 5 times faster because originally I was giving reduce an unchecked operator. However, updating the test to use a checked operator (from the Checked module) and I still get the same result

let sumsReduce = 
        [
            for i in [0..repeat] do
                yield List.reduce (Checked.(+)) nums
        ]

Notice the time discrepancy still exists

reduce = 500500 - Time = 0.274697
sum = 500500 - Time = 1.1126796
sumBy = 500500 - Time = 1.1370642
like image 211
devshorts Avatar asked Aug 21 '13 20:08

devshorts


2 Answers

Sum and SumBy use an enumerator:

    while e.MoveNext() do
        acc <- Checked.(+) acc e.Current
    acc

whereas reduce uses an recursive loop with an optimised closure: (reduce uses fold under the cover's - fold f head tail)

    let fold<'T,'State> f (s:'State) (list: 'T list) = 
        match list with 
        | [] -> s
        | _ -> 
            let f = OptimizedClosures.FSharpFunc<_,_,_>.Adapt(f)
            let rec loop s xs = 
                match xs with 
                | [] -> s
                | h::t -> loop (f.Invoke(s,h)) t
            loop s list

Using optimised closures can often yield a performance boost.

like image 79
7sharp9 Avatar answered Oct 21 '22 02:10

7sharp9


sum and sumBy use checked arithmetic, but you're passing unchecked operator + to reduce – not exactly apples to apples.

like image 22
ildjarn Avatar answered Oct 21 '22 01:10

ildjarn