Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Profiling F# for performance of recursive functions

Tags:

profiling

f#

I decided to use F# to solve the second problem for the first day of Advent of Code 2018 (performing a cyclic summation and finding the first repeated sum) in an expressive manner, but the performance is lacking and I can't find the cause of the slowdown.

Problem solved as in Python 3

For a given input with ~140,000 summations, this code executes in a few seconds.

data = list(map(int, '''
+1
-1
'''.strip().splitlines()))
from itertools import cycle, accumulate
class superset(set):
    def add(self, other):
        super().add(other)
        return other

def mapwhile(func, pred, iterable):
    for i in iterable:
        if not pred(i):
            yield func(i)
            return
        yield func(i)

def last(iterable):
    return list(iterable)[-1]

s = superset([0])
print(last(mapwhile(s.add, lambda x: x not in s, accumulate(cycle(data)))))

Problem solved as in F#

I added a conditional breakpoint on the match expression to time every thousandth i, it seems this code performs ~100 sums/sec and would not come to a solution even after an hour. A dramatic slowdown by a ridiculous order in magnitude.

let input = @"
+1
-1
"
let cycle xs = seq { while true do yield! xs }
let accumusum xs = Seq.scan(fun acc elem -> acc + elem) 0 xs

let rec findfreqcycle i (s:int Set) (data:int seq) = 
    let head, tail = Seq.head data, Seq.tail data
    match s.Contains(head) with
    | false -> findfreqcycle (i+1) (s.Add(head)) (tail)
    | true ->  head


let data = input.Trim().Split('\n') |> Seq.map int |> Seq.toList |> cycle
accumusum data |> findfreqcycle 0 Set.empty

As far as I can tell, the core ideas behind each code sample are more or less the same. The input is eagerly parsed only once, with a generator function/sequence to lazily repeat each number.

The only difference is that the function to actually find the first repeating summation is recursive in the F# example. Memory profiling indicates almost constant memory usage, and tail recursion is active.

What could I be doing wrong, and how can I better profile these recursive and generative functions for performance?

like image 963
I'll Eat My Hat Avatar asked Dec 10 '22 04:12

I'll Eat My Hat


1 Answers

As mentioned in the comments, Seq.tail is horribly inefficient, especially if you use it in a loop in the way you do. The reason is that it creates a new sequence that iterates over the original sequence and skips the first element (so, after 1000 iterations, you have to go over 1000 sequences, each skipping one element).

The pattern with head and tail works much better if you use a list, because functional lists have been designed for this kind of processing. In your case, you could do something like this (which follows the same pattern as your original function):

let rec findfreqcycle sum (s:int Set) input data = 
    match data with 
    | x::xs when s.Contains (sum + x) -> (sum + x)
    | x::xs -> findfreqcycle (sum + x) (s.Add (sum + x)) input xs
    | [] ->  findfreqcycle sum s input input

let data = input.Trim().Split('\n') |> Seq.map int |> Seq.toList 
findfreqcycle 0 Set.empty data data

I changed it so that it uses pattern matching (on lists). I also changed the code so that it takes a finite list and, when it gets to the end, it starts again. As a consequence, it also sums the numbers on the fly (rather than using Seq.scan - that would not work here, because I'm not using infinite lists).

On the input from Pastebin, I get the result 448 in about 0.17 seconds.

like image 187
Tomas Petricek Avatar answered Dec 18 '22 01:12

Tomas Petricek