Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recursive Sequences in F#

Let's say I want to calculate the factorial of an integer. A simple approach to this in F# would be:

let rec fact (n: bigint) =
    match n with
    | x when x = 0I -> 1I
    | _ -> n * fact (n-1I)

But, if my program needs dynamic programming, how could I sustain functional programming whilst using memoization?

One idea I had for this was making a sequence of lazy elements, but I ran into a problem. Assume that the follow code was acceptable in F# (it is not):

let rec facts = 
    seq {
        yield 1I
        for i in 1I..900I do 
            yield lazy (i * (facts |> Seq.item ((i-1I) |> int)))
    }

Is there anything similar to this idea in F#? (Note: I understand that I could use a .NET Dictionary but isn't invoking the ".Add()" method imperative style?)

Also, Is there any way I could generalize this with a function? For example, could I create a sequence of length of the collatz function defined by the function:

let rec collatz n i = 
    if n = 0 || n = 1 then (i+1)
    elif n % 2 = 0 then collatz (n/2) (i+1) 
    else collatz (3*n+1) (i+1)
like image 550
ljeabmreosn Avatar asked Feb 08 '16 23:02

ljeabmreosn


2 Answers

If you want to do it lazily, this is a nice approach:

let factorials =
    Seq.initInfinite (fun n -> bigint n + 1I)
    |> Seq.scan ((*)) 1I
    |> Seq.cache

The Seq.cache means you won't repeatedly evaluate elements you've already enumerated.

You can then take a particular number of factorials using e.g. Seq.take n, or get a particular factorial using Seq.item n.

like image 127
TheInnerLight Avatar answered Oct 13 '22 08:10

TheInnerLight



At first, i don't see in your example what you mean with "dynamic programming".


Using memorization doesn't mean something is not "functional" or breaks immutability. The important point is not how something is implemented. The important thing is how it behaves. A function that uses a mutable memoization is still considered pure, as long as it behaves like a pure function/immutable function. So using a mutable variables in a limited scope that is not visible to the caller is still considered pure. If the implementation would be important we could also consider tail-recursion as not pure, as the compiler transform it into a loop with mutable variables under the hood. There also exists some List.xyz function that use mutation and transform things into a mutable variable just because of speed. Those function are still considered pure/immutable because they still behave like pure function.


A sequence itself is already lazy. It already computes all its elements only when you ask for those elements. So it doesn't make much sense to me to create a sequence that returns lazy elements.


If you want to speed up the computation there exists multiple ways how to do it. Even in the recursion version you could use an accumulator that is passed to the next function call. Instead of doing deep recursion.

let fact n =
    let rec loop acc x =
        if   x = n 
        then acc * x
        else loop (acc*x) (x+1I)
    loop 1I 1I

That overall is the same as

let fact' n =
    let mutable acc = 1I
    let mutable x   = 1I
    while x <= n do
        acc <- acc * x
        x   <- x + 1I
    acc

As long you are learning functional programming it is a good idea to get accustomed to the first version and learn to understand how looping and recursion relate to each other. But besides learning there isn't a reason why you always should force yourself to always write the first version. In the end you should use what you consider more readable and easier to understand. Not whether something uses a mutable variable as an implementation or not.

In the end nobody really cares for the exact implementation. We should view functions as black-boxes. So as long as a function behaves like a pure function, everything is fine.


The above uses an accumulator, so you don't need to repetitive call a function again to get a value. So you also don't need an internal mutable cache. if you really have a slow recursive version and want to speed it up with caching you can use something like that.

let fact x =
    let rec fact x =
        match x with
        | x when x = 1I -> 1I
        | x             -> (fact (x-1I)) * x

    let cache = System.Collections.Generic.Dictionary<bigint,bigint>()
    match cache.TryGetValue x with
    | false,_ -> 
        let value = fact x
        cache.Add(x,value)
        value
    | true,value ->
        value

But that would probably be slower as the versions with an accumulator. If you want to cache calls to fact even across multiple fact calls across your whole application then you need an external cache. You could create a Dictionary outside of fact and use a private variable for this. But you also then can use a function with a closure, and make the whole process itself generic.

let memoize (f:'a -> 'b) =
    let cache = System.Collections.Generic.Dictionary<'a,'b>()
    fun x ->
        match cache.TryGetValue x with
        | false,_ ->
            let value = f x
            cache.Add(x,value)
            value
        | true,value ->
            value

let rec fact x =
    match x with
    | x when x = 1I -> 1I
    | x             -> (fact (x-1I)) * x

So now you can use something like that.

let fact = memoize fact
printfn "%A" (fact 100I)
printfn "%A" (fact 100I)

and create a memoized function out of every other function that takes 1 parameter


Note that memoization doesn't automatically speed up everything. If you use the memoize function on fact nothing get speeded up, it will even be slower as without the memoization. You can add a printfn "Cache Hit" to the | true,value -> branch inside the memoize function. Calling fact 100I twice in a row will only yield a single "Cache Hit" line.

The problem is how the algorithm works. It starts from 100I and it goes down to 0I. So calculating 100I ask the cache of 99I, it doesn't exists, so it tries to calculate 98I and ask the cache. That also doesn't exists so it goes down to 1I. It always asked the cache, never found a result and calculates the needed value. So you never get a "Cache Hit" and you have the additional work of asking the cache. To really benefit from the cache you need to change fact itself, so it starts from 1I up to 100I. The current version even throws StackOverflow for big inputs, even with the memoize function.

Only the second call benefits from the cache, That is why calling fact 100I twice will ever only print "Cache Hit" once.

This is just an example that is easy to get the behaviour wrong with caching/memoization. In general you should try to write a function so it is tail-recursive and uses accumulators instead. Don't try to write functions that expects memoization to work properly.


I would pick a solution with an accumulator. If you profiled your application and you found that this is still to slow and you have a bottleneck in your application and caching fact would help, then you also can just cache the results of facts directly. Something like this. You could use dict or a Map for this.

let factCache = [1I..100I] |> List.map (fun x -> x,fact x) |> dict
let factCache = [1I..100I] |> List.map (fun x -> x,fact x) |> Map.ofList
like image 34
David Raab Avatar answered Oct 13 '22 10:10

David Raab