I have this "learning code" I wrote for the morris seq in f# that suffers from stack overflow that I don't know how to avoid. "morris" returns an infinite sequence of "see and say" sequences (i.e., {{1}, {1,1}, {2,1}, {1,2,1,1}, {1,1,1,2,2,1}, {3,1,2,2,1,1},...}).
let printList l = Seq.iter (fun n -> printf "%i" n) l printfn "" let rec morris s = let next str = seq { let cnt = ref 1 // Stack overflow is below when enumerating for cur in [|0|] |> Seq.append str |> Seq.windowed 2 do if cur.[0] <> cur.[1] then yield!( [!cnt ; cur.[0]] ) cnt := 0 incr cnt } seq { yield s yield! morris (next s) // tail recursion, no stack overflow } // "main" // Print the nth iteration let _ = [1] |> morris |> Seq.nth 3125 |> printList
You can pick off the nth iteration using Seq.nth but you can only get so far before you hit a stack overflow. The one bit of recursion I have is tail recursion and it in essence builds a linked set of enumerators. That's not where the problem is. It's when "enum" is called on the say the 4000th sequence. Note that's with F# 1.9.6.16, the previous version topped out above 14000). It's because the way the linked sequences are resolved. The sequences are lazy and so the "recursion" is lazy. That is, seq n calls seq n-1 which calls seq n-2 and so forth to get the first item (the very first # is the worst case).
I understand that [|0|] |> Seq.append str |> Seq.windowed 2
, is making my problem worse and I could triple the # I could generate if I eliminated that. Practically speaking the code works well enough. The 3125th iteration of morris would be over 10^359 characters in length.
The problem I'm really trying to solve is how to retain the lazy eval and have a no limit based on stack size for the iteration I can pick off. I'm looking for the proper F# idiom to make the limit based on memory size.
Update Oct '10
After learning F# a bit better, a tiny bit of Haskell, thinking & investigating this problem for over year, I finally can answer my own question. But as always with difficult problems, the problem starts with it being the wrong question. The problem isn't sequences of sequences - it's really because of a recursively defined sequence. My functional programming skills are a little better now and so it's easier to see what's going on with the version below, which still gets a stackoverflow
let next str = Seq.append str [0] |> Seq.pairwise |> Seq.scan (fun (n,_) (c,v) -> if (c = v) then (n+1,Seq.empty) else (1,Seq.ofList [n;c]) ) (1,Seq.empty) |> Seq.collect snd let morris = Seq.unfold(fun sq -> Some(sq,next sq))
That basicially creates a really long chain of Seq processing function calls to generate the sequnces. The Seq module that comes with F# is what can't follow the chain without using the stack. There's an optimization it uses for append and recursively defined sequences, but that optimization only works if the recursion is implementing an append.
So this will work
let rec ints n = seq { yield n; yield! ints (n+1) } printf "%A" (ints 0 |> Seq.nth 100000);;
And this one will get a stackoverflow.
let rec ints n = seq { yield n; yield! (ints (n+1)|> Seq.map id) } printf "%A" (ints 0 |> Seq.nth 100000);;
To prove the F# libary was the issue, I wrote my own Seq module that implemented append, pairwise, scan and collect using continutions and now I can begin generating and printing out the 50,000 seq without a problem (it'll never finish since it's over 10^5697 digits long).
Some additional notes:
Avoid or strictly limit recursion. Don't break your programs up too far into smaller and smaller functions - even without counting local variables each function call consumes as much as 64 bytes on the stack (32 bit processor, saving half the CPU registers, flags, etc)
The 'scanf()' functions can lead to buffer overflow if used improperly. They do not have bound checking capability and if the input string is longer than the buffer size, then the characters will overflow into the adjoining memory. It is possible to avoid buffer overflow by specifying a field width.
There is a major security flaw in scanf family( scanf , sscanf , fscanf ..etc) esp when reading a string, because they don't take the length of the buffer (into which they are reading) into account. Example: char buf[3]; sscanf("abcdef","%s",buf); clearly the the buffer buf can hold MAX 3 char.
You should definitely check out
http://research.microsoft.com/en-us/um/cambridge/projects/fsharp/manual/FSharp.PowerPack/Microsoft.FSharp.Collections.LazyList.html
but I will try to post a more comprehensive answer later.
UPDATE
Ok, a solution is below. It represents the Morris sequence as a LazyList of LazyLists of int, since I presume you want it to be lazy in 'both directions'.
The F# LazyList (in the FSharp.PowerPack.dll) has three useful properties:
The first property is common with seq (IEnumerable), but the other two are unique to LazyList and very useful for computational problems such as the one posed in this question.
Without further ado, the code:
// print a lazy list up to some max depth let rec PrintList n ll = match n with | 0 -> printfn "" | _ -> match ll with | LazyList.Nil -> printfn "" | LazyList.Cons(x,xs) -> printf "%d" x PrintList (n-1) xs // NextMorris : LazyList<int> -> LazyList<int> let rec NextMorris (LazyList.Cons(cur,rest)) = let count = ref 1 let ll = ref rest while LazyList.nonempty !ll && (LazyList.hd !ll) = cur do ll := LazyList.tl !ll incr count LazyList.cons !count (LazyList.consf cur (fun() -> if LazyList.nonempty !ll then NextMorris !ll else LazyList.empty())) // Morris : LazyList<int> -> LazyList<LazyList<int>> let Morris s = let rec MakeMorris ll = LazyList.consf ll (fun () -> let next = NextMorris ll MakeMorris next ) MakeMorris s // "main" // Print the nth iteration, up to a certain depth [1] |> LazyList.of_list |> Morris |> Seq.nth 3125 |> PrintList 10 [1] |> LazyList.of_list |> Morris |> Seq.nth 3126 |> PrintList 10 [1] |> LazyList.of_list |> Morris |> Seq.nth 100000 |> PrintList 35 [1] |> LazyList.of_list |> Morris |> Seq.nth 100001 |> PrintList 35
UPDATE2
If you just want to count, that's fine too:
let LLLength ll = let rec Loop ll acc = match ll with | LazyList.Cons(_,rest) -> Loop rest (acc+1N) | _ -> acc Loop ll 0N let Main() = // don't do line below, it leaks //let hundredth = [1] |> LazyList.of_list |> Morris |> Seq.nth 100 // if we only want to count length, make sure we throw away the only // copy as we traverse it to count [1] |> LazyList.of_list |> Morris |> Seq.nth 100 |> LLLength |> printfn "%A" Main()
The memory usage stays flat (under 16M on my box)... hasn't finished running yet, but I computed the 55th length fast, even on my slow box, so I think this should work just fine. Note also that I used 'bignum's for the length, since I think this will overflow an 'int'.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With