Getting functional sieve of Eratosthenes fast

I read this other post about a F# version of this algorithm. I found it very elegant and tried to combine some ideas of the answers.

Although I optimized it to make fewer checks (check only numbers around 6) and leave out unnecessary caching, it is still painfully slow. Calculating the 10,000th prime already take more than 5 minutes. Using the imperative approach, I can test all 31-bit integers in not that much more time.

So my question is if I am missing something that makes all this so slow. For example in another post someone was speculating that LazyList may use locking. Does anyone have an idea?

As StackOverflow's rules say not to post new questions as answers, I feel I have to start a new topic for this.

Here's the code:

#r "FSharp.PowerPack.dll"

open Microsoft.FSharp.Collections

let squareLimit = System.Int32.MaxValue |> float32 |> sqrt |> int

let around6 = LazyList.unfold (fun (candidate, (plus, next)) -> 
        if candidate > System.Int32.MaxValue - plus then
            Some(candidate, (candidate + plus, (next, plus)))
    ) (5, (2, 4))

let (|SeqCons|SeqNil|) s =
    if Seq.isEmpty s then SeqNil
    else SeqCons(Seq.head s, Seq.skip 1 s)

let rec lazyDifference l1 l2 =
    if Seq.isEmpty l2 then l1 else
    match l1, l2 with
    | LazyList.Cons(x, xs), SeqCons(y, ys) ->
        if x < y then
            LazyList.consDelayed x (fun () -> lazyDifference xs l2)
        elif x = y then
            lazyDifference xs ys
            lazyDifference l1 ys
    | _ -> LazyList.empty

let lazyPrimes =
    let rec loop = function
        | LazyList.Cons(p, xs) as ll ->
            if p > squareLimit then
                let increment = p <<< 1
                let square = p * p
                let remaining = lazyDifference xs {square..increment..System.Int32.MaxValue}
                LazyList.consDelayed p (fun () -> loop remaining)
        | _ -> LazyList.empty
    loop (LazyList.cons 2 (LazyList.cons 3 around6))
If you are calling Seq.skip anywhere, then there's about a 99% chance that you have an O(N^2) algorithm. For nearly every elegant functional lazy Project Euler solution involving sequences, you want to use LazyList, not Seq. (See Juliet's comment link for more discussion.)

