Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

If I convert a sequence to an array and treat it as a sequence, do I get O(1) on length?

Tags:

arrays

f#

seq

I was wondering whether I would get special treatment when a sequence is converted to an array and then henceforth treated as a sequence again.

let sq = seq { for i in 0 .. 10 do yield i }
let arr = Seq.toArray sq
let len = Array.length arr      // O(1)
let sq2 = arr |> Seq.ofArray

// from converted seq
let len2 = Seq.length sq2       // O(n)???

// or direct:
let len2 = Seq.length arr       // O(n)???

On the same token, is F# smart enough with Seq.toArray arr to simply create a copy of the array, to leave it alone (not create a copy), or would it iterate over each item using the enumerator?

Put in another way, do sequence in F# remember somehow that their internal structure is an array?

I am asking this since on expensive sequences, you may need the length multiple times, and evaluating it once would be beneficial. I can either create a specific sequence type that remembers the length, or I could use the magic that's already there.

like image 557
Abel Avatar asked Mar 12 '23 01:03

Abel


1 Answers

If the sequence is actually an array type then it will simply be cast back to an array to determine the array within Seq.length. You can see this in in the implementation of the length function here:

[<CompiledName("Length")>]
let length (source : seq<'T>)    = 
    checkNonNull "source" source
    match source with 
    | :? ('T[]) as a -> a.Length
    | :? ('T list) as a -> a.Length
    | :? ICollection<'T> as a -> a.Count
    | _ -> 
        use e = source.GetEnumerator() 
        let mutable state = 0 
        while e.MoveNext() do
            state <-  state + 1;
        state

You can see this behaviour if you put it in FSI:

let arr = [|1..40000000|];;

Using Array.length:

Array.length arr;;
Real: 00:00:00.000, CPU: 00:00:00.015, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 40000000

Using Seq.length:

Seq.length arr;;
Real: 00:00:00.000, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 40000000

If you use Seq.ofArray you are specifically hiding the underlying type information, creating a new enumerator that steps through the array element by element.

This can be a useful behaviour because it prevents a consumer of your API from sneakily casting seq<'T> back to 'T[] and consequently allowing said consumer to mutate something that you, the API designer, expected to be exposing an immutable view of.

The downside of this information hiding is that you can't cast back to array so the enumeration becomes significantly slower:

Seq.length <| Seq.ofArray arr;;
Real: 00:00:00.148, CPU: 00:00:00.140, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 40000000

Seq.ofArray uses the mkSeq function which just creates an anonymous IEnumerable from an ArrayEnumerator:

let mkSeq f = 
    { new IEnumerable<'U> with 
          member x.GetEnumerator() = f()
      interface IEnumerable with 
          member x.GetEnumerator() = (f() :> IEnumerator) }
like image 171
TheInnerLight Avatar answered May 05 '23 15:05

TheInnerLight