Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

F#: Flattening a sequence of sequences into a single Seq

I am trying to build a single sequence that contains the contents of multiple files so that it can be sorted and then passed to a graphing component. However I am stuck trying to fold the contents of each file together. The pseudo code below wont compile but hopefully will show the intention of what I am trying to achieve.

Any help, greatly appreciated.

open System.IO 

let FileEnumerator filename = seq { 
        use sr = System.IO.File.OpenText(filename)
           while not sr.EndOfStream do 
           let line = sr.ReadLine()
            yield line 
}

let files = Directory.EnumerateFiles(@"D:\test_Data\","*.csv",SearchOption.AllDirectories)

let res =
   files 
        |> Seq.fold(fun x item -> 
        let lines =  FileEnumerator(item)
        let sq = Seq.concat x ; lines
        sq
    ) seq<string>

printfn "%A" res
like image 302
Glyn Darkin Avatar asked Mar 19 '12 17:03

Glyn Darkin


2 Answers

You are essentially trying to reimplement Files.Readlines, which returns the file contents as seq<string>. This can then be concatenated with Seq.concat:

let res = Directory.EnumerateFiles(@"D:\test_Data","*.csv",SearchOption.AllDirectories)
          |> Seq.map File.ReadLines 
          |> Seq.concat
like image 123
Taylor Southwick Avatar answered Sep 24 '22 16:09

Taylor Southwick


To fix the problem in your original approach, you need to use Seq.append instead of Seq.concat. The initial value for fold should be an empty sequence, which can be written as Seq.empty:

let res = 
   files |> Seq.fold(fun x item ->  
        let lines =  FileEnumerator(item) 
        let sq = Seq.append x lines 
        sq ) Seq.empty

If you wanted to use Seq.concat, you'd have to write Seq.concat [x; lines], because concat expects a sequence of sequences to be concatenated. On the other hand, append simply takes two sequences, so it is easier to use here.

Another (simpler) way to concatenate all the lines is to use yield! in sequence expressions:

let res = 
  seq { for item in files do
          yield! FileEnumerator(item) }

This creates a sequence by iterating over all the files and adding all lines from the files (in order) to the resulting sequence. The yield! construct adds all elements of the sequenece to the result.

like image 30
Tomas Petricek Avatar answered Sep 20 '22 16:09

Tomas Petricek