Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Seq.map faster than a regular for loop?

Tags:

performance

f#

I'm learning F# and one thing that preoccupies me about this language is performance. I've written a small benchmark where I compare idiomatic F# to imperative-style code written in the same language - and much to my surprise, the functional version comes out significantly faster.

The benchmark consists of:

  1. Reading in a text file using File.ReadAllLines
  2. Reversing the order of characters within each line
  3. Writing back the result to the same file using File.WriteAllLines.

Here's the code:

open System
open System.IO
open System.Diagnostics

let reverseString(str:string) =
    new string(Array.rev(str.ToCharArray()))

let CSharpStyle() = 
    let lines = File.ReadAllLines("text.txt")
    for i in 0 .. lines.Length - 1 do
        lines.[i] <- reverseString(lines.[i])

    File.WriteAllLines("text.txt", lines)

let FSharpStyle() = 
    File.ReadAllLines("text.txt")
    |> Seq.map reverseString
    |> (fun lines -> File.WriteAllLines("text.txt", lines))

let benchmark func message = 
    // initial call for warm-up
    func()

    let sw = Stopwatch.StartNew()
    for i in 0 .. 19 do
        func()

    printfn message sw.ElapsedMilliseconds


[<EntryPoint>]
let main args = 
    benchmark CSharpStyle "C# time: %d ms"
    benchmark FSharpStyle "F# time: %d ms"
    0

Whatever the size of the file, the "F#-style" version completes in around 75% of the time of the "C#-style" version. My question is, why is that? I see no obvious inefficiency in the imperative version.

like image 403
Asik Avatar asked May 06 '12 05:05

Asik


2 Answers

Seq.map is different from Array.map. Because sequences (IEnumerable<T>) are not evaluated until they are enumerated, in the F#-style code no computation actually happens until File.WriteAllLines loops through the sequence (not array) generated by Seq.map.

In other words, your C#-style version is reversing all the strings and storing the reversed strings in an array, and then looping through the array to write out to the file. The F#-style version is reversing all the strings and writing them more-or-less directly to the file. That means the C#-style code is looping through the entire file three times (read to array, build reversed array, write array to file), while the F#-style code is looping through the entire file only twice (read to array, write reversed lines to file).

You'd get the best performance of all if you used File.ReadLines instead of File.ReadAllLines combined with Seq.map - but your output file would have to be different from your input file, as you'd be writing to output while still reading from input.

like image 53
Joel Mueller Avatar answered Oct 02 '22 14:10

Joel Mueller


The Seq.map form has several advantages over a regular loop. It can precompute the function reference just once; it can avoid the variable assignments; and it can use the input sequence length to presize the result array.

like image 21
Raymond Hettinger Avatar answered Oct 02 '22 14:10

Raymond Hettinger