I'm learning F# and one thing that preoccupies me about this language is performance. I've written a small benchmark where I compare idiomatic F# to imperative-style code written in the same language - and much to my surprise, the functional version comes out significantly faster.
The benchmark consists of:
Here's the code:
open System
open System.IO
open System.Diagnostics
let reverseString(str:string) =
new string(Array.rev(str.ToCharArray()))
let CSharpStyle() =
let lines = File.ReadAllLines("text.txt")
for i in 0 .. lines.Length - 1 do
lines.[i] <- reverseString(lines.[i])
File.WriteAllLines("text.txt", lines)
let FSharpStyle() =
File.ReadAllLines("text.txt")
|> Seq.map reverseString
|> (fun lines -> File.WriteAllLines("text.txt", lines))
let benchmark func message =
// initial call for warm-up
func()
let sw = Stopwatch.StartNew()
for i in 0 .. 19 do
func()
printfn message sw.ElapsedMilliseconds
[<EntryPoint>]
let main args =
benchmark CSharpStyle "C# time: %d ms"
benchmark FSharpStyle "F# time: %d ms"
0
Whatever the size of the file, the "F#-style" version completes in around 75% of the time of the "C#-style" version. My question is, why is that? I see no obvious inefficiency in the imperative version.
Seq.map
is different from Array.map
. Because sequences (IEnumerable<T>
) are not evaluated until they are enumerated, in the F#-style code no computation actually happens until File.WriteAllLines
loops through the sequence (not array) generated by Seq.map
.
In other words, your C#-style version is reversing all the strings and storing the reversed strings in an array, and then looping through the array to write out to the file. The F#-style version is reversing all the strings and writing them more-or-less directly to the file. That means the C#-style code is looping through the entire file three times (read to array, build reversed array, write array to file), while the F#-style code is looping through the entire file only twice (read to array, write reversed lines to file).
You'd get the best performance of all if you used File.ReadLines
instead of File.ReadAllLines
combined with Seq.map
- but your output file would have to be different from your input file, as you'd be writing to output while still reading from input.
The Seq.map form has several advantages over a regular loop. It can precompute the function reference just once; it can avoid the variable assignments; and it can use the input sequence length to presize the result array.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With