Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

F# compiler keep dead objects alive

I'm implementing some algorithms which works on large data (~250 MB - 1 GB). For this I needed a loop to do some benchmarking. However, in the process I learn that F# is doing some nasty things, which I hope some of you can clarify.

Here is my code (description of the problem is below):

open System

for i = 1 to 10 do
    Array2D.zeroCreate 10000 10000 |> ignore    
    printfn "%d" (GC.GetTotalMemory(true)) 

Array2D.zeroCreate 10000 10000 |> ignore
// should force a garbage collection, and GC.Collect() doesn't help either
printfn "%d" (GC.GetTotalMemory(true))
Array2D.zeroCreate 10000 10000 |> ignore    
printfn "%d" (GC.GetTotalMemory(true))
Array2D.zeroCreate 10000 10000 |> ignore    
printfn "%d" (GC.GetTotalMemory(true))
Array2D.zeroCreate 10000 10000 |> ignore    
printfn "%d" (GC.GetTotalMemory(true))

Console.ReadLine() |> ignore

Here the output will be like:

54000
54000
54000
54000
54000
54000
54000
54000
54000
54000
400000000
800000000
1200000000

Out of memory exception

So, in the loop F# discards the result, but when I'm not in the loop F# will keep references to "dead data" (I've looked in the IL, and apparently the class Program gets fields for this data). Why? And can I fix that?

This code is runned outside Visual Studio and in release mode.

like image 810
Lasse Espeholt Avatar asked Jun 12 '11 16:06

Lasse Espeholt


1 Answers

The reason for this behavior is that the F# compiler behaves differently in the global scope than in local scope. A variable declared at global scope is turned into a static field. A module declaration is a static class with let declarations compiled as fields/properties/methods.

The simplest way to fix the problem is to write your code in a function:

let main () =    
  Array2D.zeroCreate 10000 10000 |> ignore    
  printfn "%d" (GC.GetTotalMemory(true))
  Array2D.zeroCreate 10000 10000 |> ignore    
  printfn "%d" (GC.GetTotalMemory(true))
  // (...)
  Console.ReadLine() |> ignore

main ()

... but why does the compiler declare fields when you're not using the value and just ignore it? This is quite interesting - the ignore function is a very simple function that is inlined when you use it. The declaration is let inline ignore _ = (). When inlining the function, the compiler declares some variables (to store the arguments of the function).

So, another way to fix this is to omit ignore and write:

Array2D.zeroCreate 10000 10000 
printfn "%d" (GC.GetTotalMemory(true))
Array2D.zeroCreate 10000 10000 
printfn "%d" (GC.GetTotalMemory(true))
// (...)

You'll get some compiler warnings, because the result of expression is not unit, but it will work. However, using some function and writing code in local scope is probably more reliable.

like image 184
Tomas Petricek Avatar answered Sep 29 '22 12:09

Tomas Petricek