Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is memory released after each iteration while mapping over a list?

Tags:

list

memory

f#

Sorry if this is a beginner's question, but I need to be sure.

When a function is called it may create temporary objects whose memory allocation should be released upon exit.

My question is: When a function is mapped over a list, is the memory allocated by each function call released immediately or only after the whole list has been processed?

This is an example, what the code does specifically is not meaningful, except that two objects (newList and newRec) are created within each function call.

Will the memory allocated to newList and newRec be released after each "iteration" or will all memory be released only after the call to List.map exits?

This probably should be easily figured out for a loop in an imperative language but I do not know how the F# compiler deals with such cases.

type MyRecord = { AList: int list; Name: string }
let myRecord = { AList = [1..100]; Name = "SomeRecord" }
let foo (arec: MyRecord) i =
    let newList = arec.AList |> List.filter (fun x -> x >= i)
    let newRec = { arec with AList = newList }
    List.sum newRec.AList
let res = [1..100] |> List.map (foo myRecord)
like image 501
Soldalma Avatar asked Jan 03 '23 05:01

Soldalma


2 Answers

Neither. F# has automatic memory management based on garbage collection. What causes a block of memory to be freed is not a syntactic condition, but a runtime condition. A block of memory is freed after it becomes unreachable.

An object is reachable if there is a way to get at it from variables in the current scope. While the function foo is executing, newList and newRec are reachable, so they won't be freed. When the function returns, newList and newRec are no longer directly reachable, but what makes them freeable is that they are no longer indirectly reachable either. Consider the following variation of foo:

let bar (arec: MyRecord) i =
    let newList = arec.AList |> List.filter (fun x -> x >= i)
    let newRec = { arec with AList = newList }
    newRec.AList

When bar returns, the newRec object is no longer reachable, but the newList object still is, since it's returned by the function and therefore can be used by the function's caller.

Automatic memory management means that you don't have to care about the life time of objects. In particular, it's impossible to attempt to access an object that has been freed¹: by construction, if you can access an object, it's reachable and thus not freed.

In the specific case of foo, as soon as a call to foo returns, the newRec and newList objects that it created become unreachable. This doesn't necessarily mean that they'll be freed immediately; they will be freed, at the latest, during the next full garbage collector run. How long unreachable objects remain without being freed is a matter of garbage collector quality; it's a compromise between memory usage and performance (running the GC very often leaves little uncollected garbage but costs CPU time; running the GC very rarely consumes little CPU time but leaves a lot of uncollected garbage).

In any case, the fact that you're calling foo multiple times via List.map is not relevant to memory management. Nothing special happens when List.map returns.

¹ Except by interacting with code written in other languages that use unmanaged memory.

like image 177
Gilles 'SO- stop being evil' Avatar answered Apr 12 '23 23:04

Gilles 'SO- stop being evil'


This is really a question about .NET rather than F#. The F# compiler can do things to increase or decrease allocation/deallocation, but as long as it doesn't produce code that holds on to references for longer than necessary, .NET should be able to release (garbage collect) the unused memory. For a simple function like List.map this should definitely be the case.

The question of when exactly memory will be released is a very complex one on a platform like .NET, which has an advanced garbage collector that has been finely tuned with a lot of engineering resources over many years. It's a question I personally can't even begin to answer. However, I can demonstrate something with this simple experiment.

In our mapping function, let's create a list with a million items in it and return its length:

[1 .. 100] |> List.map (fun _ -> [1 .. 1_000_000].Length)

When we map this function on a list with 100 items, it runs successfully and I get a result on my machine.

Now let's return the actual million-long list in the mapping function and run it on a hundred-long list again:

[1 .. 100] |> List.map (fun _ -> [1 .. 1_000_000])

This results in an exception: Exception of type 'System.OutOfMemoryException' was thrown.

This shows that we can't fit a hundred lists with a length of 1 million in memory. But if the first piece of code was able to run, then it must be possible to fit at least 1 million-long list in memory at a time. Therefore, we can infer that there must be some garbage collection of intermediate lists happening before List.map finishes processing all items. However, it doesn't necessarily happen immediately after each iteration.

like image 32
TheQuickBrownFox Avatar answered Apr 13 '23 00:04

TheQuickBrownFox