Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will shadowed binding be GCed?

Suppose I have this in an long run ml file:

let l = [1;2;3]

let l = [1;2;3;4]

let _ = ...

Will the first l = [1;2;3] be GCed sometime?


What if the code is like this:

let l = [1;2;3]

let l = [1;2;3;4]

let l = [1;2;3]

let _ = ...

There are three l. 1st is shadowed by 2nd, and then 2nd is shadowed by 3rd.

Are the following situations possible, since GC's schedule is not determined?

  1. when reaching 3rd l, the GC has not collected the 1st [1;2;3], so the same memory was reused or re-referenced

  2. Immediately after 2nd l, GC collected 1st [1;2;3], then the 3rd l create new memory for [1;2;3]

like image 448
Jackson Tale Avatar asked Mar 20 '23 02:03

Jackson Tale


2 Answers

Not in the OCaml toplevel, defining a new value l does not free the previous l, which (as far as I remember the implementation) lives forever. It doesn't matter because it is a constant and only takes space proportional to the source code that engendered it, like binary code does.

$ rlwrap ocaml
        OCaml version 4.00.1

# let l = [ 1 ] ;;
val l : int list = [1]
# let w = Weak.create 1 ;;
val w : '_a Weak.t = <abstr>
# Weak.set w 0 (Some l) ;;
- : unit = ()
# Gc.full_major () ;;
- : unit = ()
# Weak.check w 0 ;;
- : bool = true
# 

This true means that l still lives in memory.

# let l = [ 2 ] ;;
val l : int list = [2]
# Weak.check w 0 ;;
- : bool = true
# Gc.full_major () ;;
- : unit = ()
# Weak.check w 0 ;;
- : bool = true
# 

And it still does not, although it is not “reachable” for a fine definition of reachable (not the definition the GC uses).

Neither compiler frees the original l either:

$ cat t.ml
let l = [ 1 ] ;;
let w = Weak.create 1 ;;
Weak.set w 0 (Some l) ;;
Gc.full_major () ;;
Printf.printf "%B\n" (Weak.check w 0) ;;
let l = [ 2 ] ;;
Printf.printf "%B\n" (Weak.check w 0) ;;
Gc.full_major () ;;
Printf.printf "%B\n" (Weak.check w 0) ;;
$ ocamlc t.ml
$ ./a.out 
true
true
true
$ ocamlopt t.ml
$ ./a.out 
true
true
true

Another example of the GC's definition of “reachability” being more approximative than the definition one might like is:

let g () = Gc.full_major ()

let f () = let l = [ 1 ] in (* do something with l; *) g(); 1

At the point when g is executed (called from f), the value l is no longer reachable (for a fine definition of reachable) and could be garbage-collected. It won't be because it is still referenced from the stack. The GC, with its coarse notion of reachable, will only be able to free it after f has terminated.

like image 110
Pascal Cuoq Avatar answered Mar 27 '23 09:03

Pascal Cuoq


that depends on whether you still have references for it elsewhere - if l is the only reference, then yes, that reference is released on the second assignment, and will get GCed when appropriate.


lists are immutable in ocaml; list literals return new list instances.

from a programming stand-point, you can treat your "third" list as an entirely new one (even though it contains the same elements as the first).

from an under-the-cover implementation standpoint, it may be that the system is smart enough to know to reuse memory from the first list on the third assignment; i can't immediately think of an efficient way that might work, though, with the code as it's currently written.

so, at the third assignment:

  • the first list might still be around
  • it's still going to be, from a programming point of view, presented to you as an entirely new list
  • it's possible, but very unlikely, that the system, under the covers, knows/is able to optimize memory use well enough to reuse the memory from the first assignment. most likely, unless referenced elsewhere, that first list is garbage collected/will be garbage collected.

NOTE

Based on @PascalCuoq's answer below, and some experimentation:

The behavior is quite different if you're defining the variables as top-level variables. The OCaml garbage collector treats top level declarations as (permanent?) garbage collection roots - so they won't be GCed even if there are no longer reachable from running code.

So, if the above example is executed at the top level, there will be three different "l"'s in memory, none of which will be garbage collected.

like image 43
blueberryfields Avatar answered Mar 27 '23 11:03

blueberryfields