as ECMAScriptv5, each time when control enters a code, the enginge creates a LexicalEnvironment(LE) and a VariableEnvironment(VE), for function code, these 2 objects are exactly the same reference which is the result of calling NewDeclarativeEnvironment(ECMAScript v5 10.4.3), and all variables declared in function code are stored in the environment record componentof VariableEnvironment(ECMAScript v5 10.5), and this is the basic concept for closure.
What confused me is how Garbage Collect works with this closure approach, suppose I have code like:
function f1() {
var o = LargeObject.fromSize('10MB');
return function() {
// here never uses o
return 'Hello world';
}
}
var f2 = f1();
after the line var f2 = f1()
, our object graph would be:
global -> f2 -> f2's VariableEnvironment -> f1's VariableEnvironment -> o
so as from my little knowledge, if the javascript engine uses a reference counting method for garbage collection, the object o
has at lease 1 refenrence and would never be GCed. Appearently this would result a waste of memory since o
would never be used but is always stored in memory.
Someone may said the engine knows that f2's VariableEnvironment doesn't use f1's VariableEnvironment, so the entire f1's VariableEnvironment would be GCed, so there is another code snippet which may lead to more complex situation:
function f1() {
var o1 = LargeObject.fromSize('10MB');
var o2 = LargeObject.fromSize('10MB');
return function() {
alert(o1);
}
}
var f2 = f1();
in this case, f2
uses the o1
object which stores in f1's VariableEnvironment, so f2's VariableEnvironment must keep a reference to f1's VariableEnvironment, which result that o2
cannot be GCed as well, which further result in a waste of memory.
so I would ask, how modern javascript engine (JScript.dll / V8 / SpiderMonkey ...) handles such situation, is there a standard specified rule or is it implementation based, and what is the exact step javascript engine handles such object graph when executing Garbage Collection.
Thanks.
tl;dr answer: "Only variables referenced from inner fns are heap allocated in V8. If you use eval then all vars assumed referenced.". In your second example, o2
can be allocated on the stack and is thrown away after f1
exits.
I don't think they can handle it. At least we know that some engines cannot, as this is known to be the cause of many memory leaks, as for example:
function outer(node) {
node.onclick = function inner() {
// some code not referencing "node"
};
}
where inner
closes over node
, forming a circular reference inner -> outer's VariableContext -> node -> inner
, which will never be freed in for instance IE6, even if the DOM node is removed from the document. Some browsers handle this just fine though: circular references themselves are not a problem, it's the GC implementation in IE6 that is the problem. But now I digress from the subject.
A common way to break the circular reference is to null out all unnecessary variables at the end of outer
. I.e., set node = null
. The question is then whether modern javascript engines can do this for you, can they somehow infer that a variable is not used within inner
?
I think the answer is no, but I can be proven wrong. The reason is that the following code executes just fine:
function get_inner_function() {
var x = "very big object";
var y = "another big object";
return function inner(varName) {
alert(eval(varName));
};
}
func = get_inner_function();
func("x");
func("y");
See for yourself using this jsfiddle example. There are no references to either x
or y
inside inner
, but they are still accessible using eval
. (Amazingly, if you alias eval
to something else, say myeval
, and call myeval
, you DO NOT get a new execution context - this is even in the specification, see sections 10.4.2 and 15.1.2.1.1 in ECMA-262.)
Edit: As per your comment, it appears that some modern engines actually do some smart tricks, so I tried to dig a little more. I came across this forum thread discussing the issue, and in particular, a link to a tweet about how variables are allocated in V8. It also specifically touches on the eval
problem. It seems that it has to parse the code in all inner functions. and see what variables are referenced, or if eval
is used, and then determine whether each variable should be allocated on the heap or on the stack. Pretty neat. Here is another blog that contains a lot of details on the ECMAScript implementation.
This has the implication that even if an inner function never "escapes" the call, it can still force variables to be allocated on the heap. E.g.:
function init(node) {
var someLargeVariable = "...";
function drawSomeWidget(x, y) {
library.draw(x, y, someLargeVariable);
}
drawSomeWidget(1, 1);
drawSomeWidget(101, 1);
return function () {
alert("hi!");
};
}
Now, as init
has finished its call, someLargeVariable
is no longer referenced and should be eligible for deletion, but I suspect that it is not, unless the inner function drawSomeWidget
has been optimized away (inlined?). If so, this could probably occur pretty frequently when using self-executing functions to mimick classes with private / public methods.
Answer to Raynos comment below. I tried the above scenario (slightly modified) in the debugger, and the results are as I predict, at least in Chrome:
When the inner function is being executed, someLargeVariable is still in scope.
If I comment out the reference to someLargeVariable
in the inner drawSomeWidget
method, then you get a different result:
Now someLargeVariable
is not in scope, because it could be allocated on the stack.
There is no standard specifications of implementation for GC, every engine have their own implementation. I know a little concept of v8, it has a very impressive garbage collector (stop-the-world, generational, accurate). As above example 2, the v8 engine has following step:
On parse function literal, it create FunctionBody. Only parse FunctionBody when function was called.The following code indicate it not throw error while parser time
function p(){
return function(){alert(a)}
}
p();
So at GC time H1, H2 will be swept, because no reference point that.In my mind if the code is lazily compiled, no way to indicate o1 variable declared in a1 is a reference to f1, It use JIT.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With