I have a question about how closures are implemented.
Say this is in a file named test.lua
:
local a = 'asdf'
local function b()
return a
end
a = 10
return b
And another file does
a = require 'test'
a()
it will print
10
If a
is a pointer on the stack to 'asdf'
(on the heap I assume, but it doesn't matter), and the closure b
is created so presumably the address that was in a
is saved for b
to use, how does a = 10
change the pointer inside the closure as well?
Wikipedia says quite well what is perplexing me:
A language implementation cannot easily support full closures if its run-time memory model allocates all local variables on a linear stack1. In such languages, a function's local variables are deallocated when the function returns.
I was thinking that perhaps b
really didn't save a pointer to 'asdf'
but a stack offset to a
, so that you can change a
and the stack offset will get you to a
which points to the last thing you set a
to, but then how does this work when a
(the pointer) is popped off the stack and the stack offset becomes invalid?
1 I know Lua doesn't allocate the values on the stack, but it allocates local pointers on the stack to values in the heap, doesn't it?
Remember that, in Lua code, a closure is a function that uses local variables from an outer function. A C closure is a C approximation to a Lua closure. One interesting fact about closures is that you can create different closures using the same function code, but with different upvalues.
A closure is the combination of a function bundled together (enclosed) with references to its surrounding state (the lexical environment). In other words, a closure gives you access to an outer function's scope from an inner function.
To use a closure, define a function inside another function and expose it. To expose a function, return it or pass it to another function. The inner function will have access to the variables in the outer function scope, even after the outer function has returned.
Fortunately, thanks to the Lua dev team, we have a solution. We use a level of indirection that they call an upvalue. An upvalue refers to a local variable in an enclosing function. Every closure maintains an array of upvalues, one for each surrounding local variable that the closure uses.
I really wish you had named these variables a bit more reasonably. So I will:
local inner = 'asdf'
local function b()
return inner
end
inner = 10
return b
and
func = require 'test'
func()
OK, now that we know what we're talking about, I can proceed.
The Lua chunk test
has a local variable called inner
. Within that chunk you create a new function b
. Since this is a new function, it has a scope within the scope of the chunk test
.
Since it is within a function, it has the right to access local variables declared outside of that function. But because it is inside of a function, it does not access those variables like it would one of its own locals. The compiler detects that inner
is a local variable declared outside of the function's scope, so it converts it into what Lua calls an "upvalue".
Functions in Lua can have an arbitrary number of values (up to 255) associated with them, called "upvalues". Functions created in C/C++ can store some number of upvalues by using lua_pushcclosure
. Functions created by the Lua compiler use upvalues to provide lexical scoping.
A scope is everything that happens within a fixed block of Lua code. So:
if(...) then
--yes
else
--no
end
The yes
block has a scope, and the no
block has a different scope. Any local
variables declared in the yes
block cannot be accessed from the no
block, because they are outside of the scope of the no
block.
The Lua constructs that define a scope are if/then/else/end
, while/do/end
, repeat/until
, do/end
, for/end
, and function/end
. Also, each script, called a Lua "chunk", has a scope.
Scopes are nested. From within one scope, you can access local variables declared in a higher scope.
A "stack" represents all variables declared as local
within a particular scope. So if you have no local variables in a certain scope, the stack for that scope is empty.
In C and C++, the "stack" that you are familiar with is just a pointer. When you call a function, the compiler has predetermined how many bytes of space that the function's stack needs. It advances the pointer by that amount. All stack variables used in the function are just byte offsets from the stack pointer. When the function exits, the stack pointer is decreased by the stack amount.
In Lua, things are different. The stack for a particular scope is an object, not merely a pointer. For any particular scope, there are some number of local
variables defined for it. When the Lua interpreter enters a scope, it "allocates" a stack of the size necessary to access those local variables. All references to local variables are just offsets into that stack. Access to local variables from higher scopes (previously defined) simply access a different stack object.
So in Lua, you conceptually have a stack of stacks (which I will refer to as the "s-stack" for clarity). Each scope creates a new stack and pushes it, and when you leave a scope, it pops the stack off of the s-stack.
When the Lua compiler encounters a reference to a local
variable, it converts that reference into an index into the s-stack, and an offset into that particular stack. So if it accesses a variable in the current local stack, the index into the s-stack refers to the top of the s-stack, and the offset is the offset into that stack where the variable is.
That's fine for most Lua constructs that access scopes. But function/end
don't just create a new scope; they create a new function. And this function is allowed to access stacks that aren't just the local stack of that function.
Stacks are objects. And in Lua, objects are subject to garbage collection. When the interpreter enters a scope, it allocates a stack object and pushes it. So long as the stack object is pushed onto the s-stack, it cannot be destroyed. The stack of stacks refers to the object. However, once the interpreter exits the scope, it pops the stack off of the s-stack. So since it is no longer referenced, it is subject to being collected.
However, a function that accesses variables outside of its own local scope can still be referencing that stack. When the Lua compiler sees a reference to a local
variable that is not within the function's local scope, it alters the function. It figures out which stack the local it is referencing belongs to, and then stores that stack as an upvalue in the function. It converts the reference to that variable to an offset into that particular upvalue, rather than an offset into a stack that is currently on the s-stack.
So as long as the function object continues to exist, so too will the stack(s) that it references.
Remember that stacks are dynamically created and destroyed as the Lua interpreter enters and exits the scope of functions. So if you were to run test
twice, by calling loadfile
and executing the returned function twice, you would get two separate functions that refer to two separate stacks. Neither function will see the value from the other.
Note that this may not be exactly how it's implemented, but that's the general idea behind it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With