Scala allows closure like
def newCounter = {
var a=0
() => {a+=1;a}
}
which defines a function that on every call returns a new independent counter function starting at 1
:
scala> val counter1 = newCounter
counter1: () => Int = <function0>
scala> counter1()
res0: Int = 1
scala> counter1()
res1: Int = 2
scala> val counter2 = newCounter
counter2: () => Int = <function0>
scala> counter2()
res2: Int = 1
scala> counter1()
res3: Int = 3
This is quite impressive as usually a
would be a representative of a memory address on the stack frame of newCounter. I've just read the closure chapter of "Programming in Scala" and it only has the following to say on that matter (p. 155):
The Scala compiler rearranges things in cases like this so that the captured parameter lives out on the heap, instead of the stack, and thus can outlive the method call that created it. This rearrangement is all taken care of automatically, so you don't have to worry about it.
Can anyone elaborate on how this works on byte code level? Is the access similar to a member variable of a class with all the associated synchronization and performance implications?
So. With this in mind, the answer is that variables in a closure are stored in the stack and heap.
A closure is a function, whose return value depends on the value of one or more variables declared outside this function. The following piece of code with anonymous function. There are two free variables in multiplier: i and factor. One of them, i, is a formal parameter to the function.
Summing up, closure is those variables and methods which must be visible for the executor to perform its computations on the RDD. This closure is serialized and sent to each executor. Understanding closure is important to avoid any unexpected behaviour of the code.
A free variable of an expression is a variable that's used inside the expression but not defined inside the expression. For instance, in the function literal expression (x: Int) => (x, y) , both variables x and y are used, but only y is a free variable, because it is not defined inside the expression.
You could use scalac -Xprint:lambdalift <scala-file-name>
to investigate this.
Your code is actually something like this:
def newCounter = {
val a: runtime.IntRef = new runtime.IntRef(0);
new Function0 {
private[this] val a$1 = a
def apply() = {
a$1.elem = a$1.elem + 1
a$1.elem
}
}
}
There is a wrapper for any var
used by lambda. Other vars
(not used in closures) are common locale variables.
The link to this wrapper is stored as field in the instance of function.
lambdalift
in -Xprint:lambdalift
is the compiler phase. You can get all phases with -Xshow-phases
. You could use phase number instead of name, it's useful when you are not sure which phase you need.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With