Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does partially applied function defer class instantiation in Scala?

Tags:

scala

Imagine this code:

class Foo {
  println("in Foo")

  def foo(a: Int) = a + 1
}

Now, if we invoke:

new Foo().foo _

instance of class Foo will get created, as expected:

in Foo
res0: (Int) => Int = <function1>

However, if we invoke this:

new Foo().foo(_)

Foo's constructor will not get called:

res1: (Int) => Int = <function1>

If we then say:

res1(7)

that is when Foo gets instantiated:

in Foo
res2: Int = 8

Why does Eta expansion versus partial function application make a difference in class instantiation?

like image 663
Nermin Serifovic Avatar asked May 25 '12 22:05

Nermin Serifovic


3 Answers

Boy, that's a subtle one, but as far as I can tell it's following the Scala spec completely. I'll quote from version 2.9 of the spec.

For your first example: as you rightly say, you are seeing eta expansion through a special case of a Method Value (§6.7):

The expression e _ is well-formed if e is of method type or if e is a call-by-name parameter. If e is a method with parameters, e _ represents e converted to a function type by eta expansion.

The algorithm for eta expansion is given in §6.26.5 which you can follow to give the following replacement for the expression new Foo().x1 _:

{
  val x1 = new Foo();
  (y1: Int) => x1.(y1);
}

This implies that when eta expansion is being used, all sub-expressions are evaluated at the point where the conversion takes place (if I've understood the meaning of the phrase "maximal sub-expression" correctly) and the final expression is the creation of an anonymous function.

In your second example, those extra parentheses mean that the compiler will look at §6.23 (specifically, "Placeholder Syntax for Anonymous Functions) and create an anonymous function directly.

An expression (of syntactic category Expr) may contain embedded underscore symbols _ at places where identifiers are legal. Such an expression represents an anonymous function where subsequent occurrences of underscores denote successive parameters.

In that case, and following the algorithm in that section, your expression ends up being this:

(x1: Int) => new Foo().foo(x1)

The difference is subtle and, as explained very well by @Antoras, only actually shows in the presence of side-effecting code.

Note that there is a bugfix under way for the case involving call-by-name code blocks (see, for example, this question, this bug and this bug).

Postscript: In both cases, the anonymous function (x1:Int) => toto gets expanded to

new scala.Function1[Int, Int] {
  def apply(x1: Int): Int = toto
}
like image 96
rxg Avatar answered Nov 12 '22 13:11

rxg


I'm not totally sure, but I think the reason why there is a difference, is that Scala is not a purely functional programming language - it allows side effects:

scala> class Adder { var i = 0; def foo(a:Int)={i+=1;println(i);a+1} }
defined class Adder

scala> val curriedFunction = new Adder().foo _
curriedFunction: (Int) => Int = <function1>

scala> val anonymousFunction = new Adder().foo(_)
anonymousFunction: (Int) => Int = <function1>    

scala> curriedFunction(5)
1
res11: Int = 6

scala> curriedFunction(5)
2
res12: Int = 6

scala> anonymousFunction(5)
1
res13: Int = 6

scala> anonymousFunction(5)
1
res14: Int = 6

The anonymous function is treated as:

val anonymousFunction = x => new Adder().foo(x)

Whereas the curried function is treated as:

val curriedFunction = {
  val context = new Adder()
  (a:Int) => context foo a
}

The curried function conforms the traditional way curried functions are handled in functional languages: A curried function is a function which is applied to some data and evaluates to this partially applied function. In other words: Based on some data a context is created which is stored and can used later. This is exactly what curriedFunction is doing. Because Scala allows mutable state the context can be changed - a fact that can lead to unexpected behavior as seen in the question.

Purely functional languages like Haskell do not have this problem because they do not allow such side effects. In Scala one has to ensure by oneself that the context created by the curried function is really pure. If this is not the case and the behavior of purely curried functions is demanded, anonymous functions has to be used because they do not store a context (which can be problematic if the creation of the context is expensive and has to be done often).

like image 39
kiritsuku Avatar answered Nov 12 '22 12:11

kiritsuku


Because it expands to

(x: Int) => new Foo().foo(x)

So, you are only creating that instance of Foo when you call that function.

And the reason why the first one instantiates Foo right away is because it expands to

private[this] val c: (Int) => Int = {
  <synthetic> val eta$0$1: Foo = new Foo();
  ((a: Int) => eta$0$1.foo(a))
};
<stable> <accessor> def c: (Int) => Int = Foo.this.c;

And Foo is getting instantiated here once c is defined.

like image 1
Bradford Avatar answered Nov 12 '22 11:11

Bradford