Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between the following ways to write a function in Scala?

Tags:

scala

I am new to Scala, and have seen many ways to define a function but could not find a clear explanation on the differences, and when to use which form.

What are the main differences between the following function definitions?

  1. With '='

    def func1(node: scala.xml.Node) = {
        print(node.label + " = " + node.text + ",")
    }
    
  2. Without '='

    def func2 (node: scala.xml.Node) {
        print(node.label + " = " + node.text + ",")
    }
    
  3. With '=>'

    def func3 = (node: scala.xml.Node) => {
        print(node.label + " = " + node.text + ",")
    }
    
  4. As a var

    var func4 = (node: scala.xml.Node) => {
        print(node.label + " = " + node.text + ",")
    }
    
  5. Without a block

    def func5 (node: scala.xml.Node) = print(node.label + " = " + node.text + ",")  
    

They all seem to compile and render the same result when used as a callback for

    xmlNodes.iterator.foreach(...)
  • Is there any difference in the bytecode each generate?
  • Are there any guidlines when to use which form?
like image 926
Eran Medan Avatar asked Jun 21 '12 14:06

Eran Medan


4 Answers

Each of these questions has been answered elsewhere on this site, but I don't think anything handles them all together. So:

Braces and equals

Methods defined with an equals sign return a value (whatever the last thing evaluates to). Methods defined with only braces return Unit. If you use an equals but the last thing evalutes to Unit, there is no difference. If it's a single statement after an equals sign, braces are not required; this makes no difference to bytecode. So 1., 2., and 5. are all essentially identical:

def f1(s: String) = { println(s) }     // println returns `Unit`
def f2(s: String) { println(s) }       // `Unit` return again
def f5(s: String) = println(s)         // Don't need braces; there's only one statement

Functions vs. methods

A function, often written A => B, is a subclass of one of the Function classes, e.g. Function1[A,B]. Because this class has an apply method, which Scala magically calls when you just use parens without a method name, it looks like a method call--and it is, except it's a call on that Function object! So if you write

def f3 = (s: String) => println(s)

then what you are saying is "f3 should create an instance of Function1[String,Unit] which has an apply method that looks like def apply(s: String) = println(s)". So if you say f3("Hi"), this is first calls f3 to create the function object, and then calls the apply method.

It's rather wasteful to create the function object every single time you want to use it, so it makes more sense to store the function object in a var:

val f4 = (s: String) => println(s)

This holds one instance of the same function object that the def (method) would return, so you don't have to recreate it each time.

When to use what

People differ on the convention of : Unit = ... and { }. Personally, I write all methods that return Unit without an equals sign--this is an indication to me that the method is almost surely useless unless it has some sort of side-effect (mutates a variable, performs IO, etc.). Also, I generally only use braces when required either because there are multiple statements or because the single statement is so complex I want a visual aid to tell me where it ends.

Methods should be used whenever you want, well, a method. Function objects should be created any time you want to pass them into some other method to use them (or should be specified as parameters any time you want to be able to apply a function). For example, suppose you want to be able to scale a value:

class Scalable(d: Double) {
  def scale(/* What goes here? */) = ...
}

You could supply a constant multiplier. Or you could supply something to add and something to multiply. But most flexibly, you'd just ask for an arbitrary function from Double to Double:

def scale(f: Double => Double) = f(d)

Now, maybe you have an idea of a default scale. That's probably no scaling at all. So you might want a function that takes a Double and returns the very same Double.

val unscaled = (d: Double) => d

We store the function in a val because we don't want to keep creating it over and over again. Now we can use this function as a default argument:

class Scalable(d: Double) {
  val unscaled = (d: Double) => d
  def scale(f: Double => Double = unscaled) = f(d)
}

Now we can call both x.scale and x.scale(_*2) and x.scale(math.sqrt) and they'll all work.

like image 115
Rex Kerr Avatar answered Nov 16 '22 02:11

Rex Kerr


Yes, there are differences in bytecode. And yes, there are guidelines.

  1. With =: This declares a method which accepts a parameter and returns the last expression in the right hand side block, which has the type Unit here.

  2. Without =: This declares a method which does not have a return value, that is, the return type is always Unit, irrespective of what the type of the last expression in the right hand side block is.

  3. With =>: This declares a method which returns a function object of type scala.xml.Node => Unit. Every time you invoke this method func3, you will construct a new function object on the heap. If you write func3(node), you will first invoke func3 which returns the function object and then invoke the apply(node) on that function object. This is slower than just calling a plain method directly as in cases 1. and 2.

  4. As a var: This declares a variable and creates a function object as in 3., but the function object is created only once. Using this to call the function object is in most cases slower than just a plain method call (may not be inlined by JIT), but at least you do not recreate the object. If you want to avoid the danger of someone reassigning the variable func4, use a val or a lazy val instead.

  5. This is syntactic sugar for 1. when blocks contain only a single expression.

Note that if you use the forms 1., 2. and 5. with the higher-order foreach method, Scala will still create a function object which calls func1, func2 or func5 implicitly, and pass that to foreach (it will not use a method handle or smth like that, at least not in current versions). In these cases, the generated code will roughly correspond to:

xmlNodes.iterator.foreach((node: scala.xml.Node) => funcX(node))

So, the guideline is - unless you are using the same function object every time, just create an ordinary method as in 1., 2. or 5. It will be lifted to a function object anyway, where this is needed. If you realize that this generates a lot of objects because calling such method happens often, you might want to micro-optimize by using the form 4. instead to ensure that the function object for foreach gets created only once.

Where deciding between 1., 2. and 5. is concerned, one guideline is - if you have a single statement, use form 5.

Otherwise, if the return type is Unit, then use the def foo(): Unit = { form if this is public API, so that clients looking at your code quickly and clearly see the return type. Use the def foo() { form for methods with return type Unit which are private, for your own convenience of shorter code. But this is just one particular guideline regarding style.

For more, see: http://docs.scala-lang.org/style/declarations.html#methods

like image 42
axel22 Avatar answered Nov 16 '22 00:11

axel22


Well, 1, 2, and 5 aren't functions at all, they are methods, which are fundamentally different from functions: methods belong to objects and are not themselves objects, whereas functions are objects.

1, 2, and 5 are also exactly the same: if you have only one statement, then you don't need curly braces to group several statements, ergo 5 is the same as 1. Leaving off the = sign is syntactic sugar for declaring a return type of Unit, but Unit is also the inferred return type for 1 and 5, so 2 is the same as 1 and 5.

3 is a method which, when called, returns a function. 4 is a variable which points to a function.

like image 38
Jörg W Mittag Avatar answered Nov 16 '22 00:11

Jörg W Mittag


1-2. When you throw away equals sign, your function becomes procedure (returns Unit, or just nothing).
3. In third case you defined a function scala.xml.Node => Unit, that returns a function.
4. Same, but you've assigned some function scala.xml.Node => Unit to variable. The difference explained in Differences between these three ways of defining a function in Scala
5. No difference, comparing with 1. But you can't write multiline statements like that.

like image 26
om-nom-nom Avatar answered Nov 16 '22 01:11

om-nom-nom