Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scala multiple assignment efficiency

Is multiple assignment (e.g. val (x, y) = (1, 2)) less efficient at runtime than the corresponding single assignments (val x = 1; val y = 2)?

I can imagine the answer being "yes" because scala might need to construct intermediate tuples. Is this correct?

What if I had an extra tuple laying around, e.g. val tup = (1, 2) Now is it more efficient to do:

(a) val (x, y) = tup

OR

(b) val x = tup._1; val y = tup._2

or are they the same?

The difference from the previous example is that the RHS no longer needs to be allocated.

like image 787
dsg Avatar asked Feb 03 '23 20:02

dsg


2 Answers

You can use the new :javap feature of the scala 2.9 REPL:

scala> class A { val (a, b) = (1, 2) }
scala> :javap -c A
Compiled from "<console>"
public class A extends java.lang.Object implements scala.ScalaObject{
...
public A();
  Code:
   0:   aload_0
   1:   invokespecial   #22; //Method java/lang/Object."<init>":()V
   4:   aload_0
   5:   new #24; //class scala/Tuple2$mcII$sp
   8:   dup
   9:   iconst_1
   10:  iconst_2
   11:  invokespecial   #27; //Method scala/Tuple2$mcII$sp."<init>":(II)V
   14:  astore_1
   15:  aload_1
   16:  ifnull  68
   19:  aload_1
   20:  astore_2
   21:  new #24; //class scala/Tuple2$mcII$sp
   24:  dup
   25:  aload_2
   26:  invokevirtual   #33; //Method scala/Tuple2._1:()Ljava/lang/Object;
   29:  invokestatic    #39; //Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
   32:  aload_2
   33:  invokevirtual   #42; //Method scala/Tuple2._2:()Ljava/lang/Object;
   36:  invokestatic    #39; //Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
   39:  invokespecial   #27; //Method scala/Tuple2$mcII$sp."<init>":(II)V
   42:  putfield    #44; //Field x$1:Lscala/Tuple2;
   45:  aload_0
   46:  aload_0
   47:  getfield    #44; //Field x$1:Lscala/Tuple2;
   50:  invokevirtual   #47; //Method scala/Tuple2._1$mcI$sp:()I
   53:  putfield    #14; //Field a:I
   56:  aload_0
   57:  aload_0
   58:  getfield    #44; //Field x$1:Lscala/Tuple2;
   61:  invokevirtual   #50; //Method scala/Tuple2._2$mcI$sp:()I
   64:  putfield    #16; //Field b:I
   67:  return
   68:  new #52; //class scala/MatchError
   71:  dup
   72:  aload_1
   73:  invokespecial   #55; //Method scala/MatchError."<init>":(Ljava/lang/Object;)V
   76:  athrow

}

scala> class B { val a = 1; val b = 2 }
scala> :javap -c B
Compiled from "<console>"
public class B extends java.lang.Object implements scala.ScalaObject{
...
public B();
  Code:
   0:   aload_0
   1:   invokespecial   #20; //Method java/lang/Object."<init>":()V
   4:   aload_0
   5:   iconst_1
   6:   putfield    #12; //Field a:I
   9:   aload_0
   10:  iconst_2
   11:  putfield    #14; //Field b:I
   14:  return

}

so i guess the answer is the tuple version is slower. i wonder why there is boxing going on, shouldn't that be gone with the specialization of tuples?!

like image 56
0__ Avatar answered Feb 11 '23 20:02

0__


I wasn't really satified with the lack of benchmarks, so here are some benchmarks done using https://github.com/sirthias/scala-benchmarking-template, which uses Google Caliper in the background. Charts are contain the calculated (ns/inside loop execution), but the text results are directly from the console. The code:

package org.example

import annotation.tailrec
import com.google.caliper.Param

class Benchmark extends SimpleScalaBenchmark {
    @Param(Array("10", "100", "1000", "10000"))
    val length: Int = 0

    var array: Array[Int] = _

    override def setUp() {
        array = new Array(length)
    }

    def timeRegular(reps: Int) = repeat(reps) {        
        var result = 0        
        array.foreach {value => {
            val tuple = (value, value)
            val (out1, out2) = tuple
            result += out1
            result += out2
        }}

        result
    }

    def timeUnpack(reps: Int) = repeat(reps) {
        var result = 0
        array.foreach {value =>{
            val tuple = (value, value)
            val out1 = tuple._1
            val out2 = tuple._2
            result += out1
            result += out2
        }}

        result
    }

    def timeBoxedUnpack(reps: Int) = repeat(reps) {
        var result = 0
        array.foreach {value =>{
            val tuple = (value, value, value)
            val out1 = tuple._1
            val out2 = tuple._2
            val out3 = tuple._3
            result += out1
            result += out2
            result += out3
        }}

        result
    }
}

Scala 2.9.2

 0% Scenario{vm=java, trial=0, benchmark=Regular,     length=10}    102.09 ns;    σ=1.04 ns    @ 10 trials
 8% Scenario{vm=java, trial=0, benchmark=Unpack,      length=10}    28.23 ns;     σ=0.27 ns    @ 6 trials
17% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=10}    110.17 ns;    σ=1.95 ns    @ 10 trials
25% Scenario{vm=java, trial=0, benchmark=Regular,     length=100}   909.73 ns;    σ=6.42 ns    @ 3 trials
33% Scenario{vm=java, trial=0, benchmark=Unpack,      length=100}   271.40 ns;    σ=1.35 ns    @ 3 trials
42% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=100}   946.59 ns;    σ=8.38 ns    @ 3 trials
50% Scenario{vm=java, trial=0, benchmark=Regular,     length=1000}  8966.33 ns;   σ=40.17 ns   @ 3 trials
58% Scenario{vm=java, trial=0, benchmark=Unpack,      length=1000}  2517.54 ns;   σ=4.56 ns    @ 3 trials
67% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=1000}  9374.71 ns;   σ=68.25 ns   @ 3 trials
75% Scenario{vm=java, trial=0, benchmark=Regular,     length=10000} 81244.84 ns;  σ=661.81 ns  @ 3 trials
83% Scenario{vm=java, trial=0, benchmark=Unpack,      length=10000} 23502.73 ns;  σ=122.83 ns  @ 3 trials
92% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=10000} 112683.27 ns; σ=1101.51 ns @ 4 trials

length   benchmark       ns linear runtime
    10     Regular    102.1 =
    10      Unpack     28.2 =
    10 BoxedUnpack    110.2 =
   100     Regular    909.7 =
   100      Unpack    271.4 =
   100 BoxedUnpack    946.6 =
  1000     Regular   8966.3 ==
  1000      Unpack   2517.5 =
  1000 BoxedUnpack   9374.7 ==
 10000     Regular  81244.8 =====================
 10000      Unpack  23502.7 ======
 10000 BoxedUnpack 112683.3 ==============================

Chart showing the above data

Scala 2.10.3

 0% Scenario{vm=java, trial=0, benchmark=Regular,     length=10}    28.26 ns;     σ=0.13 ns    @ 3 trials
 8% Scenario{vm=java, trial=0, benchmark=Unpack,      length=10}    28.27 ns;     σ=0.07 ns    @ 3 trials
17% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=10}    109.56 ns;    σ=2.27 ns    @ 10 trials
25% Scenario{vm=java, trial=0, benchmark=Regular,     length=100}   273.40 ns;    σ=2.73 ns    @ 5 trials
33% Scenario{vm=java, trial=0, benchmark=Unpack,      length=100}   271.25 ns;    σ=2.63 ns    @ 6 trials
42% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=100}   1088.00 ns;   σ=10.60 ns   @ 3 trials
50% Scenario{vm=java, trial=0, benchmark=Regular,     length=1000}  2516.30 ns;   σ=7.13 ns    @ 3 trials
58% Scenario{vm=java, trial=0, benchmark=Unpack,      length=1000}  2525.00 ns;   σ=24.25 ns   @ 6 trials
67% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=1000}  10188.98 ns;  σ=101.32 ns  @ 3 trials
75% Scenario{vm=java, trial=0, benchmark=Regular,     length=10000} 25886.80 ns;  σ=116.33 ns  @ 3 trials
83% Scenario{vm=java, trial=0, benchmark=Unpack,      length=10000} 25938.97 ns;  σ=76.02 ns   @ 3 trials
92% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=10000} 115629.82 ns; σ=1159.41 ns @ 5 trials

length   benchmark       ns linear runtime
    10     Regular     28.3 =
    10      Unpack     28.3 =
    10 BoxedUnpack    109.6 =
   100     Regular    273.4 =
   100      Unpack    271.2 =
   100 BoxedUnpack   1088.0 =
  1000     Regular   2516.3 =
  1000      Unpack   2525.0 =
  1000 BoxedUnpack  10189.0 ==
 10000     Regular  25886.8 ======
 10000      Unpack  25939.0 ======
 10000 BoxedUnpack 115629.8 ==============================

Chart showing the above data

Conclusion

Unpacking tuples is fast as long as the tuple arity <= 2. If it is greater than 2, there is too much indirection and for the Hotspot compiler to optimise.

Scala 2.9.2 has some sort of weird issues making assignment with tuples faster than regular assignment. Weird, but it can probably be disregarded.

This is done using

java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
like image 45
user60561 Avatar answered Feb 11 '23 19:02

user60561