Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weird behavior accessing tuple from Java

Tags:

java

tuples

scala

I am looking for explanation and/or versioning details (if possible) about a very strange behavior I found in Java accessing tuples created in Scala.

I will show the weird behavior with an easy test I did. I created this Scala class:

class Foo {
  def intsNullTuple = (null.asInstanceOf[Int], 2)
  def intAndStringNullTuple =  (null.asInstanceOf[Int], "2")
}

and then I run this Java program:

Tuple2<Object, Object> t = (new Foo()).intsNullTuple();
t._1(); // returns 0 !
t._1; // return null
Tuple2<Object, String> t2 = (new Foo()).intAndStringNullTuple();
t._1(); // returns null
t._1; // return null

does anybody have any explanation on the reason of this? Moreover, in my tests I am using Java 1.8 and Scala 2.11.8. Can anyone provide any suggestion about the compatibility of using _1 from Java code also with older Scala 2.11 and 2.10 versions and Java 1.7? I read that _1 is not accessible from Java, but I can access it in my tests. Thus I am looking for the versions which support it.

Thanks.

like image 772
mgaido Avatar asked Dec 02 '17 08:12

mgaido


Video Answer


2 Answers

does anybody have any explanation on the reason of this?

This is due to the fact that Scala has a specialization for the overload of Tuple2<Int, Int>, while Tuple2<Int, String> doesn't. You can see it from the signature of Tuple2:

case class Tuple2[@specialized(Int, Long, Double, Char, Boolean/*, AnyRef*/) +T1, @specialized(Int, Long, Double, Char, Boolean/*, AnyRef*/) +T2](_1: T1, _2: T2)

This means that the Scala compiler emits a class for the special case where T1 and T2 are one of the specialized tuple types, in our example there is a special class taking two ints, roughly like this:

class Tuple2Special(i: Int, j: Int)

We can see this when looking at the decompiled byte code:

Compiled from "Foo.scala"
public class com.testing.Foo {
  public scala.Tuple2<java.lang.Object, java.lang.Object> intsNullTuple();
    Code:
       0: new           #12                 // class scala/Tuple2$mcII$sp
       3: dup
       4: aconst_null
       5: invokestatic  #18                 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
       8: iconst_2
       9: invokespecial #22                 // Method scala/Tuple2$mcII$sp."<init>":(II)V
      12: areturn

  public scala.Tuple2<java.lang.Object, java.lang.String> intAndStringNullTuple();
    Code:
       0: new           #27                 // class scala/Tuple2
       3: dup
       4: aconst_null
       5: ldc           #29                 // String 2
       7: invokespecial #32                 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
      10: areturn

  public com.testing.Foo();
    Code:
       0: aload_0
       1: invokespecial #35                 // Method java/lang/Object."<init>":()V
       4: return
}

In the case of intsNullTuple, you see that the new opcode calls Tuple2$mcII$sp, which is the specialized version. That is the reason your call to _1() yields 0, because that's the default value for the value type Int, while _1 isn't specialized and calls the overload returning an Object, not Int.

This can also be viewed by scalac when compiling with the -Xprint:jvm flag:

λ scalac -Xprint:jvm Foo.scala
[[syntax trees at end of                       jvm]] // Foo.scala
package com.testing {
  class Foo extends Object {
    def intsNullTuple(): Tuple2 = new Tuple2$mcII$sp(scala.Int.unbox(null), 2);
    def intAndStringNullTuple(): Tuple2 = new Tuple2(scala.Int.box(scala.Int.unbox(null)), "2");
    def <init>(): com.testing.Foo = {
      Foo.super.<init>();
      ()
    }
  }
}

Another interesting fact is that Scala 2.12 changed the behavior, and makes intAndStringNullTuple print 0 instead:

public scala.Tuple2<java.lang.Object, java.lang.String> intAndStringNullTuple();
  Code:
     0: new           #27                 // class scala/Tuple2
     3: dup
     4: aconst_null
     5: invokestatic  #18                 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
     8: invokestatic  #31                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
     11: ldc           #33                 // String 2
     13: invokespecial #36                 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
     16: areturn

Yields:

t1 method: 0
t1 field: null
t2 method: 0
t2 field: 0

Since now null gets transformed to 0 via unboxToInt and wrapped inside an Integer instance via boxToInteger.

Edit:

After talking to the relevant people at Lightbend, this happened due to the rework done in 2.12 for the bytecode generator (backend) (see https://github.com/scala/scala/pull/5176 for more).

like image 175
Yuval Itzchakov Avatar answered Sep 30 '22 09:09

Yuval Itzchakov


Firstly, need to call out, In Scala, everything is an Object, there is no primitive type(for your code, it's Int) not like in Java, but Scala need to compile to Java Bytecode to run in JVM, since Object consume more memory than primitive type, so Scala has specialized to solve this, it means generate the primitive types parameter method when annotated with specialized with types.

So for your code, it's Tuple2, and it's specialized for Int, Long, Double, Char, Boolean. this will generate the correspond primitive type constructor, like:

Tuple2(int _v1, int _v2) --> `Tuple2$mcII$sp`
Tuple2(long _v1, long _v2) 
...

And there is another thing need to clear, that's Box and UnBox, this means that compiler will decide whether the variable need to convert it to primitive type in compile time or convert variable to Object, find it more BoxesRunTime

For the intsNullTuple, see bytecode :

 scala>:javap -c Foo
 public scala.Tuple2<java.lang.Object, java.lang.Object> intsNullTuple();
    Code:
       0: new           #17                 // class scala/Tuple2$mcII$sp
       3: dup
       4: aconst_null
       5: invokestatic  #23                 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
       8: iconst_2
       9: invokespecial #27                 // Method scala/Tuple2$mcII$sp."<init>":(II)V
      12: areturn

as you can see the above code, the compiler has decided to unbox Object to int by BoxesRunTime.unboxToInt, this is returning a primitive type int. so it's actually will invokeTuple2$mcII$sp(int _1, int _2)`.

For intAndStringNullTuple, see bytecode:

  public scala.Tuple2<java.lang.Object, java.lang.String> intAndStringNullTuple();
    Code:
       0: new           #32                 // class scala/Tuple2
       3: dup
       4: aconst_null
       5: invokestatic  #23                 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
       8: invokestatic  #36                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
      11: ldc           #38                 // String 2
      13: invokespecial #41                 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
      16: areturn

also you can see finally it has boxToInteger to a Object, its actually will invoke Tuple2(Object _1, Object _2).

and for why _1() is returning 0 but _1 is returning null, because Java generics only support Object type, Tuple2<Object, Object>, when you invoke _1() it's actual invoke java.lang.Object _1(), and it's equal to invoke public int _1$mcI$sp();:

scala> :javap -c scala.Tuple2$mcII$sp
Compiled from "Tuple2.scala"
public final class scala.Tuple2$mcII$sp extends scala.Tuple2<java.lang.Object, java.lang.Object> implements scala.Product2$mcII$sp {
  public final int _1$mcI$sp;

  public final int _2$mcI$sp;

  public int _1$mcI$sp();
    Code:
       0: aload_0
       1: getfield      #14                 // Field _1$mcI$sp:I
       4: ireturn
  ...
  public java.lang.Object _1();
    Code:
       0: aload_0
       1: invokevirtual #33                 // Method _1:()I
       4: invokestatic  #56                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
       7: areturn

so _1() will return 0.

for _1 directly, it's actual access Tuple2<Object, Object> field, since it's Object, so it should be null.

scala> :javap -c scala.Tuple2
Compiled from "Tuple2.scala"
public class scala.Tuple2<T1, T2> implements scala.Product2<T1, T2>, scala.Serializable {
  public final T1 _1;

  public final T2 _2;

Finally, so for my understanding, since the box and unbox with specialized, we need to always try to invoke _1() not _1.

like image 35
chengpohi Avatar answered Sep 30 '22 09:09

chengpohi