Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Scala Implicitly for Type Equality

I've been reading some stuff on Scala type level programming. Mainly the Apocalisp blog, and also a youtube talk by Alexander Lehmann.

I am a bit stuck on something which I guess is probably very basic, which is the use of implicitly to compare two types as shown below:

implicitly[Int =:= Int]

Mark on the Apocalisp blog says:

This is useful for capturing an implicit value that is in scope and has type T.

I get how to make this work, but I don't really know why it works and so don't want to move on.

In the case above is there an implicit of type 'Int' in scope, that 'implicitly' plucks from the ether, allowing the code to compile? How does that fit with the 'function1' return type of?

res0: =:=[Int,Int] = <function1>

Also, where does this implicit come from? How about in the case of my trait 'Foo', why does

implicitly[Foo =:= Foo] 

compile? Where would the 'Foo' implicit come from in this case?

Apologies in advance if this is a very dumb question, and thanks for any help!

like image 858
bobbyr Avatar asked Mar 28 '14 13:03

bobbyr


1 Answers

X =:= Y is just syntactic sugar (infix notation) for the type =:=[X, Y].

So when you do implicitly[Y =:= Y], you are simply looking up an implicit value of type =:=[X, Y]. =:= is a generic trait defined in Predef.

Also, =:= is a valid type name because type names (just like any identifier) can contain special characters.

From now on, let's rename =:= as IsSameType and remove the infix notation so as to make our code look more straightforward and less magic. This gives us implicitly[IsSameType[X,Y]]

Here is a simplified version of how this type is defined:

sealed abstract class IsSameType[X, Y]
object IsSameType {
   implicit def tpEquals[A] = new IsSameType[A, A]{}
}

Notice how tpEquals provides an implicit value of IsSameType[A, A] for any type A. In other words, it provides an implicit value of IsSameType[X, Y] if and only if X and Y are the same type. So implicitly[IsSameType[Foo, Foo]] compiles fine. But implicitly[IsSameType[Int, String]] does not, as there is no implicit in scope of type IsSameType[Int, String], given that tpEquals is unapplicable here.

So with this very simple construct we are able to statically check that some type X is the same as another type Y.


Now here is an example of how it might be useful. Say I want to define a Pair type (ignoring the fact that it already exists in the standard library):

case class Pair[X,Y]( x: X, y: Y ) {
  def swap: Pair[Y,X] = Pair( y, x )
}

Pair is parameterized with the types of its 2 elements, which can be anything, and most importantly are unrelated. Now what if I want to define a method toList that converts the pair to a 2 elements list? This method would only really make sense in the case where X and Y are the same, otherwise I would be forced to return a List[Any]. And I certainly don't want to change the definition of Pair to Pair[T]( x: T, y: T ) because I really want to be able to have pairs of heterogeneous types. After all, it is only when calling toList that I need that X == Y. All other methods (such as swap) should be callable on any kind of heterogenous pair. So in the end I really want to statically ensure that X == Y, but only when calling toList, in which case it becomes possible and consistent to return a List[X] (or a List[Y], which is the same thing):

case class Pair[X,Y]( x: X, y: Y ) {
  def swap: Pair[Y,X] = Pair( y, x )
  def toList( implicit evidence: IsSameType[X, Y] ): List[Y] = ???
}

But there is still a serious problem when it comes to actually implement toList. If I try to write the obvious implementation, this fails to compile:

def toList( implicit evidence: IsSameType[X, Y] ): List[Y] = List[Y]( x, y )

The compiler will complain that x is not of type Y. And indeed, X and Y are still different types as far as the compiler is concerned. It is only by careful construction that we can be statically sure that X == Y (namely, the fact that toList takes an implicit value of type IsSameType[X, Y], and that they are provided by the method tpEquals only if X == Y). But the compiler certainly won't decipher this astute construction to conclude that X == Y.

What we can do to fix this situation is to provide an implicit conversion from X to Y provided that we know that X == Y (or in other words, that we have an instance of IsSameType[X, Y] in scope).

// A simple cast will do, given that we statically know that X == Y
implicit def sameTypeConvert[X,Y]( x: X )( implicit evidence: IsSameType[X, Y] ): Y = x.asInstanceOf[Y]

And now, our implementation of toList finally compiles fine: x will simply be converted to Y through the implicit conversion sameTypeConvert.

As a final tweak, we can simplify things even further: given that we are taking an implicit value (evidence) as a parameter already, why not have THIS value implement the conversion? Like this:

sealed abstract class IsSameType[X, Y] extends (X => Y) {
  def apply( x: X ): Y = x.asInstanceOf[Y]
}
object IsSameType {
   implicit def tpEquals[A] = new IsSameType[A, A]{}
}    

We can then remove the method sameTypeConvert, as the implicit conversion is now provided by the IsSameType instance itself. Now IsSameType serves a double purpose: statically ensuring that X == Y, and (if it is) providing the implicit conversion that actually allows us to treat instances of X as instances of Y.

We have now basically reimplemented the type =:= as defined in Predef


UPDATE: From the comments, it seems apparent that the use of asInstanceOf is bothering people (even though it really is just an implementation detail, and no user of IsSameType needs to ever do a cast). It turns out that it is easy to get rid of it even in the implementation. Behold:

sealed abstract class IsSameType[X, Y] extends (X => Y) {
  def apply(x: X): Y
}
object IsSameType {
  implicit def tpEquals[A] = new IsSameType[A, A]{
    def apply(x: A): A = x
  }
}

Basically, we just leave the apply abstract, and only implement it right in tpEquals where we (and the compiler) know that both the passed argument and the return value really have the same type. Hence no need for any cast. That's it really.

Note that in the end, the same cast is still present in the generated bytecode, but is now absent from the source code, and provably correct from the point of view of the compiler. And though we did introduce an additional (anonymous) class (and thus an additional indirection from the abstract class to the concrete class), it should run just as fast on any decent virtual machine because we are in the simple case of a "monomorphic method dispatch" (look it up if you are interested in the inner workings of virtual machines). Although it might still make it harder for the virtual machine to inline the call to apply (runtime virtual machine optimization is something of a black art and it's hard to make definite statements).

As a final note, I have to stress that it's really not a big deal to have a cast in the code anyway if it's provably correct. And after all, the standard library itself had this very same cast up until recently (now the whole implementation has been revamped to make it semmingly more powerful, but still contains casts in other places). If it's good enough for the standard library, it's good enough for me.

like image 87
Régis Jean-Gilles Avatar answered Nov 17 '22 11:11

Régis Jean-Gilles