Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selectively disable subsumption in Scala? (correctly type List.contains)

List("a").contains(5)

Because an Int can never be contained in a list of String, this should generate an error at compile-time, but it does not.

It wastefully and silently tests every String contained in the list for equality to 5, which can never be true ("5" never equals 5 in Scala).

This has been named "the 'contains' problem". And some have implied that if a type system cannot correctly type such semantics, then why go through the extra effort for enforcing types. So I consider it is an important problem to solve.

The type parametrization B >: A of List.contains inputs any type that is a supertype of the type A (the type of the elements contained in the list).

trait List[+A] {
   def contains[B >: A](x: B): Boolean
}

This type parametrization is necessary because the +A declares that the list is covariant on the type A, thus A can't be used in the contravariant position, i.e. as the type of an input parameter. Covariant lists (which must be immutable) are much more powerful for extension than invariant lists (which can be mutable).

A is a String in the problematic example above, but Int is not a supertype of String, so what happened? The implicit subsumption in Scala, decided that Any is a mutual supertype of both String and Int.

The creator of Scala, Martin Odersky, suggested that a fix would be to limit the input type B to only those types that have an equals method that Any doesn't have.

trait List[+A] {
   def contains[B >: A : Eq](x: B): Boolean
}

But that doesn't solve the problem, because two types (where the input type is not supertype of the type of the elements of the list) might have a mutual supertype which is a subtype of Any, i.e. also a subtype of Eq. Thus, it would compile without error and the incorrectly typed semantics would remain.

Disabling implicit subsumption every where is not an ideal solution either, because implicit subsumption is why the following example for subsumption to Any works. And we wouldn't want to be forced to use type casts when the receiving site (e.g. passing as a function argument) has correctly typed semantics for a mutual supertype (that might not even be Any).

trait List[+A] {
   def ::[B >: A](x: B): List[B]
}

val x : List[Any] = List("a", 5) // see[1]

[1] List.apply calls the :: operator.

So my question is what is the best fix to this problem?

My tentative conclusion is that implicit subsumption should be turned off at the definition site where the semantics are otherwise not typed correctly. I will be providing an answer that shows how to turn off implicit subsumption at the method definition site. Are there alternative solutions?

Please note this problem is general, and not isolated to lists.

UPDATE: I have filed an improvement request and started a scala discussion thread on this. I have also added comments under Kim Stebel's and Peter Schmitz's answers showing that their answers have erroneous functionality. Thus there is no solution. Also at the aforementioned discussion thread, I explained why I think soc's answer is not correct.

like image 638
Shelby Moore III Avatar asked Dec 02 '11 17:12

Shelby Moore III


5 Answers

This sounds good in theory, but falls apart in real life in my opinion.

equals is not based on types and contains is building on top of that.

That's why code like 1 == BigInt(1) works and returns the result most people would expect.

In my opinion it doesn't make sense to make contains more strict than equals.

If contains would be made more strict, code like List[BigInt](1,2,3) contains 1 would stop working completely.

I don't think “unsafe” or “not type safe” are the right terms here, by the way.

like image 58
soc Avatar answered Oct 22 '22 22:10

soc


Why not use an equality typeclass?

scala> val l = List(1,2,3)
l: List[Int] = List(1, 2, 3)

scala> class EQ[A](a1:A) { def ===(a2:A) = a1 == a2 } 
defined class EQ

scala> implicit def toEQ[A](a1:A) = new EQ(a1)
toEQ: [A](a1: A)EQ[A]

scala> l exists (1===)
res7: Boolean = true

scala> l exists ("1"===)
<console>:14: error: type mismatch;
 found   : java.lang.String => Boolean
 required: Int => Boolean
              l exists ("1"===)
                           ^

scala> List("1","2")
res9: List[java.lang.String] = List(1, 2)

scala> res9 exists (1===)
<console>:14: error: type mismatch;
 found   : Int => Boolean
 required: java.lang.String => Boolean
              res9 exists (1===)
like image 32
Kim Stebel Avatar answered Oct 22 '22 22:10

Kim Stebel


I think you misunderstand Martin's solution, it is not B <: Eq, it is B : Eq, which is a shortcut for

def Contains[B >: A](x: B)(implicit ev: Eq[B])

And Eq[X] would then contains a method

def areEqual(a: X, b: X): Boolean

This is not the same as moving the equals method of Any a little lower in the hierarchy, which would indeed solve none of the problem of having it in Any.

like image 35
Didier Dupont Avatar answered Oct 22 '22 21:10

Didier Dupont


In my library extension I use:

class TypesafeEquals[A](val a: A) {
  def =*=(x: A): Boolean = a == x
  def =!=(x: A): Boolean = a != x
}
implicit def any2TypesafeEquals[A](a: A) = new TypesafeEquals(a)


class RichSeq[A](val seq: Seq[A]) { 
  ...
  def containsSafely(a: A): Boolean = seq exists (a =*=)
  ...
}
implicit def seq2RichSeq[A](s: Seq[A]) = new RichSeq(s)

So I avoid calling contains.

like image 3
Peter Schmitz Avatar answered Oct 22 '22 21:10

Peter Schmitz


The examples use L instead of List or SeqLike, because for this solution to be applied to preexisting contains method of those collections, it would require a change to the preexisting library code. One of the goals is the best way to do equality, not the best compromise to interopt with the current libraries (although backwards compatibility needs to be considered). Additionally, my other goal is this answer is generally applicable for any method function that wants to selectively disable the implicit subsumption feature of the Scala compiler for any reason, not necessarily tied to the equality semantics.

case class L[+A]( elem: A )
{
   def contains[B](x: B)(implicit ev: A <:< B) = elem == x
}

The above generates an error as desired, assuming the desired semantics for List.contains is the input should be equal to and a supertype of the contained element.

L("a").contains(5)
error: could not find implicit value for parameter ev: <:<[java.lang.String,Int]
       L("a").contains(5)
                      ^

The error is not generated when implicit subsumption was not required.

scala> L("a").contains(5 : Any)
defined class L

scala> L("a").contains("")
defined class L

This disables the implicit subsumption (selectively at the method definition site), by requiring the input parameter type B to be the same as the argument type passed as input (i.e. not implicitly subsumable with A), and then separately require implicit evidence that B is a, or has an implicitly subsumable, supertype of A.]


UPDATE May 03, 2012: The code above is not complete, as is shown below that turning off all subsumption at the method definition-site does not give the desired result.

class Super
defined class Super
class Sub extends Super
defined class Sub
L(new Sub).contains(new Super)
defined class L
L(new Super).contains(new Sub)
error: could not find implicit value for parameter ev: <:<[Super,Sub]
       L(new Super).contains(new Sub)
                            ^

The only way to get the desired form of subsumption, is to also cast at the method (call) use-site.

L(new Sub).contains(new Super : Sub)
error: type mismatch;
 found   : Super
 required: Sub
       L(new Sub).contains(new Super : Sub)
                           ^
L(new Super).contains(new Sub : Super)
defined class L

Per soc's answer, the current semantics for List.contains is that the input should be equal to, but not necessarily a supertype of the contained element. This assumes List.contains promises any matched item only equals and is not required to be a (subtype or) copy of an instance of the input. The current universal equality interface Any.equals : Any => Boolean is unityped, so equality doesn't enforce a subtyping relationship. If this is the desired semantics for List.contains, subtyping relationships can't be employed to optimize the compile-time semantics, e.g. disabling implicit subsumption, and we are stuck with the potential semantic inefficiencies that degrade runtime performance for List.contains.

While I will be studying and thinking more about equality and contains, afaics my answer remains valid for the general purpose of selectively disabling implicit subsumption at the method definition site.

My thought process is also ongoing holistically w.r.t. the best model of equality.


Update: I added a comment below soc's answer, so I now think his point is not relevant. Equality should always be based on a subtyped relationship, which afaics is what Martin Odersky is proposing for the new equality overhaul (see also his version of contains). Any ad-hoc polymorphic equivalence (e.g. BitInt(1) == 1) can be handled with implicit conversions. I explained in my comment below didierd's answer that without my improvement below, afaics Martin's proposed contains would have a semantic error, whereby a mutual implicitly subsumed supertype (other than Any) will select the wrong implicit instance of Eq (if one exists, else unnecessary compiler error). My solution disables the implicit subsumption for this method, which is the correct semantics for the subtyped argument of Eq.eq.

trait Eq[A]
{
   def eq(x: A, y: A) = x == y
}

implicit object EqInt extends Eq[Int]
implicit object EqString extends Eq[String]

case class L[+A]( elem: A )
{
   def contains[B](x: B)(implicit ev: A <:< B, eq: Eq[B]) = eq.eq(x, elem)
}
L("a").contains("")

Note Eq.eq can be optionally replaced by the implicit object (not overridden because there is no virtual inheritance, see below).

Note that as desired, L("a").contains(5 : Any) no longer compiles, because Any.equals is no longer used.

We can abbreviate.

case class L[+A]( elem: A )
{
   def contains[B : Eq](x: B)(implicit ev: A <:< B) = eq.eq(x, elem)
}

Add: The x == y must be a virtual inheritance call, i.e. x.== should be declared override, because there is no virtual inheritance in the Eq typeclass. The type parameter A is invariant (because A is used in the contravariant position as input parameter of Eq.eg). Then we can define an implicit object on an interface (a.k.a. trait).

Thus, the Any.equals override must still check if the concrete type of the input matches. That overhead can't be removed by the compiler.

like image 3
19 revs Avatar answered Oct 22 '22 21:10

19 revs