Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala: Contains in mutable and immutable sets

I've discovered a strange behavior for mutable sets which I cannot understand:

I have a object which I want to add to a set. The equals method for the class is overridden. When I add two different objects to the set, which produces the same output for equals method, I get a different behavior between mutable and immutable sets for the contains method.

Here is the code snippet:

class Test(text:String){
  override def equals(obj:Any) = obj match {
    case t: Test => if (t.text == this.text) true else false
    case _ => false
  }
  override def toString = text
}

val mutableSet:scala.collection.mutable.Set[Test] = scala.collection.mutable.Set.empty
mutableSet += new Test("test")
println(mutableSet)
println(mutableSet.contains(new Test("test")))

val immutableSet:scala.collection.immutable.Set[Test] = scala.collection.immutable.Set.empty
immutableSet += new Test("test")
println(immutableSet)
println(immutableSet.contains(new Test("test")))

This produces as output:

Set(test)
false
Set(test)
true

In my opinion both calls of contains should produce the same output (true).

Could anybody help me to understand the difference here or is this a bug in the scala immutable set implementation? By the way, I use scala 2.8.1.final

Thanks.

like image 665
Stefan Avatar asked Sep 25 '11 18:09

Stefan


People also ask

Is set mutable or immutable in Scala?

There are two kinds of Sets, the immutable and the mutable. The difference between mutable and immutable objects is that when an object is immutable, the object itself can't be changed. By default, Scala uses the immutable Set. If you want to use the mutable Set, you'll have to import scala.

Is array mutable or immutable in Scala?

Array in scala is homogeneous and mutable, i.e it contains elements of the same data type and its elements can change but the size of array size can't change.

Are Scala lists immutable?

Specific to Scala, a list is a collection which contains immutable data, which means that once the list is created, then it can not be altered. In Scala, the list represents a linked list. In a Scala list, each element need not be of the same data type.

Are sets mutable Scala?

Some Important Points about Set in Scala In Scala, both mutable and immutable sets are available. Mutable set is those set in which the value of the object is change but, in the immutable set, the value of the object is not changed itself. By default set in Scala are immutable.


2 Answers

Rule 1 when implementing equals(): Implement hashCode() at the same time. See Overriding equals and hashCode in Java

In the first example, you're creating a mutable set, which calls hashCode to set up the hash table.

In the second, you're using an immutable set with one entry, so Scala actually uses an optimised version of Set called Set1. Set1.contains() just compares the one entry with the passed element using equals() directly. This looks like:

/** An optimized representation for immutable sets of size 1 */
@SerialVersionUID(1233385750652442003L)
class Set1[A] private[collection] (elem1: A) extends Set[A] with Serializable {
  override def size: Int = 1
  def contains(elem: A): Boolean = 
    elem == elem1
  def + (elem: A): Set[A] = 
    if (contains(elem)) this
    else new Set2(elem1, elem)
  def - (elem: A): Set[A] = 
    if (elem == elem1) Set.empty
    else this
  def iterator: Iterator[A] = 
    Iterator(elem1)
  override def foreach[U](f: A =>  U): Unit = {
    f(elem1)
  }
}

No hashCode is called. There is also a Set2, Set3 and Set4.

So if we change your code to be:

class Test(val text:String){
  override def equals(obj:Any) = {
  println("equals=" + obj)
  obj match {
    case t: Test => if (t.text == this.text) true else false
    case _ => false
  }}

  override def hashCode(): Int = {
    println("hashCode=" + super.hashCode())
    super.hashCode()
  }
  override def toString = text
}

println("mutable")
val mutableSet:scala.collection.mutable.Set[Test] = scala.collection.mutable.Set.empty
mutableSet += new Test("test")
println("mutableSet=" + mutableSet + " contains=" + mutableSet.contains(new Test("test")))

println("immutable")
var immutableSet:scala.collection.immutable.Set[Test] = scala.collection.immutable.Set.empty
immutableSet += new Test("test")
println("immutableSet=" + immutableSet + " contains=" + immutableSet.contains(new Test("test")))

adding a hashCode and a println in the equals, and the output is:

mutable
hashCode=30936685
hashCode=26956691
mutableSet=Set(test) contains=false
immutable
equals=test
immutableSet=Set(test) contains=true

which explains why the mutable.contains() isn't working correctly. It is looking up the object in the wrong hash table entry, equals() doesn't even get called. And, unsurprisingly, it doesn't find it.

You can implement hashCode using text.hashCode:

override def hashCode: Int = text.hashCode
like image 57
Matthew Farwell Avatar answered Nov 12 '22 13:11

Matthew Farwell


You need to override hashCode as well. hashCode is essential to override when you override equals.

Note there were also a few things that didn't compile, so I edited a bit more:

class Test(val text:String){ // added val
  override def equals(obj:Any) = obj match {
    case t: Test => if (t.text == this.text) true else false
    case _ => false
  }
  override def toString = text
  override def hashCode = text.hashCode
}

val mutableSet:scala.collection.mutable.Set[Test] = scala.collection.mutable.Set.empty
mutableSet += new Test("test")
println(mutableSet)
println(mutableSet.contains(new Test("test")))

val immutableSet:scala.collection.immutable.Set[Test] = scala.collection.immutable.Set.empty
val immutableSet2 = immutableSet + new Test("test") // reassignment to val
println(immutableSet2)
println(immutableSet2.contains(new Test("test")))

I recommend reading http://www.artima.com/pins1ed/object-equality.html for a lot more insights on doing object equality. It's eye opening.

like image 42
huynhjl Avatar answered Nov 12 '22 12:11

huynhjl