Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Automatically Hash Consed Case Classes

I'm looking for a way to have classes that behave just like case classes, but that are automatically hash consed.

One way to achieve this for integer lists would be:

import scala.collection.mutable.{Map=>MutableMap}

sealed abstract class List
class Cons(val head: Int, val tail: List) extends List
case object Nil extends List

object Cons {
  val cache : MutableMap[(Int,List),Cons] = MutableMap.empty
  def apply(head : Int, tail : List) = cache.getOrElse((head,tail), {
    val newCons = new Cons(head, tail)
    cache((head,tail)) = newCons
    newCons
  })
  def unapply(lst : List) : Option[(Int,List)] = {
    if (lst != null && lst.isInstanceOf[Cons]) {
      val asCons = lst.asInstanceOf[Cons]
      Some((asCons.head, asCons.tail))
    } else None
  }
}

And, for instance, while

scala> (5 :: 4 :: scala.Nil) eq (5 :: 4 :: scala.Nil)
resN: Boolean = false

we get

scala> Cons(5, Cons(4, Nil)) eq Cons(5, Cons(4, Nil))
resN: Boolean = true

Now what I'm looking for is a generic way to achieve this (or something very similar). Ideally, I don't want to have to type much more than:

class Cons(val head : Int, val tail : List) extends List with HashConsed2[Int,List]

(or similar). Can someone come up with some type system voodoo to help me, or will I have to wait for the macro language to be available?

like image 826
Philippe Avatar asked Dec 31 '11 13:12

Philippe


2 Answers

You can define a few InternableN[Arg1, Arg2, ..., ResultType] traits for N being the number of arguments to apply(): Internable1[A,Z], Internable2[A,B,Z], etc. These traits define the cache itself, the intern() method and the apply method we want to hijack.

We'll have to define a trait (or an abstract class) to assure your InternableN traits that there is indeed an apply method to be overriden, let's call it Applyable.

trait Applyable1[A, Z] {
  def apply(a: A): Z
}
trait Internable1[A, Z] extends Applyable1[A, Z] {
  private[this] val cache = WeakHashMap[(A), Z]()
  private[this] def intern(args: (A))(builder: => Z) = {
    cache.getOrElse(args, {
      val newObj = builder
      cache(args) = newObj
      newObj
    })
  }
  abstract override def apply(arg: A) = {
    println("Internable1: hijacking apply")
    intern(arg) { super.apply(arg) }
  }
}

The companion object of your class will have to be a mixin of a concrete class implementing ApplyableN with InternableN. It would not work to have apply directly defined in your companion object.

// class with one apply arg 
abstract class SomeClassCompanion extends Applyable1[Int, SomeClass] {
  def apply(value: Int): SomeClass = {
    println("original apply")
    new SomeClass(value)
  }
}
class SomeClass(val value: Int)
object SomeClass extends SomeClassCompanion with Internable1[Int, SomeClass]

One good thing about this is that the original apply need not be modified to cater for interning. It only creates instances and is only called when they need to be created.

The whole thing can (and should) also be defined for classes with more than one argument. For the two-argument case:

trait Applyable2[A, B, Z] {
  def apply(a: A, b: B): Z
}
trait Internable2[A, B, Z] extends Applyable2[A, B, Z] {
  private[this] val cache = WeakHashMap[(A, B), Z]()
  private[this] def intern(args: (A, B))(builder: => Z) = {
    cache.getOrElse(args, {
      val newObj = builder
      cache(args) = newObj
      newObj
    })
  }
  abstract override def apply(a: A, b: B) = {
    println("Internable2: hijacking apply")
    intern((a, b)) { super.apply(a, b) }
  }
}

// class with two apply arg 
abstract class AnotherClassCompanion extends Applyable2[String, String, AnotherClass] {
  def apply(one: String, two: String): AnotherClass = {
    println("original apply")
    new AnotherClass(one, two)
  }
}
class AnotherClass(val one: String, val two: String)
object AnotherClass extends AnotherClassCompanion with Internable2[String, String, AnotherClass]

The interaction shows that the Internables' apply method executes prior to the original apply() which gets executed only if needed.

scala> import SomeClass._
import SomeClass._

scala> SomeClass(1)
Internable1: hijacking apply
original apply
res0: SomeClass = SomeClass@2e239525

scala> import AnotherClass._
import AnotherClass._

scala> AnotherClass("earthling", "greetings")
Internable2: hijacking apply
original apply
res1: AnotherClass = AnotherClass@329b5c95

scala> AnotherClass("earthling", "greetings")
Internable2: hijacking apply
res2: AnotherClass = AnotherClass@329b5c95

I chose to use a WeakHashMap so that the interning cache does not prevent garbage collection of interned instances once they're no longer referenced elsewhere.

Code neatly available as a Github gist.

like image 198
vlfig Avatar answered Sep 22 '22 02:09

vlfig


Maybe a little hacky, but you could try defining your own intern() method, like Java's String has:

import scala.collection.mutable.{Map=>MutableMap}

object HashConsed {
  val cache: MutableMap[(Class[_],Int), HashConsed] = MutableMap.empty
}

trait HashConsed {
  def intern(): HashConsed = 
    HashConsed.cache.getOrElse((getClass, hashCode), {
      HashConsed.cache((getClass, hashCode)) = this
      this
    })
}

case class Foo(bar: Int, baz: String) extends HashConsed

val foo1 = Foo(1, "one").intern()
val foo2 = Foo(1, "one").intern()

println(foo1 == foo2) // true
println(foo1 eq foo2) // true
like image 42
earldouglas Avatar answered Sep 22 '22 02:09

earldouglas