Very often the performance bottleneck when using hash maps is the equals
method. equals
can be very expensive for deep data structures.
Note, that the following is about immutable hash maps. Thus, at least you will never remove a key. I think adding keys should be ok.
get
Suppose you query a hash map, being certain it contains the queried key. Then if there is no collision for the given key, the found single entry can be returned just based on the hash hit, because it has to be the queried object.
This can avoid calling equals
in get
in most cases (when there is no collision).
get
operation?Btw, I'm open for suggestions of a better subject line.
How is this concept called?
Identity map. It won't call equals()
when looking for elements and just use identity (i.e. ==
).
The correct solution isn't to "fix" the map but to use only keys which use the default Object.equals()
:
public boolean equals( Object other ) { return this == other; }
The problem is that finding elements in this map can be problematic unless all keys are singletons. So you can't use String
, for example, because Java doesn't guarantee that all string instances are interned. The same is true for Integer
: Instances < -128 and > 127 will be different.
But if you use your own optimized implementation for keys, you can solve this.
Such a data structure doesn't exist to my knowledge exactly because it's unsafe - there is no way to reliably encode your particular condition.
But, alas, the Scala Collection Library allows one to implement new rich data structures very quickly with a small amount of fresh code. Here's an implementation of your request:
class ParticularHashMap[A, +B] private(buckets: Vector[List[(A, B)]]) extends Map[A, B]{
def this() = this(Vector.fill(ParticularHashMap.BucketsNo)(List.empty))
def get(key: A) = {
val bucket = buckets(bucketIndex(key))
bucket match {
case List((_, v)) => Some(v) //YOUR SPECIAL OPTIMIZATION !
case list => list.find(key == _._1).map(_._2)
}
}
def iterator = buckets.flatten.iterator
def -(key: A) = mkWithUpdatedBucket(key, _ filterNot (_._1 == key))
def +[B1 >: B](kv: (A, B1)) = mkWithUpdatedBucket(kv._1, insert(kv._1, kv._2.asInstanceOf[B], _))
//if you're wondering why it's Bucket[A, Any] and not Bucket[A, B] it's because of the covariance of B
private def mkWithUpdatedBucket(key: A, f: Bucket[A, Any] => Bucket[A, Any]) = {
val index = bucketIndex(key)
val newBucket = f(buckets(index))
(new ParticularHashMap[A, Any](buckets.updated(index, newBucket))).asInstanceOf[ParticularHashMap[A, B]]
}
private def insert(k: A, v: Any, bucket: List[(A, Any)]): Bucket[A, Any] = bucket match {
case List() => List((k, v))
case (k1, v1) :: kvs if k == k1 => (k, v) :: kvs
case (k1, v1) :: kvs => (k1, v1) :: insert(k, v, kvs)
}
private def bucketIndex(key: A) = Math.abs(key.hashCode()) % ParticularHashMap.BucketsNo
}
object ParticularHashMap {
private val BucketsNo = 256
private type Bucket[+A, +B] = List[(A, B)]
def apply[A, B](kvs: (A, B)*) = new ParticularHashMap[A, B] ++ kvs
}
Please note that this is a naive implementation, but you can improve it too suit your needs.
It will work as expected:
val myMap = ParticularHashMap("hello" -> 2, "world" -> 4)
println(myMap.get("hello")) >> Some(2)
println(myMap.get("world")) >> Some(4)
println(myMap.get("world1")) >> None
But, of course, beware that if you don't respect your contract, bad things will happen. Below is an especially crafted example to show the downside:
val evilMap = ParticularHashMap(1 -> 2)
println(evilMap.get(1)) >> Some(2)
println(evilMap.get(257)) >> Some(2) //BUT 257 IS NOT IN THE MAP !!!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With