I was curious to see how Java and Scala implement switches on strings:
class Java
{
public static int java(String s)
{
switch (s)
{
case "foo": return 1;
case "bar": return 2;
case "baz": return 3;
default: return 42;
}
}
}
object Scala {
def scala(s: String): Int = {
s match {
case "foo" => 1
case "bar" => 2
case "baz" => 3
case _ => 42
}
}
}
It seems like Java switches on the hashcode and then does a single string comparison:
0: aload_0
1: dup
2: astore_1
3: invokevirtual #16 // Method java/lang/String.hashCode:()I
6: lookupswitch { // 3
97299: 40
97307: 52
101574: 64
default: 82
}
40: aload_1
41: ldc #22 // String bar
43: invokevirtual #24 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
46: ifne 78
49: goto 82
52: aload_1
53: ldc #28 // String baz
55: invokevirtual #24 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
58: ifne 80
61: goto 82
64: aload_1
65: ldc #30 // String foo
67: invokevirtual #24 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
70: ifne 76
73: goto 82
76: iconst_1
77: ireturn
78: iconst_2
79: ireturn
80: iconst_3
81: ireturn
82: bipush 42
84: ireturn
In contrast, Scala seems to compare against all the cases:
0: aload_1
1: astore_2
2: ldc #16 // String foo
4: aload_2
5: invokevirtual #20 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
8: ifeq 16
11: iconst_1
12: istore_3
13: goto 47
16: ldc #22 // String bar
18: aload_2
19: invokevirtual #20 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
22: ifeq 30
25: iconst_2
26: istore_3
27: goto 47
30: ldc #24 // String baz
32: aload_2
33: invokevirtual #20 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
36: ifeq 44
39: iconst_3
40: istore_3
41: goto 47
44: bipush 42
46: istore_3
47: iload_3
48: ireturn
Is it possible to convince Scala to employ the hashcode trick? I would rather prefer an O(1) solution to an O(n) solution. In my real code, I need to compare against 33 possible keywords.
Switching on strings can be more costly in term of execution than switching on primitive data types. Therefore, it is good to switch on strings only in cases in which the controlling data is already in string form.
Therefore, it is good to switch on strings only in cases in which the controlling data is already in string form. The comparison perform between String objects in switch statements is case sensitive. You must use break statements in switch case.
String is the only non-integer type which can be used in switch statement. Important points: Switching on strings can be more costly in term of execution than switching on primitive data types. Therefore, it is good to switch on strings only in cases in which the controlling data is already in string form.
Using a string-based switch is an improvement over using the equivalent sequence of if/else statements. We now declare a string as a String class object only as depicted below:
Definitely it seems that this case is a lack of optimization from the Scala compiler. Sure, the match
construct is much (much much) powerful than the switch/case in Java, and it is a lot harder to optimize it, but it could detect these special cases in which a simple hash comparison would apply.
Also, I don't think this case will show many times in idiomatic Scala, because you always match with case classes that have some meaning apart from having different value.
I think the problem is that you're thinking about Scala from a Java point of view (I think you're also prematurely optimizing, but hey).
I would think that the solution you want is to instead memoize your mapping. You've got a function that maps from String -> Int, right? So do this:
class Memoize1[-T, +R](f: T => R) extends (T => R) {
import scala.collection.mutable
private[this] val vals = mutable.Map.empty[T, R]
def apply(x: T): R = {
if (vals.contains(x)) {
vals(x)
}
else {
val y = f(x)
vals += ((x, y))
y
}
}
}
object Memoize1 {
def apply[T, R](f: T => R) = new Memoize1(f)
}
(this memoizing code is taken from here.
Then you can memoize your code like this:
object Scala {
def scala(s: String): Int = {
s match {
case "foo" => 1
case "bar" => 2
case "baz" => 3
case _ => 42
}
}
val memoed = Memoize1(Scala.scala)
val n = memoed("foo")
}
Tada! Now you're doing hash value comparisons. Although I will add that most memoization examples (this one included) are toys and will not survive most use cases. Real-world memoization should include an upper limit to the amount you're willing to cache, and in the case of your code where you have a tiny number of possible valid cases and a huge number of invalid cases, I would consider making a general class that pre-builds the map and has a specialized lookup that says, "in my cache, you win, not in my cache, default." which can be done very easily by tweaking the memoizer to take a List
of input to precache and change the "not-in-cache" code to return a default.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With