Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Switching on Strings

I was curious to see how Java and Scala implement switches on strings:

class Java
{
    public static int java(String s)
    {
        switch (s)
        {
        case "foo": return 1;
        case "bar": return 2;
        case "baz": return 3;
        default: return 42;
        }
    }
}
object Scala {
  def scala(s: String): Int = {
    s match {
      case "foo" => 1
      case "bar" => 2
      case "baz" => 3
      case _ => 42
    }
  }
}

It seems like Java switches on the hashcode and then does a single string comparison:

 0: aload_0       
 1: dup           
 2: astore_1      
 3: invokevirtual #16    // Method java/lang/String.hashCode:()I
 6: lookupswitch  { // 3
           97299: 40
           97307: 52
          101574: 64
         default: 82
    }
40: aload_1       
41: ldc           #22    // String bar
43: invokevirtual #24    // Method java/lang/String.equals:(Ljava/lang/Object;)Z
46: ifne          78
49: goto          82
52: aload_1       
53: ldc           #28    // String baz
55: invokevirtual #24    // Method java/lang/String.equals:(Ljava/lang/Object;)Z
58: ifne          80
61: goto          82
64: aload_1       
65: ldc           #30    // String foo
67: invokevirtual #24    // Method java/lang/String.equals:(Ljava/lang/Object;)Z
70: ifne          76
73: goto          82
76: iconst_1      
77: ireturn       
78: iconst_2      
79: ireturn       
80: iconst_3      
81: ireturn       
82: bipush        42
84: ireturn       

In contrast, Scala seems to compare against all the cases:

 0: aload_1       
 1: astore_2      
 2: ldc           #16    // String foo
 4: aload_2       
 5: invokevirtual #20    // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
 8: ifeq          16
11: iconst_1      
12: istore_3      
13: goto          47
16: ldc           #22    // String bar
18: aload_2       
19: invokevirtual #20    // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
22: ifeq          30
25: iconst_2      
26: istore_3      
27: goto          47
30: ldc           #24    // String baz
32: aload_2       
33: invokevirtual #20    // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
36: ifeq          44
39: iconst_3      
40: istore_3      
41: goto          47
44: bipush        42
46: istore_3      
47: iload_3       
48: ireturn       

Is it possible to convince Scala to employ the hashcode trick? I would rather prefer an O(1) solution to an O(n) solution. In my real code, I need to compare against 33 possible keywords.

like image 249
fredoverflow Avatar asked Mar 30 '15 20:03

fredoverflow


People also ask

When should I switch on strings?

Switching on strings can be more costly in term of execution than switching on primitive data types. Therefore, it is good to switch on strings only in cases in which the controlling data is already in string form.

Is it good to switch on strings in JavaScript?

Therefore, it is good to switch on strings only in cases in which the controlling data is already in string form. The comparison perform between String objects in switch statements is case sensitive. You must use break statements in switch case.

What is the use of string in switch statement?

String is the only non-integer type which can be used in switch statement. Important points: Switching on strings can be more costly in term of execution than switching on primitive data types. Therefore, it is good to switch on strings only in cases in which the controlling data is already in string form.

What is a string-based switch?

Using a string-based switch is an improvement over using the equivalent sequence of if/else statements. We now declare a string as a String class object only as depicted below:


2 Answers

Definitely it seems that this case is a lack of optimization from the Scala compiler. Sure, the match construct is much (much much) powerful than the switch/case in Java, and it is a lot harder to optimize it, but it could detect these special cases in which a simple hash comparison would apply.

Also, I don't think this case will show many times in idiomatic Scala, because you always match with case classes that have some meaning apart from having different value.

like image 174
Diego Sevilla Avatar answered Oct 08 '22 18:10

Diego Sevilla


I think the problem is that you're thinking about Scala from a Java point of view (I think you're also prematurely optimizing, but hey).

I would think that the solution you want is to instead memoize your mapping. You've got a function that maps from String -> Int, right? So do this:

class Memoize1[-T, +R](f: T => R) extends (T => R) {
  import scala.collection.mutable
  private[this] val vals = mutable.Map.empty[T, R]

  def apply(x: T): R = {
    if (vals.contains(x)) {
      vals(x)
    }
    else {
      val y = f(x)
      vals += ((x, y))
      y
    }
  }
}

object Memoize1 {
  def apply[T, R](f: T => R) = new Memoize1(f)
}

(this memoizing code is taken from here.

Then you can memoize your code like this:

object Scala {
  def scala(s: String): Int = {
    s match {
      case "foo" => 1
      case "bar" => 2
      case "baz" => 3
      case _ => 42
    }
  }

  val memoed = Memoize1(Scala.scala)

  val n = memoed("foo")
}

Tada! Now you're doing hash value comparisons. Although I will add that most memoization examples (this one included) are toys and will not survive most use cases. Real-world memoization should include an upper limit to the amount you're willing to cache, and in the case of your code where you have a tiny number of possible valid cases and a huge number of invalid cases, I would consider making a general class that pre-builds the map and has a specialized lookup that says, "in my cache, you win, not in my cache, default." which can be done very easily by tweaking the memoizer to take a List of input to precache and change the "not-in-cache" code to return a default.

like image 28
plinth Avatar answered Oct 08 '22 17:10

plinth