I ran into some performance discrepancy between, almost identical, implementations of Scala and Java versions. I am seeing Java version that is 68% faster than Scala version. Any idea as to why this happens?
Java version:
public class Util {
public static Set < String > toBigramsJava(String s1) {
Set <String> nx = new HashSet <String> ();
for (int i = 0; i < s1.length() - 1; i++) {
char x1 = s1.charAt(i);
char x2 = s1.charAt(i + 1);
String tmp = "" + x1 + x2;
nx.add(tmp);
}
return nx;
}
}
Scala version:
object Util {
def toBigramsScala(str: String): scala.collection.mutable.Set[String] = {
val hash: scala.collection.mutable.Set[String] = scala.collection.mutable.HashSet[String]()
for (i <-0 to str.length - 2) {
val x1 = str.charAt(i)
val x2 = str.charAt(i + 1)
val tmp = "" + x1 + x2
hash.add(tmp)
}
return hash
}
}
Test Results:
scala> Util.time(for(i<-1 to 1000000) {Util.toBigramsScala("test test abc de")})
17:00:05.034 [info] Something took: 1985ms
Util.time(for(i<-1 to 1000000) {Util.toBigramsJava("test test abc de")})
17:01:51.597 [info] Something took: 623ms
System:
I ran this on Ubuntu 14.04, with 4 cores and 8Gig RAM. Java version 1.7.0_45, Scala version 2.10.2.
There is some more info on my blog.
I've got roughly the same results with this scala version
object Util {
def toBigramsScala(str: String) = {
val hash = scala.collection.mutable.Set.empty[String]
var i: Int = 0
while (i < str.length - 1) {
val x1 = str.charAt(i)
val x2 = str.charAt(i + 1)
val tmp = new StringBuilder().append(x1).append(x2).toString()
hash.add(tmp)
i += 1
}
hash
}
}
As I remember for loop in scala implemented as call to apply() method on Function0 which is megamorphic method call (expensive from JVM/JIT point of view). Plus possibly some string concatenation optimization made by javac.
I didn't check my assumptions with looking to generated byte code, but replacing for with while and string concatenation with StringBuilder made difference negligible.
Time for Java Version: 451 millis
Time for Scala Version: 589 millis
For-comprehensions are always slower then using a while
loop or tail recursion as explained here.
The other problem in your example is the concatenation of the String
s. Scala will use the scala.collection.mutable.StringBuilder
which has some performance issues (e.g. it will box your char
s to Char
instances) as mentioned in the other answers.
Changing the for-comprehension to a tail-recursive method and using java.lang.StringBuilder
you will get mostly the same results in both Scala and Java (on my machine Scala is actually a few milliseconds faster).
I've conducted a similar test.
Here are the classes:
Java
public class JavaApp {
public static void main(String[] args) {
String s1 = args[0];
java.util.Set <String> nx = new java.util.HashSet<>();
for (int i = 0; i < s1.length() - 1; i++) {
char x1 = s1.charAt(i);
char x2 = s1.charAt(i + 1);
String tmp = "" + x1 + x2;
nx.add(tmp);
}
System.out.println(nx.toString());
}
}
Scala
object ScalaApp {
def main(args:Array[String]): Unit = {
var s1 = args(0)
val hash: scala.collection.mutable.Set[String] = scala.collection.mutable.HashSet[String]()
for (i <-0 to s1.length - 2) {
val x1 = s1.charAt(i)
val x2 = s1.charAt(i + 1)
val tmp = "" + x1 + x2
hash.add(tmp)
}
println(hash.toString())
}
}
Compilers and runtime version
Javac
javac 1.8.0_20-ea
Java
java version "1.8.0_20-ea"
Scalac
Scala compiler version 2.11.0 -- Copyright 2002-2013, LAMP/EPFL
Scala
Scala code runner version 2.11.0 -- Copyright 2002-2013, LAMP/EPFL
Scala is also slower. Taking a look at Scala
version, it creates two anonymous classes.
One thing that might be taking some time as well is an auto boxing
on the char
variable in the for
loop.
44: iload_2
45: invokestatic #61 // Method scala/runtime/BoxesRunTime.boxToCharacter:(C)Ljava/lang/Character;
48: invokevirtual #55 // Method scala/collection/mutable/StringBuilder.append:(Ljava/lang/Object;)Lscala/collection/mutable/StringBuilder;
51: iload_3
52: invokestatic #61 // Method scala/runtime/BoxesRunTime.boxToCharacter:(C)Ljava/lang/Character;
55: invokevirtual #55 // Method scala/collection/mutable/StringBuilder.append:(Ljava/lang/Object;)Lscala/collection/mutable/StringBuilder;
But that doesn't explain it all.
There a few ways to further speed up the Scala code.
We extract the loop termination into a variable max, just in case the JIT would miss optimizing that.
object Util {
def toBigramsScala(str: String) = {
val hash = scala.collection.mutable.HashSet.empty[String]
val charArray = new Array[Char](2)
var i = 0
val max = str.length - 1
while (i < max) {
charArray(0) = str.charAt(i)
charArray(1) = str.charAt(i + 1)
hash.add(new String(charArray))
i += 1
}
hash
}
}
With those changes, I was able to get the same run time between the Java and Scala code. Surprisingly (at least in this example), the java.util.HashSet didn't give any performance gain over mutable.HashSet. In fairness, we can also apply all these optimizations to the Java code as well,
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With