Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which Scala features have poor performance

I was wandering lately: as Scala is run on JVM, and latter is optimized for some types of operations, are there features whose implementation is really inefficient on JVM and which use therefore should be discouraged? Could you also explain why they are inefficient?

The first candidate would be functional programming features - as I know, functions are special classes with applymethod, which obviously creates additional overhead compared to languages where functions are just blocks of code.

like image 484
zbstof Avatar asked Apr 25 '13 08:04

zbstof


1 Answers

Performance tuning is a deep and complex issue, but three things come immediately to mind.

Scala collections are good for expressive power, but not for performance.

Consider:

(1 to 20).map(x => x*x).sum

val a = new Array[Int](20)
var i = 0
while (i < 20) { a(i) = i+1; i += 1 }  // (1 to 20)
i = 0
while (i < 20) { a(i) = a(i)*a(i); i += 1 }   // map(x => x*x)
var s = 0
i = 0
while (i < 20) { s += a(i); i += 1 }  // sum
s

The first is amazingly more compact. The second is 16x faster. Math on integers is really fast; boxing and unboxing is not. The generic collections code is, well, generic, and relies on boxing.

Function2 is only specialized on Int, Long, and Double arguments.

Anything other operation on primitives will require boxing. Beware!

Suppose you want to have a function where you can toggle a capability--maybe you want to capitalize letters or not. You try:

def doOdd(a: Array[Char], f: (Char, Boolean) => Char) = {
  var i = 0
  while (i<a.length) { a(i) = f(a(i), (i&1)==1); i += 1 }
  a
}

And then you

val text = "The quick brown fox jumps over the lazy dog".toArray
val f = (c: Char, b: Boolean) => if (b) c.toUpper else c.toLower

scala> println( doOdd(text, f).mkString )
tHe qUiCk bRoWn fOx jUmPs oVeR ThE LaZy dOg

Okay, great! Except what if we

trait Func_CB_C { def apply(c: Char, b: Boolean): Char }
val g = new Func_CB_C {
  def apply(c: Char, b: Boolean) = if (b) c.toUpper else c.toLower
}
def doOdd2(a: Array[Char], f: Func_CB_C) = {
  var i = 0
  while (i<a.length) { a(i) = f(a(i), (i&1)==1); i += 1 }
  a
}

instead? Suddenly it's 3x faster. But if it's (Int, Int) => Int, (or any other permutation of Int/Long/Double arguments and Unit/Boolean/Int/Long/Float/Double return values), rolling your own is unnecessary--it's specialized and works at maximum speed.

Just because you can parallelize easily doesn't mean it's a good idea.

Scala's parallel collections will just try to run your code in parallel. It's up to you to make sure there's enough work so that running in parallel is a smart thing to do. There's a lot of overhead in setting up threads and collecting results. Take, for example,

val v = (1 to 1000).to[Vector]
v.map(x => x*(x+1))

versus

val u = (1 to 1000).to[Vector].par
u.map(x => x*(x+1))

The second map is faster, right, because it's parallel?

Hardly! It's 10x slower because of overhead (on my machine; results can vary substantially)

Summary

These are just a few of very many issues that you'll normally never have to worry about except in the most performance-critical parts of your code. There are oodles more, which eventually you'll encounter, but as I mentioned in my comment, it would take a book to cover a decent fraction of them. Note that there are oodles of performance issues in any language, and optimization is often tricky. Save your effort for where it matters!

like image 60
Rex Kerr Avatar answered Oct 31 '22 16:10

Rex Kerr