Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Scala's foldLeft has lower performance than iterating with an index on strings?

I am comparing the performance of two atoi implementations. The first is iterating on the input string getting chars using charAt; the second is using foldLeft.

object Atoi {
  def withRandomAccess(str: String, baze: Int): Int = {
      def process(acc: Int, place: Int, str: String, index: Int): Int = 
        if (index >= 0) process(acc + value(str.charAt(index)) * place, place * baze, str, index-1) else acc
      process(0, 1, str, str.length - 1)
    }

  def withFoldLeft(str: String, base: Int): Int = (0/:str) (_ * base + value(_))

  def value(c: Char): Int = { /* omitted for clarity */ }

  def symbol(i: Int): Char = { /* omitted for clarity */ }
}

The foldLeft version is 2x to 4x slower (the complete benchmark code is here). I didn't expect this. Do you know why? Is Scala converting the string to a List before processing it? Do you have a hint on how to boost foldLeft performance on strings?

like image 608
nakhli Avatar asked Jul 14 '11 19:07

nakhli


1 Answers

The issue is nothing to do with inlining, it is to do with the boxing/unboxing of Chars that is taking place when you use the foldLeft.

You get foldLeft on String by an implicit conversion to a StringOps, which is not specialized. Each char in the String has to be boxed into a java.lang.Character in order to be passed into the Function2 (argument to foldLeft), then unboxed (much cheaper) to be passed into the value method within the function body, then boxed again to be fed into the next iteration of the fold.

The boxing involves the overhead of creating objects and subsequently garbage-collecting them.


In terms of avoiding boxing, there follows a brief and important point:

  • you shouldn't attempt to avoid boxing, with a probability of almost 1.

(That is to say, unless you have identified a specific and unacceptable performance degradation which can be attributed to boxing, then you should not worry about it.)

If you are sure that there is an issue which you need to address, avoid the collections and for-comprehensions (which use foreach and flatMap under hood). If you are using a loop, use while.

like image 160
oxbow_lakes Avatar answered Oct 31 '22 19:10

oxbow_lakes