Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala for comprehension performance

Why is

for (
  a <- 1 to 1000;
  b <- 1 to 1000 - a;
  c <- 1 to 1000 - a - b;
  if (a * a + b * b == c * c && a + b + c == 1000)
) println((a, b, c, a * b * c))

266 ms

slower then:

for (a <- 1 to 1000)
  for (b <- 1 to 1000 - a)
    for (c <- 1 to 1000 - a - b)
      if (a * a + b * b == c * c)
        if (a + b + c == 1000)
          println((a, b, c, a * b * c))

62 ms

If I understand correct this should be the same?


Solution after processing answers:

for (
  a <- 1 to 1000;
  b <- 1 to (1000 - a)
) {
  val c = (1000 - a - b)
  if (a * a + b * b == c * c)
    println((a, b, c, a * b * c))
}

9 ms

like image 645
Jeff Avatar asked Feb 28 '13 13:02

Jeff


2 Answers

Your understanding is wrong.

This is what happens when the condition is in the loop body:

// this
for(x <- coll) if(condition) doSomething
// will translate to
coll.foreach{ x => if(condition) doSomething }

As opposed to when the condition is in the generator itself:

// this
for(x <- coll if(condition)) dosomething
// will translate to
coll.withFilter(x => condition).foreach{ x => dosomething }

You can look into The Scala Language Specification 6.16 for more details.

like image 92
Eastsun Avatar answered Oct 01 '22 01:10

Eastsun


You may want to check this presentation (slides 13-15) for details on how for loops are translated internally.

The main difference of your examples are:

  • condition in for loop body (2. example)
  • condition within the generator (1. example)

The latter, also referred to as for loop filtering comes with a performance drawback by design. To extremely simplify what is happening: Within withFilter (which is the first step of the translation) an anonymous new function of type Function2[Object, Boolean] is created (which is used to evaluate the condition). The parameter that is passed to its apply function must be boxed, since it is defined based on Object. This boxing/unboxing is much slower than evaluating the if condition directly within the for loop body, which allows to access variables directly.

like image 29
bluenote10 Avatar answered Oct 01 '22 01:10

bluenote10