Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala list recursion performance

This question is about the way that Scala does pattern matching and recursion with lists and the performance thereof.

If I have a function that recurses over a list and I do it with matching on a cons, for example something like:

def myFunction(xs) = xs match { 
  case Nil => Nil
  case x :: xs => «something» myFunction(xs)
}

In Haskell:

myFunction [] = []
myFunction (x:xs) = «something» : myFunction xs

I'm using the same semantics as I would do with, for example, Haskell. I don't think there would be any question about the Haskell implementation as that's simply the way that you deal with lists. For a long list (I would be operating on a list with a few thousand nodes), Haskell wouldn't blink (I imagine though; I've never tried).

But from what I understand with Scala, the match statement would call the unapply extractor method to split the list around the cons and, to extend the example to a function that does nothing to the list:

def myFunction(xs) = xs match { 
  case Nil => Nil
  case x :: xs => x :: myFunction(xs)
}

In Haskell:

myFunction [] = []
myFunction (x:xs) = x : myFunction xs

it would call apply extractor method to cons it back together. For a long list I imagine this would be very expensive.

To illustrate, in my specific case I want to recurse over a list of characters and accumulate various things, where the input string is anything up to a few tens of kilobytes.

Will I really be calling constructors and extractors for each step of the recursion if I want to recurse over a long list? Or are there optimisations? Or better ways to do it? In this case I would need several accumulator variables and obviously I wouldn't just be recursing over a list doing nothing...

(Please excuse my Haskell, I've not written a line for two years.)

(And yes, I'm going for tail recursion.)

like image 665
Joe Avatar asked Sep 03 '09 12:09

Joe


1 Answers

First, Haskell is non-strict, so these function calls on the tail might never be evaluated at all. Scala, on the other hand, will compute all of the list before returning. A closer implementation to what happens in Haskell would be this:

def myFunction[T](l: List[T]): Stream[T] = l match {   
  case Nil => Stream.empty  
  case x :: xs => x #:: myFunction(xs)
}

That receives a List, which is strict, and returns a Stream which is non-strict.

Now, if you want to avoid pattern matching and extractors (though none is called in this particular case -- see below), you might just do this:

def myFunction[T](xs: List[T]): Stream[T] =
  if (xs.isEmpty) Stream.empty else xs.head #:: myFunction(xs.tail)

I just realized you intend to go for tail recursion. What you wrote is not tail-recursive, because you prepend x to the result of the recursion. When handling lists, you'll get tail recursion if you compute the results backwards and then invert:

def myFunction[T](xs: List[T]): List[T] = {
  def recursion(input: List[T], output: List[T]): List[T] = input match {
    case x :: xs => recursion(xs, x :: output)
    case Nil => output
  }
  recursion(xs, Nil).reverse
}

Last, let's decompile an example to see how it works:

class ListExample {
  def test(o: Any): Any = o match {
    case Nil => Nil
    case x :: xs => xs
    case _ => null
  }
}

Generates:

public class ListExample extends java.lang.Object implements scala.ScalaObject{
public ListExample();
  Code:
   0:   aload_0
   1:   invokespecial   #10; //Method java/lang/Object."<init>":()V
   4:   return

public java.lang.Object test(java.lang.Object);
  Code:
   0:   aload_1
   1:   astore_2
   2:   getstatic       #18; //Field scala/Nil$.MODULE$:Lscala/Nil$;
   5:   aload_2
   6:   invokestatic    #24; //Method scala/runtime/BoxesRunTime.equals:(Ljava/lang/Object;Ljava/lang/Object;)Z
   9:   ifeq    18
   12:  getstatic       #18; //Field scala/Nil$.MODULE$:Lscala/Nil$;
   15:  goto    38
   18:  aload_2
   19:  instanceof      #26; //class scala/$colon$colon
   22:  ifeq    35
   25:  aload_2
   26:  checkcast       #26; //class scala/$colon$colon
   29:  invokevirtual   #30; //Method scala/$colon$colon.tl$1:()Lscala/List;
   32:  goto    38
   35:  aconst_null
   36:  pop
   37:  aconst_null
   38:  areturn

public int $tag()   throws java.rmi.RemoteException;
  Code:
   0:   aload_0
   1:   invokestatic    #42; //Method scala/ScalaObject$class.$tag:(Lscala/ScalaObject;)I
   4:   ireturn

}

Decoding, it first calls the method equals on the passed parameter and the object Nil. Returns the latter if true. Otherwise, it calls instanceOf[::] on the object. If true, it casts the object to that, and invokes the method tl on it. Failing all that, loads the cosntant null and returns it.

So, you see, x :: xs is not calling any extractor.

As for accumulating, there's another pattern you might want to consider:

val list = List.fill(100)(scala.util.Random.nextInt)
case class Accumulator(negative: Int = 0, zero: Int = 0, positive: Int = 0)
val accumulator = list.foldLeft(Accumulator())( (acc, n) => 
  n match {
    case neg if neg < 0 => acc.copy(negative = acc.negative + 1)
    case zero if zero == 0 => acc.copy(zero = acc.zero + 1)
    case pos if pos > 0 => acc.copy(positive = acc.positive + 1)
  })

The default parameters and copy methods are a Scala 2.8 feature I used just to make the example simpler, but the point is using the foldLeft method when you want to accumulate things over a list.

like image 65
Daniel C. Sobral Avatar answered Oct 24 '22 23:10

Daniel C. Sobral