I made a change to some code and it got 4.5x faster. I'm wondering why. It used to be essentially:
def doThing(queue: Queue[(String, String)]): Queue[(String, String)] = queue match {
case Queue((thing, stuff), _*) => doThing(queue.tail)
case _ => queue
}
and I changed it to this to get a huge speed boost:
def doThing(queue: Queue[(String, String)]): Queue[(String, String)] = queue.headOption match {
case Some((thing, stuff)) => doThing(queue.tail)
case _ => queue
}
What does _* do and why is it so expensive compared to headOption?
My guess after running scalac with -Xprint:all is that at the end of patmat in the queue match { case Queue((thing, stuff), _*) => doThing(queue.tail) } example I see the following methods being called (edited for brevity):
val o9 = scala.collection.immutable.Queue.unapplySeq[(String, String)](x1);
if (o9.isEmpty.unary_!)
if (o9.get.!=(null).&&(o9.get.lengthCompare(1).>=(0)))
{
val p2: (String, String) = o9.get.apply(0);
val p3: Seq[(String, String)] = o9.get.drop(1);
So lengthCompare compare the length of the collection in a possibly optimized way. For Queue, it creates an iterator and iterates one time. So that should be somewhat fast. On the other hand drop(1) also constructs an iterator, skips one element and adds the rest of the elements to the result queue, so that would be linear in the size of the collection.
The headOption example is more straightforward, it checks if the list is empty (two comparisons), and if not returns a Some(head), which then just has its _1 and _2 assigned to thing and stuff. So no iterators are created and nothing linear in the length of the collection.
There should be no significant difference between your code samples.
case Queue((thing, stuff), _*) is actually translated by compiler to call of head (apply(0)) method. You could use scalac -Xprint:patmat to investigate this:
<synthetic> val p2: (String, String) = o9.get.apply(0);
if (p2.ne(null))
matchEnd6(doThing(queue.tail))
The cost of head and cost of headOption are almost the same.
Methods head, tail and dequeue could cause reverce on internal List of Queue (with cost O(n)). In both you code samples there will be at most 2 reverce calls.
You should use dequeue like this to get at most a single reverce call:
def doThing(queue: Queue[(String, String)]): Queue[(String, String)] =
if (queue.isEmpty) queue
else queue.dequeue match {
case (e, q) => doThing(q)
}
You could also replace (thing, stuff) with _. In this case compiler will generate only call of lengthCompare without head or tail:
if (o9.get != null && o9.get.lengthCompare(1) >= 0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With