I made a change to some code and it got 4.5x faster. I'm wondering why. It used to be essentially:
def doThing(queue: Queue[(String, String)]): Queue[(String, String)] = queue match {
case Queue((thing, stuff), _*) => doThing(queue.tail)
case _ => queue
}
and I changed it to this to get a huge speed boost:
def doThing(queue: Queue[(String, String)]): Queue[(String, String)] = queue.headOption match {
case Some((thing, stuff)) => doThing(queue.tail)
case _ => queue
}
What does _*
do and why is it so expensive compared to headOption?
My guess after running scalac with -Xprint:all
is that at the end of patmat in the queue match { case Queue((thing, stuff), _*) => doThing(queue.tail) }
example I see the following methods being called (edited for brevity):
val o9 = scala.collection.immutable.Queue.unapplySeq[(String, String)](x1);
if (o9.isEmpty.unary_!)
if (o9.get.!=(null).&&(o9.get.lengthCompare(1).>=(0)))
{
val p2: (String, String) = o9.get.apply(0);
val p3: Seq[(String, String)] = o9.get.drop(1);
So lengthCompare
compare the length of the collection in a possibly optimized way. For Queue
, it creates an iterator and iterates one time. So that should be somewhat fast. On the other hand drop(1)
also constructs an iterator, skips one element and adds the rest of the elements to the result queue, so that would be linear in the size of the collection.
The headOption
example is more straightforward, it checks if the list is empty (two comparisons), and if not returns a Some(head)
, which then just has its _1
and _2
assigned to thing
and stuff
. So no iterators are created and nothing linear in the length of the collection.
There should be no significant difference between your code samples.
case Queue((thing, stuff), _*)
is actually translated by compiler to call of head
(apply(0)
) method. You could use scalac -Xprint:patmat
to investigate this:
<synthetic> val p2: (String, String) = o9.get.apply(0);
if (p2.ne(null))
matchEnd6(doThing(queue.tail))
The cost of head
and cost of headOption
are almost the same.
Methods head
, tail
and dequeue
could cause reverce
on internal List
of Queue
(with cost O(n)
). In both you code samples there will be at most 2 reverce
calls.
You should use dequeue
like this to get at most a single reverce
call:
def doThing(queue: Queue[(String, String)]): Queue[(String, String)] =
if (queue.isEmpty) queue
else queue.dequeue match {
case (e, q) => doThing(q)
}
You could also replace (thing, stuff)
with _
. In this case compiler will generate only call of lengthCompare
without head
or tail
:
if (o9.get != null && o9.get.lengthCompare(1) >= 0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With