When we have two lists a
and b
, how can one concatenate those two (order is not relevant) to a new list in an efficient way ?
I could not figure out from the Scala API, if a ::: b
and a ++ b
are efficient. Maybe I missed something.
1. Concatenation operator (+) for List Concatenation. The '+' operator can be used to concatenate two lists. It appends one list at the end of the other list and results in a new list as output.
The most conventional method to perform the list concatenation, the use of “+” operator can easily add the whole of one list behind the other list and hence perform the concatenation. List comprehension can also accomplish this task of list concatenation.
For large lists with one million elements, the runtime of the extend() method is 60% faster than the runtime of the append() method.
In Scala 2.9, the code for :::
(prepend to list) is as follows:
def :::[B >: A](prefix: List[B]): List[B] =
if (isEmpty) prefix
else (new ListBuffer[B] ++= prefix).prependToList(this)
whereas ++
is more generic, since it takes a CanBuildFrom
parameter, i.e. it can return a collection type different from List
:
override def ++[B >: A, That](that: GenTraversableOnce[B])(implicit bf: CanBuildFrom[List[A], B, That]): That = {
val b = bf(this)
if (b.isInstanceOf[ListBuffer[_]]) (this ::: that.seq.toList).asInstanceOf[That]
else super.++(that)
}
So if your return type is List
, the two perform identical.
The ListBuffer
is a clever mechanism in that it can be used as a mutating builder, but eventually "consumed" by the toList
method. So what (new ListBuffer[B] ++= prefix).prependToList(this)
does, is first sequentially add all the elements in prefix
(in the example a
), taking O(|a|) time. It then calls prependToList
, which is a constant time operation (the receiver, or b
, does not need to be taken apart). Therefore, the overall time is O(|a|).
On the otherhand, as pst pointed out, we have reverse_:::
:
def reverse_:::[B >: A](prefix: List[B]): List[B] = {
var these: List[B] = this
var pres = prefix
while (!pres.isEmpty) {
these = pres.head :: these
pres = pres.tail
}
these
}
So with a reverse_::: b
, this again takes O(|a|), hence is no more or less efficient that the other two methods (although for small list sizes, you save the overhead of having an intermediate ListBuffer
creation).
In other words, if you have knowledge about the relative sizes of a
and b
, you should make sure that the prefix is the smaller of the two lists. If you do not have that knowledge, there is nothing you can do, because the size
operation on a List
takes O(N) :)
On the other hand, in a future Scala version you may see an improved Vector
concatenation algorithm, as demonstrated in this ScalaDays talk. It promises to solve the task in O(log N) time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With