Which implementation from scala.collection.mutable
package should I take if I intend to do lots of by-index-deletions, like remove(i: Int)
, in a single-threaded environment? The most obvious choice, ListBuffer
, says that it may take linear time depending on buffer size. Is there some collection with log(n)
or even constant time for this operation?
Removal operators, including buf remove i
, are not part of Seq
, but it's actually part of Buffer
trait under scala.mutable
. (See Buffers)
See the first table on Performance Characteristics. I am guessing buf remove i
has the same characteristic as insert, which are linear for both ArrayBuffer
and ListBuffer
.
As documented in Array Buffers, they use arrays internally, and Link Buffers use linked lists (that's still O(n) for remove).
As an alternative, immutable Vector may give you an effective constant time.
Vectors are represented as trees with a high branching factor. Every tree node contains up to 32 elements of the vector or contains up to 32 other tree nodes. [...] So for all vectors of reasonable size, an element selection involves up to 5 primitive array selections. This is what we meant when we wrote that element access is "effectively constant time".
scala> import scala.collection.immutable._
import scala.collection.immutable._
scala> def remove[A](xs: Vector[A], i: Int) = (xs take i) ++ (xs drop (i + 1))
remove: [A](xs: scala.collection.immutable.Vector[A],i: Int)scala.collection.immutable.Vector[A]
scala> val foo = Vector(1, 2, 3, 4, 5)
foo: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4, 5)
scala> remove(foo, 2)
res0: scala.collection.immutable.Vector[Int] = Vector(1, 2, 4, 5)
Note, however, a high constant time with lots of overhead may not win a quick linear access until the data size is significantly large.
Depending on your exact use case, you may be able to use LinkedHashMap
from scala.collection.mutable
.
Although you cannot remove by index, you can remove by a unique key in constant time, and it maintains a deterministic ordering when you iterate.
scala> val foo = new scala.collection.mutable.LinkedHashMap[String,String]
foo: scala.collection.mutable.LinkedHashMap[String,String] = Map()
scala> foo += "A" -> "A"
res0: foo.type = Map((A,A))
scala> foo += "B" -> "B"
res1: foo.type = Map((A,A), (B,B))
scala> foo += "C" -> "C"
res2: foo.type = Map((A,A), (B,B), (C,C))
scala> foo -= "B"
res3: foo.type = Map((A,A), (C,C))
Java's ArrayList
effectively has constant time complexity if the last element is the one to be removed. Look at the following snippet copied from its source code,
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work
As you can see, if numMoved
is equal to 0, remove
will not shift and copy the array at all. This in some scenarios can be quite useful. For example, if you do not care about the ordering that much, to remove an element, you can always swap it with the last element, and then delete the last element from the ArrayList
, which effectively makes the remove
operation all the way constant time. I was hoping ArrayBuffer
would do the same, unfortunately that is not the case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With