Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vector or MutableList / ListBuffer for performance

Apologies if this is a duplicate - I did a few searches and didn't quite find what I need.

We have a performance critical piece of our application that converts a Play 2.0 Enumerator (can be thought of as a Stream) of incoming data to a List (or similar). We will use the fold method on Enumerator and the question is what will be the most performant way to do it. (I will use Stream instead of Enumerator in the code, but the idea should be the same.)

val incoming: Stream[Int] = ???
val result: Seq[Int] = incoming.fold(Seq.empty)(_ + _)
val result2: Seq[Int] = incoming.fold(MutableList.empty(_ += _).toSeq

So the question is essentially, how does repeatedly appending to an immutable Vector compare to repeatedly appending to a mutable MutableList or ListBuffer in performance critical code? We've thrown out just List because we need O(1) appending (not prepending). But does the mutable data-structure buy us anything in terms of performance or garbage collection?

like image 631
3 revs Avatar asked Dec 27 '12 17:12

3 revs


1 Answers

You are probably best off using ArrayBuffer. On my machine, you get about the following number of appends per second:

preallocated Array[Int]    -- 830M
resized (x2) Array[Int]    -- 263M
Vector.newBuilder + result -- 185M
mutable.ArrayBuffer        -- 125M
mutable.ListBuffer         -- 100M
mutable.MutableList        --  71M
immutable.List + reverse   --  68M
immutable.Vector           --   8M

I assume you're not always just storing ints, and you want all the collections goodness without extra wrappings, so ArrayBuffer is the best-performing solution as long as you only need to append to one end. The lists support bidirectional addition and are comparable. Vector is horribly slow in comparison--only use it if you can take advantage of a lot of data sharing, or create it all in one swoop (see Vector.newBuilder result, which is fantastic; it's a great data structure for access, iteration, and creation and sparing updates, not updates-all-the-time).

like image 56
Rex Kerr Avatar answered Nov 15 '22 04:11

Rex Kerr