Is it possible to sort an ArrayBuffer, or other mutable Scala collection, in place? I see that ArrayBuffer.sorted (and sortBy) returns a new collection, and Sorting.quicksort does sort an Array in place but doesn't work on ArrayBuffers.
The reason I ask is that I'm using combineByKey in Spark to build collections of scored objects that are limited in size (like a "top ten" list by key). If I merge in a new object and the collection is already at capacity, I need to drop the lowest-scored object. I could use a sorted collection like a PriorityQueue or SortedSet, but I don't need to keep the collections sorted all the time, only in the case when a collection fills up.
So is there some way to sort an ArrayBuffer or ListBuffer in place? Or is there some other collection that supports appending and sorting in place? I'm sure there's a better way to do this, but I'm new to Scala.
You can use Java's sorting utilities.
Here is an example:
val myArray = Array(1,12,5,6)
java.util.Arrays.sort(myArray)
At the REPL:
> myArray
res3: Array[Int] = Array(1, 5, 6, 12)
If what you have is a Scala ArrayBuffer
then call toArray
to convert it to an Array.
Of course, the toArray
on ArrayBuffer
induces the cost of coping the entire Buffer again. If this is costly, check if you can get your initial results in an Array
instead of an ArrayBuffer
. If the results are of fixed length and unlikely to grow, then you don't rally need the dynamic expansion features of ArrayBuffer
.
There are presently no facilities for sorting collections in place. That said, if you expect to have to do sorting extremely rarely, you could investigate supporting both separately e.g. as Either[PriorityQueue[A], ArrayBuffer[A]]
; or if you expect sorting to be fairly common you should use a data structure where you don't pay such a penalty each time you add an element--meaning just use the SortedSet
or PriorityQueue
. Otherwise you'll get slow really fast. (n^2 log n
gets big quickly, which is what you get if you do a full sort each time you add a new element.)
You can use Scala's JavaConverters
to delegate to Java's Arrays.sort
with 1 line of code.
Assume you have instances of Foo
in a mutable buffer that you want to sort in place with the comparator fooComparator
.
import scala.collection.mutable
import scala.collection.JavaConverters._
…
val buffer = mutable.ArrayBuffer[Foo]()
…
buffer.asJava.sort(fooComparator) // sort "in place" (actually hides 1 copy)
However for extreme performance it seems that ArrayBuffer
just can't be used and the plain fixed size Array
is the way to go. The good thing is that JavaConverters.asJava
does not copy the items. However Java's List.sort
method internally copies the items to an Array
and calls Arrays.sort
. (It then assigns the sorted items back to original collection)
Perhaps the "full solution" would be to define your own version of Scala's ArrayBuffer
that exposes the underlying array for sorting. Implementing your own collection types that can do all the same things as the original, plus your own tricks in Scala is usually easy due to the way how Scala's collection library is set up.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With