Often I produce a collection on demand to save on instance data size. The consumer iterates through the collection probably only once, before its garbage collected. The consumer doesn't care about the order of the collection doesn't need to sort on it certainly doesn't need to mutate it, or any of its elements. What's the most efficient type safe collection in Scala? - an Array?
Later edit: It occurs to me that there are probably quite a few situations when I could use Sets. Is it good to use Sets when possible or just to use them when set functionality is really needed?
Yes, of all the collection data structures, arrays come with the least amount of overhead when you know their size in advance.
If you don't know the size ahead of time, I would still pick an ArrayBuffer*. The algorithm used to expand the underlying array when it runs out of space is as efficient as it gets.
Do not* use a (linked) List or a Stream because these classes involve one heap allocation per element. Modern JVM garbage collectors are good, but they don't work for free.
*: But see @user unknown's comment on the question for links to some micro-benchmarks. The current ArrayBuffer
implementation might be suboptimal.
Also have a look at .view
. Often you don't need to actually store intermediate results. Instead you can use .map
, .filter
and others to build up a "description" of a collection. The operations (map, filter, etc.) will only be performed when you iterate over the collection, often in O(1)
space. The downside is, that these views will be re-computed every time they're queried. (though this might still be more efficient with simple filters and huge underlying collections)
Also, be extra careful with views on mutable data structures. Views don't capture the state of the underlying data structure. When it changes, so does the view. Views on immutable data structures, however, behave very nicely. Finally, views obviously contain a reference to the underlying data structure, meaning it won't be garbage collected while your program holds onto the view.
(Updated) Vectors seem to strike a good balance between storage efficiency versus flexibility, especially for large sequences.
Do you need to store the elements? Can't you compute them on-demand? If you can compute the values on-demand instead of storing them, you can create a Traversable
or an Iterable
that will do the job with barely any memory spent (no memory spent for Traversable
, except for the class itself).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With