In Scala Collection documentation, there is some clue to this question:
Trait Seq has two subtraits LinearSeq, and IndexedSeq. These do not add any new operations, but each offers different performance characteristics: A linear sequence has efficient head and tail operations, whereas an indexed sequence has efficient apply, length, and (if mutable) update operations.
But this does not address me when to use IndexedSeq
instead of Seq
?
I need some real example of IndexedSeq
or LinearSeq
where these collections do better than Seq
.
Companion object IndexedSeqA base trait for indexed sequences. Indexed sequences support constant-time or near constant-time element access and length computation. They are defined in terms of abstract methods apply for indexing and length .
A Seq is an Iterable that has a defined order of elements. Sequences provide a method apply() for indexing, ranging from 0 up to the length of the sequence. Seq has many subclasses including Queue, Range, List, Stack, and LinkedList. A List is a Seq that is implemented as an immutable linked list.
Scala Seq is a trait to represent immutable sequences. This structure provides index based access and various utility methods to find elements, their occurences and subsequences. A Seq maintains the insertion order.
Wait, What?! Scala's default Seq class is mutable. Yes, you read that right.
Seq
is the super-trait, so it's more generic, it has the characteristics common to all sequences, both Linear and Indexed.
If you're wondering what kind of sequence is created by the Seq.apply
method in the companion object of Seq, we can have a look at the implementation.
Keep in mind that if you use Seq.apply, you're implying that you just need a Seq and that your code doesn't care if it's Linear or Indexed
The tl;dr answer is then: you use LinearSeq
or IndexedSeq
when you need to have a certain performance characteristics, you use the more general Seq
when you don't care about the difference
This is the companion object of Seq
:
object Seq extends SeqFactory[Seq] {
implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, Seq[A]] = ReusableCBF.asInstanceOf[GenericCanBuildFrom[A]]
def newBuilder[A]: Builder[A, Seq[A]] = immutable.Seq.newBuilder[A]
}
The newBuilder[A]
method is what's used to build the Seq, as you can verify in the Seq.apply method (defined on the trait GenericCompanion
):
def apply[A](elems: A*): CC[A] = {
if (elems.isEmpty) empty[A]
else {
val b = newBuilder[A]
b ++= elems
b.result()
}
}
Now the question is: what does an immutable.Seq.newBuilder[A]
build?
Let's go see, this time on the immutable.Seq
companion object:
object Seq extends SeqFactory[Seq] {
// stuff
def newBuilder[A]: Builder[A, Seq[A]] = new mutable.ListBuffer
}
It builds a mutable ListBuffer
! Why is that? That's because a mutable.ListBuffer
also happens to be a Builder[A, Seq[A]]
, that is a class that is used by the collection library to build new collections.
The actual output collection comes from this line (as you can see above):
b.result()
So, what's the return type of the ListBuffer.result()
? Let's go see in the ListBuffer:
// Implementation of abstract method in Builder
def result: List[A] = toList
here you go: It's a list.
Seq(1,2,3)
returns a List[Int]
under hood, but the whole point here is that if you're using Seq(), you don't need to know what kind of collection you have because you're implying that the more abstract interface is enough for your needs
Seq is simply super trait of IndexedSeq and LinearSeq. Seq is ordered list, as opposed to Set, which is unique & unordered usually, and as opposed to Map, which is key value pair.
Now comes IndexedSeq vs LinearSeq.
IndexedSeq subclasses i.e. Vector, Range, String, are all fast index based access, and usually, fast updates. This is possible because they use data structures suitable for this like array internally (like in String) or a tree (actually a 32 fan out tree used by Vector.. for more details see this), or Range, which is just three numbers.. begin, end, step.. from which calculations of index are straightforward.
As opposed to this, LinearSeq are all slow index based and access, and slow in updates. Usually because they have a linked list kind of structure. Prepend is fast for all of them i.e. Queue, Stack, List.. but append is notoriously slow as it has to go to the edge of the internal linked list structure.. except for queue which uses a trick with it's own issues. The trick is described here.
So broadly, IndexedSeq are those data structures that have fast index based access and update, and LinearSeq are those data structures that have fast prepend.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With