Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between Seq and IndexedSeq/LinearSeq in Scala?

In Scala Collection documentation, there is some clue to this question:

Trait Seq has two subtraits LinearSeq, and IndexedSeq. These do not add any new operations, but each offers different performance characteristics: A linear sequence has efficient head and tail operations, whereas an indexed sequence has efficient apply, length, and (if mutable) update operations.

But this does not address me when to use IndexedSeq instead of Seq? I need some real example of IndexedSeq or LinearSeq where these collections do better than Seq.

like image 232
Milad Khajavi Avatar asked Mar 18 '16 18:03

Milad Khajavi


People also ask

What is IndexedSeq in Scala?

Companion object IndexedSeqA base trait for indexed sequences. Indexed sequences support constant-time or near constant-time element access and length computation. They are defined in terms of abstract methods apply for indexing and length .

What is the difference between Seq and list in Scala?

A Seq is an Iterable that has a defined order of elements. Sequences provide a method apply() for indexing, ranging from 0 up to the length of the sequence. Seq has many subclasses including Queue, Range, List, Stack, and LinkedList. A List is a Seq that is implemented as an immutable linked list.

What does Seq mean in Scala?

Scala Seq is a trait to represent immutable sequences. This structure provides index based access and various utility methods to find elements, their occurences and subsequences. A Seq maintains the insertion order.

Is sequence mutable Scala?

Wait, What?! Scala's default Seq class is mutable. Yes, you read that right.


2 Answers

Seq is the super-trait, so it's more generic, it has the characteristics common to all sequences, both Linear and Indexed.

If you're wondering what kind of sequence is created by the Seq.apply method in the companion object of Seq, we can have a look at the implementation.

Keep in mind that if you use Seq.apply, you're implying that you just need a Seq and that your code doesn't care if it's Linear or Indexed

The tl;dr answer is then: you use LinearSeq or IndexedSeq when you need to have a certain performance characteristics, you use the more general Seq when you don't care about the difference

This is the companion object of Seq:

object Seq extends SeqFactory[Seq] {
  implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, Seq[A]] = ReusableCBF.asInstanceOf[GenericCanBuildFrom[A]]

  def newBuilder[A]: Builder[A, Seq[A]] = immutable.Seq.newBuilder[A]
}

The newBuilder[A] method is what's used to build the Seq, as you can verify in the Seq.apply method (defined on the trait GenericCompanion):

def apply[A](elems: A*): CC[A] = {
   if (elems.isEmpty) empty[A]
   else {
     val b = newBuilder[A]
     b ++= elems
     b.result()
   }
 }

Now the question is: what does an immutable.Seq.newBuilder[A] build? Let's go see, this time on the immutable.Seq companion object:

object Seq extends SeqFactory[Seq] {
// stuff
  def newBuilder[A]: Builder[A, Seq[A]] = new mutable.ListBuffer
}

It builds a mutable ListBuffer! Why is that? That's because a mutable.ListBuffer also happens to be a Builder[A, Seq[A]], that is a class that is used by the collection library to build new collections.

The actual output collection comes from this line (as you can see above):

b.result()

So, what's the return type of the ListBuffer.result()? Let's go see in the ListBuffer:

// Implementation of abstract method in Builder
def result: List[A] = toList

here you go: It's a list.

Seq(1,2,3) returns a List[Int] under hood, but the whole point here is that if you're using Seq(), you don't need to know what kind of collection you have because you're implying that the more abstract interface is enough for your needs

like image 80
Giovanni Caporaletti Avatar answered Oct 21 '22 02:10

Giovanni Caporaletti


Seq is simply super trait of IndexedSeq and LinearSeq. Seq is ordered list, as opposed to Set, which is unique & unordered usually, and as opposed to Map, which is key value pair.
Now comes IndexedSeq vs LinearSeq.
IndexedSeq subclasses i.e. Vector, Range, String, are all fast index based access, and usually, fast updates. This is possible because they use data structures suitable for this like array internally (like in String) or a tree (actually a 32 fan out tree used by Vector.. for more details see this), or Range, which is just three numbers.. begin, end, step.. from which calculations of index are straightforward.
As opposed to this, LinearSeq are all slow index based and access, and slow in updates. Usually because they have a linked list kind of structure. Prepend is fast for all of them i.e. Queue, Stack, List.. but append is notoriously slow as it has to go to the edge of the internal linked list structure.. except for queue which uses a trick with it's own issues. The trick is described here.
So broadly, IndexedSeq are those data structures that have fast index based access and update, and LinearSeq are those data structures that have fast prepend.

like image 22
Apurva Singh Avatar answered Oct 21 '22 00:10

Apurva Singh