Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's Scala's idiomatic way to split a List by separator?

Tags:

java

list

scala

If I have a List of type String,

scala> val items = List("Apple","Banana","Orange","Tomato","Grapes","BREAK","Salt","Pepper","BREAK","Fish","Chicken","Beef")
items: List[java.lang.String] = List(Apple, Banana, Orange, Tomato, Grapes, BREAK, Salt, Pepper, BREAK, Fish, Chicken, Beef)

how can I split it into n separate lists based on a certain string/pattern ("BREAK", in this case).

I've thought about finding the position of "BREAK" with indexOf, and split up the list that way, or using a similar approach with takeWhile (i => i != "BREAK") but I'm wondering if there's a better way?

If it helps, I know there will only ever be 3 sets of items in the items list (thus 2 "BREAK" markers).

like image 264
jbnunn Avatar asked Jan 30 '13 21:01

jbnunn


People also ask

How do you split a list in Scala?

Scala List splitAt() method with example. The splitAt() method belongs to the value member of the class List. It is utilized to split the given list into a prefix/suffix pair at a stated position. Where, n is the position at which we need to split.

How do you split a word in Scala?

String split() MethodThe split() method in Scala is used to split the given string into an array of strings using the separator passed as parameter. You can alternatively limit the total number of elements of the array using limit.

What does split return in Scala?

The split method returns an array of String elements, which you can then treat as a normal Scala Array : scala> "hello world".split(" ").foreach(println) hello world.


2 Answers

def splitBySeparator[T](l: List[T], sep: T): List[List[T]] = {
  l.span( _ != sep ) match {
    case (hd, _ :: tl) => hd :: splitBySeparator(tl, sep)
    case (hd, _) => List(hd)
  }
}
val items = List("Apple","Banana","Orange","Tomato","Grapes","BREAK","Salt","Pepper","BREAK","Fish","Chicken","Beef")
splitBySeparator(items, "BREAK")

Result:

res1: List[List[String]] = List(List(Apple, Banana, Orange, Tomato, Grapes), List(Salt, Pepper), List(Fish, Chicken, Beef))

UPDATE: The above version, while concise and effective, has two problems: it does not handle well the edge cases (like List("BREAK") or List("BREAK", "Apple", "BREAK"), and is not tail recursive. So here is another (imperative) version that fixes this:

import collection.mutable.ListBuffer
def splitBySeparator[T](l: Seq[T], sep: T): Seq[Seq[T]] = {
  val b = ListBuffer(ListBuffer[T]())
  l foreach { e =>
    if ( e == sep ) {
      if  ( !b.last.isEmpty ) b += ListBuffer[T]()
    }
    else b.last += e
  }
  b.map(_.toSeq)
}

It internally uses a ListBuffer, much like the implementation of List.span that I used in the first version of splitBySeparator.

like image 84
Régis Jean-Gilles Avatar answered Nov 11 '22 03:11

Régis Jean-Gilles


Another option:

val l = Seq(1, 2, 3, 4, 5, 9, 1, 2, 3, 4, 5, 9, 1, 2, 3, 4, 5, 9, 1, 2, 3, 4, 5)

l.foldLeft(Seq(Seq.empty[Int])) {
  (acc, i) =>
    if (i == 9) acc :+ Seq.empty
    else acc.init :+ (acc.last :+ i)
}

// produces:
List(List(1, 2, 3, 4, 5), List(1, 2, 3, 4, 5), List(1, 2, 3, 4, 5), List(1, 2, 3, 4, 5))
like image 41
Ryan LeCompte Avatar answered Nov 11 '22 05:11

Ryan LeCompte