Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split up a list at each element satisfying a predicate (Scala)

In a text file I have data in the form:

1)
text
text
2)
more text
3)
even more text
more even text
even more text
...

I read it as a list of Strings using the following:

val input = io.Source.fromFile("filename.txt").getLines().toList

I want to break the list down into sub-lists starting with 1), 2) etc.

I've come up with:

val subLists =
  input.foldRight( List(List[String]()) ) {
    (x, acc) =>
      if (x.matches("""[0-9]+\)""")) List() :: (x :: acc.head) :: acc.tail
      else (x :: acc.head) :: acc.tail
  }.tail

Can this be achieved more simply? What would be really nice would be if there were a built-in method to split a collection on every element that satisfies a predicate (hint, hint, library designers :)).

like image 706
Luigi Plinge Avatar asked Sep 03 '11 14:09

Luigi Plinge


2 Answers

foldRight with a complicated argument is usually an indication that you might as well write this using recursion, and factor it out to its own method, while you are at it. Here's what I came up with. First, let's generalize to a generic method, groupPrefix:

 /** Returns shortest possible list of lists xss such that
  *   - xss.flatten == xs
  *   - No sublist in xss contains an element matching p in its tail
  */
 def groupPrefix[T](xs: List[T])(p: T => Boolean): List[List[T]] = xs match {
   case List() => List()
   case x :: xs1 => 
     val (ys, zs) = xs1 span (!p(_))
     (x :: ys) :: groupPrefix(zs)(p)  
 }

Now you get the result simply by calling

 groupPrefix(input)(_ matches """\d+\)""")
like image 118
Martin Odersky Avatar answered Nov 10 '22 17:11

Martin Odersky


I have the honor, to add an answer next to the great @MartinOdersky!

From Scala 2.13 we can use the List.unfold:

List.unfold(input) {
  case Nil =>
    None
  case x :: as =>
    as.span(!_.matches("""\d+\)""")) match {
      case (prefix, Nil) =>
        Some(x :: prefix, List.empty)
      case (prefix, suffix) =>
        Some(x :: prefix, suffix)
    }
}

Code run at Scastie.

like image 40
Tomer Shetah Avatar answered Nov 10 '22 17:11

Tomer Shetah