Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiline regex capture in Scala

Tags:

regex

scala

I'm trying to capture the content from a multiline regex. It doesn't match.

val text = """<p>line1 
    line2</p>"""

val regex = """(?m)<p>(.*?)</p>""".r

var result = regex.findFirstIn(text).getOrElse("")

Returns empty.

I put the m - flag for multiline but it doesn't seem to help in this case.

If I remove the line break the regex works.

I also found this but couldn't get it working.

How do I match the content between the <p> elements? I want everything between, also the line breaks.

Thanks in advance!

like image 240
User Avatar asked Jun 15 '13 21:06

User


1 Answers

In case it's not obvious at this point, "How do I match the content":

scala> val regex = """(?s)<p>(.*?)</p>""".r

scala> (regex findFirstMatchIn text).get group 1
res52: String = 
line1 
    line2

More idiomatically,

scala> text match { case regex(content) => content }
res0: String =
line1
    line2

scala> val embedded = s"stuff${text}morestuff"
embedded: String =
stuff<p>line1
    line2</p>morestuff

scala> val regex = """(?s)<p>(.*?)</p>""".r.unanchored
regex: scala.util.matching.UnanchoredRegex = (?s)<p>(.*?)</p>

scala> embedded match { case regex(content) => content }
res1: String =
line1
    line2
like image 157
som-snytt Avatar answered Oct 16 '22 09:10

som-snytt