Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala Regex enable Multiline option

Tags:

I'm learning Scala, so this is probably pretty noob-irific.

I want to have a multiline regular expression.

In Ruby it would be:

MY_REGEX = /com:Node/m 

My Scala looks like:

val ScriptNode =  new Regex("""<com:Node>""") 

Here's my match function:

def matchNode( value : String ) : Boolean = value match  {     case ScriptNode() => System.out.println( "found" + value ); true     case _ => System.out.println("not found: " + value ) ; false } 

And I'm calling it like so:

matchNode( "<root>\n<com:Node>\n</root>" ) // doesn't work matchNode( "<com:Node>" ) // works 

I've tried:

val ScriptNode =  new Regex("""<com:Node>?m""") 

And I'd really like to avoid having to use java.util.regex.Pattern. Any tips greatly appreciated.

like image 949
ed. Avatar asked Jul 06 '09 18:07

ed.


People also ask

How do I enable line breaks in RegEx?

Line breaks If you want to indicate a line break when you construct your RegEx, use the sequence “\r\n”.

How do you turn on multi line flags?

The multiline mode is enabled by the flag m . It only affects the behavior of ^ and $ . In the multiline mode they match not only at the beginning and the end of the string, but also at start/end of line.

What is multiline in RegEx?

Multiline option, or the m inline option, enables the regular expression engine to handle an input string that consists of multiple lines. It changes the interpretation of the ^ and $ language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string.

Which flag will search over multiple lines?

The m flag indicates that a multiline input string should be treated as multiple lines. For example, if m is used, ^ and $ change from matching at only the start or end of the entire string to the start or end of any line within the string.


2 Answers

This is a very common problem when first using Scala Regex.

When you use pattern matching in Scala, it tries to match the whole string, as if you were using "^" and "$" (and did not activate multi-line parsing, which matches \n to ^ and $).

The way to do what you want would be one of the following:

def matchNode( value : String ) : Boolean =    (ScriptNode findFirstIn value) match {         case Some(v) => println( "found" + v ); true         case None => println("not found: " + value ) ; false   } 

Which would find find the first instance of ScriptNode inside value, and return that instance as v (if you want the whole string, just print value). Or else:

val ScriptNode =  new Regex("""(?s).*<com:Node>.*""") def matchNode( value : String ) : Boolean =    value match {         case ScriptNode() => println( "found" + value ); true         case _ => println("not found: " + value ) ; false   } 

Which would print all all value. In this example, (?s) activates dotall matching (ie, matching "." to new lines), and the .* before and after the searched-for pattern ensures it will match any string. If you wanted "v" as in the first example, you could do this:

val ScriptNode =  new Regex("""(?s).*(<com:Node>).*""") def matchNode( value : String ) : Boolean =    value match {         case ScriptNode(v) => println( "found" + v ); true         case _ => println("not found: " + value ) ; false   } 
like image 140
Daniel C. Sobral Avatar answered Oct 02 '22 11:10

Daniel C. Sobral


Just a quick and dirty addendum: the .r method on RichString converts all strings to scala.util.matching.Regex, so you can do something like this:

"""(?s)a.*b""".r replaceAllIn ( "a\nb\nc\n", "A\nB" ) 

And that will return

A B c 

I use this all the time for quick and dirty regex-scripting in the scala console.

Or in this case:

def matchNode( value : String ) : Boolean = {      """(?s).*(<com:Node>).*""".r.findAllIn( text ) match {         case ScriptNode(v) => System.out.println( "found" + v ); true             case _ => System.out.println("not found: " + value ) ; false     } } 

Just my attempt to reduce the use of the word new in code worldwide. ;)

like image 22
Tristan Juricek Avatar answered Oct 02 '22 11:10

Tristan Juricek