Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matching against a regular expression in Scala

Tags:

regex

scala

I fairly frequently match strings against regular expressions. In Java:

java.util.regex.Pattern.compile("\w+").matcher("this_is").matches

Ouch. Scala has many alternatives.

  1. "\\w+".r.pattern.matcher("this_is").matches
  2. "this_is".matches("\\w+")
  3. "\\w+".r unapplySeq "this_is" isDefined
  4. val R = "\\w+".r; "this_is" match { case R() => true; case _ => false}

The first is just as heavy-weight as the Java code.

The problem with the second is that you can't supply a compiled pattern ("this_is".matches("\\w+".r")). (This seems to be an anti-pattern since almost every time there is a method that takes a regex to compile there is an overload that takes a regex).

The problem with the third is that it abuses unapplySeq and thus is cryptic.

The fourth is great when decomposing parts of a regular expression, but is too heavy-weight when you only want a boolean result.

Am I missing an easy way to check for matches against a regular expression? Is there a reason why String#matches(regex: Regex): Boolean is not defined? In fact, where is String#matches(uncompiled: String): Boolean defined?

like image 262
schmmd Avatar asked Nov 28 '11 20:11

schmmd


People also ask

What is regular expression matching?

Regular Expression Matching is also one of the classic dynamic programming problems. Suppose a string S is given and a regular expression R, write a function to check whether a string S matches a regular expression R. Assume that S contains only letters and numbers. A regular expression consists of: Letters A-Z.

What is match expression in Scala?

“match” is always defined in Scala's root class to make its availability to the all objects. This can contain a sequence of alternatives. Each alternative will start from case keyword. Each case statement includes a pattern and one or more expression which get evaluated if the specified pattern gets matched.

Does Scala have pattern matching?

Notes. Scala's pattern matching statement is most useful for matching on algebraic types expressed via case classes. Scala also allows the definition of patterns independently of case classes, using unapply methods in extractor objects.

What is case class and pattern matching in Scala?

It is defined in Scala's root class Any and therefore is available for all objects. The match method takes a number of cases as an argument. Each alternative takes a pattern and one or more expressions that will be performed if the pattern matches. A symbol => is used to separate the pattern from the expressions.


2 Answers

You can define a pattern like this :

scala> val Email = """(\w+)@([\w\.]+)""".r 

findFirstIn will return Some[String] if it matches or else None.

scala> Email.findFirstIn("[email protected]") res1: Option[String] = Some([email protected])  scala> Email.findFirstIn("test") rest2: Option[String] = None 

You could even extract :

scala> val Email(name, domain) = "[email protected]" name: String = test domain: String = example.com 

Finally, you can also use conventional String.matches method (and even recycle the previously defined Email Regexp :

scala> "[email protected]".matches(Email.toString) res6: Boolean = true 

Hope this will help.

like image 55
David Avatar answered Sep 25 '22 15:09

David


I created a little "Pimp my Library" pattern for that problem. Maybe it'll help you out.

import util.matching.Regex  object RegexUtils {   class RichRegex(self: Regex) {     def =~(s: String) = self.pattern.matcher(s).matches   }   implicit def regexToRichRegex(r: Regex) = new RichRegex(r) } 

Example of use

scala> import RegexUtils._ scala> """\w+""".r =~ "foo" res12: Boolean = true 
like image 44
Ian McLaird Avatar answered Sep 24 '22 15:09

Ian McLaird