I fairly frequently match strings against regular expressions. In Java:
java.util.regex.Pattern.compile("\w+").matcher("this_is").matches
Ouch. Scala has many alternatives.
"\\w+".r.pattern.matcher("this_is").matches
"this_is".matches("\\w+")
"\\w+".r unapplySeq "this_is" isDefined
val R = "\\w+".r; "this_is" match { case R() => true; case _ => false}
The first is just as heavy-weight as the Java code.
The problem with the second is that you can't supply a compiled pattern ("this_is".matches("\\w+".r")
). (This seems to be an anti-pattern since almost every time there is a method that takes a regex to compile there is an overload that takes a regex).
The problem with the third is that it abuses unapplySeq
and thus is cryptic.
The fourth is great when decomposing parts of a regular expression, but is too heavy-weight when you only want a boolean result.
Am I missing an easy way to check for matches against a regular expression? Is there a reason why String#matches(regex: Regex): Boolean
is not defined? In fact, where is String#matches(uncompiled: String): Boolean
defined?
Regular Expression Matching is also one of the classic dynamic programming problems. Suppose a string S is given and a regular expression R, write a function to check whether a string S matches a regular expression R. Assume that S contains only letters and numbers. A regular expression consists of: Letters A-Z.
“match” is always defined in Scala's root class to make its availability to the all objects. This can contain a sequence of alternatives. Each alternative will start from case keyword. Each case statement includes a pattern and one or more expression which get evaluated if the specified pattern gets matched.
Notes. Scala's pattern matching statement is most useful for matching on algebraic types expressed via case classes. Scala also allows the definition of patterns independently of case classes, using unapply methods in extractor objects.
It is defined in Scala's root class Any and therefore is available for all objects. The match method takes a number of cases as an argument. Each alternative takes a pattern and one or more expressions that will be performed if the pattern matches. A symbol => is used to separate the pattern from the expressions.
You can define a pattern like this :
scala> val Email = """(\w+)@([\w\.]+)""".r
findFirstIn
will return Some[String]
if it matches or else None
.
scala> Email.findFirstIn("[email protected]") res1: Option[String] = Some([email protected]) scala> Email.findFirstIn("test") rest2: Option[String] = None
You could even extract :
scala> val Email(name, domain) = "[email protected]" name: String = test domain: String = example.com
Finally, you can also use conventional String.matches
method (and even recycle the previously defined Email Regexp
:
scala> "[email protected]".matches(Email.toString) res6: Boolean = true
Hope this will help.
I created a little "Pimp my Library" pattern for that problem. Maybe it'll help you out.
import util.matching.Regex object RegexUtils { class RichRegex(self: Regex) { def =~(s: String) = self.pattern.matcher(s).matches } implicit def regexToRichRegex(r: Regex) = new RichRegex(r) }
Example of use
scala> import RegexUtils._ scala> """\w+""".r =~ "foo" res12: Boolean = true
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With