As a follow-up to this question
Here is some code that compiles and runs correctly, using captures.
val myString = "ACATCGTAGCTGCTAGCTG"
val nucCap = "([ACTG]+)".r
myString match {
case nucCap(myNuc) => println("dna:"+myNuc)
case _ => println("not dna")
}
>scala scalaTest.scala
dna:ACATCGTAGCTGCTAGCTG
Here is simpler code, without capture, that does not compile.
val myString = "ACATCGTAGCTGCTAGCTG"
val nuc = "[ACGT]+".r
myString match {
case nuc => println("dna")
case _ => println("not dna")
}
>scala scalaTest.scala
scalaTest.scala:7: error: unreachable code
Seems like the matching should return a boolean regardless of whether a capture is used. What is going on here?
Pattern matching is a way of checking the given sequence of tokens for the presence of the specific pattern. It is the most widely used feature in Scala. It is a technique for checking a value against a pattern. It is similar to the switch statement of Java and C.
Regular Expressions explain a common pattern utilized to match a series of input data so, it is helpful in Pattern Matching in numerous programming languages. In Scala Regular Expressions are generally termed as Scala Regex. Regex is a class which is imported from the package scala. util. matching.
The matches() method is used to check if the string stated matches the specified regular expression in the argument or not. Return Type: It returns true if the string matches the regular expression else it returns false.
Using if expressions in case statements First, another example of how to match ranges of numbers: i match { case a if 0 to 9 contains a => println("0-9 range: " + a) case b if 10 to 19 contains b => println("10-19 range: " + b) case c if 20 to 29 contains c => println("20-29 range: " + c) case _ => println("Hmmm...") }
In your match
block, nuc
is a pattern variable and does not refer to the nuc
in the enclosing scope. This makes the default case unreachable because the simple pattern nuc
will match anything.
An empty pair of parentheses on nuc
will make the syntactic sugar work and call the unapplySeq
method on the Regex:
myString match {
case nuc() => println("dna")
case _ => println("not dna")
}
One way to avoid this pitfall is to rename nuc
to Nuc
. Starting with an uppercase letter makes it a stable identifier, so that it refers to the Nuc
in the enclosing scope, rather than being treated by the compiler as a pattern variable.
val Nuc = "[ACGT]+".r
myString match {
case Nuc => println("dna")
case _ => println("not dna")
}
The above will print "not dna"
, because here we are simply comparing Nuc
to myString
, and they are not equal. It's a bug, but maybe a less confusing one!
Adding the parentheses will have the desired effect in this case too:
myString match {
case Nuc() => println("dna")
case _ => println("not dna")
}
// prints "dna"
By the way, it is not a boolean that is being returned, but an Option[List[String]]
:
scala> nuc.unapplySeq(myString)
res17: Option[List[String]] = Some(List())
scala> nucCap.unapplySeq(myString)
res18: Option[List[String]] = Some(List(ACATCGTAGCTGCTAGCTG))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With