I found that when using pattern matching with alternatives (for strings), Scala accepts variables starting with upper case (in the example below, MyValue1
and MyValue2
), but not those starting with lower case (myValue1
, myValue2
). Is this a bug or a feature of Scala? I get this in version 2.8. If this is a feature, can anyone explain the rationale behind it? This is the code I used:
val myValue1 = "hello"
val myValue2 = "world"
val MyValue1 = "hello"
val MyValue2 = "world"
var x:String = "test"
x match {
case MyValue1 | MyValue2 => println ("first match")
case myValue1 | myValue2 => println ("second match")
}
On running, I get the following:
scala> val myValue1 = "hello"
myValue1: java.lang.String = hello
scala> val myValue2 = "world"
myValue2: java.lang.String = world
scala> val MyValue1 = "hello"
MyValue1: java.lang.String = hello
scala> val MyValue2 = "world"
MyValue2: java.lang.String = world
scala> var x:String = "test"
x: String = test
scala> x match {
| case MyValue1 | MyValue2 => println ("first match")
| case myValue1 | myValue2 => println ("second match")
| }
<console>:11: error: illegal variable in pattern alternative
case myValue1 | myValue2 => println ("second match")
^
<console>:11: error: illegal variable in pattern alternative
case myValue1 | myValue2 => println ("second match")
^
EDIT:
So it is indeed a feature and not a bug... Can anyone provide an example when this might be useful?
When I use:
x match {
case myValue1 => println ("match")
case _ =>
}
I get an unreachable code
warning on the last case, implying that the first one always matches.
Notes. Scala's pattern matching statement is most useful for matching on algebraic types expressed via case classes. Scala also allows the definition of patterns independently of case classes, using unapply methods in extractor objects.
Pattern matching is a way of checking the given sequence of tokens for the presence of the specific pattern. It is the most widely used feature in Scala. It is a technique for checking a value against a pattern. It is similar to the switch statement of Java and C.
Using if expressions in case statements First, another example of how to match ranges of numbers: i match { case a if 0 to 9 contains a => println("0-9 range: " + a) case b if 10 to 19 contains b => println("10-19 range: " + b) case c if 20 to 29 contains c => println("20-29 range: " + c) case _ => println("Hmmm...") }
This is not specific to patterns with alternatives, and it is not a bug. An identifier that begins with a lowercase letter in a pattern represents a new variable that will be bound if the pattern matches.
So, your example is equivalent to writing:
x match {
case MyValue1 | MyValue2 => println ("first match")
case y | z => println ("second match")
}
You can work around this by using backticks:
x match {
case MyValue1 | MyValue2 => println ("first match")
case `myValue1` | `myValue2` => println ("second match")
}
It is a feature. Stable identifiers beginning with an uppercase letter are treated like literals for the purpose of pattern matching, and lowercase identifiers are "assigned to" so you can use the matched value for something else.
You gave an example of it not making sense:
x match {
case myValue1 => println ("match")
case _ =>
}
But the sense is easy to see if we change that a little:
x match {
case MyValue1 => println("match")
case MyValue2 => println("match")
case other => println("no match: "+other)
}
Of course, one could use x
instead of other
above, but here are some examples where that would not be convenient:
(pattern findFirstIn text) {
// "group1" and "group2" have been extracted, so were not available before
case pattern(group1, group2) =>
// "other" is the result of an expression, which you'd have to repeat otherwise
case other =>
}
getAny match {
// Here "s" is a already a string, whereas "getAny" would have to be typecast
case s: String =>
// Here "i" is a already an int, whereas "getAny" would have to be typecase
case i: Int =>
}
So there are many reasons why it is convenient for pattern matching to assign the matched value to an identifier.
Now, though I think this is one of the greatest misfeatures of Scala, because it is so subtle and unique, the reasoning behind it is that, in the recommended Scala style, constants are camel cased starting with an uppercase letter, while methods and vals and vars (which are really methods too) are camel cased starting with lowercase letters. So constants are naturally treated as literals, while others are treated as assignable identifiers (which may shadow identifiers defined in an outer context).
What's happening here is that myValue1 and myValue2 are being treated as variable identifiers (i.e., the definition of new variables that are bound to the value being matched), whereas MyValue1 and MyValue2 are treated as stable identifiers that refer to values declared earlier. In a pattern match case, variable identifiers must start with a lower case letter, hence why the first case behaves intuitively. See section 8.1 of the Scala Language Specification (http://www.scala-lang.org/docu/files/ScalaReference.pdf) for exact details.
Altering your example slightly, you can see the variable identifier:
scala> x match {
| case MyValue1 | MyValue2 => println ("first match")
| case myValue1 => println (myValue1)
| }
test
If it helps, I just posted an article on this topic a week or so @ http://asoftsea.tumblr.com/post/2102257493/magic-match-sticks-and-burnt-fingers
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With