There are some strings:
111/aaa
111/aaa|222/bbb
They are in the form of expression:
(.*)/(.*)(|(.*)/(.*))?
I tried to use it to match a string and extract the values:
var rrr = """(.*)/(.*)(|(.*)/(.*))?""".r
"123/aaa|444/bbb" match {
case rrr(pid,pname, cid,cname) => println(s"$pid, $pname, $cid, $cname")
case _ => println("not matched ?!")
}
But it prints:
not matched ?!
And I want to get:
123, aaa, 444, bbb
How to fix it?
UPDATE
Thanks for @BartKiers and @Barmar's anser, that I found my regex has several mistakes, and finally found this solution:
var rrr = """(.*?)/(.*?)([|](.*?)/(.*?))?""".r
"123/aaa|444/bbb" match {
case rrr(pid,pname, _, cid,cname) => println(s"$pid, $pname, $cid, $cname")
case _ => println("not matched ?!")
}
It works, but you can see there is a _
which is actually not useful. Is there any way to redefine the regex that I can just write rrr(pid,pname,cid,cname)
to match it?
.*
could lead to a lot of backtracking becuase .*
would first match the complete string and then go back one by one until it matches the first /
.
Also it won't capture the values in groups properly as you would expect it to..
You should use .*?
Your regex should be
^(.*?)/(.*?)(?:\|(.*?)/(.*?))?$
There wouldn't be any performance difference for small strings but it would capture the values in the right group
Notice the ?:
in the regex, it means don't capture the group (?:\|(.*?)/(.*?))?
. So it will be 4 subgroups only as the result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With