I expected the following to work (and it does):
x = '"aa","bb","cc"'
x =~ /\A(".*?",){2}".*?"\Z/
#=> 0
...but I did not expect the following two to work (and don't want them to work). I purposely used ?
to make .*
non-greedy:
x =~ /\A(".*?",){0}".*?"\Z/
#=> 0
x =~ /\A(".*?",){1}".*?"\Z/
#=> 0
I expect: beginning of line (\A
), followed by "aa",
, followed by "bb",
(that's two matches now, i.e. {2}
), and then "cc"
, and the end of line \Z
.
I understand why they are working, but I want to understand how to achieve what I want...
I want it to fail on the last two examples above (but it doesn't). Put another way, I want the following to fail:
x = '"aa","bb","cc","dd"'
x =~ /\A(".*?",){2}".*?"\Z/
#=> 0
It should see: \A
, "aa"
, "bb"
, "cc"
and then FAIL on the subsequent ,
(the fact that it was not \Z
).
The problem is that .
is too generic, and that even a non-greedy .*?
will match ,
or "
:
'"aa","bb","cc"'.match(/\A(".*?",){1}(".*?")\Z/).captures
#=> ["\"aa\",", "\"bb\",\"cc\""]
Also, there is no difference between a greedy and a non-greedy match if they both need to continue until the end of the string. /.*\Z/
is the same as /.*?\Z/
.
You cannot remove \Z
so you could replace .
with [^"]
to avoid matching "
.
three = '"aa","bb","cc"'
four = '"aa","bb","cc","dd"'
pattern = /\A("[^"]*",){2}"[^"]*"\Z/
(three =~ pattern) && (four !~ pattern)
#=> true
If the regex becomes too unreadable, an alternative would be to try to parse your text as a JSON array:
require 'json'
three = '"aa","bb","cc"'
four = '"aa","bb","cc","dd"'
def has_n_strings?(text, n)
words = JSON.parse("[#{text}]")
words.all?(String) && words.size == n
end
puts has_n_strings?(three, 3)
# true
puts has_n_strings?(three, 4)
# false
puts has_n_strings?(four, 4)
# true
puts has_n_strings?(four, 3)
# false
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With