Working through R4DS Strings chapter and am getting confused about the following regular expression example:
x <- "1888 is the longest year in Roman numerals: MDCCCLXXXVIII"
str_view(x, "C?")
This code returns no match
Using the ? I understand specifies either match 0 or 1 time and repetition is "greedy" and will match the longest string possible, so why isn't 1 "C" matched?
Additionally, the below code matches the first "CC":
x <- "1888 is the longest year in Roman numerals: MDCCCLXXXVIII"
str_view(x, "CC?")
Thanks
I think it does return a match, but it's the empty string.
Explanation:
M does not match C.C is optional.On the other hand CC? can't match at the start of the string, so the engine has to step through the string until it finds the first C, and will then match regardless of how many Cs there are.
Moral: Never construct a regex where all tokens are optional, allowing an empty match (unless you're planning to do exactly that).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With