I have an automatically generated regular expression, which basically is one big "or" group like so:
(\bthe\b|\bcat\b|\bin\b|\bhat\.\b|\bhat\b)
I've noticed that in case of
hat.
It would match "hat" only, not "hat." as I want. Is there a way to make it more greedy?
UPDATE: forgot about word boundaries, sorry for that.
Put hat\.
before hat
in the regular expression. The first matching expression in an alternation wins. hat
matches hat.
so hat\.
is never checked.
A better way would to just write that part as hat\.?
rather than hat\.|hat
. That makes the period optional so you don't need two terms in the alternation.
After your edit:
There is no word boundary between .
and, say, a space (both are non-word characters). So \bhat\.\b
is only going to match in things like hat.x
where another letter immediately follows the period. This means that in e.g. a sentence, hat
will be the one that gets matched. I see you found a solution, however.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With