Regular expression languages use \B to include A..Z, a..z, 0..9, and _, and \b is defined as a word boundary.
How can I write a regular expression that matches all valid Spanish words, including characters such as: á, í, ó, é, ñ, etc.?
I'm using .NET.
Use a Spanish locale and make your regex locale-sensitive.
Your regex system should have something equivalent to Python's re.L
(aka re.LOCALE
) to make a regex locale-dependent, so that what's a word-character and what isn't changes with locale, as do "word boundaries" etc. Are you instead asking for a way to compensate for some given regex system not supporting locale, trying to force the issue anyway...?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With