In JavaScript, its easy to match letters and accents with this regex:
text.match(/[a-z\u00E0-\u00FC]+/i);
And only the lowercase letters and accents without the i
option:
text.match(/[a-z\u00E0-\u00FC]+/);
But what is the correct regular expression to match only capitalized letters and accents?
EDIT: like the answers already mention below, the regex above also matches some other signs, and miss some special accent characters like ý and Ý, ć and Ć and many others.
The range U+00C0
- U+00DC
should be the uppercase equivalent for U+00E0
- U+00FC
So this text.match(/[A-Z\u00C0-\u00DC]+/);
should be what you are looking for.
A site like graphemica can help you to determine the ranges you need yourself.
EDIT like the other answers already mention, this also matches some other signs.
Replace a-z
with A-Z
and \u00E0-\u00FC
with \u00C0-\u00DC
to match the same letters in uppercase as text.match(/[a-z\u00E0-\u00FC]+/);
matches in lowercase.
However!
This is not a proper implementation, neither for lowercase nor for uppercase letters, as, for example, your lowercase match includes ÷
(division sign), which is not at all a letter, and my uppercase string will match ×
(multiplication sign), which looks like an X, but isn't actually a letter either.
In addition to that, you're missing characters like ý
and Ý
, ć
and Ć
and many, many others.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With