Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JavaScript Regex for capitalized letters with accents

In JavaScript, its easy to match letters and accents with this regex:

text.match(/[a-z\u00E0-\u00FC]+/i);

And only the lowercase letters and accents without the i option:

text.match(/[a-z\u00E0-\u00FC]+/);

But what is the correct regular expression to match only capitalized letters and accents?

EDIT: like the answers already mention below, the regex above also matches some other signs, and miss some special accent characters like ý and Ý, ć and Ć and many others.

like image 928
Etienne Avatar asked Apr 19 '15 14:04

Etienne


2 Answers

The range U+00C0 - U+00DC should be the uppercase equivalent for U+00E0 - U+00FC

So this text.match(/[A-Z\u00C0-\u00DC]+/); should be what you are looking for.

A site like graphemica can help you to determine the ranges you need yourself.

EDIT like the other answers already mention, this also matches some other signs.

like image 137
t.niese Avatar answered Sep 30 '22 19:09

t.niese


Replace a-z with A-Z and \u00E0-\u00FC with \u00C0-\u00DC to match the same letters in uppercase as text.match(/[a-z\u00E0-\u00FC]+/); matches in lowercase.

However!
This is not a proper implementation, neither for lowercase nor for uppercase letters, as, for example, your lowercase match includes ÷ (division sign), which is not at all a letter, and my uppercase string will match × (multiplication sign), which looks like an X, but isn't actually a letter either.
In addition to that, you're missing characters like ý and Ý, ć and Ć and many, many others.

like image 30
Siguza Avatar answered Sep 30 '22 18:09

Siguza