I want to validate a text input field in a html page to accept only Cyrillic letters. I have written the validation code in JavaScript using regular expression like this:
var namevalue = document.getElementById("name")
var letters = /^[А-Яа-я]+$/;
if (namevalue.matches(letters)) {
alert("Accepted");
}
else {
alert("Enter only cyrillic letters");
}
This code works fine for all cyrillic letters except Ё ё
The definition of a Cyrillic letter for this list is a character encoded in the Unicode standard that a has script property of 'Cyrillic' and the general category of 'Letter'. An overview of the distribution of Cyrillic letters in Unicode is given in Cyrillic script in Unicode. Letters with diacritics .
If your regex flavor supports Unicode blocks ( [\p {IsCyrillic}] ), you can match Cyrillic characters with: [\p {IsCyrillic}] Match a character from the Unicode block "Cyrillic" (U+0400–U+04FF) « [\p {IsCyrillic}]» Unicode Characters list and Numeric HTML Entities of [U+0400–U+04FF] . This thread explains that stackoverflow.com/questions/7926514/…
Form a regular expression to remove duplicate words from sentences. regex = "\\b(\\w+)(?:\\W+\\1\\b)+"; The details of the above regular expression can be understood as: “\\b”: A word boundary. Boundaries are needed for special cases. For example, in “My thesis is great”, “is” wont be matched twice. “\\w+” A word character: [a-zA-Z_0-9]
With Regular Expressions, you can match whole classes of characters. Here are some examples that match a class of characters for 1 character position: Match all uppercase letters for 1 character position: Match all lowercase letters for 1 character position:
The problem why ё
is not working because it's out of range Aа-Яя
. Aа-Яа
is in a Basic Cyrillic alphabet [0430-044F]
, but ё
isn't in that Basic Cyrillic alphabet. ё
belongs to Cyrillic extensions [0400-045F]
. Because, JavaScript regexs engine compares not by letters itself but by its charcodes, so ё
just is out of range.
Since I presume you mean modern Russian language where despite ё
is rare but still in wide use
I may suggest this solution
var namevalue = document.getElementById("name")
// please note that I added to your pattern "еЁ".
// now this matches all Russian cyrillic letters both small and caps
// plus ё and Ё
var letters = /^[А-Яа-яёЁ]+$/;
if (namevalue.matches(letters)) {
alert("Accepted");
}
else {
alert("Enter only cyrillic letters");
}
Unfortunately the problem with A-Я
and Ё
buried deep in Unicode specification. There is no plain and simple solution. So for robust programming you need always be prepared for such cases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With