Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript regExp cyrillic pattern

I know that this is a dumb question, but i spent two days googling without any result. What the regExp pattern should be to allow my user to type only cyrillic characters and spaces? Thanks in advance!

like image 514
Emil Avramov Avatar asked Mar 30 '26 18:03

Emil Avramov


1 Answers

You cannot do this in Javascript because Javascript does not provide even the most basic Level 1 Unicode support in its regexes. You would have to switch languages to do this correctly.

You cannot use enumerated block ranges for this. That confuses blocks and scripts, which is deeply flawed. There are 150 code points that have the \p{Script=Cyrillic} property but which lack the \p{Block=Cyrillic} property. They are in different blocks. Watch:

$ unichars '\p{Script=Cyrillic}' '\P{Block=Cyrillic}' | wc -l
150

Furthermore, there are a couple of non-Cyrillic code points within the Cyrillic block.

The best you could do is to enumerate all 404 Cyrillic code points as a character class, which may prove prohibitively large.

$ unichars '\p{Script=Cyrillic}'  | wc -l
404

You can use the unichars scripts to list those all out if you really want to. You might also want to grab the uniprops script while you’re there.

like image 59
tchrist Avatar answered Apr 02 '26 08:04

tchrist



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!