Recently I have created a regex, for my PHP code which allows only the letters (including special characters plus spaces), but now I'm having a problem with converting it (?) into the JavaScript compatible regex, here it is: /^[\s\p{L}]+$/u
, the problem is the /u
modifier at the end of the regex pattern, as the JavaScript doesn't allow such flag.
How can I rewrite this, so it will work in the JavaScript as well?
Is there something to allow only Polish characters: Ł
, Ą,
Ś
, Ć
, ...
Flag u enables the support of Unicode in regular expressions. That means two things: Characters of 4 bytes are handled correctly: as a single character, not two 2-byte characters. Unicode properties can be used in the search: \p{…} .
The "g" modifier specifies a global match. A global match finds all matches (compared to only the first).
The $ number language element includes the last substring matched by the number capturing group in the replacement string, where number is the index of the capturing group. For example, the replacement pattern $1 indicates that the matched substring is to be replaced by the first captured group.
Even though JavaScript operates on Unicode strings, it does not implement Unicode-aware character classes and has no concept of POSIX character classes or Unicode blocks/sub-ranges.
The /u
modifier is for unicode support.
Support for it was added to JavaScript in ES2015.
Read http://stackoverflow.com/questions/280712/javascript-unicode to learn more information about unicode in regex with JavaScript.
Ą \u0104
Ć \u0106
Ę \u0118
Ł \u0141
Ń \u0143
Ó \u00D3
Ś \u015A
Ź \u0179
Ż \u017B
ą \u0105
ć \u0107
ę \u0119
ł \u0142
ń \u0144
ó \u00F3
ś \u015B
ź \u017A
ż \u017C
All special Polish characters:
[\u0104\u0106\u0118\u0141\u0143\u00D3\u015A\u0179\u017B\u0105\u0107\u0119\u0142\u0144\u00F3\u015B\u017A\u017C]
JavaScript doesn't have any notion of UTF-8 strings, so it's unlikely that you need the /u
flag. (Your strings are probably already in the usual JavaScript form, one UTF-16 code-unit per "character".)
The bigger problem is that JavaScript doesn't support \p{L}
, nor any equivalent notation; JavaScript regexes have no awareness of Unicode character properties. See the answers to this StackOverflow question for some ways to approximate it.
Edited to add: If you only need to support Polish letters, then you can write /^[\sa-zA-ZĄĆĘŁŃÓŚŹŻąćęłńóśźż]+$/
. The a-z
and A-Z
parts cover the ASCII letters, and then the remaining letters are listed out individually.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With