I nedd to add a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ
x time but I find this very ugly. So I try \p{L}
but it does not working in JavaScript.
Any Idea ?
my actual regex : [a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ][a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ' ,"-]*[a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ'",]+
I want to have a thing like that : [\p{L}][\p{L}' ,"-]*[\p{L}'",]+
(or smaller than the actual expression)
RegExp.prototype.unicode The unicode property indicates whether or not the " u " flag is used with a regular expression. unicode is a read-only property of an individual regular expression instance.
With JavaScript regular expressions, it is also possible to use character classes and especially \w or \d to match letters or digits. However, such forms only match characters from the Latin script (in other words, a to z and A to Z for \w and 0 to 9 for \d ).
XRegExp brings support for Unicode properties to JavaScript. RegexBuddy’s regex engine is fully Unicode-based starting with version 2.0.0. RegexBuddy 1.x.x did not support Unicode at all.
For instance, unicode property escapes can be used to match emojis, punctuations, letters (even letters from specific languages or scripts), etc. Note: For Unicode property escapes to work, a regular expression must use the u flag which indicates a string must be considered as a series of Unicode code points. See also RegExp.prototype.unicode.
What you need to add is a subset of what you asked for. First you should define what set of characters you need. \pL
means every letter from every language.
It's kind of ugly but doesn't affect performance and rather the best solution to get around such kind of problems in JS. ECMA2018 has a support for \pL
but way far to be implemented by all major browsers.
If it's a personal taste, you could reduce this ugliness a bit:
var characterSet = 'a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ';
var re = new RegExp('[' + characterSet + ']' + '[' + characterSet + '\' ,"-]*' + '[' + characterSet + '\'",]+');
This update credits go to @Francesco:
var pCL = 'a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ';
var re = new RegExp(`[${pCL}][${pCL}' ,"-]*[${pCL}'",]+`);
console.log(re.source);
You have XRegExp addon to support unicode letter matcher:
var unicodeWord = XRegExp("^\\pL+$"); // L: Letter
Here you can see more example matching unicode in javascript
http://xregexp.com/plugins/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With