I am in need of matching Unicode letters, similarly to PCRE's \p{L}
.
Now, since Dart's RegExp class is based on ECMAScript's, it doesn't have the concept of \p{L}
, sadly.
I'm looking into perhaps constructing a big character class that matches all Unicode letters, but I'm not sure where to start.
So, I want to match letters like:
foobar
מכון ראות
But the R symbol shouldn't be matched:
BlackBerry®
Neither should any ASCII control characters or punctuation marks, etc. Essentially every letter in every language Unicode supports, whether it's å, ä, φ or ת, they should match if they are actual letters.
To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).
\u000d — Carriage return — \r. \u2028 — Line separator. \u2029 — Paragraph separator.
\\pL is a Unicode property shortcut. It can also be written as as \p{L} or \p{Letter} . It matches any kind of letter from any language.
As mentioned in other answers, JavaScript regexes have no support for Unicode character classes.
I know this is an old question. But RegExp
now supports unicode categories (since Dart 2.4) so you can do something like this:
RegExp alpha = RegExp(r'\p{Letter}', unicode: true);
print(alpha.hasMatch("f")); // true
print(alpha.hasMatch("ת")); // true
print(alpha.hasMatch("®")); // false
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With