Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to construct a \p JavaScript regexp that matches Grapheme_Cluster_Break=Extend

Tags:

Unicode text segmentation requires access to the Grapheme_Cluster_Break property of characters. Which JavaScript famously doesn't provide in a direct way. I was hoping I would be able to use Unicode property escapes in a regexp to work around this, but that doesn't seem to be as simple as /\p{Grapheme_Cluster_Break=Extend}/u or something like that. You can do \p{Grapheme_Extend}, but that tests for something different.

Is there a way to trick JavaScript runtimes into giving me information about characters' Grapheme_Cluster_Break value through property escapes? (And if not, why not?)