Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript implementation of UAX 29 Unicode Text Segmentation? [closed]

Is anyone aware of any JavaScript implementations of UAX #29, Unicode Text Segmentation? I'm specifically interested in Word Boundaries.

I was hopeful when I came across XRegExp, but it seems to use the standard JavaScript implementation of \b.

like image 411
Paul Butcher Avatar asked May 05 '14 10:05

Paul Butcher


1 Answers

https://github.com/orling/grapheme-splitter is a pure js implementation of UAX #29 Grapheme Cluster Boundaries.

There is also an ES proposal on implementing Intl.Segmenter using UAX #29, see https://github.com/tc39/proposal-intl-segmenter.

like image 96
Junliang Huang Avatar answered Sep 24 '22 15:09

Junliang Huang