I accidentally answered a question where the original problem involved splitting sentence to separate words.
And the author suggested to use BreakIterator
to tokenize input strings and some people liked this idea.
I just don't get that madness: how 25 lines of complicated code can be better than a simple one-liner with regexp?
Please, explain me the pros of using BreakIterator and the real cases when it should be used.
If it's really so cool and proper then I wonder: do you really use the approach with BreakIterator
in your projects?
From looking at the code posted at that answer, it looks like BreakIterator
takes into consideration the language and locale of the text. Getting that level of support via regex will surely be a considerable pain. Perhaps that is the main reason it is preferred over a simple regex?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With