When using the whitespace tokenizer a text like "there, he is." would be split to "there," "he" and "is.". Naturally I would want to remove those punctuation that the standard tokenizer would had removed automatically.
My questions are:
You can use the char filter to remove the the ",". Char Filter
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With