How can i tokenize a string in java class using stanford parser?
I am only able to find examples of documentProcessor and PTBTokenizer taking text from external file.
DocumentPreprocessor dp = new DocumentPreprocessor("hello.txt");
for (List sentence : dp) {
System.out.println(sentence);
}
// option #2: By token
PTBTokenizer ptbt = new PTBTokenizer(new FileReader("hello.txt"),
new CoreLabelTokenFactory(), "");
for (CoreLabel label; ptbt.hasNext(); ) {
label = (CoreLabel) ptbt.next();
System.out.println(label);
}
Thanks.
PTBTokenizer constructor takes a java.io.Reader, then you can use a StringReader to parse your text
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With