It is mentioned in the documentation of opennlp that we've to train our model with 15000 line for a good performance. now, I've to extract different entities from the document which means I've to add different tags for many tokens in the training data(15000 lines) which will take a lot of time. Is there any other way to do this? which will reduce the time or any other method which I can proceed.
Thanks.
Here are some tools:
GATE http://gate.ac.uk/
GATE Teamware (web-based) http://gate.ac.uk/teamware/
XConc Suite http://www-tsujii.is.s.u-tokyo.a...
Sapient (sentence-based) http://www.aber.ac.uk/en/cs/rese...
Knowtator (Protégé plug-in) http://knowtator.sourceforge.net/
CorpusTool http://www.wagsoft.com/CorpusToo...
UIMA CAS Editor http://uima.apache.org/
Callisto http://callisto.mitre.org/
Wordfreak http://wordfreak.sourceforge.net/
MMax2 http://mmax2.sourceforge.net/
reference: https://www.quora.com/Natural-Language-Processing-What-are-the-best-tools-for-manually-annotating-a-text-corpus-with-entities-and-relationships
This one is also worth trying:
brat rapid annotation tool
I've used it myself and recommend it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With