Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stanford CoreNLP Annotators Thread safe?

the website of Stanford CoreNLP

http://nlp.stanford.edu/software/corenlp.shtml

lists dozens of Annotators which work like a charm. I would like to use instances of the Annotators for the common tasks (lemmatization, tagging, parsing) by multiple threads. For example to split up the processing of a massively large (GBs of Text) into threads or to provide web services.

There has been some discussion in the past referring to LocalThreads which, by my understanding, use one instance of an Annotator per Thread (thus avoiding problems regarding thread-safety). This is an option but that way all model files and resources have to be loaded n times as well.

Are the Annotators (or some of them) thread-safe to use? I couldn't find anything conclusive/official in the discussions, docs or faqs.

like image 376
Rüdiger Avatar asked May 05 '15 07:05

Rüdiger


1 Answers

Yes, the annotators are intended to be thread-safe. You can create a new AnnotationPipeline (e.g., a new StanfordCoreNLP object), and then many threads can pass annotations into this pipeline without reloading the model for each thread.

like image 195
Gabor Angeli Avatar answered Sep 20 '22 03:09

Gabor Angeli