Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error in creating the StanfordCoreNLP object

I have downloaded and installed required jar files from http://nlp.stanford.edu/software/corenlp.shtml#Download.

I have include the five jar files

Satnford-postagger.jar

Stanford-psotagger-3.3.1.jar

Stanford-psotagger-3.3.1.jar-javadoc.jar

Stanford-psotagger-3.3.1.jar-src.jar

stanford-corenlp-3.3.1.jar

and the code is

public class lemmafirst {

    protected StanfordCoreNLP pipeline;

    public lemmafirst() {
        // Create StanfordCoreNLP object properties, with POS tagging
        // (required for lemmatization), and lemmatization
        Properties props;
        props = new Properties();
        props.put("annotators", "tokenize, ssplit, pos, lemma");

        /*
         * This is a pipeline that takes in a string and returns various analyzed linguistic forms. 
         * The String is tokenized via a tokenizer (such as PTBTokenizerAnnotator), 
         * and then other sequence model style annotation can be used to add things like lemmas, 
         * POS tags, and named entities. These are returned as a list of CoreLabels. 
         * Other analysis components build and store parse trees, dependency graphs, etc. 
         * 
         * This class is designed to apply multiple Annotators to an Annotation. 
         * The idea is that you first build up the pipeline by adding Annotators, 
         * and then you take the objects you wish to annotate and pass them in and 
         * get in return a fully annotated object.
         * 
         *  StanfordCoreNLP loads a lot of models, so you probably
         *  only want to do this once per execution
         */
        ***this.pipeline = new StanfordCoreNLP(props);***
}

My Problem is in creating a the pipline.

The ERROR that i got is:

Exception in thread "main" java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
    at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:563)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:81)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:262)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
    at lemmafirst.<init>(lemmafirst.java:39)
    at lemmafirst.main(lemmafirst.java:83)
Caused by: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:758)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:289)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:253)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:88)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:76)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:561)
    ... 6 more
Caused by: java.io.IOException: Unable to resolve "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as either class path, filename or URL
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:434)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:753)
    ... 11 more

Can anyone please correct the errors? Thank you

like image 215
Lohath Unique Avatar asked Mar 05 '14 18:03

Lohath Unique


2 Answers

The exception which is thrown is due to the missing pos model. This is because there are downloadable versions with and without the model files.

Either you add stanford-postagger-full-3.3.1.jar which can be found on the following page (stanford-postagger-full-2014-01-04.zip): http://nlp.stanford.edu/software/tagger.shtml .

Or you do the same for the whole CoreNLP Package (stanford-corenlp-full....jar): http://nlp.stanford.edu/software/corenlp.shtml (Then you can drop all the postagger depenedencies too, they are included in CoreNLP)

In case you only want to add the model files, look at Maven Central and download "stanford-corenlp-3.3.1-models.jar".

like image 133
Christopher Schröder Avatar answered Oct 03 '22 06:10

Christopher Schröder


An easier way to add those model files is to simply add following dependencies in your pom.xml and let maven manage it for you:

<dependency>
  <groupId>edu.stanford.nlp</groupId>
  <artifactId>stanford-corenlp</artifactId>
  <version>3.6.0</version>
</dependency>
<dependency>
  <groupId>edu.stanford.nlp</groupId>
  <artifactId>stanford-corenlp</artifactId>
  <version>3.6.0</version>
  <classifier>models</classifier> <!--  will get the dependent model jars -->
</dependency>
like image 43
Sruthi Poddutur Avatar answered Oct 03 '22 07:10

Sruthi Poddutur