Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Stanford Parser(CoreNLP) to find phrase heads

I am going to use Stanford Corenlp 2013 to find phrase heads. I saw this thread.

But, the answer was not clear to me and I couldn't add any comment to continue that thread. So, I'm sorry for duplication.

What I have at the moment is the parse tree of a sentence (using Stanford Corenlp) (I also tried with CONLL format which is created by Stanford Corenlp). And what I need is exactly the head of noun phrases.

I don't know how I can use dependencies and the parse tree to extract heads of nounphrases. What I know is that if I have nsubj (x, y), y is the head of the subject. If I have dobj(x,y), y is the head of the direct object. f I have iobj(x,y), y is the head of the indirect object.

However, I am not sure if this way is the correct way to find all phrase heads. If it is, which rules I should add to get all heads of noun phrases?

Maybe, it is worth saying that I need the heads of noun phrases in a java code.

like image 989
Alice1989 Avatar asked Oct 17 '13 16:10

Alice1989


People also ask

Do I need the CoreNLP license for the Stanford parser distribution?

The Stanford Parser distribution includes English tokenization, but does not provide tokenization used for French, German, and Spanish. Access to that tokenization requires using the full CoreNLP package. Likewise usage of the part-of-speech tagging models requires the license for the Stanford POS tagger or full CoreNLP distribution.

What is the Stanford parser?

The Stanford Parser can be used to generate constituency and dependency parses of sentences for a variety of languages. The package includes PCFG, Shift Reduce, and Neural Dependency parsers.

How to parse dependencies in NLTK using Stanford CoreNLP?

CoreNLP, created by Stanford NLP Group, provides NLP tools in Java. This Java library can be used with NLTK to parse dependencies in Python. The first step is to download the Stanford CoreNLP zip file and Stanford CoreNLP model jar file from the CoreNLP website.

What is the best syntax parser for NLP?

The best general syntax parser that exists for English, Arabic, Chinese, French, German, and Spanish is currently the blackbox parser found in Stanford’s CoreNLP library. This parser is a Java library, however, and requires Java 1.8 to be installed.


2 Answers

Since I couldnt comment on the answer given by Chaitanya, adding more to his answer here.

Stanford CoreNLP suite has implementation of Collins head finder heuristics and a semantic head finder heuristic in the form of

  1. CollinsHeadFinder
  2. ModCollinsHeadFinder
  3. SemanticHeadFinder

All you would need is instantiate one of the three and do the following.

Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
headFinder.determineHead(tree).pennPrint(out);

You can iterate through the nodes of the tree and determine head words wherever required.

PS: My answer is based on the StanfordCoreNLP suite released as of 20140104.

Here is a simple dfs that lets you extract head words for all noun phrases in a sentence

public static void dfs(Tree node, Tree parent, HeadFinder headFinder) {
      if (node == null || node.isLeaf()) {
         return;
      }
      //if node is a NP - Get the terminal nodes to get the words in the NP      
      if(node.value().equals("NP") ) {

         System.out.println(" Noun Phrase is ");
         List<Tree> leaves = node.getLeaves();

         for(Tree leaf : leaves) {
            System.out.print(leaf.toString()+" ");

         }
         System.out.println();

         System.out.println(" Head string is ");
         System.out.println(node.headTerminal(headFinder, parent));

    }

    for(Tree child : node.children()) {
         dfs(child, node, headFinder);
    }

 }
like image 59
TheGT Avatar answered Sep 20 '22 13:09

TheGT


You could extract the phrase of interest such that it is an object of the class Tree You can then use determineHead(Tree t) method from any of the classes that implement the interface HeadFinder.

like image 31
Chaitanya Shivade Avatar answered Sep 21 '22 13:09

Chaitanya Shivade