Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting noun phrases from a text file using stanford typed parser

I have a text which I want to extract the noun phrases from it. I can easily get the typed parser for the text that i have, but wondering how i can extract the noun phrases in the text ?

like image 1000
S Gaber Avatar asked Jun 11 '12 04:06

S Gaber


2 Answers

You can extract noun phrases from Tree by using following code. It assumes you have parsed sentence stored in parse (i.e. parse is output of LexicalizedParser class apply method)

public static List<Tree> GetNounPhrases()
{

    List<Tree> phraseList=new ArrayList<Tree>();
    for (Tree subtree: parse)
    {

      if(subtree.label().value().equals("NP"))
      {

        phraseList.add(subtree);
        System.out.println(subtree);

      }
    }

      return phraseList;

}
like image 145
alan turing Avatar answered Nov 07 '22 09:11

alan turing


Try this link as well. I am not sure whether the stanford pos tagger and the tagger available in the corenlp are the same or not but I found this link to be more useful.

After PoS Tagging you will have to detect patterns like this (Adjective | Noun)* (Noun Preposition)? (Adjective | Noun)* Noun

Try this link for some details on Noun phrase detection.

like image 8
MARK Avatar answered Nov 07 '22 10:11

MARK