Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract the noun phrases using Open nlp's chunking parser

I am newbie to Natural Language processing.I need to extract the noun phrases from the text.So far i have used open nlp's chunking parser for parsing my text to get the Tree structure.But i am not able to extract the noun phrases from the tree structure, is there any regular expression pattern in open nlp so that i can use it to extract the noun phrases.

Below is the code that i am using

    InputStream is = new FileInputStream("en-parser-chunking.bin");
    ParserModel model = new ParserModel(is);
    Parser parser = ParserFactory.create(model);
    Parse topParses[] = ParserTool.parseLine(line, parser, 1);
        for (Parse p : topParses){
                 p.show();}

Here I am getting the output as

(TOP (S (S (ADJP (JJ welcome) (PP (TO to) (NP (NNP Big) (NNP Data.))))) (S (NP (PRP We)) (VP (VP (VBP are) (VP (VBG working) (PP (IN on) (NP (NNP Natural) (NNP Language) (NNP Processing.can))))) (NP (DT some) (CD one) (NN help)) (NP (PRP us)) (PP (IN in) (S (VP (VBG extracting) (NP (DT the) (NN noun) (NNS phrases)) (PP (IN from) (NP (DT the) (NN tree) (WP stucture.))))))))))

Can some one please help me in getting the noun phrases like NP,NNP,NN etc.Can some one tell me do I need to use any other NP Chunker to get the noun phrases?Is there any regex pattern to achieve the same.

Please help me on this.

Thanks in advance

Gouse.

like image 805
user2024234 Avatar asked Feb 05 '13 12:02

user2024234


People also ask

What is noun phrase chunking in NLP?

Chunking is a process of extracting phrases from unstructured text, which means analyzing a sentence to identify the constituents(Noun Groups, Verbs, verb groups, etc.) However, it does not specify their internal structure, nor their role in the main sentence.

What is Ne chunking?

Chunking is defined as the process of natural language processing used to identify parts of speech and short phrases present in a given sentence.


1 Answers

The Parse object is a tree; you can use getParent() and getChildren() and getType() to navigate the tree.

List<Parse> nounPhrases;

public void getNounPhrases(Parse p) {
    if (p.getType().equals("NP")) {
         nounPhrases.add(p);
    }
    for (Parse child : p.getChildren()) {
         getNounPhrases(child);
    }
}
like image 95
icecream Avatar answered Sep 25 '22 13:09

icecream