I'm using Stanford Parser to parse the dependence relations between pair of words, but I also need the tagging of words. However, in the ParseDemo.java, the program only output the Tagging Tree. I need each word's tagging like this: <pre class="prettyprint"><code>My/PRP$ dog/NN also/RB likes/VBZ eating/VBG bananas/NNS ./. </code></pre> not like this: <pre class="prettyprint"><code>(ROOT (S (NP (PRP$ My) (NN dog)) (ADVP (RB also)) (VP (VBZ likes) (S (VP (VBG eating) (S (ADJP (NNS bananas)))))) (. .))) </code></pre> Who can help me? thanks a lot.

If you're mainly interested in manipulating the tags in a program, and don't need the <code>TreePrint</code> functionality, you can just get the tagged words as a List: <pre class="prettyprint"><code>LexicalizedParser lp = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz"); Tree parse = lp.apply(Arrays.asList(sent)); List taggedWords = parse.taggedYield(); </code></pre>

How to get POS tagging using Stanford Parser

Tags:

nlp

stanford-nlp

I'm using Stanford Parser to parse the dependence relations between pair of words, but I also need the tagging of words. However, in the ParseDemo.java, the program only output the Tagging Tree. I need each word's tagging like this:

My/PRP$ dog/NN also/RB likes/VBZ eating/VBG bananas/NNS ./.

not like this:

(ROOT
  (S
    (NP (PRP$ My) (NN dog))
    (ADVP (RB also))
    (VP (VBZ likes)
      (S
        (VP (VBG eating)
          (S
            (ADJP (NNS bananas))))))
    (. .)))

Who can help me? thanks a lot.

769

asked Sep 17 '10 08:09

Charlie Epps

2 Answers

If you're mainly interested in manipulating the tags in a program, and don't need the TreePrint functionality, you can just get the tagged words as a List:

LexicalizedParser lp =
  LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz");
Tree parse = lp.apply(Arrays.asList(sent));
List taggedWords = parse.taggedYield();

answered Jan 03 '23 13:01

Christopher Manning

When running edu.stanford.nlp.parser.lexparser.LexicalizedParser on the command line, you want to use:

-outputFormat "wordsAndTags"

Programatically, use the TreePrint class constructed with formatString="wordsAndTags" and call printTree, like this:

TreePrint posPrinter = new TreePrint("wordsAndTags", yourPrintWriter);
posPrinter.printTree(yourLexParser.getBestParse());

answered Jan 03 '23 15:01

msbmsb

Related questions
                            
                                Empty vocabulary for single letter by CountVectorizer
                            
                                How to improve NLTK sentence segmentation?
                            
                                spacy create new language model with data from corpus
                            
                                Is it possible to use spacy with already tokenized input?
                            
                                How do you get the past tense of a verb? [closed]
                            
                                Does NLTK have TF-IDF implemented?
                            
                                How to un-stem a word in Python?
                            
                                In spacy, Is it possible to get the corresponding rule id in a match of matches
                            
                                Why Doc2vec gives 2 different vectors for the same texts
                            
                                Identifying geographical locations in text
                            
                                Can NLTK be used in a Postgres Python Stored Procedure
                            
                                How to avoid double-extracting of overlapping patterns in SpaCy with Matcher?
                            
                                Build a natural language model that fixes misspellings
                            
                                Simple Natural Language Processing Startup for Java [duplicate]
                            
                                How can I best determine the correct capitalization for a word?
                            
                                OpenNLP Name Finder
                            
                                Reading .eml files with Python 3.6 using emaildata 0.3.4
                            
                                Correlation clustering in R
                            
                                What is the default chunker for NLTK toolkit in Python?
                            
                                Python and .NET integration

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With