How to obtain better results using NLTK pos tag

Tags:

I am just learning nltk using Python. I tried doing pos_tag on various sentences. But the results obtained are not accurate. How can I improvise the results ?

broke = NN
flimsy = NN
crap = NN

Also I am getting lot of extra words being categorized as NN. How can I filter these out to get better results.?

752

asked Nov 16 '11 04:11

SyncMaster

1 Answers

Give the context, there you obtained these results. Just as example, I'm obtaining other results with pos_tag on the context phrase "They broke climsy crap":

import nltk
text=nltk.word_tokenize("They broke flimsy crap")
nltk.pos_tag(text)

[('They', 'PRP'), ('broke', 'VBP'), ('flimsy', 'JJ'), ('crap', 'NN')]

Anyway, if you see that in your opinion a lot of word are falsely cathegorized as 'NN', you can apply some other technique specially on those which are marked a s 'NN'. For instance, you can take some appropriate tagged corpora and classify it with trigram tagger. (actually in the same way the authors do it with bigrams on http://nltk.googlecode.com/svn/trunk/doc/book/ch05.html).

Something like this:

pos_tag_results=nltk.pos_tag(your_text) #tagged sentences with pos_tag
trigram_tagger=nltk.TrigramTagger(tagged_corpora) #build trigram tagger based on your tagged_corpora
trigram_tag_results=trigram_tagger(your_text) #tagged sentences with trigram tagger
for i in range(0,len(pos_tag_results)):
    if pos_tag_results[i][1]=='NN':
        pos_tag_results[i][1]=trigram_tag_results[i][1]#for 'NN' take trigram_tagger instead

Let me know if it improves your results.

answered Nov 03 '22 01:11

Max Li

Related questions
                            
                                Page switching using AJAX in Django
                            
                                Python : How to fill an array line by line?
                            
                                decoding shift-jis: "illegal multibyte sequence"
                            
                                Pyramid equivalent to Django's syncdb command?
                            
                                Sending StopIteration to for loop from outside of the iterator
                            
                                Python: how to compute date ranges from a list of dates?
                            
                                Python question about exponents and int
                            
                                django celery: how to set task to run at specific interval programmatically
                            
                                Convert binary string representation of a byte to actual binary value in Python
                            
                                How to calculate slope in numpy
                            
                                Help to identify python module (preprocessor?) "python -m dg"
                            
                                Hitting rate limit for google maps API, But don't know why
                            
                                PIL Cannot Handle This Data Type
                            
                                tkinter woes when porting 2.x code to 3.x, 'tkinter' module attribute doesn't exist
                            
                                Idiomatic Way to Transfer Django Model Validation Errors to a Form
                            
                                NLTK/NLP buliding a many-to-many/multi-label subject classifier
                            
                                Why does subprocess.Popen not work
                            
                                pyinstaller seems not to find a data file
                            
                                Adding a file-like object to a Zip file in Python
                            
                                Python gives 'Not well-formed xml' error because of presence of '&' characters

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to obtain better results using NLTK pos tag

Tags:

python

nltk

pos-tagger

SyncMaster

People also ask

1 Answers

Max Li

Recent Activity

Donate For Us