I am new to NLTK Python and i am looking for some sample application which can do word sense disambiguation. I have got a lot of algorithms in search results but not a sample application. I just want to pass a sentence and want to know the sense of each word by referring to wordnet library. Thanks
I have found a similar module in PERL. http://marimba.d.umn.edu/allwords/allwords.html Is there such module present in NLTK Python?
Many semantic applications can draw benefits from using WordNet, including Word Sense Disambiguation (WSD), question answering and sentiment analysis. Many papers have been published regarding WordNet and WSD, exploring different approaches and algorithms, which is the main field for using this.
Two (or more) words are disambiguated by finding the pair of dictionary senses with the greatest word overlap in their dictionary definitions. For example, when disambiguating the words in pine cone, the definitions of the appropriate senses both include the words evergreen and tree (at least in one dictionary).
Word Sense Disambiguation is an important method of NLP by which the meaning of a word is determined, which is used in a particular context. NLP systems often face the challenge of properly identifying words, and determining the specific usage of a word in a particular sentence has many applications.
Word Sense Disambiguation Approaches are classified into three main categories- a) Knowledge based approach, b) Supervised approach and c) Unsupervised approach. 4.1 Knowledge-based WSD. Knowledge-based approaches based on different knowledge sources as machine readable dictionaries or sense inventories, thesauri etc.
Recently, part of the pywsd
code has been ported into the bleeding edge version of NLTK
' in the wsd.py
module, try:
>>> from nltk.wsd import lesk >>> sent = 'I went to the bank to deposit my money' >>> ambiguous = 'bank' >>> lesk(sent, ambiguous) Synset('bank.v.04') >>> lesk(sent, ambiguous).definition() u'act as the banker in a game or in gambling'
For better WSD performance, use the pywsd
library instead of the NLTK
module. In general, simple_lesk()
from pywsd
does better than lesk
from NLTK
. I'll try to update the NLTK
module as much as possible when I'm free.
In responds to Chris Spencer's comment, please note the limitations of Lesk algorithms. I'm simply giving an accurate implementation of the algorithms. It's not a silver bullet, http://en.wikipedia.org/wiki/Lesk_algorithm
Also please note that, although:
lesk("My cat likes to eat mice.", "cat", "n")
don't give you the right answer, you can use pywsd
implementation of max_similarity()
:
>>> from pywsd.similarity import max_similiarity >>> max_similarity('my cat likes to eat mice', 'cat', 'wup', pos='n').definition 'feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats' >>> max_similarity('my cat likes to eat mice', 'cat', 'lin', pos='n').definition 'feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats'
@Chris, if you want a python setup.py , just do a polite request, i'll write it...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With