Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Word sense disambiguation in NLTK Python

Tags:

I am new to NLTK Python and i am looking for some sample application which can do word sense disambiguation. I have got a lot of algorithms in search results but not a sample application. I just want to pass a sentence and want to know the sense of each word by referring to wordnet library. Thanks

I have found a similar module in PERL. http://marimba.d.umn.edu/allwords/allwords.html Is there such module present in NLTK Python?

like image 910
thesensemakers Avatar asked Sep 13 '10 11:09

thesensemakers


People also ask

What is WSD and WordNet?

Many semantic applications can draw benefits from using WordNet, including Word Sense Disambiguation (WSD), question answering and sentiment analysis. Many papers have been published regarding WordNet and WSD, exploring different approaches and algorithms, which is the main field for using this.

What is word sense disambiguation explain with example?

Two (or more) words are disambiguated by finding the pair of dictionary senses with the greatest word overlap in their dictionary definitions. For example, when disambiguating the words in pine cone, the definitions of the appropriate senses both include the words evergreen and tree (at least in one dictionary).

How is it used for word sense disambiguation in NLP?

Word Sense Disambiguation is an important method of NLP by which the meaning of a word is determined, which is used in a particular context. NLP systems often face the challenge of properly identifying words, and determining the specific usage of a word in a particular sentence has many applications.

What are the word sense disambiguation methods?

Word Sense Disambiguation Approaches are classified into three main categories- a) Knowledge based approach, b) Supervised approach and c) Unsupervised approach. 4.1 Knowledge-based WSD. Knowledge-based approaches based on different knowledge sources as machine readable dictionaries or sense inventories, thesauri etc.


1 Answers

Recently, part of the pywsd code has been ported into the bleeding edge version of NLTK' in the wsd.py module, try:

>>> from nltk.wsd import lesk >>> sent = 'I went to the bank to deposit my money' >>> ambiguous = 'bank' >>> lesk(sent, ambiguous) Synset('bank.v.04') >>> lesk(sent, ambiguous).definition() u'act as the banker in a game or in gambling' 

For better WSD performance, use the pywsd library instead of the NLTK module. In general, simple_lesk() from pywsd does better than lesk from NLTK. I'll try to update the NLTK module as much as possible when I'm free.


In responds to Chris Spencer's comment, please note the limitations of Lesk algorithms. I'm simply giving an accurate implementation of the algorithms. It's not a silver bullet, http://en.wikipedia.org/wiki/Lesk_algorithm

Also please note that, although:

lesk("My cat likes to eat mice.", "cat", "n") 

don't give you the right answer, you can use pywsd implementation of max_similarity():

>>> from pywsd.similarity import max_similiarity >>> max_similarity('my cat likes to eat mice', 'cat', 'wup', pos='n').definition  'feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats' >>> max_similarity('my cat likes to eat mice', 'cat', 'lin', pos='n').definition  'feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats' 

@Chris, if you want a python setup.py , just do a polite request, i'll write it...

like image 92
alvas Avatar answered Oct 13 '22 01:10

alvas