The code to get the synonyms of a word in python is say:
from nltk.corpus import wordnet
dog = wordnet.synset('dog.n.01')
print dog.lemma_names
>>['dog', 'domestic_dog', 'Canis_familiaris']
However dog.n.02 gives different words. For any words i can't know how many words there may be. How can i return all of the synonyms for a word?
Using wn.synset('dog.n.1').lemma_names
is the correct way to access the synonyms of a sense. It's because a word has many senses and it's more appropriate to list synonyms of a particular meaning/sense. To enumerate words with similar meanings, possibly you can also look at the hyponyms.
Sadly, the size of Wordnet is very limited so there are very few lemma_names available for each senses.
Using Wordnet as a dictionary/thesarus is not very apt per se, because it was developed as an inventory of sense/meaning rather than a inventory of words. However you can use access the a particular sense and several (not a lot) related words to the sense. One can use Wordnet as a:
Dictionary: given a word, what are the different meaning of the word
for i,j in enumerate(wn.synsets('dog')):
print "Meaning",i, "NLTK ID:", j.name
print "Definition:",j.definition
Thesarus: given a word, what are the different words for each meaning of the word
for i,j in enumerate(wn.synsets('dog')):
print "Meaning",i, "NLTK ID:", j.name
print "Definition:",j.definition
print "Synonyms:", ", ".join(j.lemma_names)
print
Ontology: given a word, what are the hyponyms (i.e. sub-types) and hypernyms (i.e. super-types).
for i,j in enumerate(wn.synsets('dog')):
print "Meaning",i, "NLTK ID:", j.name
print "Hypernyms:", ", ".join(list(chain(*[l.lemma_names for l in j.hypernyms()])))
print "Hyponyms:", ", ".join(list(chain(*[l.lemma_names for l in j.hyponyms()])))
print
[Ontology Output]
Meaning 0 NLTK ID: dog.n.01
Hypernyms words domestic_animal, domesticated_animal, canine, canid
Hyponyms puppy, Great_Pyrenees, basenji, Newfoundland, Newfoundland_dog, lapdog, poodle, poodle_dog, Leonberg, toy_dog, toy, spitz, pooch, doggie, doggy, barker, bow-wow, cur, mongrel, mutt, Mexican_hairless, hunting_dog, working_dog, dalmatian, coach_dog, carriage_dog, pug, pug-dog, corgi, Welsh_corgi, griffon, Brussels_griffon, Belgian_griffon
Meaning 1 NLTK ID: frump.n.01
Hypernyms: unpleasant_woman, disagreeable_woman
Hyponyms:
Meaning 2 NLTK ID: dog.n.03
Hypernyms: chap, fellow, feller, fella, lad, gent, blighter, cuss, bloke
Hyponyms:
Meaning 3 NLTK ID: cad.n.01
Hypernyms: villain, scoundrel
Hyponyms: perisher
Note this other answer:
>>> wn.synsets('small')
[Synset('small.n.01'),
Synset('small.n.02'),
Synset('small.a.01'),
Synset('minor.s.10'),
Synset('little.s.03'),
Synset('small.s.04'),
Synset('humble.s.01'),
Synset('little.s.07'),
Synset('little.s.05'),
Synset('small.s.08'),
Synset('modest.s.02'),
Synset('belittled.s.01'),
Synset('small.r.01')]
Keep in mind that in your code you were trying to get the lemmas, but that's one level too deep for what you want. The synset level is about meaning, while the lemma level gives you words. In other words:
In WordNet (and I’m speaking of English WordNet here, though I think the ones in other langauges are similarly organized) a lemma has senses. Specifically, a lemma (that is, a base word form that is indexed in WordNet) has exactly as many senses as the number of synsets that it participates in. Conversely, and as you say, synsets contain one more more lemmas, which means that multiple lemmas (words) can represent the same sense, or meaning.
Also have a look at the NLTK's WordNet how to for a few more ways of exploring around a meaning or a word.
The documentation suggests
wordnet.synsets('dog')
to get all synsets for dog.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With