Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NLTK wordnet similarity returns "None" for adjectives

I have seen that for verbs, WordNet similarity measures in NLTK can return "None" at times, but I understood this should not happen for other parts of speech. Looking at the code it seems clear that where there is no relation between pairs of two words in any other parts of speech should yield -1, not "None". Yet I have been getting this result:

>>> from nltk.corpus import wordnet as wn
>>> plodding1 = wn.synset('plodding.a.01')
>>> for sense in wn.synsets('unsteady','a'):
        print sense.name, sense.path_similarity(plodding1)

unsteady.a.01 None
unfirm.s.01 None

Any thoughts?

like image 768
nmi Avatar asked Nov 25 '12 20:11

nmi


1 Answers

The adjectives in WordNet are not arranged in a hierarchy, so shortest path will not work with adjectives. The same is true for adverbs. The only measures that will work for adjectives and adverbs are measures of relatedness, like the lesk measure. Verbs in WordNet are organized into hierarchies, but there are many of them and they are rather "short", so you sometimes can't find paths between verbs (since they may belong to different verb hierarchies). In general you can find shortest paths between nouns as they belong to one big noun hierarchy (as of WordNet 3.0 at least).

I hope this helps. More discussion of these matters can also be found on the WordNet::Similarity list (which is not a part of NLTK, but rather a stand alone Perl package that does these kinds of measurements). http://wn-similarity.sourceforge.net

Good luck, Ted

like image 139
Ted Pedersen Avatar answered Oct 05 '22 16:10

Ted Pedersen