Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

WordNet - What does n and the number represent?

My question is related to WordNet Interface.

   >>> wn.synsets('cat')
       [Synset('cat.n.01'), Synset('guy.n.01'), Synset('cat.n.03'),
        Synset('kat.n.01'), Synset('cat-o'-nine-tails.n.01'), 
        Synset('caterpillar.n.02'), Synset('big_cat.n.01'), 
        Synset('computerized_tomography.n.01'), Synset('cat.v.01'), 
        Synset('vomit.v.01')]
    >>> 

I could not find the answer to what is the purpose of n and the following number in cat.n.01 or caterpillar.n.02.

like image 472
malocho Avatar asked Jan 16 '16 19:01

malocho


People also ask

What is WordNet in natural language processing?

WordNET is a lexical database of words in more than 200 languages in which we have adjectives, adverbs, nouns, and verbs grouped differently into a set of cognitive synonyms, where each word in the database is expressing its distinct concept.

What is WordNet model?

WordNet is a lexical database of semantic relations between words in more than 200 languages. WordNet links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into synsets with short definitions and usage examples.

What is WordNet describe the basic structure of the WordNet?

A wordnet is an online lexicon which is organized by concepts. The basic unit of a wordnet is the synonym set (synset), a group of words that all refer to the same concept. Words and synsets are linked by means of conceptual-semantic relations to form the structure of wordnet.

What is Hypernym in WordNet?

As we have discussed earlier only about Wordnet, now lets understand about hypernyms and hyponyms. hypernym is a term whose meaning includes the meaning of other words, its a broad superordinate label that applies to many other members of set. It describes the more broad terms or we can say that more abstract terms.


1 Answers

Per the NLTK docs, a <lemma>.<pos>.<number> Synset string is composed of the following parts:

  • <lemma> is the word’s morphological stem
  • <pos> is one of the module attributes ADJ, ADJ_SAT, ADV, NOUN or VERB
  • <number> is the sense number, counting from 0

Thus, the <pos> is the part of speech. According to the wordnet man page, the part of speech character has the following meaning:

n    NOUN
v    VERB
a    ADJECTIVE
s    ADJECTIVE SATELLITE
r    ADVERB 

The <number> is used to disambiguate word meanings.

like image 105
unutbu Avatar answered Oct 05 '22 20:10

unutbu