Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NLTK - WordNet: list of long words

I would like to find words in WordNet that are at least 18 character long. I tried the following code:

from nltk.corpus import wordnet as wn
sorted(w for w in wn.synset().name() if len(w)>18)

I get the following error message:

sorted(w for w in wn.synset().name() if len(w)>18)

TypeError: synset() missing 1 required positional argument: 'name'

I am using Python 3.4.3.

How can I fix my code?

like image 792
Cornelius Avatar asked Dec 04 '15 07:12

Cornelius


People also ask

What are synsets in WordNet?

Synset is a special kind of a simple interface that is present in NLTK to look up words in WordNet. Synset instances are the groupings of synonymous words that express the same concept. Some of the words have only one Synset and some have several.

What does NLTK WordNet do?

WordNet is a lexical database for the English language, which was created by Princeton, and is part of the NLTK corpus. You can use WordNet alongside the NLTK module to find the meanings of words, synonyms, antonyms, and more.


1 Answers

Use wn.all_lemma_names() to get a list of all lemmas. I believe that's all the words you'll get out of Wordnet, so there should be no need to iterate over synsets (but you could call up the synsets for each lemma if you are so inclined).

You'll probably want to sort your hits by length:

longwords = [ n for n in wn.all_lemma_names() if len(n) > 18 ]
longwords.sort(key=len, reverse=True)
like image 129
alexis Avatar answered Sep 28 '22 01:09

alexis