Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Wordnet Synsets from Python for Italian Language

Tags:

python

nlp

nltk

I'm starting to program with NLTK in Python for Natural Italian Language processing. I've seen some simple examples of the WordNet Library that has a nice set of SynSet that permits you to navigate from a word (for example: "dog") to his synonyms and his antonyms, his hyponyms and hypernyms and so on...

My question is: If I start with an italian word (for example:"cane" - that means "dog") is there a way to navigate between synonyms, antonyms, hyponyms... for the italian word as you do for the english one? Or... There is an Equivalent to WordNet for the Italian Language ?

Thanks in advance

like image 282
Frank B. Avatar asked May 11 '17 11:05

Frank B.


People also ask

How do you use WordNet in Python?

To use the Wordnet, at first we have to install the NLTK module, then download the WordNet package. In the wordnet, there are some groups of words, whose meaning are same. In the first example, we will see how wordnet returns meaning and other details of a word.

How do you use synsets?

Getting the Synsets of a word Synsets of a word are other words with the same meaning as the supplied word. To get the Synsets of the word given, we use the function wordnet. synsets('word') . The function returns an array containing all the Synsets related to the word passed as the argument.


1 Answers

You are in luck. The nltk provides an interface to the Open Multilingual Wordnet, which does indeed include Italian among the languages it describes. Just add an argument specifying the desired language to the usual wordnet functions, e.g.:

>>> cane_lemmas = wn.lemmas("cane", lang="ita")
>>> print(cane_lemmas)
[Lemma('dog.n.01.cane'), Lemma('cramp.n.02.cane'), Lemma('hammer.n.01.cane'),
 Lemma('bad_person.n.01.cane'), Lemma('incompetent.n.01.cane')]

The synsets have English names, because they are integrated with the English wordnet. But you can navigate the web of meanings and extract the Italian lemmas for any synset you want:

>>> hypernyms = cane_lemmas[0].synset().hypernyms()
>>> print(hypernyms)
[Synset('canine.n.02'), Synset('domestic_animal.n.01')]
>>> print(hypernyms[1].lemmas(lang="ita"))
[Lemma('domestic_animal.n.01.animale_addomesticato'), 
 Lemma('domestic_animal.n.01.animale_domestico')]

Or since you mentioned "cattiva_persona" in the comments:

>>> wn.lemmas("bad_person")[0].synset().lemmas(lang="ita")
[Lemma('bad_person.n.01.cane'), Lemma('bad_person.n.01.cattivo')]

I went from the English lemma to the language-independent synset to the Italian lemmas.

like image 136
alexis Avatar answered Sep 22 '22 15:09

alexis