Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find Python NLTK Wordnet Synsets for a each item of a list

I've been learning basic python, but I am new to NLTK. I want to use nltk to extract hyponyms for a given list of words. It works fine when I enter every term manually, but it does not seem to work when I try to iterate through items of a list.

This works:

from nltk.corpus import wordnet as wn

syn_sets = wn.synsets("car")

for syn_set in syn_sets:
    print(syn_set, syn_set.lemma_names())
    print(syn_set.hyponyms())

But how do I get Wordnet methods to work with a list of items like

token = ["cat", "dog", "car"]
syn_sets = wn.synsets((*get each item from the list*))

in a loop?

Thank you!

like image 887
Supersquirrel Avatar asked Mar 17 '23 06:03

Supersquirrel


2 Answers

List comprehensions to the rescue!

Totally possible, even using very similar syntax to what you had before. Python has a construct known as a [list comprehension][1] made exactly for this application. Basically, it's a functional syntax for inline for loops, but tend to be cleaner, more robust implementations with slightly lower overhead.

Example:

tokens = ["cat", "dog", "car"]
syn_sets = [wn.synsets(token) for token in tokens]

This will even scale to slightly more complex data structures pretty easily, for instance:

split_syn_sets = [(syn_set.lemma_names(), syn_set.hyponyms()) for syn_set in syn_sets]

Not sure if that's exactly what you're looking for, but it should generalize to whatever you are looking to do similar to this.

If it's useful I asked a question about grabbing all related synsets here a while ago.

like image 110
Slater Victoroff Avatar answered Mar 24 '23 09:03

Slater Victoroff


I believe you have no choice but to loop through your words. I modified your code to have an outer loop, and it seems to work:

from nltk.corpus import wordnet as wn

tokens = ["cat", "dog", "car"]

for token in tokens:
    syn_sets = wn.synsets(token)
    for syn_set in syn_sets:
        print(syn_set, syn_set.lemma_names())
        print(syn_set.hyponyms())

Here is the output:

(Synset('cat.n.01'), [u'cat', u'true_cat'])
[Synset('domestic_cat.n.01'), Synset('wildcat.n.03')]
(Synset('guy.n.01'), [u'guy', u'cat', u'hombre', u'bozo'])
[Synset('sod.n.04')]
...
(Synset('cable_car.n.01'), [u'cable_car', u'car'])
[]
like image 36
Darren Cook Avatar answered Mar 24 '23 07:03

Darren Cook