Convert words between verb/noun/adjective forms

Tags:

i would like a python library function that translates/converts across different parts of speech. sometimes it should output multiple words (e.g. "coder" and "code" are both nouns from the verb "to code", one's the subject the other's the object)

# :: String => List of String print verbify('writer') # => ['write'] print nounize('written') # => ['writer'] print adjectivate('write') # => ['written']

i mostly care about verbs <=> nouns, for a note taking program i want to write. i.e. i can write "caffeine antagonizes A1" or "caffeine is an A1 antagonist" and with some NLP it can figure out they mean the same thing. (i know that's not easy, and that it will take NLP that parses and doesn't just tag, but i want to hack up a prototype).

similar questions ... Converting adjectives and adverbs to their noun forms (this answer only stems down to the root POS. i want to go between POS.)

ps called Conversion in linguistics http://en.wikipedia.org/wiki/Conversion_%28linguistics%29

453

asked Jan 23 '13 21:01

sam boosalis

2 Answers

This is more a heuristic approach. I have just coded it so appologies for the style. It uses the derivationally_related_forms() from wordnet. I have implemented nounify. I guess verbify works analogous. From what I've tested works pretty well:

from nltk.corpus import wordnet as wn  def nounify(verb_word):     """ Transform a verb to the closest noun: die -> death """     verb_synsets = wn.synsets(verb_word, pos="v")      # Word not found     if not verb_synsets:         return []      # Get all verb lemmas of the word     verb_lemmas = [l for s in verb_synsets \                    for l in s.lemmas if s.name.split('.')[1] == 'v']      # Get related forms     derivationally_related_forms = [(l, l.derivationally_related_forms()) \                                     for l in    verb_lemmas]      # filter only the nouns     related_noun_lemmas = [l for drf in derivationally_related_forms \                            for l in drf[1] if l.synset.name.split('.')[1] == 'n']      # Extract the words from the lemmas     words = [l.name for l in related_noun_lemmas]     len_words = len(words)      # Build the result in the form of a list containing tuples (word, probability)     result = [(w, float(words.count(w))/len_words) for w in set(words)]     result.sort(key=lambda w: -w[1])      # return all the possibilities sorted by probability     return result

answered Sep 29 '22 13:09

bogs

Here is a function that is in theory able to convert words between noun/verb/adjective/adverb form that I updated from here (originally written by bogs, I believe) to be compliant with nltk 3.2.5 now that synset.lemmas and sysnset.name are functions.

from nltk.corpus import wordnet as wn  # Just to make it a bit more readable WN_NOUN = 'n' WN_VERB = 'v' WN_ADJECTIVE = 'a' WN_ADJECTIVE_SATELLITE = 's' WN_ADVERB = 'r'   def convert(word, from_pos, to_pos):         """ Transform words given from/to POS tags """      synsets = wn.synsets(word, pos=from_pos)      # Word not found     if not synsets:         return []      # Get all lemmas of the word (consider 'a'and 's' equivalent)     lemmas = []     for s in synsets:         for l in s.lemmas():             if s.name().split('.')[1] == from_pos or from_pos in (WN_ADJECTIVE, WN_ADJECTIVE_SATELLITE) and s.name().split('.')[1] in (WN_ADJECTIVE, WN_ADJECTIVE_SATELLITE):                 lemmas += [l]      # Get related forms     derivationally_related_forms = [(l, l.derivationally_related_forms()) for l in lemmas]      # filter only the desired pos (consider 'a' and 's' equivalent)     related_noun_lemmas = []      for drf in derivationally_related_forms:         for l in drf[1]:             if l.synset().name().split('.')[1] == to_pos or to_pos in (WN_ADJECTIVE, WN_ADJECTIVE_SATELLITE) and l.synset().name().split('.')[1] in (WN_ADJECTIVE, WN_ADJECTIVE_SATELLITE):                 related_noun_lemmas += [l]      # Extract the words from the lemmas     words = [l.name() for l in related_noun_lemmas]     len_words = len(words)      # Build the result in the form of a list containing tuples (word, probability)     result = [(w, float(words.count(w)) / len_words) for w in set(words)]     result.sort(key=lambda w:-w[1])      # return all the possibilities sorted by probability     return result   convert('direct', 'a', 'r') convert('direct', 'a', 'n') convert('quick', 'a', 'r') convert('quickly', 'r', 'a') convert('hunger', 'n', 'v') convert('run', 'v', 'a') convert('tired', 'a', 'r') convert('tired', 'a', 'v') convert('tired', 'a', 'n') convert('tired', 'a', 's') convert('wonder', 'v', 'n') convert('wonder', 'n', 'a')

As you can see below, it doesn't work so great. It's unable to switch between adjective and adverb form (my specific goal), but it does give some interesting results in other cases.

>>> convert('direct', 'a', 'r') [] >>> convert('direct', 'a', 'n') [('directness', 0.6666666666666666), ('line', 0.3333333333333333)] >>> convert('quick', 'a', 'r') [] >>> convert('quickly', 'r', 'a') [] >>> convert('hunger', 'n', 'v') [('hunger', 0.75), ('thirst', 0.25)] >>> convert('run', 'v', 'a') [('persistent', 0.16666666666666666), ('executive', 0.16666666666666666), ('operative', 0.16666666666666666), ('prevalent', 0.16666666666666666), ('meltable', 0.16666666666666666), ('operant', 0.16666666666666666)] >>> convert('tired', 'a', 'r') [] >>> convert('tired', 'a', 'v') [] >>> convert('tired', 'a', 'n') [('triteness', 0.25), ('banality', 0.25), ('tiredness', 0.25), ('commonplace', 0.25)] >>> convert('tired', 'a', 's') [] >>> convert('wonder', 'v', 'n') [('wonder', 0.3333333333333333), ('wonderer', 0.2222222222222222), ('marveller', 0.1111111111111111), ('marvel', 0.1111111111111111), ('wonderment', 0.1111111111111111), ('question', 0.1111111111111111)] >>> convert('wonder', 'n', 'a') [('curious', 0.4), ('wondrous', 0.2), ('marvelous', 0.2), ('marvellous', 0.2)]

hope this is able to save someone a little trouble

answered Sep 29 '22 12:09

stuart

Related questions
                            
                                What (pure) Python library to use for AES 256 encryption? [closed]
                            
                                Merge multi-indexed with single-indexed data frames in pandas
                            
                                How to debug a Python module run with python -m from the command line?
                            
                                python equivalent to perl's qw()
                            
                                Is python's "set" stable?
                            
                                Python Implementation of OPTICS (Clustering) Algorithm
                            
                                How are methods, `classmethod`, and `staticmethod` implemented in Python?
                            
                                What does Python's dir() function stand for? [duplicate]
                            
                                XML Parsing: Element Tree (etree) vs. minidom [duplicate]
                            
                                Importing custom module into jupyter notebook
                            
                                What is the scope of a defaulted parameter in Python?
                            
                                Copying a stream in Python
                            
                                How can I fit a Bézier curve to a set of data?
                            
                                How do you mock patch a python class and get a new Mock object for each instantiation?
                            
                                Python requests with multithreading
                            
                                Jenkinsfile and Python virtualenv
                            
                                Can't import annotations from __future__
                            
                                Comprehensive tutorial on Pyinstaller? [closed]
                            
                                Difference between map and dict
                            
                                Can matplotlib add metadata to saved figures?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Convert words between verb/noun/adjective forms

Tags:

python

nlp

nltk

wordnet

sam boosalis

People also ask

2 Answers

bogs

stuart

Recent Activity

Donate For Us