Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert adjective to adverb

Does anyone know how to convert an english adjective to its respective adverb? Python would be ideal, but really any programmatic approach would be great.

I've tried pattern.en, nltk wordnet, and spacy to no avail.

Converting adverbs to their root adjective form is no problem. I'm using the SO solution here.

What I want is to go the other way. From adjective to adverb.

Here is nltk wordnet code that kind of converts words between different word forms, but fails for adjective <--> adverb conversions.

Specifically, I'd like a function getAdverb like this:

getAdverb('quick')
>>> quickly
getAdverb('noteable')
>>> notably
getAdverb('happy')
>>> happily

Any code, resources, or suggestions would be greatly appreciated!

like image 869
stuart Avatar asked Nov 07 '22 12:11

stuart


1 Answers

Idea

Let's fetch pre-trained word embeddings and use word vector arithmetic properties to get the set of words that are semantically similar to our target word, then choose the most promising ones:

word2vec

But we'll try to exploit adjective - adverb relationships.

Code

First, you need to download the word embeddings. I usually take GloVe from Stanford. Then you need to convert GloVe text format to Gensim with:

$ python -m gensim.scripts.glove2word2vec -i glove.6B.100d.txt -o glove-word2vec.6B.100d.txt
2018-01-13 09:54:04,133 : MainThread : INFO : running /usr/lib/python2.7/site-packages/gensim/scripts/glove2word2vec.py -i glove.6B.100d.txt -o glove-word2vec.6B.100d.txt
2018-01-13 09:54:04,248 : MainThread : INFO : converting 400000 vectors from glove.6B.100d.txt to glove-word2vec.6B.100d.txt
2018-01-13 09:54:04,622 : MainThread : INFO : Converted model with 400000 vectors and 100 dimensions

After that loading is fairly easy:

from gensim.models.keyedvectors import KeyedVectors
glove_filename = '../../_data/nlp/glove/glove-word2vec.6B.100d.txt'
model = KeyedVectors.load_word2vec_format(glove_filename, binary=False)
print(model.most_similar(positive=['woman', 'king'], negative=['man']))

This test should output semantically similar words to a woman that a like king to a man:

(u'queen', 0.7698541283607483)
(u'monarch', 0.6843380928039551)
(u'throne', 0.6755735874176025) 
(u'daughter', 0.6594556570053101)
(u'princess', 0.6520534753799438)

Finally, this is how we can navigate to the closest adverbs:

from difflib import SequenceMatcher

def close_adv(input, num=5, model_topn=50):
  positive = [input, 'happily']
  negative = [       'happy']
  all_similar = model.most_similar(positive, negative, topn=model_topn)

  def score(candidate):
    ratio = SequenceMatcher(None, candidate, input).ratio()
    looks_like_adv = 1.0 if candidate.endswith('ly') else 0.0
    return ratio + looks_like_adv

  close = sorted([(word, score(word)) for word, _ in all_similar], key=lambda x: -x[1])
  return close[:num]

print(close_adv('strong'))
print(close_adv('notable'))
print(close_adv('high'))
print(close_adv('quick'))
print(close_adv('terrible'))
print(close_adv('quiet'))

The result is not ideal, but looks pretty promising:

[(u'strongly', 1.8571428571428572), (u'slowly', 1.3333333333333333), (u'increasingly', 1.3333333333333333), (u'sharply', 1.3076923076923077), (u'largely', 1.3076923076923077)]
[(u'notably', 1.8571428571428572), (u'principally', 1.3333333333333333), (u'primarily', 1.25), (u'prominently', 1.2222222222222223), (u'chiefly', 1.1428571428571428)]
[(u'rapidly', 1.1818181818181819), (u'briefly', 1.1818181818181819), (u'steadily', 1.1666666666666667), (u'dangerously', 1.1333333333333333), (u'continuously', 1.125)]
[(u'quickly', 1.8333333333333335), (u'quietly', 1.5), (u'briskly', 1.3333333333333333), (u'furiously', 1.2857142857142856), (u'furtively', 1.2857142857142856)]
[(u'horribly', 1.625), (u'heroically', 1.4444444444444444), (u'silently', 1.375), (u'uncontrollably', 1.3636363636363638), (u'stoically', 1.3529411764705883)]
[(u'quietly', 1.8333333333333335), (u'silently', 1.4615384615384617), (u'patiently', 1.4285714285714286), (u'discreetly', 1.4), (u'fitfully', 1.3076923076923077)]

Of course, you can go on with a better way to check for adverb, use nltk.edit_distance to measure word similarity, etc, etc. So this is just an idea and it's kind of probabilistic, but it looks interesting to me.

like image 197
Maxim Avatar answered Nov 15 '22 10:11

Maxim