How to calculate the similarity of English words that do not appear in WordNet?

Question

A particular natural language practice is to calculate the similarity between two words using WordNet. I start my question with the following python code:

from nltk.corpus import wordnet
sport = wordnet.synsets("sport")[0]
badminton = wordnet.synsets("badminton")[0]
print(sport.wup_similarity(badminton))

We will get 0.8421

Now what if I look for "haha" and "lol" as following:

haha = wordnet.synsets("haha")
lol = wordnet.synsets("lol")
print(haha)
print(lol)

We will get

[]
[]

Then we cannot consider the similarity between them. What can we do in this case?

sarnthil · Accepted Answer

You can create a semantic space from cooccurrence matrices using a tool like Dissect (DIStributional SEmantics Composition Toolkit) and then you are set to measure semantic similarity between words or phrases (if you compose words).

In your case for ha and lol you'll need to collect those cooccurrences.

Another thing to try is word2vec.

In your case for ha and lol you'll need to collect those cooccurrences.

Another thing to try is word2vec.

Masoud · Answer

There are two possible other ways:

CBOW: continuous bag of word

skip gram model: This model is vice versa of CBOW model

look at this: https://www.quora.com/What-are-the-continuous-bag-of-words-and-skip-gram-architectures-in-laymans-terms

These model are well represted here: https://www.tensorflow.org/tutorials/word2vec, also GENSIM is a good python library for doing such these things

Try to look for Tensorflow Solutions, For example this: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/word2vec/word2vec_basic.py

Or try to look for word2vec: https://en.wikipedia.org/wiki/Word2vec

How to calculate the similarity of English words that do not appear in WordNet?

Tags:

python

similarity

nltk

Duong Trung Nghia

2 Answers

sarnthil

Masoud

Recent Activity

Donate For Us

How to calculate the similarity of English words that do not appear in WordNet?

Tags:

python

similarity

nltk

Duong Trung Nghia

2 Answers

sarnthil

Masoud

Related questions

Recent Activity

Donate For Us