How to Normalize similarity measures from Wordnet

Q: How to calculate similarity between two synsets?

One of the core metrics used to calculate similarity is the shortest path the distance between the two Synsets and their common hypernym. Code #4 : Let’s understand the use of hypernerm. Note : The similarity score is very high i.e. they are many steps away from each other because they are not so similar.

Tags:

python

nlp

similarity

nltk

wordnet

I am trying to calculate semantic similarity between two words. I am using Wordnet-based similarity measures i.e Resnik measure(RES), Lin measure(LIN), Jiang and Conrath measure(JNC) and Banerjee and Pederson measure(BNP).

To do that, I am using nltk and Wordnet 3.0. Next, I want to combine the similarity values obtained from different measure. To do that i need to normalize the similarity values as some measure give values between 0 and 1, while others give values greater than 1.

So, my question is how do I normalize the similarity values obtained from different measures.

Extra detail on what I am actually trying to do: I have a set of words. I calculate pairwise similarity between the words. and remove the words that are not strongly correlated with other words in the set.

927

asked Jul 31 '13 11:07

nish

1 Answers

How to normalize a single measure

Let's consider a single arbitrary similarity measure M and take an arbitrary word w.

Define m = M(w,w). Then m takes maximum possible value of M.

Let's define MN as a normalized measure M.

For any two words w, u you can compute MN(w, u) = M(w, u) / m.

It's easy to see that if M takes non-negative values, then MN takes values in [0, 1].

How to normalize a measure combined from many measures

In order to compute your own defined measure F combined of k different measures m_1, m_2, ..., m_k first normalize independently each m_i using above method and then define:

alpha_1, alpha_2, ..., alpha_k

such that alpha_i denotes the weight of i-th measure.

All alphas must sum up to 1, i.e:

alpha_1 + alpha_2 + ... + alpha_k = 1

Then to compute your own measure for w, u you do:

F(w, u) = alpha_1 * m_1(w, u) + alpha_2 * m_2(w, u) + ... + alpha_k * m_k(w, u)

It's clear that F takes values in [0,1]

189

answered Nov 15 '22 00:11

pkacprzak

Related questions
                            
                                Pylint error with abstract member variable
                            
                                Tracing an ignored exception in Python?
                            
                                Can't install numpy with setup.py
                            
                                PermissionError: [Errno 13] Permission denied Python
                            
                                Python: Something went wrong somewhere in the list comprehension?
                            
                                Get raw query string in flask
                            
                                emacs mode for snakemake?
                            
                                how to use a variable in selenium webdriver find element by xpath?
                            
                                How to find all the intersection points between two contour-set in an efficient way
                            
                                Import class in same file in Python [closed]
                            
                                Read in Raw Binary Image in Python
                            
                                Is there a way to override python's json handler?
                            
                                Creating a windows installer for python using inno setup
                            
                                Which package is Python using?
                            
                                MySQL-Python install - Could not build the egg
                            
                                How can I perform set operations on Python dictionaries?
                            
                                Python Requests: Hook or no Hook?
                            
                                Python - making copies of a file
                            
                                Java method which can provide the same output as Python method for HMAC-SHA256 in Hex
                            
                                How to obtain the training error in svm of Scikit-learn?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With