Is there an algorithm that tells the semantic similarity of two phrases

3 Answers

You might want to check out this paper:

Sentence similarity based on semantic nets and corpus statistics (PDF)

I've implemented the algorithm described. Our context was very general (effectively any two English sentences) and we found the approach taken was too slow and the results, while promising, not good enough (or likely to be so without considerable, extra, effort).

You don't give a lot of context so I can't necessarily recommend this but reading the paper could be useful for you in understanding how to tackle the problem.

Regards,

Matt.

193

answered Sep 21 '22 07:09

Matt Mower

There's a short and a long answer to this.

The short answer:

Use the WordNet::Similarity Perl package. If Perl is not your language of choice, check the WordNet project page at Princeton, or google for a wrapper library.

The long answer:

Determining word similarity is a complicated issue, and research is still very hot in this area. To compute similarity, you need an appropriate represenation of the meaning of a word. But what would be a representation of the meaning of, say, 'chair'? In fact, what is the exact meaning of 'chair'? If you think long and hard about this, it will twist your mind, you will go slightly mad, and finally take up a research career in Philosophy or Computational Linguistics to find the truth™. Both philosophers and linguists have tried to come up with an answer for literally thousands of years, and there's no end in sight.

So, if you're interested in exploring this problem a little more in-depth, I highly recommend reading Chapter 20.7 in Speech and Language Processing by Jurafsky and Martin, some of which is available through Google Books. It gives a very good overview of the state-of-the-art of distributional methods, which use word co-occurrence statistics to define a measure for word similarity. You are not likely to find libraries implementing these, however.

answered Sep 24 '22 07:09

nfelger

For anyone just coming at this, i would suggest taking a look at SEMILAR - http://www.semanticsimilarity.org/ . They implement a lot of the modern research methods for calculating word and sentence similarity. It is written in Java.

SEMILAR API comes with various similarity methods based on Wordnet, Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA), BLEU, Meteor, Pointwise Mutual Information (PMI), Dependency based methods, optimized methods based on Quadratic Assignment, etc. And the similarity methods work in different granularities - word to word, sentence to sentence, or bigger texts.

answered Sep 20 '22 07:09

kyrenia

Related questions
                            
                                What are the core mathematical concepts a good developer should know? [closed]
                            
                                Python Inverse of a Matrix
                            
                                How to check if an integer is a power of 3?
                            
                                How to approach a number guessing game (with a twist) algorithm?
                            
                                What is the complexity of the sorted() function?
                            
                                Understanding "median of medians" algorithm
                            
                                Algorithm to find articles with similar text
                            
                                Sorting algorithms for data of known statistical distribution?
                            
                                Why there is no std::copy_if algorithm?
                            
                                std::transform() and toupper(), no matching function
                            
                                Emulate "double" using 2 "float"s
                            
                                Correctness of Sakamoto's algorithm to find the day of week
                            
                                Calculating mid in binary search
                            
                                Finding the number of digits of an integer
                            
                                find if 4 points on a plane form a rectangle?
                            
                                Calculate Time Remaining
                            
                                Circle-circle intersection points
                            
                                How do you validate a binary search tree?
                            
                                Most efficient code for the first 10000 prime numbers?
                            
                                Point in Polygon Algorithm

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there an algorithm that tells the semantic similarity of two phrases

Tags:

algorithm

semantics

nlp

btw0

People also ask

3 Answers

Matt Mower

nfelger

kyrenia

Recent Activity

Donate For Us