I'm building a website that should collect various news feeds and would like the texts to be compared for similarity. What i need is some sort of a news text similarity algorithm. I know that php has the similar_text function and am not sure how good it is + i need it for javascript. So if anyone could point me to an example or a plugin or any instruction on how this is possible or at least where to look and start investigating.
There's a javascript implementation of the Levenshtein distance metric, which is often used for text comparisons. If you want to compare whole articles or headlines though you might be better off looking at intersections between the sets of words that make up the text (and frequencies of those words) rather than just string similarity measures.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With