Is there a fast algorithm for finding the Largest Common Substring in two strings
or is it an NPComplete problem?
In PHP I can find a needle in a haystack:
<?php
if (strstr("there is a needle in a haystack", "needle")) {
echo "found<br>\n";
}
?>
I guess I could do this in a loop over one of the strings
but that would be very expensive! Especially since my application of this is to search a database of email and look for spam (i.e. similar emails sent by the same person).
Does anyone have any PHP code they can throw out there?
php function getBetween($content,$start,$end){ $r = explode($start, $content); if (isset($r[1])){ $r = explode($end, $r[1]); return $r[0]; } return ''; } ?>
Naive Approach: Let strings X and Y be the lengths m and n respectively. Generate all possible substrings of X which requires a time complexity of O(m2) and search each substring in the string Y which can be achieved in O(n) time complexity using the KMP algorithm. Overall time complexity will be O(n * m2).
The simplest approach to solve this problem is to generate all the substrings of the given string and among all substrings having all unique characters, return the maximum length. To generate all substrings of a string, loop from the start till the end index of the string.
The similar_text function may be what you want.
This calculates the similarity between two strings. Returns the number of matching chars in both strings
You may also want to look at levenshtein
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With