I spent days working on a function to get common chars in an array of strings, in the right order, to create a wildcard.
Here is an example to explain my problem. I made about 3 functions, but I always have a bug when the absolute position of each letter is different.
Let's assume "+" is the "wildcard char":
Array(
0 => '48ca135e0$5',
1 => 'b8ca136a0$5',
2 => 'c48ca13730$5',
3 => '48ca137a0$5');
Should return :
$wildcard='+8ca13+0$5';
In this example, the tricky thing is that $array[2]
as 1 char more than others.
Other example :
Array(
0 => "case1b25.occHH&FmM",
1 => "case11b25.occHH&FmM",
2 => "case12b25.occHH&FmM",
3 => "case20b25.occHH&FmM1");
Should return :
$wildcard='case+b25.occHH&FmM+';
In this example, the tricky parts are :
- Repeating chars, such as 1 -> 11 in the "to delete" part, and c -> cc in the common part
- The "2" char in $array[2] & [3] in the "to delete" part is not in the same position
- The "1" char at the end of the last string
I really need help because I can't find a solution to this function and it is a main part of my application.
Thanks in advance, don't hesitate to ask questions, I will answer as fast as possible.
Mykeul
Initialize this array with Integer. MAX_VALUE. Iterate through the given input array from start (index 0) to end (n-1, where n is the length of an array) and for each iteration convert the string into a character array and assign a frequency of each character to count array.
Approach: Count the frequencies of all the characters from both strings. Now, for every character if the frequency of this character in string s1 is freq1 and in string s2 is freq2 then total valid pairs with this character will be min(freq1, freq2). The sum of this value for all the characters is the required answer.
Seems you want to create something like regular expression out of set of example strings. This might be quite tricki in general. Found this link, not sure if it's relevant: http://scholar.google.com/scholar?hl=en&rlz=1B3GGGL_enEE351EE351&q=%22regular%20expression%20by%20example%22&oq=&um=1&ie=UTF-8&sa=N&tab=ws
On the other hand, if you need only one specific wildcard meaning "0 or more characters", then it should be much easier. Levenshtein distance algorithm computes similarity between 2 strings. Normally only result is needed, but in your case the places of differences are important. You also need to adapt this for N strings.
So I recommend to study this algorithm and hopefully you'll get some ideas how to solve your problem (at least you'll get some practice with text algorithms and dynamic programming).
Heres algorithm in PHP: _http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#PHP
You might want also to search for PHP implementations of "diff". http://paulbutler.org/archives/a-simple-diff-algorithm-in-php/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With