Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find common chars in array of strings, in the right order

Tags:

I spent days working on a function to get common chars in an array of strings, in the right order, to create a wildcard.

Here is an example to explain my problem. I made about 3 functions, but I always have a bug when the absolute position of each letter is different.

Let's assume "+" is the "wildcard char":

Array(
0 => '48ca135e0$5',
1 => 'b8ca136a0$5',
2 => 'c48ca13730$5',
3 => '48ca137a0$5');

Should return :

$wildcard='+8ca13+0$5';

In this example, the tricky thing is that $array[2] as 1 char more than others.

Other example :

Array(
0 => "case1b25.occHH&FmM",
1 => "case11b25.occHH&FmM",
2 => "case12b25.occHH&FmM",
3 => "case20b25.occHH&FmM1");

Should return :

$wildcard='case+b25.occHH&FmM+';

In this example, the tricky parts are :
- Repeating chars, such as 1 -> 11 in the "to delete" part, and c -> cc in the common part
- The "2" char in $array[2] & [3] in the "to delete" part is not in the same position
- The "1" char at the end of the last string

I really need help because I can't find a solution to this function and it is a main part of my application.

Thanks in advance, don't hesitate to ask questions, I will answer as fast as possible.

Mykeul

like image 227
Mykeul Avatar asked Jan 29 '10 10:01

Mykeul


People also ask

How do you find the common character in a string array?

Initialize this array with Integer. MAX_VALUE. Iterate through the given input array from start (index 0) to end (n-1, where n is the length of an array) and for each iteration convert the string into a character array and assign a frequency of each character to count array.

How do you find common characters in two strings?

Approach: Count the frequencies of all the characters from both strings. Now, for every character if the frequency of this character in string s1 is freq1 and in string s2 is freq2 then total valid pairs with this character will be min(freq1, freq2). The sum of this value for all the characters is the required answer.


1 Answers

Seems you want to create something like regular expression out of set of example strings. This might be quite tricki in general. Found this link, not sure if it's relevant: http://scholar.google.com/scholar?hl=en&rlz=1B3GGGL_enEE351EE351&q=%22regular%20expression%20by%20example%22&oq=&um=1&ie=UTF-8&sa=N&tab=ws

On the other hand, if you need only one specific wildcard meaning "0 or more characters", then it should be much easier. Levenshtein distance algorithm computes similarity between 2 strings. Normally only result is needed, but in your case the places of differences are important. You also need to adapt this for N strings.

So I recommend to study this algorithm and hopefully you'll get some ideas how to solve your problem (at least you'll get some practice with text algorithms and dynamic programming).

Heres algorithm in PHP: _http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#PHP

You might want also to search for PHP implementations of "diff". http://paulbutler.org/archives/a-simple-diff-algorithm-in-php/

like image 53
Aivar Avatar answered Nov 15 '22 07:11

Aivar