Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to find the union of multiple strings grouped by character index

I have been trying to think of a performance efficient way of finding the union of character occurrences in a set of fixed width strings grouped by index. Something like this;

s1 = "013965"
s2 = "015935"
s3 = "310012"

Resulting in the following where each group of digits exists in all strings at char index n:

out = "[03][1][350][90][631][52]"

I have thought of doing it the very naive way of iterating through every string, at every index while storing the intermediate strings in an array and then iterating through that array to build the output value. However my approach seems to me as a highly inefficient way which is way too far from an asymptotically optimal solution.

like image 895
pondigi Avatar asked Feb 13 '14 21:02

pondigi


1 Answers

If the set of all possible characters is known in advance, let's say their number is n, with n being not too high (e.g. 10, if you're doing only digits), you can do this by creating m boolean arrays of length n, with m being the number of positions, or digits in input strings and n. The n-th position in m-th array will be true, if the n-th character is present in m-th position in any of input strings. False will denote, that no such character was in m-th position before.

Then you can iterate over each string, and when you encounter character n in position m you mark down true in n-th position of m-th array. At the end, you will have m arrays, each describing the content of m-th group

pos[0] = {true, true, false, false, false, true, true, false, true, false}
pos[1] = {true, false, false, false, false, false, false, false, false, false}
pos[2] = {false, false, true, true, false, false, false, false, false, true}

translates to

[0,1,5,6,8] [0] [2,3,9]

As all the structures are direct access arrays, there is no lookup involved, all access is in constant time and you only need to visit every character once, with no comparison involved. Hope this helps.

like image 79
Warlord Avatar answered Nov 05 '22 23:11

Warlord