Algorithm to find the minimum length of substring having all characters of other string

1 Answers

Keep two pointer l and r, and a hash table M = character -> count for characters in string2 that do not occur in s[l..r].

Initially set l = 0 and r so that string1[l..r] contains all the characters of string2 (if possible). You do that by removing characters from M until it is empty.

Then proceed by incrementing r by one in each step and then incrementing l as much as possible while still keeping M empty. The minimum over all r - l + 1 (the length of the substring s[l..r]) is the solution.

Pythonish pseudocode:

n = len(string1)
M = {}   # let's say M is empty if it contains no positive values
for c in string2:
    M[c]++
l = 0
r = -1
while r + 1 < n and M not empty:
    r++
    M[string1[r]]--
if M not empty: 
    return "no solution"
answer_l, answer_r = l, r
while True:
    while M[string1[l]] < 0:
        M[string1[l]]++
        l++
    if r - l + 1 < answer_r - anwer_l + 1:
        answer_l, answer_r = l, r
    r++
    if r == n:
        break
    M[string1[r]]--
return s[answer_l..answer_r]

The "is empty" checks can be implemented in O(1) if you maintain the number of positive entries when performing the increment and decrement operations.

Let n be the length of string1 and m be the length of string2.

Note that l and r are only ever incremented, so there are at most O(n) increments and thus at most O(n) instructions are executed in the last outer loop.

If M is implemented as an array (I assume the alphabet is constant size), you get runtime O(n + m), which is optimal. If the alphabet is too large, you can use a hash table to get expected O(n + m).

Example execution:

string1 = "abbabcdbcb"
string2 = "cbb"

# after first loop
M = { 'a': 0, 'b': 2, 'c': 1, 'd': 0 }

# after second loop
l = 0
r = 5
M = { 'a': -2, 'b': -1, 'c': 0, 'd': 0 }

# increment l as much as possible:
l = 2
r = 5
M = { 'a': -1, 'b': 0, 'c': 0, 'd': 0 }

# increment r by one and then l as much as possible
l = 2
r = 6
M = { 'a': -1, 'b': 0, 'c': 0, 'd': -1 }

# increment r by one and then l as much as possible
l = 4
r = 7
M = { 'a': 0, 'b': 0, 'c': 0, 'd': -1 }

# increment r by one and then l as much as possible
l = 4
r = 8
M = { 'a': 0, 'b': 0, 'c': -1, 'd': -1 }

# increment r by one and then l as much as possible
l = 7
r = 9
M = { 'a': 0, 'b': 0, 'c': 0, 'd': 0 }

The best solution is s[7..9].

101

answered Sep 20 '22 16:09

Niklas B.

Related questions
                            
                                Open a lock with the least number of moves
                            
                                How to arrange a graph linearly with no overlapping?
                            
                                Obtaining the minimum number of tails of coin after flipping the entire row or column multiple times
                            
                                Algorithm to make a polynomial fit of a part of a data set
                            
                                OpenCV find coloured in circle and position value Python
                            
                                Reasoning behind shifting over the text whem mismatch occurs in KMP algorithm?
                            
                                Three elements in array whose xor is maximum
                            
                                Finding all subsets of a multiset
                            
                                Push, Pop, Range in constant time
                            
                                Find a single number in a list when other numbers occur more than twice
                            
                                Rotating right an array of int in c#?
                            
                                Subset-Sum in Linear Time
                            
                                NetworkX multi-directed graph possible?
                            
                                Efficient data structure for very long array of characters with insertions
                            
                                Calculating the Point3Ds of a Cuboid around a Line
                            
                                Insertion sort much faster than shell sort
                            
                                Codility - min average slice
                            
                                Betweenness centrality Brandes algorithm
                            
                                Recursion vs bitmasking for getting all combinations of vector elements
                            
                                Implementing in-place modification algorithms in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Algorithm to find the minimum length of substring having all characters of other string

Tags:

algorithm

instance

People also ask

1 Answers

Niklas B.

Recent Activity

Donate For Us