Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find smallest substring which contains all characters from a given string?

I have recently come across an interesting question on strings. Suppose you are given following:

Input string1: "this is a test string" Input string2: "tist" Output string: "t stri" 

So, given above, how can I approach towards finding smallest substring of string1 that contains all the characters from string 2?

like image 646
Rajendra Uppal Avatar asked Mar 17 '10 03:03

Rajendra Uppal


People also ask

What is contiguous substring?

In formal language theory and computer science, a substring is a contiguous sequence of characters within a string. For instance, "the best of" is a substring of "It was the best of times".

How to find the smallest substring of a string in JavaScript?

Given an array of unique characters arr and a string str, Implement a function getShortestUniqueSubstring that finds the smallest substring of str containing all the characters in arr. Return "" (empty string) if such a substring doesn’t exist. Come up with an asymptotically optimal solution and analyze the time and space complexities.

Which is the shortest substring that contains the string ‘B’?

Explanation : The substring A [2 : 5] is the shortest substring that contains the string ‘B’ as a subsequence. Explanation : Although both the substrings A [2:4] and A [5:7] have the same length, the substring which has the smallest starting index is “bcd” so it is the final answer.

How do you find the smallest window length of a string?

Given a string, find the smallest window length with all distinct characters of the given string. For eg. str = “aabcbcdbca”, then the result would be 4 as of the smallest window will be “dbca” . Input: aabcbcdbca Output: dbca Explanation: Possible substrings= {aabcbcd, abcbcd, bcdbca, dbca....}

What is the smallest window containing all distinct characters in Python?

Smallest window containing all distinct characters is: dbca Time Complexity: O (N^2). This time is required to generate all possible sub-strings of a string of length “N”. Space Complexity: O (N). Method 2: Here we have used Sliding Window technique to arrive at the solution.


1 Answers

To see more details including working code, check my blog post at:

http://www.leetcode.com/2010/11/finding-minimum-window-in-s-which.html

To help illustrate this approach, I use an example: string1 = "acbbaca" and string2 = "aba". Here, we also use the term "window", which means a contiguous block of characters from string1 (could be interchanged with the term substring).

alt text

i) string1 = "acbbaca" and string2 = "aba".

alt text

ii) The first minimum window is found. Notice that we cannot advance begin pointer as hasFound['a'] == needToFind['a'] == 2. Advancing would mean breaking the constraint.

alt text

iii) The second window is found. begin pointer still points to the first element 'a'. hasFound['a'] (3) is greater than needToFind['a'] (2). We decrement hasFound['a'] by one and advance begin pointer to the right.

alt text

iv) We skip 'c' since it is not found in string2. Begin pointer now points to 'b'. hasFound['b'] (2) is greater than needToFind['b'] (1). We decrement hasFound['b'] by one and advance begin pointer to the right.

alt text

v) Begin pointer now points to the next 'b'. hasFound['b'] (1) is equal to needToFind['b'] (1). We stop immediately and this is our newly found minimum window.

The idea is mainly based on the help of two pointers (begin and end position of the window) and two tables (needToFind and hasFound) while traversing string1. needToFind stores the total count of a character in string2 and hasFound stores the total count of a character met so far. We also use a count variable to store the total characters in string2 that's met so far (not counting characters where hasFound[x] exceeds needToFind[x]). When count equals string2's length, we know a valid window is found.

Each time we advance the end pointer (pointing to an element x), we increment hasFound[x] by one. We also increment count by one if hasFound[x] is less than or equal to needToFind[x]. Why? When the constraint is met (that is, count equals to string2's size), we immediately advance begin pointer as far right as possible while maintaining the constraint.

How do we check if it is maintaining the constraint? Assume that begin points to an element x, we check if hasFound[x] is greater than needToFind[x]. If it is, we can decrement hasFound[x] by one and advancing begin pointer without breaking the constraint. On the other hand, if it is not, we stop immediately as advancing begin pointer breaks the window constraint.

Finally, we check if the minimum window length is less than the current minimum. Update the current minimum if a new minimum is found.

Essentially, the algorithm finds the first window that satisfies the constraint, then continue maintaining the constraint throughout.

like image 115
1337c0d3r Avatar answered Oct 04 '22 00:10

1337c0d3r