Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find length of smallest window that contains all the characters of a string in another string

Recently i have been interviewed. I didn't do well cause i got stuck at the following question

suppose a sequence is given : A D C B D A B C D A C D and search sequence is like: A C D

task was to find the start and end index in given string that contains all the characters of search string preserving the order.

Output: assuming index start from 1:

start index 10 end index 12

explanation :

1.start/end index are not 1/3 respectively because though they contain the string but order was not maintained

2.start/end index are not 1/5 respectively because though they contain the string in the order but the length is not optimum

3.start/end index are not 6/9 respectively because though they contain the string in the order but the length is not optimum

Please go through How to find smallest substring which contains all characters from a given string?.

But the above question is different since the order is not maintained. I'm still struggling to maintain the indexes. Any help would be appreciated . thanks

like image 486
Ankush Dubey Avatar asked Oct 06 '13 07:10

Ankush Dubey


People also ask

What is minimum window substring?

Minimum Window Substring. Given two strings s and t of lengths m and n respectively, return the minimum window substring of s such that every character in t (including duplicates) is included in the window. If there is no such substring, return the empty string "" .

How do you find the smallest string?

To find the smallest of given List of strings based on length in Python, call min() builtin function, pass the strings to compare as arguments to min() function, and also pass len() function for key parameter. min() function returns the string with the smallest length based on the key.


2 Answers

I tried to write some simple c code to solve the problem:

Update:

I wrote a search function that looks for the required characters in correct order, returning the length of the window and storing the window start point to ìnt * startAt. The function processes a sub-sequence of given hay from specified startpoint int start to it's end

The rest of the algorithm is located in main where all possible subsequences are tested with a small optimisation: we start looking for the next window right after the startpoint of the previous one, so we skip some unnecessary turns. During the process we keep track f the 'till-now best solution

Complexity is O(n*n/2)

Update2:

unnecessary dependencies have been removed, unnecessary subsequent calls to strlen(...) have been replaced by size parameters passed to search(...)

#include <stdio.h>

// search for single occurrence
int search(const char hay[], int haySize, const char needle[], int needleSize, int start, int * startAt)
{
    int i, charFound = 0;

    // search from start to end
    for (i = start; i < haySize; i++)
    {
        // found a character ?
        if (hay[i] == needle[charFound])
        {               
            // is it the first one?
            if (charFound == 0) 
                *startAt = i;   // store starting position
            charFound++;    // and go to next one
        }
        // are we done?
        if (charFound == needleSize)
            return i - *startAt + 1;    // success
    }
    return -1;  // failure
}

int main(int argc, char **argv)
{

    char hay[] = "ADCBDABCDACD";
    char needle[] = "ACD";

    int resultStartAt, resultLength = -1, i, haySize = sizeof(hay) - 1, needleSize = sizeof(needle) - 1;

    // search all possible occurrences
    for (i = 0; i < haySize - needleSize; i++)
    {
        int startAt, length;

        length = search(hay, haySize, needle, needleSize, i, &startAt);

        // found something?
        if (length != -1)
        {
            // check if it's the first result, or a one better than before
            if ((resultLength == -1) || (resultLength > length))
            {
                resultLength = length;
                resultStartAt = startAt;
            }
            // skip unnecessary steps in the next turn
            i = startAt;
        }
    }

    printf("start at: %d, length: %d\n", resultStartAt, resultLength);

    return 0;
}
like image 123
xmoex Avatar answered Sep 28 '22 05:09

xmoex


Start from the beginning of the string.

If you encounter an A, then mark the position and push it on a stack. After that, keep checking the characters sequentially until
1. If you encounter an A, update the A's position to current value.
2. If you encounter a C, push it onto the stack.

After you encounter a C, again keep checking the characters sequentially until,
1. If you encounter a D, erase the stack containing A and C and mark the score from A to D for this sub-sequence.
2. If you encounter an A, then start another Stack and mark this position as well.
2a. If now you encounter a C, then erase the earlier stacks and keep the most recent stack.
2b. If you encounter a D, then erase the older stack and mark the score and check if it is less than the current best score.

Keep doing this till you reach the end of the string.

The pseudo code can be something like:

Initialize stack = empty;
Initialize bestLength = mainString.size() + 1; // a large value for the subsequence.
Initialize currentLength = 0;
for ( int i = 0; i < mainString.size(); i++ ) {

  if ( stack is empty ) {
    if ( mainString[i] == 'A' ) {
      start a new stack and push A on it.
      mark the startPosition for this stack as i.
    }
    continue;
  }

  For each of the stacks ( there can be at most two stacks prevailing, 
                           one of size 1 and other of size 0 ) {
    if ( stack size == 1 ) // only A in it {
      if ( mainString[i] == 'A' ) {
        update the startPosition for this stack as i.
      }
      if ( mainString[i] == 'C' ) {
        push C on to this stack.
      }
    } else if ( stack size == 2 ) // A & C in it {
      if ( mainString[i] == 'C' ) {
        if there is a stack with size 1, then delete this stack;// the other one dominates this stack.
      }
      if ( mainString[i] == 'D' ) {
        mark the score from startPosition till i and update bestLength accordingly.
        delete this stack.
      }
    }

  }

}
like image 23
Abhishek Bansal Avatar answered Sep 28 '22 05:09

Abhishek Bansal