Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minimum window width in string x that contains all characters of string y

Tags:

algorithm

Find minimum window width in string x that contains all characters of another string y. For example:

String x = "coobdafceeaxab"
String y = "abc"

The answer should be 5, because the shortest substring in x that contains all three letters of y is "bdafc".

I can think of a naive solution with complexity O(n^2 * log(m)), where n = len(x) and m = len(y). Can anyone suggest a better solution? Thanks.

Update: now think of it, if I change my set to tr1::unordered_map, then I can cut the complexity down to O(n^2), because insertion and deletion should both be O(1).

like image 805
grokus Avatar asked Aug 28 '10 19:08

grokus


4 Answers

time: O(n) (One pass)
space: O(k)

This is how I would do it:
Create a hash table for all the characters from string Y. (I assume all characters are different in Y).

First pass:
Start from first character of string X.
update hash table, for exa: for key 'a' enter location (say 1).
Keep on doing it until you get all characters from Y (until all key in hash table has value).
If you get some character again, update its newer value and erase older one.

Once you have first pass, take smallest value from hash table and biggest value.
Thats the minimum window observed so far.

Now, go to next character in string X, update hash table and see if you get smaller window.


Edit:

Lets take an example here:
String x = "coobdafceeaxab"
String y = "abc"

First initialize a hash table from characters of Y.
h[a] = -1
h[b] = -1
h[c] = -1

Now, Start from first character of X.
First character is c, h[c] = 0
Second character (o) is not part of hash, skip it.
..
Fourth character (b), h[b] = 3
..
Sixth character(a), enter hash table h[a] = 5.
Now, all keys from hash table has some value.
Smallest value is 0 (of c) and highest value is 5 (of a), minimum window so far is 6 (0 to 5).
First pass is done.

Take next character. f is not part of hash table, skip it.
Next character (c), update hash table h[c] = 7.
Find new window, smallest value is 3 (of b) and highest value is 7 (of c).
New window is 3 to 7 => 5.

Keep on doing it till last character of string X.

I hope its clear now.


Edit

There are some concerns about finding max and min value from hash.
We can maintain sorted Link-list and map it with hash table.
Whenever any element from Link list changes, it should be re-mapped to hash table.
Both these operation are O(1)

Total space would be m+m


Edit

Here is small visualisation of above problem: For "coobdafceeaxab" and "abc"

step-0:
Initial doubly linked-list = NULL
Initial hash-table = NULL

step-1:
Head<->[c,0]<->tail
h[c] = [0, 'pointer to c node in LL']

step-2:
Head<->[c,0]<->[b,3]<->tail
h[c] = [0, 'pointer to c node in LL'], h[b] = [3, 'pointer to b node in LL'],

Step-3:
Head<->[c,0]<->[b,3]<->[a,5]<->tail
h[c] = [0, 'pointer to c node in LL'], h[b] = [3, 'pointer to b node in LL'], h[a] = [5, 'pointer to a node in LL']
Minimum Window => difference from tail and head => (5-0)+1 => Length: 6

Step-4:
Update entry of C to index 7 here. (Remove 'c' node from linked-list and append at the tail)
Head<->[b,3]<->[a,5]<->[c,7]<->tail
h[c] = [7, 'new pointer to c node in LL'], h[b] = [3, 'pointer to b node in LL'], h[a] = [5, 'pointer to a node in LL'],
Minimum Window => difference from tail and head => (7-3)+1 => Length: 5

And so on..

Note that above Linked-list update and hash table update are both O(1).
Please correct me if I am wrong..


Summary:

TIme complexity: O(n) with one pass
Space Complexity: O(k) where k is length of string Y

like image 60
12 revs Avatar answered Oct 31 '22 19:10

12 revs


I found this very nice O(N) time complexity version here http://leetcode.com/2010/11/finding-minimum-window-in-s-which.html, and shortened it slightly (removed continue in a first while , which allowed to simplify condition for the second while loop). Note, that this solution allows for duplicates in the second string, while many of the above answers do not.

private static String minWindow(String s, String t) {
    int[] needToFind = new int[256];
    int[] hasFound = new int[256];

    for(int i = 0; i < t.length(); ++i) {
       needToFind[t.charAt(i)]++;
    }

    int count = 0;
    int minWindowSize = Integer.MAX_VALUE;
    int start = 0, end = -1;
    String window = "";

    while (++end < s.length()) {
        char c = s.charAt(end);
        if(++hasFound[c] <= needToFind[c]) {
           count++;
        }

        if(count < t.length()) continue;
        while (hasFound[s.charAt(start)] > needToFind[s.charAt(start)]) {
           hasFound[s.charAt(start++)]--;
        }

        if(end - start + 1 < minWindowSize) {
           minWindowSize = end - start + 1;
           window = s.substring(start, end + 1);
        }
    }
    return window;
}
like image 28
javabrew Avatar answered Oct 31 '22 20:10

javabrew


Here's my solution in C++:

int min_width(const string& x, const set<char>& y) {
  vector<int> at;
  for (int i = 0; i < x.length(); i++)
    if (y.count(x[i]) > 0)
      at.push_back(i);

  int ret = x.size();
  int start = 0;
  map<char, int> count;

  for (int end = 0; end < at.size(); end++) {
    count[x[at[end]]]++;
    while (count[x[at[start]]] > 1)
      count[x[at[start++]]]--;
    if (count.size() == y.size() && ret > at[end] - at[start] + 1)
      ret = at[end] - at[start] + 1;
  }
  return ret;
}

Edit: Here's an implementation of Jack's idea. It's the same time complexity as mine, but without the inner loop that confuses you.

int min_width(const string& x, const set<char>& y) {
  int ret = x.size();
  map<char, int> index;
  set<int> index_set;

  for (int j = 0; j < x.size(); j++) {
    if (y.count(x[j]) > 0) {
      if (index.count(x[j]) > 0)
        index_set.erase(index[x[j]]);
      index_set.insert(j);
      index[x[j]] = j;
      if (index.size() == y.size()) {
        int i = *index_set.begin();
        if (ret > j-i+1)
          ret = j-i+1;
      }
    }
  }
  return ret;
}

In Java it can be implemented nicely with LinkedHashMap:

static int minWidth(String x, HashSet<Character> y) {
    int ret = x.length();
    Map<Character, Integer> index = new LinkedHashMap<Character, Integer>();

    for (int j = 0; j < x.length(); j++) {
        char ch = x.charAt(j);
        if (y.contains(ch)) {
            index.remove(ch);
            index.put(ch, j);
            if (index.size() == y.size()) {
                int i = index.values().iterator().next();
                if (ret > j - i + 1)
                    ret = j - i + 1;
            }
        }
    }
    return ret;
}

All operations inside the loop take constant time (assuming hashed elements disperse properly).

like image 39
Sheldon L. Cooper Avatar answered Oct 31 '22 20:10

Sheldon L. Cooper


There is an O(n solution to this problem). It very well described in this article. http://www.leetcode.com/2010/11/finding-minimum-window-in-s-which.html Hope it helps.

like image 3
nmd Avatar answered Oct 31 '22 19:10

nmd