Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Smallest window (substring) that has both uppercase and corresponding lowercase characters

I was asked the following question in an onsite interview:

A string is considered "balanced" when every letter in the string appears both in uppercase and lowercase. For e.g., CATattac is balanced (a, c, t occur in both cases), while Madam is not (a, d only appear in lowercase). Write a function that, given a string, returns the shortest balanced substring of that string. For e.g.,:
“azABaabza” should return “ABaab”
“TacoCat” should return -1 (not balanced)
“AcZCbaBz” should returns the entire string

Doing it with the brute force approach is trivial - calculating all the pairs of substrings and then checking if they are balanced, while keeping track of the size and starting index of the smallest one.

How do I optimize? I have a strong feeling it can be done with a sliding-window/two-pointer approach, but I am not sure how. When to update the pointers of the sliding window?

Edit: Removing the sliding-window tag since this is not a sliding-window problem (as discussed in the comments).

like image 238
Someone Avatar asked Aug 11 '21 02:08

Someone


People also ask

How to find the shortest and smallest window in string?

Find the shortest substring in a given string that contains all the characters of a given word or Find the Smallest window in a string containing all characters of another string 1. Generate all the substrings of string S 2. For every substring of S, check whether the substring contains all characters of string T 3.

Which is the shortest substring having all the distinct characters of string?

Of the set of possible substrings 'dbca' is the shortest substring having all the distinct characters of given string. Input: aaab Output: ab Explanation: Possible substrings= {aaab, aab, ab} Of the set of possible substrings 'ab' is the shortest substring having all the distinct characters of given string. Try It!

What is the minimum window substring of a string?

76. Minimum Window Substring Given two strings s and t of lengths m and n respectively, return the minimum window substring of s such that every character in t (including duplicates) is included in the window. If there is no such substring, return the empty string "".

Which string contains a string that is made of lowercase English alphabets?

You are given a string S that is made of lowercase English alphabets. Determine the length of the smallest substring that contains the maximum number of distinct characters. The only line of input consists of a string that is made of lower case English alphabets. Print the required answer.


1 Answers

Due to the special property of string. There is only 26 uppercase letters and 26 lowercase letters.
We can loop every 26 letter j and denote the minimum length for any substrings starting from position i to find matches for uppercase and lowercase letter j be len[i][j]
Demo C++ code:

string s = "CATattac";
// if len[i] >= s.size() + 1, it denotes there is no matching
vector<vector<int>> len(s.size(), vector<int>(26, 0));
for (int i = 0; i < 26; ++i) {
    int upperPos = s.size() * 2;
    int lowerPos = s.size() * 2;
    for (int j = s.size() - 1; j >= 0; --j) {
        if (s[j] == 'A' + i) {
            upperPos = j;
        } else if (s[j] == 'a' + i) {
            lowerPos = j;
        }
        len[j][i] = max(lowerPos - j + 1, upperPos - j + 1);
    }
}

We also keep track of the count of characters.

// cnt[i][j] denotes the number of characters j in substring s[0..i-1]
// cnt[0][j] is always 0
vector<vector<int>> cnt(s.size() + 1, vector<int>(26, 0));
for (int i = 0; i < s.size(); ++i) {
    for (int j = 0; j < 26; ++j) {
        cnt[i + 1][j] = cnt[i][j];
        if (s[i] == 'A' + j || s[i] == 'a' + j) {
            ++cnt[i + 1][j];
        }
    }
}

Then we can loop over s.

int m = s.size() + 1;
for (int i = 0; i < s.size(); ++i) {
    bool done = false;
    int minLen = 1;
    while (!done && i + minLen <= s.size()) {
        // execute at most 26 times, a new character must be added to change minLen
        int prevMinLen = minLen;
        done = true;
        for (int j = 0; j < 26 && i + minLen <= s.size(); ++j) {
            if (cnt[i + minLen][j] - cnt[i][j] > 0) {
                // character j exists in the substring, have to find pair of it
                minLen = max(minLen, len[i][j]);
            }
        }
        if (prevMinLen != minLen) done = false;
    }
    // find overall minLen
    if (i + minLen <= s.size())
        m = min(m, minLen);
    cout << minLen << '\n';
}

Output: (if i + minLen <= s.size(), it is valid. Otherwise substring doesn't exist if starting at that position)
The invalid output difference is due to how the array len is generated.

8
4
15
14
13
12
11
10

I'm not sure whether there is a simpler solution but it is the best I could think of right now. Time complexity: O(N) with a constant of 26 * 26

like image 125
attempt0 Avatar answered Jun 10 '23 09:06

attempt0