Input: str="abcdeefuiuiwiwwaaaa" n=3 output: "iwiwwaaaa" (longest substr with 3 diff chars)
I have a solution as below. My questions:
The solution below can not cover the whole ASCII, can we improve this without additional space?
public static String getSubstrOfMChars(String str, int m)
{
if (str==null || str.length()==0)
return "";
int len = str.length();
String max = "";
for(int i=0; i<len;)
{
StringBuilder sb = new StringBuilder();
int counter = 1;
int checker = 0;
char firstChar = str.charAt(i);
int firstCharPos = i; // first char position in the string
sb.append(firstChar);
checker |= 1 << (firstChar - 'a');
for(int j=i+1; j<len; j++)
{
char currChar = str.charAt(j);
if (currChar == firstChar)
firstCharPos++;
int tester = checker & (1<<(currChar - 'a'));
if ( tester > 0 ) // already have such character
{
sb.append(currChar);
continue;
}
// new character
if (++counter > m)
{
i = firstCharPos + 1;
if (sb.length() > max.length())
{
max = sb.toString();
}
break;
}
sb.append(currChar);
checker |= 1 << (currChar - 'a');
}
if (counter <= m)
{
if ((counter==m) && sb.length() > max.length())
{
max = sb.toString();
}
break;
}
}
return max;
}
If the length of string is n, then there can be n*(n+1)/2 possible substrings. A simple way is to generate all the substring and check each one whether it has exactly k unique characters or not. If we apply this brute force, it would take O(n2) to generate all substrings and O(n) to do a check on each one.
Run a loop from i = 0 till N – 1 and consider a visited array. Run a nested loop from j = i + 1 to N – 1 and check whether the current character S[j] has already been visited. If true, break from the loop and consider the next window. Else, mark the current character as visited and maximize the length as j – i + 1.
There is an O(n). Let S
be the string.
Just go through the array with two pointers i
and j
and keep track of number K
of different letters between S[i]
and S[j]
. Increment j
whenever this number is smaller or equal n
and increment i
whenever K
is greater than n
. Also remember the longest substring for which K
was equal to n
.
In the implementation you also need a hash table to keep track of the last occurrence of the letter.
Python implementation:
def longest(S,n):
i = j = K = 0
res = (0,0)
last = {}
while i < len(S):
# if current substring is better than others than save
if K == n and j - i > res[1] - res[0]:
res = (i,j)
# if we reach the end of the string, we're done.
if j + 1 > len(S):
break
# if we can go further with the right end than do it
elif K <= n and j + 1 <= len(S):
if not last.has_key(S[j]):
K = K + 1
last[S[j]] = j
j = j + 1
# if we must go further with the left end than do it
else:
if last.has_key(S[i]):
del last[S[i]]
K = K - 1
i = i + 1
return S[res[0]:res[1]]
Your present code complexity is O(N^2) since you are using nested for loops to check for substrings that start from each character.
IMO you can do this in O(N*k) time and O(k) extra space (where k = number of unique characters allowed):
map
and search for the value of minimum position index. calculate present position - min(last position index)
and update max length substring accordingly. Decrement count. Pop out this character from the map. If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With