Given a string, I want to find the longest substring such that the count of each character does not exceed the number of unique characters in that substring.
For example:
Input: 'aaabb'
Output: 'aabb'
I understand how to do this brute-force: Generate all substrings while tracking the count of each character and the number of distinct characters in each substring. If there's a substring where the condition is violated, we move the start of the window forward. Else, we move the end of the window forward.
The brute-force approach is in O(n^2). Is there a more efficient way to do this?
Here's an O(n^2) algorithm, I think. (note that I don't think the O(n^2) solution given in the question works)
n >>> # of unique characters case:Let S be the input string.
S and produce C the set of distinct characters in S. Let |C| denote the length of C.c in C, produce a cumulative-occurrences array O_c, where each O_c[i] is the number of c in the substring S[0:i]. This is O(n * |C|).c in the substring via O_c[end] - O_c[start]. Obtain the maximum occurrences and the number of non-zero occurrences and use these to check validity.n, that's O(n^2).It seems like if there is an improvement over O(n^2), it'd be some way to avoid checking every start and endpoint in 4.
n >>> (# of unique characters)^2 case:S[i:j] there are more occurrences of some character c than |C|, then any S[i:k] with k > j is an invalid substring (because all the unique characters in the string cannot make up for the current occurrences of c). Thus, we can immediately increment i once we find this condition.l, at least one character will have at least ceil(l / |C|) occurrences (pigeonhole argument).|C|^2.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With