This is an interview problem that I am stuck on:
Given a string consisting of a, b and c's, we can perform the following operation: Take any two adjacent distinct characters and replace it with the third character. For example, if 'a' and 'c' are adjacent, they can replaced with 'b'. What is the smallest string which can result by applying this operation repeatedly?
My attempted solution:
import java.io.BufferedReader; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStreamReader; import java.util.LinkedList; import java.util.List;  public class Solution {     public static void main(String[] args) {         try {             BufferedReader in = new BufferedReader(new InputStreamReader(                     System.in));              System.out.println(solve(in.readLine()));              in.close();         } catch (FileNotFoundException e) {             e.printStackTrace();         } catch (IOException e) {             e.printStackTrace();         }     }      private static int solve(String testCase) {         LinkedList<String> temp = new LinkedList<String>(deconstruct(testCase));          for (int i = 0; i < (temp.size() - 1); i++) {             if (!temp.get(i).equals(temp.get(i + 1))) {                 temp.add(i, getThirdChar(temp.remove(), temp.remove()));                 i = -1;             }         }          return reconstruct(temp).length();     }      private static List<String> deconstruct(String testCase) {         List<String> temp = new LinkedList<String>();          for (int i = 0; i < testCase.length(); i++) {             temp.add(testCase.charAt(i) + "");         }          return temp;     }      private static String reconstruct(List<String> temp) {         String testCase = "";          for (int i = 0; i < temp.size(); i++) {             testCase += temp.get(i);         }          return testCase;     }      private static String getThirdChar(String firstChar, String secondChar) {         return "abc".replaceAll("[" + firstChar + secondChar + "]+", "");     } } The code seems to work fine on test inputs "cab" (prints "2"), "bcab" (prints "1"), and "ccccc" (prints "5"). But I keep getting told that my code is wrong. Can anyone help me figure out where the bug is?
As people have already pointed out the error is that your algorithm makes the substitutions in a predefined order. Your algorithm would make the transformation:
abcc --> ccc instead of abcc --> aac --> ab --> c
If you want to use the technique of generating the reduced strings, you need to either:
If all you need is the length of the reduced string, there is however a much simpler implementation which does not require the reduced strings to be generated. This is an extended version of @Matteo's answer, with some more details and a working (very simplistic) algorithm.
I postulate that the following three properties are true about abc-strings under the given set of rules.
If it is impossible to reduce a string further, all the characters in that string must be the same character.
It is impossible that: 2 < answer < string.length is true
While performing a reduction operation, if the counts of each letter prior to the operation is even, the count of each letter after the operation will be odd. Conversely, if the counts of each letter is odd prior to the operation, the counts will be even after the operation.
Property one is trivial.
Assume: we have a reduced string of length 5 which can be reduced no more.
AAAAA
As this string is the result of a reduction operation, the previous string must've contained one B and one C. Following are some examples of possible "parent strings":
BCAAAA, AABCAA, AAACBA
For all of the possible parent strings we can easily see that at least one of the C:s and the B:s can be combined with A:s instead of each other. This will result in a string of length 5 which will be further reducible. Hence, we have illustrated that the only reason for which we had an irreducible string of length 5 was that we had made incorrect choice of which characters to combine while performing the reduction operation.
This reasoning applies for all reduced strings of any length k such that 2 < k < string.length.
If we have for example [numA, numB, numC] = [even, even, even] and perform a reduction operation in which we substitute AB with a C. The count of A and B will decrease by one, making the counts odd, while the count of C will increase by one, making that count odd as well.
Similarly to this, if two counts are even and one is odd, two counts will be odd and one even after the operation and vice versa.
In other words, if all three counts have the same "evenness", no reduction operation can change that. And, if there are differences in the "evenness" of the counts, no reduction operation can change that.
Consider the two irreducible strings:
A and AA
For A notice that [numA, numB, numC] = [odd, even, even] For AA notice that [numA, numB, numC] = [even, even, even]
Now forget those two strings and assume we are given an input string of length n.
If all characters in the string are equal, the answer is obviously string.length.
Else, we know from property 2 that it is possible to reduce the string to a length smaller than 3. We also know the effect on evenness of performing reduction operations. If the input string contains even counts of all letters or odd count of all letters, it is impossible to reduce it to a single letter string, since it is impossible to change the evenness structure from [even, even, even] to [odd, even, even] by performing reduction operation.
Hence a simpler algorithm would be as follows:
Count the number of occurences of each letter in the input string [numA, numB, numC]  If two of these counts are 0, then return string.length  Else if (all counts are even) or (all counts are odd), then return 2  Else, then return 1 This problem also appears in HackerRank as an introduction to Dynamic Programming. Even though there are nice close-form solution as many posters have already suggested, I find it helpful to still work it out using the good-old dynamic programming way. i.e. find a good recurrence relation and cache intermediate results to avoid unnecessary computations.
As some people have already noted, the brute-force method of iterating through consecutive letters of the input string and all the resulting reduced strings will not work when input string is long. Such solution will only pass one test-case on HackerRank. Storing all reduced strings are also not feasible as the number of such string can grow exponentially. I benefit from some people's comments that the order of the letters does not matter, and only the numbers of each letter matter.
Each string can be reduced as long as it has more than one of two distinct letters. Each time a string is reduced, 1 of each of the distinct letters goes away and a letter of the third kind is added to the string. This gives us an recurrence relation. Let f(a,b,c) be the length of the smallest string given a of the letter 'a', b of the letter 'b', and c of the letter 'c' in the input string, then
f(a,b,c) = min(f(a-1,b-1,c+1), f(a-1,b+1,c-1), f(a+1,b-1,c-1)); since there are three possibilities when we reduce a string. Of course, every recurrence relation is subject to some initial conditions. In this case, we have
if(a < 0 || b < 0 || c < 0)     return MAX_SIZE+1; if(a == 0 && b == 0 && c == 0)     return 0; if(a != 0 && b == 0 && c == 0)     return a; if(a == 0 && b != 0 && c == 0)     return b; if(a == 0 && b == 0 && c != 0)     return c; here MAX_SIZE is the maximum number of a given letter in the HackerRank problem. Anytime we run out of a given letter, the maximum size is returned to indicate that this string reduction is invalid. We can then compute the size of the smallest reduced string using these initial conditions and the recurrence relation. 
However, this will still not pass the HackerRank test cases. Also, this incurs too many repeated calculations. Therefore, we want to cache the computed result given the tuple (a,b,c). The fact that we can cache the result is due to the fact that the order of the letters does not change the answer, as many of the posts above have proved.
My solution is posted below
#include <stdio.h> #include <string.h> #include <math.h> #include <stdlib.h> #include <assert.h>  #define MAX_SIZE 101  int cache[MAX_SIZE][MAX_SIZE][MAX_SIZE];  void init_cache() {     for(int i = 0 ; i < MAX_SIZE; i++) {         for (int j = 0; j < MAX_SIZE; j++) {             for(int k = 0; k < MAX_SIZE; k++)                 cache[i][j][k] = -1;         }     } }  void count(char* array, int* a, int* b, int* c) {     int len = strlen(array);      for(int i = 0; i < len; i++) {         if(array[i] == 'a')             (*a)++;         else if(array[i] == 'b')             (*b)++;         else             (*c)++;     } }  int solve(int a, int b, int c) {     if(a < 0 || b < 0 || c < 0)         return MAX_SIZE+1;     if(a == 0 && b == 0 && c == 0)         return 0;     if(a != 0 && b == 0 && c == 0)         return a;     if(a == 0 && b != 0 && c == 0)         return b;     if(a == 0 && b == 0 && c != 0)         return c;     if(cache[a][b][c] != -1) {         return cache[a][b][c];     }     int ci = solve(a-1, b-1, c+1);     int bi = solve(a-1, b+1, c-1);     int ai = solve(a+1, b-1, c-1);     if(a > 0 && b > 0)         cache[a-1][b-1][c+1] = ci;     if(a > 0 && c > 0)         cache[a-1][b+1][c-1] = bi;     if(b > 0 && c > 0)         cache[a+1][b-1][c-1] = ai;     return ci < bi ? (ci < ai ? ci : ai) : (ai < bi ? ai : bi); }  int main() {     int res, T, i;     scanf("%d", &T);     assert(T<=100);      char arr[100001];      init_cache();      for(i = 0; i < T; i++) {         scanf("%s",arr);          int a = 0;         int b = 0;         int c = 0;         count(arr, &a, &b, &c);          int len = solve(a, b, c);         printf("%d\n", len);       }      return 0; } If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With