Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find the minimum number of operation(s) to make the string balanced?

From Codechef:

A string is considered balanced if and only if all the characters occur in it equal number of times.

You are given a string S; this string may only contain uppercase English letters. You may perform the following operation any number of times (including zero): Choose one letter in S and replace it by another uppercase English letter. Note that even if the replaced letter occurs in S multiple times, only the chosen occurrence of this letter is replaced.

Find the minimum number of operations required to convert the given string to a balanced string.

Example:

For input: ABCB

Here, we can replace C with A, TO GET: ABAB, where each character of the string occurs for 2 times.

So, the number of minimum operation(s)=1.

How to make the string good?

Can I apply dynamic programming to it ?

like image 515
Firex Firexo Avatar asked Feb 02 '19 09:02

Firex Firexo


People also ask

What is the minimum number of Operation(s) to make the string balanced?

- Stack Overflow How to find the minimum number of operation (s) to make the string balanced? A string is considered balanced if and only if all the characters occur in it equal number of times. You are given a string S; this string may only contain uppercase English letters.

How to find the minimum number of operations required to convert?

To find the minimum number of operations required to convert the first string to the second string, the idea is to simultaneously traverse both strings from the end. If the last characters of both strings match, move to the next pair of characters.

How many deletions are needed to make a string balanced?

1653. Minimum Deletions to Make String Balanced You are given a string s consisting only of characters 'a' and 'b' ​​​​. You can delete any number of characters in s to make s balanced. s is balanced if there is no pair of indices (i,j) such that i < j and s [i] = 'b' and s [j]= 'a'. Return the minimum number of deletions needed to make s balanced.

How do you know if a string is already balanced?

A string is said to be balanced if and only if the number of occurrences of each of the characters is equal. Add 2 ‘g’, 2 ‘k’, 2 ‘s’, 3 ‘f’, 3 ‘o’ and 3 ‘r’. The string is already balanced.


2 Answers

I don't think you really need dynamic programming here.

One approach in O(length(S)) time:

  • Iterate over S, build a frequency-map (a mapping from distinct letters A–Z to counts). For your ABCB example, that would be A->1 B->2 C->1 D->0 E->0 ... Z->0, which we can represent as the array [1, 2, 1, 0, 0, ..., 0].
    • We can do this because we don't actually care about the positions of letters at all; there's no real difference between ABCB and ABBC, in that each can be balanced by replacing its C with an A.
  • Sort the array. For your example, that gives [0, 0, ..., 0, 1, 1, 2].
    • We can do this because we don't actually care about which letter was which; there's no real difference between ABCB and ABDB, in that each can be balanced by replacing one of its singleton letters with its other one.
  • Assuming the string is nonempty (since if it's empty then the answer is just 0), the final balanced string will contain at least 1 and at most 26 distinct letters. For each integer i between 1 and 26, determine how many operations you'd need to perform in order to produce a balanced string with i distinct letters:
    • First, check that length(S) is a multiple of i; if not, this isn't possible, so skip to the next integer.
    • Next, compute length(S)/i, the count of each distinct letter in the final balanced string.
    • To count how many operations need to be performed, we're going to examine all the counts that need to be increased. (We could, equivalently, examine the counts that need to be decreased: they'll have to match.)
    • We're only interested in the last i elements of the sorted array. Any elements before that point represent letters that won't occur in the balanced string: either the counts are already zero and will remain so, or they are nonzero but will be decreased to zero. Either way, since we're only tracking increases, we can ignore them.
    • For each of the last i counts that are less than length(S)/i, add the difference. This sum is the number of operations. (Note that, since the counts are sorted, you can exit this inner loop as soon as you get to a count that's greater than or equal to the target count.)
    • You can exit this loop after the first i that's greater than or equal to the number of distinct letters in the original S (aside from values of i that we had to skip because they don't evenly divide length(S)). For example, if length(S) = 100, and the original S has 19 distinct letters, then we only need to consider i as high as 20. (Hat-tip to Eric Wang for suggesting something along these lines.)
  • Return the least of these up-to-26 sums. (Note that you don't actually need to store all the sums; you can just keep track of the minimum.)
like image 82
ruakh Avatar answered Oct 19 '22 23:10

ruakh


Following code implement the solution, in Java, with unit test.

The algorithm is almost identical as in @ruakh's answer, if not identical.


Code

BalanceString.java

import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;

/**
 * Assume string only contains A-Z, the 26 uppercase letter,
 * <p>given a string, you can replace a char with another char from the 26 letter,
 * <p>figure out the minimum replacement required to make the string balance,
 * <p>which means each char in the string occurs the same time,
 *
 * @author eric
 * @date 2/2/19 8:54 PM
 */
public class BalanceString {
    private final char minChar;
    private final char maxChar;
    private final int distinctChars; // total distinct char count,

    public static final BalanceString EN_UPPER_INSTANCE = new BalanceString('A', 'Z');

    public BalanceString(char minChar, char maxChar) {
        this.minChar = minChar;
        this.maxChar = maxChar;
        this.distinctChars = maxChar - minChar + 1;

        if (distinctChars <= 0)
            throw new IllegalArgumentException("invalid range of chars: [" + minChar + ", " + maxChar + "]");
    }

    /**
     * Check minimal moves needed to make string balanced.
     *
     * @param str
     * @return count of moves
     */
    public int balanceCount(String str) {
        // System.out.printf("string to balance:\t%s\n", str);
        int len = str.length(); // get length,
        if (len <= 2) return 0; // corner cases,

        Map<Character, Integer> coMap = figureOccurs(str); // figure occurrences,
        Integer[] occurs = sortOccursReversely(coMap); // reversely order occurrences,

        int m = coMap.size(); // distinct char count,
        int maxN = (len < distinctChars ? len : distinctChars); // get max possible distinct char count, included,

        int smallestMoves = Integer.MAX_VALUE; // smallest moves, among all possible n,

        // check each possible n, and get its moves,
        for (int n = 1; n <= maxN; n++) {
            if (len % n == 0) {
                int moves = figureMoves(len, coMap, occurs, m, n);
                if (moves < smallestMoves) smallestMoves = moves;
            }
        }

        return smallestMoves;
    }

    /**
     * Figure occurs for each char.
     *
     * @param str
     * @return
     */
    protected Map<Character, Integer> figureOccurs(String str) {
        Map<Character, Integer> coMap = new HashMap<>();
        for (char c : str.toCharArray()) {
            if (c < minChar || c > maxChar)
                throw new IllegalArgumentException(c + " is not within range 'A-Z'");

            if (!coMap.containsKey(c)) coMap.put(c, 1);
            else coMap.put(c, coMap.get(c) + 1);
        }

        return coMap;
    }

    /**
     * Get reverse sorted occurrences.
     *
     * @param coMap
     * @return
     */
    protected Integer[] sortOccursReversely(Map<Character, Integer> coMap) {
        Integer[] occurs = new Integer[coMap.size()];

        coMap.values().toArray(occurs);
        Arrays.sort(occurs, Collections.reverseOrder());

        return occurs;
    }

    /**
     * Figure moves needed to balance.
     *
     * @param len   length of string,
     * @param coMap
     * @param m     original distinct elements count,
     * @param n     new distinct elements count,
     * @return count of moves,
     */
    protected int figureMoves(int len, Map<Character, Integer> coMap, Integer[] occurs, int m, int n) {
        int avgOccur = len / n; // average occurrence,
        int moves = 0;

        if (n == m) { // distinct chars don't change,
            for (Integer occur : occurs) {
                if (occur <= avgOccur) break;
                moves += (occur - avgOccur);
            }
        } else if (n < m) { // distinct chars decrease,
            for (int i = 0; i < n; i++) moves += Math.abs(occurs[i] - avgOccur); // for elements kept,
            for (int i = n; i < m; i++) moves += occurs[i]; // for elements to replace,
            moves >>= 1;
        } else { // distinct chars increase,
            for (int i = 0; i < occurs.length; i++) moves += Math.abs(occurs[i] - avgOccur); // for existing elements,
            moves += ((n - m) * avgOccur); // for new elements,
            moves >>= 1;
        }

        return moves;
    }

    public char getMinChar() {
        return minChar;
    }

    public char getMaxChar() {
        return maxChar;
    }

    public int getDistinctChars() {
        return distinctChars;
    }
}

BalanceStringTest.java
(Unit test, via TestNG)

import org.testng.Assert;
import org.testng.annotations.Test;

/**
 * BalanceString test.
 *
 * @author eric
 * @date 2/2/19 9:36 PM
 */
public class BalanceStringTest {
    private BalanceString bs = BalanceString.EN_UPPER_INSTANCE;

    @Test
    public void test() {
        // n < m case,
        Assert.assertEquals(bs.balanceCount("AAAABBBC"), 1); // e.g 1A -> B,
        Assert.assertEquals(bs.balanceCount("AAAAABBC"), 2); // e.g 1A -> B, 1C -> B,
        Assert.assertEquals(bs.balanceCount("AAAAAABC"), 2); // e.g 1B -> A, 1C -> A,
        Assert.assertEquals(bs.balanceCount("AAAAAAAB"), 1); // e.g 1B -> A,

        // n > m case,
        Assert.assertEquals(bs.balanceCount("AAAABBBBCCCCDDDDEEEEAAAA"), 4); // add 1 new char, e.g change 4 A to 4 F,
        Assert.assertEquals(bs.balanceCount(genIncSeq(10)), 15); // A-J, 10 distinct chars, 55 in length; solved by add 1 new char, need 15 steps,

        // n == m case,
        Assert.assertEquals(bs.balanceCount(genIncSeq(3)), 1); // A-C, 3 distinct chars, 6 in length; symmetric, solved with same distinct chars, need 1 steps,
        Assert.assertEquals(bs.balanceCount(genIncSeq(11)), 15); // A-K, 11 distinct chars, 66 in length; symmetric, solved with same distinct chars, need 15 steps,

        // n < m, or n > m case,
        Assert.assertEquals(bs.balanceCount("ABAC"), 1); // e.g 1A -> B, or 1A -> D,
    }

    // corner case,
    @Test
    public void testCorner() {
        // m <= 2,
        Assert.assertEquals(bs.balanceCount(""), 0);
        Assert.assertEquals(bs.balanceCount("A"), 0);
        Assert.assertEquals(bs.balanceCount("AB"), 0);
        Assert.assertEquals(bs.balanceCount("AA"), 0);

        /*------ m == n == distinctChars ------*/
        String mndBalanced = genMndBalancedSeq(); // each possible char occurs exactly once, already balanced,
        Assert.assertEquals(mndBalanced.length(), bs.getDistinctChars());
        Assert.assertEquals(bs.balanceCount(mndBalanced), 0); // no need change,

        char lastChar = mndBalanced.charAt(mndBalanced.length() - 1);
        String mndOneDup = mndBalanced.replace(lastChar, (char) (lastChar - 1)); // (distinctChars -2) chars occur exactly once, one occurs twice, one is missing, thus it's one step away to balance,
        Assert.assertEquals(mndOneDup.length(), bs.getDistinctChars());
        Assert.assertEquals(bs.balanceCount(mndOneDup), 1); // just replace the duplicate char with missing char,
    }

    // invalid input,
    @Test(expectedExceptions = IllegalArgumentException.class)
    public void testInvalidInput() {
        Assert.assertEquals(bs.balanceCount("ABAc"), 1);
    }

    // invalid char range, for constructor,
    @Test(expectedExceptions = IllegalArgumentException.class)
    public void testInvalidRange() {
        new BalanceString('z', 'a');
    }

    /**
     * Generate a string, with first char occur once, second twice, third three times, and so on.
     * <p>e.g A, ABB, ABBCCC, ABBCCCDDDD,
     *
     * @param m distinct char count,
     * @return
     */
    private String genIncSeq(int m) {
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < m; i++) {
            for (int j = 0; j <= i; j++) sb.append((char) (bs.getMinChar() + i));
        }
        return sb.toString();
    }

    /**
     * Generate a string that contains each possible char exactly once.
     * <p>For [A-Z], it could be: "ABCDEFGHIJKLMNOPQRSTUVWXYZ".
     *
     * @return
     */
    private String genMndBalancedSeq() {
        StringBuilder sb = new StringBuilder();
        char minChar = bs.getMinChar();
        int distinctChars = bs.getDistinctChars();

        for (int i = 0; i < distinctChars; i++) {
            sb.append((char) (minChar + i));
        }
        return sb.toString();
    }
}

All test cases would pass.


Complexity

  • Time: O(len) + O(m * lg(m)) + O(m * factorCount)
    • Each sequential scan takes O(len), there are several sequential loop.
    • Sorting of array takes O(m*lg(m)), which is at most O(distinctChars * lg(distinctChars)), thus constant, and only sort once.
    • To figure out moves for each n, takes O(m).
    • The count of n that needs to figure moves, depends on count of divisible numbers for len, in the range [minChar, maxChar].
      This count if kind small & constant too.
  • Space: O(len)
    • Input string need O(len).
    • Counter hashmap need O(m).
    • Sorted occurrence array need O(m).

Where:

  • len is string length.
  • m is distinct char count in original string
  • distinctChars is distinct char count, e.g 26.
  • maxN max possible distinct char count, included,
  • factorCount divisible number count in range [1, n], by len,
  • minChar min char, e.g A
  • maxChar max char, e.g Z

And:

  • len >= m
  • m <= distinctChars
like image 26
user218867 Avatar answered Oct 20 '22 01:10

user218867