Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert array to a sorted one using only two operations

I found this question on an online forum: Really interested on how it can be solved:

Given an array A of positive integers. Convert it to a sorted array with minimum cost. The only valid operation are:
1) Decrement with cost = 1
2) Delete an element completely from the array with cost = value of element

This is an interview question asked for a tech company

like image 229
TimeToCodeTheRoad Avatar asked Jan 19 '12 14:01

TimeToCodeTheRoad


2 Answers

NOTE : The original answer has been replaced with one in which I have a lot more confidence (and I can explain it, too). Both answers produced the same results on my set of test cases.

You can solve this problem using a dynamic programming approach. The key observation is that it never makes sense to decrement a number to a value not found in the original array. (Informal proof: suppose that you decremented a number O1 to a value X that is not in the original sequence in order to avoid removing a number O2 > X from the result sequence. Then you can decrement O1 to O2 instead, and reduce the cost by O2-X).

Now the solution becomes easy to understand: it is a DP in two dimensions. If we sort the elements of the distinct elements of the original sequence d into a sorted array s, the length of d becomes the first dimension of the DP; the length of s becomes the second dimension.

We declare dp[d.Length,s.Length]. The value of dp[i,j] is the cost of solving subproblem d[0 to i] while keeping the last element of the solution under s[j]. Note: this cost includes the cost of eliminating d[i] if it is less than s[j].

The first row dp[0,j] is computed as the cost of trimming d[0] to s[j], or zero if d[0] < s[j]. The value of dp[i,j] next row is calculated as the minimum of dp[i-1, 0 to j] + trim, where trim is the cost of trimming d[i] to s[j], or d[i] if it needs to be eliminated because s[j] is bigger than d[i].

The answer is calculated as the minimum of the last row dp[d.Length-1, 0 to s.Length].

Here is an implementation in C#:

static int Cost(int[] d) {
    var s = d.Distinct().OrderBy(v => v).ToArray();
    var dp = new int[d.Length,s.Length];
    for (var j = 0 ; j != s.Length ; j++) {
        dp[0, j] = Math.Max(d[0] - s[j], 0);
    }
    for (var i = 1; i != d.Length; i++) {
        for (var j = 0 ; j != s.Length ; j++) {
            dp[i, j] = int.MaxValue;
            var trim = d[i] - s[j];
            if (trim < 0) {
                trim = d[i];
            }
            dp[i, j] = int.MaxValue;
            for (var k = j ; k >= 0 ; k--) {
                dp[i, j] = Math.Min(dp[i, j], dp[i - 1, k] + trim);
            }
        }
    }
    var best = int.MaxValue;
    for (var j = 0 ; j != s.Length ; j++) {
        best = Math.Min(best, dp[d.Length - 1, j]);
    }
    return best;
}

This direct implementation has space complexity of O(N^2). You can reduce it to O(N) by observing that only two last rows are used at the same time.

like image 75
Sergey Kalinichenko Avatar answered Nov 15 '22 17:11

Sergey Kalinichenko


I'm assuming that "sorted" means smallest values at the start of the array, given the nature of the allowed operations.

The performance boundary between the two operations occurs when the cost of removing an out of sequence element is equal to the cost of either decrementing all greater-valued elements up to and including the offender, or removing all lesser-valued elements after the offender. You choose between decrementing preceding elements or removing later elements based on why the offending element is out of sequence. If it's less than the previous element, consider decrementing the earlier elements; if it's greater than the next element, consider removing later elements.

Some examples:

10 1 2 3 4 5

Decrement 10 to 1, cost 9.

1 2 3 4 10 4

Remove 4, cost 4.

1 2 3 4 10 5

Remove 5 or decrement 10 to 5, cost 5.

5 6 7 8 1 10

Remove 1, cost 1.

5 6 7 8 6 10

Decrement 7 and 8 to 6, cost 3.

2 1 1 4 2 4 4 3

Decrement the first 1, the first 4 by two, and the other two fours once each, cost 5.

The simplest implementation to find the solutions relies on having set knowledge, so it's very inefficient. Thankfully, the question doesn't care about that. The idea is to walk the array, and make the decision whether to remove or decrement to fix the set when an out of sequence element is encountered. A much more efficient implementation of this would be to use running totals (as opposed to calculate methods) and walk the array twice, forwards and backwards. I've written a mock up of the simpler version, as I think it's easier to read.

Pseudocode, returns total cost:

if array.Length < 2 : return 0; // no sorting necessary

resultArray = array.Copy();
int cost = 0;

for i = 0 to array.Length - 1 :

if i > 0 and array[i-1] > array[i] :

if CostToDecrementPreviousItems(i, array[i]) > array[i]) :
resultArray[i] = -1;
cost += array[i];
else :
cost += DecrementItemsThroughIndexGreaterThanValue(resultArray, i, array[i]);
end if

else if i < array.Length - 1 and array[i+1] < array[i] :

if CostToRemoveLaterItems(i, array[i]) > array[i] :
resultArray[i] = -1;
cost += array[i];
else :
cost += RemoveItemsAfterIndexGreaterThanValue(resultArray, i, array[i]);
end if

end if
end for

RemoveNegativeElements(resultArray);
array = resultArray;

return cost;

Hopefully the undefined method calls are self explanatory.

like image 44
Esoteric Screen Name Avatar answered Nov 15 '22 16:11

Esoteric Screen Name