Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

minimum steps required to make array of integers contiguous

Tags:

algorithm

given a sorted array of distinct integers, what is the minimum number of steps required to make the integers contiguous? Here the condition is that: in a step , only one element can be changed and can be either increased or decreased by 1 . For example, if we have 2,4,5,6 then '2' can be made '3' thus making the elements contiguous(3,4,5,6) .Hence the minimum steps here is 1 . Similarly for the array: 2,4,5,8:

  • Step 1: '2' can be made '3'
  • Step 2: '8' can be made '7'
  • Step 3: '7' can be made '6'

Thus the sequence now is 3,4,5,6 and the number of steps is 3.

I tried as follows but am not sure if its correct?

    //n is the number of elements in array a
    int count=a[n-1]-a[0]-1;
    for(i=1;i<=n-2;i++)
    {
        count--;
    }
    printf("%d\n",count);

Thanks.

like image 406
pranay Avatar asked Feb 19 '12 11:02

pranay


People also ask

How many steps are there to create an array?

Making an array in a Java program involves three distinct steps: Declare the array name. Create the array. Initialize the array values.

What is the minimum number of operations needed to make all the elements equal?

So, the smallest elements need not to decrease any more and rest of elements will got decremented upto smallest one. In this way the total number of operation required for making all elements equal will be arraySum – n * (smallestElement).

How do you make all the elements in an array equal?

In one operation, you can select two indices x and y where 0 <= x, y < n and subtract 1 from arr[x] and add 1 to arr[y] (i.e., perform arr[x] -=1 and arr[y] += 1 ). The goal is to make all the elements of the array equal. It is guaranteed that all the elements of the array can be made equal using some operations.


1 Answers

The intuitive guess is that the "center" of the optimal sequence will be the arithmetic average, but this is not the case. Let's find the correct solution with some vector math:

Part 1: Assuming the first number is to be left alone (we'll deal with this assumption later), calculate the differences, so 1 12 3 14 5 16-1 2 3 4 5 6 would yield 0 -10 0 -10 0 -10.

sidenote: Notice that a "contiguous" array by your implied definition would be an increasing arithmetic sequence with difference 1. (Note that there are other reasonable interpretations of your question: some people may consider 5 4 3 2 1 to be contiguous, or 5 3 1 to be contiguous, or 1 2 3 2 3 to be contiguous. You also did not specify if negative numbers should be treated any differently.)

theorem: The contiguous numbers must lie between the minimum and maximum number. [proof left to reader]

Part 2: Now returning to our example, assuming we took the 30 steps (sum(abs(0 -10 0 -10 0 -10))=30) required to turn 1 12 3 14 5 16 into 1 2 3 4 5 6. This is one correct answer. But 0 -10 0 -10 0 -10+c is also an answer which yields an arithmetic sequence of difference 1, for any constant c. In order to minimize the number of "steps", we must pick an appropriate c. In this case, each time we increase or decrease c, we increase the number of steps by N=6 (the length of the vector). So for example if we wanted to turn our original sequence 1 12 3 14 5 16 into 3 4 5 6 7 8 (c=2), then the differences would have been 2 -8 2 -8 2 -8, and sum(abs(2 -8 2 -8 2 -8))=30.

Now this is very clear if you could picture it visually, but it's sort of hard to type out in text. First we took our difference vector. Imagine you drew it like so:

 4|
 3|     *
 2|  *  |
 1|  |  |  *
 0+--+--+--+--+--*
-1|           |
-2|           *

We are free to "shift" this vector up and down by adding or subtracting 1 from everything. (This is equivalent to finding c.) We wish to find the shift which minimizes the number of | you see (the area between the curve and the x-axis). This is NOT the average (that would be minimizing the standard deviation or RMS error, not the absolute error). To find the minimizing c, let's think of this as a function and consider its derivative. If the differences are all far away from the x-axis (we're trying to make 101 112 103 114 105 116), it makes sense to just not add this extra stuff, so we shift the function down towards the x-axis. Each time we decrease c, we improve the solution by 6. Now suppose that one of the *s passes the x axis. Each time we decrease c, we improve the solution by 5-1=4 (we save 5 steps of work, but have to do 1 extra step of work for the * below the x-axis). Eventually when HALF the *s are past the x-axis, we can NO LONGER IMPROVE THE SOLUTION (derivative: 3-3=0). (In fact soon we begin to make the solution worse, and can never make it better again. Not only have we found the minimum of this function, but we can see it is a global minimum.)

Thus the solution is as follows: Pretend the first number is in place. Calculate the vector of differences. Minimize the sum of the absolute value of this vector; do this by finding the median OF THE DIFFERENCES and subtracting that off from the differences to obtain an improved differences-vector. The sum of the absolute value of the "improved" vector is your answer. This is O(N) The solutions of equal optimality will (as per the above) always be "adjacent". A unique solution exists only if there are an odd number of numbers; otherwise if there are an even number of numbers, AND the median-of-differences is not an integer, the equally-optimal solutions will have difference-vectors with corrective factors of any number between the two medians.

So I guess this wouldn't be complete without a final example.

  1. input: 2 3 4 10 14 14 15 100
  2. difference vector: 2 3 4 5 6 7 8 9-2 3 4 10 14 14 15 100 = 0 0 0 -5 -8 -7 -7 -91
  3. note that the medians of the difference-vector are not in the middle anymore, we need to perform an O(N) median-finding algorithm to extract them...
  4. medians of difference-vector are -5 and -7
  5. let us take -5 to be our correction factor (any number between the medians, such as -6 or -7, would also be a valid choice)
  6. thus our new goal is 2 3 4 5 6 7 8 9+5=7 8 9 10 11 12 13 14, and the new differences are 5 5 5 0 -3 -2 -2 -86*
  7. this means we will need to do 5+5+5+0+3+2+2+86=108 steps

*(we obtain this by repeating step 2 with our new target, or by adding 5 to each number of the previous difference... but since you only care about the sum, we'd just add 8*5 (vector length times correct factor) to the previously calculated sum)

Alternatively, we could have also taken -6 or -7 to be our correction factor. Let's say we took -7...

  • then the new goal would have been 2 3 4 5 6 7 8 9+7=9 10 11 12 13 14 15 16, and the new differences would have been 7 7 7 2 1 0 0 -84
  • this would have meant we'd need to do 7+7+7+2+1+0+0+84=108 steps, the same as above

If you simulate this yourself, can see the number of steps becomes >108 as we take offsets further away from the range [-5,-7].

Pseudocode:

def minSteps(array A of size N):
    A' = [0,1,...,N-1]
    diffs = A'-A
    medianOfDiffs = leftMedian(diffs)
    return sum(abs(diffs-medianOfDiffs))

Python:

leftMedian = lambda x:sorted(x)[len(x)//2]
def minSteps(array):
    target = range(len(array))
    diffs = [t-a for t,a in zip(target,array)]
    medianOfDiffs = leftMedian(diffs)
    return sum(abs(d-medianOfDiffs) for d in diffs)

edit:

It turns out that for arrays of distinct integers, this is equivalent to a simpler solution: picking one of the (up to 2) medians, assuming it doesn't move, and moving other numbers accordingly. This simpler method often gives incorrect answers if you have any duplicates, but the OP didn't ask that, so that would be a simpler and more elegant solution. Additionally we can use the proof I've given in this solution to justify the "assume the median doesn't move" solution as follows: the corrective factor will always be in the center of the array (i.e. the median of the differences will be from the median of the numbers). Thus any restriction which also guarantees this can be used to create variations of this brainteaser.

like image 182
ninjagecko Avatar answered Oct 30 '22 11:10

ninjagecko