Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

More efficient method of finding minimum sum after k operations

Tags:

python

An array is manipulated k times so that each time the max value is devided by 2 and rounded up. I need to find its minimum sum after these k manipulations. k and all numbers in array num > 1. The method minSum recieves an array called num and an integer k. The brute Python code that works for me with very bad time complexity is:

import bisect

def minSum(num, k):
    num.sort()
    lastIndex = len(num)-1
    for i in range(k):
        if num[lastIndex]==1:
            return sum(num)

        num[lastIndex]= math.ceil(num[lastIndex]/2)
        lastNumber = num[len(num)-1]
        num.pop()
        bisect.insort(num, lastNumber)

    return sum(num)

I've also tried to sort the array first but then every algorithm I tried failed. could you think of a good algorithm with good time complexity? preferably without using special imports.

like image 358
Ohad Marbach Avatar asked Feb 26 '19 12:02

Ohad Marbach


People also ask

How to find minimum sum?

Minimum sum = (sumOfArr - subSum) + (subSum/ X) where sumOfArr is the sum of all elements of the array and subSum is the maximum sum of the subarray.


2 Answers

The standard library (ie. no special imports) comes with a heapq module that makes the algorithm O(3n + k*(2 lg n)) => O(lg n):

import math
import heapq

def minSum(num, k):
    heap = [-n for n in num]  # negate values for max-heap
    heapq.heapify(heap)

    for i in range(k):
        # Find max value
        max_value = heapq.heappop(heap)
        # Change max value to rounded half
        # use floor since we've negated the values
        heapq.heappush(heap, math.floor(max_value/2))

    # Calculate minimum sum
    return -sum(heap)  # reverse polarity again

update 1: (from comment by @raury-daulton) combine pop/push, O(3n + k*(lg n)) => O(lg n):

def minSum(num, k):
    heap = [-n for n in num]  # negate values for max-heap
    heapq.heapify(heap)

    for i in range(k):
        max_value = heap[0]
        heapq.heapreplace(heap, math.floor(max_value/2))

    # Calculate minimum sum
    return -sum(heap)  # reverse polarity again

update 2: use a max heap directly O(2n + k*(lg n)) => O(lg n)

def heapreplace_max(heap, item):
    "We need to implement this ourselves from primitives."
    returnitem = heap[0]
    heap[0] = item
    heapq._siftup_max(heap, 0)
    return returnitem


def minSum(num, k):
    heap = num   # alias for consistency with other samples
    heapq._heapify_max(heap)  # make it a max heap (no negation hack)

    for i in range(k):
        max_value = heap[0]
        heapreplace_max(heap, math.ceil(max_value/2))

    return sum(heap)

update 3: final optimizations (this requires the input array to be an array of ints).

def minSum(num, k):
    heap = num   # alias for consistency with other samples
    heapq._heapify_max(heap)

    for i in range(k):
        max_value = heap[0]
        if max_value == 1:  # exit early if there is no more work
            break
        new_val = (max_value >> 1) + (max_value & 1)  # same as, but much faster than math.ceil(max_value/2)
        heapreplace_max(heap, new_val)

    return sum(heap)

The upper bound on k is sum(math.floor(math.log(v, 2)+1) for v in num) (i.e. the total number of bits required to represent all the input numbers). It may be faster to precompute the loop variable than to have if max_value == 1: in the loop, i.e.:

for i in range(min(k, int(sum(floor(log(v, 2)+1) for v in num)))):
    max_value = heap[0]
    new_val = ...

but I haven't actually measured this.

like image 89
thebjorn Avatar answered Oct 13 '22 00:10

thebjorn


Each loop of your code goes through the complete array twice, so your algorithm is of execution order kn, where n is the length of your array. You can speed this up by not taking the sum after every operation, since your operation "divide by 2 and round up" cannot decrease the sum. (This depends on the fact that none of the array values may be negative.) Take the sum only after applying all k operations. This should cut your execution roughly in half.

Another, more significant optimization is to avoid the search through the array for the maximum value. I suggest you use a heapq (priority queue) structure. You store the negatives of the array values, so the minimum of the heapq is the negative of the maximum of your array. Finding the max array value is then an order 1 operation, and changing it is an order log(n) operation. Use the heapreplace method to do the replacing, to get maximum speed. Making the array into a heap using heapify is the only order n operation, but that happens only once.

I'll add code later today if I get the time. This algorithm does require one import, namely of heapq, but that is in the standard library so does not really count as a "special import."

like image 40
Rory Daulton Avatar answered Oct 12 '22 23:10

Rory Daulton