Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Divide the list into three lists such that their sum are close to each other

Let's say that I have an array of number S = [6, 2, 1, 7, 4, 3, 9, 5, 3, 1]. I want to divide into three arrays. The order of the number and the number of item in those array does not matter.

Let's say A1, A2, and A3 are the sub arrays. I want to minimize the function

f(x) = ( SUM(A1) - SUM(S) / 3 )^2 / 3 +
       ( SUM(A2) - SUM(S) / 3 )^2 / 3 +
       ( SUM(A3) - SUM(S) / 3 )^2 / 3
  • I don't need an optimal solution; I just need the solution that is good enough.
  • I don't want an algorithm that is too slow. I can trade some speed for a better result, but I cannot trade too much.
  • The length of S is around 10 to 30.

Why

Why do I need to solve this problem? I want to nicely arrange the box into three columns such that the total height of each columns is not too different from each other.

Enter image description here

What have I tried

My first instinct is to use greedy. The result is not that bad, but it does not ensure an optimal solution. Is there a better way?

s = [6, 2, 1, 7, 4, 3, 9, 5, 3, 1]
s = sorted(s, reverse=True)

a = [[], [], []]
sum_a = [0, 0, 0]

for x in s:
    i = sum_a.index(min(sum_a))
    sum_a[i] += x
    a[i].append(x)

print(a)
like image 501
invisal Avatar asked Dec 19 '16 11:12

invisal


People also ask

Can list be divided in Python?

To split a list into n parts in Python, use the numpy. array_split() function. The np. split() function splits the array into multiple sub-arrays.


2 Answers

As you said you don't mind a non-optimal solution, I though I would re-use your initial function, and add a way to find a good starting arrangement for your initial list s

Your initial function:

def pigeon_hole(s):
    a = [[], [], []]
    sum_a = [0, 0, 0]
    for x in s:
        i = sum_a.index(min(sum_a))
        sum_a[i] += x
        a[i].append(x)
    return map(sum, a)

This is a way to find a sensible initial ordering for your list, it works by creating rotations of your list in sorted and reverse sorted order. The best rotation is found by minimizing the standard deviation, once the list has been pigeon holed:

def rotate(l):
    l = sorted(l)
    lr = l[::-1]
    rotation = [np.roll(l, i) for i in range(len(l))] + [np.roll(lr, i) for i in range(len(l))]
    blocks = [pigeon_hole(i) for i in rotation]
    return rotation[np.argmin(np.std(blocks, axis=1))]  # the best rotation

import random
print pigeon_hole(rotate([random.randint(0, 20) for i in range(20)]))

# Testing with some random numbers, these are the sums of the three sub lists
>>> [64, 63, 63]

Although this could be optimized further it is quite quick taking 0.0013s for 20 numbers. Doing a quick comparison with @Mo Tao's answer, using a = rotate(range(1, 30))

# This method
a = rotate(range(1, 30))
>>> [[29, 24, 23, 18, 17, 12, 11, 6, 5], [28, 25, 22, 19, 16, 13, 10, 7, 4, 1], [27, 26, 21, 20, 15, 14, 9, 8, 3, 2]]
map(sum, a)
# Sum's to [145, 145, 145] in 0.002s

# Mo Tao's method
>>> [[25, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1], [29, 26, 20, 19, 18, 17, 16], [28, 27, 24, 23, 22, 21]]
# Sum's to [145, 145, 145] in 1.095s

This method also seems to find the optimal solution in many cases, although this probably wont hold for all cases. Testing this implementation 500 times using a list of 30 numbers against Mo Tao's answer, and comparing if the sub-lists sum to the same quantity:

c = 0
for i in range(500):
    r = [random.randint(1, 10) for j in range(30)]
    res = pigeon_hole(rotate(r))
    d, e = sorted(res), sorted(tao(r))  # Comparing this to the optimal solution by Mo Tao
    if all([k == kk] for k, kk in zip(d, e)):
        c += 1
    memory = {}
    best_f = pow(sum(s), 3)
    best_state = None

>>> 500 # (they do)

I thought I would provide an update with a more optimized version of my function here:

def rotate2(l):
    # Calculate an acceptable minimum stdev of the pigeon holed list
    if sum(l) % 3 == 0:
        std = 0
    else:
        std = np.std([0, 0, 1])

    l = sorted(l, reverse=True)
    best_rotation = None
    best_std = 100

    for i in range(len(l)):
        rotation = np.roll(l, i)
        sd = np.std(pigeon_hole(rotation))

        if sd == std:  
            return rotation  # If a min stdev if found 

        elif sd < best_std:
            best_std = sd
            best_rotation = rotation

    return best_rotation

The main change is that the search for a good ordering stops once a suitable rotation has been found. Also only the reverse sorted list is searched which doesnt appear to alter the result. Timing this with

print timeit.timeit("rotate2([random.randint(1, 10) for i in range(30)])", "from __main__ import rotate2, random", number=1000) / 1000.

results in a large speed up. On my current computer rotate takes about 1.84ms and rotate2 takes about 0.13ms, so about a 14x speed-up. For comparison גלעד ברקן 's implementation took about 0.99ms on my machine.

like image 155
kezzos Avatar answered Oct 12 '22 11:10

kezzos


As I mentioned in the comment of the question, this is the straight-forward dynamic programming method. It takes less than 1 second for s = range(1, 30) and gives optimized solution.

I think the code is self-explained if you known Memoization.

s = range(1, 30)
# s = [6, 2, 1, 7, 4, 3, 9, 5, 3, 1]
n = len(s)

memory = {}
best_f = pow(sum(s), 3)
best_state = None

def search(state, pre_state):
    global memory, best_f, best_state    
    s1, s2, s3, i = state
    f = s1 * s1 + s2 * s2 + s3 * s3
    if state in memory or f >= best_f:
        return
    memory[state] = pre_state
    if i == n:
        best_f = f
        best_state = state
    else:
        search((s1 + s[i], s2, s3, i + 1), state)
        search((s1, s2 + s[i], s3, i + 1), state)
        search((s1, s2, s3 + s[i], i + 1), state)

search((0, 0, 0, 0), None)

a = [[], [], []]
state = best_state
while state[3] > 0:
    pre_state = memory[state]
    for j in range(3):
        if state[j] != pre_state[j]:
            a[j].append(s[pre_state[3]])
    state = pre_state

print a
print best_f, best_state, map(sum, a)
like image 4
Mo Tao Avatar answered Oct 12 '22 10:10

Mo Tao