Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to evenly distribute values into containers?

Tags:

algorithm

Does anyone know a way to evenly distribute numbers into a set number of containers, making sure that the total values of the containers are as even as possible?

EDIT: by "even as possible" I mean that the total of each container will be as close to the total average if distributed in X amount of containers.

Right now I simply sort the array of numbers(descending) and then distribute them, oblivious of their value, into the containers. A set of 1000, 200, 20, 1000 distributed into three containers would equal [2000], [200], [20].

What I want to do is:

Example

Set of numbers: 10 30 503 23 1 85 355
If I were to distribute these into three containers I would just pick the highest first and then distribute them as I go, like this:
Cont 1 = 503
Cont 2 = 355
Cont 3 = 85 + 30 + 23 + 10 + 1

This will give the best possible distribution that you can get with the values provided.

But I do not know of a neat way to express this in code.

Ideas?

like image 460
Root Avatar asked Oct 05 '13 12:10

Root


People also ask

How many a set of 1000 are distributed into three containers?

A set of 1000, 200, 20, 1000 distributed into three containers would equal [2000], [200], [20].

Do you have a bin packing algorithm in place?

But to get you packing right, you need to have a right, bin packing algorithm in place. With an algorithm, it becomes easy to work on the packing when advance technology helps to ease the management of numerous goods. With these baselines bin packing algorithms, the packaging problem in today’s world is minimized.

How do you find the optimal average between two containers?

It starts with sorting the data, then for n containers, immediately stores the n highest numbers in each one. (You can omit that step, actually.) Then, from largest remaining number to smallest, it finds the container where adding that number makes the smallest difference to the optimal average.

How do I choose the smallest container for my data set?

First, sort your data and consider the data points from the largest to the smallest. At each stage, assign the next value to the container which is currently smallest. This probably won't give you the optimal solution in all cases, but it might be quite reasonable in practice.


2 Answers

Do you have a large dataset, with much variance in the size of objects, and a cast iron requirement that you must find the very best solution? If so, this is not realistic.

But the good news is that many problems that are NP-complete in theory, are quite easy in the real world! If your number of datapoints is relatively small, then you can probably do an intelligent (but still thorough) search and find the globally optimum solution.

Also, if the variance in the values is quite small if you have a nicely behaved dataset, you might quickly stumble across a solution that fills all the containers exactly evenly. If so, then this is obviously the best possible answer. This could work well even on very large datasets. (I think that what you want here is a dataset with lots of small values that can be used to easily tidy things up at the end.).

So, don't give up! First, sort your data and consider the data points from the largest to the smallest. At each stage, assign the next value to the container which is currently smallest. This probably won't give you the optimal solution in all cases, but it might be quite reasonable in practice.

Sorting 1000, 200, 20, 1000, would give you 1000, 1000, 200, 20. This algorithm would then give you:

1000        = 1000
1000        = 1000
200   +20   =  220

This happens to be the optimal solution, but it won't always be the case.

====

If you are willing and able to try more complex algorithms, look up the partition problem:

Although the partition problem is NP-complete, there is a pseudo-polynomial time dynamic programming solution, and there are heuristics that solve the problem in many instances, either optimally or approximately. For this reason, it has been called "The Easiest Hard Problem".

There is an optimization version of the partition problem, which is to partition the multiset S into two subsets S1, S2 such that the difference between the sum of elements in S1 and the sum of elements in S2 is minimized.

like image 160
Aaron McDaid Avatar answered Nov 16 '22 02:11

Aaron McDaid


Interesting. This C program seems to give the expected result so far. It starts with sorting the data, then for n containers, immediately stores the n highest numbers in each one. (You can omit that step, actually.) Then, from largest remaining number to smallest, it finds the container where adding that number makes the smallest difference to the optimal average. Because this runs from high to low, each number is placed into the optimal container -- all other numbers are lower, so the difference for them would even be bigger.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>

int sort_numeric (const void *a, const void *b)
{
    return *((int *)a) - *((int *)b);
}

int main (int argc, char **argv)
{
    int list[] = { 10, 30, 503, 23, 1, 85, 355 };
    int i,j, nnumber, ncontainer, total, avgPerContainer, nextError, smallestError, containerForSmallest;
    int *containers, *errors;

    ncontainer = 3;

    nnumber = sizeof(list)/sizeof(list[0]);

    qsort (list, nnumber, sizeof(list[0]), sort_numeric);

    containers = (int *)malloc(ncontainer * sizeof(int));
    for (i=0; i<ncontainer; i++)
        containers[i] = 0;

    errors = (int *)malloc(ncontainer * sizeof(int));
    for (i=0; i<ncontainer; i++)
        errors[i] = 0;


    printf ("input:");
    for (i=0; i<nnumber; i++)
    {
        printf (" %d", list[i]);
    }
    printf ("\n");

//  how much is to be in each container?
    total = 0;
    for (i=0; i<nnumber; i++)
        total += list[i];

//  this will result in a fraction:
//      avgPerContainer = total/ncontainer;
//  so instead we'll use 'total' and *keeping in mind*
//  that the number needs to be divided by ncontainer
    avgPerContainer = total;

    printf ("per container: ~%d\n", (2*avgPerContainer+ncontainer)/(2*ncontainer) );

//  start by putting highest values into each container
    for (i=0; i<ncontainer; i++)
        containers[i] += list[nnumber-ncontainer+i];
//  .. remove from list ..
    nnumber -= ncontainer;

//  print current totals
    for (i=0; i<ncontainer; i++)
    {
        errors[i] = containers[i]*ncontainer - avgPerContainer;
        printf ("#%d: %d, error = %d/%d ~ %+d\n", i, containers[i], errors[i], ncontainer, (2*errors[i]+ncontainer)/(2*ncontainer) );
    }

    printf ("remaining:");
    for (i=0; i<nnumber; i++)
    {
        printf (" %d", list[i]);
    }
    printf ("\n");

//  add the remainders
    for (i=nnumber-1; i>=0; i--)
    {
        smallestError = INT_MAX;
        containerForSmallest = 0;
        for (j=0; j<ncontainer; j++)
        {
            nextError = (containers[j] + list[i]) - avgPerContainer;
            if (nextError < smallestError)
            {
                containerForSmallest = j;
                smallestError = nextError;
                printf ("error for %d, %d + %d, is %+d\n", j, containers[j], list[i], smallestError);
            }
        }
        printf ("we put %d into #%d\n", list[i], containerForSmallest);
        containers[containerForSmallest] += list[i];
    }

    for (i=0; i<ncontainer; i++)
    {
        printf ("#%d: %d, error = %d/%d ~ %+d\n", i, containers[i], containers[i]*ncontainer - avgPerContainer, ncontainer, (2*(containers[i]*ncontainer - avgPerContainer)+ncontainer)/(2*ncontainer) );
    }

    return 0;
}
like image 40
Jongware Avatar answered Nov 16 '22 02:11

Jongware