Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the most efficient way to evenly fill an unsorted list of "buckets" of varying sizes

Tags:

c++

algorithm

Suppose I have an unsorted list of buckets. (Each bucket has a size property.) Suppose I have a quantity Q that I must distribute across the list of buckets as evenly as possible (i.e. minimize the maximum).

If the buckets were sorted in increasing size, then the solution would be obvious: fully fill each bucket, say buckets[i], until Q/(buckets.length-i) <= buckets[i]->size, and then fill the remaining buckets with that same quantity, Q/(buckets.length-i), as illustrated:

Filling buckets.

What's the most efficient way to solve this if the buckets aren't sorted?

I can only think of iterating like this (pseudocode):

while Q > 0
    for i in 0..buckets.length-1
        q = Q/(buckets.length-i)
        if q > buckets[i]->size
            q = buckets[i]->size
        buckets[i]->fill(q)
        Q -= q

But I'm not sure if there's a better way, or if sorting the list would be more efficient.

(The actual problem I face has more to it, e.g. this "unsorted" list is actually sorted by a separate property "rank", which determines which buckets would get the extra fills when quantities don't divide evenly, etc. So, for example, to use the sort-then-fill method, I'd sort the list by bucket size and rank. But knowing an answer to this would help me figure out the rest.)

like image 598
slackwing Avatar asked Jan 08 '13 16:01

slackwing


1 Answers

In many cases, where the solution is "so simple" or "so effective" if the data was sorted, yet very complicated or ineffective in case it isn't, the best solution is very often to just sort the data first and then go for the simple, effective solution. Even though this means you will have the overhead of sorting the data first, there are plenty of very good sort algorithms available for pretty much any purpose and in many cases the total overhead of "first sorting the data and then applying a simple, effective algorithm to it" is still lower than "not sorting the data and applying a very complicated, ineffective algorithm to it".

The fact that you need the data sorted by a different key only means to me, that you need two lists here, each one sorted by a different criteria. Unless we are talking of several thousand buckets here, the memory overhead for a second list will most likely not be an issue (after all both lists only contains pointers to your bucket objects, that means 4/8 bytes per pointer, depending on fact if you have 32 or 64 bit code). One list has the buckets sorted by size the other list has the buckets sorted by "rank" and when when adding new items as described in your question, you use the "sorted by size list", while using the "sorted by rank" list the same way you are using it already by now.

like image 151
Mecki Avatar answered Sep 18 '22 00:09

Mecki