Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spread objects evenly over multiple collections

Tags:

The scenario is that there are n objects, of different sizes, unevenly spread over m buckets. The size of a bucket is the sum of all of the object sizes that it contains. It now happens that the sizes of the buckets are varying wildly.

What would be a good algorithm if I want to spread those objects evenly over those buckets so that the total size of each bucket would be about the same? It would be nice if the algorithm leaned towards less move size over a perfectly even spread.

I have this naïve, ineffective, and buggy solution in Ruby.

buckets = [ [10, 4, 3, 3, 2, 1], [5, 5, 3, 2, 1], [3, 1, 1], [2] ]  avg_size = buckets.flatten.reduce(:+) / buckets.count + 1  large_buckets = buckets.take_while {|arr| arr.reduce(:+) >= avg_size}.to_a  large_buckets.each do |large|   smallest = buckets.last    until ((small_sum = smallest.reduce(:+)) >= avg_size)     break if small_sum + large.last >= avg_size     smallest << large.pop   end    buckets.insert(0, buckets.pop) end  => [[3, 1, 1, 1, 2, 3], [2, 1, 2, 3, 3], [10, 4], [5, 5]] 
like image 872
Jonas Elfström Avatar asked May 16 '13 13:05

Jonas Elfström


1 Answers

I believe this is a variant of the bin packing problem, and as such it is NP-hard. Your answer is essentially a variant of the first fit decreasing heuristic, which is a pretty good heuristic. That said, I believe that the following will give better results.

  • Sort each individual bucket in descending size order, using a balanced binary tree.
  • Calculate average size.
  • Sort the buckets with size less than average (the "too-small buckets") in descending size order, using a balanced binary tree.
  • Sort the buckets with size greater than average (the "too-large buckets") in order of the size of their greatest elements, using a balanced binary tree (so the bucket with {9, 1} would come first and the bucket with {8, 5} would come second).
  • Pass1: Remove the largest element from the bucket with the largest element; if this reduces its size below the average, then replace the removed element and remove the bucket from the balanced binary tree of "too-large buckets"; else place the element in the smallest bucket, and re-index the two modified buckets to reflect the new smallest bucket and the new "too-large bucket" with the largest element. Continue iterating until you've removed all of the "too-large buckets."
  • Pass2: Iterate through the "too-small buckets" from smallest to largest, and select the best-fitting elements from the largest "too-large bucket" without causing it to become a "too-small bucket;" iterate through the remaining "too-large buckets" from largest to smallest, removing the best-fitting elements from them without causing them to become "too-small buckets." Do the same for the remaining "too-small buckets." The results of this variant won't be as good as they are for the more complex variant because it won't shift buckets from the "too-large" to the "too-small" category or vice versa (hence the search space will be smaller), but this also means that it has much simpler halting conditions (simply iterate through all of the "too-small" buckets and then halt), whereas the complex variant might cause an infinite loop if you're not careful.

The idea is that by moving the largest elements in Pass1 you make it easier to more precisely match up the buckets' sizes in Pass2. You use balanced binary trees so that you can quickly re-index the buckets or the trees of buckets after removing or adding an element, but you could use linked lists instead (the balanced binary trees would have better worst-case performance but the linked lists might have better average-case performance). By performing a best-fit instead of a first-fit in Pass2 you're less likely to perform useless moves (e.g. moving a size-10 object from a bucket that's 5 greater than average into a bucket that's 5 less than average - first fit would blindly perform the movie, best-fit would either query the next "too-large bucket" for a better-sized object or else would remove the "too-small bucket" from the bucket tree).

like image 198
Zim-Zam O'Pootertoot Avatar answered Oct 02 '22 14:10

Zim-Zam O'Pootertoot