I have a collection of objects, each of which has a weight and a value. I want to pick the pair of objects with the highest total value subject to the restriction that their combined weight does not exceed some threshold. Additionally, I am given two arrays, one containing the objects sorted by weight and one containing the objects sorted by value.
I know how to do it in O(n2) but how can I do it in O(n)?
This is a combinatorial optimization problem, and the fact the values are sorted means you can easily try a branch and bound approach.
I think that I have a solution that works in O(n log n) time and O(n) extra space. This isn't quite the O(n) solution you wanted, but it's still better than the naive quadratic solution.
The intuition behind the algorithm is that we want to be able to efficiently determine, for any amount of weight, the maximum value we can get with a single item that uses at most that much weight. If we can do this, we have a simple algorithm for solving the problem: iterate across the array of elements sorted by value. For each element, see how much additional value we could get by pairing a single element with it (using the values we precomputed), then find which of these pairs is maximum. If we can do the preprocessing in O(n log n) time and can answer each of the above queries in O(log n) time, then the total time for the second step will be O(n log n) and we have our answer.
An important observation we need to do the preprocessing step is as follows. Our goal is to build up a structure that can answer the question "which element with weight less than x has maximum value?" Let's think about how we might do this by adding one element at a time. If we have an element (value, weight) and the structure is empty, then we want to say that the maximum value we can get using weight at most "weight" is "value". This means that everything in the range [0, max_weight - weight) should be set to value. Otherwise, suppose that the structure isn't empty when we try adding in (value, weight). In that case, we want to say that any portion of the range [0, weight) whose value is less than value should be replaced by value.
The problem here is that when we do these insertions, there might be, on iteration k, O(k) different subranges that need to be updated, leading to an O(n2) algorithm. However, we can use a very clever trick to avoid this. Suppose that we insert all of the elements into this data structure in descending order of value. In that case, when we add in (value, weight), because we add the elements in descending order of value, each existing value in the data structure must be higher than our value. This means that if the range [0, weight) intersects any range at all, those ranges will automatically be higher than value and so we don't need to update them. If we combine this with the fact that each range we add always spans from zero to some value, the only portion of the new range that could ever be added to the data structure is the range [weight, x), where x is the highest weight stored in the data structure so far.
To summarize, assuming that we visit the (value, weight) pairs in descending order of value, we can update our data structure as follows:
Notice that this means that we are always splitting ranges at the front of the list of ranges we have encountered so far. Because of this, we can think about storing the list of ranges as a simple array, where each array element tracks the upper endpoint of some range and the value assigned to that range. For example, we might track the ranges [0, 3), [3, 9), and [9, 12) as the array
3, 9, 12
If we then needed to split the range [0, 3) into [0, 1) and [1, 3), we could do so by prepending 1 to he list:
1, 3, 9, 12
If we represent this array in reverse (actually storing the ranges from high to low instead of low to high), this step of creating the array runs in O(n) time because at each point we just do O(1) work to decide whether or not to add another element onto the end of the array.
Once we have the ranges stored like this, to determine which of the ranges a particular weight falls into, we can just use a binary search to find the largest element smaller than that weight. For example, to look up 6 in the above array we'd do a binary search to find 3.
Finally, once we have this data structure built up, we can just look at each of the objects one at a time. For each element, we see how much weight is left, use a binary search in the other structure to see what element it should be paired with to maximize the total value, and then find the maximum attainable value.
Let's trace through an example. Given maximum allowable weight 10 and the objects
Weight | Value
------+------
2 | 3
6 | 5
4 | 7
7 | 8
Let's see what the algorithm does. First, we need to build up our auxiliary structure for the ranges. We look at the objects in descending order of value, starting with the object of weight 7 and value 8. This means that if we ever have at least seven units of weight left, we can get 8 value. Our array now looks like this:
Weight: 7
Value: 8
Next, we look at the object of weight 4 and value 7. This means that with four or more units of weight left, we can get value 7:
Weight: 7 4
Value: 8 7
Repeating this for the next item (weight six, value five) does not change the array, since if the object has weight six, if we ever had six or more units of free space left, we would never choose this; we'd always take the seven-value item of weight four. We can tell this since there is already an object in the table whose range includes remaining weight four.
Finally, we look at the last item (value 3, weight 2). This means that if we ever have weight two or more free, we could get 3 units of value. The final array now looks like this:
Weight: 7 4 2
Value: 8 7 3
Finally, we just look at the objects in any order to see what the best option is. When looking at the object of weight 2 and value 3, since the maximum allowed weight is 10, we need tom see how much value we can get with at most 10 - 2 = 8 weight. A binary search over the array tells us that this value is 8, so one option would give us 11 weight. If we look at the object of weight 6 and value 5, a binary search tells us that with five remaining weight the best we can do would be to get 7 units of value, for a total of 12 value. Repeating this on the next two entries doesn't turn up anything new, so the optimum value found has value 12, which is indeed the correct answer.
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With