Note: this is an abstract rewording of a real-life problem regarding ordering records in a SWF file. A solution will help me improve an open-source application.
Bob has a store, and wants to do a sale. His store carries a number of products, and he has a certain integer quantity of units of each product in stock. He also has a number of shelf-mounted price labels (as many as the number of products), with the prices already printed on them. He can place any price label on any product (unitary price for one item for his entire stock of that product), however some products have an additional restriction - any such product may not be cheaper than a certain other product.
You must find how to arrange the price labels, such that the total cost of all of Bob's wares is as low as possible. The total cost is the sum of each product's assigned price label multiplied by the quantity of that product in stock.
Given:
The program must find:
To satisfy the conditions:
Note that if not for the first condition, the solution would be simply sorting labels by price and products by quantity, and matching both directly.
Typical values for input will be N,K<10000. In the real-life problem, there are only several distinct price tags (1,2,3,4).
Here's one example of why most simple solutions (including topological sort) won't work:
You have 10 items with the quantities 1 through 10, and 10 price labels with the prices $1 through $10. There is one condition: the item with the quantity 10 must not be cheaper than the item with the quantity 1.
The optimal solution is:
Price, $ 1 2 3 4 5 6 7 8 9 10
Qty 9 8 7 6 1 10 5 4 3 2
with a total cost of $249. If you place the 1,10 pair near either extreme, the total cost will be higher.
The problem is NP-complete for the general case. This can be shown via a reduction of 3-partition (which is a still strong NP-complete version of bin packing).
Let w1, ..., wn be the weights of objects of the 3-partition instance, let b be the bin size, and k = n/3 the number of bins that are allowed to be filled. Hence, there is a 3-partition if objects can be partitioned such that there are exactly 3 objects per bin.
For the reduction, we set N=kb and each bin is represented by b price labels of the same price (think of Pi increasing every bth label). Let ti, 1≤i≤k, be the price of the labels corresponding to the ith bin. For each wi we have one product Sj of quantity wi + 1 (lets call this the root product of wi) and another wi - 1 products of quantity 1 which are required to be cheaper than Sj (call these the leave products).
For ti = (2b + 1)i, 1≤i≤k, there is a 3-partition if and only if Bob can sell for 2bΣ1≤i≤kti:
So, this is the destructive part ;-) However, if the number of different price tags is a constant, you can use dynamic programming to solve it in polynomial time.
This problem resembles many scheduling problems considered in the CS literature. Allow me to restate it as one.
Problem ("nonpreemptive single-machine scheduling with precedence, weights, and general lateness penalties")
Input:
jobs 1, …, n
a "treelike" precedence relation prec on the jobs (Hasse diagram is a forest)
weights w1, …, wn
a nondecreasing lateness penalty function L(t) from {1, …, n} to Z+
Output:
Correspondence: job <=> product; i prec j <=> i has a lower price than j; weight <=> quantity; L(t) <=> tth lowest price
When L is linear, there is an efficient polynomial-time algorithm due to Horn [1]. The article is behind a pay wall, but the main idea is
For all j, find the connected set of jobs containing only j and its successors whose mean weight is maximum. For example, if n = 6 and the precedence constraints are 1 prec 2 and 2 prec 3 and 2 prec 4 and 4 prec 5, then the sets under consideration for 2 are {2}, {2, 3}, {2, 4}, {2, 3, 4}, {2, 4, 5}, {2, 3, 4, 5}. We actually only need the maximum mean weight, which can be computed bottom up by dynamic programming.
Schedule jobs greedily in order of the mean weight of their associated sets.
In CyberShadow's example, we have n = 10 and 1 prec 10 and wj = j and L(t) = t. The values computed in Step 1 are
job 1: 5.5 (mean of 1 and 10)
job 2: 2
job 3: 3
job 4: 4
job 5: 5
job 6: 6
job 7: 7
job 8: 8
job 9: 9
job 10: 10
The optimal order is 9, 8, 7, 6, 1, 10, 5, 4, 3, 2.
This algorithm might work well in practice even for a different choice of L, as the proof of optimality uses local improvement. Alternatively, perhaps someone on the CS Theory Stack Exchange will have an idea.
[1] W. A. Horn. Single-Machine Job Sequencing with Treelike Precedence Ordering and Linear Delay Penalties. SIAM Journal on Applied Mathematics, Vol. 23, No. 2 (Sep., 1972), pp. 189–202.
Since I thought the problem was fun, I did a model for finding solutions using constraint programming. The model is written in a modelling language called MiniZinc.
include "globals.mzn";
%%% Data declaration
% Number of products
int: n;
% Quantity of stock
array[1..n] of int: stock;
% Number of distinct price labels
int: m;
% Labels
array[1..m] of int: labels;
constraint assert(forall(i,j in 1..m where i < j) (labels[i] < labels[j]),
"All labels must be distinct and ordered");
% Quantity of each label
array[1..m] of int: num_labels;
% Number of precedence constraints
int: k;
% Precedence constraints
array[1..k, 1..2] of 1..n: precedences;
%%% Variables
% Price given to product i
array[1..n] of var min(labels)..max(labels): prices :: is_output;
% Objective to minimize
var int: objective :: is_output;
%%% Constraints
% Each label is used once
constraint global_cardinality_low_up_closed(prices, labels, num_labels, num_labels);
% Prices respect precedences
constraint forall(i in 1..k) (
prices[precedences[i, 1]] <= prices[precedences[i, 2]]
);
% Calculate the objective
constraint objective = sum(i in 1..n) (prices[i]*stock[i]);
%%% Find the minimal solution
solve minimize objective;
Data for a problem is given in a separate file.
%%% Data definitions
n = 10;
stock = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
m = 10;
labels = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
num_labels = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
k = 1;
precedences = [| 1, 10 |];
The model is fairly naive and straight-forward, no fancy stuff. Using the Gecode back-end for solving the example problem, the following output is generated (assuming the model is in model.mzn and the data in data.dzn)
$ mzn2fzn -I/usr/local/share/gecode/mznlib/ model.mzn data.dzn
$ fz -mode stat -threads 0 model.fzn
objective = 265;
prices = array1d(1..10, [1, 10, 9, 8, 7, 6, 5, 4, 3, 2]);
----------
objective = 258;
prices = array1d(1..10, [2, 10, 9, 8, 7, 6, 5, 4, 1, 3]);
----------
objective = 253;
prices = array1d(1..10, [3, 10, 9, 8, 7, 6, 5, 2, 1, 4]);
----------
objective = 250;
prices = array1d(1..10, [4, 10, 9, 8, 7, 6, 3, 2, 1, 5]);
----------
objective = 249;
prices = array1d(1..10, [5, 10, 9, 8, 7, 4, 3, 2, 1, 6]);
----------
==========
%% runtime: 0.027 (27.471000 ms)
%% solvetime: 0.027 (27.166000 ms)
%% solutions: 5
%% variables: 11
%% propagators: 3
%% propagations: 136068
%% nodes: 47341
%% failures: 23666
%% peak depth: 33
%% peak memory: 237 KB
For larger problems it is of course much slower, but the model will typically generate successively better solutions over time.
Posting some thoughts as a community wiki, feel free to edit.
This problem is easier to visualise if you think about the additional constraints as having to lay out or rearrange a set of top-to-bottom trees in such a way that every node must be to the right of its parent (products on the left are cheaper and those on the right are more expensive).
Let's say that two products are conflicting if the first has more stock than the second, and yet the first must not be cheaper than the other (so they are being "pulled" in different directions price-wise). Similarly, a conflicting group of products is one where at least two products are conflicting, and none of its products conflicts with any product outside the group.
We can make a few observations:
The main problem with this algorithm is how to deal with displacement of already-placed constrained pairs. I imagine that simply trying to re-place displaced chains by iterative search might work, but the algorithm already looks too complicated to work right.
In the case that the number of distinct prices is low, you can use a deque (or doubly-linked list) for each distinct price, holding all the items with that price assigned to them. The deques are ordered from lowest to highest price. Inserting an item into a deque shifts the last item into the start of next deque (for the next higher distinct price), and so on for all deques after that.
One thing to note about iterative / bubble-sort-ish algorithms: when you have a conflicting pair of products, it is not enough to greedily walk in either direction by one position until the next one does not yield an improvement. Here is a test case I got by playing around a bit with Mathematica writing a test case generator:
Price, $ 1 2 7 9
Qty 3 2 1 4
The constraint is to have the 4-qty item to the right of the 1-qty item. As shown above, the total price is $50. If you move the pair one position to the left (so it's 3 1 4 2
), the total goes up to $51, but if you go once further (1 4 3 2
) it goes down to $48.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With