The problem "specification":
It's Christmas! You have to buy presents!
You have a set of already existing bundles of toys, and the corresponding price of the bundle:
1 0 0 1 0 1 1 1 0 => 58
0 1 0 0 1 1 1 0 0 => 27
1 1 1 0 0 0 1 0 0 => 46
0 0 0 0 1 1 1 1 0 => 73
...
Where a 1
indicates that the toy is in the bundle, and a 0
that it is not.
Now, Santa Claus promo is coming and a leftover bundle X
is offered to you at a "special promo price". We will say X
is a bad deal if there exists another bundle Y
so that:
Edit: to make it easier, I dropped condition 3 but changed condition 1 from "subset" into "strict subset"
X
is a strict subset of Y
X
is more expensive than Y
The objective is to implement a function bool isBadSubset(X)
which finds out efficiently whether X
is good or not.
Given that there are millions of bundles, comparing it to each one is obviously not feasible. Moreover, you can take the assumption that in the existing collection of bundles, a subset of toys is always cheaper than a superset.
Tips:
N
more toys and cheaper. However, the list might still be big.The challenge: is it possible to achieve this in constant time? (independent of the amount of bundles presently in the collection) ...or at least in log(n)?
I managed to find some relevant literature on a quick search, and it seems that in the general case your problem is not easy.
Charikar, Indyk and Panigrahy (2002) study the subset query problem: given a set P of N subsets of some universe U of m elements, and a query set Q, is there a set in P that is a superset of Q ? They present two algorithms which trade of storage space for query speed. In order to achieve O(N/k) query time they need to increase space usage by a factor exponential in the square root of k.
Bevc and Sannik (2010) describe a simple trie-based data structure for subset queries with no analysis of query speed, but it seems clear that it is worst-case linear in the number N of stored sets.
Asking for a solution in O(1) is unrealistic, I think. The only solution I could think of would be generating a full list of bundles and for each indicating whether it is a good one or not... I doubt this is what you are asking for.
A simple binary search might prove interesting though, not even going down into the details of which toys are concerned, we can simply index on the price and the number of items. Our item is a bad bundle if there exists another with a lower price and a higher number of items.
We can therefore define a key (price, nb items)
and order them efficiently. The search will be O(n log n)
, and then the inclusion tests on the subset will still be linear.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With