Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

looking for efficient algorithm (non-trivial) [closed]

The problem "specification":

It's Christmas! You have to buy presents!

You have a set of already existing bundles of toys, and the corresponding price of the bundle:

1 0 0 1 0 1 1 1 0 => 58
0 1 0 0 1 1 1 0 0 => 27
1 1 1 0 0 0 1 0 0 => 46
0 0 0 0 1 1 1 1 0 => 73
...

Where a 1 indicates that the toy is in the bundle, and a 0 that it is not.

Now, Santa Claus promo is coming and a leftover bundle X is offered to you at a "special promo price". We will say X is a bad deal if there exists another bundle Y so that:

Edit: to make it easier, I dropped condition 3 but changed condition 1 from "subset" into "strict subset"

  1. X is a strict subset of Y
  2. X is more expensive than Y

The objective is to implement a function bool isBadSubset(X) which finds out efficiently whether X is good or not.

Given that there are millions of bundles, comparing it to each one is obviously not feasible. Moreover, you can take the assumption that in the existing collection of bundles, a subset of toys is always cheaper than a superset.

Tips:

  • comparing whether a set is a subset or not of another set is easy
  • one may limit the comparisons to set which are both containing at least N more toys and cheaper. However, the list might still be big.
  • something in the direction of a sieve would be good
  • you don't need to know which bundle is better ...just that there exists one which is better

The challenge: is it possible to achieve this in constant time? (independent of the amount of bundles presently in the collection) ...or at least in log(n)?

like image 511
dagnelies Avatar asked Sep 11 '11 09:09

dagnelies


2 Answers

I managed to find some relevant literature on a quick search, and it seems that in the general case your problem is not easy.

Charikar, Indyk and Panigrahy (2002) study the subset query problem: given a set P of N subsets of some universe U of m elements, and a query set Q, is there a set in P that is a superset of Q ? They present two algorithms which trade of storage space for query speed. In order to achieve O(N/k) query time they need to increase space usage by a factor exponential in the square root of k.

Bevc and Sannik (2010) describe a simple trie-based data structure for subset queries with no analysis of query speed, but it seems clear that it is worst-case linear in the number N of stored sets.

like image 136
han Avatar answered Nov 15 '22 09:11

han


Asking for a solution in O(1) is unrealistic, I think. The only solution I could think of would be generating a full list of bundles and for each indicating whether it is a good one or not... I doubt this is what you are asking for.

A simple binary search might prove interesting though, not even going down into the details of which toys are concerned, we can simply index on the price and the number of items. Our item is a bad bundle if there exists another with a lower price and a higher number of items.

We can therefore define a key (price, nb items) and order them efficiently. The search will be O(n log n), and then the inclusion tests on the subset will still be linear.

like image 26
Matthieu M. Avatar answered Nov 15 '22 09:11

Matthieu M.