Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating in-order constrained sets

First I will paste the scenario and then pose my question:

Suppose you have a list of Categories, for example:

Food,Meat,Dairy,Fruit,Vegetable,Grain,Wheat,Barley

Now you have a list of items that fits into one or more of the categories listed above.

Here is a sample list of items: Pudding,Cheese,Milk,Chicken,Barley,Bread,Couscous,Fish,Apple,Tomato, Banana,Grape,Lamb,Roast,Honey,Potato,Rice,Beans,Legume,Barley Soup

As you see every item fits into at least one category, it could fit into more, or possibly all but the minimum is always one.

For example Cheese is a Food and Dairy.

Each item has two attributes:
1) A Price Tag
2) A Random Value

A set is defined as having every category mapped to an item.
In other words all categories must be present in a set.

A set from the items above could be:

[Pudding,Lamb,Milk,Apple,Tomato,Legume,Bread,Barley Soup]

As you see each item is mapped to a category slot:

  • Pudding is mapped to Food Category
  • Lamb is mapped to Meat Category
  • Milk is mapped to Dairy Category
  • Apple is mapped to Fruit Category
  • Tomato is mapped to Vegetable Category
  • Legume is mapped to Grain Category
  • Bread is mapped to Wheat Category
  • Barley Soup is mapped to Barley Category

My question is, what is the most efficient algorithm for generating in-order sets of the above categories from a list of items given.

The best set is defined as having the highest Random Value in total.

The only constraint is that any generated set cannot, in total, exceed a certain fixed amount, in other words, all generated sets should be within this Price Cap.

Hope I am clear, thank you!

like image 609
Moshe Rabaev Avatar asked Dec 27 '18 04:12

Moshe Rabaev


Video Answer


1 Answers

What you are trying to achieve is a form of maximal matching, and I don't know if there is an efficient way to compute in-order sets, but still this reduction might help you.

Define a bipartite graph with on one side one node per category, and on the other side one node per item. Add an edge between an item and a category if that items belongs in that category, with a weight defined by the random value of the item.

A "set" as you defined it is a maximum-cardinality matching in that graph. They can be enumerated in reasonable time, as proved by Takeaki Uno in "A Fast Algorithm for Enumerating Non-Bipartite Maximal Matchings", and it is likely to be even faster in your situation because your graph is bipartite.

Among those sets, you are looking for the ones with maximal weight and under a price constraint. Depending on your data, it may be enough to just enumerate them all, filter them based on the price, and sort the remaining results if there are not too many.

If that is not the case, then you may find the best set by solving the combinatorial optimization problem whose objective function is the total weight, and whose constraints are the price limit and the cardinal (known as maximum-weighted matching in the litterature). There may even be solvers already online once you write the problem in this form. However, this will only provide one such set rather than a sorted list, but as this problem is very hard in the general case, this is the best you can hope for. You would need more assumptions on your data to have better results (like bounds on the maximum number of such sets, the maximum number of items that can belong to more than k categories, etc.)

like image 88
Robindar Avatar answered Nov 15 '22 06:11

Robindar