Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient structure for storing pareto optimal solutions

I am trying to solve problem which requires storing pareto optimal solutions during calculation. I will call set of pareto optimal solutions a Bag.

So far I had only two criteria, which allowed for quite efficient solution based on an array in which elements were sorted in descending order according to the first criterion and ascending according to the second criterion. An example of such an array would be:

[(100, 0), (50, 1), (-10, 3)]

(about pareto optimality - wiki)

Recently however I found out that I need to add third criterion and for such an extension the above approach doesn't seem to be applicable. I tried to google whether someone already solved this but found nothing satisfactory. Perhaps I was asking google wrong question.

To be more precise about what I need: Structure capable of storing mutually non-dominating pareto optimal elements. I need to insert elements into the structure and I need to iterate over elements but in no particular order. In my case there won't usually more then 4-5 elements, but sometimes more up to 10-20. Insertions into the bag happen VERY often in my algorithm so I need them to be as fast as possible.

The application is written in C++, but it is probably not really relevant.

Any help would be much appreciated.

Edit: I already had some ideas of my own - arranging elements into some sort of triangular structure, but I am quite unable to formalize that idea.

Edit2: Please note that I require, that after each insertion, only mutually non-dominating elements remain in the structure. For example if I have set of non-dominating triples {(1,2,3), (3, 1, 1)} and add triple (3, 3, 3) I will get set {(3,3,3)}.

Edit3: To be more clear about dominance of elements - We say, in this particular case, that triple (a,b,c) dominates (e,f,g) if and only if a >= e && b >= f && c >= g and at least one of inequalities is strict - >.

like image 251
Jendas Avatar asked Oct 19 '22 12:10

Jendas


2 Answers

The trival approach first, so we can see what we try to improve and validate, that this answer actually is relevant to the problem.

// Taken from the question and translated.
// Is the dominance condition.
let x_doms_y x y =
    let (a,b,c) = x
    let (e,f,g) = y
    a >= e && b >= f && c >= g &&
    (a > e || b > f || c > g)

The Naive approach would require O(n) tests, in order to filter out the existing elements in the data set, which are dominated by the new item to be added. The following shows the naive O(n) solution, which subsequently we try to improve.

type Triplet = int * int * int
type TripletSet = Triplet list

module Naive =
    let insert (t : Triplet) (tset : TripletSet) : TripletSet =
        t :: (tset|> List.filter (fun u -> not (x_doms_y t u)))

Starting with an empty list, then subsequently adding one triplet after the next yields:

let foo =
[] |> show
|> Naive.insert (1,2,3) |> show
|> Naive.insert (3,1,1) |> show
|> Naive.insert (3,3,3) |> show

> []
  [(1, 2, 3)]
  [(3, 1, 1); (1, 2, 3)]
  [(3, 3, 3)]

This appears to meet the expectations.

To improve the speed, next to insertion cost to the chosen data structure, which shall not be considered here, but which can be relevant, we try to get the number of dominance comparisons reduced to a value < n. On Average, at least.

The problem can be interpreted in a geometric sense. the triplet, e.g. 1,2,3 can be seen as a vector from one end of a cube, which is located at 0,0,0 to its diagonal corner.

Can a cube with smaller volume ever dominate a larger cube? The answer is no. We can show this by analogy on a 1-dimensional equivalent. If x < y, x cannot dominate y because to dominate, it should hold x >= y && X > y.

A similar equivalence can be conceived for 2 dimensions. And it holds in the same sense for our triplet.

Now, we narrowed our search space. Those items in the existing set, which have a smaller volume than the new triplet can be but need not be dominated by the new triplet. Those which have a higher volume than the new triplet cannot be dominated.

Hence, an improved approach would be:

  • let qset be the sorted sequence of quadlets, already inserted.
  • Let vt be the volume of the new triplet t : (a,b,c). vt = a * b * c
  • let qt = (vt,a,b,c)
  • Use binary search to find the index pos in qset, by volume.
  • All quadlets left of pos (with v < vt) are candidates for filtering.
  • All quadlets to the right of pos cannot be dominated as they are bigger.
  • So, now, we only have to apply the naive approach to the sub-set qset[0..pos-1]. If the values inserted are random regarding this relation, on average, we only have to filter n/2 where n is the size of qset.
  • Insert qt at pos into the qset and return qset.
like image 164
BitTickler Avatar answered Nov 15 '22 11:11

BitTickler


You could e.g. order them by their norm (like a*a + b*b + c*c), then you need only check the elements with a bigger norm if they dominate the new element and check only the elements with a smaller norm, if they are dominated by the new element.

However, I'm not sure if ordering of your elements is of particular value, if you only have so few of them to begin with. Maintaining that order (whatever it is) comes with its own overhead and might very well outweigh any benefits you get in terms of algorithmic complexity. In particular, I'd refrain from anything, that involves dynamic per-element allocation and deallocation, like std::list or std::map with standard allocators. Even a heap on an array might not bring a noticeable advantage.

I'd probably just use a plain, unsorted std::vector:

std::vector<Element> frontier;
void insert(const Element& newElement) {
    if (std::none_of(frontier.begin(), frontier.end(),  [&](const auto& e) {return dominates(e, newElement); })) {
        frontier.erase(std::remove_if(frontier.begin(), frontier.end(), [&](const auto& e) {return dominates(newElement, e); }),frontier.end());
        frontier.push_back(newElement);     
    }
}
like image 20
MikeMB Avatar answered Nov 15 '22 12:11

MikeMB