Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Balanced partition of iterator range without LegacyRandomAccessIterator

Tags:

c++

Let's start with a simple example:

std::vector<int> foo;
... // fill it
auto begin = foo.begin();
auto end = foo.end();
auto middle = begin + std::distance(begin, end) / 2;

If the iterators have the LegacyRandomAccessIterator trait, ideal bi-partition of the iterator range is embarrassingly easy, and so is therefor any form of divide-and-conquer type algorithm. This pattern is commonly found in the STL implementations too.

If they provide only the LegacyForwardIterator trait, only thing we can do in constant time is:

std::set<int> foo;
... // fill it
auto begin = foo.begin();
auto end = foo.end();
auto middle = (begin != end) ? ++(begin) : begin;

No chance of processing the iterator range in an even remotely balanced way without a linear scan.

It's easy to understand that for most containers not providing LegacyRandomAccessIterator, an ideal partition scheme can't be for free. However, most containers would still be able to provide a partitioning scheme better than 1:n-1 on average at a constant cost.

Any container based on a self balancing tree would be trivially able to guarantee a worst-case ratio based on intrinsic implementation details. Any hash table - be it bucket or multiple round based - still has a high chance of providing a good partition, just by slicing the hash table itself in half.

Only imbalanced trees or plain lists are by design unsuitable.

This deficit also shows in STL internal algorithms depending on a somewhat balanced division, e.g. like std::for_each(std::execution::par_unseq, ...). Even though most STL containers in about all implementations have implementation details which allow constant time partition into remotely balanced chunks, I have yet to see any STL implementation handling anything except LegacyRandomAccessIterator better than pure single element round-robin. Commonly resulting in false-sharing of cache lines, and synchronization overhead far beyond any practical value.

This begs the question, how come that the C++ language provides no interface to request a better partition scheme for any STL containers? Or is there one I am just not aware of?

like image 511
Ext3h Avatar asked Jun 10 '20 14:06

Ext3h


1 Answers

While this question makes sense in a vacuum, I couldn't find a container where it would matter.

  1. All associate containers are already inherently sorted and already provides specialized accessors and api's.

  2. We can ignore sequence containers that provide random access since we are not concerned with those.

  3. That leaves just std::list and std::forward_list. Both do not have the layout neccesary for providing a balanced division.

So there is no container left where this issue pops up, at least not in the standard library. Unless, perhaps, you are talking about using eg. std::lower_bound with eg. a std::map (perhaps for some generic purpose) ?


Concerning the case std::for_each(std::execution::par_unseq, ...) working on a std::set:

This is specialized case. There is choice even in this case and theres a tradeoff in performance (Remember that the std::set implementation are strictly private - there is not a guarentee that it is implemented with RB tree and therefore doesn't neccesarily know the median/precisely divisions). And without a precise division you could be worse off for a lot of use cases for processing, Im guessing. Therefore, you case/need is too specialized for the standard library as it stands and you should probably just implement a specialized for_each that handles you actual need. Or try to use a sorted std::vector instead (wrether you keep it sorted or sort it just when needed can make or break the performance), ie. if you don't have too many inserts in the middle and needs it sorted throughout - you could experiment with the std::deque as well.

That being said, it is possible that this idea should be added into a new iterator concept in the future, but I guess no one really gave it any thought as of yet.

like image 181
darune Avatar answered Oct 19 '22 18:10

darune