Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boost any_range vs. "canonical form" - what is the latter?

Boost's any_range documentation says the following:

Despite the underlying any_iterator being the fastest available implementation, the performance overhead of any_range is still appreciable due to the cost of virtual function calls required to implement increment, decrement, advance, equal etc. Frequently a better design choice is to convert to a canonical form.

What does the author mean by a "canonical form"? Can someone give an example?

EDIT: As suggested here, I asked the same question on the boost users' mailing list. Here is what Neil Groves, the original author of this text had to say:

For example, copying the range into a vector.

Yes, this is exactly the alternative design that I had in mind when writing the documentation. The overhead of iterating over an any_range is quite considerable, and often compares poorly with copying a concrete result-type into a container such as a vector. However, this is not always the case and some of the users of Boost.Range have desired the ability to implement algorithms that operate upon any_range instances. This is sometimes desirable to allow, for example, exposure of algorithms from a shared library that supports various containers. The use of any_range may also make sense where the number of passes over the range are small, but the memory size of the underlying container is very large.

In many cases, the performance overhead will not matter. I wanted to ensure that I did not mislead anyone into the widespread adoption of any_range usage. I believe that the valid usages for this class are few, but sometimes it is exactly the correct design choice. I shall improve the documentation with some additional clarification and examples in due course.

like image 498
paperjam Avatar asked Apr 08 '11 22:04

paperjam


1 Answers

I think they mean converting your range to an std::vector, or whatever the standard container is in your project, and returning an iterator into that.

The tradeoff is between the cost of making a copy of your range from the original range type to the canonical container type, and the cost of heap allocations and virtual function calls associated with the type erasure used to implement any_range. Depending on how many elements are in the range, how large each element is, and how many passes will be made over the range, one option may be better than the other.

like image 194
HighCommander4 Avatar answered Sep 30 '22 00:09

HighCommander4