I'd like to use a std::pmr::unordered_map
with a std::pmr::monotonic_buffer_resource
. The two fit well together, because the set's nodes are stable, so I don't create a lot of holes in the buffer resource by reallocation:
std::pmr::monotonic_buffer_resource res;
std::pmr::unordered_set<T> set(&res);
That is, except for the bucket list, which needs to be reallocated when the set rehashes as it exceeds the max_load_factor()
. Assuming I can't reserve()
my way out of this, and I actually care about the holes in the buffer resource left by old bucket lists since grown, what are my options?
If I know that unordered_set
is implemented as std::vector<std::forward_list>
, as in (some versions of) MSVC, then I should be able to use a scoped_allocator
to give different allocators for the vector
and the forward_list
. But a) I can't rely on unordered_set
being a vector<forward_list>
in portable code and b) scoped_allocator
is an Allocator
whereas monotonic_buffer_resource
is a memory_resource
, an impedance mismatch that will make for very complicated initialization.
Or I could write a switch_memory_resource
that delegates to other memory_resource
s based on the size of the request. I could then use a monotonic_buffer_resource
for requests that match the size of the nodes (which, however, I cannot, portably, know, either) and default_memory_resource()
for everything else. I could probably make an educated guess that the nodes are at most sizeof(struct {void* next; size_t hash; T value;})
, add some error margin by multiplying that by two and use that as the cut-off between the two memory_resource
s, but I wonder whether there's a cleaner way?
Keys are immutable, therefore, the elements in an unordered_set cannot be modified once in the container - However, they can be inserted and removed. Unordered sets do not allow duplicates and are initialized using comma-delimited values enclosed in curly braces.
Unordered sets are containers that store unique elements in no particular order, and which allow for fast retrieval of individual elements based on their value. In an unordered_set, the value of an element is at the same time its key, that identifies it uniquely.
Comparing the timings for std::set , std::unordered_set with the original hash, and std::unordered_set with the extra hash, we see that for sufficiently large number of elements, std::unordered_set actually becomes faster than std::set : Another solution would be not to use a power of two for the number of buckets.
unordered_set only contains keys, and no values. There is no mapping from a key to a value, so no need for an operator[] . unordered_map maps a key to a value.
Unordered sets are containers that store unique elements in no particular order, and which allow for fast retrieval of individual elements based on their value. In an unordered_set, the value of an element is at the same time its key, that identifies it uniquely.
If you want to look at implementation details of set and unordered_set in c++ STL, see Set Vs Map. Set allows to traverse elements in sorted order whereas Unordered_set doesn’t allow to traverse elements in sorted order. Set can be modified to find predecessor or successor whereas Unordered_set doesn’t allow to find predecessor/Successor.
The HTML <ul> tag defines an unordered (bulleted) list. An unordered list starts with the <ul> tag. Each list item starts with the <li> tag. The CSS list-style-type property is used to define the style of the list item marker.
bucket_size () – Returns the total number of elements present in a specific bucket in an unordered_set container. erase () – Remove either a single element or a range of elements ranging from start (inclusive) to end (exclusive). size () – Return the number of elements in the unordered_set container.
The small number of concrete resource types that I proposed a number of years ago and that were adopted into C++17 was a minimalist set of useful allocators. As evidenced by your question, they do not provide optimal behavior for every circumstance. There are not many tuning dials and I have some regrets about missing functionality, but they are still useful for most cases.
For your specific situation, you say "Assuming I can't reserve()
my way out of this, and I actually care about the holes in the buffer resource left by old bucket lists since grown." I'm not sure any general allocator can help you. The geometric growth of the bucket list will leave holes in any allocation strategy. The question is whether those holes can be re-used and/or minimized. As you point out, only a very-carefully customized allocator for the very specific situation will minimize these holes. But maybe your assumptions are too strong.
Consider a std::pmr::vector<int>
. This is the worst-case scenario for a monotonic_buffer_resource
because every reallocation results in leaked memory. And yet, even this case has a worst-case memory waste of only 50%; i.e., it will never use more than twice as much memory as it would with a resource that perfectly reuses memory blocks. Granted, 50% can be pretty bad, but in your scenario, we are talking much, much less. For a reasonably large set, the bucket list is small compared to the buckets and the data itself, and you can use reserve
to minimize reallocation. So my first piece of advice is to go ahead and use the monotonic_buffer_resource
without alteration, and measure to see if you have unacceptable memory use. A second experiment would be to use an unsynchronized_pool_resource
backed by an (upstream) monotonic_buffer_resource
.
If you decide you want to create a custom resource for this purpose, which might be fruitful and might even be fun, your approach of choosing some lower threshold for passing to the monotonic allocator would probably work and would not actually be a lot of effort. You could also consider making it adaptive: Keep a list of the last, say, 4, allocation sizes. If any size gets more than two hits, then assume it is your node size and allocate those nodes from the monotonic resource while other requests get passed directly to the upstream resource. Be careful, however, that you use such a custom allocator only when you know what's going on. If you have a std::pmr::unordered_set<std::pmr::string>
, then both approaches may result in many of the strings getting allocated from the upstream resource and thus losing the benefit of the monotonic buffer. As you can see, being too stingy with memory for your bucket list can backfire in a generic context. You're likely to discover that the unmodified monotonic_buffer_resource
was a better bet.
Good luck, and please report your findings back here. Also, if you have an idea for a general-purpose resource that could address your problem (or any other common allocation problem), I'd love to hear about it. There's certainly room in the standard for a few more useful resource types.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With