The following code is supposed to find the key <code>3.0</code>in a <code>std::map</code> which exists. But due to floating point precision it won't be found. <pre class="prettyprint"><code>map<double, double> mymap; mymap[3.0] = 1.0; double t = 0.0; for(int i = 0; i < 31; i++) { t += 0.1; bool contains = (mymap.count(t) > 0); } </code></pre> In the above example, <code>contains</code> will always be <code>false</code>. My current workaround is just multiply <code>t</code> by 0.1 instead of adding 0.1, like this: <pre class="prettyprint"><code>for(int i = 0; i < 31; i++) { t = 0.1 * i; bool contains = (mymap.count(t) > 0); } </code></pre> Now the question: Is there a way to introduce a fuzzyCompare to the <code>std::map</code> if I use <code>double</code> keys? The common solution for floating point number comparison is usually something like <code>a-b < epsilon</code>. But I don't see a straightforward way to do this with <code>std::map</code>. Do I really have to encapsulate the <code>double</code> type in a class and overwrite <code>operator<(...)</code> to implement this functionality?

Here's a simplified example of how using soft-compare (aka epsilon or almost equal) can lead to problems. Let <code>epsilon = 2</code> for simplicity. Put <code>1</code> and <code>4</code> into your <code>map</code>. It now might look like this: <pre class="prettyprint"><code>1 \ 4 </code></pre> So <code>1</code> is the tree root. Now put in the numbers <code>2</code>, <code>3</code>, <code>4</code> in that order. Each will replace the root, because it compares equal to it. So then you have <pre class="prettyprint"><code>4 \ 4 </code></pre> which is already broken. (Assume no attempt to rebalance the tree is made.) We can keep going with <code>5</code>, <code>6</code>, <code>7</code>: <pre class="prettyprint"><code>7 \ 4 </code></pre> and this is even more broken, because now if we ask whether <code>4</code> is in there, it will say "no", and if we ask for an iterator for values less than <code>7</code>, it won't include <code>4</code>. Though I must say that I've used <code>map</code>s based on this flawed fuzzy compare operator numerous times in the past, and whenever I digged up a bug, it was never due to this. This is because datasets in my application areas never actually amount to stress-testing this problem.

Floating point keys in std:map

Q: Can double be the key of a map C++?

Using doubles as keys is not useful. As soon as you make any arithmetic on the keys you are not sure what exact values they have and hence cannot use them for indexing the map. The only sensible usage would be that the keys are constant.

Q: Where is the key in STD map?

The C++ function std::map::find() finds an element associated with key k. If operation succeeds then methods returns iterator pointing to the element otherwise it returns an iterator pointing the map::end().

Q: What is STD Multimap?

} (2) (since C++17) Multimap is an associative container that contains a sorted list of key-value pairs, while permitting multiple entries with the same key. Sorting is done according to the comparison function Compare , applied to the keys.

Tags:

c++

floating-point

stl

The following code is supposed to find the key 3.0in a std::map which exists. But due to floating point precision it won't be found.

map<double, double> mymap; mymap[3.0] = 1.0;  double t = 0.0; for(int i = 0; i < 31; i++) {   t += 0.1;   bool contains = (mymap.count(t) > 0); }

In the above example, contains will always be false. My current workaround is just multiply t by 0.1 instead of adding 0.1, like this:

for(int i = 0; i < 31; i++) {   t = 0.1 * i;   bool contains = (mymap.count(t) > 0); }

Now the question:

Is there a way to introduce a fuzzyCompare to the std::map if I use double keys? The common solution for floating point number comparison is usually something like a-b < epsilon. But I don't see a straightforward way to do this with std::map. Do I really have to encapsulate the double type in a class and overwrite operator<(...) to implement this functionality?

620

asked Jul 13 '11 19:07

pokey909

2 Answers

So there are a few issues with using doubles as keys in a std::map.

First, NaN, which compares less than itself is a problem. If there is any chance of NaN being inserted, use this:

struct safe_double_less {   bool operator()(double left, double right) const {     bool leftNaN = std::isnan(left);     bool rightNaN = std::isnan(right);     if (leftNaN != rightNaN)       return leftNaN<rightNaN;     return left<right;   } };

but that may be overly paranoid. Do not, I repeat do not, include an epsilon threshold in your comparison operator you pass to a std::set or the like: this will violate the ordering requirements of the container, and result in unpredictable undefined behavior.

(I placed NaN as greater than all doubles, including +inf, in my ordering, for no good reason. Less than all doubles would also work).

So either use the default operator<, or the above safe_double_less, or something similar.

Next, I would advise using a std::multimap or std::multiset, because you should be expecting multiple values for each lookup. You might as well make content management an everyday thing, instead of a corner case, to increase the test coverage of your code. (I would rarely recommend these containers) Plus this blocks operator[], which is not advised to be used when you are using floating point keys.

The point where you want to use an epsilon is when you query the container. Instead of using the direct interface, create a helper function like this:

// works on both `const` and non-`const` associative containers: template<class Container> auto my_equal_range( Container&& container, double target, double epsilon = 0.00001 ) -> decltype( container.equal_range(target) ) {   auto lower = container.lower_bound( target-epsilon );   auto upper = container.upper_bound( target+epsilon );   return std::make_pair(lower, upper); }

which works on both std::map and std::set (and multi versions).

(In a more modern code base, I'd expect a range<?> object that is a better thing to return from an equal_range function. But for now, I'll make it compatible with equal_range).

This finds a range of things whose keys are "sufficiently close" to the one you are asking for, while the container maintains its ordering guarantees internally and doesn't execute undefined behavior.

To test for existence of a key, do this:

template<typename Container> bool key_exists( Container const& container, double target, double epsilon = 0.00001 ) {   auto range = my_equal_range(container, target, epsilon);   return range.first != range.second; }

and if you want to delete/replace entries, you should deal with the possibility that there might be more than one entry hit.

The shorter answer is "don't use floating point values as keys for std::set and std::map", because it is a bit of a hassle.

If you do use floating point keys for std::set or std::map, almost certainly never do a .find or a [] on them, as that is highly highly likely to be a source of bugs. You can use it for an automatically sorted collection of stuff, so long as exact order doesn't matter (ie, that one particular 1.0 is ahead or behind or exactly on the same spot as another 1.0). Even then, I'd go with a multimap/multiset, as relying on collisions or lack thereof is not something I'd rely upon.

Reasoning about the exact value of IEEE floating point values is difficult, and fragility of code relying on it is common.

106

answered Oct 23 '22 22:10

Yakk - Adam Nevraumont

Here's a simplified example of how using soft-compare (aka epsilon or almost equal) can lead to problems.

Let epsilon = 2 for simplicity. Put 1 and 4 into your map. It now might look like this:

1  \   4

So 1 is the tree root.

Now put in the numbers 2, 3, 4 in that order. Each will replace the root, because it compares equal to it. So then you have

4  \   4

which is already broken. (Assume no attempt to rebalance the tree is made.) We can keep going with 5, 6, 7:

7  \   4

and this is even more broken, because now if we ask whether 4 is in there, it will say "no", and if we ask for an iterator for values less than 7, it won't include 4.

Though I must say that I've used maps based on this flawed fuzzy compare operator numerous times in the past, and whenever I digged up a bug, it was never due to this. This is because datasets in my application areas never actually amount to stress-testing this problem.

answered Oct 23 '22 21:10

Evgeni Sergeev

Related questions
                            
                                How to implement a network protocol?
                            
                                Gnu C++ macro __cplusplus standard conform?
                            
                                What is the behaviour of compiler generated move constructor?
                            
                                How to build boost Version 1.58.0 using Visual Studio 2015 (Enterprise)
                            
                                Differences between std::is_convertible and std::convertible_to (in practice)?
                            
                                C++11 observer pattern (signals, slots, events, change broadcaster/listener, or whatever you want to call it)
                            
                                Unexpected results with std::ofstream binary write
                            
                                How do I explicitly specify an out-of-tree source in CMake?
                            
                                ISO C++ forbids forward references to 'enum' types
                            
                                Is double-braced scalar initialization allowed by the C++ standard?
                            
                                floating-point promotion : stroustrup vs compiler - who is right?
                            
                                Why does decltype(auto) return a reference here?
                            
                                CMake is not finding Boost
                            
                                Does CLion IDE include all features which ReSharper C++ provides under Visual Studio?
                            
                                Is std::any_of required to follow short circuit logic?
                            
                                Implementation of a work stealing queue in C/C++? [closed]
                            
                                Can I configure Visual Studio to use real folders instead of filters in C++ projects?
                            
                                Are functions calls in a constructor's initializer-list sequenced?
                            
                                Modern C++ approach for providing optional arguments
                            
                                How to get length of std::stringstream without copying

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With