Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it safe to use floats as keys of hashtables?

I need to store pairs of float,int in which the int value stores the number of occurrences of a float value inside a model I'm using for a tool I'm developing and I was wondering if it's safe to do such things..

Finite precision should be an issue when talking floats used to direct comparisons (or as content to be hashed) so I think that a similar approach is discouraged, am I right?

Actually the problem is that I don't have any other information coupled with these floats so I simply can't use anything else as a key for the hashtable but in the same time, since keys will be many, having a good performance would be nice.

Maybe the best solution is to use a binary search tree (or an even more advanced data structure) to obtain at least an average case of O(logn) also if a constant factor would be better.

Do you have any suggestion? Just to let you know I'm developing in OCaml but I think that these considerations can be considered language agnostic

like image 945
Jack Avatar asked Aug 03 '10 17:08

Jack


2 Answers

The usual problem with floating-point numbers is that calculations are approximate. If you calculate the same value in two different ways, the results are likely to be very slightly different. (In some cases, you can get slight differences by calculating the same value twice the same way.)

Therefore, if you're doing any calculations on floats, you will get approximations, and shouldn't rely on equality. If your source was calculating floats in various ways, the data passed to you will be approximate. If you are getting exact floating-point values, and can count on any numbers that should be the same being the exact same bit representation, then equality works as normal, and you can use a hash table.

like image 189
David Thornley Avatar answered Sep 28 '22 13:09

David Thornley


I think there are a couple of questions here

Is it safe to use floats as keys to a hash table?

Yes. I can't think of a language right now where floats do not fit the requirements needed for a key in a hash table (typically stable hash code and equality semantics)

Is it OK to have a hash table with lots of keys?

Depends on how many. If the number of keys is so great it causes the table to expand past the allowable memory size then certainly no as it will cause out of memory situations. It's really impossible to answer this part of the question without more context. Likely you're the only one who will be able to answer it.

Does the precision of float make it any worse than other types like int?

This is implementation specific but I believe in OCaml a float has double precision (8 bytes). So asking if the precision makes it invalid as a key is the equivalent of asking is the C# long type unsuitable as a hash table key. They both have the same number of possible values (they're both 8 bytes). I would certainly say long is a valid key type (used it often and there is nothing wrong with it).

I think the real question is are you irresponsibly creating instances of the float to use as a key.

If I'm running out of memory with a hash table will a binary tree be any better?

Possibly but not by a lot. There is overhead involved with both binary trees and hash tables. For hashtables it's typically the unused buckets and the next pointers in the lists inside the buckets. For a binary tree every element in the tree has 2 extra pieces of overhead (the left and right pointer). If you're running out of memory I'm not sure swvitching to a binary tree will be significantly better.

like image 24
JaredPar Avatar answered Sep 28 '22 12:09

JaredPar