What's the best heuristic I can use to identify whether a chunk of X 4-bytes are integers or floats? A human can do this easily, but I wanted to do it programmatically.
I realize that since every combination of bits will result in a valid integer and (almost?) all of them will also result in a valid float, there is no way to know for sure. But I still would like to identify the most likely candidate (which will virtually always be correct; or at least, a human can do it).
For example, let's take a series of 4-bytes raw data and print them as integers first and then as floats:
1 1.4013e-45 10 1.4013e-44 44 6.16571e-44 5000 7.00649e-42 1024 1.43493e-42 0 0 0 0 -5 -nan 11 1.54143e-44
Obviously they will be integers.
Now, another example:
1065353216 1 1084227584 5 1085276160 5.5 1068149391 1.33333 1083179008 4.5 1120403456 100 0 0 -1110651699 -0.1 1195593728 50000
These will obviously be floats.
PS: I'm using C++ but you can answer in any language, pseudo code or just in english.
The "common sense" heuristic from your example seems to basically amount to a range check. If one interpretation is very large (or a tiny fraction, close to zero), that is probably wrong. Check the exponent of the float interpretation and compare it to the exponent that results from a proper static cast of the integer interpretation to a float.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With