Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting floating-point values using their byte-representation

If have an 8-byte section of data and write a double-precision floating-point value to it, under what conditions will comparison by numerical comparison and lexicographic sorting of the bytes agree?

Current theory: positive, big-endian

I believe that if the number is positive, and the representation is big-endian, then numerical ordering of the floating-point values will match the lexicographic ordering of the bytes.

The idea is that it would first sort on the exponent, then on the mantissa. Even the "denormalized" IEEE representation shouldn't cause any problems.

Is this true?

(I'm using Node's Buffer::writeDoubleBE, but that shouldn't matter.)

Follow-up

I think a simple modification can extend this to negative numbers: XOR all positive numbers with 0x8000... and negative numbers with 0xffff.... This should flip the sign bit on both (so negative numbers go first), and then reverse the ordering on negative numbers. Does anyone see a problem with this?

like image 611
cloudfeet Avatar asked Apr 08 '17 19:04

cloudfeet


People also ask

How are floating point numbers represented in binary?

This allows the way that the bits are allocated to vary so that both very large and very small numbers can be represented. Binary floating point numbers are expressed in the form mantissa × 2, start superscript, e, x, p, o, n, e, n, t, end superscript,2exponent, e.g. 0, point, 101,0.101 x 2, to the power 4 ,24.

Which data is stored using a floating point representation?

Floating point numbers are used to represent noninteger fractional numbers and are used in most engineering and technical calculations, for example, 3.256, 2.1, and 0.0036. The most commonly used floating point standard is the IEEE standard.

What is the difference between range and accuracy in floating point representation?

Accuracy in floating point representation is governed by number of significant bits, whereas range is limited by exponent. Not all real numbers can exactly be represented in floating point format.

What is the precision of a floating point number?

Similarly, in case of double precision numbers the precision is log (10) (2 52) = 15.654 = 16 decimal digits. Accuracy in floating point representation is governed by number of significant bits, whereas range is limited by exponent. Not all real numbers can exactly be represented in floating point format.

Can all real numbers be represented in floating point format?

Not all real numbers can exactly be represented in floating point format. For any numberwhich is not floating point number, there are two options for floating point approximation, say, the closest floating point number less than x as x_ and the closest floating point number greater than x as x+.

How to convert a decimal to a floating point number?

To convert the decimal into floating point, we have 3 elements in a 32-bit floating point representation: i) Sign (MSB) ii) Exponent (8 bits after MSB) iii) Mantissa (Remaining 23 bits) Sign bitis the first bit of the binary representation. ‘1’ implies negative number and ‘0’ implies positive number.


1 Answers

Your approach:

I think a simple modification can extend this to negative numbers: XOR all positive numbers with 0x8000... and negative numbers with 0xffff.... This should flip the sign bit on both (so negative numbers go first), and then reverse the ordering on negative numbers. Does anyone see a problem with this?

is definitely the answer. Moreover, it was used, for example, in dBase and clones to organize sorting on a float column, and I guess it's followed by newer DB generations.

Also, it is identical to the "total order" according to IEEE-754 for binary representations. (But not for decimal ones, the latter is much more complex.)

UPDATE: as suggested by @Sneftel: you could find replacing -0 with +0 as useful before converting to bit string.

like image 50
Netch Avatar answered Sep 29 '22 12:09

Netch