Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

flush-to-zero behavior in floating-point arithmetic

While, as far as I remember, IEEE 754 says nothing about a flush-to-zero mode to handle denormalized numbers faster, some architectures offer this mode (e.g. http://docs.sun.com/source/806-3568/ncg_lib.html ).

In the particular case of this technical documentation, standard handling of denormalized numbers is the default, and flush-to-zero has to be activated explicitly. In the default mode, denormalized numbers are also handled in software, which is slower.

I work on a static analyzer for embedded C which tries to predict correct (if sometimes imprecise) ranges for the values that can happen at run-time. It aims at being correct because it is intended to be usable to exclude the possibility of something going wrong at run-time (for instance for critical embedded code). This requires having captured all possible behaviors during the analysis, and therefore all possible values produced during floating-point computations.

In this context, my question is twofold:

  1. among the embedded architectures, are there architectures that offer only flush-to-zero? They would perhaps not have to right to advertise themselves as "IEEE 754", but could offer close-enough IEEE 754-style floating-point operations.

  2. For the architectures that offer both, in an embedded context, isn't flush-to-zero likely to be activated by the system, in order to make the reaction time more predictable (a common constraint for these embedded systems)?

Handling flush-to-zero in the interval arithmetic that I use for floating-point values is simple enough if I know I have to do it, my question is more whether I have to do it.

like image 933
Pascal Cuoq Avatar asked Jan 18 '10 02:01

Pascal Cuoq


People also ask

What is floating-point arithmetic operations?

Arithmetic operations on floating point numbers consist of addition, subtraction, multiplication and division. The operations are done with algorithms similar to those used on sign magnitude integers (because of the similarity of representation) — example, only add numbers of the same sign.

What is normalized and denormalized floating-point numbers?

If the exponent is all zeros, the floating-point number is denormalized and the most significant bit of the mantissa is known to be a zero. Otherwise, the floating-point number is normalized and the most significant bit of the mantissa is known to be one.

What is denormalized floating-point?

In a normal floating-point value, there are no leading zeros in the significand (mantissa); rather, leading zeros are removed by adjusting the exponent (for example, the number 0.0123 would be written as 1.23 × 10−2). Conversely, a denormalized floating point value has a significand with a leading digit of zero.

What is denormalized value?

A number is denormalized if the exponent field contains all 0's and the fraction field does not contain all 0's. Thus denormalized single-precision numbers can be in the range (plus or minus) to inclusive.


1 Answers

Yes to both questions. There are platforms that support flush-to-zero only, and there are many platforms where flush-to-zero is the default.

You should also be aware that many embedded and dsp platforms use a "Denormals Are Zero" mode, which is another wrinkle in the floating-point semantics.


Edit further explanation of FTZ vs. DAZ:

In FTZ, when an operation would produce a denormal result under the usual arithmetic, a zero is returned instead. Note that some implementations always flush to positive zero, whereas others may flush to either positive or negative zero. It's probably best not to depend on either behavior.

In DAZ, when an input to an operation is a denormal, a zero is substituted in its place. Again, there's no general guarantee about which zero will be substituted.

Some implementations that support these modes allow them to be set independently (and some support only one of the two), so it may be necessary for you to be able model either mode independently as well as together.

Note also that some implementations combine these two modes into "Flush to Zero". The ARM VFP "flush to zero" mode is both FTZ and DAZ, for example.

like image 55
Stephen Canon Avatar answered Sep 19 '22 05:09

Stephen Canon