flush-to-zero behavior in floating-point arithmetic

Tags:

While, as far as I remember, IEEE 754 says nothing about a flush-to-zero mode to handle denormalized numbers faster, some architectures offer this mode (e.g. http://docs.sun.com/source/806-3568/ncg_lib.html ).

In the particular case of this technical documentation, standard handling of denormalized numbers is the default, and flush-to-zero has to be activated explicitly. In the default mode, denormalized numbers are also handled in software, which is slower.

I work on a static analyzer for embedded C which tries to predict correct (if sometimes imprecise) ranges for the values that can happen at run-time. It aims at being correct because it is intended to be usable to exclude the possibility of something going wrong at run-time (for instance for critical embedded code). This requires having captured all possible behaviors during the analysis, and therefore all possible values produced during floating-point computations.

In this context, my question is twofold:

among the embedded architectures, are there architectures that offer only flush-to-zero? They would perhaps not have to right to advertise themselves as "IEEE 754", but could offer close-enough IEEE 754-style floating-point operations.
For the architectures that offer both, in an embedded context, isn't flush-to-zero likely to be activated by the system, in order to make the reaction time more predictable (a common constraint for these embedded systems)?

Handling flush-to-zero in the interval arithmetic that I use for floating-point values is simple enough if I know I have to do it, my question is more whether I have to do it.

933

asked Jan 18 '10 02:01

Pascal Cuoq

1 Answers

Yes to both questions. There are platforms that support flush-to-zero only, and there are many platforms where flush-to-zero is the default.

You should also be aware that many embedded and dsp platforms use a "Denormals Are Zero" mode, which is another wrinkle in the floating-point semantics.

Edit further explanation of FTZ vs. DAZ:

In FTZ, when an operation would produce a denormal result under the usual arithmetic, a zero is returned instead. Note that some implementations always flush to positive zero, whereas others may flush to either positive or negative zero. It's probably best not to depend on either behavior.

In DAZ, when an input to an operation is a denormal, a zero is substituted in its place. Again, there's no general guarantee about which zero will be substituted.

Some implementations that support these modes allow them to be set independently (and some support only one of the two), so it may be necessary for you to be able model either mode independently as well as together.

Note also that some implementations combine these two modes into "Flush to Zero". The ARM VFP "flush to zero" mode is both FTZ and DAZ, for example.

answered Sep 19 '22 05:09

Stephen Canon

Related questions
                            
                                *Almost* Perfect C Shell Piping
                            
                                Understanding undefined behavior for a binary stream using fseek(file, 0, SEEK_END) with a file
                            
                                Maximum size of string can be printed using %s?
                            
                                Ignore 'E' when reading double with sscanf
                            
                                Possible C/C++ compiler bug in Visual Studio 2013
                            
                                How do you debug the bug that only appears when the load is huge?
                            
                                Get index of first element that is not zero in a __m256 variable
                            
                                What's the "hints" mean for the addrinfo name in socket programming
                            
                                Tutorial on C pointers and arrays from a Java standpoint
                            
                                RAND_MAX macro: signed or unsigned?
                            
                                printf and %llx in GCC under Windows 64x
                            
                                Real-time linting of C code
                            
                                ftello/fseeko vs fgetpos/fsetpos
                            
                                Compiling Lua lib for Android - success, but strange segfaults
                            
                                Why does GCC generate different opcodes for multiplication based on a value of the constant?
                            
                                Sequence points and side effects: Quiet change in C11?
                            
                                C language: meaning of operator "#" ?
                            
                                How to compile and keep "unused" C declarations with clang -emit-llvm
                            
                                Does anyone have experience creating a shared library in MATLAB?
                            
                                what does compiler do with a[i] which a is array? And what if a is a pointer?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

flush-to-zero behavior in floating-point arithmetic

Tags:

c

floating-point

embedded

ieee-754

Pascal Cuoq

People also ask

1 Answers

Stephen Canon

Recent Activity

Donate For Us