Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Range analysis of floating point values?

I have an image processing program which uses floating point calculations. However, I need to port it to a processor which does not have floating point support in it. So, I have to change the program to use fixed point calculations. For that I need proper scaling of those floating point numbers, for which I need to know the range of all values, including intermediate values of the floating point calculations.

Is there a method where I just run the program and it automatically give me the range of all the floating point calculations in the program? Trying to figure out the ranges manually would be too cumbersome, so if there is some tool for doing it, that would be awesome!

like image 724
MetallicPriest Avatar asked Feb 13 '23 04:02

MetallicPriest


2 Answers

You could use some "measuring" replacement for your floating type, along these lines (live example):

template<typename T>
class foo
{
    T val;

    using lim = std::numeric_limits<int>;

    static int& min_val() { static int e = lim::max(); return e; }
    static int& max_val() { static int e = lim::min(); return e; }

    static void sync_min(T e) { if (e < min_val()) min_val() = int(e); }
    static void sync_max(T e) { if (e > max_val()) max_val() = int(e); }

    static void sync(T v)
    {
        v = std::abs(v);
        T e = v == 0 ? T(1) : std::log10(v);
        sync_min(std::floor(e)); sync_max(std::ceil(e));
    }

public:
    foo(T v = T()) : val(v) { sync(v); }
    foo& operator=(T v) { val = v; sync(v); return *this; }

    template<typename U> foo(U v) : foo(T(v)) {}
    template<typename U> foo& operator=(U v) { return *this = T(v); }

    operator T&() { return val; }
    operator const T&() const { return val; }

    static int min() { return min_val(); }
    static int max() { return max_val(); }
};

to be used like

int main ()
{
    using F = foo<float>;
    F x;

    for (F e = -10.2; e <= 30.4; e += .2)
        x = std::pow(10, e);

    std::cout << F::min() << " " << F::max() << std::endl;  // -11 31
}

This means you need to define an alias (say, Float) for your floating type (float or double) and use it consistently throughout your program. This may be inconvenient but it may prove beneficial eventually (because then your program is more generic). If your code is already templated on the floating type, even better.

After this parametrization, you can switch your program to "measuring" or "release" mode by defining Float to be either foo<T> or T, where T is your float or double.

The good thing is that you don't need external tools, your own code carries out the measurements. The bad thing is that, as currently designed, it won't catch all intermediate results. You would have to define all (e.g. arithmetic) operators on foo for this. This can be done but needs some more work.

like image 171
iavr Avatar answered Feb 24 '23 05:02

iavr


It is not true that you cannot use floating point code on hardware that does not support floating point - the compiler will provide software routines to perform floating point operations - they just may be rather slow - but if it is fast enough for your application , that is the path of least resistance.

It is probably simplest to implement a fixed point data type class and have its member functions detect over/underflow as a debug option (because the checking will otherwise slow your code).

I suggest you look at Anthony Williams' fixed-Point math C++ library. It is in C++ and defines a fixed class with extensive function and operator overloading, so it can largely be used simply by replacing float or double in your existing code with fixed. It uses int64_t as the underlying integer data type, with 34 integer bits and 28 fractional bits (34Q28), so is good for about 8 decimal places and a wider range than int32_t.

It does not have the under/overflow checking I suggested, but it is a good starting point for you to add your own.

On 32bit ARM this library performs about 5 times faster than software-floating point and is comparable in performance to ARM's VFP unit for C code.

Note that the sqrt() function in this library has poor precision performance for very small values as it looses lower-order bits in intermediate calculations that can be preserved. It can be improved by replacing it with the code the version I presented in this question.

like image 20
Clifford Avatar answered Feb 24 '23 05:02

Clifford