Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to force pow(float, int) to return float

Tags:

c++

c++11

pow

The overloaded function float pow(float base, int iexp ) was removed in C++11 and now pow returns a double. In my program, I am computing lots of these (in single precision) and I am interested in the most efficient way how to do it.

Is there some special function (in standard libraries or any other) with the above signature?

If not, is it better (in terms of performance in single precision) to explicitly cast result of pow into float before any other operations (which would cast everything else into double) or cast iexp into float and use overloaded function float pow(float base, float exp)?

EDIT: Why I need float and do not use double?

The primarily reason is RAM -- I need tens or hundreds of GB so this reduction is huge advantage. So I need from float to get float. And now I need the most efficient way to achieve that (less casts, use already optimize algorithms, etc).

like image 840
Michal Avatar asked Jan 16 '18 12:01

Michal


People also ask

Can an INT function return a float?

A float is not an int, and the return value of the method is an int. So no you can't return a float. The function must have a float return type. Like: float main(){ return 0.0; } The main method can't be a float though.

What does powf do?

The pow(), powf(), and powl() functions calculate the value of x to the power of y. Note: These functions work in both IEEE Binary Floating-Point and hexadecimal floating-point formats.

How do you return a float value in C++?

To return a float out of a function with long return type, the float value must be converted to long. When converting a float value to long, the decimal value of float will be truncated off, leading to a loss in the value of float, which is allowed by C++ compiler.

How do you use POW in C++?

pow() is function to get the power of a number, but we have to use #include<math.h> in c/c++ to use that pow() function. then two numbers are passed. Example – pow(4 , 2); Then we will get the result as 4^2, which is 16.


1 Answers

Another question that can only be honestly answered with "wrong question". Or at least: "Are you really willing to go there?". float theoretically needs ca. 80% less die space (for the same number of cycles) and so can be much cheaper for bulk processing. GPUs love float for this reason.

However, let's look at x86 (admittedly, you didn't say what architecture you're on, so I picked the most common). The price in die space has already been paid. You literally gain nothing by using float for calculations. Actually, you may even lose throughput because additional extensions from float to double are required, and additional rounding to intermediate float precision. In other words, you pay extra to have a less accurate result. This is typically something to avoid except maybe when you need maximum compatibility with some other program.

See Jens' comment as well. These options give the compiler permission to disregard some language rules to achieve higher performance. Needless to say this can sometimes backfire.

There are two scenarios where float might be more efficient, on x86:

  • GPU (including GPGPU), in fact many GPUs don't even support double and if they do, it's usually much slower. Yet, you will only notice when doing very many calculations of this sort.
  • CPU SIMD aka vectorization

You'd know if you did GPGPU. Explicit vectorization by using compiler intrinsics is also a choice – one you could make, for sure, but this requires quite a cost-benefit analysis. Possibly your compiler is able to auto-vectorize some loops, but this is usually limited to "obvious" applications, such as where you multiply each number in a vector<float> by another float, and this case is not so obvious IMO. Even if you pow each number in such a vector by the same int, the compiler may not be smart enough to vectorize this effectively, especially if pow resides in another translation unit, and without effective link time code generation.

If you are not ready to consider changing the whole structure of your program to allow effective use of SIMD (including GPGPU), and you're not on an architecture where float is indeed much cheaper by default, I suggest you stick with double by all means, and consider float at best a storage format that may be useful to conserve RAM, or to improve cache locality (when you have a lot of them). Even then, measuring is an excellent idea.

That said, you could try ivaigult's algorithm (only with double for the intermediate and for the result), which is related to a classical algorithm called Egyptian multiplication (and a variety of other names), only that the operands are multiplied and not added. I don't know how pow(double, double) works exactly, but it is conceivable that this algorithm could be faster in some cases. Again, you should be OCD about benchmarking.

like image 179
Arne Vogel Avatar answered Oct 06 '22 01:10

Arne Vogel