Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to roll a significantly faster version of modf

In an app I'm profiling, I found that in some scenarios this function is able to take over 10% of total execution time.

MSVC++ 2008 compiler is being used, for reference... I don't recall if modf maps to a single instruction or if there is likely any way to make it faster.

see also here for similar question on sqrt function

Unlike sqrt, I don't really know how modf works. Is there an assembly operation? For instance you could do:

modf(float input,int &intPart, float &floatPart)
{
 intPart= (int)input;
 floatPart= input - intPart;
}

But I assume this incurs penalties from casting/conversion, etc. how does a fast implementation work?

like image 352
Mr. Boy Avatar asked Apr 14 '10 13:04

Mr. Boy


3 Answers

You're getting good answers here, but where does the other 90% of time go?

Don't look at exclusive % time per routine.

Look at inclusive % time per line of code, and if possible, have it include blocked time, not just CPU time.

That way, you may well find out that part of the time, you don't even need to be in the modf function, or other functions.

An easy way to get that information is this technique.

Added: As you find optimizations that you can do, expect total execution time to decrease, but don't expect the percentages to necessarily go down. Your % time in modf and / or sqrt may actually go up if you get rid of other stuff, or they may go down if you find that you can memoize them (and therefore call them less), for example.

In this approach to optimization, you can think of the program's execution history as a big call tree, and what you are looking for is whole branches that you can prune off. What's more, since a line of code can appear in multiple branches of the call tree, pruning it in one prunes it in all.

like image 142
Mike Dunlavey Avatar answered Nov 16 '22 08:11

Mike Dunlavey


A good implementation of modf can be quite fast (on the order of 10 cycles on current hardware). A bad implementation can be quite slow (on the order of 100 cycles). A really poorly conceived implementation could conceivably take 1000 cycles. I don't know what the state of Microsoft's implementation is, but there are a number of good implementations in various open source C libraries that you might look at.

Your proposed implementation takes some shortcuts and wouldn't comply with the C standard; in particular, it will misbehave rather severely in the case where input is too large to be successfully converted to integer. It also gets the sign of zero wrong in some cases, but you might not care about that.

Note also that you would be well served to use a compiler / C library that supports the C99 standard, as you could then take advantage of the modff function and avoid the overhead of converting to and from double precision. I know that Intel's math library (which comes with their compilers) has an excellent modf and modff implementations. GCC also supports the C99 single-precision variants.

FWIW, I benchmarked your proposed implementation, and (assuming excellent compiler codegen), it's about 50% faster than the Intel library modff (Intel's implementation however, delivers the correct result for all inputs). The fastest correct implementation that I've benchmarked is only about 15% slower than your implementation (but again, gives the correct result for all inputs, and even sets the floating-point state flags correctly to boot).

like image 2
Stephen Canon Avatar answered Nov 16 '22 10:11

Stephen Canon


modf should be a very fast function indeed, so the problem might be that it is still a function (ie, not being inlined). You could try using exactly the same code from the library but in an inline static function in a header to allow the compiler to inline it.

When the function is inlined, if you're always only using one of the mantissa/exponent, the compiler ought to be clever enough to only emit code to calculate that part, further speeding things up.

If you're still interested in rolling your own have a look at wikipedia on the floating point format

like image 1
James Avatar answered Nov 16 '22 09:11

James