Replacing extrordinarily slow pow() function

Tags:

We have a CFD solver and while running a simulation, it was found to run extraordinarily slow on some machines but not others. Using Intel VTune, it was found the following line was the problem (in Fortran):

RHOV= RHO_INF*((1.0_wp - COEFF*EXP(F0)))**(1.0_wp/(GAMM - 1.0_wp))

Drilling in with VTune, the problem was traced to the call pow assembly line and when tracing the stack, it showed it was using __slowpow(). After some searching, this page showed up complaining about the same thing.

On the machine with libc version 2.12, the simulation took 18 seconds. On the machine with libc version 2.14, the simulation took 0 seconds.

Based on the information on the aforementioned page, the problem arises when the base to pow() is close to 1.0. So we did another simple test where we scaled the base by an arbitrary number before the pow() and then divided by the number raised to the exponent after the pow() call. This dropped the runtime from 18 seconds to 0 seconds with the libc 2.12 also.

However, it's impractical to put this all over the code where we do a**b. How would one go about replacing the pow() function in libc? For instance, I would like the assembly line call pow generated by the Fortran compiler to call a custom pow() function we write that does the scaling, calls the libc pow() and then divides by the scaling. How does one create an intermediate layer transparent to the compiler?

Edit

To clarify, we're looking for something like (pseudo-code):

double pow(a,b) {    a *= 5.0    tmp = pow_from_libc(a,b)    return tmp/pow_from_libc(5.0, b) }

Is it possible to load the pow from libc and rename it in our custom function to avoid the naming conflicts? If the customPow.o file could rename pow from libc, what happens if libc is still needed for other things? Would that cause a naming conflict between pow in customPow.o and pow in libc?

850

asked Feb 14 '12 05:02

tpg2114

1 Answers

Well, hold on now. The library isn't calling __slowpow() just to toy with you; it's calling __slowpow() because it believes the extra precision is necessary to give an accurate result for the values you're giving it (in this case, base very near 1, exponent of order 1). If you care about the accuracy of this computation, you should understand why that is and if it matters before trying to work around it. It might be the case that for (say) large negative F0 this whole thing can be safely rounded to 1; or it might not, depending on what's done with this value later. If you ever need 1.d0 minus this result, you're going to want that extra precision.

150

answered Sep 22 '22 10:09

Jonathan Dursi

Related questions
                            
                                Header file inclusion static analysis tools?
                            
                                How Switch case Statement Implemented or works internally?
                            
                                When is casting void pointer needed in C?
                            
                                Access symbols defined in the linker script by application
                            
                                Why type cast a void pointer?
                            
                                Proper Way to Free Memory of a Returned Variable
                            
                                Why are integer types promoted during addition in C?
                            
                                No O_BINARY and O_TEXT flags in Linux?
                            
                                Why can I set an anonymous enum equal to another in C but not C++?
                            
                                Write to memory buffer instead of file with libjpeg?
                            
                                Sharing a global/static variable between a process and DLL
                            
                                Naming convention when using STRUCT in C
                            
                                find ones position in 64 bit number
                            
                                Using Doxygen with C, do you comment the function prototype or the definition? Or both?
                            
                                Is the PHP language resultantly C?
                            
                                What constitutes a "valid" C Identifier?
                            
                                Is there a need to close file descriptors before exit?
                            
                                What c lib to use when I need to parse a simple config file under linux? [closed]
                            
                                Why do I have to explicitly link with libm? [duplicate]
                            
                                How to use list from sys/queue.h?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Replacing extrordinarily slow pow() function

Tags:

c

pow

fortran

libc

tpg2114

People also ask

1 Answers

Jonathan Dursi

Recent Activity

Donate For Us