I noticed a big difference in performance of my C program depending on the -fPIC flag. When I use it my program is about 30% slower than without it. I am comparing it with a Lua program which calls a C function (where all the heavy calculation is done). Firstly I created a shared object with the C function, so had to use the -fPIC flag. The performance is very similar to the C code with -fPIC flag. So now I tried to the same without the .so: I called Lua from C:
int main()
{
lua_State* L = luaL_newstate();
luaL_openlibs(L);
lua_register(L, "my_c_function", my_c_function);
luaL_dofile(L, "my_lua_program.lua");
lua_close(L);
return 0;
}
But here performance is the same regardless if I use the -fPIC flag or not (and the same as the approach with .so). I was expecting some improvement without the -fPIC flag... Any advice on how can I investigate it further? Is the second approach creating position independent code anyway and that's why the performance is similar? Thanks!
More information, as suggested by the comment: I use the -O3 flag, gcc 4.7.2, Ubuntu 12.04.2, x86_64. Yes, I was quite surprised with so big overhead... My program is calculating Mandelbrot fractal. So there are two loops iterating over x and y and the function I have in C is isMandelbrot: it takes the number of iterations and returns bool: belongs to Mandelbrot set or not. I use the shared object with 'require'.
Moreover, for a C function to be called from Lua, we must register it, that is, we must give its address to Lua in an appropriate way. When Lua calls a C function, it uses the same kind of stack that C uses to call Lua. The C function gets its arguments from the stack and pushes the results on the stack.
The Stack. Lua uses a virtual stack to pass values to and from C. Each element in this stack represents a Lua value ( nil , number, string, etc.). Whenever Lua calls C, the called function gets a new stack, which is independent of previous stacks and of stacks of C functions that are still active.
I think the code you are running on is x86. This platform has performance issues with -fPIC, where the location of any imported function requires the local eip to be found. The code to do this adds a small overhead to the function. Unfortunately lua is full of very small functions, and it will increase the relative overhead.
On x64 -fPIC doesn't have this overhead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With