When I diassembled my program, I saw that gcc was using jmp for the second pthread_wait_barrier call when compiled with -O3. Why is it so?
What advantage does it get by using jmp instead of call. What tricks the compiler is playing here? I guess its performing tail call optimization here.
By the way I'm using static linking here.
__attribute__ ((noinline)) void my_pthread_barrier_wait(
volatile int tid, pthread_barrier_t *pbar )
{
pthread_barrier_wait( pbar );
if ( tid == 0 )
{
if ( !rollbacked )
{
take_checkpoint_or_rollback( ++iter == 4 );
}
}
//getcontext( &context[tid] );
SETJMP( tid );
asm("addr2jmp:");
pthread_barrier_wait( pbar );
// My suspicion was right, gcc was performing tail call optimization,
// which was messing up with my SETJMP/LONGJMP implementation, so here I
// put a dummy function to avoid that.
dummy_var = dummy_func();
}
As you don't show an example, I can only guess: the called function has the same return type as the calling one, and this works like
return func2(...)
or has no return type at all (void
).
In this case, "we" leave "our" return address on the stack, leaving it to "them" to use it to return to "our" caller.
Perhaps it was a tail-recursive call. GCC has some pass doing tail-recursive optimization.
But why should you bother? If the called function is an extern
function, then it is public, and GCC should call it following the ABI conventions (which means that it follows the calling convention).
You should not care if the function was called by a jmp.
And it might also be a call to a dynamic library function (i.e. with the PLT for dynamic linking)
jmp has less overhead than call. jmp just jumps, call pushes some stuff on stack and jumps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With