I guess you all heard of the 'swap problem'; SO is full of questions about it. The version of the swap without use of a third variable is often considered to be faster since, well, you have one variable less. I wanted to know what was going on behind the curtains and wrote the following two programs:
int main () {
int a = 9;
int b = 5;
int swap;
swap = a;
a = b;
b = swap;
return 0;
}
and the version without third variable:
int main () {
int a = 9;
int b = 5;
a ^= b;
b ^= a;
a ^= b;
return 0;
}
I generated the assembly code using clang and got this for the first version (that uses a third variable):
...
Ltmp0:
movq %rsp, %rbp
Ltmp1:
movl $0, %eax
movl $0, -4(%rbp)
movl $9, -8(%rbp)
movl $5, -12(%rbp)
movl -8(%rbp), %ecx
movl %ecx, -16(%rbp)
movl -12(%rbp), %ecx
movl %ecx, -8(%rbp)
movl -16(%rbp), %ecx
movl %ecx, -12(%rbp)
popq %rbp
ret
Leh_func_end0:
...
and this for the second version (that does not use a third variable):
...
Ltmp0:
movq %rsp, %rbp
Ltmp1:
movl $0, %eax
movl $0, -4(%rbp)
movl $9, -8(%rbp)
movl $5, -12(%rbp)
movl -12(%rbp), %ecx
movl -8(%rbp), %edx
xorl %ecx, %edx
movl %edx, -8(%rbp)
movl -8(%rbp), %ecx
movl -12(%rbp), %edx
xorl %ecx, %edx
movl %edx, -12(%rbp)
movl -12(%rbp), %ecx
movl -8(%rbp), %edx
xorl %ecx, %edx
movl %edx, -8(%rbp)
popq %rbp
ret
Leh_func_end0:
...
The second one is longer but I don't know much about assembly code so I have no idea if that means that it is slower so I'd like to hear the opinion of someone more knowledgable about it.
Which of the above versions of a variable swap is faster and takes less memory?
Look at some optimised assembly. From
void swap_temp(int *restrict a, int *restrict b){
int temp = *a;
*a = *b;
*b = temp;
}
void swap_xor(int *restrict a, int *restrict b){
*a ^= *b;
*b ^= *a;
*a ^= *b;
}
gcc -O3 -std=c99 -S -o swapping.s swapping.c
produced
.file "swapping.c"
.text
.p2align 4,,15
.globl swap_temp
.type swap_temp, @function
swap_temp:
.LFB0:
.cfi_startproc
movl (%rdi), %eax
movl (%rsi), %edx
movl %edx, (%rdi)
movl %eax, (%rsi)
ret
.cfi_endproc
.LFE0:
.size swap_temp, .-swap_temp
.p2align 4,,15
.globl swap_xor
.type swap_xor, @function
swap_xor:
.LFB1:
.cfi_startproc
movl (%rsi), %edx
movl (%rdi), %eax
xorl %edx, %eax
xorl %eax, %edx
xorl %edx, %eax
movl %edx, (%rsi)
movl %eax, (%rdi)
ret
.cfi_endproc
.LFE1:
.size swap_xor, .-swap_xor
.ident "GCC: (SUSE Linux) 4.5.1 20101208 [gcc-4_5-branch revision 167585]"
.section .comment.SUSE.OPTs,"MS",@progbits,1
.string "Ospwg"
.section .note.GNU-stack,"",@progbits
To me, swap_temp
looks as efficient as can be.
The problem with XOR swap trick is that it's strictly sequential. It may seem deceptively fast, but in reality, it is not. There's an instruction called XCHG
that swaps two registers, but this can also be slower than simply using 3 MOVs
, due to its atomic nature. The common technique with temp is an excellent choice ;)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With