Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interchange 2 variables in C++ with asm code

Tags:

c++

assembly

I have a huge function that sorts a very large amount of int data. The code works fine except the fact that it's slower that it should be. My first step into solving this is to place some asm code inside C++. How can I interchange 2 variables using asm? I've tried this:

_asm{ push a[x]; push a[y]; pop a[x]; pop a[y];}

and this:

_asm(mov eax, a[x];mov ebx,a[y]; mov a[x],ebx; mov a[y],eax;}

but both crash. How can I save some time on these interchanges ? I use VS_2010

like image 835
user775476 Avatar asked Apr 11 '26 12:04

user775476


2 Answers

In general, it is very difficult to do better than your compiler with simple code like this.

A compiler, when faced with a swap operation on integers, will typically issue code like this:

mov eax, [x]
mov ebx, [y]
mov [x], ebx
mov [y], eax

Before you try to override, first check what the compiler is actually generating. If it's something like this, don't bother going any further; you won't be able to do better than this. Moreover, if you leave it to the compiler, it may, if these variables are used immediately thereafter, choose to reuse one of these registers to save on variable loads/stores as well. This is impossible with hand-coded assembly; the compiler must reload the variables after the black box that is hand-coded asm.

Note that the push/push/pop/pop sequence is likely to be much slower; not only does it add an additional four memory operations to the stack, it also introduces dependencies on the stack pointer, eliminating any possibility of pipelining. With the simple mov sequence, it is at least possible to run the pair of reads and pair of writes in parallel if they are on different memory banks, or one is in cache, etc. It also does not introduce stalls on the stack pointer in later code.

As such, you should not try to micro-optimize the cost of an interchange; instead, reduce the number of interchanges performed. There are many sorting algorithms available, each with slightly different characteristics. You may find some are better (cause less swaps) on your dataset than others.

like image 74
bdonlan Avatar answered Apr 13 '26 08:04

bdonlan


What makes you think you can produce faster assembly than an optimizing compiler?
Even if you'll get it to work properly, all you're likely to achieve is to confuse the optimizer to produce even slower code.

like image 36
shoosh Avatar answered Apr 13 '26 08:04

shoosh



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!