"cqo", "cdq" and "cwd" x86_64 instructions. Why not use just cqo?

Tags:

x86-64

I'm not the most experienced assembly programmer, and I ran into the "cqo", "cdq" and "cwd" instructions, which are all valid x86_64 assembly.

I was wondering if there are any advantages of using cdq or cwd, when operating on smaller values. Is there is some difference in performance?

EDIT: Originally started looking into this, when calculating absolute value for one digit numbers.

For example if we have -9 value in al:

cwd
xor al,dl
sub al,dl

vs. Having it as a 32 bit value and calculating

cdq
xor eax,edx
sub eax,edx

or if we have a 64 bit value for -9

cqo
xor rax,rdx
sub rax,rdx

If the original value is 64 bits and consists of a value -9 to 9, effectively they all seem the same.

760

asked Nov 19 '15 19:11

1 Answers

You only have a choice if your value is already sign-extended to fill more than 16 bits of rax.

If you have a signed 16bit int in ax, but the upper16 of eax is unknown or zero, you must keep using 16bit instructions. cdq would set edx based on the garbage bit at the top of eax, not the sign-bit of your value in ax.

Similarly, if you were using 32bit ops to generate a signed 32bit int in eax, the upper32 will be zeroed, not sign-extended.

If you can, use cdq. You might need cqo if you need all 64bits set in rdx.

See http://agner.org/optimize/ to learn about making asm that runs fast on x86. 32bit operand size is the default in 64bit mode, so 16 or 64bit operands require an extra prefix. This means larger code size, which means worse I-cache efficiency (and often more decode bottlenecks on pre-Sandybridge CPUs; SnB's uop cache usually means decode isn't a problem.)

16bit also has a false dependency on the previous contents of the register, since writing ax doesn't clear the rest of rax. Fortunately, AMD64 was designed with out-of-order CPUs in mind, so it avoided repeating that design choice that's inconvenient for high-performance, by clearing the upper32 when writing the low 32bits of a GP reg. (x86 CPUs already used OOO when AMD64 was designed, unlike when ax was extended to eax).

174

answered Oct 07 '22 12:10

Peter Cordes

Related questions
                            
                                How to work with Strings in ARM?
                            
                                How to measure x86 and x86-64 assembly commands execution time in processor cycles? [duplicate]
                            
                                Determining register values when using objdump
                            
                                Converting Decimal to Hex
                            
                                Writing / linking a flat binary using NASM + LD
                            
                                LLVM assembly: call a function using varargs
                            
                                What is on the stack before my program starts?
                            
                                How to read data from absolute address in delphi XE2
                            
                                prohibit inline assembly in g++ (gcc) or clang (llvm) [duplicate]
                            
                                C++ and FULLY dynamic functions
                            
                                Get the address at the end of a set of data?
                            
                                How to figure out function prototype from assembly code?
                            
                                Passing Parameters in 64 bit Assembly Function from C language. Which Register Receive These Parameter?
                            
                                6502 emulator in C/C++: how to separate addressing mode code from actual instruction code
                            
                                What is the meaning of the following assembler-code line?
                            
                                __asm__ in c++ error [duplicate]
                            
                                Mach-O 64-bit format does not support 32-bit absolute addresses. NASM [duplicate]
                            
                                Why does GCC-generated code read junk from stack?
                            
                                Why the compiler changes movq to movd when doing 64bit operations?
                            
                                How to code a far absolute JMP/CALL instruction in MASM?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

"cqo", "cdq" and "cwd" x86_64 instructions. Why not use just cqo?

Tags:

assembly

x86-64

Husky

People also ask

1 Answers

Peter Cordes

Recent Activity

Donate For Us