I'm working on an x86 assembly code golf puzzle. I'm assembling the source file using NASM:
nasm -f elf32 -O0 main.s
ld -m elf_i386 -s -O0 -o main main.o
Using -O0
, all optimizations should be turned off. The goal is to reduce the size of the ELF binary.
While working on the "reference implementation" for the puzzle, I stumbled over a strange behavior. This is a reduced code sample:
section .text
global _start ; Must be declared for linker
_start: ; Entry point for linker
read_stdin:
add esp, 8 ; Ignore argc and argv[0] on stack
pop eax ; Store pointer to 'argv[1]' into EAX
mov eax, [eax] ; Dereference pointer
and eax, 0xff ; We only want the least significant byte
add eax, -0x30 ; Subtract ascii offset
exit:
mov eax, 1 ; Syscall: sys_exit
mov ebx, 0 ; Exit code 0
int 0x80 ; Invoke syscall
The binary is 264 bytes:
$ wc -c main
264 main
Now when I simply replace all occurrences of eax
in the read_stdin
section with ebx
, ecx
or edx
, the binary gets larger:
$ wc -c main
268 main
When comparing the sizes of the object files, the difference is even larger (480 vs 496 bytes). What's special about the eax
register that this happens? Is NASM doing some kind of optimization, even though -O0
has been specified?
EAX is the accumulator register. It has special one-byte opcodes for all nine basic operations (ADD, ADC, AND, CMP, OR, SBB, SUB, TEST, and XOR). Also the MOV instruction has a one-byte opcode for moving data into the accumulator from a constant memory location.
The Art of Picking Intel Registers
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With