The following 32bit x86 Linux program prints a string of arbitrary length (as long as a program can be, anyway) and does exit(0)
afterwards:
.global _start ; notice on entry here, all regs but %esp are zero
_start:
call .L0 ; offset == strlen, provided by your assembler
.byte 'H','e','l','l','o',',',' ','W','o','r','l','d'
.L0:
pop %ecx ; ret addr is starting addr of string
mov -4(%ecx),%edx ; argument to `call`, 4 bytes: strlen
inc %ebx ; stdout == 1
movb $4, %al ; SYS_write == 4
int $0x80
xchg %eax,%ebp ; %ebp is still zero
xchg %eax,%ebx ; SYS_exit == 1, return value == 0
int $0x80
If one's willing to sacrifice the position-independence (instead, force the linker to insert the string address), and not care about the program returning zero, one can get it down to:
.global _start
_start:
movb $4, %al
inc %ebx
mov $.L0, %ecx ; this address is calculated when linking
movb $.Lend-.L0, %dl ; strlen, calculated by assembler
int $0x80
xchg %eax,%ebx
int %0x80
.L0:
.byte 'H','e','l','l','o',',',' ','W','o','r','l','d'
.Lend:
Both of these can be assembled/linked via as --32 -o x.o x.S; ld -s -m elf_i386 x.o
, and run just fine. The second one is 26 Bytes of code. If you permit a crash after printing Hello, World
then leave the last two instructions out, 23 Bytes. That's as low as I could go.
Question that's always bugged me, is it possible to squeeze a few more bytes off this ? Pure speculation of mine gives these possible leads:
movw $.L0, %cx
could be used (saves one byte) ?jmp
to a place that's known (or created via assembler / linker invocation magic) to contain the necessary instructions for the exit(...)
syscall, saving one byte over the xchg; int
sequence ?Or else, can it be proven that this actually is the smallest well-behaved (no crash / return code zero) Linux/x86 "Hello, World" ?
To clarify, the question is not about minimizing the size of the ELF executable; techniques for that are long-known. I'm explicitly inquiring about the size of a Linux 32bit x86 assembly program that performs the equivalent of what the compiled code for:
int main(int argc, char **argv)
{
puts("Hello, World");
exit(0); /* or whatever code */
}
would do.
In fact, I'll be happy about anything that does not require manual editing of the ELF headers. If you find a way to e.g. stuff the "Hello, World"
into some ELF object and referencing that from the assembly source, using only the assembler / linker command line and/or a mapfile input, I'd consider it valid enough, even though that increases the size of the ELF executable. I just want to know if the instruction sequence to print "Hello, World" and exit()
afterwards can be shrunk still.
The question is about code size, not executable size.
It's been done back in 1999. Have a look at this page (spoiler: the end result is a 45-byte ELF file). Make sure to read the postscript too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With