Suppose I wanted to write a program to display "hello world", and I wanted to write it in binary. How could I do this?
I have some idea that:
Can anybody walk me through this?
01001000 01100101 01101100 01101100 01101111 00100001 Those ones and zeros might not look like anything to you, but in binary code the numbers are actually saying “Hello!”
Our Hello World will only be in 32 bits. We'll transition from 32-bit mode to 64-bit mode in the next chapter, but it's a bit involved. So let's just stay in protected mode for now. This new line is the most complicated bit of assembly we've seen yet.
Find the 8-bit binary code sequence for each letter of your name, writing it down with a small space between each set of 8 bits. For example, if your name starts with the letter A, your first letter would be 01000001.
It's bit more complicated, because actually printing "Hello, world!" to stdout is a system call, thus you need to know the correct kernel syscall number. Which of course varies by operating system. Also you need to know the binary format, which also tend to vary, although ELF (Executable and Linkable Format) is universal across few flavors of Unix and Linux.
See Hello, world! in assembler.
This is Linux assembler code:
section .text
global _start ;must be declared for linker (ld)
_start: ;tell linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db 'Hello, world!',0xa ;our dear string
len equ $ - msg ;length of our dear string
... which on 32-bit Linux, compilation results in binary of 360 bytes, although it's mostly zeros:
00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 02 00 03 00 01 00 00 00 80 80 04 08 34 00 00 00 |............4...|
00000020 c8 00 00 00 00 00 00 00 34 00 20 00 02 00 28 00 |........4. ...(.|
00000030 04 00 03 00 01 00 00 00 00 00 00 00 00 80 04 08 |................|
00000040 00 80 04 08 9d 00 00 00 9d 00 00 00 05 00 00 00 |................|
00000050 00 10 00 00 01 00 00 00 a0 00 00 00 a0 90 04 08 |................|
00000060 a0 90 04 08 0e 00 00 00 0e 00 00 00 06 00 00 00 |................|
00000070 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000080 ba 0e 00 00 00 b9 a0 90 04 08 bb 01 00 00 00 b8 |................|
00000090 04 00 00 00 cd 80 b8 01 00 00 00 cd 80 00 00 00 |................|
000000a0 48 65 6c 6c 6f 2c 20 77 6f 72 6c 64 21 0a 00 2e |Hello, world!...|
000000b0 73 68 73 74 72 74 61 62 00 2e 74 65 78 74 00 2e |shstrtab..text..|
000000c0 64 61 74 61 00 00 00 00 00 00 00 00 00 00 00 00 |data............|
000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000000f0 0b 00 00 00 01 00 00 00 06 00 00 00 80 80 04 08 |................|
00000100 80 00 00 00 1d 00 00 00 00 00 00 00 00 00 00 00 |................|
00000110 10 00 00 00 00 00 00 00 11 00 00 00 01 00 00 00 |................|
00000120 03 00 00 00 a0 90 04 08 a0 00 00 00 0e 00 00 00 |................|
00000130 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 |................|
00000140 01 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 |................|
00000150 ae 00 00 00 17 00 00 00 00 00 00 00 00 00 00 00 |................|
00000160 01 00 00 00 00 00 00 00 |........|
Since you want to "compile by hand", this basically means translating assembler mnemonics above to their opcodes, and then wrapping the result in correct binary format (ELF in the example above)
UPDATE: As this answer shows by @adam-rosenfield, the ELF binary for "Hello, world!" can be handcrafted down to 116 bytes. Original answer is now deleted, but still visible to moderators, so here's a copy:
Here's a 32-byte version using Linux system calls:
.globl _start _start: movb $4, %al xor %ebx, %ebx inc %ebx movl $hello, %ecx xor %edx, %edx movb $11, %dl int $0x80 ;;; sys_write(1, $hello, 11) xor %eax, %eax inc %eax int $0x80 ;;; sys_exit(something) hello: .ascii "Hello world"
When compiled into a minimal ELF file, the full executable is 116 bytes:
00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 |.ELF............| 00000010 02 00 03 00 01 00 00 00 54 80 04 08 34 00 00 00 |........T...4...| 00000020 00 00 00 00 00 00 00 00 34 00 20 00 01 00 00 00 |........4. .....| 00000030 00 00 00 00 01 00 00 00 00 00 00 00 00 80 04 08 |................| 00000040 00 80 04 08 74 00 00 00 74 00 00 00 05 00 00 00 |....t...t.......| 00000050 00 10 00 00 b0 04 31 db 43 b9 69 80 04 08 31 d2 |......1.C.i...1.| 00000060 b2 0b cd 80 31 c0 40 cd 80 48 65 6c 6c 6f 20 77 |[email protected] w| 00000070 6f 72 6c 64 |orld| 00000074
Normally, you'd use a hex editor for this. Figure out the assembly code, hand-assemble it, use the hex editor to enter the binary values, then save them to a file. Once you have your file, drop into your machine monitor and load the file at an available address, then jump to the first instruction. This was pretty common practice on single-board computers and is still done on microcontrollers today, but it's not something you're going to do on a contemporary OS. If you really want to do this, I'd recommend running a low-level emulator (SIMH will work) or working with a microcontroller (you can pick up a TI MSP430 development kit for less than five bucks).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With