I am running my code in a 64-bit Linux environment where the Linux Kernel is built with IA32_EMULATION and X86_X32 disabled.
In the book Programming from the Ground Up the very first program doesn't do anything except produce a segfault:
.section .data
.section .text
.globl _start
_start:
movl $1, %eax
movl $0, %ebx
int $0x80
I convert the code to use x86-64 instructions but it also segfaults:
.section .data
.section .text
.globl _start
_start:
movq $1, %rax
movq $0, %rbx
int $0x80
I assembled both these programs like this:
as exit.s -o exit.o
ld exit.o -o exit
Running ./exit
gives Segmentation fault
for both. What am I doing wrong?
P.S. I have seen a lot of tutorials assembling code with gcc, however I'd like to use gas.
Combining comments and the answer, here's the final version of the code:
.section .data
.section .text
.globl _start
_start:
movq $60, %rax
xor %rbx, %rbx
syscall
int $0x80
is the 32bit ABI. On normal kernels (compiled with IA32 emulation), it's available in 64bit processes, but you shouldn't use it because it only supports 32bit pointers, and some structs have a different layout.
See the x86 tag wiki for info on making 64bit Linux system calls. (Also ZX485's answer on this question). There are many differences, including the fact that the syscall
instruction clobbers %rcx
and %r11
, unlike the int $0x80
ABI.
In a kernel without IA32 emulation, like yours, running int $0x80
is probably the same as running any other invalid software interrupt, like int $0x79
. Single-stepping that instruction in gdb (on my 64bit 4.2 kernel that does include IA32 emulation) results in a segfault on that instruction.
It doesn't return and keep executing garbage bytes as instructions (which would also result in a SIGSEGV or SIGILL), or keep executing until it jumped to (or reached normally) an unmapped page. If it did, that would be the mechanism for segfaulting.
You can run a process under strace
, e.g. strace /bin/true --version
to make sure it's making the system calls you thought it would. You can also use gdb
to see where a program segfaults. Using a debugger is essential, moreso than in most languages, because the failure mode in asm is usually just a segfault.
The first observation is that the code in both your examples effectively do the same thing, but are encoded differently. The site x86-64.org has some good information for those starting out with x86-64 development. The first code snippet that uses 32-bit registers is equivalent to the second because of Implicit Zero Extend:
Implicit zero extend
Results of 32-bit operations are implicitly zero extended to 64-bit values. This differs from 16 and 8 bit operations, that don't affect the upper part of registers. This can be used for code size optimisations in some cases, such as:
movl $1, %eax # one byte shorter movq $1, %rax xorq %rax, %rax # three byte equivalent of mov $0,%rax andl $5, %eax # equivalent for andq $5, %eax
The question is, why does this code segfault? If you had run this code on a typical x86-64 Linux distro your code may have exited as expected without generating a segfault
. The reason that your code is failing is because you are using a custom kernel with IA32 emulation off.
IA32 emulation in the Linux kernel does allow you to use the 32-bit int 0x80
interrupt to make calls using the traditional 32-bit system call mechanism. This is an emulation layer, and doesn't support passing pointers that can't be represented in a 32-bit register. This is the case for stack based pointers since they fall outside the 4gb address space, and can't be accessed with 32-bit pointers.
Your system has IA32 emulation off, and because of that int 0x80
doesn't exist for backwards compatibility. The result is that the int 0x80
interrupt will throw a segmentation fault and your application will fail.
In x86-64 code it is preferred that you use the syscall
instruction to make system calls to the 64-bit Linux kernel. This mechanism supports 64-bit operands and pointers where necessary. Ryan Chapman's site has some good information on the 64-bit SYSCALL interface which differs considerably from the 32-bit int 0x80
mechanism.
Your code could have been written this way to work in a 64-bit environment without IA32 emulation:
.section .text
.globl _start
_start:
mov $60, %eax
xor %ebx, %ebx
syscall
Other useful information on doing 64-bit development can be found in the 64-bit System V ABI. This document also better describes the general syscall
convention used by the Linux kernel including side effects in Section A.2 . This document is also very informative if you also wish to interface with third party libraries and modules (like the C library etc).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With