Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Segmentation fault", x86_64 assembly, AT&T syntax

I am running my code in a 64-bit Linux environment where the Linux Kernel is built with IA32_EMULATION and X86_X32 disabled.

In the book Programming from the Ground Up the very first program doesn't do anything except produce a segfault:

.section .data

.section .text
.globl _start
_start:
movl $1, %eax
movl $0, %ebx

int $0x80

I convert the code to use x86-64 instructions but it also segfaults:

.section .data

.section .text
.globl _start
_start:
movq $1, %rax
movq $0, %rbx

int $0x80

I assembled both these programs like this:

as exit.s -o exit.o
ld exit.o -o exit

Running ./exit gives Segmentation fault for both. What am I doing wrong?

P.S. I have seen a lot of tutorials assembling code with gcc, however I'd like to use gas.

Update

Combining comments and the answer, here's the final version of the code:

.section .data
.section .text

.globl _start
_start:

movq $60, %rax
xor  %rbx, %rbx

syscall
like image 571
Cág Avatar asked Apr 15 '16 08:04

Cág


2 Answers

int $0x80 is the 32bit ABI. On normal kernels (compiled with IA32 emulation), it's available in 64bit processes, but you shouldn't use it because it only supports 32bit pointers, and some structs have a different layout.

See the x86 tag wiki for info on making 64bit Linux system calls. (Also ZX485's answer on this question). There are many differences, including the fact that the syscall instruction clobbers %rcx and %r11, unlike the int $0x80 ABI.

In a kernel without IA32 emulation, like yours, running int $0x80 is probably the same as running any other invalid software interrupt, like int $0x79. Single-stepping that instruction in gdb (on my 64bit 4.2 kernel that does include IA32 emulation) results in a segfault on that instruction.

It doesn't return and keep executing garbage bytes as instructions (which would also result in a SIGSEGV or SIGILL), or keep executing until it jumped to (or reached normally) an unmapped page. If it did, that would be the mechanism for segfaulting.

You can run a process under strace, e.g. strace /bin/true --version to make sure it's making the system calls you thought it would. You can also use gdb to see where a program segfaults. Using a debugger is essential, moreso than in most languages, because the failure mode in asm is usually just a segfault.

like image 65
Peter Cordes Avatar answered Sep 19 '22 23:09

Peter Cordes


The first observation is that the code in both your examples effectively do the same thing, but are encoded differently. The site x86-64.org has some good information for those starting out with x86-64 development. The first code snippet that uses 32-bit registers is equivalent to the second because of Implicit Zero Extend:

Implicit zero extend

Results of 32-bit operations are implicitly zero extended to 64-bit values. This differs from 16 and 8 bit operations, that don't affect the upper part of registers. This can be used for code size optimisations in some cases, such as:

movl $1, %eax                 # one byte shorter movq $1, %rax
xorq %rax, %rax       # three byte equivalent of mov $0,%rax
andl $5, %eax         # equivalent for andq $5, %eax

The question is, why does this code segfault? If you had run this code on a typical x86-64 Linux distro your code may have exited as expected without generating a segfault. The reason that your code is failing is because you are using a custom kernel with IA32 emulation off.

IA32 emulation in the Linux kernel does allow you to use the 32-bit int 0x80 interrupt to make calls using the traditional 32-bit system call mechanism. This is an emulation layer, and doesn't support passing pointers that can't be represented in a 32-bit register. This is the case for stack based pointers since they fall outside the 4gb address space, and can't be accessed with 32-bit pointers.

Your system has IA32 emulation off, and because of that int 0x80 doesn't exist for backwards compatibility. The result is that the int 0x80 interrupt will throw a segmentation fault and your application will fail.

In x86-64 code it is preferred that you use the syscall instruction to make system calls to the 64-bit Linux kernel. This mechanism supports 64-bit operands and pointers where necessary. Ryan Chapman's site has some good information on the 64-bit SYSCALL interface which differs considerably from the 32-bit int 0x80 mechanism.

Your code could have been written this way to work in a 64-bit environment without IA32 emulation:

.section .text

.globl _start
_start:

mov  $60, %eax
xor  %ebx, %ebx
syscall

Other useful information on doing 64-bit development can be found in the 64-bit System V ABI. This document also better describes the general syscall convention used by the Linux kernel including side effects in Section A.2 . This document is also very informative if you also wish to interface with third party libraries and modules (like the C library etc).

like image 24
Michael Petch Avatar answered Sep 17 '22 23:09

Michael Petch