Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I manually increment an instruction pointer from a context?

First, let me say that I am doing things here that most people have no legitimate reason to EVER do. 99.99...% of all segfaults should result in definite termination, and merrily handling them in any but the most simple of situations WILL result in really bad behavior and a corrupted stack. If you came here looking to resolve a segfault, please look at the following link: https://www.securecoding.cert.org/confluence/display/seccode/SIG35-C.+Do+not+return+from+a+computational+exception+signal+handler

That said, I am working on implementing an environment from an external standard, which has defined the behavior for returning from a computational logic error's signal handler as skipping ahead one instruction. I understand this is bad, however I have no control over it; I can't just nuke the definition, as it is for an embedded system with other software elements already written which depend on the defined behavior (they are often safety critical, and need to be able to gracefully exit, even when they do ungraceful or awful things; further I don't have source, so I can't just fix the segfault and any existing bad segfault/crash behavior is actually desired because I am simulating the behavior of an existing system).

While the system itself is to operate on PowerPC, which has a fixed instruction length, our development is happening in a parallel x86/x64 environment where the instructions are not fixed-length. I know that the following code works, albeit badly for the x86:

#define _GNU_SOURCE
#include <signal.h>
#include <stdio.h>
#include <ucontext.h>
#include <sys/mman.h>

#define CRASHME *((int*)NULL) = 0 
//for x86
#ifdef REG_EIP
#define INCREMENT(x) (x)->uc_mcontext.gregs[REG_EIP]++
//for x64
#elif defined REG_RIP
#define INCREMENT(x) (x)->uc_mcontext.gregs[REG_RIP]++
//for PPC arch
#elif defined PT_NIP
#define INCREMENT(x) (x)->uc_mcontext.uc_regs->gregs[PT_NIP]+=4
#endif

static void handler(int sig, siginfo_t *si, void *vcontext)
{
    ucontext_t *context = (ucontext_t *)vcontext;
    INCREMENT(context);
}

void crashme_function(void)
{
    printf("entered new context, segfaulting!\n");
    CRASHME;
    printf("SEGFAULT handled!\n");
}

int main (int argc, char* args)
{
    struct sigaction sa;
    printf("Printing a thing\n");
    sa.sa_flags = SA_SIGINFO;
    sigemptyset(&sa.sa_mask);
    sa.sa_sigaction = handler;
    sigaction(SIGSEGV, &sa, NULL);
    printf("Entering new context...\n");
    crashme_function();
    printf("context exited successfully\n");
    return(0);

}

The result of executing this code will advance the instruction pointer by 1 on an intel based arch running Linux kernel 3.11.X, and eventually it will advance out of the instruction. I know this probably won't work on all instructions. In fact, when executed on my test environment, the handler enters 6 times (for the 6 bytes of the instruction) and then execution continues past CRASHME.

It seems like a trivial task to merely advance a given instruction pointer to the next instruction, given an existing instruction; the processor does it every cycle. In other settings it has been said "look at the instruction table and build your own" or "implement a disassembler". These are neither appropriate nor necessary for the task, as both have already been done by others and posted (almost?) exclusively in places of the web where my work computer can't go, and to which I do not trust to submit my home PC. But where can I find such tables or libraries to accomplish only doing the instruction calculation, and without looking at a site where I already know I can't access?

like image 792
user2149140 Avatar asked Jan 03 '14 16:01

user2149140


People also ask

What is EIP instruction pointer?

EIP stands for Extended Instruction Pointer and is used to track the address of the current instruction running inside the application.

What is ESP and EIP?

stack pointer (ESP): register containing the address of the top of the stack base pointer (EBP): register containing the address of the bottom of the stack frame instruction pointer (EIP): register containing the address of the instruction to be executed Other examples: EAX (return value), etc. Instructions.

What does the instruction pointer do?

The Program Counter, also known as the Instruction pointer, is a processor register that indicates the current address of the program being executed.

What is %rip in assembly?

The %rip register on x86-64 is a special-purpose register that always holds the memory address of the next instruction to execute in the program's code segment.


1 Answers

The Linux Kernel sources have an encoding of the X86 opcode map that is then parsed by an Awk script to generate a set of tables that can be used to read instructions. It has enough information to give you accurate instruction sizes, although you may need to extend it to include information for floating point instructions and some of the newer Intel extensions, such as AVX.

If you have access to the linux kernel source tree, take a look at arch/x86/lib/x85-opcode-map.txt.

That contains all the data you need to determine instruction sizes.

There's an AWK script @ arch/x86/tools/gen-insn-attr-x86.awk that will read the opcode file and produce a series of tables that encodes the information in the opcode map.

Finally if you look at arch/x86/lib/insn.c there's a function in there insn_get_length(...) that will give you the length of an instruction, using the tables generated from the opcode map. That should be sufficent for you to answer your particular question "how large is this instruction".

There's nothing particularly "kernely" about that code. You can adapt to user mode without doing anything special.

I'm assuming accessing the Linux Kernel sources shouldn't be a security issue for you, and that there is nothing encumbering you from reading / adopting GPL code.

like image 198
Scott Wisniewski Avatar answered Sep 17 '22 18:09

Scott Wisniewski