Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

x86 Assembly: Data in the Text Section

I don't quite understand how variables can be stored in the text section and how they can be manipulated. Shouldn't all variables be in the .data section and aren't all part of the .text section read-only? How does this code work then?

[Code taken from Shellcoder's Handbook]

Section .text
global _start

_start:
    jmp short GotoCall

shellcode:
    pop esi
    xor eax, eax
    mov byte [esi + 7], al
    lea ebx, [esi]
    mov long [esi + 8], ebx
    mov long [esi + 12], eax
    mov byte al, 0x0b
    mov ebx, esi
    lea ecx, [esi + 8]
    lea edx, [esi + 12]
    int 0x80

GotoCall:
    call shellcode
    db '/bin/shJAAAAKKKK'
like image 547
nanoman Avatar asked Sep 13 '17 16:09

nanoman


2 Answers

Well, the data & code are just bytes. Only how you interpret them makes them what they are. Code can be interpreted as data and vice versa. In most case it will produce the something that's invalid but anyway it's possible.

Attributes of the section are dependant on the linker and most of them by default make the .text section RO, but it doesn't mean it can't be changed.

The whole example is a clever way to obtain the address of /bin/sh just by using the call. Basically the call places on the stack the address of the next instruction (next bytes) and in this case it will be the address of this string so pop esi will get that address from the stack and use it.

like image 164
Paweł Łukasik Avatar answered Nov 02 '22 05:11

Paweł Łukasik


The top level answer is, that x86 machine is not aware of ".text" and ".data" sections. Modern x86 CPU provides OS with tools to create virtual address space with specific rights (like read-only, no-exec and read-write).

But the content of memory is just bytes, and those can be either read, written, or executed, the CPU has no means to guess which part of memory are data and what is code, and will happily execute anything what you point it to.

Those .text/.data/... sections are logical construct supported by compiler, linker, and OS (executable loader), which together cooperate to prepare the runtime environment for the code in such way, that .text is read-only nowadays, and you need to put writeable variables into .data or .bss or similar. Also non-executable stack may be provided by some OS and configurations.

The OS usually also has API, so application can change the rights or memory mapping, or allocate further memory with the attributes it needs (for example JIT compilers would get nowhere, if they would be unable to first write compiled code into memory, and then execute it).

So if you will use your code example on common linux in default config, it will very likely segfault as the .text will be read-only. Many of those "exploits" books have whole dedicated chapter how to compile + set up runtime environment for their examples in such way, that several protections (ASLR, NX, ...) are switched OFF, thus allowing their samples to work.

Then a real exploit in the wild will usually use some bug/weak spot in application to inject its payload somewhere. Depending on the hostility of "somewhere" the real exploit may have to first elevate its rights to get writeable+executable memory (or it must be written in a way to not write into code parts and use other memory for variables), unless the app itself already has some friendly environment for exploit due to its internal needs.

Keep in mind the OS and applications are not written in a way to make sure the exploits will work, quite opposite. Each exploit is usually targetting particular version of application on particular version of OS, which is vulnerable, and it is expected that it will break with the security update later. So if you know you have writeable and executable memory, you just exploit it as is, without bothering what will happen in next version, when they will fix the app to keep their code memory RO.

like image 42
Ped7g Avatar answered Nov 02 '22 07:11

Ped7g