Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to interpret segment register accesses on x86-64?

With this function:

mov    1069833(%rip),%rax        # 0x2b5c1bf9ef90 <_fini+3250648>
add    %fs:0x0,%rax
retq

How do I interpret the second instruction and find out what was added to RAX?

like image 884
Alex B Avatar asked Oct 21 '11 04:10

Alex B


People also ask

What are segment registers in x86?

The x86 line of computers have 6 segment registers (CS, DS, ES, FS, GS, SS). They are totally independent of one another. DS, ES, FS, GS, SS are used to form addresses when you want to read/write to memory.

Why does x86 64 not support segmentation?

The x86-64 architecture does not use segmentation in long mode (64-bit mode). Four of the segment registers, CS, SS, DS, and ES, are forced to base address 0, and the limit to 264. The segment registers FS and GS can still have a nonzero base address.

What are the 4 segment registers?

The 8086 has four special segment registers: cs, ds, es, and ss. These stand for Code Seg- ment, Data Segment, Extra Segment, and Stack Segment, respectively. These registers are all 16 bits wide. They deal with selecting blocks (segments) of main memory.

How are segment registers used?

A segment register changes the memory address accessed by 16 bits at a time, because its value is shifted left by 4 bits (or multiplied by 16) to cover the entire 20-bit address space. The segment register value is added to the addressing register's 16-bit value to produce the actual 20-bit memory address.


2 Answers

This code:

mov    1069833(%rip),%rax        # 0x2b5c1bf9ef90 <_fini+3250648>
add    %fs:0x0,%rax
retq

is returning the address of a thread-local variable. %fs:0x0 is the address of the TCB (Thread Control Block), and 1069833(%rip) is the offset from there to the variable, which is known since the variable resides either in the program or on some dynamic library loaded at program's load time (libraries loaded at runtime via dlopen() need some different code).

This is explained in great detail in Ulrich Drepper's TLS document, specially §4.3 and §4.3.6.

like image 116
ninjalj Avatar answered Oct 16 '22 13:10

ninjalj


I'm not sure they've been called segment register since the bad old days of segmented architecture. I believe the proper term is a selector (but I could be wrong).

However, I think you just need at the first quadword (64 bits) in the fs area.

The %fs:0x0 bit means the contents of the memory at fs:0. Since you've used the generic add (rather than addl for example), I think it will take the data width from the target %rax.

In terms of getting the actual value, it depends on whether you're in legacy or long mode.

In legacy mode, you'll have to get the fs value and look it up in the GDT (or possibly LDT) in order to get the base address.

In long mode, you'll need to look at the relevant model specific registers. If you're at this point, you've moved beyond my level of expertise unfortunately.

like image 33
paxdiablo Avatar answered Oct 16 '22 14:10

paxdiablo