What does the following line mean:
...
401147: ff 24 c5 80 26 40 00 jmpq *0x402680(,%rax,8)
...
What does the asterisk in front of the memory address mean? Also, what does it mean when the memory access method is missing it's first register value?
Usually its something like ("%register", %rax, 8), but in this case it doesn't have the first register.
Any tips?
The * is used before absolute addresses in AT&T assembly syntax for call or jump instructions. This means it will jump to the address contained in the register. The alternative is a relative jump, which is relative to the current instruction.
Operands can be immediate (that is, constant expressions that evaluate to an inline value), register (a value in the processor number registers), or memory (a value stored in memory).
Square brackets means 'the variable at the memory address stored in RAX”.
Description. Performs a bitwise AND operation on the destination (first) and source (second) operands and stores the result in the destination operand location. The source operand can be an immediate, a register, or a memory location; the destination operand can be a register or a memory location.
It's AT&T assembly syntax:
q
for quad, etc.)%
and immediate values with $
DISP(BASE, INDEX, SCALE)
(DISP + BASE + INDEX * SCALE)*
(as opposed to direct).So, you have a jmpq
for jumping to the absolute address which is stored in %rax * 8 + 0x402680
, and is a quad word long.
AT&T syntax needed a way to distinguish RIP = foo (jmp foo
) from RIP = load from some symbol address (jmp *foo
). Remember that movl $1, foo
is a store to the absolute address foo
.
With other addressing modes, there's no ambiguity between what kind of jump / call you're doing, anything other than a bare label must be indirect. (GAS will infer that and warn about an indirect jump without *
if you do jmp %rax
or jmp 24(%rax)
or anything other than a bare symbol name.)
(In 64-bit mode you'd normally actually use jmp *foo(%rip)
to load a global variable into RIP, not use a 32-bit absolute address like jmp *foo
. But the possibility exists, and before x86-64 when AT&T syntax was designed, was the normal way to do things.)
Actually this is computed table jmp, where the 0x402680 is address of tabele and rax is index of 8 byte (qword) pointer.
Getting things into Intel syntax always makes stuff clearer:
FF24C5 80264000 JMP QWORD PTR [RAX*8+402680]
It's a jump to an address contained in memory. The address is stored in memory at address rax*8+0x402680
, where rax
is the current rax
value (when this instruction executes).
jmpq
is just a un-conditional jump to a given address. The 'q' means that we're dealing with quad words (64 bits long).
*0x402680(,%rax,8)
: This is a way to write an address in x-86 assembly. You are correct in saying that usually there is a register before the first comma, but you still follow the same rules if no register is specified.
The format works this way :
D(reg1, reg2, scalingFactor)
where D stands for displacement. Displacement is basically just an integer. reg1
is the first or base register. reg2
is the second register and scalingFactor
is one of 2, 4, 8 (maybe even 1, but I'm not sure about that). Now, you can obtain your address by simply adding the values in this way: Displacement + (value at reg1
) + scalingFactor
*(value at reg2
).
I'm not completely sure as to what the asterisk in front of the address is for, but my guess is that it means that the displacement value is stored at that address.
Hope this helps.
As Necrolis wrote, Intel syntax makes it a bit more obvious, but RTN is really clearer. The line
jmpq *0x402680(,%rax,8)
would be described in RTN by:
RIP <- M[0x402680 + (8 * RAX)]
where M
is the system memory.
As such, we can write the general form jmpq *c(r1, r2, k)
, where c
is an immediate constant, r1
and r2
are general purpose registers and k
is either 1 (default), 2, 4 or 8:
RIP <- M[c + r1 + (k * r2)]
Minimal example
To make things clearer:
.data
# Store he address of the label in the data section.
symbol: .int label
.text
# Jumps to label.
jmp *symbol
label:
GitHub upstream.
Without the *
, it would jump to the address of symbol
in the .data
section and segfault.
I feel this syntax is a bit inconsistent, because for most instructions:
mov symbol, %eax
mov label, %eax
already moves the data at the address symbol
, and $symbol
is used for the address. Intel syntax is more consistent in this point as it always uses []
for dereference.
The *
is of course a mnemonic for the C dereference operator *ptr
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With