x86 assembly design has instruction suffix, such as l(long)
, w(word)
, b(byte)
.
So I thought that jmpl
to be long jmp
But it worked quite weird when I assemble it:
Test1 jmp
: assembly source, and disassembly
main:
jmp main
eb fe jmp 0x0804839b <main>
Test2 jmpl
: assembly source, and disassembly
main:
jmpl main # added l suffix
ff 25 9b 83 04 08 jmp *0x0804839b
Compared to Test1, Test2 result is unexpected.
I think it should be assembled the same as Test1.
Question:
Is jmpl
some different instruction in 8086 design?
(according to here, jmpl
in SPARC means jmp link. is it something like this?)
...Or is this just a bug in GNU assembler?
The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor.
In the x86 assembly language, the JMP instruction performs an unconditional jump. Such an instruction transfers the flow of execution by changing the program counter.
IIRC, on x86 "JB" means "Jump if Borrow," which would occur if the carry flag is set as pointed out by Simon... – stix.
An l
operand-size suffix implies an indirect jmp
, unlike with calll main
which is still a relative near-call. This inconsistency is pure insanity in AT&T syntax design.
(And since you're using it with an operand like main
, it becomes a memory-indirect jump, doing a data load from main
and using that as the new EIP value.)
You never need to use the jmpl
mnemonic, you can and should indicate indirect jumps using *
on the operand. Like jmp *%eax
to set EIP = EAX, or jmp *4(%edi, %ecx, 4)
to index a jump table, or jmp *func_pointer
. Using jmpl
is optional in all of these.
You could use jmpw *%ax
to truncate EIP to a 16-bit value. That assembles to 66 ff e0 jmpw *%ax
)
Compare What is callq instruction? and What is the difference between retq and ret?, that's just the operand-size suffix behaving like you expected it would, same as plain call
or plain ret
. But jmp
is different.
semi-related: far jmp or call (to a new CS:[ER]IP) in AT&T syntax is ljmp / lcall. These are very different.
It's also insane that GAS accepts jmpl main
as equivalent to jmpl *main
. It only warns instead of erroring.
$ gcc -no-pie -fno-pie -m32 jmp.s
jmp.s: Assembler messages:
jmp.s:3: Warning: indirect jmp without `*'
And then disassembling it to see what we got, with objdump -drwC a.out
:
08049156 <main>: # corresponding source line (added by hand)
8049156: ff 25 56 91 04 08 jmp *0x8049156 # jmpl main
804915c: ff 25 56 91 04 08 jmp *0x8049156 # jmp *main
8049162: ff 25 56 91 04 08 jmp *0x8049156 # jmpl *main
08049168 <foo>:
8049168: e8 fb ff ff ff call 8049168 <foo> # calll foo
804916d: ff 15 68 91 04 08 call *0x8049168 # calll *foo
8049173: ff 15 68 91 04 08 call *0x8049168 # call *foo
We get the same thing if we replace l
with q
in the source, and built without -m32
(using the default -m64
). Including the same warning about a missing *
. But the disassembly has an explicit jmpq
and callq
on every instruction. (Except for a relative direct jmp
I added, which uses the jmp
mnemonic in the disassembly.)
It's like objdump thinks 32-bit is the default operand-size for jmp/call in both 32 and 64-bit mode, so it wants to always use a q
suffix in 64-bit, but leaves it implicit in 32-bit mode. Anyway, that's just disassembly choice between implicit / explicit size suffixes, no weirdness for a programmer writing source code.
Clang's built-in assembler does reject jmpl main
, requiring jmpl *main
.
$ clang -m32 jmp.s
jmp.s:3:8: error: invalid operand for instruction
jmpl main
^~~~
calll main
is the same as call main
. call *main
and calll *main
are both accepted for indirect jumps.
YASM's GAS-syntax mode assembles jmpl main
to a near relative jmp, like jmp main
! So it disagrees with gcc/clang about jmpl
implying indirect. (Very few people use YASM in GAS mode; and these days its maintenance hasn't kept up with NASM for new instructions like AVX512. I like YASM's good defaults for long NOPs, but otherwise I'd recommend NASM.)
You have fallen victim to the awfulness that is AT&T syntax.
x86 assembly design has instruction suffix, such as l(long), w(word), b(byte).
No, it doesn't. The abomination that is AT&T syntax has this.
In the sane Intel syntax there are no such suffixes.
Is jmpl something different.
Yes, this is an indirect jump to an absolute address. A -near- jump to a -long- address.
(ljmp
in gnu syntax is a -far- jump, but that's totally different, setting a new CS:EIP.)
The default for a jump is a near jump, to a relative address.
Note that the Intel syntax for this jump is:
jmp dword [ds:0x0804839b] //note the [] specifying the indirectness.
//or, this is the same
jmp [0x0804839b]
//or
jmp [main]
//or
jmp DWORD PTR ds:0x804839f //the PTR makes it indirect.
I prefer the []
, to highlight the indirectness.
It does not jump to 0x0804839b, but reads a dword from the specified address and then jumps to the address specified in this dword. In the Intel syntax the indirectness is explicit.
Of course you intended to jump to 0x0804839b (aka main:) directly, which is done by:
Hm, most assembler do not allow absolute far jumps!
It cannot be done.
See also: How to code a far absolute JMP/CALL instruction in MASM?
A near/short relative jump is (almost) always better, because it will still be valid when your code changes; the long jump can become invalid. Also shorter instructions are usually better, because they occupy less space in the instruction cache. The assembler (in Intel mode) will automatically select the correct jmp encoding for you.
SPARC
This is a totally different processor than the x86. From a different manufacturer, using a different paradigm. Obviously the SPARC documentation bears no relation to the x86 docs.
The official Intel documentation for jmp
is here.
https://www.felixcloutier.com/x86/jmp
Note that Intel does not specify different mnemonics for the relative and absolute forms of the jmp. This is because Intel want to assembler to always use the short (relative) jump, unless the target is too far away, in which case the near jmp rel32
encoding is used. (Or in 16-bit mode, jmp foo
could assemble to a far absolute jump to a different CS value, aka segment. In 32-bit mode, a relative jmp rel32
can reach any other EIP value from anywhere.)
The beauty of this is that the assembler automatically uses the proper jump for you.
(In 64-bit mode jumping more than +-2GiB requires extra instructions or a pointer in memory, there is no 64-bit absolute direct far jump, so the assembler can't do this for you automatically.))
Forcing gnu back to sanity
You can use
.intel_syntax noprefix <<-- as the first line in your assembly
mov eax,[eax+100+ebx*2]
....
To make gnu use Intel syntax, this will put things back the way they are designed by Intel and away from the PDP7 syntax used by gnu.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With