When using assembly instructions on x86 or amd64, programmer can use "Intel" (i.e. nasm
compiler) or "AT&T" (i.e. gas
compiler) assembly syntax. "Intel" syntax is more popular on Windows, but "AT&T" is more popular on UNIX(-like) systems.
But both Intel and AMD manuals, so manuals created by the creators of the chip, are both using the "Intel" syntax.
I'm wondering, what was the original idea behind the design of the "AT&T" syntax? What was the benefit for floating away from notation used by the creators of the processor?
It's said that the word “design” comes from the Latin word designare. Designare is said to have meant to draw a plan. For this reason, it is thought that the word design initially was used in this sense of a plan on paper.
Design thinking is created not only because Tim Brown coined the word that became a buzzword. There's a logical reason to it. Design thinking is created because big corporation lack the ability to be creative and on extreme cases, aren't able to create new products and services that meet unmet needs of their customers.
Historians trace the origins of graphic design to early cave paintings from about 38,000 BCE. These early forms of cave paintings were how people communicated from one generation to another. Subjects in these cave paintings mainly featured animals, handprints, weapons, and other references to hunting.
IDEO is often credited with inventing the term “design thinking” and its practice. In fact, design thinking has deep roots in a global conversation that has been unfolding for decades.
UNIX was for a long time developed on the PDP-11, a 16 bit computer from DEC, which had a fairly simple instruction set. Nearly every instruction has two operands, each of which can have one of the following eight addressing modes, here shown in the MACRO 16 assembly language:
0n Rn register 1n (Rn) deferred 2n (Rn)+ autoincrement 3n @(Rn)+ autoincrement deferred 4n -(Rn) autodecrement 5n @-(Rn) autodecrement deferred 6n X(Rn) index 7n @X(Rn) index deferred
Immediates and direct addresses can be encoded by cleverly re-using some addressing modes on R7, the program counter:
27 #imm immediate 37 @#imm absolute 67 addr relative 77 @addr relative deferred
As the UNIX tty driver used @
and #
as control characters, $
was substituted for #
and *
for @
.
The first operand in a PDP11 instruction word refers to the source operand while the second operand refers to the destination. This is reflected in the assembly language's operand order which is source, then destination. For example, the opcode
011273
refers to the instruction
mov (R2),R3
which moves the word pointed to by R2
to R3
.
This syntax was adapted to the 8086 CPU and its addressing modes:
mr0 X(bx,si) bx + si indexed mr1 X(bx,di) bx + di indexed mr2 X(bp,si) bp + si indexed mr3 X(bp,di) bp + di indexed mr4 X(si) si indexed mr5 X(di) di indexed mr6 X(bp) bp indexed mr7 X(bx) bx indexed 3rR R register 0r6 addr direct
Where m
is 0 if there is no index, m
is 1 if there is a one-byte index, m
is 2 if there is a two-byte index and m
is 3 if instead of a memory operand, a register is used. If two operands exist, the other operand is always a register and encoded in the r
digit. Otherwise, r
encodes another three bits of the opcode.
Immediates aren't possible in this addressing scheme, all instructions that take immediates encode that fact in their opcode. Immediates are spelled $imm
just like in the PDP-11 syntax.
While Intel always used a dst, src
operand ordering for its assembler, there was no particularly compelling reason to adapt this convention and the UNIX assembler was written to use the src, dst
operand ordering known from the PDP11.
They made some inconsistencies with this ordering in their implementation of the 8087 floating point instructions, possibly because Intel gave the two possible directions of non-commutative floating point instructions different mnemonics which do not match the operand ordering used by AT&T's syntax.
The PDP11 instructions jmp
(jump) and jsr
(jump to subroutine) jump to the address of their operand. Thus, jmp foo
would jump to foo
and jmp *foo
would jump to the address stored in the variable foo
, similar to how lea
works in the 8086.
The syntax for the x86's jmp
and call
instructions was designed as if these instructions worked like on the PDP11, which is why jmp foo
jumps to foo
and jmp *foo
jumps to the value at address foo
, even though the 8086 doesn't actually have deferred addressing. This has the advantage and convenience of syntactically distinguishing direct jumps from indirect jumps without requiring an $
prefix for every direct jump target but doesn't make a lot of sense logically.
The syntax was expanded to specify segment prefixes using a colon:
seg:addr
When the 80386 was introduced, this scheme was adapted to its new SIB addressing modes using a four-part generic addressing mode:
disp(base,index,scale)
where disp
is a displacement, base is a base register, index
an index register and scale
is 1, 2, 4, or 8 to scale the index register by one of these amounts. This is equal to Intel syntax:
[disp+base+index*scale]
Another remarkable feature of the PDP-11 is that most instructions are available in a byte and a word variant. Which one you use is indicated by a b
or w
suffix to the opcode, which directly toggles the first bit of the opcode:
010001 movw r0,r1 110001 movb r0,r1
this also was adapted for AT&T syntax as most 8086 instructions are indeed also available in a byte mode and a word mode. Later the 80386 and AMD K6 introduced 32 bit instructions (suffixed l
for long
) and 64 bit instructions (suffixed q
for quad).
Last but not least, originally the convention was to prefix C language symbols with an underscore (as is still done on Windows) so you can distinguish a C function named ax
from the register ax
. When Unix System Laboratories developed the ELF binary format, they decided to get rid of this decoration. As there is no way to distinguish a direct address from a register otherwise, a %
prefix was added to every register:
mov direct,%eax # move memory at direct to %eax
And that's how we got today's AT&T syntax.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With