Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NASM Vs GAS (Practical differences)

I'm not trying to prompt an Intel vs AT&T war (moot point anyway, now that they both support Intel syntax) or ask which one is "better" per se, I just want to know the practical differences in choosing one or the other.

Basically, when I was picking up some basic x86 assembly a few years back, I used NASM for no reason other than the book I was reading did too -- which put me firmly but involuntarily in the NASM camp. Since then, I've had very few causes to use assembly so I haven't had the opportunity to try GAS.

Bearing in mind that they both support Intel syntax (which I personally prefer) and should, theoretically at least, produce the same binary (I know they probably won't but the meaning shouldn't be changed), what are the reasons to favour one or the other?

Is it command line options? Macros? Non-mnemonic keywords? Or something else?

Thanks :)

like image 476
Elliott Avatar asked Dec 10 '12 00:12

Elliott


People also ask

Is NASM Intel syntax?

Nasm is an Intel-syntax assembler.

Does GAS support Intel syntax?

Good news are that starting from binutils 2.10 release, GAS supports Intel syntax too.

What is NASM and MASM?

• NASM is operating system independent. –One of the two widely used Linux assemblers. –The other is GAS (GNU assembler) • The syntax differs significantly in many ways from. MASM (Microsoft Assembler)

What is GAS syntax?

The GNU Assembler, commonly known as gas or as, is the assembler developed by the GNU Project. It is the default back-end of GCC. It is used to assemble the GNU operating system and the Linux kernel, and various other software. It is a part of the GNU Binutils package. GNU Assembler.


2 Answers

NASM actually uses its own variation of Intel syntax, different from the MASM syntax used in Intel's official documentation. The opcode names and operand orders are the same as in Intel so the instructions look the same at first glance, but any significant program will have differences. For example with MASM the instruction used by MOV ax, foo depends on the type of foo, while NASM doesn't have types and this always assembles to a move immediate instruction. When the size of an operand can't be determined implicitly MASM requires something like DWORD PTR to be used where NASM uses DWORD to mean the same thing. Most of the syntax beyond the instruction mnemonics and basic operand format and ordering is different.

In terms of functionality NASM and GAS are pretty much the same. Both have assembler macro facilities, though NASM's is more extensive and more mature. Many GAS source code files use the C preprocessor instead of GAS's own macro support.

The biggest difference between the two assemblers is their support for 16-bit code. GAS doesn't have any support for defining x86 segments. With GAS you're limited to creating simple single-segment 16-bit binary images, basically just boot sectors and .COM files. NASM has full support for segments and supports OMF format object files which you can use with a suitable linker to create segmented 16-bit executables.

In addition to the OMF object file format, NASM supports a number of formats that GAS doesn't. GAS normally only supports the native format for the machine its running on, basically ELF, PE-COFF, or MACH-O. If you want to support a different format you need to build a "cross-compiling" version of GAS for that format.

Another notable difference is that GAS has support for creating DWARF and Windows 64-bit unwind information (the later required by the Windows x64 ABI) while with NASM you have to create the sections and fill in the data yourself.

like image 137
Ross Ridge Avatar answered Sep 20 '22 17:09

Ross Ridge


Intel Syntax: mov eax, 1 (instruction destination, source)

AT&T Syntax: movl $1, %eax (instruction source, destination)

The Intel syntax is pretty self explanatory. In the above example, the amount of data which is moved is inferred from the size of the register (32 bits in the case of eax). The addressing mode used is inferred from the operands themselves.

There are some quirks when it comes to the AT&T syntax. Firstly, notice the l suffix at the end of the mov instruction, this stands for long and signifies 32 bits of data. Other instruction suffixes include w for a word (16 bits - not to be confused with the word size of your CPU!), q for a quad-word (64 bits) and b for a single byte. Whilst not always required, typically you will see assembly code which uses AT&T syntax explicitly state the amount of data being operated on by the instruction.

More explicitness is required when it comes to the addressing mode used on the source and destination operand. $ signifies immediate addressing, as in use the value in the instruction itself. In the above example, if it was written without this $, direct addressing would be used i.e. the CPU would try and fetch the value at memory address 1 (which will more than likely result in a segmentation fault). The % signifies register addressing, if you didn't include this in the above example eax would be treated as a symbol i.e. a labelled memory address, which would more than likely result in an undefined reference at link time. So it is mandatory that you are explicit about the addressing mode used on both the source and destination operand.

The way memory operands are specified is also different:

Intel: [base register + index * size of index + offset]

AT&T: offset(base register, index, size of index)

The Intel syntax makes it a little more clear what calculation is taking place to find the memory address. With the AT&T syntax, the result is the same but you are expected to know the calculation taking place.

should, theoretically at least, produce the same binary

This is entirely dependent on your toolchain.

what are the reasons to favour one or the other?

Personal preference of course, in my opinion it comes down to which syntax you feel more comfortable with when addressing memory. Do you prefer the forced explicitness of the AT&T syntax? Or do you prefer your assembler figuring out this low level minutia for you?

Is it command line options? Macros? Non-mnemonic keywords?

This has to do with the assembler (GAS, NASM) itself. Again, personal preference.

like image 45
uname01 Avatar answered Sep 20 '22 17:09

uname01