Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the difference between the .asciz and the .string assembler directives?

I know that the .ascii directive doesn't put a null character at the end of the string, as the .asciz directive is used for that purpose. However, I don't know whether the .string directive puts a null character at the end of the string.

If it does append the null character, then what's the difference between the .asciz and the .string directives? To me, having both .asciz and .string seems redundant.

like image 832
Aadit M Shah Avatar asked Apr 26 '16 02:04

Aadit M Shah


1 Answers

According to the binutils docs:

.ascii "string" (Here for completeness)

.ascii expects zero or more string literals separated by commas. It assembles each string (with no automatic trailing zero byte) into consecutive addresses.

.asciz "string"

.asciz is just like .ascii, but each string is followed by a zero byte. The “z” in ‘.asciz’ stands for “zero”.

.string "str", .string8 "str", .string16 "str", .string32 "str", .string64 "str"

Copy the characters in str to the object file. You may specify more than one string to copy, separated by commas. Unless otherwise specified for a particular machine, the assembler marks the end of each string with a 0 byte.

...

The variants string16, string32 and string64 differ from the string pseudo opcode in that each 8-bit character from str is copied and expanded to 16, 32 or 64 bits respectively. The expanded characters are stored in target endianness byte order.

They all support escape sequences and accept multiple arguments. As for the difference between .string and .asciz:

  • In certain architectures, .string will not add the null byte, when .asciz always will. To test your own system, you can do this:
    • echo '.string ""' | gcc -c -o stdout.o -xassembler -; objdump -sj .text stdout.o.
    • If the first byte is 00, then the null character was inserted.
  • .string also has suffixes to expand characters to certain widths (16, 32, or 64), but by default it is 8.

As stated in the comments to the question, in most cases, there is no difference other than semantics, but technically, the two pseudo-ops are different.

Addendum:

As it turns out, the docs do mention two architectures that behave differently:

  • HPPA (HP Precision Architecture) - does not add 0, but has a special .stringz directive for that.
  • TI-C54X (Some DSP chip from Texas Instruments) - zero-fills upper 8 bits of each word (2 bytes). Has a related .pstring directive that packs the characters and zero-fills unused space.

Digging through the source code in the gas/config folder, we can confirm this and find one more:

  • IA64 (Intel Architecture) - .string and .stringz behave like HPPA.
like image 122
General Grievance Avatar answered Oct 18 '22 23:10

General Grievance