Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between .rodata and .rodata.str1.4 section in compiled output for string literals?

Tags:

c

compilation

For example when I have:

const char mesg [] = "Hello World";

it is directly put in the .rodata but when I have:

const char* mesg = "Hello World";

it is put directly in .rodata.str1.4

What is the difference between them and why we use .rodata.str1.4 when we use the pointer ?

like image 702
mehmetozer Avatar asked Jan 10 '13 09:01

mehmetozer


2 Answers

I did a couple of experiments, it looks like the compiler places strings in special sections in object files. The interesting thing happens when the binary is compiled, the strings end up in .rodata as expected. Further experiments show that what happens is that if you have the same string in different objects they get unified into the same string in the resulting binary.

So I'd suspect that the reason for this is that the compiler wants to give the linker some information about the read only data other than "it's read-only", so that the final link can make more intelligent decisions about how to handle it, including deduplication.

$ cat foo.c
const char *
fun(int i)
{
        const char *foo = "foofoo foo foo foo";
    const char *bar = "barbar bar bar bar";
    return i ? foo : bar;
}
$ cat bar.c
#include <stdio.h>
extern const char *fun(int);

int
main(int argc, char **argv)
{
    const char *foo = "foofoo foo foo foo";

    printf("%s%s\n", foo, fun(1));
    return 0;
}
$ cc -c -O2 foo.c
$ cc -c -O2 bar.c
$ objdump -s foo.o
[...]
Contents of section .rodata.str1.1:
 0000 62617262 61722062 61722062 61722062  barbar bar bar b
 0010 61720066 6f6f666f 6f20666f 6f20666f  ar.foofoo foo fo
 0020 6f20666f 6f00                        o foo.
[...]
$ objdump -s bar.o
[...]
Contents of section .rodata.str1.1:
 0000 666f6f66 6f6f2066 6f6f2066 6f6f2066  foofoo foo foo f
 0010 6f6f0025 7325730a 00                 oo.%s%s..
[...]
$ cc -o foobar foo.o bar.o
$ objdump -s foobar
[...]
Contents of section .rodata:
 400608 01000200 00000000 00000000 00000000  ................
 400618 62617262 61722062 61722062 61722062  barbar bar bar b
 400628 61720066 6f6f666f 6f20666f 6f20666f  ar.foofoo foo fo
 400638 6f20666f 6f002573 25730a00           o foo.%s%s..
[...]
like image 89
Art Avatar answered Nov 02 '22 02:11

Art


Different compilers may use different sections for read-only data, depending on the type, declarations, etc.

.rodata, by convention, may be used for anything that will need to be placed in a read-only portion of memory by the loader.
So it would have been OK to place the const char * there.

But usually, compilers also generate sections, prefixed by .rodata, to categorize read-only data.
It may be just ignored by the loader, and treated exactly as the .rodata section (I think it should most often be the case), but it may allow some specific arrangement in memory, if needed.

This is why linker scripts often specify .rodata and .rodata*

like image 21
Macmade Avatar answered Nov 02 '22 00:11

Macmade