For example when I have:
const char mesg [] = "Hello World";
it is directly put in the .rodata
but when I have:
const char* mesg = "Hello World";
it is put directly in .rodata.str1.4
What is the difference between them and why we use .rodata.str1.4 when we use the pointer ?
I did a couple of experiments, it looks like the compiler places strings in special sections in object files. The interesting thing happens when the binary is compiled, the strings end up in .rodata as expected. Further experiments show that what happens is that if you have the same string in different objects they get unified into the same string in the resulting binary.
So I'd suspect that the reason for this is that the compiler wants to give the linker some information about the read only data other than "it's read-only", so that the final link can make more intelligent decisions about how to handle it, including deduplication.
$ cat foo.c
const char *
fun(int i)
{
const char *foo = "foofoo foo foo foo";
const char *bar = "barbar bar bar bar";
return i ? foo : bar;
}
$ cat bar.c
#include <stdio.h>
extern const char *fun(int);
int
main(int argc, char **argv)
{
const char *foo = "foofoo foo foo foo";
printf("%s%s\n", foo, fun(1));
return 0;
}
$ cc -c -O2 foo.c
$ cc -c -O2 bar.c
$ objdump -s foo.o
[...]
Contents of section .rodata.str1.1:
0000 62617262 61722062 61722062 61722062 barbar bar bar b
0010 61720066 6f6f666f 6f20666f 6f20666f ar.foofoo foo fo
0020 6f20666f 6f00 o foo.
[...]
$ objdump -s bar.o
[...]
Contents of section .rodata.str1.1:
0000 666f6f66 6f6f2066 6f6f2066 6f6f2066 foofoo foo foo f
0010 6f6f0025 7325730a 00 oo.%s%s..
[...]
$ cc -o foobar foo.o bar.o
$ objdump -s foobar
[...]
Contents of section .rodata:
400608 01000200 00000000 00000000 00000000 ................
400618 62617262 61722062 61722062 61722062 barbar bar bar b
400628 61720066 6f6f666f 6f20666f 6f20666f ar.foofoo foo fo
400638 6f20666f 6f002573 25730a00 o foo.%s%s..
[...]
Different compilers may use different sections for read-only data, depending on the type, declarations, etc.
.rodata, by convention, may be used for anything that will need to be placed in a read-only portion of memory by the loader.
So it would have been OK to place the const char * there.
But usually, compilers also generate sections, prefixed by .rodata, to categorize read-only data.
It may be just ignored by the loader, and treated exactly as the .rodata section (I think it should most often be the case), but it may allow some specific arrangement in memory, if needed.
This is why linker scripts often specify .rodata and .rodata*
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With