I'm doing something for class where I want to use a different format string based on certain conditions. I defined it like so:
const char *fmts[] = {"this one is a little long", "this one is short"};
later, I can use
printf(fmts[0]);
or
printf(fmts[1]);
and it works.
Is the compiler doing something for us? My guess is that it would take the longest string and store all of them aligned like that. But I'd like to know from someone who knows. Thanks
It does it the same way as for any other data type. An array of "strings" is actually an array of character pointers, which all have the same size. So, in order to get the right address for the pointer, it multiplies the index by the size of an individual element, then adds that to the base address.
Your array will look like this:
<same-size>
+---------+
fmts: | fmts[0] | ------+
+---------+ |
| fmts[1] | ------|--------------------------+
+---------+ | |
V V
this one is a little long\0this one is short\0
The characters for the strings themselves are not stored in the array, they exist elsewhere. The way you have it, they're usually stored in read only memory although you can malloc
them as well, or even define them as a modifiable character array with something like:
char f0[] = "you can modify me without invoking undefined behaviour";
You can see this in operation with the following code:
#include<stdio.h>
const char *fmts[] = {
"This one is a little long",
"Shorter",
"Urk!"
};
int main (void) {
printf ("Address of fmts[0] is %p\n", (void*)(&(fmts[0])));
printf ("Address of fmts[1] is %p\n", (void*)(&(fmts[1])));
printf ("Address of fmts[2] is %p\n", (void*)(&(fmts[2])));
printf ("\n");
printf ("Content of fmts[0] (%p) is %c%c%c...\n",
(void*)(fmts[0]), *(fmts[0]+0), *(fmts[0]+1), *(fmts[0]+2));
printf ("Content of fmts[1] (%p) is %c%c%c...\n",
(void*)(fmts[1]), *(fmts[1]+0), *(fmts[1]+1), *(fmts[1]+2));
printf ("Content of fmts[2] (%p) is %c%c%c...\n",
(void*)(fmts[2]), *(fmts[2]+0), *(fmts[2]+1), *(fmts[2]+2));
return 0;
}
which outputs:
Address of fmts[0] is 0x40200c
Address of fmts[1] is 0x402010
Address of fmts[2] is 0x402014
Content of fmts[0] (0x4020a0) is Thi...
Content of fmts[1] (0x4020ba) is Sho...
Content of fmts[2] (0x4020c2) is Urk...
Here you can see that the actual addresses of the array elements are equidistant - 0x40200c + 4 = 0x402010
, 0x402010 + 4 = 0x402014
.
However, the values are not, because they refer to differently sized strings. The strings are in a single memory block (in this case - it's not necessary by any means) as shown below, with the *
characters indication start and end of individual strings:
| +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +a +b +c +d +e +f +0123456789abcdef
---------+-------------------------------------------------------------------
0x04020a0| *54 68 69 73 20 6f 6e 65 20 69 73 20 61 20 6c 69 This one is a li
0x04020b0| 74 74 6c 65 20 6c 6f 6e 67 00*53 68 6f 72 74 65 ttle long.Shorte
0x04020c0| 72 00*55 72 6b 21 00* r.Urk!.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With