Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can different strings have the same address [duplicate]

I know that in order to compare two strings in C, you need to use the strcmp() function. But I tried to compare two strings with the == operator, and it worked. I don't know how, because it just compares the address of the two strings. It shouldn't work if the strings are different. But then I printed the address of the strings:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    char* str1 = "First";
    char* str2 = "Second";
    char* str3 = "First";

    printf("%p %p %p", str1, str2, str3);

    return 0;
}

And the output was:

00403024 0040302A 00403024
Process returned 0 (0x0)   execution time : 0.109 s
Press any key to continue.

How is it possible that str1 and str3 have the same address? They may contain the same string, but they aren't the same variable.

like image 233
Drakalex Avatar asked Mar 06 '18 16:03

Drakalex


4 Answers

There is no guarantee that it will always be like this. In general, implementors maintain a literal pool maintaining each of the string literals only once, and then for multiple usages of the string literal the same address is being used. But one might implement it a different way - the standard does not pose a constraint on this.

Now your question: You are looking at the content of the two pointers pointing to the same string literal. The same string literal gave rise to the same value (they decayed into a pointer to the first element). But that address is same because of the reason stated in the first paragraph.

Also, I would emphasize providing the argument of the %p format specifier with the (void*) cast.

like image 85
user2736738 Avatar answered Oct 21 '22 01:10

user2736738


There is an interesting point here. What you have actually are just 3 pointers all pointing to const litteral strings. So the compiler is free to create one single string for "First" and have both str1 and str3 point there.

This would be a completely different case:

char str1[] = "First";
char str2[] = "Second";
char str3[] = "First";

I have declared 3 different char arrays initialized from litteral strings. Test it, and you will see that the compiler have assigned different addresses for the 3 different strings.

What you should remember from that: pointers and arrays are different animals, even if arrays can decay to pointers (more on it in this post from the C FAQ)

like image 25
Serge Ballesta Avatar answered Oct 21 '22 01:10

Serge Ballesta


When a particular string literal appears multiple times in a source file, the compiler may choose to have all instances of that literal point to the same place.

Section 6.4.5 of the C standard, which describes String Literals, states the following:

7 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

Where "unspecified behavior" is defined in section 3.4.4 as:

use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance

In your case, the string literal "First" appears twice in the source. So the compiler uses the same instance of the literal for both, resulting in str1 and str3 pointing to the same instance.

As stated above, this behavior is not guaranteed. The two instances of "First" could be distinct from each other, resulting in str1 and str3 pointing to different places. Whether two identical instances of a string literal reside in the same place is unspecified.

like image 10
dbush Avatar answered Oct 21 '22 00:10

dbush


String literals, just like C99+ compound literals, may be pooled. That means that two different occurrences in the source-code might in fact result in only one instance in the running program.
That might even be the case if your target does not support hardware write-protection.

like image 3
Deduplicator Avatar answered Oct 21 '22 00:10

Deduplicator