Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Basic c-style string memory allocation

I am working on a project with existing code which uses mainly C++ but with c-style strings. Take the following:

#include <iostream>
int main(int argc, char *argv[])
{
    char* myString = "this is a test";
    myString = "this is a very very very very very very very very very very very long string";
    cout << myString << endl;
    return 0;
}

This compiles and runs fine with the output being the long string.

However I don't understand WHY it works. My understanding is that

char* myString 

is a pointer to an area of memory big enough to hold the string literal "this is a test". If that's the case, then how am I able to then store a much longer string in the same location? I expected it to crash when doing this due to trying to cram a long string into a space set aside for the shorter one.

Obviously there's a basic misunderstanding of what's going on here so I appreciate any help understanding this.

like image 280
TheOx Avatar asked Nov 15 '11 20:11

TheOx


3 Answers

You're not changing the content of the memory, you're changing the value of the pointer to point to a different area of memory which holds "this is a very very very very very very very very very very very long string".

Note that char* myString only allocates enough bytes for the pointer (usually 4 or 8 bytes). When you do char* myString = "this is a test";, what actually happened was that before your program even started, the compiler allocated space in the executable image and put "this is a test" in that memory. Then when you do char* myString = "this is a test"; what it actually does is just allocate enough bytes for the pointer, and make the pointer point to that memory it had already allocated at compile time, in the executable.

So if you like diagrams:

char* myString = "this is a test";

(allocate memory for myString)

              ---> "this is a test"
            / 
myString---

                   "this is a very very very very very very very very very very very long string"

Then

myString = "this is a very very very very very very very very very very very long string";

                   "this is a test"

myString---
            \
              ---> "this is a very very very very very very very very very very very long string"
like image 158
Seth Carnegie Avatar answered Sep 22 '22 07:09

Seth Carnegie


There are two strings in the memory. First is "this is a test" and lets say it begins at the address 0x1000. The second is "this is a very very ... test" and it begins at the address 0x1200.

By

char* myString = "this is a test";

you crate a variable called myString and assign address 0x1000 to it. Then, by

myString = "this is a very very ... test";

you assign 0x1200. By

cout << myString << endl;

you just print the string beginning at 0x1200.

like image 36
Adam Trhon Avatar answered Sep 24 '22 07:09

Adam Trhon


You have two string literals of type const char[n]. These can be assigned to a variable of type char*, which is nothing more than a pointer to a char. Whenever you declare a variable of type pointer-to-T you are only declaring the pointer, and not the memory to which it points.

The compiler reserves memory for both literals and you just take your pointer variable and point it at those literals one after the other. String literals are read-only and their allocation is taken care of by the compiler. Typically they are stored in the executable image in protected read-only memory. A string literal typically has a lifetime equal to that of the program itself.

Now, it would be UB if you attempted to modify the contents of a literal, but you don't. To help prevent yourself from attempting modifications in error you would be wise to declare your variable as const char*.

like image 30
David Heffernan Avatar answered Sep 24 '22 07:09

David Heffernan