Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When do C++ POD types get zero-initialized?

Coming from a C background, I've always assumed the POD types (eg ints) were never automatically zero-initialized in C++, but it seems this was plain wrong!

My understanding is that only 'naked' non-static POD values don't get zero-filled, as shown in the code snippet. Have I got it right, and are there any other important cases that I've missed?

static int a;

struct Foo { int a;};

void test()
{
  int b;     
  Foo f;
  int *c = new(int); 
  std::vector<int> d(1);

  // At this point...
  // a is zero
  // f.a is zero
  // *c is zero
  // d[0] is zero
  // ... BUT ... b is undefined     
}  
like image 277
Roddy Avatar asked Jun 23 '10 13:06

Roddy


People also ask

What is zero-initialized?

A zero-initialized pointer is the null pointer value of its type, even if the value of the null pointer is not integral zero.

Does New initialize memory 0?

Yes. That's kind of my point. If you make a new variable and see that's it's zero, you can't straight away assume that something within your program has set it to zero. Since most memory comes ready-zeroed, it's probably still uninitialised.

Are class members initialized to zero?

If T is scalar (arithmetic, pointer, enum), it is initialized from 0 ; if it's a class type, all base classes and data members are zero-initialized; if it's an array, each element is zero-initialized.

Does C++ initialize variables to zero?

Unlike some programming languages, C/C++ does not initialize most variables to a given value (such as zero) automatically. Thus when a variable is given a memory address to use to store data, the default value of that variable is whatever (garbage) value happens to already be in that memory address!


3 Answers

Assuming you haven't modified a before calling test(), a has a value of zero, because objects with static storage duration are zero-initialized when the program starts.

d[0] has a value of zero, because the constructor invoked by std::vector<int> d(1) has a second parameter that takes a default argument; that second argument is copied into all of the elements of the vector being constructed. The default argument is T(), so your code is equivalent to:

std::vector<int> d(1, int());

You are correct that b has an indeterminate value.

f.a and *c both have indeterminate values as well. To value initialize them (which for POD types is the same as zero initialization), you can use:

Foo f = Foo();      // You could also use Foo f((Foo()))
int* c = new int(); // Note the parentheses
like image 53
James McNellis Avatar answered Oct 02 '22 18:10

James McNellis


Actually some of the values being zero may be due to you trying this code in the debug version of the application (if that is the case).

If I'm not mistaken, in your code:

  • a should be uninitialized.
  • b should be uninitialized
  • c should point to a new (uninitialized) int
  • d should be initialized to [0] (as you correctly guessed)
like image 1
utnapistim Avatar answered Oct 02 '22 19:10

utnapistim


Note that the zero-initialization done by the OS as a security feature is usually only done the first time memory is allocated. By that I mean any segment in the heap, stack, and data sections. The stack and data sections are typically of fixed size, and are initialized when the application is loaded into memory.

The data segment (containing static/global data and code) typically doesn't get "re-used", although that may not be the case if you dynamically load code at runtime.

The memory in the stack segment gets re-used all the time. Local variables, function stack frames, etc.. are all being constantly used and re-used and are not initialized every time - just when the application is first loaded.

However, when the application makes requests for heap memory, the memory manager will typically zero-initialize segments of memory before granting the request, but only for new segments. If you make a request for heap memory, and there is free space in a segment that was already initialized, the initialization isn't done a second time. Therefor, there is no guarantee that if that particular segment of memory is re-used by your application, it will get zero-initialized again.

So, for example, if you allocate a Foo on the heap, assign its field a value, delete the Foo instance, and then create a new Foo on the heap, there is a chance that the new Foo will be allocated in the same exact memory location as the old Foo, and so its field will initially have the same value as the old Foo's field.

If you think about it, this makes sense, because the OS is only initializing the data to prevent one application from accessing the data from another application. There is less risk in allowing an application access to its own data, so for performance reasons the initialization isn't done every time - just the first time a particular segment of memory is made available for use by the application (in any segment).

Sometimes when you run an application in debug mode, however, some debug mode runtimes initialize stack and heap data at every allocation (soo your Foo field will always be initialized). However, different debug runtimes initialize the data to different values. Some zero initialize, and some initialize to a "marker" value.

The point is - never ever use uninitialized values anywhere in your code. There is absolutely no guarantee that they will be zero initialized. Also, be sure to read the previously linked article regarding parens and default vs value initialization as this affects the definition of an "uninitialized" value.

like image 1
Jeremy Bell Avatar answered Oct 02 '22 17:10

Jeremy Bell