Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Object is already initialized on declaration?

Tags:

c++

I'm trying to understand something in C++. Basically I have this:

class SomeClass {
    public:
        SomeClass();
    private:
        int x;
};

SomeClass::SomeClass(){
    x = 10;
}

int main() {
    SomeClass sc;
    return 0;
}

I thought that sc is an uninitialized variable of type SomeClass, but from the various tutorials I found it looks like this declaration is actually an initialization that calls the SomeClass() contructor, without me needing to call "sc = new SomeClass();" or something like that.

As I come from the C# world (and know a bit C, but no C++), I'm trying to understand when I need stuff like new and when to release objects like that. I found a pattern called RAll which seems to be unrelated.

What is this type of initialization called and how do I know if something is a mere declaration or a full initialization?

like image 794
Michael Stum Avatar asked Aug 17 '10 20:08

Michael Stum


2 Answers

I think there are several things here:

  • Difference between automatic variable and dynamically allocated variable
  • Lifetime of objects
  • RAII
  • C# parallel

Automatic vs Dynamic

An automatic variable is a variable of which the system will manage the lifetime. Let's ditch global variables at the moment, it's complicated, and concentrate on the usual case:

int main(int argc, char* argv[])  // 1
{                                 // 2
  SomeClass sc;                   // 3
  sc.foo();                       // 4
  return 0;                       // 5
}                                 // 6

Here sc is an automatic variable. It is guaranteed to be fully initialized (ie the constructor is guaranteed to have run) after execution of line (3) completed successfully. Its destructor will be automatically invoked on line (6).

We generally speak of the scope of a variable: from the point of declaration to the corresponding closing bracket; and the language guarantees destruction when the scope will be exited, be it with a return or an exception.

There is of course no guarantee in the case you invoke the dreaded "Undefined Behavior" which generally results into a crash.

On the other hand, C++ also has dynamic variables, that is variables that you allocate using new.

int main(int argc, char* argv[])  // 1
{                                 // 2
  SomeClass* sc = 0;              // 3
  sc = new SomeClass();           // 4
  sc->foo();                      // 5
  return 0;                       // 6
}                                 // 7 (!! leak)

Here sc is still an automatic variable, however its type differ: it's now a pointer to a variable of type SomeClass.

On line (3) sc is assigned a null pointer value (nullptr in C++0x) because it doesn't point to any instance of SomeClass. Note that that the language does not guarantee any initialization on its own, so you need to explicitly assign something otherwise you'll have a garbage value.

On line (4) we build a dynamic variable (using the new operator) and assign its address to sc. Note that the dynamic variable itself is unnamed, the system only gives us a pointer (address) to it.

On line (7) the system automatically destroys sc, however it does not destroys the dynamic variable it pointed to, and thus we now have a dynamic variable whose address is not stored anywhere. Unless we are using a garbage collector (which isn't the case in standard C++), we thus have leaked memory since the variable's memory won't be reclaimed before the process ends... and even then the destructor will not be run (too bad if it had side effects).

Lifetime of Objects

Herb Sutter has a very interesting articles on this subject. Here is the first.

As a summary:

  • An object lives as soon as its constructor runs to completion. It means that if the constructor throws, the object never lived (consider it an accident of pregnancy).
  • An object is dead as soon as its destructor is invoked, if the destructor throws (this is EVIL) it cannot be attempted again because you cannot invoke any method on a dead object, it's undefined behavior.

If we go back to the first example:

int main(int argc, char* argv[])  // 1
{                                 // 2
  SomeClass sc;                   // 3
  sc.foo();                       // 4
  return 0;                       // 5
}                                 // 6

sc is alive from line (4) to line (5) inclusive. On line (3) it's being constructed (which may fail for any number of reasons) and on line (6) it's being destructed.

RAII

RAII means Resources Acquisition Is Initialization. It's an idiom to manage resources, and notably to be sure that the resources will eventually be released once they've been acquired.

In C++, since we do not have garbage collection, this idiom is mainly applied to memory management, but it's also useful for any other kind of resources: locks in multithreaded environments, files locks, sockets / connections in network, etc...

When used for memory management, it's used to couple the lifetime of dynamic variable to the lifetime of a given set of automatic variables, ensuring that the dynamic variable will not outlive them (and be lost).

In its simplest form, it's coupled to a single automatic variable:

int main(int argc, char* argv[])
{
  std::unique_ptr<SomeClass> sc = new SomeClass();
  sc->foo();
  return 0;
}

It's very similar to the first example, except that I dynamically allocate an instance of SomeClass. The address of this instance is then handed to the sc object, of type std::unique_ptr<SomeClass> (it's a C++0x facility, use boost::scoped_ptr if unavailable). unique_ptr guarantees that the object pointed to will be destroyed when sc is destroyed.

In a more complicated form, it might be coupled to several automatic variables using (for example) std::shared_ptr, which as the name implies allows to share an object and guarantees that the object will be destroyed when the last sharer is destroyed. Beware that this is not equivalent to using a garbage collector and there can be issues with cycles of references, I won't go in depth here so just remember than std::shared_ptr isn't a panacea.

Because it's very complicated to perfectly manage the lifetime of a dynamic variable without RAII in the face of exceptions and multithreaded code, the recommendation is:

  • use automatic variables as much as possible
  • for dynamic variables, never invoke delete on your own and always makes use of RAII facilities

I personally consider any occurrence of delete to be strongly suspicious, and I always ask for its removal in code reviews: it's a code smell.

C# parallel

In C# you mainly use dynamic variables*. This is why:

  • If you just declare a variable, without assignment, its value is null: in essence you are only manipulating pointers and you thus have a null pointer (initialization is guaranteed, thanks goodness)
  • You use new to create values, this invoke the constructor of your object and yields you the address of the object; note how the syntax is similar to C++ for dynamic variables

However, unlike C++, C# is garbage collected so you don't have to worry about memory management.

Being garbage collected also means that the lifetime of objects is more difficult to understand: they are built when you ask for them but destroyed at the system's convenience. This can be an issue to implement RAII, for example if you really wish to release the lock rapidly, and the language have a number of facilities to help you out using keyword + IDisposable interface from memory.

*: it's easy to check, if after declaring a variable its value is null, then it will be a dynamic variable. I believe that for int the value will be 0 indicating it's not, but it's been 3 years already since I fiddled with C# for a course project so...

like image 92
Matthieu M. Avatar answered Oct 22 '22 13:10

Matthieu M.


What you are doing in the first line of main() is to allocate a SomeClass object on the stack. The new operator instead allocates objects on the heap, returning a pointer to the class instance. This eventually leads to the two different access techniques via the . (with the instance) or with the -> (with the pointer)

Since you know C, you perform stack allocation every time you say, for example int i;. On the other hand, heap allocation is performed in C with malloc(). malloc() returns a pointer to a newly allocated space, which is then cast to a pointer-to something. example:

int *i;
i = (int *)malloc(sizeof(int));
*i=5;

While deallocation of allocated stuff on the stack is done automatically, deallocation of stuff allocated on the heap must be done by the programmer.

The source of your confusion comes from the fact that C# (which I don't use, but I know it is similar to Java) does not have stack allocation. What you do when you say SomeClass sc, is to declare a SomeClass reference which is currently uninitialized until you say new, which is the moment when the object springs into existence. Before the new, you have no object. In C++ this is not the case. There's no concept of references in C++ that is similar to C# (or java), although you have references in C++ only during function calls (it's a pass-by-reference paradigm, in practice. By default C++ passes by value, meaning that you copy objects at function call). However, this is not the whole story. Check the comments for more accurate details.

like image 29
Stefano Borini Avatar answered Oct 22 '22 14:10

Stefano Borini