Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Returning vector of pointers - understanding

I'm trying to understand the following (Lets pretend MyStorageClass is huge) :

class MyStorageClass
{
public:
  string x;
  string y;
  string z;
};

class storage
{
public:
  storage();
  ~storage()
  {
    vector<MyStorageClass *>::iterator it = myVec.begin(); it != myVec.end(); it++)
    {
      delete *it;
    }

  vector<MyStorageClass*> getItems()
  {
    for(int i = 0; i < 10; ++i)
    { 
      v.push_back(new MyStorageClass());
    }
    return v;
  }

private:
  vector<MyStorageClass*> v;

};



main()
{
  storage s;
  vector<MyStorageClass*> vm = s.getItems();
}

From my understading, when s returns the vector and is assigned to vm this is done as a copy (by value). So if s went out of scope and calls it destructor, vm has its own copy and its structure is not affected. However, passing by value is not efficient. So, if you change it to pass by reference:

vector<MyStorageClass*>& getItems()
{
  for(int i = 0; i < 10; ++i)
  { 
    v.push_back(new MyStorageClass());
  }
  return v;
}

You pass the memory location of v (in class Storage). But you still assign a copy of it using the = operator to the vector vm in the Main class. So, vm is independent of v, and if Storage destructor is called vm unaffected.

Finally, if getItems returned a reference and in main you had the following:

main()
{
  storage s;
  vector<MyStorageClass*> &vm = s.getItems();
}

Now, vm holds the address of v. So it is affected by the Storage destructor.

Question: Is what I've stated above true?

like image 679
nf313743 Avatar asked Feb 23 '23 06:02

nf313743


2 Answers

Yes, you have understood this correctly. Now, if you want to take it to the next level, start by finding reasons why this is bad:

  1. Returning pointer or references to class internals breaks encapsulation.
  2. The reference you get will become a dangling reference once the corresponding s-object is destroyed, thereby breaking the invariant that references are always valid.
  3. By returning a vector of pointers you leave the caller wondering weather he has to delete those pointers or not.

A better solution would be for storage to expose methods begin and end that return the corresponding iterators from s. Alternatively, you can use the Visitor-pattern for algorithms that need to operate on s.

Also, in your case it seems like the vector s is supposed to own the objects it contains. That would be a good indicator for use of boost::ptr_vector.

like image 188
Björn Pollex Avatar answered Feb 24 '23 18:02

Björn Pollex


Even though a vector copies its values, the values in your case are the pointers, and not the objects pointed to. So "vm has it's own copy" is not fully true: The resulting vector in your first piece of code will have a copy of the pointers, but not of the MyStorageClass objects they are pointing to; so, actually in all of your 3 code examples the pointers stored in vm would not be valid anymore if the Storage destructor is called!

If you need to make sure, though, that Storage isn't destroyed before the last access to one of the MyStorageClass objects anyway, then, of the presented methods, the third way would be the preferred one, since the vector data (i.e. the pointers) is in memory only once then. You should consider returning a const vector, however, else every caller of getItems can modify the v vector of the Storage class through the returned reference.

As others have already pointed out, exposing the vector itself might not be that good of an idea in the first place; you might consider using iterators instead or the Visitor pattern.

Furthermore, the usage of pointers - where not absolutely needed - is in bigger projects usually rather frowned upon. Consider using smart pointers, such as std::auto_ptr (not very recommendable though due to strange copy semantics), or the more easily understandable boost::shared_ptr / std::tr1::shared_ptr.

Just as a side note, the first and second code example will (at least in most modern compilers) have the exact same performance, since in the first case, the compiler can optimize the temporary return value away (check out "Return value optimization").

like image 45
codeling Avatar answered Feb 24 '23 19:02

codeling