Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is preferred way of passing pointer/reference to existing object in a constructor?

Tags:

c++

I'll start from example. There is a nice "tokenizer" class in boost. It take a string to be tokenized as a parameter in a constructor:

std::string string_to_tokenize("a bb ccc ddd 0");
boost::tokenizer<boost::char_separator<char> > my_tok(string_to_tokenize);
/* do something with my_tok */

The string isn't modifed in the tokenizer, so it is passed by const object reference. Therefore I can pass a temporary object there:

boost::tokenizer<boost::char_separator<char> > my_tok(std::string("a bb ccc ddd 0"));
/* do something with my_tok */

Everything looks fine, but if I try to use the tokenizer, a disaster occurs. After short investigation I realized, that the tokenizer class store the reference that I gave to it, and use in further use. Of course it cannot work well for reference to temporary object.

The documentation doesn't say explicitly that the object passed in the constructor will be used later, but ok, it is also not stated, that it won't be :) So I cannot assume this, my mistake.

It is a bit confusing however. In general case, when one object take another one by const reference, it suggest that temporary object can be given there. What do you think? Is this a bad convention? Maybe pointer to object (rather than reference) should be used in such cases? Or even further - won't it be useful to have some special keyword to argument that allow/disallow giving temporary object as parameter?

EDIT: The documentation (version 1.49) is rather minimalistic and the only part that may suggest such a problem is:

Note: No parsing is actually done upon construction. Parsing is done on demand as the tokens are accessed via the iterator provided by begin.

But it doesn't state explicitly, that the same object that was given will be used.

However, the point of this question is rather discussion about coding style in such a case, this is only an example that inspired me.

like image 836
peper0 Avatar asked Mar 05 '12 10:03

peper0


People also ask

Can we pass this pointer in constructor?

So it is perfectly valid to use a "this" pointer in a class' constructor and assume that it points to a completely constructed object. Of course, you still need to beware of uninitialized member variables, if you haven't already initialized them in your constructor code.

When should I use references and when should I use pointers?

References are usually preferred over pointers whenever you don't need “reseating”. This usually means that references are most useful in a class's public interface. References typically appear on the skin of an object, and pointers on the inside.

Why is it better to pass by reference?

Pass-by-references is more efficient than pass-by-value, because it does not copy the arguments. The formal parameter is an alias for the argument. When the called function read or write the formal parameter, it is actually read or write the argument itself.

When should you pass by reference C++?

In C++, variables are passed by reference due to following reasons: 1) To modify local variables of the caller function: A reference (or pointer) allows called function to modify a local variable of the caller function.


4 Answers

If some function (such as a constructor) takes an argument as reference-to-const then it should either

  • Document clearly that the lifetime of the referenced object must satisfy certain requirements (as in "Is not destroyed before this and that happens")

or

  • Create copies internally if it needs to make use of the given object at a later point.

In this particular case (the boost::tokenizer class) I'd assume that the latter isn't done for performance reasons and/or to make the class usable with container types which aren't even copyable in the first place. For this reason, I'd consider this a documentation bug.

like image 112
Frerich Raabe Avatar answered Nov 07 '22 20:11

Frerich Raabe


Personally I think it's a bad idea, and it would be better write the constructor either to copy the string, or to take a const std::string* instead. It's only one extra character for the caller to type, but that character stops them accidentally using a temporary.

As a rule: don't create responsibilities on people to maintain objects without making it very obvious that they have that responsibility.

I think a special keyword wouldn't be a complete enough solution to justify changing the language. It's not actually temporaries that are the problem, it's any object that lives for less time than the object being constructed. In some circumstances a temporary would be fine (for example if the tokenizer object itself were also a temporary in the same full-expression). I don't really want to mess about with the language for the sake of half a fix, and there are fuller fixes available (for example take a shared_ptr, although that has its own issues).

"So I cannot assume this, my mistake"

I don't think it really is your mistake, I agree with Frerich that as well as being against my personal style guide to do this at all, if you do it and don't document then that's a documentation bug in any reasonable style guide.

It's absolutely essential that the required lifetime of by-reference function parameters is documented, if it's anything other than "at least as long as the function call". It's something that docs are often lax about, and needs to be done properly to avoid errors.

Even in garbage-collected languages, where lifetime itself is automatically handled and so tends to get neglected, it matters whether or not you can change or re-use your object without changing the behavior of some other object that you passed it to method of, some time in the past. So functions should document whether they retain an alias to their arguments in any language that lacks referential transparency. Especially so in C++ where object lifetime is the caller's problem.

Unfortunately the only mechanism to actually ensure that your function cannot retain a reference is to pass by value, which has a performance cost. If you can invent a language that allows aliasing normally, but also has a C-style restrict property that is enforced at compile-time, const-style, to prevent functions from squirreling away references to their arguments, then good luck and sign me up.

like image 24
4 revs Avatar answered Nov 07 '22 21:11

4 revs


As others said, the boost::tokenizer example is the result of either a bug in the tokenizer or a warning missing from the documentation.

To generally answer the question, I found the following priority list useful. If you can't choose an option for some reason, you go to the next item.

  1. Pass by value (copyable at an acceptable cost and don't need to change original object)
  2. Pass by const reference (don't need to change original object)
  3. Pass by reference (need to change original object)
  4. Pass by shared_ptr (the lifetime of the object is managed by something else, this also clearly shows the intention to keep the reference)
  5. Pass by raw pointer (you got an address to cast to, or you can't use a smart pointer for some reason)

Also, if your reasoning to choose the next item from the list is "performance", then sit down and measure the difference. In my experience, most people (especially with Java or C# backgrounds) tend to over-estimate the cost of passing an object by value (and under-estimate the cost of dereferencing). Passing by value is the safest option (it will not cause any surprises outside the object or function, not even in another thread), don't give up that huge advantage easily.

like image 42
Tamás Szelei Avatar answered Nov 07 '22 21:11

Tamás Szelei


A lot of time it will depend on context, for example if it's a functor which will be called in a for_each or similar, then you will often store a reference or a pointer within your functor to an object you expect will have a lifetime beyond your functor.

If it is a general use class then you have to consider how people are going to use it.

If you are writing a tokenizer, you need to consider that copying what you are tokenizing over might be expensive, however you also need to consider that if you are writing a boost library you are writing it for the general public who will use it in a multi-purpose way.

Storing a const char * would be better than a std::string const& here. If the user has a std::string then the const char * will remain valid as long as they don't modify their string, and they probably won't. If they have a const char * or something that holds an array of chars and passes it in, it will copy it anyway to create the std::string const & and you are in great danger of the fact that it won't live past your constructor.

Of course, with a const char * you can't use all the lovely std::basic_string functions in your implementation.

There is an option to take, as parameter, a std::string& (not const reference) which should guarantee (with a compliant compiler) that nobody will pass in a temporary, but you will be able to document that you don't actually change it, and the rationale behind your seemingly not const-correct code. Note, I have used this trick once in my code too. And you can happily use string's find functions. (As well as, if you wish, taking basic_string rather than string so you can tokenize wide character strings too).

like image 45
CashCow Avatar answered Nov 07 '22 20:11

CashCow