Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ const-reference semantics?

Consider the sample application below. It demonstrates what I would call a flawed class design.

#include <iostream>

using namespace std;

struct B
{
 B() : m_value(1) {}

 long m_value;
};

struct A
{
 const B& GetB() const { return m_B; }

 void Foo(const B &b)
 {
  // assert(this != &b);
  m_B.m_value += b.m_value;
  m_B.m_value += b.m_value;
 }

protected:
 B m_B;
};

int main(int argc, char* argv[])
{
 A a;

 cout << "Original value: " << a.GetB().m_value << endl;

 cout << "Expected value: 3" << endl;
 a.Foo(a.GetB());

 cout << "Actual value: " << a.GetB().m_value << endl;

 return 0;
}

Output:
Original value: 1
Expected value: 3
Actual value: 4

Obviously, the programmer is fooled by the constness of b. By mistake b points to this, which yields the undesired behavior.

My question: What const-rules should you follow when designing getters/setters?

My suggestion: Never return a reference to a member variable if it can be set by reference through a member function. Hence, either return by value or pass parameters by value. (Modern compilers will optimize away the extra copy anyway.)

like image 803
l33t Avatar asked Jun 11 '10 11:06

l33t


2 Answers

Obviously, the programmer is fooled by the constness of b

As someone once said, You keep using that word. I do not think it means what you think it means.

Const means that you cannot change the value. It does not mean that the value cannot change.

If the programmer is fooled by the fact that some other code else can change something that they cannot, they need a better grounding in aliasing.

If the programmer is fooled by the fact that the token 'const' sounds a bit like 'constant' but means 'read only', they need a better grounding in the semantics of the programming language they are using.

So if you have a getter which returns a const reference, then it is an alias for an object you don't have the permission to change. That says nothing about whether its value is immutable.


Ultimately, this comes down to a lack of encapsulation, and not applying the Law of Demeter. In general, don't mutate the state of other objects. Send them a message to ask them to perform an operation, which may (depending on their own implementation details) mutate their state.

If you make B.m_value private, then you can't write the Foo you have. You either make Foo into:

void Foo(const B &b)
{
    m_B.increment_by(b);
    m_B.increment_by(b);
}

void B::increment_by (const B& b)
{
    // assert ( this != &b ) if you like 
    m_value += b.m_value;
}

or, if you want to ensure that the value is constant, use a temporary

void Foo(B b)
{
    m_B.increment_by(b);
    m_B.increment_by(b);
}

Now, incrementing a value by itself may or may not be reasonable, and is easily tested for within B::increment_by. You could also test whether &m_b==&b in A::Foo, though once you have a couple of levels of objects and objects with references to other objects rather than values (so &a1.b.c == &a2.b.c does not imply that &a1.b==&a2.b or &a1==&a2), then you really have to just be aware that any operation is potentially aliased.

Aliasing means that incrementing by an expression twice is not the same as incrementing by the value of the expression the first time you evaluated it; there's no real way around it, and in most systems the cost of copying the data isn't worth the risk of avoiding the alias.

Passing in arguments which have the least structure also works well. If Foo() took a long rather than an object which it has to get a long from, then it would not suffer aliasing, and you wouldn't need to write a different Foo() to increment m_b by the value of a C.

like image 182
Pete Kirkham Avatar answered Oct 12 '22 09:10

Pete Kirkham


I propose a slightly different solution to this that has several advantages (especially in an every increasing, multi-threaded world). Its a simple idea to follow, and that is to "commit" your changes last.

To explain via your example you would simply change the 'A' class to:

struct A
{
 const B& GetB() const { return m_B; }

 void Foo(const B &b)
 {
  // copy out what we are going to change;
  int itm_value = m_b.m_value;

  // perform operations on the copy, not our internal value
  itm_value += b.m_value;
  itm_value += b.m_value;

  // copy over final results
  m_B.m_value = itm_value ;
 }

protected:
 B m_B;
};

The idea here is to place all assignment to memory viewable above the current function at the end, where they pretty much can't fail. This way, if an error is thrown (say there was a divide in the middle of those 2 operations, and if it just happens to be 0) in the middle of the operation, then we aren't left with half baked data in the middle.

Furthermore, in a multi-threading situation, you can do all of the operation, and then just check at the end if anything has changed before your "commit" (an optimistic approach, which will usually pass and usually yield much better results than locking the structure for the entire operation), if it has changed, you simply discard the values and try again (or return a value saying it has failed if there is something it can do instead).

On top of this, the compiler can usually optimise this better, because it is no longer required to write the variables being modified to memory (we are only forcing one read of the value to be changed and one write). This way, the compiler has the option of just keeping the relevant data in a register, saves L1 cache access if not cache misses. Otherwise the compiler will probably make it write to the memory as it doesn't know what aliasing might be taking place (so it can't ensure those values stay the same, if they are all local, it knows it can't be aliasing because the current function is the only one that knows about it).

There's a lot of different things that can happen with the original code posted. I wouldn't be surprised if some compilers (with optimizations enabled) will actually produce code that produces the "expected" result, whereas others won't. All of this is simply because the point at which variables, that aren't 'volatile', are actually written/read from memory isn't well defined within the c++ standards.

like image 33
Grant Peters Avatar answered Oct 12 '22 09:10

Grant Peters