Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deal with reference arguments that may point to internal data?

Tags:

c++

reference

More than once I've tracked down a rather baffling bug only to find it was due to a const reference parameter changing value part way through a method. This has always been the result of a method receiving a reference argument that happens to refer to (part of) one of its own data members. So when the method mutates that member, the reference (despite being const) is mutated as well.

A simple example of what I'm talking about:

#include <algorithm>
#include <iostream>
#include <string>
#include <vector>

class RecentStringBuffer
{
public:
    const std::string & at(size_t index)
    {
        return v.at(index);
    }

    void add(const std::string & s)
    {
        // Remove any previous occurances
        v.erase(std::remove(v.begin(), v.end(), s), v.end());
        // Prepend the new entry
        v.insert(v.begin(), s);
        // Truncate older entries
        v.resize(std::min<size_t>(v.size(), maxEntries));
    }

private:
    const int maxEntries = 10;
    std::vector<std::string> v;
};

int main()
{
    RecentStringBuffer r;
    r.add("A");     // r == [A]
    r.add("B");     // r == [B, A]
    r.add("C");     // r == [C, B, A]
    r.add(r.at(1)); // r == [B, C, A] one would assume?

    std::cout << r.at(0) << r.at(1) << r.at(2); // Prints "A C A"
}

In this example, we get a surprising result, but if v had reallocated, the reference would have been left pointing outside of v's memory, which would have been much worse. Technically, the reference was invalidated either way, so what ever happens is undefined behaviour.

Similar scenarios can obviously happen with global variables instead of data members and with pointers instead of references, but members and references generally feel safer, so this seems like a much more surprising pitfall.

Now, I'm not asking why this happens (I understand what's going on) or how to work around it (there are several obvious ways). My questions are more related to best practices:

  • Is there a name for this problem?
  • Who is responsible for worrying about this problem?
  • Or said another way, where does the bug lie in the above application? When viewed individually, it seems perfectly reasonable and beneficial for at() to return a const reference and for add() to accept one. On the other hand, it doesn't seem all that fair to say that the caller, main(), should have known better, either, especially considering that the reference may be passed down through multiple functions before the problem occurs.
  • Are there any general strategies for noticing and avoiding such constructs?
like image 229
Parker Coates Avatar asked Apr 17 '17 19:04

Parker Coates


1 Answers

Typically, the implied contract of a function is that it should do what it's supposed to do even if referential arguments refer to something that the function might modify.

If a function doesn't support this then it should be documented clearly.

An example from the standard library is:

  • std::vector::insert( const_iterator pos, InputIt first, InputIt last ) is specifically documented to say "The behavior is undefined if first and last are iterators into *this".
  • std::vector::insert( const_iterator pos, const T& value ) has no such documentation, therefore it must work even if value refers to an element of the vector. (This was confirmed by the committee).

So, in your code you either need to modify add() to work even if s refers to a member of v; or document that it won't work.

like image 165
M.M Avatar answered Nov 07 '22 22:11

M.M