Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

size_t, ptrdiff_t and std::vector::size()

Tags:

c++

I thought that the correct type to use to store the difference between pointers was ptrdiff_t.

As such, I'm confused by the way that my STL (msvc 2010) implements it's std::vector::size() function. The return type is size_t (this is mandated by the standard, as far as I understand it) and yet it's computed as the difference of pointers:

// _Mylast, _Myfirst are of type pointer
// size_type, pointer are inherited from allocator<_Ty>
size_type size() const 
{
    return (this->_Mylast - this->_Myfirst);
}

Obviously, there's a bit of meta-magic that goes on in order to determine exactly what types size_type and pointer are. In order to be "sure" what types they are I checked this:

bool bs = std::is_same<size_t, std::vector<int>::size_type>::value;
bool bp = std::is_same<int * , std::vector<int>::pointer>::value;
// both bs and bp evaluate as true, therefore:
//   size_type is just size_t
//   pointer is just int*

Compiling the following with /Wall gives me a signed-to-unsigned mismatch for mysize2, but no warnings for mysize1:

std::vector<int> myvector(100);
int *tail = &myvector[99];
int *head = &myvector[ 0];
size_t mysize1 = myvector.size();
size_t mysize2 = (tail - head + 1);

Changing the type of mysize2 to ptrdiff_t results in no warning. Changing the type of mysize1 to ptrdiff_t results in an unsigned-to-signed mismatch.

Obviously I'm missing something...

EDIT: I'm not asking how to suppress the warning, with a cast or a #pragma disable(xxx). The issue I'm concerned about is that size_t and ptrdiff_t may have different allowable ranges (they do on my machine).

Consider std::vector<char>::max_size(). My implementation returns a max_size equal to std::numeric_limits<size_t>::max(). Since vector::size() is creating an intermediate value of type ptrdiff_t before casting to size_t it seems that there could be problems here - ptrdiff_t is not big enough to hold vector<char>::max_size().

like image 368
Darren Engwirda Avatar asked Jul 19 '11 02:07

Darren Engwirda


1 Answers

Generally speaking, ptrdiff_t is a signed integral type of the same size as size_t. It must be signed so that it can represent both p1 - p2 and p2 - p1.

In the specific case of the internals of std::vector, the implementor is effectively deriving size() from end() - begin(). Because of the guarantees of std::vector (contiguous, array based storage), the value of the end pointer will always be greater than the value of the begin pointer, and thus there is no risk of generating a negative value. In fact, size_t will always be able to represent a larger positive range than will ptrdiff_t, as it doesn't have to use half its range to represent negative values. Effectively, this means that the cast in this case from ptrdiff_t to size_t is a widening cast, which has well defined (and intuitively obvious) results.

Also, note that this is not the only possible implementation of std::vector. It could just as easily be implemented as a single pointer and a size_t value holding the size, deriving end() as begin() + size(). That implementation would also resolve your max_size() concern. In reality, max_size is never actually attainable--it would require your program's entire address space to be allocated for the vector's buffer, leaving no room for the begin()/end() pointers, function call stack, etc.

like image 157
Drew Hall Avatar answered Oct 06 '22 00:10

Drew Hall