I thought that the correct type to use to store the difference between pointers was ptrdiff_t
.
As such, I'm confused by the way that my STL
(msvc 2010) implements it's std::vector::size()
function. The return type is size_t
(this is mandated by the standard, as far as I understand it) and yet it's computed as the difference of pointers:
// _Mylast, _Myfirst are of type pointer
// size_type, pointer are inherited from allocator<_Ty>
size_type size() const
{
return (this->_Mylast - this->_Myfirst);
}
Obviously, there's a bit of meta-magic that goes on in order to determine exactly what types size_type
and pointer
are. In order to be "sure" what types they are I checked this:
bool bs = std::is_same<size_t, std::vector<int>::size_type>::value;
bool bp = std::is_same<int * , std::vector<int>::pointer>::value;
// both bs and bp evaluate as true, therefore:
// size_type is just size_t
// pointer is just int*
Compiling the following with /Wall
gives me a signed-to-unsigned mismatch
for mysize2
, but no warnings for mysize1
:
std::vector<int> myvector(100);
int *tail = &myvector[99];
int *head = &myvector[ 0];
size_t mysize1 = myvector.size();
size_t mysize2 = (tail - head + 1);
Changing the type of mysize2
to ptrdiff_t
results in no warning.
Changing the type of mysize1
to ptrdiff_t
results in an unsigned-to-signed mismatch
.
Obviously I'm missing something...
EDIT: I'm not asking how to suppress the warning, with a cast or a #pragma disable(xxx)
. The issue I'm concerned about is that size_t
and ptrdiff_t
may have different allowable ranges (they do on my machine).
Consider std::vector<char>::max_size()
. My implementation returns a max_size
equal to std::numeric_limits<size_t>::max()
. Since vector::size()
is creating an intermediate value of type ptrdiff_t
before casting to size_t
it seems that there could be problems here - ptrdiff_t
is not big enough to hold vector<char>::max_size()
.
Generally speaking, ptrdiff_t is a signed integral type of the same size as size_t. It must be signed so that it can represent both p1 - p2
and p2 - p1
.
In the specific case of the internals of std::vector, the implementor is effectively deriving size()
from end() - begin()
. Because of the guarantees of std::vector (contiguous, array based storage), the value of the end pointer will always be greater than the value of the begin pointer, and thus there is no risk of generating a negative value. In fact, size_t will always be able to represent a larger positive range than will ptrdiff_t, as it doesn't have to use half its range to represent negative values. Effectively, this means that the cast in this case from ptrdiff_t to size_t is a widening cast, which has well defined (and intuitively obvious) results.
Also, note that this is not the only possible implementation of std::vector. It could just as easily be implemented as a single pointer and a size_t value holding the size, deriving end()
as begin() + size()
. That implementation would also resolve your max_size()
concern. In reality, max_size is never actually attainable--it would require your program's entire address space to be allocated for the vector's buffer, leaving no room for the begin()/end() pointers, function call stack, etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With