Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is std::ssize being forced to a minimum size for its signed size type?

In C++20, std::ssize is being introduced to obtain the signed size of a container for generic code. (And the reason for its addition is explained here.)

Somewhat peculiarly, the definition given there (combining with common_type and ptrdiff_t) has the effect of forcing the return value to be "either ptrdiff_t or the signed form of the container's size() return value, whichever is larger".

P1227R1 indirectly offers a justification for this ("it would be a disaster for std::ssize() to turn a size of 60,000 into a size of -5,536").

This seems to me like an odd way to try to "fix" that, however.

  • Containers which intentionally define a uint16_t size and are known to never exceed 32,767 elements will still be forced to use a larger type than required.
    • The same thing would occur for containers using a uint8_t size and 127 elements, respectively.
    • In desktop environments, you probably don't care; but this might be important for embedded or otherwise resource-constrained environments, especially if the resulting type is used for something more persistent than a stack variable.
  • Containers which use the default size_t size on 32-bit platforms but which nevertheless do contain between 2B and 4B items will hit exactly the same problem as above.
  • If there still exist platforms for which ptrdiff_t is smaller than 32 bits, they will hit the same problem as well.

Wouldn't it be better to just use the signed type as-is (without extending its size) and to assert that a conversion error has not occurred (eg. that the result is not negative)?

Am I missing something?


To expand on that last suggestion a bit (inspired by Nicol Bolas' answer): if it were implemented the way that I suggested, then this code would Just Work™:

void DoSomething(int16_t i, T const& item);

for (int16_t i = 0, len = std::ssize(rng); i < len; ++i)
{
    DoSomething(i, rng[i]);
}

With the current implementation, however, this produces warnings and/or errors unless static_casts are explicitly added to narrow the result of ssize, or to use int i instead and then narrow it in the function call (and the range indexing), neither of which seem like an improvement.

like image 995
Miral Avatar asked May 22 '19 03:05

Miral


People also ask

What is the difference between size_t and ssize_t?

In short, ssize_t is the same as size_t, but is a signed type - read ssize_t as “signed size_t”. ssize_t is able to represent the number -1, which is returned by several system calls and library functions as a way to indicate error.

What is ssize_t in C?

I previously covered the size_t type in C, which is used to represent the size of an allocated block of memory. But lots of C functions use a type called ssize_t. What is the extra s? In short, ssize_t is the same as size_t, but is a signed type - read ssize_t as “signed size_t ”.

What is size_type in C++?

When indexing C++ containers, such as std::string, std::vector, etc, the appropriate type is the member typedef size_type provided by such containers. It is usually defined as a synonym for std::size_t .

What is size_t in C++?

When indexing C++ containers, such as std::string, std::vector, etc, the appropriate type is the member typedef size_type provided by such containers. It is usually defined as a synonym for std::size_t . The integer literal suffix for std::size_t is uz (or UZ ). C++20 standard (ISO/IEC 14882:2020): 6.8.3 Compound types [basic.compound] (p: 75-76)


1 Answers

Containers which intentionally define a uint16_t size and are known to never exceed 32,767 elements will still be forced to use a larger type than required.

It's not like the container is storing the size as this type. The conversion happens via accessing the value.

As for embedded systems, embedded systems programmers already know about C++'s propensity to increase the size of small types. So if they expect a type to be an int16_t, they're going to spell that out in the code, because otherwise C++ might just promote it to an int.

Furthermore, there is no standard way to ask about what size a range is "known to never exceed". decltype(size(range)) is something you can ask for; sized ranges are not required to provide a max_size function. Without such an ability, the safest assumption is that a range whose size type is uint16_t can assume any size within that range. So the signed size should be big enough to store that entire range as a signed value.

Your suggestion is basically that any ssize call is potentially unsafe, since half of any size range cannot be validly stored in the return type of ssize.

Containers which use the default size_t size on 32-bit platforms but which nevertheless do contain between 2B and 4B items will hit exactly the same problem as above.

Assuming that it is valid for ptrdiff_t to not be a signed 64-bit integer on such platforms, there isn't really a valid solution to that problem. So yes, there will be cases where ssize is potentially unsafe.

ssize currently is potentially unsafe in cases where it is not possible to be safe. Your proposal would make ssize potentially unsafe in all cases.

That's not an improvement.

And no, merely asserting/contract checking is not a viable solution. The point of ssize is to make for(int i = 0; i < std::ssize(rng); ++i) work without the compiler complaining about signed/unsigned mismatch. To get an assert because of a conversion failure that didn't need to happen (and BTW, cannot be corrected without using std::size, which we are trying to avoid), one which is ultimately irrelevant to your algorithm? That's a terrible idea.


if it were implemented the way that I suggested, then this code would Just Work™:

Let us ignore the question of how often it is that a user would write this code.

The reason your compiler will expect/require you to use a cast there is because you are asking for an inherently dangerous operation: you are potentially losing data. Your code only "Just Works™" if the current size fits into an int16_t; that makes the conversion statically dangerous. This is not something that should implicitly take place, so the compiler suggests/requires you to explicitly ask for it. And users looking at that code get a big, fat eyesore reminding them that a dangerous thing is being done.

That is all to the good.

See, if your suggested implementation were how ssize behaved, then that means we must treat every use of ssize as just as inherently dangerous as the compiler treats your attempted implicit conversion. But unlike static_cast, ssize is small and easily missed.

Dangerous operations should be called out as such. Since ssize is small and difficult to notice by design, it therefore should be as safe as possible. Ideally, it should be as safe as size, but failing that, it should be unsafe only to the extend that it is impossible to make it safe.

Users should not look on ssize usage as something dubious or disconcerting; they should not fear to use it.

like image 68
Nicol Bolas Avatar answered Oct 26 '22 01:10

Nicol Bolas