Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ vector::size_type: signed vs unsigned; int vs. long

I have been doing some testing of my application by compiling it on different platforms, and the shift from a 64-bit system to a 32-bit system is exposing a number of issues.

I make heavy use of vectors, strings, etc., and as such need to count them. However, my functions also make use of 32-bit unsigned numbers because in many cases I need to explicitly consume a positive integer.

I'm having issues with seemingly simple tasks such as std::min and std::max, which may be more systemic. Consider the following code:

uint32_t getmax()
{
    return _vecContainer.size();
}

Seems simple enough: I know that a vector can't have a negative number of elements, so returning an unsigned integer makes complete sense.

void setRowCol(const uint32_t &r_row; const uint32_t &r_col)
{
    myContainer_t mc;
    mc.row = r_row;
    mc.col = r_col;
    _vecContainer.push_back(mc);
}

Again, simple enough.

Problem:

uint32_t foo(const uint32_t &r_row)
{
    return std::min(r_row, _vecContainer.size());
}

This gives me errors such as:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/algorithm:2589:1: note: candidate template ignored: deduced conflicting types for parameter '_Tp' ('unsigned long' vs. 'unsigned int')
min(const _Tp& __a, const _Tp& __b)

I did a lot of digging, and on one platform vector::size_type is an 8 byte number. However, by design I am using unsigned 4-byte numbers. This is presumably causing things to be wacky because you cannot implicitly convert from an 8-byte number to a 4-byte number.

The solution was to do this the old fashioned weay:

#define MIN_M(a,b) a < b ? a : b
return MIN_M(r_row, _vecContainer.size());

Which works dandy. But the systemic issue remains: when planning for multiple platform support, how do you handle instances like this? I could use size_t as my standard size, but that adds other complications (e.g. moving from one platform which supports 64 bit numbers to another which supports 32 bit numbers at a later date). The bigger issue is that size_t is unsigned, so I can't update my signatures:

size_t foo(const size_t &r_row)
// bad, this allows -1 to be passed, which I don't want

Any suggestions?

EDIT: I had read somewhere that size_t was signed, and I've since been corrected. So far it looks like this is a limitation of my own design (e.g. 32-bit numbers vs. using std::vector::size_type and/or size_t).

like image 532
tendim Avatar asked May 26 '17 16:05

tendim


People also ask

What's the difference between unsigned and int?

An int is signed by default, meaning it can represent both positive and negative values. An unsigned is an integer that can never be negative. If you take an unsigned 0 and subtract 1 from it, the result wraps around, leaving a very large number (2^32-1 with the typical 32-bit integer size).

What is the difference between size T and unsigned int?

On a typical 64-bit system, the size_t will be 64-bit, but unsigned int will be 32 bit. So we cannot use them interchangeably. One standard recommendation is that the size_t be at most as big as an unsigned long.

What is std:: size_ t in c++?

std::size_t is the type of any sizeof expression and as is guaranteed to be able to express the maximum size of any object (including any array) in C++. By extension it is also guaranteed to be big enough for any array index so it is a natural type for a loop by index over an array.

Is Short signed or unsigned by default in C?

By default, char is unsigned while short, int, and long are signed. Now if a variable is declared to be of type U32, the programmer knows it is a 32-bit unsigned integer.


2 Answers

One way to deal with this is to use

std::vector<Type>::size_type

as the underlying type of your function parameters/returns, or auto returns if using C++14.

like image 82
vsoftco Avatar answered Sep 30 '22 01:09

vsoftco


An answer in the form of a set of tidbits:

  1. Instead of relying on the compiler to deduce the type, you can explicitly specify the type when using function templates like std::min<T>. For example: std::min<std::uint32_t>(4, my_vec.size());

  2. Turn on all the compiler warnings related to signed versus unsigned comparisons and implicit narrowing conversions. Use brace initialization where you can, as it will treat narrowing conversions as errors.

  3. If you explicitly want to use 32-bit values like std::uint32_t, I'd try to find the minimal number of places to explicitly convert (i.e., static_cast) the "sizes" to the smaller types. You don't want casts everywhere, but if you're using library container sizes internally and you want your API to use std::uint32_t, explicitly cast at the API boundaries so that a user of your class never has to worry about doing the conversion themselves. If you can keep the conversions to just a couple places, it becomes practical to add run-time checks (i.e., assertions) that the size has not actually outgrown the range of the smaller type.

  4. If you don't care about the exact size, use std::size_t, which is almost certainly identical to std::XXX::size_type for all of the standard containers. It's theoretically possible for them to be different, but it doesn't happen in practice. In most contexts, std::size_t is less verbose that std::vector::size_type, so it makes a good compromise.

  5. Lots of people (including many people on the C++ standards committee) will tell you to avoid unsigned values even for sizes and indexes. I understand and respect their arguments, but I don't find them persuasive enough to justify the extra friction at the interface with the standard library. Whether or not it's an historical artifact that std::size_t is unsigned, the fact is that the standard library uses unsigned sizes extensively. If you use something else, your code ends up littered with implicit conversions, all of which are potential bugs. Worse, those implicit conversions make turning on the compiler warnings impractical, so all those latent bugs remain relatively invisible. (And even if you know your sizes will never exceed the smaller type, being forced to turn of the compiler warnings for signedness and narrowing means you could miss bugs in completely unrelated parts of the code.) Match the types of the APIs you're using as much as possible, assert and explicitly convert when necessary, and turn on all the warnings.

  6. Keep in mind that auto is not a panacea. for (auto i = 0; i < my_vec.size(); ++i) ... is just as bad as for (int i .... But if you generally prefer algorithms and iterators to raw loops, auto will get you pretty far.

  7. With division you must never divide unless you know the denominator is not 0. Similarly, with unsigned integral types, you must never subtract unless you know the subtrahend is smaller than or equal to the original value. If you can make that a habit, you can avoid the bugs that the always-use-a-signed-type folks are concerned about.

like image 44
Adrian McCarthy Avatar answered Sep 30 '22 02:09

Adrian McCarthy