Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the max value of size_t (SIZE_MAX) defined relative to the other integer types?

I'm writing a library of functions that will safely convert between various numeric types or die trying. My intent is roughly equal parts create-useful-library and learn-C-edge-cases.

My int-to-size_t function is triggering a GCC -Wtype-limits warning that claims I shouldn't test if an int is greater than SIZE_MAX, because it will never be true. (Another function that converts int to ssize_t produces an identical warning about SSIZE_MAX.)

My MCVE, with extra comments and baby steps, is:

#include <stdint.h>  /* SIZE_MAX */
#include <stdlib.h>  /* exit EXIT_FAILURE size_t */

extern size_t i2st(int value) {
    if (value < 0) {
        exit(EXIT_FAILURE);
    }
    // Safe to cast --- not too small.
    unsigned int const u_value = (unsigned int) value;
    if (u_value > SIZE_MAX) {  /* Line 10 */
        exit(EXIT_FAILURE);
    }
    // Safe to cast --- not too big.
    return (size_t) u_value;
}

The Compiler Warnings

I'm getting similar warnings from GCC 4.4.5 on Linux 2.6.34:

$ gcc -std=c99 -pedantic -Wall -Wextra -c -o math_utils.o math_utils.c

math_utils.c: In function ‘i2st’:
math_utils.c:10: warning: comparison is always false due to limited range of data type

...and also from GCC 4.8.5 on Linux 3.10.0:

math_utils.c: In function ‘i2st’:
math_utils.c:10:5: warning: comparison is always false due to limited range of data type [-Wtype-limits]
     if (u_value > SIZE_MAX) {  /* Line 10 */
     ^

These warnings don't appear justified to me, at least not in the general case. (I don't deny that the comparison might be "always false" on some particular combination of hardware and compiler.)

The C Standard

The C 1999 standard does not appear to rule out an int being greater than SIZE_MAX.

Section "6.5.3.4 The sizeof operator" doesn't address size_t at all, except to describe it as "defined in <stddef.h> (and other headers)".

Section "7.17 Common definitions <stddef.h>" defines size_t as "the unsigned integer type of the result of the sizeof operator". (Thanks, guys!)

Section "7.18.3 Limits of other integer types" is more helpful --- it defines "limit of size_t" as:

SIZE_MAX 65535

...meaning SIZE_MAX could be as small as 65535. An int (signed or unsigned) could be much greater than that, depending on the hardware and compiler.

Stack Overflow

The accepted answer to "unsigned int vs. size_t" seems to support my interpretation (emphasis added):

The size_t type may be bigger than, equal to, or smaller than an unsigned int, and your compiler might make assumptions about it for optimization.

This answer cites the same "Section 7.17" of the C standard that I've already quoted.

Other Documents

My searches turned up the Open Group's paper "Data Size Neutrality and 64-bit Support", which claims under "64-bit Data Models" (emphasis added):

ISO/IEC 9899:1990, Programming Languages - C (ISO C) left the definition of the short int, the int, the long int, and the pointer deliberately vague [...] The only constraints were that ints must be no smaller than shorts, and longs must be no smaller than ints, and size_t must represent the largest unsigned type supported by an implementation. [...] The relationship between the fundamental data types can be expressed as:

sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) = sizeof(size_t)

If this is true, then testing an int against SIZE_MAX is indeed futile... but this paper doesn't cite chapter-and-verse, so I can't tell how its authors reached their conclusion. Their own "Base Specification Version 7" sys/types.h docs don't address this either way.

My Question

I understand that size_t is unlikely to be narrower than an int, but does the C standard guarantee that comparing some_unsigned_int > SIZE_MAX will always be false? If so, where?

Not-Duplicates

There are two semi-duplicates of this question, but they are both asking more general questions about what size_t is supposed to represent and when it should / should-not be used.

  • "What is size_t in C?" does not address the relationship between size_t and the other integer types. Its accepted answer is just a quote from Wikipedia, which doesn't provide any information beyond what I've already found.

  • "What is the correct definition of size_t?" starts off nearly a duplicate of my question, but then veers off course, asking when size_t should be used and why it was introduced. It was closed as a duplicate of the previous question.

like image 732
Kevin J. Chase Avatar asked Oct 01 '17 03:10

Kevin J. Chase


People also ask

What is max value of Size_t?

So the minimum maximum value that size_t must be able to hold is 65535, which is 16 bits of precision, and size_t is only defined to be an unknown unsigned integer type.

How is Size_t defined?

size_t is the unsigned integer type of the result of sizeof , _Alignof (since C11) and offsetof, depending on the data model. The bit width of size_t is not less than 16.

Is Size_t always the size of a pointer?

The size of size_t and ptrdiff_t always coincide with the pointer's size. Because of this, it is these types that should be used as indexes for large arrays, for storage of pointers and pointer arithmetic. Linux-application developers often use long type for these purposes.


2 Answers

The current C standard does not require size_t to be at least as wide as an int, and I'm skeptical about any version of the standard ever doing so. size_t needs to be able to represent any number which might be the size of an object; if the implementation limits object sizes to be 24 bits wide, then size_t could be a 24-bit unsigned type, regardless of what an int is.

The GCC warning does not refer to theoretical possibilities. It is checking a particular hardware platform and a particular compiler and runtime. That means it sometimes triggers on code which is trying to be portable. (There are other cases where portable code will trigger optional GCC warnings.) That might not be what you were hoping the warning would do, but there are probably users whose expectations are precisely matched by the implemented behaviour, and the standard provides no guidelines whatsoever for compiler warnings.


As OP mentions in a comment, there is a long history related to this warning. The warning was introduced in version 3.3.2 or so (in 2003), apparently not controlled by any -W flag. This was reported as bug 12963 by a user who evidently felt, as you do, that the warning discourages portable programming. As can be seen in the bug report, various GCC maintainers (and other well-known members of the community) weighed in with strongly-felt but conflicting opinions. (This is a common dynamic in open source bug reports.) After several years, the decision was made to control the warnings with a flag, and to not enable that flag by default or as part of -Wall. In the meantime, the -W option had been renamed -Wextra, and the newly-created flag (-Wtype-limits) was added to the -Wextra collection. To me, this appears to be the correct resolution.


The remainder of this answer contains my personal opinion.

-Wall, as documented in the GCC manual, does not actually enable all warnings. It enables those warnings "about constructions that some users consider questionable, and that are easy to avoid (or modify to prevent the warning), even in conjunction with macros." There are a number of other conditions which GCC can detect:

Note that some warning flags are not implied by -Wall. Some of them warn about constructions that users generally do not consider questionable, but which occasionally you might wish to check for; others warn about constructions that are necessary or hard to avoid in some cases, and there is no simple way to modify the code to suppress the warning. Some of them are enabled by -Wextra but many of them must be enabled individually.

These distinctions are somewhat arbitrary. For example, I have to grit my teeth every time that GCC decides to "suggest parentheses around ‘&&’ within ‘||’". (It doesn't seem to feel the need to suggest parentheses around ´*´ within ´+´, which doesn't feel different to me.) But I recognize that all of us have different comfort levels with operator precedence, and not all of GCC's suggestions about parentheses seem excessive to me.

But on the whole, the distinction seems reasonable. There are warnings which are generally applicable, and those are enabled with -Wall, which should always be specified because these warnings almost always demand action to correct a deficiency. There are other warnings which might be useful in particular circumstances, but which also have lots of false positive; these warnings need to be investigated individually because they do not always (or even often) correspond with a problem in your code.

I'm aware that there are people who feel that the mere fact that GCC knows how to warn about some condition is sufficient to demand action to avoid that warning. Everyone is entitled to their stylistic and aesthetic judgements, and it is right and just that such programmers add -Wextra to their build flags. I am not in that crowd, however. At a given point in a project, I will try a build with a large collection of optional warnings enabled, and consider whether or not to modify my code on the basis of the reports, but I really don't want to spend my development time thinking about non-problems every time I rebuild a file. The -Wtypes-limit flag falls into this category for me.

like image 120
rici Avatar answered Sep 21 '22 16:09

rici


Nothing requires the maximum of size_t to be larger than int. Such architectures where SIZE_MAX is <= INT_MAX are rare though and I doubt GCC would support any of them.

As for the fix, you can use #if:

#if INT_MAX > SIZE_MAX
if (u_value > SIZE_MAX) {  /* Line 10 */
    exit(EXIT_FAILURE);
}
#endif