I'm writing a library of functions that will safely convert between various numeric types or die trying. My intent is roughly equal parts create-useful-library and learn-C-edge-cases. My <code>int</code>-to-<code>size_t</code> function is triggering a GCC <code>-Wtype-limits</code> warning that claims I shouldn't test if an <code>int</code> is greater than <code>SIZE_MAX</code>, because it will never be true. (Another function that converts <code>int</code> to <code>ssize_t</code> produces an identical warning about <code>SSIZE_MAX</code>.) My MCVE, with extra comments and baby steps, is: <pre class="prettyprint"><code>#include <stdint.h> /* SIZE_MAX */ #include <stdlib.h> /* exit EXIT_FAILURE size_t */ extern size_t i2st(int value) { if (value < 0) { exit(EXIT_FAILURE); } // Safe to cast --- not too small. unsigned int const u_value = (unsigned int) value; if (u_value > SIZE_MAX) { /* Line 10 */ exit(EXIT_FAILURE); } // Safe to cast --- not too big. return (size_t) u_value; } </code></pre> <h3>The Compiler Warnings</h3> I'm getting similar warnings from GCC 4.4.5 on Linux 2.6.34: <pre class="prettyprint lang-none prettyprint-override"><code>$ gcc -std=c99 -pedantic -Wall -Wextra -c -o math_utils.o math_utils.c math_utils.c: In function ‘i2st’: math_utils.c:10: warning: comparison is always false due to limited range of data type </code></pre> ...and also from GCC 4.8.5 on Linux 3.10.0: <pre class="prettyprint lang-none prettyprint-override"><code>math_utils.c: In function ‘i2st’: math_utils.c:10:5: warning: comparison is always false due to limited range of data type [-Wtype-limits] if (u_value > SIZE_MAX) { /* Line 10 */ ^ </code></pre> These warnings don't appear justified to me, at least not in the general case. (I don't deny that the comparison might be "always false" on some particular combination of hardware and compiler.) <h3>The C Standard</h3> The C 1999 standard does not appear to rule out an <code>int</code> being greater than <code>SIZE_MAX</code>. Section "6.5.3.4 The <code>sizeof</code> operator" doesn't address <code>size_t</code> at all, except to describe it as "defined in <code><stddef.h></code> (and other headers)". Section "7.17 Common definitions <code><stddef.h></code>" defines <code>size_t</code> as "the unsigned integer type of the result of the <code>sizeof</code> operator". (Thanks, guys!) Section "7.18.3 Limits of other integer types" is more helpful --- it defines "limit of <code>size_t</code>" as: <blockquote> <code>SIZE_MAX</code> <code>65535</code> </blockquote> ...meaning <code>SIZE_MAX</code> could be as small as 65535. An <code>int</code> (signed or unsigned) could be much greater than that, depending on the hardware and compiler. <h3>Stack Overflow</h3> The accepted answer to "<code>unsigned int</code> vs. <code>size_t</code>" seems to support my interpretation (emphasis added): <blockquote> The <code>size_t</code> type may be bigger than, equal to, or smaller than an <code>unsigned int</code>, and your compiler might make assumptions about it for optimization. </blockquote> This answer cites the same "Section 7.17" of the C standard that I've already quoted. <h3>Other Documents</h3> My searches turned up the Open Group's paper "Data Size Neutrality and 64-bit Support", which claims under "64-bit Data Models" (emphasis added): <blockquote> ISO/IEC 9899:1990, Programming Languages - C (ISO C) left the definition of the <code>short int</code>, the <code>int</code>, the <code>long int</code>, and the <code>pointer</code> deliberately vague [...] The only constraints were that <code>int</code>s must be no smaller than <code>short</code>s, and <code>long</code>s must be no smaller than <code>int</code>s, and <code>size_t</code> must represent the largest unsigned type supported by an implementation. [...] The relationship between the fundamental data types can be expressed as: <blockquote> <code>sizeof(char)</code> <= <code>sizeof(short)</code> <= <code>sizeof(int)</code> <= <code>sizeof(long)</code> = <code>sizeof(size_t)</code> </blockquote> </blockquote> If this is true, then testing an <code>int</code> against <code>SIZE_MAX</code> is indeed futile... but this paper doesn't cite chapter-and-verse, so I can't tell how its authors reached their conclusion. Their own "Base Specification Version 7" <code>sys/types.h</code> docs don't address this either way. <h3>My Question</h3> I understand that <code>size_t</code> is unlikely to be narrower than an <code>int</code>, but does the C standard guarantee that comparing <code>some_unsigned_int > SIZE_MAX</code> will always be false? If so, where? <h3>Not-Duplicates</h3> There are two semi-duplicates of this question, but they are both asking more general questions about what <code>size_t</code> is supposed to represent and when it should / should-not be used. <ul> <li>"What is <code>size_t</code> in C?" does not address the relationship between <code>size_t</code> and the other integer types. Its accepted answer is just a quote from Wikipedia, which doesn't provide any information beyond what I've already found.</li> <li>"What is the correct definition of <code>size_t</code>?" starts off nearly a duplicate of my question, but then veers off course, asking when <code>size_t</code> should be used and why it was introduced. It was closed as a duplicate of the previous question.</li> </ul>

Nothing requires the maximum of <code>size_t</code> to be larger than <code>int</code>. Such architectures where <code>SIZE_MAX</code> is <= <code>INT_MAX</code> are rare though and I doubt GCC would support any of them. As for the fix, you can use <code>#if</code>: <pre class="prettyprint"><code>#if INT_MAX > SIZE_MAX if (u_value > SIZE_MAX) { /* Line 10 */ exit(EXIT_FAILURE); } #endif </code></pre>

Is the max value of size_t (SIZE_MAX) defined relative to the other integer types?

Tags:

c

types

integer

limit

I'm writing a library of functions that will safely convert between various numeric types or die trying. My intent is roughly equal parts create-useful-library and learn-C-edge-cases.

My int-to-size_t function is triggering a GCC -Wtype-limits warning that claims I shouldn't test if an int is greater than SIZE_MAX, because it will never be true. (Another function that converts int to ssize_t produces an identical warning about SSIZE_MAX.)

My MCVE, with extra comments and baby steps, is:

#include <stdint.h>  /* SIZE_MAX */
#include <stdlib.h>  /* exit EXIT_FAILURE size_t */

extern size_t i2st(int value) {
    if (value < 0) {
        exit(EXIT_FAILURE);
    }
    // Safe to cast --- not too small.
    unsigned int const u_value = (unsigned int) value;
    if (u_value > SIZE_MAX) {  /* Line 10 */
        exit(EXIT_FAILURE);
    }
    // Safe to cast --- not too big.
    return (size_t) u_value;
}

The Compiler Warnings

I'm getting similar warnings from GCC 4.4.5 on Linux 2.6.34:

$ gcc -std=c99 -pedantic -Wall -Wextra -c -o math_utils.o math_utils.c

math_utils.c: In function ‘i2st’:
math_utils.c:10: warning: comparison is always false due to limited range of data type

...and also from GCC 4.8.5 on Linux 3.10.0:

math_utils.c: In function ‘i2st’:
math_utils.c:10:5: warning: comparison is always false due to limited range of data type [-Wtype-limits]
     if (u_value > SIZE_MAX) {  /* Line 10 */
     ^

These warnings don't appear justified to me, at least not in the general case. (I don't deny that the comparison might be "always false" on some particular combination of hardware and compiler.)

The C Standard

The C 1999 standard does not appear to rule out an int being greater than SIZE_MAX.

Section "6.5.3.4 The sizeof operator" doesn't address size_t at all, except to describe it as "defined in <stddef.h> (and other headers)".

Section "7.17 Common definitions <stddef.h>" defines size_t as "the unsigned integer type of the result of the sizeof operator". (Thanks, guys!)

Section "7.18.3 Limits of other integer types" is more helpful --- it defines "limit of size_t" as:

SIZE_MAX 65535

...meaning SIZE_MAX could be as small as 65535. An int (signed or unsigned) could be much greater than that, depending on the hardware and compiler.

Stack Overflow

The accepted answer to "unsigned int vs. size_t" seems to support my interpretation (emphasis added):

The size_t type may be bigger than, equal to, or smaller than an unsigned int, and your compiler might make assumptions about it for optimization.

This answer cites the same "Section 7.17" of the C standard that I've already quoted.

My Question

I understand that size_t is unlikely to be narrower than an int, but does the C standard guarantee that comparing some_unsigned_int > SIZE_MAX will always be false? If so, where?

Not-Duplicates

There are two semi-duplicates of this question, but they are both asking more general questions about what size_t is supposed to represent and when it should / should-not be used.

"What is size_t in C?" does not address the relationship between size_t and the other integer types. Its accepted answer is just a quote from Wikipedia, which doesn't provide any information beyond what I've already found.
"What is the correct definition of size_t?" starts off nearly a duplicate of my question, but then veers off course, asking when size_t should be used and why it was introduced. It was closed as a duplicate of the previous question.

732

asked Oct 01 '17 03:10

Kevin J. Chase

2 Answers

The current C standard does not require size_t to be at least as wide as an int, and I'm skeptical about any version of the standard ever doing so. size_t needs to be able to represent any number which might be the size of an object; if the implementation limits object sizes to be 24 bits wide, then size_t could be a 24-bit unsigned type, regardless of what an int is.

The GCC warning does not refer to theoretical possibilities. It is checking a particular hardware platform and a particular compiler and runtime. That means it sometimes triggers on code which is trying to be portable. (There are other cases where portable code will trigger optional GCC warnings.) That might not be what you were hoping the warning would do, but there are probably users whose expectations are precisely matched by the implemented behaviour, and the standard provides no guidelines whatsoever for compiler warnings.

As OP mentions in a comment, there is a long history related to this warning. The warning was introduced in version 3.3.2 or so (in 2003), apparently not controlled by any -W flag. This was reported as bug 12963 by a user who evidently felt, as you do, that the warning discourages portable programming. As can be seen in the bug report, various GCC maintainers (and other well-known members of the community) weighed in with strongly-felt but conflicting opinions. (This is a common dynamic in open source bug reports.) After several years, the decision was made to control the warnings with a flag, and to not enable that flag by default or as part of -Wall. In the meantime, the -W option had been renamed -Wextra, and the newly-created flag (-Wtype-limits) was added to the -Wextra collection. To me, this appears to be the correct resolution.

The remainder of this answer contains my personal opinion.

-Wall, as documented in the GCC manual, does not actually enable all warnings. It enables those warnings "about constructions that some users consider questionable, and that are easy to avoid (or modify to prevent the warning), even in conjunction with macros." There are a number of other conditions which GCC can detect:

Note that some warning flags are not implied by -Wall. Some of them warn about constructions that users generally do not consider questionable, but which occasionally you might wish to check for; others warn about constructions that are necessary or hard to avoid in some cases, and there is no simple way to modify the code to suppress the warning. Some of them are enabled by -Wextra but many of them must be enabled individually.

These distinctions are somewhat arbitrary. For example, I have to grit my teeth every time that GCC decides to "suggest parentheses around ‘&&’ within ‘||’". (It doesn't seem to feel the need to suggest parentheses around ´*´ within ´+´, which doesn't feel different to me.) But I recognize that all of us have different comfort levels with operator precedence, and not all of GCC's suggestions about parentheses seem excessive to me.

But on the whole, the distinction seems reasonable. There are warnings which are generally applicable, and those are enabled with -Wall, which should always be specified because these warnings almost always demand action to correct a deficiency. There are other warnings which might be useful in particular circumstances, but which also have lots of false positive; these warnings need to be investigated individually because they do not always (or even often) correspond with a problem in your code.

I'm aware that there are people who feel that the mere fact that GCC knows how to warn about some condition is sufficient to demand action to avoid that warning. Everyone is entitled to their stylistic and aesthetic judgements, and it is right and just that such programmers add -Wextra to their build flags. I am not in that crowd, however. At a given point in a project, I will try a build with a large collection of optional warnings enabled, and consider whether or not to modify my code on the basis of the reports, but I really don't want to spend my development time thinking about non-problems every time I rebuild a file. The -Wtypes-limit flag falls into this category for me.

120

answered Sep 21 '22 16:09

rici

Nothing requires the maximum of size_t to be larger than int. Such architectures where SIZE_MAX is <= INT_MAX are rare though and I doubt GCC would support any of them.

As for the fix, you can use #if:

#if INT_MAX > SIZE_MAX
if (u_value > SIZE_MAX) {  /* Line 10 */
    exit(EXIT_FAILURE);
}
#endif

answered Sep 22 '22 16:09

Antti Haapala -- Слава Україні

Related questions
                            
                                What encoding used when invoke fopen or open?
                            
                                What is the explanation for "warning: assuming that the loop is not infinite"
                            
                                Tutorial for Tree Data Structure in C
                            
                                something like an "extended" C string library?
                            
                                design a system supporting massive data storage and query
                            
                                Correct way return a string from a function [closed]
                            
                                How do I have a comma inside braces inside a macro argument when parentheses cause a syntax error?
                            
                                Precision of floats with printf
                            
                                Implementation of Goertzel algorithm in C
                            
                                How do I free a pointer returned from a function?
                            
                                Which initializer is appropriate for an int64_t?
                            
                                'memdup' function in C?
                            
                                What is wrong with printf("%llx")?
                            
                                Race Conditions in C
                            
                                How can I find out what this ffmpeg error code means?
                            
                                GCC options for strict C90 code?
                            
                                Reading the contents of an ELF section(programmatically)
                            
                                Is this forward declaration of a function pointer valid in C?
                            
                                Compatibility of C89/C90, C99 and C11
                            
                                Directed probability graph - algorithm to reduce cycles?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With