Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does strtol("-2147483648", 0, 0) overflow if LONG_MAX is 2147483647?

Per the specification of strtol:

If the subject sequence has the expected form and the value of base is 0, the sequence of characters starting with the first digit shall be interpreted as an integer constant. If the subject sequence has the expected form and the value of base is between 2 and 36, it shall be used as the base for conversion, ascribing to each letter its value as given above. If the subject sequence begins with a minus-sign, the value resulting from the conversion shall be negated. A pointer to the final string shall be stored in the object pointed to by endptr, provided that endptr is not a null pointer.

The issue at hand is that, prior to the negation, the value is not in the range of long. For example, in C89 (where the integer constant can't take on type long long), writing -2147483648 is possibly an overflow; you have to write (-2147483647-1) or similar.

Since the wording using "integer constant" could be interpreted to apply the C rules for the type of an integer constant, this might be enough to save us from undefined behavior here, but the same issue (without such an easy out) would apply to strtoll.

Edit: Finally, note that even if it did overflow, the "right" value should be returned. So this question is really just about whether errno may or must be set in this case.

like image 747
R.. GitHub STOP HELPING ICE Avatar asked Jun 08 '13 19:06

R.. GitHub STOP HELPING ICE


1 Answers

Although I cannot point to a particular bit of wording in the standard today, when I wrote strtol for 4BSD back in the 1990s I was pretty sure that this should not set errno, and made sure that I would not. Whether this was based on wording in the standard, or personal discussion with someone, I no longer recall.

In order to avoid overflow, this means the calculation has to be done pretty carefully. I did it in unsigned long and included this comment (still in the libc source in the various BSDs):

    /*
     * Compute the cutoff value between legal numbers and illegal
     * numbers.  That is the largest legal value, divided by the
     * base.  An input number that is greater than this value, if
     * followed by a legal input character, is too big.  One that
     * is equal to this value may be valid or not; the limit
     * between valid and invalid numbers is then based on the last
     * digit.  For instance, if the range for longs is
     * [-2147483648..2147483647] and the input base is 10,
     * cutoff will be set to 214748364 and cutlim to either
     * 7 (neg==0) or 8 (neg==1), meaning that if we have accumulated
     * a value > 214748364, or equal but the next digit is > 7 (or 8),
     * the number is too big, and we will return a range error.
     *
     * Set 'any' if any `digits' consumed; make it negative to indicate
     * overflow.
     */

I was (and still am, to some extent) annoyed by the asymmetry between this action in the C library and the syntax of the language itself (where negative numbers are two separate tokens, - followed by the number, so that writing -217483648 means -(217483648) which becomes -(217483648U) which is of course 217483648U and hence positive! (Assuming 32-bit int of course; the problematic value varies for other bit sizes.)

like image 182
torek Avatar answered Dec 19 '22 07:12

torek