C safely taking absolute value of integer

Tags:

Consider following program (C99):

#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>

int main(void)
{
    printf("Enter int in range %jd .. %jd:\n > ", INTMAX_MIN, INTMAX_MAX);
    intmax_t i;
    if (scanf("%jd", &i) == 1)
        printf("Result: |%jd| = %jd\n", i, imaxabs(i));
}

Now as I understand it, this contains easily triggerable undefined behaviour, like this:

Enter int in range -9223372036854775808 .. 9223372036854775807:
 > -9223372036854775808
Result: |-9223372036854775808| = -9223372036854775808

Questions:

Is this really undefined behaviour, as in "code is allowed to trigger any code path, which any code that stroke compiler's fancy", when user enters the bad number? Or is it some other flavor of not-completely-defined?
How would a pedantic programmer go about guarding against this, without making any assumptions not guaranteed by standard?

(There are a few related questions, but I didn't find one which answers question 2 above, so if you suggest duplicate, please make sure it answers that.)

507

asked Feb 07 '16 08:02

2 Answers

If the result of imaxabs cannot be represented, can happen if using two's complement, then the behavior is undefined.

7.8.2.1 The imaxabs function

The imaxabs function computes the absolute value of an integer j. If the result cannot be represented, the behavior is undefined. 221)

221) The absolute value of the most negative number cannot be represented in two’s complement.

The check that makes no assumptions and is always defined is:

intmax_t i = ... ;
if( i < -INTMAX_MAX )
{
    //handle error
}

(This if statement cannot be taken if using one's complement or sign-magnitude representation, so the compiler might give a unreachable code warning. The code itself is still defined and valid. )

answered Sep 28 '22 10:09

2501

How would a pedantic programmer go about guarding against this, without making any assumptions not guaranteed by standard?

One method is to use unsigned integers. The overflow behaviour of unsigned integers is well-defined as is the behaviour when converting from a signed to an unsigned integer.

So I think the following should be safe (turns out it's horriblly broken on some really obscure systems, see later in the post for an improved version)

uintmax_t j = i;
if (j > (uintmax_t)INTMAX_MAX) {
  j = -j;
}
printf("Result: |%jd| = %ju\n", i, j);

So how does this work?

uintmax_t j = i;

This converts the signed integer into an unsigned one. IF it's positive the value stays the same, if it's negative the value increases by 2ⁿ (where n is the number of bits). This converts it to a large number (larger than INTMAX_MAX)

if (j > (uintmax_t)INTMAX_MAX) {

If the original number was positive (and hence less than or equal to INTMAX_MAX) this does nothing. If the original number was negative the inside of the if block is run.

  j = -j;

The number is negated. The result of a negation is clearly negative and so cannot be represented as an unsigned integer. So it is increased by 2ⁿ.

So algebraically the result for negative i looks like

j = - (i + 2ⁿ) + 2ⁿ = -i

Clever, but this solution makes assumptions. This fails if INTMAX_MAX == UINTMAX_MAX, which is allowed by C Standard.

Hmm, lets look at this (i'm reading https://busybox.net/~landley/c99-draft.html which is apprarently the last C99 draft prior to standardisation, if anything changed in the final standard please do tell me.

When typedef names differing only in the absence or presence of the initial u are defined, they shall denote corresponding signed and unsigned types as described in 6.2.5; an implementation shall not provide a type without also providing its corresponding type.

In 6.2.5 I see

For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements.

In 6.2.6.2 I see

#1

For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter). If there are N value bits, each bit shall represent a different power of 2 between 1 and 2N-1, so that >objects of that type shall be capable of representing values from 0 to 2N-1 >using a pure binary representation; this shall be known as the value representation. The values of any padding bits are unspecified.39)

#2

For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; there shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M<=N). If the sign bit is zero, it shall not affect the resulting value.

So yes it seems you are right, while the signed and unsigned types have to be the same size it does seem to be valid for the unsigned type to have one more padding bit than the signed type.

Ok, based on the analysis above revealing a flaw in my first attempt i've written a more paranoid variant. This has two changes from my first version.

I use i < 0 rather than j > (uintmax_t)INTMAX_MAX to check for negative numbers. This means that the algorithm proceduces correct results for numbers grater than or equal to -INTMAX_MAX even when INTMAX_MAX == UINTMAX_MAX.

I add handling for the error case where INTMAX_MAX == UINTMAX_MAX, INTMAX_MIN == -INTMAX_MAX -1 and i == INTMAX_MIN. This will result in j=0 inside the if condition which we can easilly test for.

It can be seen from the requirements in the C standard that INTMAX_MIN cannot be smaller than -INTMAX_MAX -1 since there is only one sign bit and the number of value bits must be the same or lower than in the corresponding unsigned type. There are simply no bit patterns left to represent smaller numbers.

uintmax_t j = i;
if (i < 0) {
  j = -j;
  if (j == 0) {
    printf("your platform sucks\n");
    exit(1);
  }
}
printf("Result: |%jd| = %ju\n", i, j);

@plugwash I think 2501 is correct. For example, -UINTMAX_MAX value becomes 1: (-UINTMAX_MAX + (UINTMAX_MAX + 1)), and is not caught by your if. – hyde 58 mins ago

Umm,

assuming INTMAX_MAX == UINTMAX_MAX and i = -INTMAX_MAX

uintmax_t j = i;

after this command j = -INTMAX_MAX + (UINTMAX_MAX + 1) = 1

if (i < 0) {

i is less than zero so we run the commands inside the if

j = -j;

after this command j = -1 + (UINTMAX_MAX + 1) = UINTMAX_MAX

which is the correct answer, so no need to trap it in an error case.

answered Sep 28 '22 11:09

plugwash

Related questions
                            
                                Mongo: network error while attempting to run command 'whatsmyuri' on host
                            
                                Pandas dropping columns by index drops all columns with same name
                            
                                How do I make PyCharm recognize placeholders in SQL statements?
                            
                                ggplot2: geom_text resize with the plot and force/fit text within geom_bar
                            
                                WebSocket connection failure. Error during Websocket handshake. Response Code 403?
                            
                                Queue like Subject in RxJava
                            
                                Redux - How to add entry to array in reducer
                            
                                C++11: the difference between memory_order_relaxed and memory_order_consume
                            
                                docker-compose up -d <name>; No such service: <name>
                            
                                Spring Environment backed by Typesafe Config
                            
                                Const member function and typedef, C++
                            
                                Can I share ACM SSL certificates between AWS acounts [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

C safely taking absolute value of integer

Tags:

hyde

People also ask

2 Answers

2501

plugwash

Recent Activity

Donate For Us