Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does this program with bounded recursion have undefined behavior?

I know that infinite recursion or iteration is undefined behavior, but bounded is not. But, this program segment faults for most inputs. Does it have undefined behavior, why or why not? If it has undefined behavior, is there some modification I could make that would remove the undefined behavior?

#include <cstdint>

using bigint = ::std::uint64_t;
constexpr const bigint add_val = 1442695040888963407;
constexpr const bigint mult_val = 6364136223846793005;

bigint compute(bigint t)
{
   if (t > 1) {
      return add_val + compute(t * mult_val);
   } else {
      return 1;
   }
}

int main(int argc, char const * const argv[])
{
   return compute(argc < 0 ? -argc : argc);
}

This is basically using recursion to cycle through all the values of a linear congruential random number generator with a period of 2^64 so it's guaranteed to eventually come to 0 or 1.

I'm asking a very clear and specific question. Does this program that anybody can compile and run in its entirety invoke undefined behavior according to the C++ standard?

like image 552
Omnifarious Avatar asked Jun 28 '19 17:06

Omnifarious


People also ask

What is undefined behavior in programming?

In computer programming, undefined behaviour is defined as 'the result of compiling computer code which is not prescribed by the specs of the programming language in which it is written'.

Does Java have undefined behavior?

Java is thus two steps removed from Modern C's model of "Undefined Behavior"; it rigidly prescribes many more behaviors, and even in cases where behaviors are not rigidly defined implementations are still limited to choosing from among various possibilities.

Does Python have undefined behavior?

Python has undefined behavior because it's implementations are written in languages that do.

Is printf undefined behavior?

printf("%f\n",0); Above line of code is undefined behavior.


2 Answers

With thanks to 101010's standard link, I think the relevant phrase is

4.1 Implementation compliance [intro.compliance]

(2.1) - If a program contains no violations of the rules in this document, a conforming implementation shall, within its resource limits, accept and correctly execute that program.

and

Annex B (informative)

Implementation quantities [implimits]

mostly addresses compiler limitations (max symbol length, characters per line etc.), but the language sounds related:

  1. Because computers are finite, C++ implementations are inevitably limited in the size of the programs they can successfully process. Every implementation shall document those limitations where known. This documentation may cite fixed limits where they exist, say how to compute variable limits as a function of available resources, or say that fixed limits do not exist or are unknown.

  2. The limits may constrain quantities that include those described below or others ...

So, the standard doesn't say which quantities may be limited by an implementation, or give anything other than guidelines for the minimum values of those limits it does describe.

If your compiler doesn't document its stack depth limit, I guess it may be non-conforming according to the bold sentence, but a statement that "stack depth is limited by these runtime properties" might be sufficient..

Does this program that anybody can compile and run in its entirety invoke undefined behavior according to the C++ standard?

No. But per 4.1/2.1 implementations are permitted to fail to compile, or fail correctly to execute, a correct program.

like image 101
Useless Avatar answered Oct 01 '22 14:10

Useless


Facts first, from draft standard N4800 §7.1/P4 [expr.pre] (Emphasis Mine):

If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined. [Note: Treatment of division by zero, forming a remainder using a zero divisor, and all floating-point exceptions vary among machines, and is sometimes adjustable by a library function. — end note]

Also the clause §6.7.1/p2 Fundamental types [basic.fundamental] (Emphasis Mine):

For each of the standard signed integer types, there exists a corresponding (but different) standard unsigned integer type: “unsigned char”, “unsigned short int”, “unsigned int”, “unsigned long int”, and “unsigned long long int”. Likewise, for each of the extended signed integer types, there exists a corresponding extended unsigned integer type. The standard and extended unsigned integer types are collectively called unsigned integer types. An unsigned integer type has the same range exponent N as the corresponding signed integer type. The range of representable values for the unsigned type is 0 to 2^N − 1 (inclusive); arithmetic for the unsigned type is performed modulo 2^N . [Note: Unsigned arithmetic does not overflow. Overflow for signed arithmetic yields undefined behavior (7.1). — end note]

Also §5.13.2/p2 Integer literals [lex.icon]

The type of an integer literal is the first of the corresponding list in Table 7 in which its value can be represented.

enter image description here

Also §7.4 Usual arithmetic conversions [expr.arith.conv]

(1.5) — Otherwise, the integral promotions (7.3.6) shall be performed on both operands60. Then the following rules shall be applied to the promoted operands:

  • (1.5.1) — If both operands have the same type, no further conversion is needed.

  • (1.5.2) — Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank shall be converted to the type of the operand with greater rank.

  • (1.5.3) — Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.

  • (1.5.4) — Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.

  • (1.5.5) — Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed integer type

So the question is: Are the arithmetic results in this program mathematically defined and within the range of representable values for its types. More specifically, is the expression 1442695040888963407 + compute(t * 6364136223846793005) within the range of representable values for its types?

For this to happen the type of integer literals 1442695040888963407 and 6364136223846793005 must fall to lesser or equal conversion rank with std::uint64_t so that the results convert to std::uint64_t. Unfortunatelly, there's no such guarantee.

Thus, for your program to avoid UB I would mark the integer literals with LU.

bigint compute(bigint t)
{
   if (t > 1) {
      return 1442695040888963407LU + compute(t * 6364136223846793005LU);
   } else {
      return 1;
   }
}

Now, as to why you get segmentation fault, is due to the fact that you overflow your stack. Although, the above program theoretically doesn't have infinite number of recursions which is UB, the number of recursions exhausts your machine's resources.

like image 38
101010 Avatar answered Oct 01 '22 15:10

101010