Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Undefined behavior allowed in constexpr -- compiler bug?

My understanding is that:

  • Signed integer overflow in C++ is undefined behavior
  • Constant expressions are not allowed to contain undefined behavior.

It seems to follow that something like the following should not compile, and indeed on my compiler it doesn't.

template<int n> struct S { };

template<int a, int b>
S<a * b> f()
{
  return S<a * b>();
}

int main(int, char **)
{
  f<50000, 49999>();
  return 0;
}

However, now I try the following instead:

#include <numeric>

template<int n> struct S { };

template<int a, int b>
S<std::lcm(a, b)> g()
{
  return S<std::lcm(a,b)>();
}

int main(int, char **)
{
  g<50000, 49999>();
  return 0;
}

Each of g++, clang, and MSVC will happily compile this, despite the fact that

The behavior is undefined if |m|, |n|, or the least common multiple of |m| and |n| is not representable as a value of type std::common_type_t<M, N>.

(Source: https://en.cppreference.com/w/cpp/numeric/lcm)

Is this a bug in all three compilers? Or is cppreference wrong about lcm's behavior being undefined if it can't represent the result?

like image 550
Daniel McLaury Avatar asked Apr 17 '26 01:04

Daniel McLaury


2 Answers

According to [expr.const]/5, "an operation that would have undefined behavior as specified in [intro] through [cpp]" is not permitted during constant evaluation, but:

If E satisfies the constraints of a core constant expression, but evaluation of E would evaluate an operation that has undefined behavior as specified in [library] through [thread], or an invocation of the va_­start macro ([cstdarg.syn]), it is unspecified whether E is a core constant expression.

We usually summarize this as "language UB must be diagnosed in a context that requires a constant expression, but library UB does not necessarily need to be diagnosed".

The reason for this rule is that an operation that causes library UB may or may not cause language UB, and it would be difficult for compilers to consistently diagnose library UB even in cases when it doesn't cause language UB. (In fact, even some forms of language UB are not consistently diagnosed by current implementations.)

Some people also refer to language UB as "hard" UB and library UB as "soft" UB, but I don't like this terminology because (in my opinion) it encourages users to think of "code for which it's unspecified whether language UB occurs" as somehow less bad than "code that unambiguously has language UB". But in both cases, the result is that the programmer cannot write a program that executes such code and expect anything to work properly.

like image 192
Brian Bi Avatar answered Apr 19 '26 16:04

Brian Bi


A late reply, but it was a bug in gcc and is now fixed: See https://gcc.gnu.org/PR105844

Quoting some details:

    When I fixed PR libstdc++/92978 I introduced a regression whereby
    std::lcm(INT_MIN, 1) and std::lcm(50000, 49999) would no longer produce
    errors during constant evaluation. Those calls are undefined, because
    they violate the preconditions that |m| and the result can be
    represented in the return type (which is int in both those cases). The
    regression occurred because __absu<unsigned>(INT_MIN) is well-formed,
    due to the explicit casts to unsigned in that new helper function, and
    the out-of-range multiplication is well-formed, because unsigned
    arithmetic wraps instead of overflowing.
like image 30
doctorlove Avatar answered Apr 19 '26 16:04

doctorlove



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!