Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which enum values are undefined behavior in C++14, and why?

A footnote in the standard implies that any enum expression value is defined behavior; why does Clang's undefined behavior sanitizer flag out-of-range values?

Consider the following program:

enum A {B = 3, C = 7};

int main() {
  A d = static_cast<A>(8);
  return d + B;
}

The output under the undefined behavior sanitizer is:

$ clang++-5.0 -fsanitize=undefined -ggdb3 enum.cc && ./a.out 
enum.cc:5:10: runtime error: load of value 8, which is not a valid value for type 'A'

Note that the error is not on the static_cast, but on the addition. This is also true when an A is created (but not initialized) and then an int with value 8 is memcpyied into the A - the ubsan error is on the addition, not the initial load.

IIUC, ubsan in newer clangs does flag an error on the static_cast in C++17 mode. I don't know if that mode also finds an error in the memcpy. In any case, this question is focused on C++14.

The reported error comports with the following parts of the standard:

dcl.enum:

For an enumeration whose underlying type is fixed, the values of the enumeration are the values of the underlying type. Otherwise, the values of the enumeration are the values representable by a hypothetical integer types with minimal range exponent M such that all enumerators can be represented. The width of the smallest bit-field large enough to hold all the values of the enumeration type is M. It is possible to define an enumeration that has values not defined by any of its enumerators. If the enumerator-list is empty, the values of the enumeration are as if the enumeration had a single enumerator with value 0.100

So the values of the enumeration A are 0 through 7, inclusive, and the "range exponent" M is 3. Evaluating an expression of type A with value 8 is undefined behavior according to expr.pre:

If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined.

But there is one hiccup: that footnote from dcl.enum reads:

This set of values is used to define promotion and conversion semantics for the enumeration type. It does not preclude an expression of enumeration type from having a value that falls outside this range. [emphasis mine]

Question: Why is an expression with value 8 and type A undefined behavior if "[dcl.enum] does not preclude an expression of enumeration type from having a value that falls outside this range"?

like image 932
jbapple Avatar asked Jan 26 '19 15:01

jbapple


People also ask

Can enum have float values C++?

Enums can only be ints, not floats in C# and presumably unityScript.

What are enums used for C++?

Enum, which is also known as enumeration, is a user-defined data type that enables you to create a new data type that has a fixed range of possible values, and the variable can select one value from the set of values.

How do you declare an enum in C++?

An enumeration is a user-defined data type that consists of integral constants. To define an enumeration, keyword enum is used. enum season { spring, summer, autumn, winter }; Here, the name of the enumeration is season .

How do you pass an enum to a function in C++?

static class Myclass { ... public: enum encoding { BINARY, ASCII, ALNUM, NUM }; Myclass(Myclass::encoding); ... } Then in the method definition: Myclass::Myclass(Myclass::encoding enc) { ... }


2 Answers

Note the phrasing of footnote 100: "[This set of values] does not preclude [stuff]." This is not an endorsement of the "stuff" as valid; it merely emphasizes that this section does not declare the stuff invalid. It is in fact a neutral statement that should bring to mind the fallacy of the excluded middle. As far as this section goes, values outside the values of the enumeration are neither approved nor disapproved. This section defines which values are outside the values of the enumeration, but it is left to other sections (like expr.pre) to decide the validity of using such values.

You can think of this footnote as a warning to those writing compilers: do not assume! An expression of enumeration type need not have a value within the enumeration's set of values. Such a case must compile correctly unless another section classifies that case as undefined behavior.


For a better understanding of what exactly clang is complaining about, try the following code:

enum A {B = 3, C = 7};

int main() {
  // Set a variable of type A to a value outside A's set of values.
  A d = static_cast<A>(8);

  // Try to evaluate an expression of type A with this too-big value.
  if ( !static_cast<bool>(static_cast<A>(8)) )
    return 2;

  // Try again, but this time load the value from d.
  if ( !static_cast<bool>(d) ) // Sanitizer flags only this
    return 1;

  return 0;
}

The sanitizer does not complain about forcing a value of 8 into a variable of type A. It does not complain about evaluating an expression of type A that happens to have the value 8 (the first if). It does, though, complain when the value of 8 comes from (is loaded from) a variable of type A.

like image 186
JaMiT Avatar answered Oct 22 '22 12:10

JaMiT


Clang flags the use of static_cast on a value that is out of range. The behavior is undefined, if the integral value is not within the range of the enumeration.

C++ standard 5.2.9 Static cast [expr.static.cast] paragraph 7

A value of integral or enumeration type can be explicitly converted to an enumeration type. The value is unchanged if the original value is within the range of the enumeration values (7.2). Otherwise, the resulting enumeration value is unspecified / undefined (since C++17).

like image 5
KamilCuk Avatar answered Oct 22 '22 12:10

KamilCuk