Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clarification on difference in ODR rules for structs in C and C++

I am aware of how ODR, linkage, static, and extern "C" work with functions. But I am not sure about visibility of types since they cannot be declared static and there are no anonymous namespaces in C.

In particular, I would like to know the validity of the following code if compiled as C and C++

// A.{c,cpp}
typedef struct foo_t{
    int x;
    int y;
} Foo;

static int use_foo() 
{ 
    Foo f;
    f.x=5;
    return f.x;
}
// B.{c,cpp}
typedef struct foo_t{
    double x;
} Foo;

static int use_foo() 
{ 
    Foo f;
    f.x=5.0;
    return f.x;// Cast on purpose
}

using the following two commands (I know both compilers autodetect the language based on extensions, hence the different names).

  • g++ -std=c++17 -pedantic -Wall -Wextra a.cpp b.cpp
  • gcc -std=c11 -pedantic -Wall -Wextra a.c b.c

Versions 8.3 happily compile both without any errors. Clearly, if both struct symbols have external linkage, there is ODR violation because the definitions are not identical. Yes, compiler is not required to report it, hence my question because neither did.

Is it valid C++ program?

I do not think so, that is what anonymous namespaces are for.

Is it valid C program?

I am not sure here, I have read that types are considered static which would make the program valid. Can someone please confirm?

C,C++ Compatibility

If these definitions were in public header files, perhaps in different C libraries, and a C++ program includes both, each also in a different TU, would that be ODR? How can one prevent this? Does extern "C" play any role?

like image 516
Quimby Avatar asked Oct 20 '21 08:10

Quimby


People also ask

When struct is used instead of the class keyword will anything change in the program?

The reality is that this completely up to you and your team, and it'll make literally no difference whatsoever to your program. The following two classes are absolutely equivalent in every way except their name: struct Foo { int x; }; class Bar { public: int x; };

What is the difference between a struct and a class in C++?

The only difference between a struct and class in C++ is the default accessibility of member variables and methods. In a struct they are public; in a class they are private.

Why do we need struct in C++?

Use a struct when you need to store elements of different data types under one data type. C++ structs are a value type rather than being a reference type. Use a struct if you don't intend to modify your data after creation.

Are structs like classes?

Technically speaking, structs and classes are almost equivalent, still there are many differences. The major difference like class provides the flexibility of combining data and methods (functions ) and it provides the re-usability called inheritance. Struct should typically be used for grouping data.

Is there an ODR for variable types in C?

Note: in C, there is no program-wide ODR for types, and even extern declarations of the same variable in different translation units may have different types as long as they are compatible.

How to initialize structure data members in C and C++?

Direct Initialization: We cannot directly initialize structure data members in C but we can do it in C++. 5. Using struct keyword: In C, we need to use a struct to declare a struct variable. In C++, a struct is not necessary. For example, let there be a structure for Record. In C, we must use “struct Record” for Record variables.

What are the differences between C++ and C++ structures?

C++ structures can have this concept as it is inbuilt in the language. 7. Pointers and References: In C++, there can be both pointers and references to a struct in C++, but only pointers to structs are allowed in C. 8. sizeof operator: This operator will generate 0 for an empty structure in C whereas 1 for an empty structure in C++.

What is one definition rule in C++?

One Definition Rule. if the definition is for a template, then all these requirements apply to both names at the point of definition and dependent names at the point of instantiation If all these requirements are satisfied, the program behaves as if there is only one definition in the entire program. Otherwise, the behavior is undefined.


Video Answer


3 Answers

I will use for references the n1570 draft for C11 for the C language and the draft n4860 for C++20 for the C++ language.

  1. C language

    Types have no linkage in C: 6.2.2 Linkages of identifiers §6:

    The following identifiers have no linkage: an identifier declared to be anything other than an object or a function...

    That means that the types used in a.c and b.c are unrelated: you correctly declare different objects in both compilation units.

  2. C++ language

    Types do have linkage in C++. 6.6 Program and linkage [basic.link] says (emphasize mine):

    • §2:

    A name is said to have linkage when it might denote the same object, reference, function, type, template, namespace or value as a name introduced by a declaration in another scope

    • §4

    An unnamed namespace or a namespace declared directly or indirectly within an unnamed namespace has internal linkage. All other namespaces have external linkage. A name having namespace scope that has not been given internal linkage above and that is the name of
    ...
    a named class...
    ...
    has its linkage determined as follows:
    — if the enclosing namespace has internal linkage, the name has internal linkage;
    — otherwise, if the declaration of the name is attached to a named module (10.1) and is not exported (10.2), the name has module linkage;
    — otherwise, the name has external linkage

    The types declared in a.cpp and b.cpp share the same identifier with external linkage and are not compatible: the program is ill-formed.


That being said, most common compiler are able to compile either C or C++ sources, and I would bet a coin that they try hard to share most of the implementation of both languages. For that reason, I would trust real world implementation to produce the expected resuls even for C++ language. But Undefined Behaviour does not forbid expected results...

like image 142
Serge Ballesta Avatar answered Oct 23 '22 10:10

Serge Ballesta


For C. The program is valid. The only requirement that applies here is "strict aliasing rule" saying that the object can be accessed only via a l-value of a compatible type (+ a few exception described in 6.5p7).

The compatibility of structures/unions defined in separate translation units is defined in 6.2.7p1.

... two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types; if one member of the pair is declared with an alignment specifier, the other is declared with an equivalent alignment specifier; and if one member of the pair is declared with a name, the other is declared with the same name. For two structures, corresponding members shall be declared in the same order. For two structures or unions, corresponding bit-fields shall have the same widths. For two enumerations, corresponding members shall have the same values.

Therefore the structures are not compatible in the example.

However, it is not an issue because the f object is created and accessed via locally defined type. UB would be invoked if the object was created with Foo type defined in one translation unit and accessed via other Foo type in the other translation unit:

// A.c
typedef struct foo_t{
    int x;
    int y;
} Foo;

void bar(void *f);

void foo() 
{ 
    Foo f;
    bar(&f);
}

// B.c
typedef struct foo_t{
    double x;
} Foo;

// using void* to avoid passing pointer to incompatible types
void bar(void *f_) 
{ 
    Foo *f = f_;
    f->x=5.0; // UB!
}
like image 24
tstanisl Avatar answered Oct 23 '22 11:10

tstanisl


Other answers point out that this is an ill-formed program in C++.

In practice, link errors on overloaded functions would be possible if you have two separate definitions of (non-static) void foo(bar); in separate translation units. I expect this is (part of) why C++ has this rule that (some) types have external linkage.

If types were truly private, those wouldn't conflict. But they'll name-mangle the same way, because if both TUs do have the same definition of the type bar (e.g. via a .h or manual copying), they need to resolve to calling the same function.

// A.cpp
typedef struct foo{  // names ending with _t are reserved
    int x;
    int y;
} Foo;

int take_foo(Foo f) {
    return f.x;
}

int main(){}  // so it's linkable without special options like -nostdlib and linker entry-point defaults
// B.{c,cpp}
typedef struct foo{
    double x;
} Foo;

double take_foo(Foo f) {
    return f.x;
}

In case it matters, these functions will compile to different machine code on some targets, including x86-64 System V ABI where I tested it. (The first double arg is already in the return-value register, even if inside a struct containing only a couple doubles. But unlike ARM64 and some other RISCs, the first integer arg is not passed in the return-value register, so a mov is required before the ret.)

$ g++ [AB].cpp
/usr/bin/ld: /tmp/ccM89kvx.o: in function `take_foo(foo)':
B.cpp:(.text+0x0): multiple definition of `take_foo(foo)'; /tmp/cckZ5qRG.o:A.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status

There's no error if the functions or the struct tags have different names. (And yes, I compiled with optimization disabled, and no link-time optimization, so nothing had a chance to remove unused functions before they conflicted.)

However, just changing the typedef name without changing the struct tag isn't sufficient. That makes sense; all typedefs for the same type need to resolve to the same asm name, so GCC mangles based on the struct tag even if you don't use it directly. Note the linker error messages demangling it back to take_foo(foo) not Foo.

I didn't go through the standard wording to see if two typedef ... Foo would be legal in ISO C++, despite not being a problem in practice for real-world C++ implementations.

Making either function static would fix the problem, too, because it's fine for static functions to have the same asm name.

This would also have a linker error if compiled as C, which doesn't have function overloading so it's already a problem to have two non-static take_foo functions in the same program regardless of their args being structs of the same tag-name or not.

like image 2
Peter Cordes Avatar answered Oct 23 '22 11:10

Peter Cordes