Why is GCC tricked into allowing undefined behavior simply by putting it in a loop?

Tags:

The following is nonsensical yet compiles cleanly with g++ -Wall -Wextra -Werror -Winit-self (I tested GCC 4.7.2 and 4.9.0):

#include <iostream> #include <string>  int main() {   for (int ii = 0; ii < 1; ++ii)   {     const std::string& str = str; // !!     std::cout << str << std::endl;   } }

The line marked !! results in undefined behavior, yet is not diagnosed by GCC. However, commenting out the for line makes GCC complain:

error: ‘str’ is used uninitialized in this function [-Werror=uninitialized]

I would like to know: why is GCC so easily fooled here? When the code is not in a loop, GCC knows that it is wrong. But put the same code in a simple loop and GCC doesn't understand anymore. This bothers me because we rely quite a lot on the compiler to notify us when we make silly mistakes in C++, yet it fails for a seemingly trivial case.

Bonus trivia:

If you change std::string to int and turn on optimization, GCC will diagnose the error even with the loop.
If you build the broken code with -O3, GCC literally calls the ostream insert function with a null pointer for the string argument. If you thought you were safe from null references if you didn't do any unsafe casting, think again.

I have filed a GCC bug for this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63203 - I'd still like to get a better understanding here of what went wrong and how it may impact the reliability of similar diagnostics.

472

asked Sep 08 '14 05:09

John Zwinck

1 Answers

I'd still like to get a better understanding here of what went wrong and how it may impact the reliability of similar diagnostics.

Unlike Clang, GCC doesn't have logic to detect self-initialized references, so getting a warning here relies on the code for detecting use of uninitialized variables, which is quite temperamental and unreliable (see Better Uninitialized Warnings for discussion).

With an int the compiler can figure out that you write an uninitialized int to the stream, but with a std::string there are apparently too many layers of abstraction between an expression of type std::string and getting the const char* it contains, and GCC fails to detect the problem.

e.g. GCC does give a warning for a simpler example with less code between the declaration and use of the variable, as long as you enable some optimization:

extern "C" int printf(const char*, ...);  struct string {   string() : data(99) { }   int data;   void print() const { printf("%d\n", data); } };  int main() {   for (int ii = 0; ii < 1; ++ii)   {     const string& str = str; // !!     str.print();   } }  d.cc: In function ‘int main()’: d.cc:6:43: warning: ‘str’ is used uninitialized in this function [-Wuninitialized]    void print() const { printf("%d\n", data); }                                            ^ d.cc:13:19: note: ‘str’ was declared here      const string& str = str; // !!                    ^

I suspect this kind of missing diagnostic is only likely to affect a handful of diagnostics which rely on heuristics to detect problems. These would be the ones that give a warning of the form "may be used uninitialized" or "may violate strict aliasing rules", and probably the "array subscript is above array bounds" warning. Those warnings are not 100% accurate and "complicated" logic like loops(!) can cause the compiler to give up trying to analyse the code and fail to give a diagnostic.

IMHO the solution would be to add checking for self-initialized references at the point of initialization, and not rely on detecting it is uninitialized later when it gets used.

198

answered Sep 23 '22 02:09

Jonathan Wakely

Related questions
                            
                                Why does this struct padding trick work?
                            
                                Optimizing animation performance in WebKit on Linux
                            
                                OAuth 2.0 on C++ (for UNIX)
                            
                                What prevents g++ from eliminating temporary std::array not used in runtime?
                            
                                Are there language constructs that are valid for type-name but not for fundamental types?
                            
                                How to see actual value of a C++ string in CLion's debugger?
                            
                                operator++ as both a postfix and prefix doesn't work with clang
                            
                                Using derivatives as functions in CppAD
                            
                                End iterator invalidation rules
                            
                                Forcing GCC to perform loop unswitching of memcpy runtime size checks?
                            
                                'this' argument has type const but function is not marked const
                            
                                What is the purpose and usage of `memory_resource`?
                            
                                Does anyone have information on using operator""?
                            
                                How to dynamically build a new protobuf from a set of already defined descriptors?
                            
                                Why is Allocator::reference being phased out?
                            
                                Can I use a constexpr value in a lambda without capturing it?
                            
                                What is a synthetic pointer?
                            
                                using c++ aggregate initialization in std::make_shared
                            
                                Defining new infix operators
                            
                                for_each that gives two (or n) adjacent elements

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is GCC tricked into allowing undefined behavior simply by putting it in a loop?

Tags:

c++

gcc

undefined-behavior

g++

compiler-warnings

John Zwinck

People also ask

1 Answers

Jonathan Wakely

Recent Activity

Donate For Us