What is the difference between "safe" and "unsafe" code in C/C++?
I've read that "C++ is unsafe in ways which cause serious security vulnerabilities" in the article: How Rust Compares to Other Languages and More . What is insecure about unsafe code?
What Does Unsafe Mean? Unsafe is a C programming language (C#) keyword used to denote a section of code that is not managed by the Common Language Runtime (CLR) of the . NET Framework, or unmanaged code. Unsafe is used in the declaration of a type or member or to specify block code.
In general, safe code doesn't directly access memory using pointers. It also doesn't allocate raw memory. It creates managed objects instead. C# supports an unsafe context, in which you may write unverifiable code.
Unsafe code in general is a keyword that denotes a code section that is not handled by the Common Language Runtime(CLR). Pointers are not supported by default in C# but unsafe keyword allows the use of the pointer variables.
The unsafe keyword denotes an unsafe context, which is required for any operation involving pointers. For more information, see Unsafe Code and Pointers. You can use the unsafe modifier in the declaration of a type or a member. The entire textual extent of the type or member is therefore considered an unsafe context.
I believe John Regehr's article A Guide to Undefined Behavior in C and C++, Part 1 gives a good overview of what the article is getting at:
Programming languages typically make a distinction between normal program actions and erroneous actions. For Turing-complete languages we cannot reliably decide offline whether a program has the potential to execute an error; we have to just run it and see.
In a safe programming language, errors are trapped as they happen. Java, for example, is largely safe via its exception system. In an unsafe programming language, errors are not trapped. Rather, after executing an erroneous operation the program keeps going, but in a silently faulty way that may have observable consequences later on. Luca Cardelli’s article on type systems has a nice clear introduction to these issues. C and C++ are unsafe in a strong sense: executing an erroneous operation causes the entire program to be meaningless, as opposed to just the erroneous operation having an unpredictable result. In these languages erroneous operations are said to have undefined behavior.
So once we go into the realm of undefined behavior we now have "unsafe" code. Another good article that covers undefined behavior as unsafe code is What Every C Programmer Should Know About Undefined Behavior #2/3:
In Part 1 of our series, we discussed what undefined behavior is, and how it allows C and C++ compilers to produce higher performance applications than "safe" languages. This post talks about how "unsafe" C really is, explaining some of the highly surprising effects that undefined behavior can cause. In Part #3, we talk about what friendly compilers can do to mitigate some of the surprise, even if they aren't required to.
I like to call this "Why undefined behavior is often a scary and terrible thing for C programmers". :-)
The C and C++ are specified by their respective standards and we can find links to the latest ones here and those standard leave a lot of the behavior specified as undefined behavior. Which basically means the behavior is unpredictable. The C++ standard defined undefined behavior as follows:
behavior for which this International Standard imposes no requirements [ Note: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed. —end note ]
The compiler is not required to provide a diagnostic for undefined behavior and we can find many cases where undefined behavior has lead to security vulnerabilities, one of the more well known cases is probably Linux kernel null pointer check removal:
The idea is to look for code that becomes dead when a C/C++ compiler is smart about exploiting undefined behavior. The classic example of this class of error was found in the Linux kernel several years ago. The code was basically:
struct foo *s = ...; int x = s->f; if (!s) return ERROR; ... use s ...
The problem is that the dereference of s in line 2 permits a compiler to infer that s is not null (if the pointer is null then the function is undefined; the compiler can simply ignore this case). Thus, the null check in line 3 gets silently optimized away and now the kernel contains an exploitable bug if an attacker can find a way to invoke this code with a null pointer
Most of the time avoiding undefined behavior is a matter of good coding practice:
and using the right tools such as ubsan but there can be some obscure cases such as infinite loops that many developers may find surprising.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With